Re: Beautiful Soup - close tags more promptly?

2022-10-25 Thread Tim Delaney
On Mon, 24 Oct 2022 at 19:03, Chris Angelico  wrote:

>
> Ah, cool. Thanks. I'm not entirely sure of the various advantages and
> disadvantages of the different parsers; is there a tabulation
> anywhere, or at least a list of recommendations on choosing a suitable
> parser?
>

Coming to this a bit late, but from my experience with BeautifulSoup and
HTML produced by other people ...

lxml is easily the fastest, but also the least forgiving.
html.parer is middling on performance, but as you've seen sometimes makes
mistakes.
html5lib is the slowest, but is most forgiving of malformed input and edge
cases.

I use html5lib - it's fast enough for what I do, and the most likely to
return results matching what the author saw when they maybe tried it in a
single web browser.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Randomizing Strings In A Microservices World

2019-12-09 Thread Tim Delaney
On Tue, 10 Dec 2019 at 12:12, Tim Daneliuk  wrote:

> - Each of these services needs to produce a string of ten digits
> guaranteed to be unique
>   on a per service instance basis AND to not collide for - oh, let's say -
> forever :)s
>
> Can anyone suggest a randomization method that might achieve this
> efficiently?
>
> My first thought was to something like nanonseconds since the epoch plus
> something
> unique about the service instance - like it's IP?  (This is in a K8s
> cluster) - to
> see the randomization and essentially eliminate the string being repeated.
>

10 digits is only 9. That's not a very big number. Forget
nanoseconds since the epoch, that won't currently give you seconds since
the epoch - let alone combining with any other identifier.

$ date '+%s'
1575942612

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Are the critiques in "All the things I hate about Python" valid?

2018-02-18 Thread Tim Delaney
On 18 February 2018 at 22:55, Anders Wegge Keller <we...@wegge.dk> wrote:

> På Sat, 17 Feb 2018 15:05:34 +1100
> Ben Finney <ben+pyt...@benfinney.id.au> skrev:
> > boB Stepp <robertvst...@gmail.com> writes:
>
>
> > He blithely conflates “weakly typed” (Python objects are not weakly, but
> > very strongly typed)
>
>  Python is more strongly typed than PHP, but that doesn't really say much.
> However, compared to a language like C, there are big holes in the type
> safety.
>
> >>> alist = [1, 'two', ('three', four), 5*'#']
>
>  That list is not only weakly typed, but rather untyped. There are no
> safeguards in the language, that enforce that all elements in a list or
> other container are in fact of the same type. Before type annotations and
> mypy, I could not enforce that other than at runtime.
>

You couldn't have got the above much more wrong.

As others have said, typing is about how the underlying memory is treated.

I can't comment on PHP typing, as I've actively avoided that language since
my first experience with it.

C is statically and weakly typed. Variables know their types at compile
time (static typing). It is a feature of the language that you can cast any
pointer to any chunk of memory to be a pointer to any other type (normally
via void *). This is not coercion - it takes the bit pattern of memory of
one type and interprets it as the bit pattern for another type, and is weak
typing.

Python is strongly and dynamically typed. In Python, once you create an
object, it remains that type of object, no matter what you do to it*. That
makes it strongly typed. Python does not have variables - it instead has
names with no type information at compile time. That makes it dynamically
typed. In your list example, each element of the list is a name - the
element itself doesn't have a type, but the object named by the list does.

* In some cases it is possible to change the __class__ of an object, but
that can only be done in restricted circumstances and will usually result
in runtime exceptions unless you've specifically planned your class
hierarchy to do it. The cases where it is possible to change the __class__
do not result in reinterpretation of memory bit patterns.

>>> (1.0).__class__ = int
Traceback (most recent call last):
  File "", line 1, in 
TypeError: __class__ assignment only supported for heap types or ModuleType
subclasses

In some implementations it is possible to subvert the Python typing system
by stepping out of Python code and into (for example) a C extension, but
that does not make Python *the language* weakly typed.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Where has the practice of sending screen shots as source code come from?

2018-01-28 Thread Tim Delaney
On 29 January 2018 at 11:27, Steven D'Aprano <
steve+comp.lang.pyt...@pearwood.info> wrote:

> On Mon, 29 Jan 2018 08:55:54 +1100, Tim Delaney wrote:
>
> > I got back a Word document containing about 10 screenshots where they'd
> > apparently taken a screenshot, moved the horizontal scrollbar one
> > screen, taken another screenshot, etc.
>
> You're lucky they didn't just take a single screen shot, thinking that
> you can scroll past the edges to see what is off-screen.


I suspect that was the case on the original screenshot in Word document ...

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Where has the practice of sending screen shots as source code come from?

2018-01-28 Thread Tim Delaney
On 29 January 2018 at 02:04, Steven D'Aprano <
steve+comp.lang.pyt...@pearwood.info> wrote:

> I'm seeing this annoying practice more and more often. Even for trivial
> pieces of text, a few lines, people post screenshots instead of copying
> the code.


I don't tend to see this from programmers I work with, but I'm constantly
having to deal with support tickets where the person raising the ticket put
a screenshot of something like a console or grid output of an SQL tool or
even a logfile opened in a text editor ... Even worse, usually they'll
paste the screenshot into a Word document first (which then causes
difficulties to view the screenshot due to page width, etc).

I had one case the other day where they'd taken a screenshot of some of the
columns of the output of an SQL query and pasted it into a Word document. I
specifically asked them not to do this, explained that the tool they were
using could export to CSV and that would be much more useful as I could
search it, etc. I offered to walk them through how to do the CSV export.
And I requested that they send me the entire output (all columns) of the
SQL output.

I got back a Word document containing about 10 screenshots where they'd
apparently taken a screenshot, moved the horizontal scrollbar one screen,
taken another screenshot, etc.

These are support people who are employed by the company I'm contracted to.
Doesn't matter how often I try to train them otherwise, this type of thing
keeps happening.

BTW: I have nothing to do with the final persistence format of the data,
but in practice I've had to learn the DB schema and stored procedures for
everything I support. Strangely the DB team don't have to learn my parts ...

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Lies in education [was Re: The "loop and a half"]

2017-10-05 Thread Tim Delaney
On 5 October 2017 at 14:22, Steve D'Aprano <steve+pyt...@pearwood.info>
wrote:

> The A and E in the word "are" are not vowels, since they are silent. The U
> in "unicorn" and "university" are not vowels either, and if you write "an
> unicorn" you are making a mistake.
>

There are dialects of English where the "u" in unicorn or university would
be pronounced "oo" (e.g. where heavily influenced by Spanish or
Portuguese), in which case writing "an unicorn" would not necessarily be a
mistake.

For a similar example, I and most Australians would say that writing "an
herb" is a mistake since we pronounce the "h", but millions of people
elsewhere would disagree with us.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: scandir slower than listdir

2017-07-20 Thread Tim Delaney
On 20 July 2017 at 21:43, Skip Montanaro <skip.montan...@gmail.com> wrote:

> scandir returns an iterator of DirEntry objects which contain more
> > information than the mere name.
> >
>
> As I recall, the motivation for scandir was to avoid subsequent system
> calls, so it will be slower than listdir the way you've tested it. If you
> add in the cost of fetching the other bits Terry mentioned, I suspect your
> relative timing will change.
>

In addition, listdir() returns a list of names, so building a new list from
that is fairly fast (can use a single allocation of the correct size).
scandir() returns an iterator, so building a list from that may require
multiple reallocations (depending on the number of entries in the
directory), which could skew the test results.

In neither case is building a list from the result the way you would
normally use it. A more accurate test of the way both functions would
normally be used would be to iterate over the results instead of eagerly
building a list. In this test you would also expect scandir() to use less
memory for a large directory.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Awful code of the week

2016-08-07 Thread Tim Delaney
On 7 August 2016 at 16:54, Steven D'Aprano <
steve+comp.lang.pyt...@pearwood.info> wrote:

> Seen in the office IRC channel:
>
>
> (13:23:07) fred: near_limit = []
> (13:23:07) fred: near_limit.append(1)
> (13:23:07) fred: near_limit = len(near_limit)
> (13:23:09) fred: WTF
>

Assuming you'e not talking about the semantics of this specific code (which
could be reduced to near_limit = 1), I have to say I've been guilty of
reusing a name in precisely this way - set up the structure using the name,
then set the name to a calculated value.

I would only do this when I'm totally uninterested in the structure itself
- it's purely a way to get to the final result, but getting to that final
result requires several steps that justifies an intermediate name binding.
It avoids namespace pollution (but of course that can be fixed with del)
and avoids having to think up another name for a very temporary structure
(names are hard). And I would include a comment explaining the reuse of the
name.

The alternative would be something like (replace first line by something
complex ...):

near_limit_list = [1]
near_limit = len(near_limit_list)
del near_limit_list

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: [OT] Compression of random binary data

2016-07-13 Thread Tim Delaney
On 14 July 2016 at 05:35, Marko Rauhamaa <ma...@pacujo.net> wrote:

> Michael Torrie <torr...@gmail.com>:
> > If the data is truly random then it does not matter whether you have 5
> > bytes or 5 GB. There is no pattern to discern, and having more chunks
> > of random data won't make it possible to compress.
>
> That's true if "truly random" means "evenly distributed". You might have
> genuine random numbers with some other distribution, for example
> Gaussian: https://www.random.org/gaussian-distributions/>. Such
> sequences of random numbers may well be compressible.
>

No one is saying that *all* random data is incompressible - in fact, some
random samples are *very* compressible. A single sample of random data
might look very much like the text of "A Midsummer Night's Dream"
(especially if your method of choosing the random sample was to pick a book
off a library shelf).

But unless otherwise qualified, a claim of being able to compress random
data is taken to mean any and all sets of random data.

Anyway, that's going to be my only contribution to this thread.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: [OT] Java generics (was: Guido sees the light: PEP 8 updated)

2016-04-18 Thread Tim Delaney
On 18 April 2016 at 09:30, Chris Angelico <ros...@gmail.com> wrote:

> On Mon, Apr 18, 2016 at 8:02 AM, Tim Delaney
> <timothy.c.dela...@gmail.com> wrote:
> > I also wouldn't describe Java as a
> > "perfectly good language" - it is at best a compromise language that just
> > happened to be heavily promoted and accepted at the right time.
> >
> > Python is *much* closer to my idea of a perfectly good language.
>
> "Java" was originally four related, but separate, concepts: a source
> language, a bytecode, a sandboxing system, and one other that I can't
> now remember.


I was very specifically referring to Java the language. The JVM is fairly
nice, especially with the recent changes specifically aimed at more easily
supporting dynamic languages.

Speaking of JVM changes - I had to take over support of a chat applet
developed by a contractor built on Java 1.0. We just could not get it to
work reliably (this was for the original Foxtel web site - I remember
trying to keep it up and running while Richard Fidler was doing a promoted
chat session ...). Then Java 1.1 was released and what a huge improvement
that was.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


[OT] Java generics (was: Guido sees the light: PEP 8 updated)

2016-04-17 Thread Tim Delaney
On 17 April 2016 at 23:38, Ian Kelly <ian.g.ke...@gmail.com> wrote:


> > Java generics ruined a perfectly good language. I mean:
>
> The diamond operator in JDK 7 makes this a lot more tolerable, IMO:
>
> Map<AccountManager, List> customersOfAccountManager =
> new HashMap<>();
>

To some extent - you can't use the diamond operator when creating an
anonymous subclass, and you often need to explicitly specify the types for
generic methods. The inference engine is fairly limited.

I wouldn't say generics ruined Java - they made it better in some ways (for
a primarily statically-typed language) but worse in others (esp. that
they're implemented by erasure). I also wouldn't describe Java as a
"perfectly good language" - it is at best a compromise language that just
happened to be heavily promoted and accepted at the right time.

Python is *much* closer to my idea of a perfectly good language.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Guido sees the light: PEP 8 updated

2016-04-16 Thread Tim Delaney
On 17 April 2016 at 07:50, Tim Chase <python.l...@tim.thechases.com> wrote:

> On 2016-04-17 06:08, Ben Finney wrote:
> > Larry Martell <larry.mart...@gmail.com> writes:
> > > if we still had 1970's 80 character TTYs that would matter but on
> > > my 29" 1920x1080 screen it doesn't.
> >
> > Larry, you've been around long enough to know that's not an argument
> > against a limited line length for code. It is not about the
> > technology of your terminal. It's about the technology of the brain
> > reading the text.
>
> But just in case you do want to consider hardware limits, I do some
> of my coding on my phone & tablet, both of which are ~80 characters
> wide at best (or less if I use the phone in portrait mode).  I also do
> some editing/diffing within a cmd.exe window on Windows which is
> limited to 80 characters unless you do some hijinks in the settings
> to expand it.
>

Personally, I've given up on 80 characters (or even 120 in rare cases) for
Java code (esp method declarations), where just specifying the generics can
often take almost that much.

But for Python code it's generally fairly easy to break a line in a natural
place.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: How to read from a file to an arbitrary delimiter efficiently?

2016-02-28 Thread Tim Delaney
On 29 February 2016 at 07:28, Oscar Benjamin <oscar.j.benja...@gmail.com>
wrote:

> On 25 February 2016 at 06:50, Steven D'Aprano
> <steve+comp.lang.pyt...@pearwood.info> wrote:
> >
> > I have a need to read to an arbitrary delimiter, which might be any of a
> > (small) set of characters. For the sake of the exercise, lets say it is
> > either ! or ? (for example).
> >
> > I want to read from files reasonably efficiently. I don't mind if there
> is a
> > little overhead, but my first attempt is 100 times slower than the
> built-in
> > "read to the end of the line" method.
>
> You can get something much faster using mmap and searching for a
> single delimiter:
>
> My timing makes that ~7x slower than iterating over the lines of the
> file but still around 100x faster than reading individual characters.
> I'm not sure how to generalise it to looking for multiple delimiters
> without dropping back to reading individual characters though.
>

You can use an mmapped file as the input for regular expressions. May or
may not be particularly efficient.

Otherwise, if reading from a file I think read a chunk, and seek() back to
the delimiter is probably going to be most efficient whilst leaving the
file position just after the delimiter.

If reading from a stream, I think Chris' read a chunk and maintain an
internal buffer, and don't give access to the underlying stream.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What use for reversed()?

2015-05-31 Thread Tim Delaney
On 1 June 2015 at 05:40, fl rxjw...@gmail.com wrote:

 Hi,

 I have a string b='1234'. I run: br=reversed(b)

 I hope that I can print out '4321' by:

 for br in b

 but it complains:
 SyntaxError: invalid syntax


Any time you get a SyntaxError, it means that you have coded something
which does not match the specified syntax of the language version.

Assuming you copied and pasted the above, I can see an error:

for br in b

The for statement must have a colon at the end of line e.g. a complete for
statement and block is:

for br in b:
print br

This will output the characters one per line (on Python 3.x), since that is
what the reversed() iterator will return. You will need to do something
else to get it back to a single string.

Have you read through the python tutorials?

https://docs.python.org/3/tutorial/

or for Python 2.x:

https://docs.python.org/2/tutorial/

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What use for reversed()?

2015-05-31 Thread Tim Delaney
On 1 June 2015 at 10:30, Mark Lawrence breamore...@yahoo.co.uk wrote:

 On 01/06/2015 00:23, Tim Delaney wrote:

 The for statement must have a colon at the end of line e.g. a complete
 for statement and block is:

 for br in b:
  print br

 This will output the characters one per line (on Python 3.x), since that
 is what the reversed() iterator will return. You will need to do
 something else to get it back to a single string.


 Will it indeed?  Perhaps fixing the syntax error will get something to
 print :)


Indeed - as Mark is so gently alluding to, I've done the reverse of what I
said - given Python 2.x syntax instead of Python 3.x.

That should have been:

for br in b:
 print(br)

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Delivery Status Notification (Failure)

2015-05-11 Thread Tim Delaney
On 9 May 2015 at 13:56, Chris Angelico ros...@gmail.com wrote:


 Yeah, I know, shocking. But I wanted to at least *try* doing the
 normal and official thing, in the hopes that they were a legit company
 that perhaps didn't realize what this looked like.


By all means report to abuse@ in the future, but please do not CC the list.
My spam filters have learned to filter out most job spam automatically by
now, but it doesn't filter out your reply.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: (Still OT) Nationalism, language and monoculture [was Re: Python Worst Practices]

2015-03-04 Thread Tim Delaney
On 5 March 2015 at 07:11, Steven D'Aprano 
steve+comp.lang.pyt...@pearwood.info wrote:


 As for your comments about spoken accents, I sympathise. But changing
 accents is very hard for most people (although a very few people find it
 incredibly easy). Even professionals typically need to have voice coaches
 to teach them to change accents successfully. One of the problems is that
 most people don't hear their own accent. My wife usually has a fairly
 generic English accent that most people think is American, but within
 seconds of beginning to talk to another Irish person she is speaking in a
 full-blown Irish accent, and she is *completely* unaware of it.


This is very much the case - any time someone is reacquainted with their
native accent they tend to strongly slip back into it, and it takes some
time to get their more neutral accent back.

A related thing is when you have multiple multi-lingual people talking
together where at least two of their languages match (or are close enough
for most uses e.g. Spanish and Portuguese). They'll slip in and out of
multiple languages depending on which best expresses what they're trying to
say, and no one will involved realise.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: (Still OT) Nationalism, language and monoculture [was Re: Python Worst Practices]

2015-03-04 Thread Tim Delaney
On 5 March 2015 at 09:39, Emile van Sebille em...@fenx.com wrote:

 On 3/4/2015 12:40 PM, Tim Delaney wrote:

 A related thing is when you have multiple multi-lingual people talking
 together where at least two of their languages match (or are close
 enough for most uses e.g. Spanish and Portuguese). They'll slip in and
 out of multiple languages depending on which best expresses what they're
 trying to say, and no one will involved realise.


 Except for my poor grandmother who hadn't understood a word my mother had
 said the previous ten minutes.  :)


The phenomenon I'm talking about involves people switching languages
mid-sentence without the participants noticing. It mainly occurs with
people who grew up speaking multiple languages, and commonly switch between
them in their thoughts. If your grandmother learned her second/third/etc
languages after she was a teenager then it's likely she mainly thinks in
one language and translates to others.

It can also be seen with people who have recently had long-term saturation
exposure to a second language - for example, exchange students who have
just come back from a year's stay. When I'd recently returned from Brasil
(20-odd years ago now ...) there was one time when everyone was a native
(Australia) english speaker and had a mix of latin-based second languages -
that was close enough for it to happen.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Style question: Importing modules from packages - 'from' vs 'as'

2014-12-03 Thread Tim Delaney
On 3 December 2014 at 22:02, Chris Angelico ros...@gmail.com wrote:


 import os.path as path
 from os import path


Bah - deleted the list and sent directly to Chris ... time to go to bed.

The advantage of the former is that if you want to use a different name,
it's a smaller change. But the disadvantage of the former is that if you
*don't* want to rename, it violates DRY (don't repeat yourself).

The difference is so marginal that I'd leave it to personal preference, and
wouldn't pull someone up for either in a code review.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Classes

2014-11-02 Thread Tim Delaney
On 2 November 2014 20:50, Denis McMahon denismfmcma...@gmail.com wrote:


 The question (I thought) was to write a class for Square that inherited a
 class Rectangle but imposed on it the additional constraints of a square
 over a rectangle, namely that length == width.


I'm late to the party and this has already been partially addressed in the
thread, but it always annoys me. A square is as much a rhombus with 90
degree angles as it is a rectangle with equal length and width, and yet I
*never* see the former given as an option.

If course, that's probably because rectangles have a multitude of uses for
user interfaces, whilst other quadrilaterals are somewhat less useful.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Lazy-evaluation lists/dictionaries

2014-10-26 Thread Tim Delaney
On 27 October 2014 01:14, Jon Ribbens jon+use...@unequivocal.co.uk wrote:

 I have a need, in a Python C extension I am writing, for lists and
 dictionaries with lazy evaluation - by which I mean that at least
 some of the values in the lists/dictionaries are proxy objects
 which, rather than returning as themselves, should return the thing
 they are a proxy for when retrieved. This is because retrieving
 the proxied objects is expensive and only a small minority of them
 will actually be accessed, so retrieving them all before they are
 actually accessed is massively inefficient.


Why not put proxy objects into the list/dict? Have a look at the weakref
module for an API that may be suitable for such proxy objects (if you used
the same API, that would also allow you to transparently use weakrefs in
your lists/dicts).

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: [OT] spelling colour / color was Re: Toggle

2014-10-09 Thread Tim Delaney
On 10 October 2014 05:24, duncan smith buzzard@invalid.invalid wrote:

 On 09/10/14 18:43, mm0fmf wrote:
  On 09/10/2014 02:29, Steven D'Aprano wrote:
  Apart from the horrible spelling of colour :-)
 
  I've always spelt colour as color when programming and as colour
  when writing language including documentation about software.
 
  colour in a programme doesn't seem right.
 

 Even in British English that is usually spelt 'program' (from the US
 spelling, of course). Let's not cave in on 'colour' too. It's bad enough
 that we can't use 'whilst' loops :-).


That would be a theatre programme vs a computer program.

I try to stick with the current spelling style when modifying existing code
- esp. for APIs. It's very annoying to have some methods use z and others
s in the same package. So since I'm currently working for a US company I
have to consciously remind myself to use their abominations ;)

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: [OT] Question about Git branches

2014-09-16 Thread Tim Delaney
On 16 September 2014 22:14, Steven D'Aprano 
steve+comp.lang.pyt...@pearwood.info wrote:

 Chris Angelico wrote:

  On Tue, Sep 16, 2014 at 6:21 PM, Marko Rauhamaa ma...@pacujo.net
 wrote:
  Frank Millman fr...@chagford.com:
 
  You are encouraged to make liberal use of 'branches',
 
  Personally, I only use forks, IOW, git clone. I encourage that
  practice. Then, there is little need for git checkout. Instead, I just
  cd to a different directory.
 
  Branches and clones are highly analogous processwise; I would go so far
  as to say that they are redundant.
 
  But rather than listening to, shall we say, *strange* advice like
  this, Frank, you'll do well to pick up a reliable git tutorial, which
  should explain branches, commits, the working tree, etc, etc, etc.

 Isn't this strange advice standard operating procedure in Mercurial? I'm
 not an expert on either hg or git, but if I've understood hg correctly, the
 way to begin an experimental branch is to use hg clone.


It depends entirely on how you're comfortable working. I tend to have a
clone per feature branch (they all push to the same central repo) and then
create a named branch per task (which may be a prototype, bugfix,
enhancement, whatever).

Makes it very easy to switch between tasks - I just update to a different
changeset (normally the tip of a named branch) and force a refresh in my
IDE. When I'm happy, I merge into the feature branch, then pull the
necessary changesets into other feature branch repos to merge/graft as
appropriate.

Branches and clones are two different ways of organising, and I find that
things work best for me when I use both.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: [OT] Question about Git branches

2014-09-16 Thread Tim Delaney
On 17 September 2014 02:25, Chris Angelico ros...@gmail.com wrote:

 On Wed, Sep 17, 2014 at 2:08 AM, Robert Kern robert.k...@gmail.com
 wrote:
  Yes, but this is due to different design decisions of git and Mercurial.
 git
  prioritized the multiple branches in a single clone use case; Mercurial
  prioritized re-cloning. It's natural to do this kind of branching in git,
  and more natural to re-clone in Mercurial.


I disagree that it's more natural to re-clone in Mercurial. It's just that
the preferred workflow of the Mercurial developers is to use clones,
anonymous branches and bookmarks (almost the same as git branches) rather
than named branches - and so that is the workflow that is most associated
with using Mercurial.

Mercurial fully supports multiple lines of development by:

1. cloning;

2. anonymous branching (i.e. multiple heads on the same branch) - normally
combined with bookmarks;

3. named branching (the branch name is an integral part of the commit);

4. all of the above combined.

Eventually if you want to merge between lines of development then you end
up with multiple branches (either anonymous or named) in the one repo.


 Ah, I wasn't aware of that philosophical difference. Does hg use
 hardlinks or something to minimize disk usage when you clone, or does
 it actually copy everything? (Or worse, does it make the new directory
 actually depend on the old one?)


If you clone a repo to the same filesystem (e.g. the same disk partition)
then Mercurial will use hardlinks for the repository files (i.e. things in
.hg). This means that clones are quick (although if you don't prevent
updating the working directory while cloning that can take some time ...).

Hardlinks may be broken any time changesets are added to the repo e.g. via
a commit or pull. Only the hardlinks involved in the commit (and the
manifest) will be broken.

Mercurial provides a standard extension (relink) to re-establish hardlinks
between identical storage files. For example, running hg relink in my
current feature branch repo:

[feature_branch_repo:65179] [feature_branch] hg relink default
relinking d:\home\repos\feature_branch_repo\.hg/store to
d:\home\repos\default_repo\.hg/store
tip has 22680 files, estimated total number of files: 34020
collected 229184 candidate storage files
pruned down to 49838 probably relinkable files
relinked 359 files (221 MB reclaimed)

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: hg, git, fossil, ...

2014-08-28 Thread Tim Delaney
On 29 August 2014 02:32, Tim Chase python.l...@tim.thechases.com wrote:


 No, you wouldn't use hg pull nor git pull but rather git
 cherry-pick or what Mercurial calls transplant (I've not used this
 in Mercurial, but I believe it's an extension).


hg transplant has been deprecated for a long time now. The correct command
for cherry-picking is hg graft.

I do sometimes miss the ability to easily cherry-pick the changes in a
single file. When grafting, you graft the entire revision, and then need to
revert individual files and amend the changeset if you don't want the graft
as-is. It's a bit messy, and could cause problems if you later do a merge
that includes the originally-grafted changeset on top of the amended
changeset (since the changes committed to the amended changeset will be
considered during the merge).

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Why Python 4.0 won't be like Python 3.0

2014-08-18 Thread Tim Delaney
On 19 August 2014 00:51, Grant Edwards invalid@invalid.invalid wrote:

 On 2014-08-17, Mark Lawrence breamore...@yahoo.co.uk wrote:
  A blog from Nick Coghlan
  http://www.curiousefficiency.org/posts/2014/08/python-4000.html that
  should help put a few minds to rest.

 I agree with the comments that the appellation for simply the next
 version after 3.9 should be 3.10 and not 4.0.  Everybody I know
 considers SW versions numbers to be dot-separated tuples, not
 floating point numbers.

 To all of us out here in user-land a change in the first value in the
 version tuple means breakage and incompatibilities. And when the
 second value is 0, you avoid it until some other sucker has found
 the bugs and a few more minor releases have come out.


 No. A major version increase *may* introduce breakage and
incompatibilities. It does not mean that it *has* to introduce breakage and
incompatibilities. If the major version increase is documented as just
being the next version then there's no reason to avoid it - unless your
policy is wait for the first patch release i.e. never take major.minor.0
but always wait for major.minor.1.

What is more important is that minor and patch version increases should
avoid introducing breakage and incompatibilities wherever possible
(security fixes are one reason to allow incompatibility in a minor release).

BTW I agree with the idea that 4.0 would be an appropriate time to remove
anything that has been deprecated for the requisite number of versions.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python and IDEs [was Re: Python 3 is killing Python]

2014-07-19 Thread Tim Delaney
On 20 July 2014 04:08, C.D. Reimer ch...@cdreimer.com wrote:

 On 7/19/2014 12:28 AM, Steven D'Aprano wrote:

 Earlier, I mentioned a considerable number of IDEs which are available
 for Python, including:


 I prefer to use Notepad++ (Windows) and TextWrangler (Mac). Text editors
 with code highlighting can get the job done as well, especially if the
 project is modest and doesn't require version control.


IMO there is no project so modest that it doesn't require version control.
Especially since version control is as simple as:

cd project
hg init
hg add
hg commit

FWIW I also don't find a need for an IDE for Python - I'm quite happy using
EditPlus (which I preferred enough to other alternatives on Windows to pay
for many years ago).

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python and IDEs [was Re: Python 3 is killing Python]

2014-07-19 Thread Tim Delaney
On 20 July 2014 09:19, Chris Angelico ros...@gmail.com wrote:

 On Sun, Jul 20, 2014 at 7:50 AM, Tim Delaney
 timothy.c.dela...@gmail.com wrote:
  IMO there is no project so modest that it doesn't require version
 control.
  Especially since version control is as simple as:
 
  cd project
  hg init
  hg add
  hg commit

 That said, though, there are some projects so modest they don't
 require dedicated repositories. I have a repo called shed - it's a
 collection of random tools that I've put together, no more logical
 grouping exists.


Agreed. I have a utils one - but I do like shed and think I'm going to
rename :)

The main thing is that versioning should be automatic now - it's almost
free, and the benefits are huge because even trivial scripts end up
evolving.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python and IDEs [was Re: Python 3 is killing Python]

2014-07-19 Thread Tim Delaney
On 20 July 2014 11:53, C.D. Reimer ch...@cdreimer.com wrote:


 On 7/19/2014 6:23 PM, Steven D'Aprano wrote:

 I haven't used Python on Windows much, but when I did use it, I found the
 standard Python interactive interpreter running under cmd.exe to be bare-
 bones but usable for testing short snippets. If I recall correctly, it is
 missing any sort of command history or line editing other than backspace,
 which I guess it would have been painful to use for extensive interactive
 work, but when I started using Python on Linux the interactive interpreter
 had no readline support either so it was just like old times :-)


 Windows PowerShell supports very basic Linux commands and has a command
 history. I'm always typing ls for a directory listing when I'm on a
 Windows machine. The regular command line would throw a DOS fit. PowerShell
 lets me get away with it.

 http://en.wikipedia.org/wiki/Windows_PowerShell#Comparison_
 of_cmdlets_with_similar_commands

 I prefer working on my vintage 2006 Black MacBook. Alas, the CPU fan is
 dying and MacBook shuts down after 15 minutes. I'm surprised at how well I
 was able to set up a equivalent programming environment on Windows.


I advise anyone who works cross-platform to install MSYS on their Windows
boxes (for the simplest, most consistent behaviour ignore rxvt and just
launch bash -l - i directly). Or use cygwin if you prefer.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: PyPy3 2.3.1 released

2014-06-20 Thread Tim Delaney
Congratulations.

I can't find the details of PyPy3's unicode implementation documented
anywhere. Is it equivalent to:

- a Python 3.2 narrow build
- a Python 3.2 wide build
- PEP 393
- something else?

Cheers,

Tim Delaney



On 21 June 2014 06:32, Philip Jenvey pjen...@underboss.org wrote:

 =
 PyPy3 2.3.1 - Fulcrum
 =

 We're pleased to announce the first stable release of PyPy3. PyPy3
 targets Python 3 (3.2.5) compatibility.

 We would like to thank all of the people who donated_ to the `py3k
 proposal`_
 for supporting the work that went into this.

 You can download the PyPy3 2.3.1 release here:

 http://pypy.org/download.html#pypy3-2-3-1

 Highlights
 ==

 * The first stable release of PyPy3: support for Python 3!

 * The stdlib has been updated to Python 3.2.5

 * Additional support for the u'unicode' syntax (`PEP 414`_) from Python 3.3

 * Updates from the default branch, such as incremental GC and various JIT
   improvements

 * Resolved some notable JIT performance regressions from PyPy2:

  - Re-enabled the previously disabled collection (list/dict/set) strategies

  - Resolved performance of iteration over range objects

  - Resolved handling of Python 3's exception __context__ unnecessarily
 forcing
frame object overhead

 .. _`PEP 414`: http://legacy.python.org/dev/peps/pep-0414/

 What is PyPy?
 ==

 PyPy is a very compliant Python interpreter, almost a drop-in replacement
 for
 CPython 2.7.6 or 3.2.5. It's fast due to its integrated tracing JIT
 compiler.

 This release supports x86 machines running Linux 32/64, Mac OS X 64,
 Windows,
 and OpenBSD,
 as well as newer ARM hardware (ARMv6 or ARMv7, with VFPv3) running Linux.

 While we support 32 bit python on Windows, work on the native Windows 64
 bit python is still stalling, we would welcome a volunteer
 to `handle that`_.

 .. _`handle that`:
 http://doc.pypy.org/en/latest/windows.html#what-is-missing-for-a-full-64-bit-translation

 How to use PyPy?
 =

 We suggest using PyPy from a `virtualenv`_. Once you have a virtualenv
 installed, you can follow instructions from `pypy documentation`_ on how
 to proceed. This document also covers other `installation schemes`_.

 .. _donated:
 http://morepypy.blogspot.com/2012/01/py3k-and-numpy-first-stage-thanks-to.html
 .. _`py3k proposal`: http://pypy.org/py3donate.html
 .. _`pypy documentation`:
 http://doc.pypy.org/en/latest/getting-started.html#installing-using-virtualenv
 .. _`virtualenv`: http://www.virtualenv.org/en/latest/
 .. _`installation schemes`:
 http://doc.pypy.org/en/latest/getting-started.html#installing-pypy


 Cheers,
 the PyPy team

 --
 Philip Jenvey

 --
 https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Micro Python -- a lean and efficient implementation of Python 3

2014-06-10 Thread Tim Delaney
On 11 June 2014 05:43, alister alister.nospam.w...@ntlworld.com wrote:


 Your error reports always seem to resolve around benchmarks despite speed
 not being one of Pythons prime objectives


By his own admission, jmf doesn't use Python anymore. His only reason to
remain on this emailing/newsgroup is to troll about the FSR. Please don't
reply to him (and preferably add him to your killfile).

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: try/except/finally

2014-06-10 Thread Tim Delaney
On 11 June 2014 10:00, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info
 wrote:

 On Wed, 11 Jun 2014 06:37:01 +1000, Chris Angelico wrote:

  I don't know
  a single piece of programming advice which, if taken as an inviolate
  rule, doesn't at some point cause suboptimal code.

 Don't try to program while your cat is sleeping on the keyboard.


Lying down, the weight is spread across the whole keyboard so you're
unlikely to suffer extra keypresses due to the cat. So if you're a
touch-typist that one may not be too bad (depending on how easily their fur
gets up your nose).

Now, a cat *standing* on the keyboard, between you and the monitor, and
rubbing his head against your hands, is a whole other matter.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3.2 has some deadly infection

2014-06-02 Thread Tim Delaney
On 2 June 2014 17:45, Wolfgang Maier 
wolfgang.ma...@biologie.uni-freiburg.de wrote:

 Tim Delaney timothy.c.delaney at gmail.com writes:

  For some purposes, there needs to be a way to treat an arbitrary stream
 of
 bytes as an arbitrary stream of 8-bit characters. iso-latin-1 is a
 convenient way to do that.
 

 For that purpose, Python3 has the bytes() type. Read the data as is, then
 decode it to a string once you figured out its encoding.


I know that, you know that. Convincing other people of that is the
difficulty.

I probably should have mentioned it, but in my case it's not even Python
(Java). It's exactly the same principal - an assumption was made that has
become entrenched due to the fear of breakage. If they'd been forced to
think about encodings up-front, it shouldn't have been an issue, which was
the point I was trying to make.

In Java, it's much worse. At least with Python you can perform string-like
operations on bytes. In Java you have to convert it to characters before
you can really do anything with it, so people just use the default encoding
all the time - especially if they want the convenience of line-by-line
reading using BufferedReader ...

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3.2 has some deadly infection

2014-06-01 Thread Tim Delaney
On 1 June 2014 12:26, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info
wrote:


 with cross-platform behavior preferred over system-dependent one --
 It's not clear how cross-platform behaviour has anything to do with the
 Internet age. Python has preferred cross-platform behaviour forever,
 except for those features and modules which are explicitly intended to be
 interfaces to system-dependent features. (E.g. a lot of functions in the
 os module are thin wrappers around OS features. Hence the name of the
 module.)


There is the behaviour of defaulting input and output to the system
encoding. I personally think we would all be better off if Python (and
Java, and many other languages) defaulted to UTF-8. This hopefully would
eventually have the effect of producers changing to output UTF-8 by
default, and consumers learning to manually specify an encoding when it's
not UTF-8 (due to invalid codepoints).

I'm currently working on a product that interacts with lots of other
products. These other products can be using any encoding - but most of the
functions that interact with I/O assume the system default encoding of the
machine that is collecting the data. The product has been in production for
nearly a decade, so there's a lot of pushback against changes deep in the
code for fear that it will break working systems. The fact that they are
working largely by accident appears to escape them ...

FWIW, changing to use iso-latin-1 by default would be the most sensible
option (effectively treating everything as bytes), with the option for
another encoding if/when more information is known (e.g. there's often a
call to return the encoding, and the output of that call is guaranteed to
be ASCII).

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3.2 has some deadly infection

2014-06-01 Thread Tim Delaney
On 2 June 2014 11:14, Steven D'Aprano steve+comp.lang.pyt...@pearwood.info
wrote:

 On Mon, 02 Jun 2014 08:54:33 +1000, Tim Delaney wrote:
  I'm currently working on a product that interacts with lots of other
  products. These other products can be using any encoding - but most of
  the functions that interact with I/O assume the system default encoding
  of the machine that is collecting the data. The product has been in
  production for nearly a decade, so there's a lot of pushback against
  changes deep in the code for fear that it will break working systems.
  The fact that they are working largely by accident appears to escape
  them ...
 
  FWIW, changing to use iso-latin-1 by default would be the most sensible
  option (effectively treating everything as bytes), with the option for
  another encoding if/when more information is known (e.g. there's often a
  call to return the encoding, and the output of that call is guaranteed
  to be ASCII).

 Python 2 does what you suggest, and it is *broken*. Python 2.7 creates
 moji-bake, while Python 3 gets it right:


The purpose of my example was to show a case where no thought was put into
encodings - the assumption was that the system encoding and the remote
system encoding would be the same. This is most definitely not the case a
lot of the time.

I also should have been more clear that *in the particular situation I was
talking about* iso-latin-1 as default would be the right thing to do, not
in the general case. Quite often we won't know the correct encoding until
we've executed a command via ssh - iso-latin-1 will allow us to extract the
info we need (which will generally be 7-bit ASCII) without the possibility
of an invalid encoding. Sure we may get mojibake, but that's better than
the alternative when we don't yet know the correct encoding.


 Latin-1 is one of those legacy encodings which needs to die, not to be
 entrenched as the default. My terminal uses UTF-8 by default (as it
 should), and if I use the terminal to input δжç, Python ought to see
 what I input, not Latin-1 moji-bake.


For some purposes, there needs to be a way to treat an arbitrary stream of
bytes as an arbitrary stream of 8-bit characters. iso-latin-1 is a
convenient way to do that. It's not the only way, but settling on it and
being consistent is better than not having a way.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: singleton ... again

2014-02-13 Thread Tim Delaney
On 13 February 2014 20:00, Piet van Oostrum p...@vanoostrum.org wrote:

 Ben Finney ben+pyt...@benfinney.id.au writes:
  Make that “somewhere” a module namespace, and you effectively have a
  Singleton for all practical purposes. So yes, I see the point of it; but
  we already have it built in :-)

 There is a use case for a singleton class: when creating the singleton
 object takes considerable resources and you don't need it always in your
 program.


Then have that resource in its own module, and import that module only when
needed e.g. inside a function. Python already has the machinery - no need
to reinvent the wheel.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python programming

2014-02-12 Thread Tim Delaney
On 13 February 2014 00:55, Larry Martell larry.mart...@gmail.com wrote:

 On Tue, Feb 11, 2014 at 7:21 PM, ngangsia akumbo ngang...@gmail.com
 wrote:
  Please i have a silly question to ask.
 
  How long did it take you to learn how to write programs?

 My entire life.

 I started in 1975 when I was 16 - taught myself BASIC and wrote a very
 crude downhill skiing game.


OK - it's degenerated into one of these threads - I'm going to participate.

I received a copy of The Beginners Computer Handbook: Understanding 
programming the micro (Judy Tatchell and Bill Bennet, edited by Lisa Watts
- ISBN 0860206947) for Christmas of 1985 (I think - I would have been 11
years old). As you may be able to tell from that detail, I have it sitting
in front of me right now - other books have come and gone, but I've kept
that one with me. It appears to have been published elsewhere under a
slightly different name with a very different (and much more boring) cover
- I can't find any links to my edition.

My school had a couple of Apple IIe and IIc machines, so I started by
entering the programs in the book. Then I started modifying them. Then I
started writing my own programs from scratch.

A couple of years later my dad had been asked to teach a programming class
and was trying to teach himself Pascal. We had a Mac 512K he was using.
He'd been struggling with it for a few months and getting nowhere. One
weekend I picked up his Pascal manual + a 68K assembler Mac ROM guide,
combined the two and by the end of the weekend had a semi-working graphical
paint program.

A few years after that I went to university (comp sci); blitzed my
computer-related classes; scraped by in my non-computer-related classes;
did some programming work along the way; was recommended to a job by a
lecturer half-way through my third year of uni; spent the next 4 years
working while (slowly) finishing my degree; eventually found my way into an
organisation which treated software development as a discipline and a
craft, stayed there for 10 years learning how to be more than just a
programmer; came out the other end a senior developer/technical lead and
effective communicator.

And that's how I learned to program.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Newcomer Help

2014-02-12 Thread Tim Delaney
On 13 February 2014 02:17, Grant Edwards invalid@invalid.invalid wrote:

 On 2014-02-12, Ben Finney ben+pyt...@benfinney.id.au wrote:
 
  In other contexts eg corporates, often the culture is the opposite:
  top-posting with strictly NO trimming.
 
  I've never found a corporation that objects to the sensible
  conversation-style, minimal-quotes-for-context interleaved posting style.

 I've always worked in corporations where the email culture is the
 Microsoft-induced lazy and stupid style as you describe.  And yet
 when I respond with editted quotes and interleaved repies like this I
 consitently get nothing favorable comments about it.  Some people have
 even asked how I do it -- though they don't seem to adopt it.


Yep - the problem is that you usually have to fight against the tools to do
it. It's worth the effort, but it can be really hard when you've got an
already existing top-posted email thread with people using bizarre fonts
and colours throughout.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python programming

2014-02-12 Thread Tim Delaney
On 13 February 2014 08:02, Tim Delaney timothy.c.dela...@gmail.com wrote:

 I received a copy of The Beginners Computer Handbook: Understanding 
 programming the micro (Judy Tatchell and Bill Bennet, edited by Lisa Watts
 - ISBN 0860206947)


I should have noted that the examples were all BASIC (with details for how
to modify for various BASIC implementations on various platforms).

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: singleton ... again

2014-02-12 Thread Tim Delaney
On 13 February 2014 08:34, Ned Batchelder n...@nedbatchelder.com wrote:

 On 2/12/14 12:50 PM, Asaf Las wrote:

 On Wednesday, February 12, 2014 7:48:51 AM UTC+2, Dave Angel wrote:


 Perhaps if you would state your actual goal,  we could judge
   whether this code is an effective way to accomplish
   it.
 DaveA


 Thanks!

 There is no specific goal, i am in process of building pattern knowledge
 in python by doing some examples.


 Not all patterns are useful.  Just because it's been enshrined in the GoF
 patterns book doesn't mean that it's good for Python.


Speaking of which, my monitor is currently sitting on my copy of Design
Patterns.

A lot of Design Patterns isn't directly relevant to Python, because
Python either already has various patterns implemented, or obviates the
need for them. For example, if you really need a singleton (answer - you
don't) just use a module attribute. Functions as objects and iterators
being so pervasive means that visitor and related patterns are just a
normal style of programming, instead of having to be explicit.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: __init__ is the initialiser

2014-02-02 Thread Tim Delaney
On 1 February 2014 14:42, Steven D'Aprano 
steve+comp.lang.pyt...@pearwood.info wrote:

 On Fri, 31 Jan 2014 14:52:15 -0500, Ned Batchelder wrote:

 (In hindsight, it was probably a mistake for Python to define two create-
 an-object methods, although I expect it was deemed necessary for
 historical reasons. Most other languages make do with a single method,
 Objective-C being an exception with alloc and init methods.)


I disagree. In nearly every language I've used which only has single-phase
construction, I've wished for two-phase construction. By the time you get
to __init__ you know the following things about the instance:

1. It is a complete instance of the subclass - there's no part of the
structure that is invalid to access (of course, many attributes might not
yet exist).

2. Calling a method from __init__ will call the subclass' method. This
allows subclasses to hook into the initisation process by overriding
methods (of course, the subclass will need to ensure it has initialised all
the state it needs). This is generally not allowed in languages with
single-phase construction because the object is in an intermediate state.

For example,  in C++ the vtable is for the class currently being
constructed, not the subclass, so it will always call the current class'
implementation of the method.

In Java you can actually call the subclass' implementation, but in that
case it will call the subclass method before the subclass constructor is
actually run, meaning that instance variables will have their default
values (null for objects). When the base class constructor is eventually
run the instance variables will be assigned the values in the class
definition (replacing anything set by the subclass method call).

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: __init__ is the initialiser

2014-02-01 Thread Tim Delaney
On 1 February 2014 23:28, Ned Batchelder n...@nedbatchelder.com wrote:

 You are looking at things from an accurate-down-to-the-last-footnote
 detailed point of view (and have provided some footnotes!).  That's a very
 valuable and important point of view.  It's just not how most programmers
 approach the language.


This is the *language reference* that is being discussed. It documents the
intended semantics of the language. We most certainly should strive to
ensure that it is accurate-down-to-the-last-footnote - any difference
between the reference documentation and the implementation is a bug in
either the documentation or the implementation.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Is it possible to protect python source code by compiling it to .pyc or .pyo?

2014-01-17 Thread Tim Delaney
On 18 January 2014 08:31, Joshua Landau jos...@landau.ws wrote:

 On 17 January 2014 00:58, Sam lightai...@gmail.com wrote:
  I would like to protect my python source code. It need not be foolproof
 as long as it adds inconvenience to pirates.
 
  Is it possible to protect python source code by compiling it to .pyc or
 .pyo? Does .pyo offer better protection?

 If you're worried about something akin to corporate espionage or
 some-such, I don't know of a better way than ShedSkin or Cython. Both
 of those will be far harder to snatch the source of. Cython will be
 particularly easy to use as it is largely compatible with Python
 codebases.


Indeed - I've only had one time someone absolutely insisted that this be
done (for trade secret reasons - there needed to be a good-faith attempt to
prevent others from trivially getting the source). I pointed them at Pyrex
(this was before Cython, or at least before it was dominant). They fully
understood that it wouldn't stop a determined attacker - this was a place
where a large number of the developers were used to working on bare metal.

If you're going to do this, I strongly suggest only using Cython on code
that needs to be obscured (and if applicable, performance-critical
sections). I'm currently working with a system which works this way - edge
scripts in uncompiled .py files, and inner code as compiled extensions. The
.py files have been really useful for interoperability purposes e.g. I was
able to verify yesterday that one of the scripts had a bug in its
command-line parsing and I wasn't going insane after all.

Also, remember that any extension can be imported and poked at (e.g. in the
interactive interpreter). You'd be surprised just how much information you
can get that way just using help, dir, print and some experimentation. The
output I was parsing from one of the scripts was ambiguous, and it was one
where most of the work was done in an extension. I was able to poke around
using the interactive interpreter understand what it was doing and obtain
the data in an unambiguous manner to verify against my parser.

The only way to truly protect code is to not ship any version of it
(compiled or otherwise), but have the important parts hosted remotely under
your control (and do your best to ensure it doesn't become compromised).

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Blog about python 3

2014-01-07 Thread Tim Delaney
On 8 January 2014 00:34, wxjmfa...@gmail.com wrote:


 Point 2: This Flexible String Representation does no
 effectuate any memory optimization. It only succeeds
 to do the opposite of what a corrrect usage of utf*
 do.


UTF-8 is a variable-width encoding that uses less memory to encode code
points with lower numerical values, on a per-character basis e.g. if a code
point = U+007F it will use a single byte to encode; if = U+07FF two bytes
will be used; ... up to a maximum of 6 bytes for code points = U+400.

FSR is a variable-width memory structure that uses the width of the code
point with the highest numerical value in the string e.g. if all code
points in the string are = U+00FF a single byte will be used per
character; if all code points are = U+ two bytes will be used per
character; and in all other cases 4 bytes will be used per character.

In terms of memory usage the difference is that UTF-8 varies its width
per-character, whereas the FSR varies its width per-string. For any
particular string, UTF-8 may well result in using less memory than the FSR,
but in other (quite common) cases the FSR will use less memory than UTF-8
e.g. if the string contains only contains code points = U+00FF, but some
are between U+0080 and U+00FF (inclusive).

In most cases the FSR uses the same or less memory than earlier versions of
Python 3 and correctly handles all code points (just like UTF-8). In the
cases where the FSR uses more memory than previously, the previous
behaviour was incorrect.

No matter which representation is used, there will be a certain amount of
overhead (which is the majority of what most of your examples have shown).
Here are examples which demonstrate cases where UTF-8 uses less memory,
cases where the FSR uses less memory, and cases where they use the same
amount of memory (accounting for the minimum amount of overhead required
for each).

Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:57:17) [MSC v.1600 64
bit (AMD64)] on win32
Type help, copyright, credits or license for more information.
 import sys

 fsr = u
 utf8 = fsr.encode(utf-8)
 min_fsr_overhead = sys.getsizeof(fsr)
 min_utf8_overhead = sys.getsizeof(utf8)
 min_fsr_overhead
49
 min_utf8_overhead
33

 fsr = u\u0001 * 1000
 utf8 = fsr.encode(utf-8)
 sys.getsizeof(fsr) - min_fsr_overhead
1000
 sys.getsizeof(utf8) - min_utf8_overhead
1000

 fsr = u\u0081 * 1000
 utf8 = fsr.encode(utf-8)
 sys.getsizeof(fsr) - min_fsr_overhead
1024
 sys.getsizeof(utf8) - min_utf8_overhead
2000

 fsr = u\u0001\u0081 * 1000
 utf8 = fsr.encode(utf-8)
 sys.getsizeof(fsr) - min_fsr_overhead
2024
 sys.getsizeof(utf8) - min_utf8_overhead
3000

 fsr = u\u0101 * 1000
 utf8 = fsr.encode(utf-8)
 sys.getsizeof(fsr) - min_fsr_overhead
2025
 sys.getsizeof(utf8) - min_utf8_overhead
2000

 fsr = u\u0101\u0081 * 1000
 utf8 = fsr.encode(utf-8)
 sys.getsizeof(fsr) - min_fsr_overhead
4025
 sys.getsizeof(utf8) - min_utf8_overhead
4000

Indexing a character in UTF-8 is O(N) - you have to traverse the the string
up to the character being indexed. Indexing a character in the FSR is O(1).
In all cases the FSR has better performance characteristics for indexing
and slicing than UTF-8.

There are tradeoffs with both UTF-8 and the FSR. The Python developers
decided the priorities for Unicode handling in Python were:

1. Correctness
  a. all code points must be handled correctly;
  b.  it must not be possible to obtain part of a code point (e.g. the
first byte only of a multi-byte code point);

2. No change in the Big O characteristics of string operations e.g.
indexing must remain O(1);

3. Reduced memory use in most cases.

It is impossible for UTF-8 to meet both criteria 1b and 2 without
additional auxiliary data (which uses more memory and increases complexity
of the implementation). The FSR meets all 3 criteria.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: [OT]Royal pardon for codebreaker Turing

2013-12-27 Thread Tim Delaney
On 28 December 2013 04:34, Mark Lawrence breamore...@yahoo.co.uk wrote:


 Personally, I think that people ought to throw a party celebrating
 Turing's rehabilitation, and do it right outside the Russian Embassy.


 Any particular reason for the restriction to Russian Embassy?


I suspect it's in reference to the difficulties homosexuals are likely to
face when attending or competing in the 2014 Winter Olympic and Paralympic
Games at Sochi. Adam Hills in particular has had a real go about it on his
UK show The Last Leg where he decided to turn Vladimir Putin into a
homosexual icon (search last leg sochi without the quotes).

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: [OT]Royal pardon for codebreaker Turing

2013-12-27 Thread Tim Delaney
On 28 December 2013 15:16, Steven D'Aprano st...@pearwood.info wrote:

 I don't care about the Olympians. Their presence in Russia is voluntary,
 and so long as they keep it in their pants for a few weeks (or at least
 don't get caught) they get to go home again a few weeks later. Have a
 thought for those who don't get to go home again. I'm talking about the
 situation in Russia, where the government is engaging in 1930s-style
 scape-goating and oppression of homosexuals. They haven't quite reached
 the level of Kristallnacht or concentration camps, but the rhetoric and
 laws coming out of the Kremlin are just like that coming out of the
 Reichstag in the thirties.


You are of course correct - I was still groggy from waking up when I
replied, and focused on the element that I had been most exposed to.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Experiences/guidance on teaching Python as a first programming language

2013-12-11 Thread Tim Delaney
On 12 December 2013 03:25, Chris Angelico ros...@gmail.com wrote:

 On Thu, Dec 12, 2013 at 3:18 AM, Mark Lawrence breamore...@yahoo.co.uk
 wrote:
  On 11/12/2013 16:04, Chris Angelico wrote:
 
  I strongly believe that a career
  programmer should learn as many languages and styles as possible, but
  most of them can wait.
 
 
  I chuckle every time I read this one.  Five years per language, ten
  languages, that's 50 years I think.  Or do I rewrite my diary for next
 week,
  so I learn Smalltalk Monday morning, Ruby Monday afternoon, Julia Tuesday
  morning ...

 Well, I went exploring the Wikipedia list of languages [1] one day,
 and found I had at least broad familiarity with about one in five. I'd
 like to get that up to one in four, if only because four's a power of
 two.

 More seriously: Once you've learned five of very different styles, it
 won't take you five years to learn a sixth language. I picked up Pike
 in about a weekend by realizing that it was Python semantics meets C
 syntax, and then went on to spend the next few years getting to know
 its own idioms. I'd say anyone who knows a dozen languages should be
 able to pick up any non-esoteric language in a weekend, at least to a
 level of broad familiarity of being able to read and comprehend code
 and make moderate changes to it.


Absolutely. 10 years ago I was saying I'd forgotten at least 20 languages,
and there have been many more since.

Once you know enough programming languages you (and by you I mean me)
get to the point where if you don't know a specific language you can pick
up enough to be useful in a day or two, reasonably proficient in a week,
and have a fairly high level of mastery by the time you've finished
whatever project you picked it up for. And then you don't use it for a
while, forget it to make room for something else, and pick it up again when
you need it (much faster this time).

Except Prolog. Never could get my head around it - I should go back and
have another try one of these days.

Some languages stick with you (e.g. Python) and I don't tend to learn
languages that are too similar to what I already know unless it's for a
specific project. So I've never learned Ruby ... but I have had to modify a
few Ruby scripts along the way, and been able to achieve what I wanted the
same day.

TimD elaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python Unicode handling wins again -- mostly

2013-12-01 Thread Tim Delaney
On 2 December 2013 07:15, wxjmfa...@gmail.com wrote:

 0.11.13 02:44, Steven D'Aprano написав(ла):
  (2) If you reverse that string, does it give lëon? The implication of
  this question is that strings should operate on grapheme clusters rather
  than code points. ...
 

 BTW, a grapheme cluster *is* a code points cluster.


Anyone with a decent level of reading comprehension would have understood
that Steven knows that. The implied word is individual i.e. ... rather
than [individual] code points.

Why am I responding to a troll? Probably because out of all his baseless
complaints about the FSR, he *did* have one valid point about performance
that has now been fixed.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python Unicode handling wins again -- mostly

2013-12-01 Thread Tim Delaney
On 2 December 2013 09:06, Mark Lawrence breamore...@yahoo.co.uk wrote:

 I don't remember him ever having a valid point, so FTR can we have a
 reference please.  I do remember Steven D'Aprano showing that there was a
 regression which I flagged up here http://bugs.python.org/issue16061.  It
 was fixed by Serhiy Storchaka, who appears to have forgotten more about
 Python than I'll ever know, grrr!!! :)


From your own bug report (quoting Steven): Nevertheless, I think there is
something here. The consequences are nowhere near as dramatic as jmf claims
...

His initial postings did lead to a regression being found.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Jython - Can't access enumerations?

2013-11-29 Thread Tim Delaney
On 30 November 2013 03:15, Eamonn Rea eamonn...@gmail.com wrote:

 Ok, here's the code:
 [elided]


As I said, please also show the *exact* error - copy and paste the stack
trace.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Jython - Can't access enumerations?

2013-11-28 Thread Tim Delaney
On 29 November 2013 04:22, Eamonn Rea eamonn...@gmail.com wrote:

 Hello! I'm using Jython to write a game with Python/Jython/LibGDX. I
 wasn't having any trouble, and everything was going well, but sadly I can't
 access items in enumerations.

 If you know about the LibGDX library and have used it, you'll probably
 know about the BodyType enumeration for Box2D. I was trying to access the
 BodyType item in the enumeration. This didn't work, as Jython couldn't find
 the BodyType enumeration. I asked on the LibGDX forum and no one could
 help. So I'm hoping that someone here could help :-)

 So, is this a problem with LibGDX or Jython? I'm using the latest version
 of Jython.


There is no problems accessing the elements of enumerations with Jython.
The following was tested using Jython 2.7b1:

Jython 2.7b1 (default:ac42d59644e9, Feb 9 2013, 15:24:52)
[Java HotSpot(TM) 64-Bit Server VM (Oracle Corporation)] on java1.7.0_40
Type help, copyright, credits or license for more information.
 import java.util.Collections as Collections
 import java.util.Arrays as Arrays

 a = Arrays.asList(1, 2, 3)
 print(a)
[1, 2, 3]
 e = Collections.enumeration(a)
 print(e)
java.util.Collections$2@f48007e

 for i in e:
... print(i)
...
1
2
3

Therefore the problem is either with LibGDX or (more likely) way you are
using it. Please post a minimal example that demonstrates the problem plus
the exact error message you get (I don't know LibGLX at all, but someone
else might, plus trimming it down to a minimal example may reveal the
problem to you).

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Got a Doubt ! Wanting for your Help ! Plz make it ASAP !

2013-11-26 Thread Tim Delaney
On 27 November 2013 03:57, Antoon Pardon antoon.par...@rece.vub.ac.bewrote:


 So I can now ask my questions in dutch and expect others to try and
 understand me instead of me asking them in english? Or can I use
 literal translations of dutch idioms even if I suspect that such
 a literal translation could be misunderstood and even be insulting?


1. No, because this is stated to be an English-speaking list/newsgroup. It
just doesn't specify what dialect of English.

2. If you suspect that the literal translation could be misunderstood or
insulting, then I would expect you to make an effort to find a better
translation. If someone I didn't know posted it, I'd be willing to give
them leeway if the rest of their message indicated that they are used to
another dialect or language. If *you* posted it, I'd probably assume you
meant it, because I know your command of the english language is pretty
extensive ...

Participants are expected to attempt to be understandable in English, but I
personally expect responders to make an effort to work with multiple
dialects. If you're too unfamiliar with a dialect that you cannot respond,
either don't respond, or respond saying something like I think I can help
here, but I'm confused about unfamiliar phrase - could you or someone
else clarify please?

And if an unfamiliar dialect annoys you, killfile the person. No skin off
my nose.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Cracking hashes with Python

2013-11-26 Thread Tim Delaney
On 27 November 2013 13:28, Chris Angelico ros...@gmail.com wrote:

 On Wed, Nov 27, 2013 at 12:04 PM, Mark Lawrence breamore...@yahoo.co.uk
 wrote:
  On 26/11/2013 23:06, TheRandomPast . wrote:
  I'm stumped.
 
  Good to see another cricketer on the list :)

 May I be bowled enough to suggest that stumped doesn't necessarily
 imply a background in cricket?

 *dives for cover*


Surely that should have been drives for cover ;) I guess I'll play on ...

Before I go look it up, I'm guessing that the etymology of stumped is
actually coming from the problem of a plough getting stuck on a stump (i.e.
can't progress any further). Not much of an issue anymore since the
invention of the stump-jump plough:
https://en.wikipedia.org/wiki/Stump-jump_plough

(Looked it up, my guess is considered the most likely origin of the term).

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Off-topic: Aussie place names [was Re: Automation]

2013-11-20 Thread Tim Delaney
On 21 November 2013 11:58, Steven D'Aprano 
steve+comp.lang.pyt...@pearwood.info wrote:

 For a serious look at Australian placenames named after Australian
 Aboriginal words, see wikipedia:


 http://en.wikipedia.org/wiki/List_of_Australian_place_names_of_Aboriginal_origin


Just noticed that my town was missing - added it:
https://en.wikipedia.org/wiki/Mittagong,_New_South_Wales

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: To whoever hacked into my Database

2013-11-08 Thread Tim Delaney
On 8 November 2013 21:00, Νίκος Αλεξόπουλος nikos.gr...@gmail.com wrote:

 I have never exposed my client's data. Prove otherwise.


https://mail.python.org/pipermail/python-list/2013-June/648550.html

Or don't you consider giving the root password for a server containing
client data to a complete stranger to be exposing that data?

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: To whoever hacked into my Database

2013-11-07 Thread Tim Delaney
On 8 November 2013 09:18, Νίκος Αλεξόπουλος nikos.gr...@gmail.com wrote:

 I feel a bit proud because as it seems i have manages to secure it more
 tight. All i need to do was to validate user input data, so the hacker
 won't be able again to pass bogus values to specific variables that my
 script was using.


So we now have confirmation that Nikos' site is subject to SQL injection
attacks on anything that he is not specifically validating. And I'm
absolutely sure that he has identified every location where input needs to
be validated, and that it is impossible to get past the level of validation
that he's doing, so the site is completely secure! Just like the last time
he claimed that (and the time before, and the time before that ...).

Nikos, please please please do yourself and your customers a favour and
quit your so-called business. All you are doing is opening your customers
up to potentially disastrous situations and yourself to lawsuits. It's not
a question of *if*, but *when* one of your customers is compromised to the
extent that they decide to take it out of you.

Also, you're an embarrassment to our profession.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: To whoever hacked into my Database

2013-11-07 Thread Tim Delaney
On 8 November 2013 09:45, Tim Delaney timothy.c.dela...@gmail.com wrote:

 On 8 November 2013 09:18, Νίκος Αλεξόπουλος nikos.gr...@gmail.com wrote:

 I feel a bit proud because as it seems i have manages to secure it more
 tight. All i need to do was to validate user input data, so the hacker
 won't be able again to pass bogus values to specific variables that my
 script was using.


 So we now have confirmation that Nikos' site is subject to SQL injection
 attacks on anything that he is not specifically validating. And I'm
 absolutely sure that he has identified every location where input needs to
 be validated, and that it is impossible to get past the level of validation
 that he's doing, so the site is completely secure! Just like the last time
 he claimed that (and the time before, and the time before that ...).


Not to mention the idiocy of exposing your web server logs to the outside
world ... (no - I didn't go there - I want no chance of getting malware
from his site).

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Algorithm that makes maximum compression of completly diffused data.

2013-10-30 Thread Tim Delaney
On 31 October 2013 05:21, jonas.thornv...@gmail.com wrote:

 I am searching for the program or algorithm that makes the best possible
 of completly (diffused data/random noise) and wonder what the state of art
 compression is.

 I understand this is not the correct forum but since i think i have an
 algorithm that can do this very good, and do not know where to turn for
 such question i was thinking to start here.

 It is of course lossless compression i am speaking of.


This is not an appropriate forum for this question. If you know it's an
inappropriate forum (as you stated) then do not post the question here. Do
a search with your preferred search engine and look up compression on
lossless Wikipedia. And read and understand the following link:

http://www.catb.org/esr/faqs/smart-questions.html

paying special attention to the following parts:

http://www.catb.org/esr/faqs/smart-questions.html#forum
http://www.catb.org/esr/faqs/smart-questions.html#prune
http://www.catb.org/esr/faqs/smart-questions.html#courtesy
http://www.catb.org/esr/faqs/smart-questions.html#keepcool
http://www.catb.org/esr/faqs/smart-questions.html#classic

If you have *python* code implementing this algorithm and want help, post
the parts you want help with (and preferably post the entire algorithm in a
repository).

However, having just seen the following from you in a reply to Mark (I do
not follow instructions, i make them accesible to anyone), I am not not
going to give a second chance - fail to learn from the above advice and
you'll meet my spam filter.

If the data is truly completely random noise, then there is very little
that lossless compression can do. On any individual truly random data set
you might get a lot of compression, a small amount of compression, or even
expansion, depending on what patterns have randomly occurred in the data
set. But there is no current lossless compression algorithm that can take
truly random data and systematically compress it to be smaller than the
original.

If you think you have an algorithm that can do this on truly random data,
you're probably wrong - either your data is has patterns the algorithm can
exploit, or you've simply been lucky with the randomness of your data so
far.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: personal library

2013-10-30 Thread Tim Delaney
On 31 October 2013 07:02, patrick vrijlandt patrick.vrijla...@gmail.comwrote:

 Chris Angelico ros...@gmail.com wrote:
  On Wed, Oct 30, 2013 at 3:33 PM, Ben Finney ben+pyt...@benfinney.id.au
 wrote:
  Chris Angelico ros...@gmail.com writes:
 
  *Definitely* use source control.
 
  +1, but prefer to call it a “version control system” which is (a) more
  easily searched on the internet, and (b) somewhat more accurate.
 
  Right. I've picked up some bad habits, and I think Dave may also
  have... but yes, distributed version control system is what I'm
  talking about here.
 
  ChrisA

 Thanks. Do you all agree that Mercurial is the way to go, or is there
 another distributed version control system that I should shortlist?


There are huge arguments all over the net on this topic. Having extensively
used the top two contenders (Git and Mercurial) I would strongly advise you
to use Mercurial.

What it comes down to for me is that Mercurial usage fits in my head and I
rarely have to go to the docs, whereas with Git I have to constantly go to
the docs for anything but the most trivial usage - even when it's something
I've done many times before. I'm always afraid that I'm going to do
something *wrong* in Git.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: personal library

2013-10-30 Thread Tim Delaney
On 31 October 2013 08:31, Chris Angelico ros...@gmail.com wrote:

 On Thu, Oct 31, 2013 at 7:19 AM, Tim Delaney
 timothy.c.dela...@gmail.com wrote:
  What it comes down to for me is that Mercurial usage fits in my head and
 I
  rarely have to go to the docs, whereas with Git I have to constantly go
 to
  the docs for anything but the most trivial usage - even when it's
 something
  I've done many times before. I'm always afraid that I'm going to do
  something *wrong* in Git.

 Oddly enough, I've had the opposite experience. With git, I can do
 whatever I want easily, but with Mercurial, some tasks seem to elude
 me. (Is there a Mercurial cheat-sheet for git users somewhere?


https://github.com/sympy/sympy/wiki/Git-hg-rosetta-stone

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: personal library

2013-10-30 Thread Tim Delaney
On 31 October 2013 08:43, Tim Delaney timothy.c.dela...@gmail.com wrote:

 On 31 October 2013 08:31, Chris Angelico ros...@gmail.com wrote:

 On Thu, Oct 31, 2013 at 7:19 AM, Tim Delaney
 timothy.c.dela...@gmail.com wrote:
  What it comes down to for me is that Mercurial usage fits in my head
 and I
  rarely have to go to the docs, whereas with Git I have to constantly go
 to
  the docs for anything but the most trivial usage - even when it's
 something
  I've done many times before. I'm always afraid that I'm going to do
  something *wrong* in Git.

 Oddly enough, I've had the opposite experience. With git, I can do
 whatever I want easily, but with Mercurial, some tasks seem to elude
 me. (Is there a Mercurial cheat-sheet for git users somewhere?


 https://github.com/sympy/sympy/wiki/Git-hg-rosetta-stone


And the defacto standard GUI for Mercurial is TortoiseHg (available on
Windows, Linux and OSX).

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Check if this basic Python script is coded right

2013-10-27 Thread Tim Delaney
On 27 October 2013 23:20, rusi rustompm...@gmail.com wrote:

 On Saturday, October 26, 2013 11:50:33 PM UTC+5:30, MRAB wrote:
  On 26/10/2013 18:36, HC wrote:
   I'm doing my first year in university and I need help with this basic
 assignment.
  
   Assignment: Write Python script that prints sum of cubes of numbers
 between 0-200 that are multiples of 3. 3^3+6^3+9^3+12^3+198^3=?
 code snipped
  
   Is it all okay?
  
  Just one small point: the assignment says that the numbers should be in
  the range 0-200, but the loop's condition is count200 it's excluding
  200, so the numbers will actually be in the range 0-199.
 
  However, as 200 isn't a multiple of 3, it won't affect the result, but
  in another assignment it might.
 
  (For the record, this is known as an off-by-one error.)

 So is an off-by-one error that did not happen an off-by-an-off-by-one
 error?


I would say yes. When someone realises that the requirements were also off
by one and the specification gets changed to  between 0-201 (inclusive)
then whoever fixes it might naively just add one to the existing code,
giving an incorrect result.

Obviously I'm ignoring the possibility of appropriate unit tests to prevent
this - just looking at the code itself.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python Front-end to GCC

2013-10-25 Thread Tim Delaney
On 26 October 2013 06:18, Mark Janssen dreamingforw...@gmail.com wrote:

  As for the hex value for Nan who really gives a toss?  The whole point is
  that you initialise to something that you do not expect to see.  Do you
 not
  have a text book that explains this concept?

 No, I don't think there is a textbook that explains such a concept of
 initializing memory to anything but 0 -- UNLESS you're from Stupid
 University.

 Thanks for providing fodder...


I know I'm replying to a someone who has trolled many threads over multiple
years ... or as I'm now starting to suspect, possibly a bot, but I'll give
him (it?) this one chance to show the capability to read and learn.

http://en.wikipedia.org/wiki/Hexspeak

Search for 0xBAADF00D; 0xBADDCAFE; and (in particular) OxDEADBEEF. These
are historical examples of this technique used by major companies.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python Front-end to GCC

2013-10-25 Thread Tim Delaney
On 26 October 2013 07:36, Mark Lawrence breamore...@yahoo.co.uk wrote:

 I can't see it being a bot on the grounds that a bot wouldn't be smart
 enough to snip a URL that referred to itself as a quack.


My thought based on some of the responses is that they seem auto-generated,
then tweaked - so not a bot per-se, but verging on it.

But OTOH, it can also be explained away entirely by (as you previously
noted) the Dunning-Kruger effect, with the same uninformed responses
trotted out to everything. Not necessarily a troll as I injudiciously
claimed in my previous post (I'd just woken up after 4 hours sleep - my
apologies to the list).

Anyway, not going to get sucked into this bottomless hole.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Sexism in the Ruby community: how does the Python community manage it?

2013-10-17 Thread Tim Delaney
On 18 October 2013 04:16, Roy Smith r...@panix.com wrote:

 On Thursday, October 17, 2013 11:07:48 AM UTC-4, Chris Angelico wrote:
  Module names should be  descriptive, not fancy.

 Interesting comment, on a mailing list for a language named after a snake,
 especially by a guy who claims to prefer an language named after a fish :-)


That would be  https://en.wikipedia.org/wiki/Monty_Python not
https://en.wikipedia.org/wiki/Pythonidae.

The snake has been adopted as a mascot (see the Python icon) but is not the
inspiration.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: JUST GOT HACKED

2013-10-01 Thread Tim Delaney
On 2 October 2013 00:00, Νίκος nikos.gr...@gmail.com wrote:


 Thanks for visting my website: you help me increase my google page rank
 without actually utilizing SEO.

 Here:  
 http://superhost.gr/?show=log**page=index.htmlhttp://superhost.gr/?show=logpage=index.html


Speaking of which, I would strongly advise against *anyone* going to Nikos'
web site. With the length of time his credentials have been available for
anyone in the world to obtain and use it's highly likely that by now his
website is a malware-spewing zombie member of a botnet.

Of course, I'm not going to risk it by going there to check myself ...

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: JUST GOT HACKED

2013-10-01 Thread Tim Delaney
On 2 October 2013 09:28, Νίκος nikos.gr...@gmail.com wrote:


 con = pymysql.connect( db = 'mypass', user = 'myuser', passwd =
 'mysqlpass', charset = 'utf8', host = 'localhost' )

 That was viewable by the link Mark have posted.

 But this wasnt my personal's account's login password, that was just the
 mysql password.

 Mysql pass != account's password


Because there's no chance with the brilliance you display that there could
be any possibility of login details being kept in plaintext in your
database.

And of course your database is so well locked down that no attacker with a
login to it could then execute arbitrary code on your system.

And there's also zero chance that your personal account login details are
also available in plaintext somewhere that you're unaware of.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Do I really need a web framework?

2013-09-30 Thread Tim Delaney
On 1 October 2013 05:57, duf...@gmail.com wrote:

 I want to set up a very simple website, and I need to know if it is
 necessary to use a web framework (e.g. Django) to do basic interactive
 operations such as receiving input from the user, looking up a database and
 returning some data to the user.
 I know that this is exactly the purpose of web frameworks, and that they
 work fine.
 However, I read somewhere that for small projects such operations can be
 managed without a web framework, just by using Python with mod_python or
 with the CGI module. Is this correct?

 What do you suggest, keeping in mind that I am a newbie and that my
 website project would be very simple and very small?


There is no *need* to use a web framework. But a web framework can make
things a lot easier for you.

Have a look at webapp2: http://webapp-improved.appspot.com/

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Print statement not printing as it suppose to

2013-09-20 Thread Tim Delaney
On 21 September 2013 07:57, Sam anasdah...@gmail.com wrote:

 hi everybody i am just starting to learn python, i was writing a simple
 i/o program but my print statement is acting weird. here is my code i want
 to know why it prints this way. thank you

 print(\nThe total amount required is , total )


 OUTPUT

 ('\nThe total amount required is ', 3534)

 === the problem is obviously on the last print statement that is supposed
 to print the outut


Check your version of Python. The output you have given says that you're
using a Python 2 version, but the print syntax you're using is for Python
3. Unfortunately, you've hit one of the edge cases where they produce
different output.

As a general rule, either use % formatting or format()to produce a single
string to print, rather than relying on print to output them correctly for
you (or using string concatenation). Since you're only just starting you
won't have got to them yet - the simplest way to to it is to just insert
the string representation of all parameters. The above done using %
formatting would be:

print(\nThe total amount required is  %s % (total,))

which will produce the same output on both Python 2 and Python 3. Note the
double space before %s - that matches your print statement (there would be
soft-space inserted in your print statement, which is another reason not to
rely on print for anything other than single strings). If you didn't want
that extra space, it's easy to delete.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: PEP8 79 char max

2013-09-06 Thread Tim Delaney
On 6 September 2013 20:35, Tim Chase python.l...@tim.thechases.com wrote:

 On 2013-09-06 05:09, Skip Montanaro wrote:
  And thank goodness for SIGWINCH. :-)

 BEDEVERE: How do you know she is a SIGWINCH?

 VILLAGER: She looks like one.

 CROWD: Right! Yeah! Yeah!


 :-)

 I'm just glad it's no longer 40-chars-per-column and purely
 upper-case like the Apple ][+ on which I cut my programming teeth.


Couldn't you switch the ][+ into high-res mode? You could with the IIe.
Made programming in DOS 3.3 BASIC so much nicer.

Tim Delaney
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Encapsulation unpythonic?

2013-09-01 Thread Tim Delaney
On 2 September 2013 06:33, Ethan Furman et...@stoneleaf.us wrote:


 class PlainPython:

 value = None


 In the Javaesque class we see the unPythonic way of using getters/setters;
 in the ProtectedPython* class we see the pythonic way of providing
 getters/setters**; in the PlainPython class we have the standard,
 unprotected, direct access to the class attribute.

 No where in PlainPython is a getter/setter defined, nor does Python define
 one for us behind our backs.

 If you have evidence to the contrary I'd like to see it.


I think Roy is referring to the fact that attribute access is implemented
via __getattr__ / __getattribute__ / __setattr__ / __delattr__. From one
point of view, he's absolutely correct - nearly all attributes are accessed
via getters/setters in Python.

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Encapsulation unpythonic?

2013-08-31 Thread Tim Delaney
On 1 September 2013 03:31, Dennis Lee Bieber wlfr...@ix.netcom.com wrote:

 On Fri, 30 Aug 2013 23:07:47 -0700 (PDT), Fabrice Pombet fp2...@gmail.com
 
 declaimed the following:

 well, look at that:
 
 a=(1,2)
 a=2+3 -a is an object and I have changed its type and value from
 outside. As far as I am concerned this is one hell of an encapsulation
 violation... Could you do this -strictly speaking- in Java or C++?

 There is where your major misunderstanding is...

 a is a NAME attached (bound) to an object. In the first statement, the
 object is the tuple (1,2). That object was not changed when you execute the
 second statement -- which is taking two integer objects and creating a new
 integer object having a value of '5', and then attaches the NAME a to the
 new object. If no other names are bound to the (1,2) object, it will be
 garbage collected.


I'll try another way to explain it, using Java terminology(since Fabrice
appears to be familiar with Java).

Object a = Arrays.asList(1, 2);  // a is a reference to the ListInteger
returned by Arrays.asList
a = Integer.valueOf(2 + 3);  // a is now a reference to the Integer
returned by Integer.valueOf

You have not changed the type of 'a' in any way - you have simply changed
what the name 'a' refers to. This is functionally identical to your Python
code above,except that in Python you do not have to downcast the Object
reference 'a' or use reflection to call methods on it or access it's
members (think of it as Python does reflection automatically for you).

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why is str(None) == 'None' and not an empty string?

2013-08-29 Thread Tim Delaney
On 29 August 2013 20:43, Ian Kelly ian.g.ke...@gmail.com wrote:

 On Wed, Aug 28, 2013 at 6:21 AM, Steven D'Aprano
 steve+comp.lang.pyt...@pearwood.info wrote:
  On Wed, 28 Aug 2013 01:57:16 -0700, Piotr Dobrogost wrote:
 
  Hi!
 
  Having repr(None) == 'None' is sure the right thing but why does
  str(None) == 'None'? Wouldn't it be more correct if it was an empty
  string?
 
 
  Why do you think an empty string is more correct? Would you expect
  str([]) or str(0.0) or str({}) to also give an empty string?
 
 
  I can't see any reason for str(None) to return the empty string.

 I've had many occasions where it would have been convenient for
 str(None) to return the empty string, e.g. when exporting tabular data
 that includes null values from a database to a spreadsheet.  Generally
 it's safe to just call str() on the data, except that I'd rather empty
 cells just be empty rather than spamming the word None all over the
 place, so I end up having to do something like (str(value) if value is
 not None else '') instead.  Not a major inconvenience, but enough to
 make me wonder if there could be a better way.


There is.

def format(value):
if value is None:
return ''

return str(value)

print(format(value))

This also allows you to format other types differently e.g. only output 2
decimal places for non-integer numeric types.

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: New VPS Provider needed

2013-08-27 Thread Tim Delaney
On 27 August 2013 18:45, Νικόλαος ni...@superhost.gr wrote:


 Iam having major issues with my VPS provider and losign customers becaue
 the provider doesnt set thign u[ cprrectly.


Given your posting history in this newsgroup/mailing list, I wouldn't be so
sure that the problem is on your VPS provider's end.

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Fast conversion of numbers to numerator/denominator pairs

2013-08-24 Thread Tim Delaney
On 24 August 2013 13:30, Steven D'Aprano 
steve+comp.lang.pyt...@pearwood.info wrote:


 def convert(d):
 sign, digits, exp = d.as_tuple()
 num = int(''.join([str(digit) for digit in digits]))
 if sign: num = -num
 return num, 10**-exp

 which is faster, but not fast enough. Any suggestions?


Straightforward multiply and add takes about 60% of the time for a single
digit on my machine compared to the above, and 55% for 19 digits (so
reasonably consistent). It's about 10x slower than fractions.

def convert_muladd(d, _trans=_trans, bytes=bytes):
sign, digits, exp = d.as_tuple()
num = 0

for digit in digits:
num *= 10
num += digit

if sign:
num = -num

return num, 10**-exp

Breakdown of the above (for 19 digits):

d.as_tuple() takes about 35% of the time.

The multiply and add takes about 55% of the time.

The exponentiation takes about 10% of the time.

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Fast conversion of numbers to numerator/denominator pairs

2013-08-24 Thread Tim Delaney
On 25 August 2013 07:59, Tim Delaney timothy.c.dela...@gmail.com wrote:

 Breakdown of the above (for 19 digits):

 d.as_tuple() takes about 35% of the time.

 The multiply and add takes about 55% of the time.

 The exponentiation takes about 10% of the time.


Bah - sent before complete.

Since the multiply and add takes such a significant proportion of the time,
compiling the above with Cython should gain you a big win as well. Or find
some other way to turn that loop into native code.

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: RE Module Performance

2013-07-30 Thread Tim Delaney
On 31 July 2013 00:01, wxjmfa...@gmail.com wrote:


 I am pretty sure that once you have typed your 127504
 ascii characters, you are very happy the buffer of your
 editor does not waste time in reencoding the buffer as
 soon as you enter an €, the 125505th char. Sorry, I wanted
 to say z instead of euro, just to show that backspacing the
 last char and reentering a new char implies twice a reencoding.


And here we come to the root of your complete misunderstanding and
mischaracterisation of the FSR. You don't appear to understand that
strings in Python are immutable and that to add a character to an
existing string requires copying the entire string + new character. In
your hypothetical situation above, you have already performed 127504
copy + new character operations before you ever get to a single widening
operation. The overhead of the copy + new character repeated 127504
times dwarfs the overhead of a single widening operation.

Given your misunderstanding, it's no surprise that you are focused on
microbenchmarks that demonstrate that copying entire strings and adding
a character can be slower in some situations than others. When the only
use case you have is implementing the buffer of an editor using an
immutable string I can fully understand why you would be concerned about
the performance of adding and removing individual characters. However,
in that case *you're focused on the wrong problem*.

Until you can demonstrate an understanding that doing the above in any
language which has immutable strings is completely insane you will have
no credibility and the only interest anyone will pay to your posts is
refuting your FUD so that people new to the language are not driven off
by you.

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Dihedral

2013-07-16 Thread Tim Delaney
On 16 July 2013 08:59, Chris Angelico ros...@gmail.com wrote:

 On Tue, Jul 16, 2013 at 8:54 AM, Fábio Santos fabiosantos...@gmail.com
 wrote:
 
  On 07/15/2013 08:36 AM, Steven D'Aprano wrote:
 
  Devyn,
 
  8 Dihedral is our resident bot, not a human being. Nobody knows who
  controls it, and why they are running it, but we are pretty certain
 that
  it is a bot responding mechanically to keywords in people's posts.
 
  It's a very clever bot, but still a bot. About one post in four is
  meaningless jargon, the other three are relevant enough to fool people
  into thinking that maybe it is a human being. It had me fooled for a
 long
  time.
 
 
  Does this mean he passes the Turing test?

 Yes, absolutely. The original Turing test was defined in terms of five
 minutes of analysis, and Dihedral and jmf have clearly been
 indistinguishably human across that approximate period.


The big difference between them is that the jmfbot does not appear to
evolve its routines in response to external sources - it seems to be stuck
in a closed feedback loop.

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: RE Module Performance

2013-07-13 Thread Tim Delaney
On 13 July 2013 09:16, MRAB pyt...@mrabarnett.plus.com wrote:

 On 12/07/2013 23:16, Tim Delaney wrote:

 On 13 July 2013 03:58, Devyn Collier Johnson devyncjohn...@gmail.com
 mailto:devyncjohnson@gmail.**com devyncjohn...@gmail.com wrote:


 Thanks for the thorough response. I learned a lot. You should write
 articles on Python.
 I plan to spend some time optimizing the re.py module for Unix
 systems. I would love to amp up my programs that use that module.


 If you are finding that regular expressions are taking too much time,
 have a look at the 
 https://pypi.python.org/pypi/**re2/https://pypi.python.org/pypi/re2/and
 https://pypi.python.org/pypi/**regex/2013-06-26https://pypi.python.org/pypi/regex/2013-06-26modules
  to see if they
 already give you enough of a speedup.

  FYI, you're better off going to 
 http://pypi.python.org/pypi/**regexhttp://pypi.python.org/pypi/regex
 because that will take you to the latest version.


Absolutely - what was I thinking?

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: RE Module Performance

2013-07-12 Thread Tim Delaney
On 13 July 2013 03:58, Devyn Collier Johnson devyncjohn...@gmail.comwrote:


 Thanks for the thorough response. I learned a lot. You should write
 articles on Python.
 I plan to spend some time optimizing the re.py module for Unix systems. I
 would love to amp up my programs that use that module.


If you are finding that regular expressions are taking too much time, have
a look at the https://pypi.python.org/pypi/re2/ and
https://pypi.python.org/pypi/regex/2013-06-26 modules to see if they
already give you enough of a speedup.

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Version Control Software

2013-06-15 Thread Tim Delaney
On 16 June 2013 01:29, Giorgos Tzampanakis giorgos.tzampana...@gmail.comwrote:

 On 2013-06-15, Roy Smith wrote:

 Also, is working without connection to the server such big an issue? One
 would expect that losing access to the central server would indicate
 significant problems that would impact development anyway.


I work almost 100% remotely (I chose to move back to a country town). Most
of the time I have a good internet connection. But sometimes my clients are
in other countries (I'm in Australia, my current client is in the US) and
the VPN is slow or doesn't work (heatwaves have taken down their systems a
few times). Sometimes I'm on a train going to Sydney and mobile internet is
pretty patchy much of the way. Sometimes my internet connection dies - we
had a case where someone put a backhoe through the backhaul and my backup
mobile internet was also useless.

But so long as at some point I can sync the repositories, I can work away
(on things that are not dependent on something new from upstream).

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Version Control Software

2013-06-14 Thread Tim Delaney
On 15 June 2013 06:55, Dave Angel da...@davea.name wrote:

 On 06/14/2013 10:24 AM, Grant Edwards wrote:

 On 2013-06-14, Roy Smith r...@panix.com wrote:

  All that being said, it is, as Anssi points out, a horrible, bloated,
 overpriced, complicated mess which requires teams of specially
 trained ClearCase admins to run.  In other words, it's exactly the
 sort of thing big, stupid, Fortune-500 companies buy because the IBM
 salesperson plays golf with the CIO.


 Years ago, I worked at one largish company where a couple of the
 embedded development projects used ClearCase.  The rest of us used CVS
 or RCS or some other cheap commercial systems.  Judging by those
 results, ClearCase requires a full-time administrator for every 10 or
 so users.  The other systems seemed to require almost no regular
 administration, and what was required was handled by the developers
 themselves (mayby a couple hours per month).  The cost of ClearCase
 was also sky-high.


 if I remember rightly, it was about two-thousand dollars per seat.  And
 the people I saw using it were using XCOPY to copy the stuff they needed
 onto their local drives, then disabling the ClearCase service so they could
 get some real work done.  Compiles were about 10x slower with the service
 active.


I can absolutely confirm how much ClearCase slows things down. I completely
refused to use dynamic views for several reasons - #1 being that if you
lost your network connection you couldn't work at all, and #2 being how
slow they were. Static views were slightly better as you could at least
hijack files in that situation and keep working (and then be very very
careful when you were back online).

And then of course there was ClearCase Remote Client. I was working from
home much of the time, so I got to use CCRC. It worked kinda well enough,
and in that situation was much better than the native client. Don't ever
ever try to use ClearCase native over a non-LAN connection. I can't stress
this enough. The ClearCase protocol is unbelievably noisy, even if using
static views.

CCRC did have one major advantage over the native client though. I had the
fun task when I moved my local team from CC to Mercurial of keeping the
Mercurial and CC clients in sync. Turns out that CCRC was the best option,
as I was able to parse its local state files and work out what timestamp
ClearCase thought its files should be, set it appropriately from a
Mercurial extension and convince CCRC that really, only these files have
changed, not the thousand or so that just had their timestamp changed ...
CCRC at least made that possible, even if it was a complete accident by the
CCRC developers.

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Changing filenames from Greeklish = Greek (subprocess complain)

2013-06-02 Thread Tim Delaney

 A programmer chooses his own clients, and you are the Atherton Wing to
 my Inara Serra.


I've just been watching this train wreck (so glad I didn't get involved at
the start) but I have to say - that's brilliant Chris. Thank you for
starting my week off so well.

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Changing filenames from Greeklish = Greek (subprocess complain)

2013-06-02 Thread Tim Delaney
On 3 June 2013 09:10, Tim Delaney timothy.c.dela...@gmail.com wrote:

 A programmer chooses his own clients, and you are the Atherton Wing to
 my Inara Serra.


 I've just been watching this train wreck (so glad I didn't get involved at
 the start) but I have to say - that's brilliant Chris. Thank you for
 starting my week off so well.


And I just realised I missed a shiny opportunity to wrangle train job in
there ...

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PyWart: The problem with print

2013-06-02 Thread Tim Delaney
On 3 June 2013 13:23, Jason Swails jason.swa...@gmail.com wrote:

 Yea, I've only run into Heisenbugs with Fortran or C/C++.  Every time I've
 seen one it's been due to an uninitialized variable somewhere -- something
 valgrind is quite good at pinpointing.  (And yes, a good portion of our
 code is -still- in Fortran -- but at least it's F90+ :).


With the increase in use of higher-level languages, these days Heisenbugs
most often appear with multithreaded code that doesn't properly protect
critical sections, but as you say, with lower-level languages uninitialised
memory is a common source of them.

I had a fun one once (in C++, but could have happened in any language)
where a semaphore was being acquired twice on the one thread. There were 10
semaphore slots available, and very occasionally the timings would result
in one of the threads deadlocking. Fortunately, by reducing to a single
thread + single semaphore slot I was able to turn it from a Heisenbug to a
100% replicable bug.

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: is operator versus id() function

2013-04-05 Thread Tim Delaney
On 6 April 2013 03:40, candide c.cand...@laposte.net wrote:

 Le vendredi 5 avril 2013 16:53:55 UTC+2, Arnaud Delobelle a écrit :


 
  You've fallen victim to the fact that CPython is very quick to collect
 
  garbage.


 OK, I get it but it's a fairly unexpected behavior.
 Thanks for the demonstrative snippet of code and the instructive answer.


If you read the docs for id() 
http://docs.python.org/3.3/library/functions.html#id, you will see that it
says:

Return the identity of an object. This is an integer which is guaranteed
to be unique and constant for this object during its lifetime. Two objects
with non-overlapping lifetimes may have the same id() value.

If you think it could explain things better, please submit a doc bug.

I think part of your confusion here is that bound methods in Python are
created when accessed. So A.f and a.f are not the same object - one is a
function (an unbound method, but there's no distinction in Python 3.x) and
the other is a bound method. For that reason, accessing a.f twice will
return two different bound method instances.

Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:57:17) [MSC v.1600 64
bit (AMD64)] on win32
Type help, copyright, credits or license for more information.
 class A(object):
... def f(self):
... print(A)
...
 a=A()
 print(id(a.f) == id(a.f), a.f is a.f)
True False


Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: monty python

2013-03-20 Thread Tim Delaney
On 21 March 2013 06:40, jmfauth wxjmfa...@gmail.com wrote:

 
 [snip usual rant from jmf]


Franz, please pay no attention to jmf. He has become obsessed with a single
small regression in Python 3.3 in performance with how strings perform in a
very small domain that rarely shows up in practice (although as he has
demonstrated, it is easy to create a microbenchmark that makes it appear to
be much worse than it is).

The regression is a consequence of the decision in Python 3.3 to
*correctly* support the full range of Unicode characters whilst also
reducing the required memory where possible. In the vast majority of cases
this is a performance *improvement*. It is only optimised for the ascii
user in the sense that in the Unicode standard the pre-existing ASCII
characters only require 1 byte per code point and hence can be stored in
less memory than most other Unicode code points. The possible character
widths are 1, 2 and 4 bytes.

The actual regression occurs when concatentating/replacing/etc a character
to a string that is wider than any other character currently in the string.
In this situation the new string needs to be widened (increase the number
of bytes used by every character) which is a much more expensive operation
than simply creating a new string (which is what would happen if the
character was the same size or smaller).

It has been acknowledged as a real regression, but he keeps hijacking every
thread where strings are mentioned to harp on about it. He has shown no
inclination to attempt to *fix* the regression and is rapidly coming to be
regarded as a troll by most participants in this list.

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Instances as dictionary key, __hash__ and __eq__

2013-02-18 Thread Tim Delaney
On 19 February 2013 06:51, Jean-Michel Pichavant jeanmic...@sequans.comwrote:

 Greetings,

 I opened something like a month ago a thread about hash functions and how
 I could write classes which instances can be safely used as dictionary keys.
 I though I had it but when I read back my code, I think I wrote yet
 another bug.

 Consider the following simple (buggy) class, python 2.5

 class FooSet(object):
 Define an algorithm set, containing pdcch/pdsch (or none).
 def __init__(self, pdcch, pdsch):
 self.pdcch = bool(pdcch)
 self.pdsch = bool(pdsch)
 # __hash__ and __eq__ allow to use the object as a dictionary key
 def __hash__(self):
 return hash((self.pdcch, self.pdsch))
 def __eq__(self, other):
 return hash(self) == hash(other)

 Can you confirm that using the hash function for testing equality is a
 very bad idea ?


Yes - it is a *very* bad idea. A hash by definition can produce collisions,
since you are taking much larger amount of data and are trying to represent
it in a smaller amount of space. It's effectively lossy compression - you
can never reliably get the original back.


 One obvious solution would be:

 def __eq__(self, other):
 return self.pdsch = other.pdsch and self.pdcch == other.pdcch


This is a correct and the simplest way to do it.


 But I was looking for a standard solution, that I could use for
 basically all my container classes

 So I came up with these ones:

 def __hash__(self):
 return hash(tuple(vars(self).values()))
 def __eq__(self, other):
 return vars(self) == vars(other)

 But I'm not sure about vars(self).values(), I don't really care about the
 order of the values, but I need to be sure that for 2 equal dictionaries,
 they will both return their values in the same order.
 And that's the point, I'm not sure at all.


You cannot rely on this. Dictionaries are unordered, and the order that
items are added affects the order that the elements will be iterated over.
You could sort the vars by name (thus giving the stable order you need) but
there's another flaw - vars() contains more than just the attributes you
set.

 class A():
... pass
...
 vars(A)
mappingproxy({'__qualname__': 'A', '__dict__': attribute '__dict__' of 'A'
objects, '__module__':
'__main__', '__weakref__': attribute '__weakref__' of 'A' objects,
'__doc__': None})

So by using vars you are preventing instances of subclasses of your class
from comparing equal to each other (or to instances of the base class).

Additionally,  If I'm making things much more complicated than they need to
 be, let me know.


You are. There are ways to achieve what you want, but it requires a lot
more setup and discipline. The simplest way is probably to have a
_equal_fields() method that subclasses override, returning a tuple of the
attributes that should be hashed. Then in __hash__() and __eq__ you iterate
over the returned tuple, get the value for each attribute and either hash
or compare.

Of course, you have to take into account in __eq__ that the other instance
may not have the same attributes (e.g. self is a subclass that uses extra
attributes in its __hash__ and __eq__).

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: PyWart (Terminolgy): Class

2013-01-14 Thread Tim Delaney
On 15 January 2013 07:57, Chris Angelico ros...@gmail.com wrote:


 Oh, and Dennis? Mal. Bad. From the Latin. :)


I was about to point out the same thing, using the same quote ;)

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: String manipulation in python..NEED HELP!!!!

2012-12-11 Thread Tim Delaney
On 12 December 2012 07:52, Ross Ridge rri...@csclub.uwaterloo.ca wrote:

 John Gordon wrote:
  def encode(plain):
  '''Return a substituted version of the plain text.'''
  encoded = ''
  for ch in plain:
 encoded += key[alpha.index(ch)]
  return encoded

 Terry Reedy  tjre...@udel.edu wrote:
 The turns an O(n) problem into a slow O(n*n) solution. Much better to
 build a list of chars and then join them.

 There have been much better suggestions in this thread, but John Gordon's
 code above is faster than the equivilent list and join implementation
 with Python 2.6 and Python 3.1 (the newest versions I have handy).
 CPython optimized this case of string concatenation into O(n) back in
 Python 2.4.


From What's New in Python 2.4:
http://docs.python.org/release/2.4.4/whatsnew/node12.html#SECTION000121

String concatenations in statements of the form s = s + abc and s +=
abc are now performed more efficiently *in certain circumstances*. This
optimization *won't be present in other Python implementations such as
Jython*, so you shouldn't rely on it; using the join() method of strings is
still recommended when you want to efficiently glue a large number of
strings together.

Emphasis mine.

The optimisation was added to improve the situation for programs that were
already using the anti-pattern of string concatenation, not to encourage
people to use it.

As a real-world case, a bug was recently found in Mercurial where an
operation on Windows was taking orders of magnitudes longer than on
Linux due to use of string concatenation rather than the join idiom (from
~12 seconds spent on string concatenation to effectively zero).

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python does not take up available physical memory

2012-10-21 Thread Tim Delaney
On 22 October 2012 01:14, Pradipto Banerjee 
pradipto.baner...@adainvestments.com wrote:

 I tried this on a different PC with 12 GB RAM. As expected, this time,
 reading the data was no issue. I noticed that for large files, Python takes
 up 2.5x size in memory compared to size on disk, for the case when each
 line in the file is retained as a string within a Python list. As an
 anecdote, for MATLAB, the similar overhead is 2x, slightly lower than
 Python, and each line in the file was retained as string within a MATLAB
 cell. I'm curious, has any one compared the overhead of data in memory for
 other languages like for instance Ruby?


What version of Python were you using? 2.7? 3.2? 3.3?

If you can, try running the same program in Python 3.3 and compare the
amount of memory used and report it here. It sounds like this might be a
case that would greatly benefit from the new string representation in 3.3.

If you're using Python 3.x then the byte and bytearray types might be
of interest to you:
http://docs.python.org/py3k/library/stdtypes.html#binary-sequence-types-bytes-bytearray-memoryview

Alternatively, the array type might be useful:
http://docs.python.org/py3k/library/array.html

As to the core problem, I can only echo what others have said - only hold
in memory what you absolutely have to. There are various techniques to
avoid holding unnecessary data in memory that have been mentioned. One I
haven't seen here yet (I may have missed it) is dumping the data into a
database of some form and using it's capabilities.

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Aggressive language on python-list

2012-10-13 Thread Tim Delaney
On 14 October 2012 08:22, Roel Schroeven r...@roelschroeven.net wrote:

 Zero Piraeus schreef:

  :

 Not sure exactly how to put this ...

 I'm a mostly passive subscriber to this list - my posts here over the
 years could probably be counted without having to take my socks off -
 so perhaps I have no right to comment, but I've noticed a marked
 increase in aggressive language here lately, so I'm putting my head
 above the parapet to say that I don't appreciate it.


 Same here. I've been lurking here for a number of years, and I've always
 regarded this list as an example of friendly civilized behavior, quite
 exceptional on the Internet. I also have the impression that situation is
 changing for the worse, and it worries me too.


If everyone *plonks* the jerks/trolls/bots/etc and no one responds to them,
they won't have an audience and will either go away; act out more (but no
one will see it); or reform and become a useful member of the group
(probably needing to change email addresses to be un-*plonked*).

The problem is mainly when people respond to them. That's what they want -
it gives them an audience. No matter how much you want to *just this once*
respond to one, resist the urge. And if you can't prevent yourself from
replying to someone who has quoted one in order to tell them that the
person is a known troll/bot, tell them privately, not on the list.

Cheers,

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Aggressive language on python-list

2012-10-13 Thread Tim Delaney

 A response to someone who quotes a trollbot just stating *Username* is a
 trollbot. where *no* further correspondence occurs doesn't seem like
 trollbotbait to me, and it makes it easy for people to know who's been
 warned.


If properly trimmed, so there is no reference to the troll/bot or any text
from the troll/bot - fine. But any reference to the original will make it
harder for those of us who use bayesian-based spam filtering.

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: write a regex matches 800-555-1212, 555-1212, and also (800) 555-1212.

2012-09-29 Thread Tim Delaney
On 29 September 2012 20:05, Chris Angelico ros...@gmail.com wrote:

 On Sat, Sep 29, 2012 at 7:38 PM, Mark Lawrence breamore...@yahoo.co.uk
 wrote:
 
  My understanding is that Python 3.3 has regressed the performance of ''.
  Surely the Python devs can speed the performance back up and, just for
 us,
  use less memory at the same time?

 Yes, but to do that we'd have to make Python more Australia-focused
 instead of US-centric. As of Python 3.4, the empty string will be
 lazily evaluated and be delimited by redback spiders instead of
 quotes. That will give a 25% speed and 50% memory usage improvement,
 but you'll need to be careful you don't get bitten


Look - the worst that will happen is nausea and painful swelling and maybe
death if you're a very young child.

Personally I voted for the Fierce Snake[1][2] as the delimiter, but it was
voted down as not Pythonic enough.
I'm sure they were using that as a euphamism for Python*ish* though.

[1] https://en.wikipedia.org/wiki/Inland_Taipan
[2] It's is so pretty:
https://upload.wikimedia.org/wikipedia/commons/f/fe/Fierce_Snake-Oxyuranus_microlepidotus.jpg

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: write a regex matches 800-555-1212, 555-1212, and also (800) 555-1212.

2012-09-29 Thread Tim Delaney
On 30 September 2012 09:26, Chris Angelico ros...@gmail.com wrote:

 On Sun, Sep 30, 2012 at 6:51 AM, Tim Delaney
 timothy.c.dela...@gmail.com wrote:
  Personally I voted for the Fierce Snake[1][2] as the delimiter, but it
 was
  voted down as not Pythonic enough.
  I'm sure they were using that as a euphamism for Python*ish* though.
 
  [1] https://en.wikipedia.org/wiki/Inland_Taipan
  [2] It's is so pretty:
 
 https://upload.wikimedia.org/wikipedia/commons/f/fe/Fierce_Snake-Oxyuranus_microlepidotus.jpg

 A tempting idea, but it's rather a large delimiter. We should reserve
 that for multi-line strings, I think. Although you may have a problem
 with i18n; when you take your code to the southern hemisphere, the
 snake will be facing the other way, so what you thought was an
 open-quote marker is now a close-quote marker instead. Could get
 awkward for naive coders


You seem to have that backwards. With the Oz-centric focus, it's taking
code to the northern hemisphere that's the problem.

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: The opener parameter of Python 3 open() built-in

2012-09-06 Thread Tim Delaney
On 6 September 2012 16:34, Steven D'Aprano 
steve+comp.lang.pyt...@pearwood.info wrote:

 On Thu, 06 Sep 2012 00:34:56 +, Antoine Pitrou wrote:
  Monkey-patching globals is not thread-safe: other threads will see your

 modification, which is risky and fragile.

 Isn't that assuming that you don't intend the other threads to see the
 modification?

 If I have two functions in my module that call open, and I monkey-patch
 the global (module-level) name open to intercept that call, I don't see
 that there is more risk of breakage just because one function is called
 from a thread.

 Obviously monkey-patching the builtin module itself is much riskier,
 because it doesn't just effect code in my module, it affects *everything*.


It's not as though the option to monkey-patch has been taken away. But
hopefully there is now less of a need for it.

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


[OT] git and hg in prompt (was: My first ever Python program, comments welcome)

2012-07-24 Thread Tim Delaney
On 24 July 2012 21:34, Lipska the Kat lipskathe...@yahoo.co.uk wrote:

 On 24/07/12 06:13, rusi wrote:

 On Jul 22, 10:23 pm, Lipska the Katlip...@lipskathekat.com  wrote:

  Heh heh, Nothing to do with Eclipse, just another thing to get my head
 around. For work and Java IMHO you can't beat eclipse...
 at the moment I'm getting my head around git,


 Bumped into this yesterday. Seems like a good aid to git-comprehension
 https://github.com/git/git/**blob/master/contrib/**
 completion/git-prompt.shhttps://github.com/git/git/blob/master/contrib/completion/git-prompt.sh


 eek ... now that's a shell script to be proud of isn't it .. and it works
 [lipska@ubuntu fileio (master)]$ impressive. Good find, thanks.


OT, but I have the following in my prompt covering both git and hg - much
simpler, but gives similar information. Extending should be fairly easy.

function hg_ps1() {
# requires http://stevelosh.com/projects/hg-prompt/
#hg prompt '[{update}{{tags|:}:}{{bookmark}:}{branch}:{rev}] ' 2
/dev/null
hg prompt '[{update}{branch}:{rev}] ' 2 /dev/null
}

function git_ps1() {
local head=`git rev-list -n 1 --abbrev-commit HEAD 2 /dev/null`

if [ ${head} !=  ] ; then
local branch=`git branch --no-color 2 /dev/null | sed -e
'/^[^*]/d' -e 's/* \(.*\)/\1/'`

# Puts a ^ in the prompt if this revision is not FETCH_HEAD
local uptodate=`git log --no-color -n 1 --pretty=format:^
HEAD..FETCH_HEAD 2 /dev/null`

# Puts a comparison with the remote tracking branch in the prompt:
+ (ahead), - (behind) or * (both - diverged).
local tracking=`git branch -avv --no-color 2 /dev/null | sed -e
'/^[^*]/d' -e 's/  */ /g' -e 's/* \(.*\)/\1/' -e
's/^[^[]*\[\([^]]*\)\].*$/\1/' -e 's/^.*ahead [0-9][0-9]*/+/' -e
's/[^+].*behind [0-9][0-9]*.*$/-/' -e '/^[^+-]/d' -e 's/+-/*/'`

echo [${tracking}${uptodate}${branch}:${head}] 
return 0
fi

return 1
}

function git_hg_ps1() {
git_ps1

if [ $? -eq 0 ] ; then
return 0
fi

hg_ps1
return $?
}

export 
PS1='$(git_hg_ps1)\[\033[1;30m\]${USERNAME}@${HOSTNAME}:\[\033[0m\]\[\033[1;30m\]${PWD%%${PWD##$HOME}}\[\033[0m\]${PWD##$HOME}
'

It's designed to call as few external processes as possible (esp. when not
in a git repository) since it's used on Windows as well (msys, should work
in cygwin) and spawning on Windows is slow.

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: cPython, IronPython, Jython, and PyPy (Oh my!)

2012-05-16 Thread Tim Delaney
On 17 May 2012 07:33, Ethan Furman et...@stoneleaf.us wrote:

 Just hit a snag:

 In cPython the deterministic garbage collection allows me a particular
 optimization when retrieving records from a dbf file -- namely, by using
 weakrefs I can tell if the record is still in memory and active, and if so
 not hit the disk to get the data;  with PyPy (and probably the others) this
 doesn't work because the record may still be around even when it is no
 longer active because it hasn't been garbage collected yet.


What is the distinguishing feature of an active record? What is the
problem if you get back a reference to an inactive record? And if there is
indeed a problem, don't you already have a race condition on CPython?

1. Record is active;
2. Get reference to record through weak ref;
3. Record becomes inactive;
4. Start trying to use the (now inactive) record.

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: cPython, IronPython, Jython, and PyPy (Oh my!)

2012-05-16 Thread Tim Delaney
On 17 May 2012 11:13, Chris Angelico ros...@gmail.com wrote:

 On Thu, May 17, 2012 at 9:01 AM, Ethan Furman et...@stoneleaf.us wrote:
  A record is an interesting critter -- it is given life either from the
 user
  or from the disk-bound data;  its fields can then change, but those
 changes
  are not reflected on disk until .write_record() is called;  I do this
  because I am frequently moving data from one table to another, making
  changes to the old record contents before creating the new record with
 the
  changes -- since I do not call .write_record() on the old record those
  changes do not get backed up to disk.

 I strongly recommend being more explicit about usage and when it gets
 written and re-read, rather than relying on garbage collection.
 Databasing should not be tied to a language's garbage collection.
 Imagine you were to reimplement the equivalent logic in some other
 language - could you describe it clearly? If so, then that's your
 algorithm. If not, you have a problem.


Agreed. To me, this sounds like a perfect case for with: blocks and
explicit reference counting.  Something like (pseudo-python - not runnable):

class Record:
def __init__(self):
self.refs = 0
self.lock = threading.Lock()

def __enter__(self):
with self.lock:
self.refs += 1

def __exit__(self):
with self.lock:
self.refs -=1

if self.refs == 0:
self.write_record()

rest of Record class

rec = record_weakrefs.get('record_name')

if rec is None:
rec = load_record()
record_weakrefs.put('record_name', rec)

with rec:
do_stuff

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: why () is () and [] is [] work in other way?

2012-04-23 Thread Tim Delaney
On 24 April 2012 06:40, Devin Jeanpierre jeanpierr...@gmail.com wrote:

 On Mon, Apr 23, 2012 at 4:27 PM, Devin Jeanpierre
 jeanpierr...@gmail.com wrote:
  Well, no. Immutable objects could always compare equal, for example.
  This is more expensive though. is as-it-stands is very quick to
  execute, which is probably attractive to some people (especially for
  its used in detecting special constants).

 I don't know what made me write that so wrong. I meant immutable
 objects that are equal could always compare the same via is.


And doing that would make zero sense, because it directly contradicts the
whole *point* of is. The point of is is to tell you whether or not two
references are to the same object. This is a *useful* property.

I'll leave aside the question of how you determine if an object is
immutable, and restrict the discussion to a few built-in types that are
known to be immutable.

If two objects are not the same object, then lying and saying they are
would remove the opportunity for various programming techniques, such as
interning. Of course, you could say that all immutable objects should be
interned automatically. There are a couple problems with this that I can
think of off the top of my head.

The first problem is memory. If every immutable object is interned then
welcome to the world of ever-expanding memory usage. Ah - but Python has
got around this for interned strings! They're ejected from the intern cache
when there are no more references. Surely we could do the same for integers
and other immutables?

That brings us to performance. You do not want computations involving
immutable objects to suffer severe performance degradation just to make
equal immutable objects have the same identity. But if every single step of
a numerical calculation involved the following sequence of possible steps,
that's exactly what you would be doing:

1. Calculate result;

2. Lookup result in integer intern cache (involves hash() and ==);
- unavoidable

3. Add result to integer intern cache (involves hash() and ==, and maybe
resizing the cache);
- necessary if your result is not currently referenced anywhere else in the
Python VM

4. Lookup previous intermediate result in integer intern
cache (involves hash() and ==);
- necessary if you have a previous intermediate result

5. Eject previous intermediate result from integer intern
cache (involves hash() and ==).
- necessary if you have a previous intermediate result that is not
currently referenced anywhere else in the Python VM

Now think of the Python implementation of any checksum algorithm. Nearly
every intermediate result (a reasonably-large hash) is not going to be used
anywhere else in the VM, and will require all 4 extra steps. Ouch.

Instead, CPython makes the (sensible) choice to intern heavily-used
integers permanently - (-5, 256) IIRC - and leaves the rest up to the
programmer.

Strings are a different question. Unlike integers, where == is cheap, ==
for strings can be prohibitively expensive. Consider the case that for
whatever reason you create a 1GB string. Now imagine creating or deleting a
reference to *any* string potentially involves calling == on the 1GB
string. Ouch again.

Instead, CPython makes the (sensible) choice to automatically intern short
strings that look like names (in the Python sense) and leave everything
else up to the programmer. It's possible for the programmer to manually
intern their 1GB string, but they've then got to deal with the consequences
of doing so.

Tim Delaney
-- 
http://mail.python.org/mailman/listinfo/python-list


  1   2   >