Re: The future of Python immutability

2009-09-03 Thread Nigel Rantor


John Nagle wrote:

Python's concept of immutability is useful, but it could be more
general.

In the beginning, strings, tuples, and numbers were immutable, and
everything else was mutable.  That was simple enough.  But over time,
Python has acquired more immutable types - immutable sets and immutable
byte arrays.  Each of these is a special case.

Python doesn't have immutable objects as a general concept, but
it may be headed in that direction.  There was some fooling around
with an immmutability API associated with NumPy back in 2007, but
that was removed.  As more immutable types are added, a more general
approach may be useful.

Suppose, for discussion purposes, we had general immutable objects.
Objects inherited from immutableobject instead of object would be
unchangeable once __init__ had returned.  Where does this take us?

Immutability is interesting for threaded programs, because
immutable objects can be shared without risk.  Consider a programming
model where objects shared between threads must be either immutable or
synchronized in the sense that Java uses the term.


Yes, this is one of the reasons I am currently learning Haskell, I am 
not yet anywhwere near proficient but the reason I am looking into FP is 
because of some of the claims of the FP community, particularly Erlang, 
regarding the benefits of pure FP with respect to multi-threading.


It's a shame this post came right now since I'm not really up-to-speed 
enough with Haskell to comment on it with repsect to multi-threading.


context
I program Perl, Java and C++ for my day job, I've spent a lot of time 
making multithreaded programs work correctly and have even experienced 
the POE on a large project. So my comments below are based on experience 
of these languages.

/context

 Such programs are free of most race conditions, without much
 programmer effort to make them so.

I disagree. They are not free of most race conditions, and it still 
takes a lot of effort. Where did you get this idea from? Have you been 
reading some Java primer that attempts to make it sound easy?


Java synchronized turned out to be a headache partly because 
trying to

figure out how to lock all the little stuff being passed around
a headache.  But Java doesn't have immutable objects.  Python does,
and that can be exploited to make thread-based programming cleaner.


This is nothing to do with Java, any multithreaded language that has 
mutable shared state has exactly the same problems. Can we talk about 
threading rather than Java please? Additionally Java provides a lot more 
than monitors (synchronized) for controlling multiple threads.


Java does have immutable objects. Strings in Java are immutable for 
example. As are the object-based numeric types, Bytes, Characters etc.


There are lots and lots of immutable types in Java and you can make your 
own by creating a class with no mutator methods and declaring it final.



The general idea is that synchronized objects would have built in
locks which would lock at entry to any function of the object and
unlock at exit.  The lock would also unlock at explicit waits.  A
Queue object would be a good example of a synchronized object.

With this mechanism, multi-thread programs with shared data
structures can be written with little or no explicit locking by
the programmer.  If the restrictions are made a bit stricter,
strict enough that threads cannot share mutable unsynchronized data,
removal of the global interpreter lock is potentially possible.
This is a route to improved performance on modern multi-core CPUs.


Right, this is where I would love to have had more experience with Haksell.

Yes, as soon as you get to a situation where no thread can access shared 
state that is mutable your problems go away, you're also getting no work 
done becasue the threads, whilst they may be performing lots of 
interesting calculations have no way of allowing the rest of the 
program, or operating system, know about it.


You can, today, in any language that provides threads, make any number 
of threaded programs that do not contain any race conditions, it's just 
that most of them are terribly dull and uninteresting.


I'd love for someone from the Haskell/Erlang/other pure FP community 
provide some canonical example of how this is acheived in pure FP. I'll 
get there soon but I'm not going to skip ahead in my reading, I'm still 
trying to learn the basics.


So, in response to your point of trying to get an immutable API so that 
Python can easily have multi-threaded programs that do not present race 
conditions I would say the following:


That is not the challenge, that's the easy part. The challenge is 
getting useful information out of a system that has only been fed 
immutable objects.


Regards,

  Nigel
--
http://mail.python.org/mailman/listinfo/python-list


Re: The future of Python immutability

2009-09-03 Thread Nigel Rantor

Stefan Behnel wrote:

Nigel Rantor wrote:

John Nagle wrote:

Immutability is interesting for threaded programs, because
immutable objects can be shared without risk.  Consider a programming
model where objects shared between threads must be either immutable or
synchronized in the sense that Java uses the term.
Such programs are free of most race conditions, without much
programmer effort to make them so.

I disagree. They are not free of most race conditions, and it still
takes a lot of effort. Where did you get this idea from? Have you been
reading some Java primer that attempts to make it sound easy?


Read again what he wrote. In a language with only immutable data types
(which doesn't mean that you can't efficiently create modified versions of
a data container), avoiding race conditions is trivial. The most well known
example is clearly Erlang. Adding synchronised data structures to that
will not make writing race conditions much easier.


My comment you quoted was talking about Java and the use of 
synchronized. I fthat was unclear I apologise.


Please feel free to read the entirety of my post before replying.

  n
--
http://mail.python.org/mailman/listinfo/python-list


Re: evolution [was Re: An assessment of the Unicode standard]

2009-09-02 Thread Nigel Rantor

r wrote:

I'd like to present a bug report to evolution, obviously the garbage
collector is malfunctioning.


I think most people think that when they read the drivel that you generate.

I'm done with your threads and posts.

*plonk*
--
http://mail.python.org/mailman/listinfo/python-list


Re: An assessment of the Unicode standard

2009-08-31 Thread Nigel Rantor

Hendrik van Rooyen wrote:

On Sunday 30 August 2009 22:46:49 Dennis Lee Bieber wrote:


Rather elitist viewpoint... Why don't we just drop nukes on some 60%
of populated landmasses that don't have a western culture and avoid
the whole problem?


Now yer talking, boyo!  It will surely help with the basic problem which is 
the heavy infestation of people on the planet!

:-)


bait
On two conditions:

1) We drop some test bombs on Slough to satisfy Betjeman.

2) We strap both Xah and r to aforementioned bombs.
/bait

switch
Also, I'm surprised no-one has mentioned Esperanto yet. Sounds like 
something r and Xah would *love*.


Slightly off-topic - does anyone have a good recipe for getting 
thunderbird to kill whole threads for good? Either based on a rule or 
just some extension I can use?


The Xah/r threads are like car crashes, I can't help but watch but my 
time could be better spent and I don't want to unsub the whole list.

/switch

Cheers,

  n
--
http://mail.python.org/mailman/listinfo/python-list


Re: Need help with Python scoping rules

2009-08-26 Thread Nigel Rantor
kj wrote:

 Needless to say, I'm pretty beat by this point.  Any help would be
 appreciated.
 
 Thanks,

Based on your statement above, and the fact that multiple people have
now explained *exactly* why your attempt at recursion hasn't worked, it
might be a good idea to step back, accept the advice and walk away
instead of trying to convince people that the language forbids recursion
and doesn't provide decent OO ecapsulation.

Otherwise I'd wager you'll soon be appearing in multiple kill-files.

  n

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: zip codes

2009-08-17 Thread Nigel Rantor

MRAB wrote:

Sjoerd Mullender wrote:

Martin P. Hellwig wrote:

Shailen wrote:

Is there any Python module that helps with US and foreign zip-code
lookups? I'm thinking of something that provides basic mappings of zip
to cities, city to zips, etc. Since this kind of information is so
often used for basic user-registration, I'm assuming functionality of
this sort must be available for Python. Any suggestions will be much
appreciated.


There might be an associated can of worms here, for example in the
Netherlands zip codes are actually copyrighted and require a license if
you want to do something with them, on the other hand you get a nice SQL
formatted db to use it. I don't know how this works in other countries
but I imagine that it is likely to be generally the same.



Also in The Netherlands, ZIP codes are much more fine-grained than in
some other countries: ZIP code plus house number together are sufficient
to uniquely identify an address.  I.e. you don't need the street name.
E.g., my work address has ZIP code 1098 XG and house number 123, so
together they indicate that I work at Science Park 123, Amsterdam.

In other words, a simple city - ZIP mapping is not sufficient.


The same comment applies to UK postcodes, which are also alphanumeric.
My home postcode, for example, is shared with only 3 other houses, IIRC.


Kind of off-topic...but nevertheless...

Yes, the UK postcode database (PAF) can be bought from the Royal Mail 
for a fee.


The data cannot be copyright, but the version they maintain and 
distribute is.


As an aside, the PAF has finer grained information than simply the 
postal code, every letterbox in the UK has (or is meant to) a DPS 
(delivery point suffix), so that given a post code and DPS you can 
uniquely identify individual letterbox even when, for example, a house 
has been split into multiple flats.


So, nastily, you *can* identify individual letterboxes, but the Royal 
Mail does not publicise the fact, so you cannot actually look at a post 
code on a letter and determine the letterbox it is intended for.


Shame really.

  n
--
http://mail.python.org/mailman/listinfo/python-list


Re: callable virtual method

2009-08-14 Thread Nigel Rantor

Jean-Michel Pichavant wrote:


Your solution will work, for sure. The problem is that it will dumb down 
the Base class interface, multiplying the number of methods by 2. This 
would not be an issue in many cases, in mine there's already too much 
meaningful methods in my class for me to add artificial ones.


Thanks for the tip anyway.


I suggest you reconsider.

You asked a question and have been given a standard way of achieving the 
desired outcome.


It's common in OO to use a Template pattern like this.

If you're not interested in finding out how loads of people have already 
solved the problem then why ask?


The methods that require overriding can be prefixed with an underscore 
so that people get a hint that they are an implementation detail rather 
than part of the public interface.


I don't see your problem, other than a vague aesthetic unease.

Regards,

  n
--
http://mail.python.org/mailman/listinfo/python-list


Re: callable virtual method

2009-08-14 Thread Nigel Rantor

Jean-Michel Pichavant wrote:

Nigel Rantor wrote:

Jean-Michel Pichavant wrote:


Your solution will work, for sure. The problem is that it will dumb 
down the Base class interface, multiplying the number of methods by 
2. This would not be an issue in many cases, in mine there's already 
too much meaningful methods in my class for me to add artificial ones.


Thanks for the tip anyway.


I suggest you reconsider.

You asked a question and have been given a standard way of achieving 
the desired outcome.


It's common in OO to use a Template pattern like this.

If you're not interested in finding out how loads of people have 
already solved the problem then why ask?


The methods that require overriding can be prefixed with an underscore 
so that people get a hint that they are an implementation detail 
rather than part of the public interface.


I don't see your problem, other than a vague aesthetic unease.

Regards,

  n
I understand how refuting some obvious solution may look just stupid. 
You're right, I shouldn't have asked.


I never said it seemed stupid. I was merely curious as to why you'd ask 
a question and ignore solutions.



By the way I'd like to know if I am I alone to find that

class Stream:
   def start
   def stop
   def reset

is better than

class Stream:
   def start
   def _start
   def stop
   def _stop
   def reset
   def _reset

(try to figure out with 20+ methods)
What you call aesthetic may sometimes fall into readability.


Depends on what you mean by better.

Do you mean pleasing to your eye or performs the task you want it to?

Assuming you are taking the aesthetic viewpoint I think that in this 
case it will depend on how you set out your code.


Realise that all of the underscore methods for your class are 
boilerplate, they simply raise an exception.


They can all be at the end of the file, commented as an entire block to 
be left alone.


Editing the main body of code is then fairly easy, and uncluttered...

e.g.

#
# Stream class blah blah blah
#
class Stream:

def start

def stop

def reset

#
# stubs to be over-ridden in sub-classes, add one for each
# method that requires overriding.
#
def _start
def _stop
def _reset

Regards,

  Nigel

p.s. Please take this in the spirit it is offered. I'm trying to stop 
you from ignoring a good suggestion, not make you feel like a fool.

--
http://mail.python.org/mailman/listinfo/python-list


Re: cross platform method Re: How to get the total size of a local hard disk?

2009-06-16 Thread Nigel Rantor
Tim Harig wrote:
 warning font=small print
 This is a joke.  Do not take it seriously.  I do not actually suggest
 anybody use this method to measure the size of their drive.  I do not take any
 responsibility for any damages incurred by using this method.  I will laugh
 at you if you do.  Offer not valid in AK, HI, Puero Rico, or U.S Virgin 
 Ilands.
 /warning

Like most jokes it's not really funny if you have to explain it.

But I appreciate that you're worried that anyone who would actually
follow the advice would also probably be rabidly litigious even if they
were one of those rare-breed of living brain-donors.

  n

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Connection tester

2009-06-10 Thread Nigel Rantor
Sparky wrote:
 Hey! I am developing a small application that tests multiple websites
 and compares their response time. Some of these sites do not respond
 to a ping and, for the measurement to be standardized, all sites must
 have the same action preformed upon them. Another problem is that not
 all of the sites have the same page size and I am not interested in
 how long it takes to load a page but instead just how long it takes
 for the website to respond. Finally, I am looking to keep this script
 platform independent, if at all possible.

Yes, lots of people block ICMP so you can't use it to reliably tell
whether a machine is there or not.

At least three possible solutions.

1) Perform a HEAD request against the document root. This is likely to
be a static page and making it a HEAD request will make most responses
take similar times.

2) Perform an OPTIONS request as specified in the RFC below for the *
resource. This doesn't always work.

3) Perform a request you believe will fail so that you are provided with
a 4XX error code, the only time this should take any appreciable time is
when someone has cute server-generated error pages.

HTTP/1.1 RFC - http://www.ietf.org/rfc/rfc2616.txt

  n
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Winter Madness - Passing Python objects as Strings

2009-06-05 Thread Nigel Rantor
Hendrik van Rooyen wrote:
  Nigel Rantor wi...@wiggly.org wrote:
 
 It just smells to me that you've created this elaborate and brittle hack 
 to work around the fact that you couldn't think of any other way of 
 getting the thread to change it's behaviour whilst waiting on input.
 
 I am beginning to think that you are a troll, as all your comments are
 haughty and disparaging, while you either take no trouble to follow, 
 or are incapable of following, my explanations.
 
 In the event that this is not the case, please try to understand my 
 reply to Skip, and then suggest a way that will perform better
 in my use case, out of your vast arsenal of better, quicker, 
 more reliable, portable and comprehensible ways of doing it.

Well, why not have a look at Gabriel's response.

That seems like a much more portable way of doing it if nothing else.

I'm not trolling, you just seem to be excited about something that
sounds like a fundamentally bad idea.

  n
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Winter Madness - Passing Python objects as Strings

2009-06-04 Thread Nigel Rantor
Hendrik van Rooyen wrote:
 
 If you have any interest, contact me and I will
 send you the source.

Maybe you could tell people what the point is...

  n
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Winter Madness - Passing Python objects as Strings

2009-06-04 Thread Nigel Rantor
Hendrik van Rooyen wrote:
 Nigel Rantor wi...@wiggly.org wrote:
 
 Hendrik van Rooyen wrote:
 If you have any interest, contact me and I will
 send you the source.
 Maybe you could tell people what the point is...
 
 Well its a long story, but you did ask...

[snip]

Maybe I should have said

why should people care

or

why would someone use this

or

what problem does this solve

Your explanation doesn't make a whole lot of sense to me, I'm sure it
does to you.

Why, for example, would someone use your system to pass objects between
processes (I think this is the main thing you are providing?) rather
than POSH or some other system?

Regards,

  n
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Winter Madness - Passing Python objects as Strings

2009-06-04 Thread Nigel Rantor

Hendrik van Rooyen wrote:

It is not something that would find common use - in fact, I have
never, until I started struggling with my current problem, ever even
considered the possibility of converting a pointer to a string and 
back to a pointer again, and I would be surprised if anybody else

on this list has done so in the past, in a context other than debugging.


Okay, well, I think that's probably because it sounds like a fairly good 
way of making things slower and hard to port to another interpreter. 
Obviously depending on how you're achieving this.


If you need to pass infomation to a thread then I would suggest there's 
better, quicker, more reliable, portable and comprehensible ways of 
doing it.


It just smells to me that you've created this elaborate and brittle hack 
to work around the fact that you couldn't think of any other way of 
getting the thread to change it's behaviour whilst waiting on input.


Just my 0.02p

  n
--
http://mail.python.org/mailman/listinfo/python-list


Re: binary file compare...

2009-04-17 Thread Nigel Rantor

Adam Olsen wrote:

On Apr 16, 11:15 am, SpreadTooThin bjobrie...@gmail.com wrote:

And yes he is right CRCs hashing all have a probability of saying that
the files are identical when in fact they are not.


Here's the bottom line.  It is either:

A) Several hundred years of mathematics and cryptography are wrong.
The birthday problem as described is incorrect, so a collision is far
more likely than 42 trillion trillion to 1.  You are simply the first
person to have noticed it.

B) Your software was buggy, or possibly the input was maliciously
produced.  Or, a really tiny chance that your particular files
contained a pattern that provoked bad behaviour from MD5.

Finding a specific limitation of the algorithm is one thing.  Claiming
that the math is fundamentally wrong is quite another.


You are confusing yourself about probabilities young man.

Just becasue something is extremely unlikely does not mean it can't 
happen on the first attempt.


This is true *no matter how big the numbers are*.

If you persist in making these ridiculous claims that people *cannot* 
have found collisions then as I said, that's up to you, but I'm not 
going to employ you to do anything except make tea.


Thanks,

  Nigel

--
http://mail.python.org/mailman/listinfo/python-list


Re: binary file compare...

2009-04-17 Thread Nigel Rantor

Adam Olsen wrote:

On Apr 16, 4:27 pm, Rhodri James rho...@wildebst.demon.co.uk
wrote:

On Thu, 16 Apr 2009 10:44:06 +0100, Adam Olsen rha...@gmail.com wrote:

On Apr 16, 3:16 am, Nigel Rantor wig...@wiggly.org wrote:

Okay, before I tell you about the empirical, real-world evidence I have
could you please accept that hashes collide and that no matter how many
samples you use the probability of finding two files that do collide is
small but not zero.

I'm afraid you will need to back up your claims with real files.

So that would be a no then.  If the implementation of dicts in Python,
say, were to assert as you are that the hashes aren't going to collide,
then I'd have to walk away from it.  There's no point in using something
that guarantees a non-zero chance of corrupting your data.


Python's hash is only 32 bits on a 32-bit box, so even 2**16 keys (or
65 thousand) will give you a decent chance of a collision.  In
contrast MD5 needs 2**64, and a *good* hash needs 2**128 (SHA-256) or
2**256 (SHA-512).  The two are at totally different extremes.


I'm just going to go ahead and take the above as an admission by you 
that the chance of collision is non-zero, and that if we accept that 
fact you cannot rely on a hash function to tell you if two files are 
identical.


Thanks.


There is *always* a non-zero chance of corruption, due to software
bugs, hardware defects, or even operator error.  It is only in that
broader context that you can realize just how minuscule the risk is.


Please explain why you're dragging the notion of corruption into this 
when it seems to be beside the point?



Can you explain to me why you justify great lengths of paranoia, when
the risk is so much lower?


Because in the real world, where I work, in practical, real, factual 
terms I have seen it happen. Not once. Not twice. But many, many times.



Why are you advocating a solution to the OP's problem that is more
computationally expensive than a simple byte-by-byte comparison and
doesn't guarantee to give the correct answer?


For single, one-off comparison I have no problem with a byte-by-byte
comparison.  There's a decent chance the files won't be in the OS's
cache anyway, so disk IO will be your bottleneck.



Only if you're doing multiple comparisons is a hash database
justified.  Even then, if you expect matching files to be fairly rare
I won't lose any sleep if you're paranoid and do a byte-by-byte
comparison anyway.  New vulnerabilities are found, and if you don't
update promptly there is a small (but significant) chance of a
malicious file leading to collision.


If I have a number of files then I would certainly use a hash as a quick 
test, but if two files hash to the same value I still have to go compare 
them. Hashing in this case saves time doing a comparison for each file 
but it doesn't obviate the need to do a byte-by-byte comparison to see 
if the files that hash to the same value are actually the same.



That's not my concern though.  What I'm responding to is Nigel
Rantor's grossly incorrect statements about probability.  The chance
of collision, in our life time, is *insignificant*.


Please tell me which statements? That the chance of two files hashing to 
the same value is non-zero? You admit as much above.


Also, please remember I gave you a way of verifying what I said, go 
crawl the web for images, pages, whatever, build a hash DB and tell me 
how long it takes you to find a collision using MD5 (which is the hash I 
was talking about when I told you I had real-world, experience to back 
up the theoretical claim that collisions occur.


Regards,

  Nigel

--
http://mail.python.org/mailman/listinfo/python-list


Re: binary file compare...

2009-04-16 Thread Nigel Rantor

Adam Olsen wrote:

On Apr 16, 3:16 am, Nigel Rantor wig...@wiggly.org wrote:

Adam Olsen wrote:

On Apr 15, 12:56 pm, Nigel Rantor wig...@wiggly.org wrote:

Adam Olsen wrote:

The chance of *accidentally* producing a collision, although
technically possible, is so extraordinarily rare that it's completely
overshadowed by the risk of a hardware or software failure producing
an incorrect result.

Not when you're using them to compare lots of files.
Trust me. Been there, done that, got the t-shirt.
Using hash functions to tell whether or not files are identical is an
error waiting to happen.
But please, do so if it makes you feel happy, you'll just eventually get
an incorrect result and not know it.

Please tell us what hash you used and provide the two files that
collided.

MD5


If your hash is 256 bits, then you need around 2**128 files to produce
a collision.  This is known as a Birthday Attack.  I seriously doubt
you had that many files, which suggests something else went wrong.

Okay, before I tell you about the empirical, real-world evidence I have
could you please accept that hashes collide and that no matter how many
samples you use the probability of finding two files that do collide is
small but not zero.


I'm afraid you will need to back up your claims with real files.
Although MD5 is a smaller, older hash (128 bits, so you only need
2**64 files to find collisions), and it has substantial known
vulnerabilities, the scenario you suggest where you *accidentally*
find collisions (and you imply multiple collisions!) would be a rather
significant finding.


No. It wouldn't. It isn't.

The files in question were millions of audio files. I no longer work at 
the company where I had access to them so I cannot give you examples, 
and even if I did Data Protection regulations wouldn't have allowed it.


If you still don't beleive me you can easily verify what I'm saying by 
doing some simple experiemnts. Go spider the web for images, keep 
collecting them until you get an MD5 hash collision.


It won't take long.


Please help us all by justifying your claim.


Now, please go and re-read my request first and admit that everything I 
have said so far is correct.



Mind you, since you use MD5 I wouldn't be surprised if your files were
maliciously produced.  As I said before, you need to consider
upgrading your hash every few years to avoid new attacks.


Good grief, this is nothing to do with security concerns, this is about 
someone suggesting to the OP that they use a hash function to determine 
whether or not two files are identical.


Regards,

  Nige
--
http://mail.python.org/mailman/listinfo/python-list


Re: binary file compare...

2009-04-16 Thread Nigel Rantor

Adam Olsen wrote:

On Apr 15, 12:56 pm, Nigel Rantor wig...@wiggly.org wrote:

Adam Olsen wrote:

The chance of *accidentally* producing a collision, although
technically possible, is so extraordinarily rare that it's completely
overshadowed by the risk of a hardware or software failure producing
an incorrect result.

Not when you're using them to compare lots of files.

Trust me. Been there, done that, got the t-shirt.

Using hash functions to tell whether or not files are identical is an
error waiting to happen.

But please, do so if it makes you feel happy, you'll just eventually get
an incorrect result and not know it.


Please tell us what hash you used and provide the two files that
collided.


MD5


If your hash is 256 bits, then you need around 2**128 files to produce
a collision.  This is known as a Birthday Attack.  I seriously doubt
you had that many files, which suggests something else went wrong.


Okay, before I tell you about the empirical, real-world evidence I have 
could you please accept that hashes collide and that no matter how many 
samples you use the probability of finding two files that do collide is 
small but not zero.


Which is the only thing I've been saying.

Yes, it's unlikely. Yes, it's possible. Yes, it happens in practice.

If you are of the opinion though that a hash function can be used to 
tell you whether or not two files are identical then you are wrong. It 
really is that simple.


I'm not sitting here discussing this for my health, I'm just trying to 
give the OP the benefit of my experience, I have worked with other 
people who insisted on this route and had to find out the hard way that 
it was a Bad Idea (tm). They just wouldn't be told.


Regards,

  Nige
--
http://mail.python.org/mailman/listinfo/python-list


Re: binary file compare...

2009-04-15 Thread Nigel Rantor

Martin wrote:

On Wed, Apr 15, 2009 at 11:03 AM, Steven D'Aprano
ste...@remove.this.cybersource.com.au wrote:

The checksum does look at every byte in each file. Checksumming isn't a
way to avoid looking at each byte of the two files, it is a way of
mapping all the bytes to a single number.


My understanding of the original question was a way to determine
wether 2 files are equal or not. Creating a checksum of 1-n files and
comparing those checksums IMHO is a valid way to do that. I know it's
a (one way) mapping between a (possibly) longer byte sequence and
another one, how does checksumming not take each byte in the original
sequence into account.


The fact that two md5 hashes are equal does not mean that the sources 
they were generated from are equal. To do that you must still perform a 
byte-by-byte comparison which is much less work for the processor than 
generating an md5 or sha hash.


If you insist on using a hashing algorithm to determine the equivalence 
of two files you will eventually realise that it is a flawed plan 
because you will eventually find two files with different contents that 
nonetheless hash to the same value.


The more files you test with the quicker you will find out this basic truth.

This is not complex, it's a simple fact about how hashing algorithms work.

  n

--
http://mail.python.org/mailman/listinfo/python-list


Re: binary file compare...

2009-04-15 Thread Nigel Rantor

Grant Edwards wrote:

We all rail against premature optimization, but using a
checksum instead of a direct comparison is premature
unoptimization.  ;)


And more than that, will provide false positives for some inputs.

So, basically it's a worse-than-useless approach for determining if two 
files are the same.


   n
--
http://mail.python.org/mailman/listinfo/python-list


Re: binary file compare...

2009-04-15 Thread Nigel Rantor

Adam Olsen wrote:

The chance of *accidentally* producing a collision, although
technically possible, is so extraordinarily rare that it's completely
overshadowed by the risk of a hardware or software failure producing
an incorrect result.


Not when you're using them to compare lots of files.

Trust me. Been there, done that, got the t-shirt.

Using hash functions to tell whether or not files are identical is an 
error waiting to happen.


But please, do so if it makes you feel happy, you'll just eventually get 
an incorrect result and not know it.


  n
--
http://mail.python.org/mailman/listinfo/python-list


Re: Ordered Sets

2009-03-20 Thread Nigel Rantor

Aahz wrote:

In article 9a5d59e1-2798-4864-a938-9b39792c5...@s9g2000prg.googlegroups.com,
Raymond Hettinger  pyt...@rcn.com wrote:

Here's a new, fun recipe for you guys:

http://code.activestate.com/recipes/576694/


That is *sick* and perverted.


I'm not sure why.

Would it be less sick if it had been called UniqueList ?

  n
--
http://mail.python.org/mailman/listinfo/python-list


Re: file locking...

2009-03-01 Thread Nigel Rantor

bruce wrote:

Hi.

Got a bit of a question/issue that I'm trying to resolve. I'm asking
this of a few groups so bear with me.

I'm considering a situation where I have multiple processes running,
and each process is going to access a number of files in a dir. Each
process accesses a unique group of files, and then writes the group
of files to another dir. I can easily handle this by using a form of
locking, where I have the processes lock/read a file and only access
the group of files in the dir based on the  open/free status of the
lockfile.

However, the issue with the approach is that it's somewhat
synchronous. I'm looking for something that might be more
asynchronous/parallel, in that I'd like to have multiple processes
each access a unique group of files from the given dir as fast as
possible.


I don't see how this is synchronous if you have a lock per file. Perhaps 
you've missed something out of your description of your problem.



So.. Any thoughts/pointers/comments would be greatly appreciated. Any
 pointers to academic research, etc.. would be useful.


I'm not sure you need academic papers here.

One trivial solution to this problem is to have a single process 
determine the complete set of files that require processing then fork 
off children, each with a different set of files to process.


The parent then just waits for them to finish and does any 
post-processing required.


A more concrete problem statement may of course change the solution...

  n

--
http://mail.python.org/mailman/listinfo/python-list


Re: file locking...

2009-03-01 Thread Nigel Rantor

koranthala wrote:

On Mar 1, 2:28 pm, Nigel Rantor wig...@wiggly.org wrote:

bruce wrote:

Hi.
Got a bit of a question/issue that I'm trying to resolve. I'm asking
this of a few groups so bear with me.
I'm considering a situation where I have multiple processes running,
and each process is going to access a number of files in a dir. Each
process accesses a unique group of files, and then writes the group
of files to another dir. I can easily handle this by using a form of
locking, where I have the processes lock/read a file and only access
the group of files in the dir based on the  open/free status of the
lockfile.
However, the issue with the approach is that it's somewhat
synchronous. I'm looking for something that might be more
asynchronous/parallel, in that I'd like to have multiple processes
each access a unique group of files from the given dir as fast as
possible.

I don't see how this is synchronous if you have a lock per file. Perhaps
you've missed something out of your description of your problem.


So.. Any thoughts/pointers/comments would be greatly appreciated. Any
 pointers to academic research, etc.. would be useful.

I'm not sure you need academic papers here.

One trivial solution to this problem is to have a single process
determine the complete set of files that require processing then fork
off children, each with a different set of files to process.

The parent then just waits for them to finish and does any
post-processing required.

A more concrete problem statement may of course change the solution...

   n


Using twisted might also be helpful.
Then you can avoid the problems associated with threading too.


No one mentioned threads.

I can't see how Twisted in this instance isn't like using a sledgehammer 
to crack a nut.


  n
--
http://mail.python.org/mailman/listinfo/python-list


Re: file locking...

2009-03-01 Thread Nigel Rantor


Hi Bruce,

Excuse me if I'm a little blunt below. I'm ill grumpy...

bruce wrote:

hi nigel...

using any kind of file locking process requires that i essentially have a
gatekeeper, allowing a single process to enter, access the files at a
time...


I don't beleive this is a necessary condition. That would only be the 
case if you allowed yourself a single lock.



i can easily setup a file read/write lock process where a client app
gets/locks a file, and then copies/moves the required files from the initial
dir to a tmp dir. after the move/copy, the lock is released, and the client
can go ahead and do whatever with the files in the tmp dir.. thie process
allows multiple clients to operate in a psuedo parallel manner...

i'm trying to figure out if there's a much better/faster approach that might
be available.. which is where the academic/research issue was raised..


I'm really not sure why you want to move the files around. Here are two 
different approaches from the one I initially gave you that deals 
perfectly well with a directory where files are constantly being added.


In both approaches we are going to try and avoid using OS-specific 
locking mechanisms, advisory locking, flock etc. So it should work 
everywhere as long as you also have write access to the filesystem 
you're on.



Approach 1 - Constant Number of Processes

This requires no central manager but for every file lock requires a few 
OS calls.


Start up N processes with the same working directory WORK_DIR.

Each process then follows this algorithm:

- sleep for some small random period.

- scan the WORK_DIR for a FILE that does not have a corresponding LOCK_FILE

- open LOCK_FILE in append mode and write our PID into it.

- close LOCK_FILE

- open LOCK_FILE

- read first line from LOCK_FILE and compare to our PID

- if the PID we just read from the LOCK_FILE matches ours then we may 
process the corresponding FILE otherwise another process beat us to it.


- repeat

After processing a file completely you can remove it and the lockfile at 
the same time.


As long as filenames follow some pattern then you can simply say that 
the LOCK_FILE for FILE is called FILE.lock


e.g.

WORK_DIR  : /home/wiggly/var/work
FILE  : /home/wiggly/var/work/data_2354272.dat
LOCK_FILE : /home/wiggly/var/work/data_2354272.dat.lock


Approach 2 - Managed Processes

Here we have a single main process that spawns children. The children 
listen for filenames on a pipe that the parent has open to them.


The parent constantly scans the WORK_DIR for new files to process and as 
it finds one it sends that filename to a child process.


You can either be clever about the children and ensure they tell the 
parent when they're free or just pass them work in a round-robin fashion.


I hope the two above descriptions make sense, let me know if they don't.

   n



the issue that i'm looking at is analogous to a FIFO, where i have lots of
files being shoved in a dir from different processes.. on the other end, i
want to allow mutiple client processes to access unique groups of these
files as fast as possible.. access being fetch/gather/process/delete the
files. each file is only handled by a single client process.

thanks..



-Original Message-
From: python-list-bounces+bedouglas=earthlink@python.org
[mailto:python-list-bounces+bedouglas=earthlink@python.org]on Behalf
Of Nigel Rantor
Sent: Sunday, March 01, 2009 2:00 AM
To: koranthala
Cc: python-list@python.org
Subject: Re: file locking...


koranthala wrote:

On Mar 1, 2:28 pm, Nigel Rantor wig...@wiggly.org wrote:

bruce wrote:

Hi.
Got a bit of a question/issue that I'm trying to resolve. I'm asking
this of a few groups so bear with me.
I'm considering a situation where I have multiple processes running,
and each process is going to access a number of files in a dir. Each
process accesses a unique group of files, and then writes the group
of files to another dir. I can easily handle this by using a form of
locking, where I have the processes lock/read a file and only access
the group of files in the dir based on the  open/free status of the
lockfile.
However, the issue with the approach is that it's somewhat
synchronous. I'm looking for something that might be more
asynchronous/parallel, in that I'd like to have multiple processes
each access a unique group of files from the given dir as fast as
possible.

I don't see how this is synchronous if you have a lock per file. Perhaps
you've missed something out of your description of your problem.


So.. Any thoughts/pointers/comments would be greatly appreciated. Any
 pointers to academic research, etc.. would be useful.

I'm not sure you need academic papers here.

One trivial solution to this problem is to have a single process
determine the complete set of files that require processing then fork
off children, each with a different set of files to process.

The parent then just waits for them to finish and does any
post-processing required

Re: file locking...

2009-03-01 Thread Nigel Rantor

zugnush wrote:

You could do something like this so that  every process will know if
the file belongs to it without prior coordination, it  means a lot
of redundant hashing though.

In [36]: import md5

In [37]: pool = 11

In [38]: process = 5

In [39]: [f for f in glob.glob('*') if int(md5.md5(f).hexdigest(),16)
% pool == process ]
Out[39]:


You're also relying on the hashing being perfectly distributed, 
otherwise some processes aren't going to be performing useful work even 
though there is useful work to perform.


In other words, why would you rely on a scheme that limits some 
processes to certain parts of the data? If we're already talking about 
trying to get away without some global lock for synchronisation this 
seems to go against the original intent of the problem...


  n
--
http://mail.python.org/mailman/listinfo/python-list


Re: code challenge: generate minimal expressions using only digits 1,2,3

2009-02-20 Thread Nigel Rantor

Trip Technician wrote:

anyone interested in looking at the following problem.


if you can give me a good reason why this is not homework I'd love to 
hear it...I just don't see how this is a real problem.



we are trying to express numbers as minimal expressions using only the
digits one two and three, with conventional arithmetic. so for
instance

33 = 2^(3+2)+1 = 3^3+(3*2)

are both minimal, using 4 digits but

33 = ((3+2)*2+1)*3

using 5 is not.

I have tried coding a function to return the minimal representation
for any integer, but haven't cracked it so far. The naive first
attempt is to generate lots of random strings, eval() them and sort by
size and value. this is inelegant and slow.


Wow. Okay, what other ways have you tried so far? Or are you beating 
your head against the search the entire problem space solution still?


This problem smells a lot like factorisation, so I would think of it in 
terms of wanting to reduce the target number using as few operations as 
possible.


If you allow exponentiation that's going to be your biggest hitter so 
you know that the best you can do using 2 digits is n^n where n is the 
largest digit you allow yourself.


Are you going to allow things like n^n^n or not?

  n


--
http://mail.python.org/mailman/listinfo/python-list


Re: code challenge: generate minimal expressions using only digits 1,2,3

2009-02-20 Thread Nigel Rantor

Luke Dunn wrote:

yes power towers are allowed


right, okay, without coding it here's my thought.

factorise the numbers you have but only allowing primes that exist in 
your digit set.


then take that factorisation and turn any repeated runs of digits 
multiplied by themselves into power-towers


any remainder can then be created in other ways, starting with a way 
other than exponentiation that is able to create the largest number, 
i.e. multiplication, then addition...


I've not got time to put it into code right now  but it shouldn't be too 
hard...


e.g.

digits : 3, 2, 1

n : 10
10 = 2*5 - but we don't have 5...
10 = 3*3 + 1
10 = 3^2+1
3 digits

n : 27
27 = 3*3*3
27 = 3^3
2 digits

n : 33
33 = 3*3*3 + 6
33 = 3*3*3 + 3*2
33 = 3^3+3*2
4 digits

exponentiation, multiplication, division, addition and subtraction. 
Brackets when necessary but length is sorted on number of digits not 
number of operators plus digits.
 
I always try my homework myself first. in 38 years of life I've 
learned only to do what i want, if I wanted everyone else to do my work 
for me I'd be a management consultant !
On Fri, Feb 20, 2009 at 3:52 PM, Luke Dunn luke.d...@gmail.com 
mailto:luke.d...@gmail.com wrote:


I am teaching myself coding. No university or school, so i guess its
homework if you like. i am interested in algorithms generally, after
doing some of Project Euler. Of course my own learning process is
best served by just getting on with it but sometimes you will do
that while other times you might just choose to ask for help. if no
one suggests then i will probably shelve it and come back to it
myself when I'm fresh.
 
no it's not a real world problem but my grounding is in math so i

like pure stuff anyway. don't see how that is a problem, as a math
person i accept the validity of pure research conducted just for
curiosity and aesthetic satisfaction. it often finds an application
later anyway
 
Thanks for your helpful suggestion of trying other methods and i

will do that in time. my motive was to share an interesting problem
because a human of moderate math education can sit down with this
and find minimal solutions easily but the intuition they use is
quite subtle, hence the idea of converting the human heuristic into
an algorithm became of interest, and particularly a recursive one. i
find that the development of a piece of recursion usually comes as
an 'aha', and since i hadn't had such a moment, i thought i'd turn
the problem loose on the public. also i found no online reference to
this problem so it seemed ripe for sharing.

On Fri, Feb 20, 2009 at 3:39 PM, Nigel Rantor wig...@wiggly.org
mailto:wig...@wiggly.org wrote:

Trip Technician wrote:

anyone interested in looking at the following problem.


if you can give me a good reason why this is not homework I'd
love to hear it...I just don't see how this is a real problem.


we are trying to express numbers as minimal expressions
using only the
digits one two and three, with conventional arithmetic. so for
instance

33 = 2^(3+2)+1 = 3^3+(3*2)

are both minimal, using 4 digits but

33 = ((3+2)*2+1)*3

using 5 is not.

I have tried coding a function to return the minimal
representation
for any integer, but haven't cracked it so far. The naive first
attempt is to generate lots of random strings, eval() them
and sort by
size and value. this is inelegant and slow.


Wow. Okay, what other ways have you tried so far? Or are you
beating your head against the search the entire problem space
solution still?

This problem smells a lot like factorisation, so I would think
of it in terms of wanting to reduce the target number using as
few operations as possible.

If you allow exponentiation that's going to be your biggest
hitter so you know that the best you can do using 2 digits is
n^n where n is the largest digit you allow yourself.

Are you going to allow things like n^n^n or not?

 n






--
http://mail.python.org/mailman/listinfo/python-list


Re: code challenge: generate minimal expressions using only digits 1,2,3

2009-02-20 Thread Nigel Rantor

Trip Technician wrote:


yes n^n^n would be fine. agree it is connected to factorisation.
building a tree of possible expressions is my next angle.


I think building trees of the possible expressions as a couple of other 
people have suggested is simply a more structured way of doing what 
you're currnetly doing.


Right now you're throwing darts at the problem space, and hoping that 
the next one point you hit will be a more optimal solution.


If you enumerate all the expression trees you are just ensuring you 
don't miss any solutions.


I think the algorithm/hueristic I just posted should get you to the 
answer quicker though...


  n
--
http://mail.python.org/mailman/listinfo/python-list


Re: To Troll or Not To Troll (aka: as keyword woes)

2008-12-10 Thread Nigel Rantor

James Stroud wrote:

Andreas Waldenburger wrote:

Is it me, or has c.l.p. developed a slightly harsher tone recently?
(Haven't been following for a while.)


Yep. I can only post here for about a week or two until someone blows a 
cylinder and gets ugly because they interpreted something I said as a 
criticism of the language and took it personally by extension. Then I 
have to take a 4 month break because I'm VERY prone to 
reciprocating--nastily. I think its a symptom of the language's 
maturing, getting popular, and a minority fraction* of the language's 
most devout advocates developing an egotism that complements their 
python worship in a most unsavory way.


I wish they would instead spend their energy volunteering to moderate 
this list and culling out some of the spam.


*No names were mentioned in the making of this post.


I joined this list some time ago, I am not a regular python user.

I have maintained my list subscription because when I'm bored the flames 
here are very entertaining.


I don't think I need to mention specifics really.

Oh, and the weekly thread about immutable default arguments is a 
cracker...more please.


  n
--
http://mail.python.org/mailman/listinfo/python-list


Re: Exhaustive Unit Testing

2008-11-28 Thread Nigel Rantor

Roy Smith wrote:


There's a well known theory in studies of the human brain which says people 
are capable of processing about 7 +/- 2 pieces of information at once.  


It's not about processing multiple taks, it's about the amount of things 
that can be held in working memory.


  n

--
http://mail.python.org/mailman/listinfo/python-list


Re: How to best explain a subtle difference between Python and Perl ?

2008-08-13 Thread Nigel Rantor

Jonathan Gardner wrote:
[...eloquent and interesting discussion of variable system snipped...]


Is Python's variable system better than perl's? It depends on which
way you prefer. As for me, being a long-time veteran of perl and
Python, I don't think having a complicated variable system such as
perl's adds anything to the language. Python's simplicity in this
regard is not only sufficient, but preferable.


Very well put.

I am not however sure I agree with your very final thought.

I ma a long time C/C++/Java/Perl developer. I know some python too.

The Python system is the same as the Java system, apart from Java's 
primitive types, which is a completely different discussion that I 
really don't want to get into right now.


So, everything is by reference.

I understand, and agree that a simple system is good. And maybe even 
preferable. But it isn't always sufficient.


Some algorithms are much easier to write if you know that your 
parameters are going to be copied and that the function may use them as 
local variables without having to explicitly create copies.


You can also reason more easily about what side-effects the function 
could have if you know it cannot possibly modify your parameters.


Other systems out there require pointer-like semantics (for example 
CORBA out and inout parameters) which have to be kludged in languages 
like Java to pass in wrapper objects/boxes that can be assigned values.


Whilst it may be easier to learn a system like python/java, in the end 
the amount of time spent learning the system is normally dwarfed by the 
time spent using the system to build software. I would rather have a 
type system that is as expressive as possible.


Also, note that it is entirely possible to create many, many, many 
interesting and useful things in Perl without having to resort to 
references. They are a relatively new feature after all.


Just my 0.02p

  n
--
http://mail.python.org/mailman/listinfo/python-list


Re: You advice please

2008-08-13 Thread Nigel Rantor

Calvin Spealman wrote:

Ruby (on Rails) people love to talk about Ruby (on Rails).

Python people are too busy getting things done to talk as loudly.


Have you read this list?

I would suggest your comment indicates not.

Throwaway comments like yours that are pithy, emotional and devoid of 
any factual content are just the kind of thing that makes lists such as 
this less useful than they could be.


You are acting as a source of noise, not signal. I'm sure you don't want 
to be considered in that manner, so perhaps you should think about 
adding something to the conversation instead.


Before you reply please think about what you plan on saying, you'll be 
helping not only me but yourself and anyone who reads your post.


  n
--
http://mail.python.org/mailman/listinfo/python-list


Re: You advice please

2008-08-13 Thread Nigel Rantor

Fredrik Lundh wrote:

Nigel Rantor wrote:

Throwaway comments like yours that are pithy, emotional and devoid of 
any factual content are just the kind of thing that makes lists such 
as this less useful than they could be.


Oh, please.  It's a fact that Python advocacy is a lot more low-key than 
the advocacy of certain potentially competing technologies.  It's always 
been that way.   Too many Europeans involved, most likely.


Your opinion. We simply disagree on this point.

I'm not sure what the comment about Europeans even means though.


  Have you read this list?
 
  I would suggest your comment indicates not.

This list is a Python forum.  Calvin (who's a long time contributor to 
this forum, which you would have known if you'd actually followed the 
list for some time) was talking about the real world.


I did not mean in a how long have you been here way. I apologise. I 
meant in a have you not seen how much traffic, including rabid fanboys, 
this list gets?


You're right, I should have been much clearer on that point.

  n
--
http://mail.python.org/mailman/listinfo/python-list


Re: You advice please

2008-08-13 Thread Nigel Rantor

Calvin Spealman wrote:

God forbid I try to make a joke.


Ah, sorry, sense of humour failure for me today obviously.

  n
--
http://mail.python.org/mailman/listinfo/python-list


Re: How to best explain a subtle difference between Python and Perl ?

2008-08-12 Thread Nigel Rantor

Palindrom wrote:

### Python ###

liste = [1,2,3]

def foo( my_list ):
my_list = []


The above points the my_list reference at a different object. In this 
case a newly created list. It does not modify the liste object, it 
points my_list to a completely different object.



### Perl ###

@lst =(1,2,3);
$liste [EMAIL PROTECTED];
foo($liste);
print @lst\n;

sub foo {
 my($my_list)[EMAIL PROTECTED];
 @{$my_list}=()
}


The above code *de-references* $my_list and assigns an empty list to its 
referant (@lst).


The two code examples are not equivalent.

An equivalent perl example would be as follows:

### Perl ###

@lst =(1,2,3);
$liste [EMAIL PROTECTED];
foo($liste);
print @lst\n;

sub foo {
 my($my_list)[EMAIL PROTECTED];
 $my_list = [];
}

The above code does just what the python code does. It assigns a newly 
created list object to the $my_list reference. Any changes to this now 
have no effect on @lst because $my_list no longer points there.


  n
--
http://mail.python.org/mailman/listinfo/python-list


Re: Terminate a python script from linux shell / bash script

2008-07-31 Thread Nigel Rantor

Gros Bedo wrote:

Thank you guys for your help. My problem is that I project to use this command 
to terminate a script when uninstalling the software, so I can't store the PID. 
This command will be integrated in the spec file of the RPM package. Here's the 
script I'll use, it may help someone else:

#!/bin/sh
# PYTHON SCRIPT PROCESS KILLER by GBO v0.1
# This script will look for all the lines containing $SOFTWARENAME in the 
process list, and close them

SOFTWARENAME='yoursoftware' #This is case insensitive
JOBPRESENT=$(ps -ef | grep -i $SOFTWARENAME | grep -v grep)
echo $JOBPRESENT
ps -ef | grep -i $SOFTWARENAME | grep -v grep | awk '{print $2}' | xargs kill


If you have a long running process you wish to be able to kill at a 
later date the normal way of doing it would be for the script itself to 
write it's own PID to a file that you can then inspect from a different 
process and use to kill it.


So, my_daemon.py might shove its PID into /var/run/my_daemon.pid

And later my_daemon_killer.py (or indeed my_daemon_killer.sh) would read 
the PID out of /var/run/my_daemon.pid and pass that to a kill command.


Using ps/grep in the way you're trying to do is always going to be 
inexact and people will not thank you for killing processes they wanted 
running.


  n
--
http://mail.python.org/mailman/listinfo/python-list


Re: new style class

2007-11-02 Thread Nigel Rantor
gert wrote:
 Could not one of you just say @staticmethod for once damnit :)
 

why were you asking if you knew the answer?

yeesh
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: new style class

2007-11-02 Thread Nigel Rantor
gert wrote:
 On Nov 2, 12:27 pm, Boris Borcic [EMAIL PROTECTED] wrote:
 gert wrote:
 class Test(object):
 def execute(self,v):
 return v
 def escape(v):
 return v
 if  __name__ == '__main__':
 gert = Test()
 print gert.m1('1')
 print Test.m2('2')
 Why doesn't this new style class work in python 2.5.1 ?
 why should it ?
 
 I don't know I thought it was supported from 2.2?
 

I think what Boris was being exceedingly unhelpful in saying was why 
should it work when you're calling methods that do not exist

I don't see 'm1' or 'm2' defined for the class 'Test'.

   n

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: modules and generated code

2006-11-15 Thread Nigel Rantor
J. Clifford Dyer wrote:
 
 Maybe I'm missing something obvious, but it sounds like you are 
 over-complicating the idea of inheritance.  Do you just want to create a 
 subclass of the other class?

Nope, that isn't my problem.

I have an IDL file that is used to generate a set of stub and skeleton 
code that is not human-modifiable.

Eventually I would like to have my IDL in source control and have a 
setup script able to generate my stubs and skels and install them for me.

At the same time I want to produce code that uses this code but in the 
same package.

In Java or Perl I can easily create a couple package/module like this:

package org.mine.package;

[...class definitions...]


and then somewhere else

package org.mine.otherpackage;

[...class definitions...]

These can be compiled into separate Jar files and just work.

Since the python is the final target though I don't want to put it all 
in one directory because then I need to be clever when I regenerate the 
generated code, I don't want old python modules lying around that are no 
longer in the IDL. Blowing the entire generated directory away is the 
best way of doing this, so I don't want my implementation code in there.

Basically, I want the same top-level package to have bits of code in 
different directories, but because Python requires the __init__.py file 
it only picks up the first one in PYTHONPATH.

I'm not sure if that makes sense, my brain is already toast from 
meetings today.

   n





-- 
http://mail.python.org/mailman/listinfo/python-list


Re: modules and generated code

2006-11-14 Thread Nigel Rantor
Peter Otten wrote:
 Nigel Rantor wrote:
 
 So, if I have a tool that generates python code for me (in my case,
 CORBA stubs/skels) in a particular package is there a way of placing my
 own code under the same package hierarchy without all the code living in
 the same directory structure.
 
 http://docs.python.org/lib/module-pkgutil.html

Ooh, thanks for that.

Yep, looks like that should work, but it doesn't. :-/

Do you have any idea whether other __init__.py scripts from the same 
logical module will still be run in this case?

The generated code uses its init script to pull in other code.

Off, to tinker some more with this.

   n


-- 
http://mail.python.org/mailman/listinfo/python-list


modules and generated code

2006-11-14 Thread Nigel Rantor

Hi all,

Python newbie here with what I hope is a blindingly obvious question 
that I simply can't find the answer for in the documentation.

So, if I have a tool that generates python code for me (in my case, 
CORBA stubs/skels) in a particular package is there a way of placing my 
own code under the same package hierarchy without all the code living in 
the same directory structure.

Ideally I would like something like the following:

package_dir/
top_level_package/
generated_code_package/
implementation_code_package/

but have two distinct directories that hold them so that I can simply 
delete the generated code and regenerate it without worrying that 
anything got left behind.

So, I want something like:

generated_package_dir/
top_level_package/
generated_code_package/

implementation_package_dir/
top_level_package/
implementation_code_package/

Whilst I can create this structure, and add 'generated_package_dir' and 
'implementation_package_dir' to PYTHONPATH the fact that both 
directories contain 'top_level_package' seems to be causing clashes, 
perhaps because there are multiple __init__.py files for 
'top_level_package'?

I know that this is possible in Java, Perl and C++ so I am finding it 
hard to believe I can't do the same in Python, I just think I'm too new 
to know how.

I have spent most of this morning searching through all the docs I can 
find, searching on USENET and the web to no avail.

Any help or pointers greatly appreciated.

Regards,

   n
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: modules and generated code

2006-11-14 Thread Nigel Rantor
Peter Otten wrote:
 Nigel Rantor wrote:
 
 Peter Otten wrote:
 Nigel Rantor wrote:
 
 So, if I have a tool that generates python code for me (in my case,
 CORBA stubs/skels) in a particular package is there a way of placing my
 own code under the same package hierarchy without all the code living in
 the same directory structure.
 
 http://docs.python.org/lib/module-pkgutil.html
 
 Yep, looks like that should work, but it doesn't. :-/

 Do you have any idea whether other __init__.py scripts from the same
 logical module will still be run in this case?
 
 I don't think it will.

Yeah, I am getting that impression. Gah!

 The generated code uses its init script to pull in other code.
 
 You could invoke it explicitly via 
 
 execfile(/path/to/generated/package/__init__.py) 
 
 in the static package/__init__.py.

Hmm, yes, that works. It's not pretty though, it seems to be finding the 
file relative to the current directory, I suppose writing a bit of code 
that figures out where this package is located and modifying it won't be 
too hard.

And, at the risk of being flamed or sounding like a troll, this seems 
like something that should be easy to do...other languages manage it 
quite neatly. Up until this point I was really liking my exposure to 
Python :-/

I wonder if there is any more magic that I'm missing, the thing is the 
pkgutil method looks exactly like what I want, except for not executing 
the additional __init__.py files in the added directories.

Thanks for the help so far Peter, if anyone has a prettier solution then 
I'm all ears.

Cheers,

   n





-- 
http://mail.python.org/mailman/listinfo/python-list