Re: python and ARM memory types

2015-10-06 Thread Jorgen Grahn
On Mon, 2015-10-05, dieter wrote:
> voxner@gmail.com writes:
>> ...

>> But how do I specify (streaming,write-combining,write-back) memory
>> types in python ? Is there a library that I can use ? I am thinking of
>> programming some fixed memory space [...]

>
> Python is quite a high level programming language - i.e. lots of things
> are out of direct control of the programmer - among others memory management.
>
> I suppose you will need an approach that gives you more control over
> memory use - maybe, write your algorithms partially in the "C" programming
> language.

C or C++ would work.  He may want to check the assembly code that's
generated too.

> You might find "cython" helpful to easily combine Python parts
> and "C" parts.

In this case I doubt that involving Python will be helpful at all ...
There won't be a lot for it to do: all the interesting stuff will
happen at the level close to the hardware.  There's tweaking the
parameters for the benchmark, but that could be just command-line
options.

He may also want to instrument his programs using valgrind, time(1),
oprofile and perf(1), and I guess the Python interpreter would
introduce some noise there.

/Jorgen

-- 
  // Jorgen Grahn <grahn@  Oo  o.   . .
\X/ snipabacken.se>   O  o   .
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-22 Thread Jorgen Grahn
On Mon, 2015-09-21, Cameron Simpson wrote:
> On 21Sep2015 10:34, Chris Angelico <ros...@gmail.com> wrote:
>>If you're going to add sequencing and acknowledgements to UDP,
>>wouldn't it be easier to use TCP and simply prefix every message with
>>a two-byte length?
>
> Frankly, often yes. That's what I do. (different length encoding, but 
> otherwise...)
>
> UDP's neat if you do not care if a packet fails to arrive and if you can 
> guarentee that your data fits in a packet in the face of different MTUs. 

There's also the impact on your application. With TCP you need to
consider that you may block when reading or writing, and you'll be
using threads and/or a state machine driven by select() or something.
UDP is more fire-and-forget.

> I like TCP myself, most of the time. Another nice thing about TCP is that wil 
> a 
> little effort you get to pack multiple data packets (or partial data packets) 
> into a network packet, etc.

That, and also (again) the impact on the application.  With UDP you
can easily end up wasting a lot of time reading tiny datagrams one by
one.  It has often been a performance bottleneck for me, with certain
UDP-based protocols which cannot pack multiple application-level
messages into one datagram.

Although perhaps you tend not to use Python in those situations.

/Jorgen

-- 
  // Jorgen Grahn <grahn@  Oo  o.   . .
\X/ snipabacken.se>   O  o   .
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-22 Thread Jorgen Grahn
On Mon, 2015-09-21, Chris Angelico wrote:
> On Mon, Sep 21, 2015 at 6:38 PM, Marko Rauhamaa <ma...@pacujo.net> wrote:
>> Chris Angelico <ros...@gmail.com>:
>>
>>> On Mon, Sep 21, 2015 at 5:59 PM, Marko Rauhamaa <ma...@pacujo.net> wrote:
>>>> You can read a full buffer even if you have a variable-length length
>>>> encoding.
>>>
>>> Not sure what you mean there. Unless you can absolutely guarantee that
>>> you didn't read too much, or can absolutely guarantee that your
>>> buffering function will be the ONLY way anything reads from the
>>> socket, buffering is a problem.
>>
>> Only one reader can read a socket safely at any given time so mutual
>> exclusion is needed.
>>
>> If you read "too much," the excess can be put in the application's read
>> buffer where it is available for whoever wants to process the next
>> message.
>
> Oops, premature send - sorry! Trying again.
>
> Which works only if you have a single concept of "application's read
> buffer". That means that you have only one place that can ever read
> data. Imagine a protocol that mainly consists of lines of text
> terminated by CRLF, but allows binary data to be transmitted by
> sending "DATA N\r\n" followed by N arbitrary bytes. The simplest and
> most obvious way to handle the base protocol is to buffer your reads
> as much as possible, but that means potentially reading the beginning
> of the data stream along with its header. You therefore cannot use the
> basic read() method to read that data - you have to use something from
> your line-based wrapper, even though you are decidedly NOT using a
> line-based protocol at that point.
>
> That's what I mean by guaranteeing that your buffering function is the
> only way data gets read from the socket. Either that, or you need an
> underlying facility for un-reading a bunch of data - de-buffering and
> making it readable again.

The way it seems to me, reading a TCP socket always ends up as:

- keep an application buffer
- do one socket read and append to the buffer
- consume 0--more complete "entries" from the beginning
  of the buffer; keep the incomplete one which may exist
  at the end
- go back and read some more when there's a chance more data
  has arrived

So the buffer is a circular buffer of octets, which you chop up
by parsing it so you can see it as a circular buffer of complete and
incomplete entries or messages.

At that level, yes, the line-oriented data and the binary data would
coexist in the same application buffer.

/Jorgen

-- 
  // Jorgen Grahn <grahn@  Oo  o.   . .
\X/ snipabacken.se>   O  o   .
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-21 Thread Jorgen Grahn
On Mon, 2015-09-21, Dennis Lee Bieber wrote:
> On Sun, 20 Sep 2015 23:36:30 +0100, "James Harris"
> <james.harri...@gmail.com> declaimed the following:

...
>>I thought UDP would deliver (or drop) a whole datagram but cannot find 
>>anything in the Python documentaiton to guarantee that. In fact 
>>documentation for the send() call says that apps are responsible for 
>>checking that all data has been sent. They may mean that to apply to 
>>stream protocols only but it doesn't state that. (Of course, UDP 
>>datagrams are limited in size so the call may validly indicate 
>>incomplete transmission even when the first part of a big message is 
>>sent successfully.)
>>
>   Looking in the wrong documentation  
>
>   You probably should be looking at the UDP RFC. Or maybe just
>
> http://www.diffen.com/difference/TCP_vs_UDP
>
> """
> Packets are sent individually and are checked for integrity only if they
> arrive. Packets have definite boundaries which are honored upon receipt,
> meaning a read operation at the receiver socket will yield an entire
> message as it was originally sent.
> """
>
>   Even if the IP layer has to fragment a UDP packet to meet limits of the
> transport media, it should put them back together on the other end before
> passing it up to the UDP layer. To my knowledge, UDP does not have a size
> limit on the message (well -- a 16-bit length field in the UDP header).

So they are "limited in size" like the OP wrote.  (A TCP stream OTOH is
potentially infinite.)

But also, the IPv4 RFC says:

All hosts must be prepared to accept datagrams of up to 576 octets
(whether they arrive whole or in fragments).  It is recommended
that hosts only send datagrams larger than 576 octets if they have
assurance that the destination is prepared to accept the larger
datagrams.

As for "all or nothing" with UDP datagrams, you also have the socket
layer case where the user does read() into a 1000 octet buffer and the
datagram was 1200 octets.  With BSD sockets you can (if you try)
detect this, but the extra 200 octets are lost forever.

> But  since it /is/ "got it all" or "dropped" with no inherent confirmation, 
> one
> would have to embed their own protocol within it -- sequence numbers with
> ACK/NAK, for example. Problem: if using LARGE UDP packets, this protocol
> would mean having LARGE resends should packets be dropped or arrive out of
> sequence (and since the ACK/NAK could be dropped too, you may have to
> handle the case of a duplicated packet -- also large).
>
>   TCP is a stream protocol -- the protocol will ensure that all data
> arrives, and that it arrives in order, but does not enforce any boundaries
> on the data; what started as a relatively large packet at one end may
> arrive as lots of small packets due to intermediate transport limits (one
> can visualize a worst case: each TCP packet is broken up to fit Hollerith
> cards; 20bytes for header and 60 bytes of data -- then fed to a reader and
> sent on AS-IS).

The problem is IMO more this: the chunks of data that the application
writes doesn't map to what the other application reads.  In the lower
layers, I don't expect TCP segments to be split, and IP fragmentation
(if it happens at all) operates at an even lower level.

However the end result is still just as you write:

> Boundaries are the end-user responsibility... line endings
> (look at SMTP, where an email message ends on a line containing just a ".")
> or embedded length counter (not the TCP packet length).
>
>>Receiving no bytes is taken as indicating the end of the communication. 
>>That's OK for TCP but not for UDP so there should be a way to 
>>distinguish between the end of data and receiving an empty datagram.
>>
>   I don't believe UDP supports a truly empty datagram (length of 0) --
> presuming a sending stack actually sends one, the receiving stack will
> probably drop it as there is no data to pass on to a client

UDP datagrams of length 0 work (just tried it on Linux).  There's
nothing special about it.

> (there is a PR
> at work because we have a UDP driver that doesn't drop 0-length messages,
> but also can't deliver them -- so the circular buffer might fill with
> undeliverable headers)

Those messages should be delivered to the receiving socket, in the
sense that they are sanity-checked, used to wake up the application
and mark the socket readable, fill up one entry in the read queue and
so on ...

Of course your system at work may have the rights to be more
restrictive, if it's special-purpose.

/Jorgen

-- 
  // Jorgen Grahn <grahn@  Oo  o.   . .
\X/ snipabacken.se>   O  o   .
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Socket programming

2015-01-04 Thread Jorgen Grahn
On Sat, 2015-01-03, Dan Stromberg wrote:
 On Sat, Jan 3, 2015 at 3:43 AM, pramod gowda pramod.s...@gmail.com wrote:
...

 data=client_socket.recv(1024)
 print(data)
 client_socket.close()


 But note that if you send 10 bytes into a socket, it could be received
 as two chunks of 5, or other strangeness. So you should frame your
 data somehow - adding crlf to the end of your send's is one simple
 way.

I like to think of it as defining the protocol rather than framing
your data.  But it ends up as the same thing: making sure each end
knows when it should stop looking for more data and start /acting/ on
it.

And yes, you can't do much with a TCP soocket without setting up these
rules. It's important to see that noone does it /for/ you.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   . .
\X/ snipabacken.se   O  o   .
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: New user's initial thoughts / criticisms of Python

2013-11-10 Thread Jorgen Grahn
On Sat, 2013-11-09, Chris Angelico wrote:
 On Sun, Nov 10, 2013 at 12:08 AM, John von Horn j@btinternet.com wrote:
...
 * Why not allow floater=float(int1/int2) - rather than floater=float
 (int1)/float(int2)?

 Give me a float (or an error message) from evaluating everything in the
 brackets. Don't make me explicitly convert everything myself (unless I
 want to)

 As others have said, what you're asking for is actually magic. One of
 the rules of Python - one for which I'm not aware of any exceptions -
 is that you can always take a subexpression out and give it a new
 name:

And it's not just Python: programming languages have been designed
that way since at least the 1960s.  People are used to analysing
expressions inside and out according to rules common for almost all
languages.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   . .
\X/ snipabacken.se   O  o   .
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Languages for different purposes (was Re: New user's initial thoughts / criticisms of Python)

2013-11-10 Thread Jorgen Grahn
On Sun, 2013-11-10, Chris Angelico wrote:
 On Sun, Nov 10, 2013 at 11:41 AM, Roy Smith r...@panix.com wrote:
 On 09/11/2013 22:58, Chris Angelico wrote:
 
  * Some languages are just fundamentally bad. I do not recommend ever
  writing production code in Whitespace, Ook, or Piet.

 One of the worst coding experiences I ever had was trying to build an
 app for a Roku media player.  They have a home-grown language called
 BrightScript.  Barf.

 And this is exactly why I was so strongly against the notion of
 developing an in-house scripting language. It may be a lot of work to
 evaluate Lua, Python, JavaScript, and whatever others we wanted to
 try, but it's a *lot* less work than making a new language that
 actually is worth using.

Yes.  I am baffled that people insist on doing the latter. Designing a
limited /data/ language is often a good idea; designing something
which eventually will need to become Turing-complete is not.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   . .
\X/ snipabacken.se   O  o   .
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: What does it take to implement a chat system in Python (Not asking for code just advice before I start my little project)

2013-07-19 Thread Jorgen Grahn
On Thu, 2013-07-18, Chris Angelico wrote:
...
 You can certainly do your server-side programming directly in Python;
 in fact, I recommend it for this task. There's no reason to use HTTP,
 much less a web framework (which usually consists of a structured way
 to build HTML pages, plus a bunch of routing and stuff, none of which
 you need). All you need is a simple structure for separating one
 message from another.

 I would recommend either going MUD/TELNET style
 and ending each message with a newline, or prefixing each message with
 its length in octets. Both ways work very nicely; newline-termination
 allows you to use a MUD client for debugging, which I find very
 convenient

It's definitely the way to go.  It's not just MUDs -- a lot of
Internet protocols work that way.  Netcat is one popular client for
talking to/debugging/testing such servers.  No doubt MUD clients too,
but last time I did such stuff was in 1993, and I think I used telnet
...

In fact, I'd design the protocol before starting to write code.  Or
better, steal some existing chat protocol.  Like a subset of IRC.


There's also another question in the original posting that bothers me:
paraphrased do I need to learn database programming to manage users?
No!  Unix does fine with plain-text files.

Managing credentials (understanding cryptography, setting up a support
organization for resetting lost passwords ...) is non-trivial though,
so try to do without such things at first.  It's not obvious that you
should need an account for an experimental chat, anyway.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   . .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Diagnosing socket Connection reset by peer

2013-05-22 Thread Jorgen Grahn
On Wed, 2013-05-22, Dave Angel wrote:
 On 05/22/2013 04:46 AM, loial wrote:
SNIP

  Is there any additional traceing I can do(either within my
 python code or on the network) to establish what is causing this
 error?

 Try using Wireshark.  It can do a remarkable job of filtering, 
 capturing, and analyzing packets.  It can also read and write pcap 
 files, which you could either save for later analysis, or send to 
 someone who might help.

Or use tcpdump, which has a text interface so you can show the problem
in a text medium like Usenet.

 (Note - unfiltered pcap files can be very large 
 on a busy network, but if you can quiet other traffic, you may not need 
 to filter at all.)

Or simply filter.  It's not hard -- the capture filter
host my-printer-hostname-or-address is enough.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   . .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Diagnosing socket Connection reset by peer

2013-05-22 Thread Jorgen Grahn
On Wed, 2013-05-22, Matt Jones wrote:
 On Wed, May 22, 2013 at 3:46 AM, loial jldunn2...@gmail.com wrote:

 I have a sockets client that is connecting to a printer and occassionally
 getting the error 104 Connection reset by peer

 I have not been able to diagnose what is causing this. Is there any
 additional traceing I can do(either within my python code or on the
 network) to establish what is causing this error?

 Currently I am simply trapping socket.erruor

 Python version is 2.6

 This typically indicates that the peer at the other end of the tcp
 connection severed the session without the typical FIN packet.

I.e. by sending a RST (reset) instead.  Yes, that's what Connection
reset by peer means.  I don't think there are any other causes for
this signal.

A server application can cause a reset explicitly, or if it crashes
the OS will send one for it, as part of the resource cleanup.

Also, if you're behind a cheap NATing gateway, I think it may send
fake RSTs if it has lost track of the TCP session.

 If you're
 treating the printer as a blackbox then there really isn't anything you
 can do here except catch the exception and attempt to reconnect.

Yes.  Note that there *may* be some uncertainty re: did the printer
process the last request before the reset or not? E.g. I wouldn't
endlessly retry printing a 100-page document in that case.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   . .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Diacretical incensitive search

2013-05-20 Thread Jorgen Grahn
On Fri, 2013-05-17, Olive wrote:

 One feature that seems to be missing in the re module (or any tools
 that I know for searching text) is diacretical incensitive search. I
 would like to have a match for something like this:

 re.match(franc, français)
...

 The algorithm to write such a function is trivial but there are a
 lot of mark we can put on a letter. It would be necessary to have the
 list of a's with something on it. i.e. à,á,ã, etc. and this for
 every letter. Trying to make such a list by hand would inevitably lead
 to some symbols forgotten (and would be tedious). 

Ok, but please remember that the diacriticals are of varying importance.
The english naïve is easily recognizable when written as naive.
The swedish word får cannot be spelled far and still be understood.

This is IMHO out of the scope of re, and perhaps case-insensitivity
should have been too.  Perhaps it /would/ have been, if regular
expressions hadn't come from the ASCII world where these things are
easy.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   . .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Imaging libraries in active development?

2012-11-28 Thread Jorgen Grahn
On Wed, 2012-11-28, Christian Heimes wrote:
 Am 28.11.2012 19:14, schrieb Michael Torrie:
 I'm curious.  What features do you need that pil doesn't have?  Other
 than updating pil to fix bugs, support new image types or new versions
 of Python, what kind of active development do you think it needs to
 have? Maybe pil has all the features the original author wanted and is
 pretty stable.  To judge a package on how fast it's changing seems a bit
 odd to me.

Not to me -- the slower the change, the better!

 Obviously you want to know that bugs can get fixed of
 course.  Perhaps none have been reported recently.

 PIL is missing a bunch of features like proper TIFF support (no
 multipage, g3/g4 compression and more), JPEG 2000,

I thought those formats were dead since about a decade?  (Ok, I know
TIFF has niches, but JPEG 2000?)

 RAW and HDR image
 formats, tone mapping, proper ICC support, PEP 3128 buffer support ...

I won't comment on those, but they seem likely to be valid complaints.

 PIL is also rather slow. My smc.freeimage library can write JPEGs about
 six times faster, because it uses libjpeg-turbo. Only some Linux
 distributions have replaced libjpeg with the turbo implementation.

That seems like an argument for *not* having support for many file
formats in the imaging library itself -- just pipeline into the best
standalone utilities available.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   . .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: please help me to debud my local chat network program

2012-11-28 Thread Jorgen Grahn
On Wed, 2012-11-28, Chris Angelico wrote:
 On Wed, Nov 28, 2012 at 1:50 PM, Minh Dang dangbaminh...@gmail.com wrote:
 Hello everybody, i am doing my project: local network chat using python
 here is my file
 http://www.mediafire.com/?cc2g9tmsju0ba2m

 Hmm. Might I recommend some other means of sharing your code? The
 unrar-free utility from the Debian repo won't extract more than the
 first file (accounts.txt), and I don't know if that's a problem with
 unrar-free or your file. A better-known format like zip or tar.gz
 would make things easier.

Or a Git repository at github.com or similar.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   . .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Preventing crap email from google?

2012-10-20 Thread Jorgen Grahn
On Sat, 2012-10-20, Michael Torrie wrote:
 On 10/19/2012 06:43 PM, Mark Lawrence wrote:
 Good morning/afternoon/evening all,
 
 Is there any possibility that we could find a way to prevent the double 
 spaced rubbish that comes from G$ infiltrating this ng/ml?  For example, 
 does Python have anybody who works for G$ who could pull a few strings, 
 preferably a Dutch national who has named a quite well known programming 
 language after a humorous BBC television programme?

 I don't know what G$ is, but from your subject, I assume you are
 complaining about posts from gmail users, of which I am one.  However my
 messages have never been double-spaced.  And I haven't noticed any
 recently either.  I just checked and the last dozen or so posts from
 gmail accounts don't appear to have any problems.

He's probably talking about Google Groups and its integration with
Usenet. (I'm seeing this as the comp.lang.python newsgroup, via a real
news reader and a real Usenet provider.  I am vaguely aware that it's
also a mailing list.)

Google Groups has a long history of creating broken Usenet postings in
new and exciting ways.  It's at least as bad as MS Outlook Express and
AOL were in Usenet's past.

Your own posting (or mail) is almost flawless: correct quoting, and a
properly formatted response.  But you seem to be using gmail and the
mailing list interface; that's not the technology he's complaining
about.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   . .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Experimental Python-based shell

2012-10-03 Thread Jorgen Grahn
On Tue, 2012-10-02, Jonathan Hayward wrote:
 I've made an experimental Python-based Unix/Linux shell at:

 http://JonathansCorner.com/cjsh/

 An experimental Unix/Linux command line shell, implemented in Python
 3, that takes advantage of some more recent concepts in terms of
 usability and searching above pinpointing files in heirarchies.

 I invite you to try it.

Hard to do without a manual page, or any documentation at all except
for a tiny hello world-style example ...

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   . .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: simple client data base

2012-09-08 Thread Jorgen Grahn
On Sat, 2012-09-08, Mark R Rivet wrote:
 On Thu, 6 Sep 2012 01:57:04 -0700 (PDT), Bryan
 bryanjugglercryptograp...@yahoo.com wrote:
...
comp.lang.python tries to be friendly and helpful, and to that end
responders have read and answered your question as directly as
possible. There's good stuff available for Python.

Mark, there is absolutely no chance, no how, no way, that your stated
plan is a good idea. Fine CRM apps are available for free; excellent
ones for a few dollars. You're reading about lists, tuples, and
dictionary data? Great, but other home accounting businesses have
their client databases automatically synced with their smart-phones
and their time-charging and their invoicing.

-Bryan
 Well I have to say that this is most discouraging. I should give up
 learning to program. I don't have a chance at all. Thanks.

He's saying you don't have a chance, but he's *not* telling you to
give up programming.

Personal reflection: it's risky to make friends and relatives depend
on the success of your hobby projects.  I got away with it once or
twice (under special circumstances) but it might equally well have
ended with resentment. Why did he sell this crap idea to me?  Now I'm
stuck with it.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   . .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: strptime format string nasty default

2012-05-09 Thread Jorgen Grahn
On Wed, 2012-05-09, Javier Novoa C. wrote:
 Hi,

 I am using time.strptime method as follows:

 I receive an input string, representing some date in the following
 format:

 %d%m%Y

 However, the day part may be a single digit or two, depending on
 magnitude.

 For example:

 '10052012' will be parsed as day 10, month 5, year 2012

 Again:

 '8052012' will be parsed as day 8, month 5, year 2012

 What happens when day is 1 or 2?

 '1052012' will be parsed as day 10, month 5, year 2012 

 That's not the expected behaviour! Not for me at least. I mean, in my
 case, month will always be a 2 digit string, so there's no ambiguity
 problem by pretending that... say '1052012' is correctly parsed.

 Is there a way out of here? I know I can pre-parse the string and
 append a '0' to it when lenght == 7, but I think that a better way
 would be if strptime allowed me to define my format in a better
 way... To say that the month is the optional one-two digit part is
 just a default, isn't it? Why can't I specify that the day part is the
 one with one-or-two digits on the input string...?

 Or is there a way out that I don't know yet?

You'd have to read the strptime(3) manual page (it's a Unix function,
imported straight into Python, I'm sure). Judging from a quick read
it's not intended to support things like these. I'm surprised it
doesn't parse your last example to (10, 52, 12) and then fail it due
to month12.

Can't you use a standard date format, like ISO? Apart from not being
possible to parse with standard functions, this one looks quite odd
and isn't very human-readable.

If you have to use this format, I strongly recommend parsing it
manually as text first. Then you can create an ISO date and feed
that to strptime, or perhaps use your parsed (day, month, year) tuple
directly.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   . .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Fast file data retrieval?

2012-03-13 Thread Jorgen Grahn
On Mon, 2012-03-12, MRAB wrote:
 On 12/03/2012 19:39, Virgil Stokes wrote:
 I have a rather large ASCII file that is structured as follows

 header line
 9 nonblank lines with alphanumeric data
 header line
 9 nonblank lines with alphanumeric data
 ...
 ...
 ...
 header line
 9 nonblank lines with alphanumeric data
 EOF

 where, a data set contains 10 lines (header + 9 nonblank) and there can
 be several thousand
 data sets in a single file. In addition,*each header has a* *unique ID
 code*.

 Is there a fast method for the retrieval of a data set from this large
 file given its ID code?

[Responding here since the original is not available on my server.]

It depends on what you want to do. Access a few of the entries (what
you call data sets) from your program? Process all of them?  How fast
do you need it to be?

 Probably the best solution is to put it into a database. Have a look at
 the sqlite3 module.

Some people like to use databases for everything, others never use
them. I'm in the latter crowd, so to me this sounds as overkill, and
possibly impractical. What if he has to keep the text file around? A
database on disk would mean duplicating the data. A database in memory
would not offer any benefits over a hash.

 Alternatively, you could scan the file, recording the ID and the file
 offset in a dict so that, given an ID, you can seek directly to that
 file position.

Mmapping the file (the mmap module) is another option.
But I wonder if this really would improve things.

Several thousand entries is not much these days. If a line is 80
characters, 5000 entries would take ~3MB of memory. The time to move
this from disk to a Python list of 9-tuples of strings would be almost
only disk I/O.

I think he should try to do it the dumb way first: read everything
into memory once.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   . .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: What other languages use the same data model as Python?

2011-05-02 Thread Jorgen Grahn
On Sun, 2011-05-01, Terry Reedy wrote:
 On 5/1/2011 4:45 AM, Steven D'Aprano wrote:
...
 What other languages use the same, or mostly similar, data model as
 Python?

 Natural languages. That is why I think it is better to think of Python 
 as an algorithm language or information-object manipulation language 
 rather than as just a linear-memory machine language.A linear memory 
 with bytes addressed from 0 to max-int or max-long is an explicit part 
 of the definition of assembly languages and C. It is no part of the 
 definition of Python.

It's not part of the definition of C either -- C supports segmented
memory (pre-386 Intel) and separate code/data address spaces. (Even if
most C users tend not to think of it that way.)

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Compile 32bit C-lib on 64 bit

2011-05-02 Thread Jorgen Grahn
On Sun, 2011-05-01, Hegedüs Ervin wrote:
 Hello,

 this is not a clear Python question - I've wrote a module in C,
 which uses a 3rd-party lib - it's a closed source, I just get the
 .so, .a and a header file.

 Looks like it works on 32bit (on my desktop), but it must be run
 on 64bit servers.


 When I'm compiling it on 64bit, gcc says:

 /usr/bin/ld: skipping incompatible /lib32/lib3rdpartyCrypt.so when searching 
 for -l3rdpartyCrypt

A different angle: if you need to use closed-source, non-standard
crypto, and the suppliers cannot even be bothered to compile it for
the CPU architecture that has been standard for at least 6--7 years,
start planning to replace it *now*.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python IDE/text-editor

2011-04-17 Thread Jorgen Grahn
On Sat, 2011-04-16, Chris Angelico wrote:
 Based on the comments here, it seems that emacs would have to be the
 editor-in-chief for programmers. I currently use SciTE at work; is it
 reasonable to, effectively, bill my employer for the time it'll take
 me to learn emacs? I'm using a lot of the same features that the OP
 was requesting (multiple files open at once, etc), plus I like syntax
 highlighting (multiple languages necessary - I'm often developing
 simultaneously in C++, Pike, PHP, and gnu make, as well as Python).

Your editor seems popular, free, cross-platform and capable ... if you
already know it well, I can't see why you should switch.

Unless you're truly not productive in SciTE, but I'd have to watch
you use it for hours to tell.

(That should really be a new job title. Just as there are aerobics
instructors or whatever at the gyms to help you use the equipment
there safely and efficiently, there should be text editor instructors!)

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python IDE/text-editor

2011-04-16 Thread Jorgen Grahn
On Sat, 2011-04-16, Alec Taylor wrote:
 On Sat, Apr 16, 2011 at 3:29 PM, John Bokma j...@castleamber.com wrote:
 Ben Finney ben+pyt...@benfinney.id.au writes:

 Alec Taylor alec.tayl...@gmail.com writes:

 I'm looking for an IDE which offers syntax-highlighting,
 code-completion, tabs, an embedded interpreter and which is portable
 (for running from USB on Windows).

 Either of Emacs URL:http://www.gnu.org/software/emacs/ or Vim
 URL:http://www.vim.org/ are excellent general-purpose editors that
 have strong features for programmers of any popular language or text
 format.

 I second Emacs or vim. I currently use Emacs the most, but I think it's
 good to learn both.

 Thanks, but non of the IDEs so far suggested have an embedded python
 interpreter AND tabs...
 emacs having the opposite problem, missing tabs (also,
 selecting text with my mouse is something I do often).

Does it *have* to be tabs? Why? Both Emacs and Vim can have multiple
files open, and have various powerful ways to navigate between them.

If you cannot stand non-tabbed interfaces, you probably can't stand
other non-Windows-like features of these two, like their menu systems.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Purely historic question: VT200 text graphic programming

2011-03-11 Thread Jorgen Grahn
On Thu, 2011-03-10, Martin Gregorie wrote:
 On Thu, 10 Mar 2011 20:31:11 +, Grant Edwards wrote:

 You tricked me by saying only DEC VAX/VMS programmers would know what it
 was.  In fact, many, many Unix programmers knew about curses (and still
 do) and very few VMS programmers ever did.  C wasn't very widely used
 under VMS, and VMS had it's own screen formatting and form handling
 libraries.

 From the context the only DEC VAX/VMS programmers remark applied to the 
 VT-100. However, the OP is wrong about that - VT-100s were well-known and 
 popular devices in the 8-bit microprocessor world too, together with 
 assorted clones. In addition, many other terminals had a VT-100 emulation 
 mode. IIRC all the Wyse terminals had that.

But he wrote VT-200, not VT-100. I assumed he meant those (vt200) had
some exotic graphics mode. The VT-xxx series was pretty heterogenous,
although most of us think of them as more or less fancy VT-100s.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ImSim: Image Similarity

2011-03-05 Thread Jorgen Grahn
On Sat, 2011-03-05, Grigory Javadyan wrote:
 At least you could've tried to make the script more usable by adding
 the possibility to supply command line arguments, instead of editing
 the source every time you want to compare a couple of images.

 On Sat, Mar 5, 2011 at 11:23 AM, n00m n...@narod.ru wrote:
 Let me present my newborn project (in Python) ImSim:

 http://sourceforge.net/projects/imsim/

 Its README.txt:
 -
 ImSim is a python script for finding the most similar pic(s) to
 a given one among a set/list/db of your pics.
 The script is very short and very easy to follow and understand.
 Its sample output looks like this:
...
 The *less* numeric value -- the *more similar* this pic is to the
 tested pic. If this value  70 almost for sure these pictures are
 absolutely different (from totally different domains, so to speak).

 What is similarity and how can/could/should it be estimated this
 point I'm leaving for your consideration/contemplation/arguing etc.

So basically you're saying you won't tell the users what the program
*does*. I don't get that.

Is it better than this?
- scale each image to 100x100
- go blackwhite in such a way that half the pixels are black
- XOR the images and count the mismatches

That takes care of JPEG quality, scaling and possibly gamma
correction, but not cropping or rotation. I'm sure there are better,
well-known algorithms.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Creating Long Lists

2011-02-23 Thread Jorgen Grahn
On Tue, 2011-02-22, Ben Finney wrote:
 Kelson Zawack zawack...@gis.a-star.edu.sg writes:

 I have a large (10gb) data file for which I want to parse each line
 into an object and then append this object to a list for sorting and
 further processing.

 What is the nature of the further processing?

 Does that further processing access the items sequentially? If so, they
 don't all need to be in memory at once, and you can produce them with a
 generator URL:http://docs.python.org/glossary.html#term-generator.

He mentioned sorting them -- you need all of them for that.

If that's the *only* such use, I'd experiment with writing them as
sortable text to file, and run GNU sort (the Unix utility) on the file.
It seems to have a clever file-backed sort algorithm.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to use Python well?

2011-02-19 Thread Jorgen Grahn
On Sat, 2011-02-19, Ben Finney wrote:
 Roy Smith r...@panix.com writes:
...
 HTML also gives you much greater formatting flexibility than what's
 still basically 35-year old nroff.

 Full agreement there.

Some disagreement here. There are typographical features in
nroff/troff today which you don't get in web browsers: ligatures and
hyphenation for example.

Then of course there's the argument that formatting flexibility
isn't a good thing for reference manuals -- you want them to look
similar no matter who wrote them. (Not that all man pages look similar
in reality, but there are some pretty decent conventions which most
follow).

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Archiving Modules

2011-02-18 Thread Jorgen Grahn
On Fri, 2011-02-18, Alexander Kapps wrote:
 On 18.02.2011 19:51, Westley Martínez wrote:
 On Fri, 2011-02-18 at 04:55 -0800, peter wrote:
 On Feb 17, 9:55 pm, Jorgen Grahngrahn+n...@snipabacken.se  wrote:


 RAR is a proprietary format, which complicates things. For example,
 Linux distributions like Debian cannot distribute software which
 handles it.  If Python included such a module, they'd be forced to
 remove it from their version.

 Good point, and one that I did not appreciate.  But there are freeware
 applications such as jzip (http://www.jzip.com) which can handle .rar
 files, so there must be a way round it.

 I wouldn't encourage its use by writing /more/ software which handles
 it. IMHO, archives should be widely readable forever, and to be that
 they need to be in a widely used, open format.

 I agree, but if a file is only available as a rar archive I would like
 to be able to extract it without using another 3rd party application.

 peter
 Freeware is still proprietary software.

It can be (freeware is a vague term). As I understand they situation
here, such software is either in a gray area legally, or the author
has made some kind of special agreement with the RAR people.

 While I agree with the general refusal of .rar or other non-free 
 archive formats, a useful archiving tool should still be able to 
 extract them. Creating them is an other issue.

 There is a free (open source) un-rar for Linux which AFAIK can at 
 least handle .rar archives below v3.

That's part of my point -- unrar-free is the only decoder free enough
to be distributed by Debian, and yes, it's limited to decoding old
versions or the rar file format. Wikipedia seems to say it was based
on RAR as it looked before some license terms change.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to use Python well?

2011-02-18 Thread Jorgen Grahn
On Thu, 2011-02-17, Roy Smith wrote:
 In article slrnilr5lj.15e.grahn+n...@frailea.sa.invalid,
  Jorgen Grahn grahn+n...@snipabacken.se wrote:

 - Write user documentation and build/installation scripts. Since I'm
   on Unix, that means man pages and a Makefile.

 Wow, I haven't built a man page in eons.  These days, user documentation 
 for me means good help text for argparse to use.

Perhaps I'm old-fashioned, but all other software I use (on Unix) has
man pages. I /expect/ there to be one. (It's not hard to write a man
page either, if you have a decent one as a template.)

Help texts are better than nothing though (and unlike man pages they
work on Windows too).

 If I need something 
 more than that, I'll write it up in our wiki.

I guess you're working within an organization? Local rules override
personal preferences -- if everyone else is using the wiki, I guess
you must do too.

I have to say though that *not* handling the documentation together
with the source code is harmful.  If source code and documentation
aren't in version control together, they *will* go out of sync.

 Anyway, I don't feel bad if I don't find any classes at first.

 Same here.  I don't usually find a reason to refactor things into 
 classes until I've written the second or third line of code :-)

 Somewhat more seriously, the big clue for me that I've got a class 
 hiding in there is when I start having all sorts of globals.  That's 
 usually a sign you've done something wrong.

Or a whole bunch of related arguments to a function, and/or the same
arguments being passed to many functions.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to use Python well?

2011-02-17 Thread Jorgen Grahn
On Wed, 2011-02-16, snorble wrote:
 I use Python a lot, but not well. I usually start by writing a small
 script, no classes or modules. Then I add more content to the loops,
 and repeat. It's a bit of a trial and error learning phase, making
 sure I'm using the third party modules correctly, and so on. I end up
 with a working script, but by the end it looks messy, unorganized, and
 feels hacked together. I feel like in order to reuse it or expand it
 in the future, I need to take what I learned and rewrite it from
 scratch.

 If I peeked over a Python expert's shoulder while they developed
 something new, how would their habits differ? Do they start with
 classes from the start?

 I guess I'm looking for something similar to Large Scale C++ Software
 Design for Python. Or even just a walkthrough of someone competent
 writing something from scratch. I'm not necessarily looking for a
 finished product that is well written. I'm more interested in, I have
 an idea for a script/program, and here is how I get from point A to
 point B.

 Or maybe I'm looking for is best practices for how to organize the
 structure of a Python program. I love Python and I just want to be
 able to use it well.

Good questions -- and you got some really good answers already!

What I always do when starting a program is:

- Split it into a 'if __name__ == __main__:' which does the
  command-line parsing, usage message and so on; and a function
  which contains the logic, i.e. works like the program would have
  if the OS had fed it its arguments as Python types

- Document functions and classes.

- Avoid having functions use 'print' and 'sys.std*', in case I need to
  use them with other files. I pass file-like objects as arguments
  instead.

- Write user documentation and build/installation scripts. Since I'm
  on Unix, that means man pages and a Makefile.

And that's all in the normal case. No need to do anything more fancy
if it turns out I'll never have to touch that program again.

I use classes when I see a use for them. The see part comes from
quite a few years' worth of experience with object-oriented design in
Python and C++ ... not sure how to learn that without getting lost in
Design with a capital 'D' for a few years ...

Anyway, I don't feel bad if I don't find any classes at first.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Archiving Modules

2011-02-17 Thread Jorgen Grahn
On Wed, 2011-02-16, peter wrote:
 I am writing a small Tkinter utility to control archive files in
 multiple formats (mainly for my own amusement and education).
 Basically it presents the user with two adjacent listboxes, one with
 the contents of the target directory and one with the contents of the
 archive. By clicking buttons labelled '' and '' the user can copy
 files to and from the archive.  The actual archiving functionality
 derives from the modules zipfile and tarfile.

 It all seems to be working fine, but I have two residual queries.
 Firstly for the sake of completeness I would like to include .rar
 archives, but there doesn't seem to be an equivalent rarfile module.
 I use both Windows and Linux on various machines, so need a cross
 platform solution which does not depend on external modules. The only
 one I have found is at http://pypi.python.org/pypi/rarfile/1.0, but it
 seems this does rely on an external module.  Is there anything out
 there?

RAR is a proprietary format, which complicates things. For example,
Linux distributions like Debian cannot distribute software which
handles it.  If Python included such a module, they'd be forced to
remove it from their version.

I wouldn't encourage its use by writing /more/ software which handles
it. IMHO, archives should be widely readable forever, and to be that
they need to be in a widely used, open format.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Fully functioning email client in Python

2011-02-06 Thread Jorgen Grahn
On Sun, 2011-02-06, iaoua iaoua wrote:
...
 In a nutshell my problem is this. I have
 developed an intelligent conversational agent which dialogues with a
 user to find out what they are really searching for. The domain is
 open and helps the user find which businesses/websites offer the
 services or products the user is looking for according to needs
 elicited from the user during the dialogue. I have small but growing
 base of experimental users who want to access the agent via email (the
 agent can be quite slow at times due to the complexity of the system)
 so that they can ask questions and forget about them until an answer
 comes to their inbox.

OK, so the agent becomes a server with a mail interface.

 And so I need to extend my agent so that it can
 receive and send emails and extract the relevant text answers to
 questions from the user. This requires full threading capabilities and
 automatic text extraction of only the new and relevant material (minus
 signatures of course).

 Now, I'm not completely lazy and so I have already done a little
 homework. I've been reading O'Reilly's Programming Python and
 Foundations of Python Network Programming which have both been very
 helpful in ascertaining that Python is a strong candidate solution.
 Both books talk about the email (for message and header construction
 and parsing) sendmail (for sending mail via SMTP) and poplib (for
 receiving mail with POP3) libraries.
...
 What I really need is a ready made fully functional Python client that
 already has it that offers me a simple interface to extract text from
 received messages along with a message id that I can use to associate
 the response to a question with and allows me to send out new
 questions or answers along with a message id I can use to track future
 responses with. I need something that already has the whole shebang.
 Can support POP3, IMAP, SSL authentication, message threading etc.

 Can anyone point to something that will save me the effort of building
 this thing from scratch and first principles? Thanks in advance for
 any useful suggestions.

I imagine your only worry is the server side. What are your
requirements on the system it's running on?

If it's on Unix you don't need to do any network programming at all,
and can forget about IMAP, POP3, SSL and SMTP.  Using procmail you can
arrange to start a program of your choice each time a mail arrives
(and that program would put the message on your agent's work queue).
To send mail, you feed the text and the recipient address into
/usr/lib/sendmail.

If you build the interface on POP3 and SMTP, you have to bypass the
system's builtin mail capabilities.

If on the other hand it's on Windows, there's certainly some very
different scheme you have to follow (or bypass).

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Perl Hacker, Python Initiate

2011-02-03 Thread Jorgen Grahn
On Wed, 2011-02-02, Gary Chambers wrote:
 All,

 Given the following Perl script:

 #!/usr/bin/perl

I'm a Perl user, but I generally refuse to read Perl code which
doesn't utilize 'use warnings/-w' and 'use strict'.  There are just too
many crazy bugs and 1980s constructs which go unnoticed without them.

 %dig = (
  solaris = /usr/sbin/dig,
  linux   = /usr/bin/dig,
  darwin  = /usr/bin/dig
 );

Not related to your question, except that you'll have to deal with
this in Python too:

I really suggest letting the user's $PATH decide which dig to call.
/usr/bin is always in the path.  /usr/sbin may not be, but if that's a
problem for your users, just let your script start by appending it to
the pre-existing $PATH.  You don't even have to do OS detection on
that one -- it's safe to do everywhere.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Digitally Signing a XML Document (using SHA1+RSA or SHA1+DSA)

2010-12-30 Thread Jorgen Grahn
On Tue, 2010-12-28, Adam Tauno Williams wrote:
 On Tue, 2010-12-28 at 03:25 +0530, Anurag Chourasia wrote:
 Hi All,

 I have a requirement to digitally sign a XML Document using SHA1+RSA
 or SHA1+DSA
 Could someone give me a lead on a library that I can use to fulfill
 this requirement?

 http://stuvel.eu/rsa  Never used it though.

 The XML Document has values such as 
 RSASK-BEGIN RSA PRIVATE KEY-
 MIIBOgIBAAJBANWzHfF5Bppe4JKlfZDqFUpNLrwNQqguw76g/jmeO6f4i31rDLVQ
 n7sYilu65C8vN+qnEGnPB824t/A3yfMu1G0CAQMCQQCOd2lLpgRm6esMblO18WOG
...

 Is this any kind of standard or just something someone made up?  Is
 there a namespace for the document?

 It seems quite odd that the document contains a *private* key.

 If all you need to do is parse to document to retrieve the values that
 seems straight-forward enough.

 And the XML also has another node that has a Public Key with Modules
 and Exponents etc that I apparently need to utilize.
 RSAPK
   M1bMd8XkGml7gkqV9kOoVSk0uvA1CqC7DvqD
 +OZ47p/iLfWsMtVCfuxiKW7rkLy836qcQac8Hzbi38DfJ8y7UbQ==/M 
   EAw==/E 
 /RSAPK

 I am a little thin on this concept and expecting if you could guide me
 to a library/documentation that I could utilize.

[The original posting by Anurag Chourasia did not reach my news server.]

I'd simply invoke GnuPG. A simple example:

% gpg --sign --armor foo
You need a passphrase to unlock the secret key for
user: ...

% head foo.asc  
-BEGIN PGP MESSAGE-
Version: GnuPG v1.4.9 (GNU/Linux)

owGs+TuuLdGWRQu9B1hTwsAHaRUhPjN+DjVAWBRgxs+nGAgHA58aUA88RHVw6K3N
2PfefJn5Mg2ko6N99lkrYn7G6KN//m//6//l//C/+N/8X/5P/6//+//u//r/+P/+
...

The result isn't XML, but it *is* a standardized file format readable
by anyone. That's worth a lot.  You can also create a detached signature
and ship it together with the original file, or skip the '--armor' and
get a binary signed file.

If you really *do* have a requirement to make the result XML-like and
incompatible with anything else, I'm afraid you're on your own, and
will have a lot of extra work testing and making sure it's all secure.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why flat is better than nested?

2010-10-27 Thread Jorgen Grahn
On Tue, 2010-10-26, Carl Banks wrote:
 On Oct 25, 11:20 pm, Jorgen Grahn grahn+n...@snipabacken.se wrote:
 On Mon, 2010-10-25, bruno.desthuilli...@gmail.com wrote:
  On 25 oct, 15:34, Alex Willmer a...@moreati.org.uk wrote:
  On Oct 25, 11:07 am, kj no.em...@please.post wrote:

   In The Zen of Python, one of the maxims is flat is better than
   nested?  Why?  Can anyone give me a concrete example that illustrates
   this point?

  I take this as a reference to the layout of the Python standard
  library and other packages i.e. it's better to have a module hierarchy
  of depth 1 or 2 and many top level items, than a depth of 5+ and only
  a few top level items.

  (snip)

  This also applies to inheritance hierarchies (which tend to be rather
  flat in Python compared to most mainstreams OOPLs), as well as nested
  classes etc.

 Which mainstream languages are you thinking of?  Java?  Because C++ is
 as flat as Python.

 Not in my experience.  The only way to get dynamic polymorphism (as
 opposed to the static polymorphism you get with templates) in C++ is
 to use inheritance, so when you have a class library in C++ you tend
 to get hierarchies where classes with all kinds of abstract base
 classes so that types can be polymorphic.

I should have mentioned that I talked about the standard C++ library:
almost no inheritance[1] and just one namespace level.

Of course you can make a layered mess of C++ if you really try[2], but
it's not something the language encourages. IMHO.

 In Python you don't need
 abstract base classes so libraries tend to be flatter, only inheriting
 when behavior is shared.

 However it's not really that big of a difference.

Right, that's one level, and you can't avoid it if you really *do* need
inheritance.

/Jorgen

[1] Not counting the black sheep, iostreams.
[2] I have seen serious C++ code trying to mimic the Java bottomless
namespace pit of despair: com::company::division::product::subsystem::...

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why flat is better than nested?

2010-10-26 Thread Jorgen Grahn
On Mon, 2010-10-25, bruno.desthuilli...@gmail.com wrote:
 On 25 oct, 15:34, Alex Willmer a...@moreati.org.uk wrote:
 On Oct 25, 11:07 am, kj no.em...@please.post wrote:

  In The Zen of Python, one of the maxims is flat is better than
  nested?  Why?  Can anyone give me a concrete example that illustrates
  this point?

 I take this as a reference to the layout of the Python standard
 library and other packages i.e. it's better to have a module hierarchy
 of depth 1 or 2 and many top level items, than a depth of 5+ and only
 a few top level items.

 (snip)

 This also applies to inheritance hierarchies (which tend to be rather
 flat in Python compared to most mainstreams OOPLs), as well as nested
 classes etc.

Which mainstream languages are you thinking of?  Java?  Because C++ is
as flat as Python.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how to handle network failures

2010-10-10 Thread Jorgen Grahn
On Fri, 2010-10-08, harryos wrote:
 hi
 I  am trying to write a DataGrabber which reads some data from given
 url..I made DataGrabber as a Thread and want to wait for some interval
 of time in case there is a network failure that prevents read().
 I am not very sure how to implement this

 class DataGrabber(threading.Thread):
 def __init__(self,url):
 threading.Thread.__init__(self)
 self.url=url
 def run(self):
 data=self.get_page_data()
 process_data(data)

 def get_page_data():
 try:
 f=urllib.urlopen(self.url)
 data=f.read(1024)
 except IOError:
 #wait for some time and try again
 time.sleep(120)
 data=self.get_page_data()
 return data

 Is this the way to  implement the part where the thread waits and
 reads the  data again? Will this handle network failures?Can somebody
 please help?

You are using TCP sockets. When you get an error on one of those, the
TCP connection is dead (except for a few special cases like EAGAIN,
EINTR).

But you also risk *not* getting told and hanging forever, or anyway
for far longer than your application is likely to want to wait. For
example if the peer host is suddenly disconnected from the network --
TCP will keep trying, in case a connection suddenly reappears much
later.

Try provoking that situation and see what happens.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: meta-class review

2010-10-06 Thread Jorgen Grahn
On Wed, 2010-10-06, Ethan Furman wrote:
 MRAB wrote:
 On 06/10/2010 00:17, Ethan Furman wrote:
   [snip]
   Any comments appreciated, especially ideas on how to better handle
   class- and staticmethods
  
 I think that's a bit of overkill. The problem lies in the printing
 part, but you're spreading the solution into the rest of the
 application! (A case of the tail wagging the dog, perhaps? :-))
 
 IMHO you should just use a simple function when printing:
 
 def dash_zero(x):
 return str(x) if x else '-'
 
 '%-25s: %7s' % ('DPV Failure', dash_zero(counts['D']))

 Yes, simple is better than complex, isn't it?  :)  And certainly a *lot* 
 less code!

 Thank you for pointing that out -- hopefully my blush of embarassment 
 will fade by morning.

IMHO wrapping it in a class made much sense -- I just didn't see why
it exploded with more and more.  There are a few classes like that which
I frequently use:

a. statistics counters which are like ints, but can only be incremented
   and printed (or placed into SNMP messages, or whatever the system
   uses)

b. integers to be printed right-aligned in tables of a certain width,
   and as '-' or 'n/a' or '' when they are zero.  If they are so
   int-like that you can't do (a), then just build them on-the-fly
   when you're printing:
  
   f.write('%s: %s\n' % (name, MyFormatted(value)))

   Class MyFormatted here is very much like dash_zero above; it
   has no methods except __init__ and __str__.

I mostly do this in C++; perhaps it makes more sense in a language with
static typing, overloading and templates.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Learning inheritance

2010-09-18 Thread Jorgen Grahn
On Sat, 2010-09-18, Niklasro wrote:
 Hi
 How can I make the visibility of a variable across many methods or
 files? To avoid repeating the same line eg url =
 os.environ['HTTP_HOST'] if os.environ.get('HTTP_HOST') else
 os.environ['SERVER_NAME'] I repeat for many methods. So declaring it
 to a super class and inheriting it is my plan. Do you agree or propose
 otherwise?

Inheritance is not the main tool for sharing code. Just make it a
function and place it in one of your modules (files):

def get_host():
   Return the environment's $HTTP_HOST if
   it exists, otherwise $SERVER_NAME or (if that
   doesn't exist either) None.
   
   ...

Perhaps you are focusing too much on inheritance in general.
I personally almost never use it in Python -- it has much fewer
uses here than in staticaly typed languages.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: socket.error: [Errno 98] Address already in use

2010-09-18 Thread Jorgen Grahn
On Sat, 2010-09-18, Lawrence D'Oliveiro wrote:
 In message 
 2f830099-4264-47bc-98ee-31950412a...@q21g2000prm.googlegroups.com, cerr 
 wrote:

 I get a socket error [Errno 98] Address already in use when i try to
 open a socket that got closed before with close(). How come close()
 doesn't close the socket properly?

 The usual case this happens is you have a client connection open at the 
 time, that was not properly terminated. Then the TCP stack goes through a 
 holdoff period (2 minutes, I believe it is), to make absolutely sure all 
 packets destined for the old connection have completely disappeared off the 
 entire Internet, before it will let you open a socket on the same port 
 again.

That's why Stevens recommends that all TCP servers use the
SO_REUSEADDR socket option.  He also noted in his book:

This scenario is one of the most frequently asked
questions on Usenet.

Possibly I missed something in the question, but it's worth googling for.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: The trouble with dynamic attributes.

2010-09-18 Thread Jorgen Grahn
On Fri, 2010-09-17, James Mills wrote:
 On Fri, Sep 17, 2010 at 11:33 AM, moerchendiser2k3
 googler.1.webmas...@spamgourmet.com wrote:
 I am really sorry, but what are you talking about ? Hmmm, ...I have
 problems to compile Python on SL, I did not ask anything about
 dynamic attribute. I don't get it...

 You are subscribed to the python mailing list.

 Check your subscription status with the link below.

JN's posting was technically a reply to JM's SL question -- a
References: header led back to it. That's why he was confused.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why are String Formatted Queries Considered So Magical?

2010-07-01 Thread Jorgen Grahn
On Wed, 2010-06-30, Steven D'Aprano wrote:
 On Wed, 30 Jun 2010 14:14:38 +, Jorgen Grahn wrote:

 On Tue, 2010-06-29, Stephen Hansen wrote:
 On 6/29/10 5:41 AM, Roy Smith wrote:
 Nobodynob...@nowhere.com  wrote:

 And what about regular expressions?

 What about them? As the saying goes:

   Some people, when confronted with a problem, think I know, I'll 
use regular expressions. Now they have two problems.

 That's silly.  RE is a good tool.  Like all good tools, it is the
 right tool for some jobs and the wrong tool for others.

 There's nothing silly about it.

 It is an exaggeration though: but it does represent a good thing to
 keep in mind.
 
 Not an exaggeration: it's an absolute. It literally says that any time
 you try to solve a problem with a regex, (A) it won't solve the problem
 and (B) it will in itself become a problem.  And it doesn't tell you
 why: you're supposed to accept or reject this without thinking.

 It's a *two sentence* summary, not a reasoned and nuanced essay on the 
 pros and cons for REs.

Well, perhaps you cannot say anything useful about REs in general in
two sentences, and should use either more words, or not say anything
at all.

The way it was used in the quoted text above is one example of what I
mean. (Unless other details have been trimmed -- I can't check right
now.) If he meant to say REs aren't really a good solution for this
kind of problem, even though they look tempting, then he should have
said that.

 Sheesh, I can just imagine you as a child, arguing with your teacher on 
 being told not to run with scissors -- but teacher, there may be 
 circumstances where running with scissors is the right thing to do, you 
 are guilty of over-simplifying a complex topic into a single simplified 
 sound-byte, instead of providing a detailed, rich heuristic for analysing 
 each and every situation in full before making the decision whether or 
 not to run with scissors.

When I was a child I expected that kind of argumentation from adults.
I expect something more as an adult.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [OT] Re: Why Is Escaping Data Considered So Magical?

2010-07-01 Thread Jorgen Grahn
On Wed, 2010-06-30, Michael Torrie wrote:
 On 06/30/2010 03:00 AM, Jorgen Grahn wrote:
 On Wed, 2010-06-30, Michael Torrie wrote:
 On 06/29/2010 10:17 PM, Michael Torrie wrote:
 On 06/29/2010 10:05 PM, Michael Torrie wrote:
 #include stdio.h

 int main(int argc, char ** argv)
 {
   char *buf = malloc(512 * sizeof(char));
   const int a = 2, b = 3;
   snprintf(buf, sizeof buf, %d + %d = %d\n, a, b, a + b);
^^
 Make that 512*sizeof(buf)

 Sigh.  Try again.  How about 512 * sizeof(char) ?  Still doesn't make
 a different.  The code still crashes because the buf is incorrect.
 
 I haven't tried to understand the rest ... but never write
 'sizeof(char)' unless you might change the type later. 'sizeof(char)'
 is by definition 1 -- even on odd-ball architectures where a char is
 e.g. 16 bits.

 You're right.  I normally don't use sizeof(char).  This is obviously a
 contrived example; I just wanted to make the example such that there's
 no way the original poster could argue that the crash is caused by
 something other than buf.

 Then again, it's always a bad idea in C to make assumptions about
 anything.

There are some things you cannot assume, others which few fellow
programmers can care to memorize, and others which you often can get
away with (like assuming an int is more than 16 bits, when your code
is tied to a modern Unix anyway).

But sizeof(char) is always 1.

 If you're on Windows and want to use the unicode versions of
 everything, you'd need to do sizeof().  So using it here would remind
 you that when you move to the 16-bit Microsoft unicode versions of
 snprintf need to change the sizeof(char) lines as well to sizeof(wchar_t).

Yes -- see unless you might change the type later above.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [OT] Re: Why Is Escaping Data Considered So Magical?

2010-06-30 Thread Jorgen Grahn
On Wed, 2010-06-30, Michael Torrie wrote:
 On 06/29/2010 10:17 PM, Michael Torrie wrote:
 On 06/29/2010 10:05 PM, Michael Torrie wrote:
 #include stdio.h

 int main(int argc, char ** argv)
 {
 char *buf = malloc(512 * sizeof(char));
 const int a = 2, b = 3;
 snprintf(buf, sizeof buf, %d + %d = %d\n, a, b, a + b);
^^
 Make that 512*sizeof(buf)

 Sigh.  Try again.  How about 512 * sizeof(char) ?  Still doesn't make
 a different.  The code still crashes because the buf is incorrect.

I haven't tried to understand the rest ... but never write
'sizeof(char)' unless you might change the type later. 'sizeof(char)'
is by definition 1 -- even on odd-ball architectures where a char is
e.g. 16 bits.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Ancient C string conventions (was Re: Why Is Escaping Data Considered So Magical?)

2010-06-30 Thread Jorgen Grahn
On Wed, 2010-06-30, Carl Banks wrote:
 On Jun 28, 2:44 am, Gregory Ewing greg.ew...@canterbury.ac.nz wrote:
 Carl Banks wrote:
  Indeed, strncpy does not copy that final NUL if it's at or beyond the
  nth element.  Probably the most mind-bogglingly stupid thing about the
  standard C library, which has lots of mind-boggling stupidity.

 I don't think it was as stupid as that back when C was
 designed. Every byte of memory was precious in those days,
 and if you had, say, 10 bytes allocated for a string, you
 wanted to be able to use all 10 of them for useful data.

 So the convention was that a NUL byte was used to mark
 the end of the string *if it didn't fill all the available
 space*.

 I can't think of any function in the standard library that observes
 that convention,

Me neither, except strncpy(), according to above.

 which inclines me to disbelieve this convention ever
 really existed.  If it did, there would be functions to support it.

Maybe others existed, but got killed off early. That would make
strncpy() a living fossil, like the Coelacanth ...

 For that matter, I'm not really inclined to believe bytes were *that*
 precious in those days.

It's somewhat believable. If I handled thousands of student names in a
big C array char[30][], I would resent the fact that 1/30 of the
memory was wasted on NUL bytes.  I'm sure plenty of people have done what
Gregory suggests ... but it's not clear that strncpy() was designed to
support those people.

I suppose it's all lost in history.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why are String Formatted Queries Considered So Magical?

2010-06-30 Thread Jorgen Grahn
On Tue, 2010-06-29, Stephen Hansen wrote:
 On 6/29/10 5:41 AM, Roy Smith wrote:
 Nobodynob...@nowhere.com  wrote:

 And what about regular expressions?

 What about them? As the saying goes:

 Some people, when confronted with a problem, think
 I know, I'll use regular expressions.
 Now they have two problems.

 That's silly.  RE is a good tool.  Like all good tools, it is the right
 tool for some jobs and the wrong tool for others.

 There's nothing silly about it.

 It is an exaggeration though: but it does represent a good thing to keep 
 in mind.

Not an exaggeration: it's an absolute. It literally says that any time
you try to solve a problem with a regex, (A) it won't solve the problem
and (B) it will in itself become a problem.  And it doesn't tell you
why: you're supposed to accept or reject this without thinking.

How can that be a good thing to keep in mind?

I wouldn't normally be annoyed by the quote, but it is thrown around a
lot in various places, not just here.

 Yes, re is a tool -- and a useful one at that. But its also a tool which 
 /seems/ like an omnitool capable of tackling everything.

That's more like my attitude towards them.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Build unordered list in HTML from a python list

2010-06-30 Thread Jorgen Grahn
On Wed, 2010-06-30, Kushal Kumaran wrote:
 On Wed, Jun 30, 2010 at 2:04 PM, Nico Grubert nicogrub...@yahoo.de wrote:
 Dear list members

 I have this python list that represets a sitemap:

 tree = [{'indent': 1, 'title':'Item 1', 'hassubfolder':False},
        {'indent': 1, 'title':'Item 2', 'hassubfolder':False},
        {'indent': 1, 'title':'Folder 1', 'hassubfolder':True},
        {'indent': 2, 'title':'Sub Item 1.1', 'hassubfolder':False},
        {'indent': 2, 'title':'Sub Item 1.2', 'hassubfolder':False},
        {'indent': 1, 'title':'Item 3', 'hassubfolder':False},
        {'indent': 1, 'title':'Folder 2', 'hassubfolder':True},
        {'indent': 2, 'title':'Sub Item 2.1', 'hassubfolder':False},
        {'indent': 2, 'title':'Folder 2.1', 'hassubfolder':True},
        {'indent': 3, 'title':'Sub Item 2.1.1', 'hassubfolder':False},
        {'indent': 3, 'title':'Sub Item 2.1.2', 'hassubfolder':False},
       ]

 From that list I want to create the following HTML code:

 ul id=tree
  liItem 1/li
  liItem 2/li
  liFolder 1
    ul
      liSub Item 1.1/li
      liSub Item 1.2/li
    /ul
  /li
  liItem 3/li
  liFolder 2
    ul
      liSub Item 2.1/li
      liFolder 2.1
        ul
          liSub Item 2.1.1/li
          liSub Item 2.1.2/li
        /ul
      /li
    /ul
  /li
 /ul

 If an item of the list has 'True' for the 'hassubfolder' key than a new
 ulli must be created instead of /li after its title. (See Folder
 2 node in the HTML code above.

 My problem is: How do I keep track of the closing tags while iterating over
 the python list?


 Use a stack?

 Whenever you start a new list, push the corresponding closing tag onto
 a stack.  Whenever your indent level decreases, pop the stack and
 write out the closing tag you get.

 It's straightforward to use a python list as a stack.

Or even simpler.

You keep track of your current indentation level (0, 1, ...). If
level==1 and you come to an indent: 2, you generate an ul and
increase level to 2.  Similarly for going left. When you reach the end
you add /uls to go back up to level 1 (or maybe you want to call it
level 0 instead).

That probably assumes you use HTML (like you say) rather than XHTML
(which your example hints at).  In HTML you don't need to supply the
/lis.

I did all this in Perl earlier today, but in my case it was unsuitable
because I skipped levels (had a list of HTML headings, wanted a
table-of-contents, but sometimes a h1 was followed by h3 with no
h2 inbetween. I'd get invalid stuff like ulul.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python as a scripting language. Alternative to bash script?

2010-06-29 Thread Jorgen Grahn
On Mon, 2010-06-28, John Nagle wrote:
 On 6/28/2010 7:58 AM, Benjamin Kaplan wrote:
 How does a program return anything other than an exit code?

 Ah, yes, the second biggest design mistake in UNIX.

 Programs have argv and argc, plus environment variables,
 going in.  So, going in, there are essentially subroutine parameters.
 But all that comes back is an exit code. They should have had
 something similar coming back, with arguments to exit() returning
 the results.  Then the many small intercommunicating programs
 concept would have worked much better.

Like others said, you have standard output. sys.stdout for data,
sys.stderr for human-readable errors and warnings, and the exit code
for machine-readable errors.

 C was like that once.  In the 1970s, all you could return was
 an int or a float.  But that got fixed.

Huh? The C we have today cannot return a float, and not even a full int.
0 and 1 work, small integers up to 255 are likely to work, but beyond
that common systems (Unix) will chop off the high bits.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python as a scripting language. Alternative to bash script?

2010-06-29 Thread Jorgen Grahn
On Mon, 2010-06-28, Dave Pawson wrote:
 I've a fairly long bash script and I'm wondering
 how easy it would be to port to Python.

 Main queries are:
 Ease of calling out to bash to use something like imageMagick or Java?
 Ease of grabbing return parameters? E.g. convert can return both
 height and width of an image. Can this be returned to the Python program?
 Can Python access the exit status of a program?

 I'd prefer the advantages of using Python, just wondering if I got
 so far with the port then found it wouldn't do something?

As other remarked, bash has advantages, too.

Personally, if my main job is chaining other non-trivial programs into
pipelines and sequences, I don't hesitate to use Bourne shell or bash.
Perl is for heavier text processing, and Python for problems with more
elaborate data types.

Note the distinction Bourne shell/bash. If you can get away with it,
use bash for medium/large-sized scripts. Many people try to avoid
bash-specific syntax, but they miss out on lots of things that make
programs maintainable, like local variables.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-28 Thread Jorgen Grahn
On Mon, 2010-06-28, Kushal Kumaran wrote:
 On Mon, Jun 28, 2010 at 2:00 AM, Jorgen Grahn grahn+n...@snipabacken.se 
 wrote:
 On Sun, 2010-06-27, Lawrence D'Oliveiro wrote:
 In message roy-854954.20435125062...@news.panix.com, Roy Smith wrote:

 I recently fixed a bug in some production code.  The programmer was
 careful to use snprintf() to avoid buffer overflows.  The only problem
 is, he wrote something along the lines of:

 snprintf(buf, strlen(foo), foo);

 A long while ago I came up with this macro:

     #define Descr(v) v, sizeof v

 making the correct version of the above become

     snprintf(Descr(buf), foo);

 This is off-topic, but I believe snprintf() in C can *never* safely be
 the only thing you do to the buffer: you also have to NUL-terminate it
 manually in some corner cases. See the documentation.

 snprintf goes to great lengths to be safe, in fact.  You might be
 thinking of strncpy.

Yes, it was indeed strncpy I was thinking of. Thanks.

But actually, the snprintf(3) man page I have is not 100% clear on
this issue, so last time I used it, I added a manual NUL-termination
plus a comment saying I wasn't sure it was needed.  I normally use C++
or Python, so I am a bit rusty on these things.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-27 Thread Jorgen Grahn
On Sat, 2010-06-26, Lawrence D'Oliveiro wrote:
 In message slrni297ec.1m5.grahn+n...@frailea.sa.invalid, Jorgen Grahn 
 wrote:

 I thought it was well-known that the solution is *not* to try to
 sanitize the input -- it's to switch to an interface which doesn't
 involve generating an intermediate executable.  In the Python example,
 that would be something like os.popen2(['zcat', '-f', '--', untrusted]).

 That???s what I mean. Why do people consider input sanitization so hard?

I'm not sure you understood me correctly, because I advocate
*not* doing input sanitization. Hard or not -- I don't want to know,
because I don't want to do it.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-27 Thread Jorgen Grahn
On Fri, 2010-06-25, Nobody wrote:
 On Fri, 25 Jun 2010 12:15:08 +, Jorgen Grahn wrote:

 I don't do SQL and I don't even understand the terminology properly
 ... but the discussion around it bothers me.
 
 Do those people really do this?

 Yes. And then some.

 Among web developers, the median level of programming knowledge amounts to
 the first 3 chapters of Learn PHP in 7 Days.

 It doesn't help the the guy who wrote PHP itself wasn't much better.

 - accept untrusted user data
 - try to sanitize the data (escaping certain characters etc)
 - turn this data into executable code (SQL)
 - executing it
 
 Like the example in the article
 
   SELECT * FROM hotels WHERE city = 'untrusted';

 Yep. Search the BugTraq archives for SQL injection. And most of those
 are for widely-deployed middleware; the zillions of bespoke site-specific
 scripts are likely to be worse.

 Also: http://xkcd.com/327/

Priceless!

As is often the case with xkcd, I learned something, too: there's a
widely used web application/portal/database thingy which silently
strips some characters from my input.  I thought it had to do with
HTML, but it's in fact exactly the sequences ', ')', ';' and '--'
from the comic, and a few more like '' and undoubtedly some I haven't
noticed yet.

That is surely input sanitization gone horribly wrong: I enter 6--8
slices of bread, but the system stores 68 slices of bread.

 I thought it was well-known that the solution is *not* to try to
 sanitize the input

 Well known by anyone with a reasonable understanding of the principles of
 programming, but somewhat less well known by the other 98% of web
 developers.

 Am I missing something?

 There's a world of difference between a skilled chef and the people
 flipping burgers for a minimum wage. And between a chartered civil
 engineer and the people laying the asphalt. And between what you
 probably consider a programmer and the people doing most web development.

I don't know them, so I wouldn't know ... What I would *expect* is
that safe tools are provided for them, not just workarounds so they
can keep using the unsafe tools. That's what Python did, with its
multitude of alternatives to os.system and os.popen.

Anyway, thanks. It's always nice to be able to map foreign terminology
like SQL injection to something you already know.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-27 Thread Jorgen Grahn
On Sun, 2010-06-27, Lawrence D'Oliveiro wrote:
 In message roy-854954.20435125062...@news.panix.com, Roy Smith wrote:

 I recently fixed a bug in some production code.  The programmer was
 careful to use snprintf() to avoid buffer overflows.  The only problem
 is, he wrote something along the lines of:
 
 snprintf(buf, strlen(foo), foo);

 A long while ago I came up with this macro:

 #define Descr(v) v, sizeof v

 making the correct version of the above become

 snprintf(Descr(buf), foo);

This is off-topic, but I believe snprintf() in C can *never* safely be
the only thing you do to the buffer: you also have to NUL-terminate it
manually in some corner cases. See the documentation.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-25 Thread Jorgen Grahn
On Fri, 2010-06-25, Lawrence D'Oliveiro wrote:
 Just been reading this article
 http://www.theregister.co.uk/2010/06/23/xxs_sql_injection_attacks_testing_remedy/
 which says that a lot of security holes are arising these days because
 everybody is concentrating on unit testing of their own particular
 components, with less attention being devoted to overall integration
 testing.

I don't do SQL and I don't even understand the terminology properly
... but the discussion around it bothers me.

Do those people really do this?
- accept untrusted user data
- try to sanitize the data (escaping certain characters etc)
- turn this data into executable code (SQL)
- executing it

Like the example in the article

  SELECT * FROM hotels WHERE city = 'untrusted';

If so, its isomorphic with doing os.popen('zcat -f %s' % untrusted)
in Python (at least on Unix, where 'zcat ...' is executed as a shell
script).

I thought it was well-known that the solution is *not* to try to
sanitize the input -- it's to switch to an interface which doesn't
involve generating an intermediate executable.  In the Python example,
that would be something like os.popen2(['zcat', '-f', '--', untrusted]).

Am I missing something?  If not, I can go back to sleep -- and keep
avoiding SQL and web programming like the plague until that community
has entered the 21st century.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Pythonic Idiom For Searching An Include Path

2010-06-25 Thread Jorgen Grahn
On Thu, 2010-06-24, Nobody wrote:
 On Wed, 23 Jun 2010 17:27:16 -0500, Tim Daneliuk wrote:

 Given a program 'foo' that takes a command line argument '-I
 includefile', I want to be able to look for 'includefile' in a path
 specified in an environment variable, 'FOOPATH'.
 
 I'd like a semantic that says:
 
   If 'includefile' contains one or more path separator characters,
ignore 'FOOPATH'. If it contains no path separators, look for it in
the paths specified by 'FOOPATH', beginning with the leftmost path
first.
 
 Is there a standard Pythonic idiom for doing this or do I need to cook
 up my own.

 There isn't an idiom.

 There are a surprising number of choices for such a simple task, e.g.
 whether the search path is used for relative paths containing a separator,
 whether you stop at the first file which exists or the first file which
 meets other criteria (e.g. suitable permissions), whether default
 locations come first or last, what happens if a default location is
 included in the search path, etc.

Another favorite is whether relative paths are relative to your
current directory, or to the location of whatever file this is to be
included /into/.

For an example where it mattered (to C compilers), google for the
paper recursive make considered harmful. It took compiler writers
decades to realize what the best choice was there.

(By the way, -I commonly means search this directory for include
files rather than include this file. You may want to avoid
confusing people by choosing another letter, like -i.)

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Syntax problem - cannot solve it by myself

2010-06-09 Thread Jorgen Grahn
On Wed, 2010-06-09, Deadly Dirk wrote:
 On Tue, 08 Jun 2010 18:52:44 -0700, alex23 wrote:


 Unless you have a clear need for 3rd party libraries that currently
 don't have 3.x versions, starting with Python 3 isn't a bad idea.

But see below.

 From what I see, most of the people are still using Python 2.x. My reason 
 for learning Python is the fact that my CTO decided that the new company 
 standard for scripting languages will be Python.

Not a bad choice.

 I've been using Perl for 
 15 years and it was completely adequate but, apparently, Perl is no 
 longer in.

I hope your CTO still lets you use Perl for the things Perl does
better (like quickly and elegantly parse huge text files, and various
one-liners). For many other tasks, I think you will quickly find
Python superior.

 I am afraid that Python3 is like Perl 6, the one with Parrot: 
 everybody is reading articles about it but nobody is using it. 

It seemed like that for a year or two (when people regularly called it
Python 3000). Now it's in use -- although perhaps not so much as you
would think when you read comp.lang.python.

I am still perfectly happy with Python 2.4 and 2.5. These are the
versions which are installed by default in modern, recent Linux
distributions.  I bet it will be years before Python 3 replaces them.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python + vim + spaces vs tab

2010-06-09 Thread Jorgen Grahn
On Mon, 2010-06-07, Neil Cerutti wrote:
 On 2010-06-07, Jean-Michel Pichavant jeanmic...@sequans.com wrote:
 Hello,

 Does anyone knows a way to configure vim so it automatically
 select to correct expandtab value depending on the current
 buffer 'way of doing' ? I need to edit different files, some
 are using spaces, others tabs. Those belong to different
 projects, and changing all spaces to tabs is not an option for
 me.

 I can't make vim automatically comply with the current buffer
 coding style, anyone knows if it is possible ?

 :h filetypes will get you started on the right path. It'll be up
 to you to program the recognition logic. Do you have a heuristic
 in mind?

 You will be better off converting tabbed files to be tabless,
 which is pretty easy in vim.

But as he wrote, that is not an option.  And I can believe that -- if
you are many programmers, working in parallel on some fairly big and
mature project, the *last* thing you want is someone coming in and
reindenting everything.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: py2exe and pydocs. Downloads?

2010-01-21 Thread Jorgen Grahn
On Thu, 2010-01-21, Gib Bogle wrote:
 Gabriel Genellina wrote:

 You found a bug. Looks like it depends on the environment, or what 
 packages are installed, or something like that, because it worked on my 
 other PC but not here.
 Please report it at http://bugs.python.org so it doesn't get forgotten.
 

 Done

And for reference, it's http://bugs.python.org/issue7749, pydoc error.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to execute a script from another script and other script does not do busy wait.

2010-01-08 Thread Jorgen Grahn
On Thu, 2010-01-07, danmcle...@yahoo.com wrote:
 On Jan 7, 9:18 am, Jorgen Grahn grahn+n...@snipabacken.se wrote:
 On Thu, 2010-01-07, Rajat wrote:
  I want to run a python script( aka script2) from another python script
  (aka script1). While script1 executes script2 it waits for script2 to
  complete and in doing so it also does some other useful work.(does not
  do a busy wait).

  My intention is to update a third party through script1 that script2
  is going to take longer.

 I do not understand that sentence.
 What are you trying to do, more exactly?  The best solution can be
 threads, os.popen, os.system or something different -- depending on
 the details of what you want to do.

...

 I personally use subprocess. Once you launch via subprocess you can
 wait or not.

 p = subprocess.Popen(...)
 p.wait() #or not.

 See subprocess docs.

Yes, that was included in or something different above. I have never
used it myself, since I've needed to be compatible with Python  2.4.

Still, we need to know what he tries to do.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Do I have to use threads?

2010-01-08 Thread Jorgen Grahn
On Wed, 2010-01-06, Gary Herron wrote:
 aditya shukla wrote:
 Hello people,

 I have 5 directories corresponding 5  different urls .I want to 
 download images from those urls and place them in the respective 
 directories.I have to extract the contents and download them 
 simultaneously.I can extract the contents and do then one by one. My 
 questions is for doing it simultaneously do I have to use threads?

 Please point me in the right direction.


 Thanks

 Aditya

 You've been given some bad advice here.

 First -- threads are lighter-weight than processes, so threads are 
 probably *more* efficient.  However, with only five thread/processes, 
 the difference is probably not noticeable.(If the prejudice against 
 threads comes from concerns over the GIL -- that also is a misplaced 
 concern in this instance.  Since you only have network connection, you 
 will receive only one packet at a time, so only one thread will be 
 active at a time.   If the extraction process uses a significant enough 
 amount of CPU time

I wonder what that extraction would be, by the way.  Unless you ask
for compression of the HTTP data, the images come as-is on the TCP
stream.

 so that the extractions are all running at the same 
 time *AND* if you are running on a machine with separate CPU/cores *AND* 
 you would like the extractions to be running truly in parallel on those 
 separate cores,  *THEN*, and only then, will processes be more efficient 
 than threads.)

I can't remember what the bad advice was, but here processes versus
threads clearly doesn't matter performance-wise.  I generally
recommend processes, because how they work is well-known and they're
not as vulnerable to weird synchronization bugs as threads.

 Second, running 5 wgets is equivalent to 5 processes not 5 threads.

 And third -- you don't have to use either threads *or* processes.  There 
 is another possibility which is much more light-weight:  asynchronous 
 I/O,  available through the low level select module, or more usefully 
 via the higher-level asyncore module.

Yeah, that would be my first choice too for a problem which isn't
clearly CPU-bound.  Or my second choice -- the first would be calling
on a utility like wget(1).

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Scripting (was Re: Python books, literature etc)

2010-01-08 Thread Jorgen Grahn
On Thu, 2010-01-07, Peter wrote:

 [...] depending on your 
 application domain, I liked:

 1) Hans Petter Langtangen: Python Scripting for Computational Science
 A truly excellent book, not only with respect to Python Scripting , but 
 also on how to avoid paying  license fees by using opensource tools as 
 an engineer ( plotting, graphing, gui dev etc ). Very good , pratical 
 introduction to Python with careful and non-trivial examples and exercises.

Sounds good.

Regarding the book's title: is it just me, or are Python programmers
in general put off when people call it scripting?

I won't attempt a strict definition of the term scripting language,
but it seems like non-programmers use it to mean less scary than what
you might think of as programming, while programmers interpret it as
not useful as a general-purpose language.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Recommended new way for config files

2010-01-08 Thread Jorgen Grahn
On Thu, 2010-01-07, Jean-Michel Pichavant wrote:
 Peter wrote:
 Hi
 There seems to be several strategies to enhance the old ini-style 
 config files with real python code, for example:
...
 Is there a strategy that should be prefered for new projects ?

...
 The .ini file is the simpliest solution, at least from the user point of 
 view, no need to learn any python syntax.

Yeah. Use whatever your users expect, and deal with it. The language
you've happened to implement your stuff in should normally be
irrelevant to the users.

I wouldn't use .ini-style, but that's because I'm a Unix guy and they
remind me of brief and painful experiments with Windows 3.1.

Just remember to include support for commented-out lines.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to execute a script from another script and other script does not do busy wait.

2010-01-07 Thread Jorgen Grahn
On Thu, 2010-01-07, Rajat wrote:
 I want to run a python script( aka script2) from another python script
 (aka script1). While script1 executes script2 it waits for script2 to
 complete and in doing so it also does some other useful work.(does not
 do a busy wait).

 My intention is to update a third party through script1 that script2
 is going to take longer.

I do not understand that sentence.
What are you trying to do, more exactly?  The best solution can be
threads, os.popen, os.system or something different -- depending on
the details of what you want to do.

 Please suggest how should I go about implementing it.

 I'm currently executing it as:

 import main from script2
 ret_code  = main()
 return ret_code

 which surely is not going to achieve me what I intend.


 Thanks,
 Rajat.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Do I have to use threads?

2010-01-07 Thread Jorgen Grahn
On Thu, 2010-01-07, Marco Salden wrote:
 On Jan 6, 5:36 am, Philip Semanchuk phi...@semanchuk.com wrote:
 On Jan 5, 2010, at 11:26 PM, aditya shukla wrote:

  Hello people,

  I have 5 directories corresponding 5  different urls .I want to  
  download
  images from those urls and place them in the respective  
  directories.I have
  to extract the contents and download them simultaneously.I can  
  extract the
  contents and do then one by one. My questions is for doing it  
  simultaneously
  do I have to use threads?

 No. You could spawn 5 copies of wget (or curl or a Python program that  
 you've written). Whether or not that will perform better or be easier  
 to code, debug and maintain depends on the other aspects of your  
 program(s).

 bye
 Philip

 Yep, the more easier and straightforward the approach, the better:
 threads are always (programmers')-error-prone by nature.
 But my question would be: does it REALLY need to be simultaneously:
 the CPU/OS only has more overhead doing this in parallel with
 processess. Measuring sequential processing and then trying to
 optimize (e.g. for user response or whatever) would be my prefered way
 to go. Less=More.

Normally when you do HTTP in parallell over several TCP sockets, it
has nothing to do with CPU overhead. You just don't want every GET to
be delayed just because the server(s) are lazy responding to the first
few ones; or you might want to read the text of a web page and the CSS
before a few huge pictures have been downloaded.

His I have to [do them] simultaneously makes me want to ask Why?.

If he's expecting *many* pictures, I doubt that the parallel download
will buy him much.  Reusing the same TCP socket for all of them is
more likely to help, especially if the pictures aren't tiny. One
long-lived TCP connection is much more efficient than dozens of
short-lived ones.

Personally, I'd popen() wget and let it do the job for me.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python books, literature etc

2010-01-07 Thread Jorgen Grahn
On Thu, 2010-01-07, Stuart Murray-Smith wrote:
...
 [...] ESR's guide to
 smart questions [1] helps set the pace of list culture.

It's good, if you can ignore the These People Are Very Important
Hacker Gods, Not Mere Mortals subtext.

...
 Anyways, to rephrase, could someone kindly mention any of their
 preferred Python books, websites, tutorials etc to help me get to an
 intermediate/advanced level? Something that would help me add
 functionality to Ubiquity, say.

I may be alone in this, but Alex Martelli's book (Python in a
nutshell?) on Python 2.2 and a bit of 2.3, plus the official
documentation, plus this group, is all I think I need.
But I had a lot of Unix, C, C++ and Perl experience to help me.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to schedule system calls with Python

2009-10-23 Thread Jorgen Grahn
On Thu, 2009-10-22, Al Fansome wrote:


 Jorgen Grahn wrote:
 On Fri, 2009-10-16, Jeremy wrote:
 On Oct 15, 6:32 pm, MRAB pyt...@mrabarnett.plus.com wrote:
 TerryP wrote:
 On Oct 15, 7:42 pm, Jeremy jlcon...@gmail.com wrote:
 I need to write a Python script that will call some command line
 programs (using os.system).  I will have many such calls, but I want
 to control when the calls are made.  I won't know in advance how long
 each program will run and I don't want to have 10 programs running
 when I only have one or two processors.  I want to run one at a time
 (or two if I have two processors), wait until it's finished, and then
 call the next one.
 ...
 You could use multithreading: put the commands into a queue; start the
 same number of worker threads as there are processors; each worker
 thread repeatedly gets a command from the queue and then runs it using
 os.system(); if a worker thread finds that the queue is empty when it
 tries to get a command, then it terminates.
 Yes, this is it.  If I have a list of strings which are system
 commands, this seems like a more intelligent way of approaching it.
 My previous response will work, but won't take advantage of multiple
 cpus/cores in a machine without some manual manipulation.  I like this
 idea.
 
 Note that you do not need *need* multithreading for this. To me it
 seems silly to have N threads sitting just waiting for one process
 each to die -- those threads contribute nothing to the multiprocessing
 you want.
...

 Another way to approach this, if you do want to use threads, is to use a 
 counting semaphore. Set it to the maximum number of threads you want to 
 run at any one time. Then loop starting up worker threads in the main 
 thread. acquire() the semaphore before starting the next worker thread; 
 when the semaphore reaches 0, your main thread will block. Each worker 
 thread should then release() the semaphore when it  exits; this will 
 allow the main thread to move on to creating the next worker thread.

 This doesn't address the assignment of threads to CPU cores, but I have 
 used this technique many times, and it is simple and fairly easy to 
 implement. [---]

But do you *want* to handle the CPUs manually, anyway? Your program
typically has no idea what other processes are running on the machine
or how important they are.

(Of course, in this case the threads do next to nothing, so controlling
them on that detailed level neither helps nor hurts performance.)

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: search python wifi module

2009-10-23 Thread Jorgen Grahn
On Fri, 2009-10-23, Clint Mourlevat wrote:
 hello,

 i search a wifi module python on windows, i have already scapy !

What is a wifi module?  Your OS is supposed to hide networking
implementation details (Ethernet, PPP, Wi-fi, 3G ...) and provide
specific management interfaces when needed. What are you trying to do?

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ANN: Testoob 1.15 released

2009-10-21 Thread Jorgen Grahn
On Mon, 2009-10-19, oripel wrote:
 On Oct 14, 5:59 pm, Jorgen Grahn grahn+n...@snipabacken.se wrote:
 But this sentence on the home page

     The documentation is sadly outdated, but may be
     a starting point:

 made me stop looking.  As far as I can tell, you cannot even find out
 what's so advanced about it (or why advanced is a good thing)
 without starting to use it.  A brief comparison with module unittest
 (which I am rather unhappy with) would have been nice, too.

 Those are good points Jorgen, thanks.

 The briefest summary I would give is:
 (a) You can run your unittest suites unmodified (so it's easy to try
 out)
 (b) The test running options have the potential to whet your appetite:

 % testoob -h
 Usage
 =
   testoob [options] [test1 [test2 [...]]]

 examples:
   testoob  - run default set of tests
   testoob MyTestSuite  - run suite 'MyTestSuite'
   testoob MyTestCase.testSomething - run MyTestCase.testSomething
   testoob MyTestCase   - run all 'test*' test methods in
 MyTestCase

 Options
 ===
[dozens of options snipped]

Oh, good. Both (a) and (b) are certainly good info for the web page.

Many of the options are for transforming the output -- something I
prefer (as a Unix guy) to do myself with a filtering script I have
control over. Others will like it though, and I like some of the other
options -- especially the one which lists all tests, and the run
tests which match this string option.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: The rap against while True: loops

2009-10-21 Thread Jorgen Grahn
On Wed, 2009-10-14, Steven D'Aprano wrote:
...

 Setting up a try...except block is cheap in Python. According to my 
 tests, the overhead is little more than that of a single pass statement.

 But actually raising and catching the exception is not cheap. If you use 
 a lot of exceptions for flow control, performance will probably suffer.

You seem to have experimented with this, so you might be right.

 In C++, exceptions are expensive, whether you catch one or not.

I am not sure that is objectively true, even if you consider that
expensive among C++ users often means costs more than a semi-decent
alternative.  For example, Stroustrup claimed back in 1994 that the
non-catching case can be implemented at no speed cost or no memory
usage cost (Design and Evolution of C++, 1994, p397).

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to schedule system calls with Python

2009-10-21 Thread Jorgen Grahn
On Thu, 2009-10-15, TerryP wrote:
...
 launching external programs, irregardless of language, generally falls
 into 3 major categories:

   0.) blocks until program is done; like system
   1.) replaces your program with process, never returns; like exec
   2.) quickly return after asynchronously launching the program

 Most languages will implement the first method because of the standard
 system() function in C, which is fairly popular in it's own right.
 Most multi-tasking operating systems will implement some form of exec
 function, which Python exposes through the os module. The last method
 is the least portable, because obviously if the OS lacks multi-tasking
 you're screwed. The best examples of 2. are the UNIX popen() function
 and Microsoft's spawn() family, when used with the P_DETACH flag.

Not sure that popen() fits nicely into that category -- you have to
eat the child's output or feed it with input, or it will eventually
stall.

 Python being made with much loving kindless, exposes each interface.

Nicely put!

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to schedule system calls with Python

2009-10-21 Thread Jorgen Grahn
On Fri, 2009-10-16, Jeremy wrote:
 On Oct 15, 6:32 pm, MRAB pyt...@mrabarnett.plus.com wrote:
 TerryP wrote:
  On Oct 15, 7:42 pm, Jeremy jlcon...@gmail.com wrote:
  I need to write a Python script that will call some command line
  programs (using os.system).  I will have many such calls, but I want
  to control when the calls are made.  I won't know in advance how long
  each program will run and I don't want to have 10 programs running
  when I only have one or two processors.  I want to run one at a time
  (or two if I have two processors), wait until it's finished, and then
  call the next one.
...
 You could use multithreading: put the commands into a queue; start the
 same number of worker threads as there are processors; each worker
 thread repeatedly gets a command from the queue and then runs it using
 os.system(); if a worker thread finds that the queue is empty when it
 tries to get a command, then it terminates.

 Yes, this is it.  If I have a list of strings which are system
 commands, this seems like a more intelligent way of approaching it.
 My previous response will work, but won't take advantage of multiple
 cpus/cores in a machine without some manual manipulation.  I like this
 idea.

Note that you do not need *need* multithreading for this. To me it
seems silly to have N threads sitting just waiting for one process
each to die -- those threads contribute nothing to the multiprocessing
you want.

In Unix, you can have one process fork() and exec() as many programs
as you like, have them run on whatever CPUs you have, and wait for
them to die and reap them using wait() and related calls.  (Not sure
what the equivalent is in non-Unix OSes or portable Python.)

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: cpython compilation parameter

2009-10-14 Thread Jorgen Grahn
On Thu, 2009-10-08, Diez B. Roggisch wrote:
 cEd wrote:

 Hello,
 
 I'm wondering how to compile python to get good performance.
 Because when I compare this ugly code which find prime number:

...

  between :
  - the python distributed with my ubuntu
 #time python prime_number.py  /dev/null
 real0m12.237s
 user0m12.129s
 sys0m0.024s
 
  - and the one compiled by my self
 time my_python_compiled prime_number.py  /dev/null
 real0m42.193s
 user0m41.891s
 sys0m0.044s
 
 so which option should I give or which flag ???

 I doubt that there is such a flag. There must be a different reason for
 this. Can you give us the python versions for each, and architecture (32/64
 bit)?

He could start by compiling it exactly like Ubuntu does. Just get the
Ubuntu source packet -- it's all in there, Ubuntu doesn't keep it a
secret.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python performance on Solaris

2009-10-14 Thread Jorgen Grahn
On Wed, 2009-10-14, Antoine Pitrou wrote:
 inaf cem.ezberci at gmail.com writes:
 
 Good point. I failed to compare the CPU power on these machines.. 32
 bit linux box I have is 2666 Mhz vs the Solaris zone is 1415 Mhz.. I
 guess that explains :) Thank you for the tip..

 You have to compare not only CPU frequencies but the CPU models.

Yes, at least that.  Megahertz figures have been useless for decades,
except in advertising.

 Recently Sun has been selling CPUs optimized for multi-threading (e.g. the
 UltraSPARC T2 or Niagara CPUs) which have, by design, very poor
 single-threaded performance. If your Solaris zone uses such a CPU then a 6-8x
 difference in single-threaded performance compared to a modern Intel
 or AMD CPU
 is totally expected.

(Had to Google it. A Solaris Zone is apparently some kind of
virtualization thing, with low CPU overhead.)

s/multi-threading/multi-programming/ I suppose. I certainly hope you
can still get performance while running many separate true processes in
parallel.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ANN: Testoob 1.15 released

2009-10-14 Thread Jorgen Grahn
On Thu, 2009-10-08, oripel wrote:
 Testoob is the advanced Python test runner and testing framework that
 spices up any existing unittest test suite.

 Home: http://code.google.com/p/testoob

But this sentence on the home page

The documentation is sadly outdated, but may be
a starting point:

made me stop looking.  As far as I can tell, you cannot even find out
what's so advanced about it (or why advanced is a good thing)
without starting to use it.  A brief comparison with module unittest
(which I am rather unhappy with) would have been nice, too.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Are there any modules for IRC, that work with Python 3.1?

2009-10-14 Thread Jorgen Grahn
On Sat, 2009-10-10, TerryP wrote:
 Does anyone know of any modules for dealing with the IRC protocol,
 that will work with Python 3.1? It doens't have to be super great,
 just less time consuming then playing with sockets directly (and obv.
 stable). The only module in my systems package manager is irclib for
 Python 2.6. I can live with writing code for Python 2.4+ easily but,
 ahem, I think it would be wise to write new code around Python 3.1
 instead...

Even though it is not widely used yet, and the module you want to use
doesn't support it?  I assume you have installed Python 3.x manually
too (my Debian 'stable' is only at Python 2.5 at the moment -- it
probably takes lots of work to bring in Python 3 without losing
important packages).

Or you can ask the irclib maintainers if they have something. If not,
you can do the work for them, after you have convinced yourself it's
good enough (by starting to use it with Python 2.x).

I don't have any more substantial advice, sorry.

 # circumstances

 Having recently been put into search for a new IRC client, and
 everything I've thrown in the cauldron having become a
 disappointment... let's just say, I've come to a conclusion -- either
 I'm going to install ircII and live with whatever it has to offer(!),
 or hash out something quickly in Python that fits my needs. If I'm
 considering writing an IRC client, it makes sense to check for modules
 implementing the protocol before I have to roll something myself, but
 nothing seems to fit the bill.


 (For those that don't know it, ircII is a really freaking old Internet
 Rely Chat client ;)

I would have thought (given the number of hackers who use it a lot)
there were lots of good IRC clients, but I don't use it myself, so ...

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: The rap against while True: loops

2009-10-14 Thread Jorgen Grahn
On Mon, 2009-10-12, RDrewD wrote:
...
 I was a bit surprised that nobody in this discussion so far bantered
 around the phrase loop invariant, but then I looked in
 http://en.wikipedia.org/wiki/Loop_invariant and found it was draped in
 so much formalism that it's sure to put off all but the most dedicated
 of Computer Science fans.

Haven't read it. But much of the CS parts of the Wikipedia sucks, and
whoever writes there doesn't own the trademark on loop invariants
anyway.

IME, a loop invariant is a simple and useful tool for thinking about
the correctness of code. Class invariants (or whatever they are called)
are even better.

 I haven't been in college in 35 years, so
 I'll admit to being rusty on this, but as I remember it, any time we
 wrote a loop, we were expected to be able to say what the loop
 invariant is.

Yes, it's as simple as that.

 my_prissy_little_indicator_variable = true
 while (my_prissy_little_indicator_variable){
 body
 }
 isn't satisfying because it doesn't guard the body with any
 assurance that the loop invariant will be true before you enter into
 that block of code.

Why not? To me, it obviously does.

It would also help if you didn't use intentionally meaningless and
annoying variable names in your examples. In reality you would have a
meaningful expression like not inputqueue.empty() or
time()  deadline or something.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: The rap against while True: loops

2009-10-14 Thread Jorgen Grahn
On Mon, 2009-10-12, Grant Edwards wrote:
 On 2009-10-12, Gabriel Genellina gagsl-...@yahoo.com.ar wrote:

 my_prissy_little_indicator_variable = true
 while (my_prissy_little_indicator_variable){
 body
 }
 isn't satisfying because it doesn't guard the body with any
 assurance that the loop invariant will be true before you enter into
 that block of code.

 I think you meant the other way; the above is the simplest loop case, with  
 the test at the start.

 Except the test at the start is meaningless when it comes to
 reading the code and troubleshooting.  What counts are
 assignments to my_prissy_little_indicator_variable inside the
 loop.  And those aren't really any easier to spot that break
 statements.

It's a red herring.  A good loop tends to *not* have a boolean
variable as the while ... expression.  That smells like flag
programming, and if I cannot come up with anything better that that, I
often prefer a while 1 with breaks in it.

For a real-life loop, see for example

  http://en.wikipedia.org/wiki/Binary_search#Iterative

(except it confuses me because it's a repeat ... until and it's in
Pascal with that quaint 1-based indexing)

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: The rap against while True: loops

2009-10-14 Thread Jorgen Grahn
On Wed, 2009-10-14, Marco Mariani wrote:
 Dennis Lee Bieber wrote:

  One thing to note is that break ONLY exits the innermost loop --
 Ada adds the confusion that one could define a label on the loops, and
 have the innermost use
  exit outer_label [when condition]
 
 
  THAT I find scary... Since you have to match the label name to
 something that occurs somewhere prior to the exit, and THEN have to
 find the end of that loop.

 But we have exceptions. And I know somebody, in other languages, thinks 
 it's a Best Practice to avoid using exceptions for flow control.

A lot of C++ programmers think so, and Stroustrup himself says
exceptions are for exceptional things or something to that effect.
Is that what you're thinking of?

Thankfully, Stroustrup doesn't use the dreaded phrase Best Practice,
which as far as I can tell is designed to shut down rational thought
in the audience.

 Thankfully, python programmers are less dogmatic, and use whatever makes 
 sense to use. I hope.

Calling it dogmatic is unfair.  C++ is very different from Python,
and has a different implementation of exceptions. You also tend to use
the language to solve a different set of problems.

That said, I still don't fully understand the rationale behind that
advice or rule ... so I'm willing to break it, and sometimes I do.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: how to handle broken pipes

2009-10-14 Thread Jorgen Grahn
On Wed, 2009-10-14, Igor Mikushkin wrote:
 Hello all!

 Could anyone please say me what is the right way to handle broken
 pipes in Python?
 I can wrap all my print statements with try/except blocks but it looks
 like overkill for me.

 I'm using my Python script this way: my_script | less
 The script produces a lot of data.
 So usually when I find what I'm looking for and press 'q' script is
 still writing something.

You mean like this on Unix?

  python -c 'while 1: print hello, world'|less

which produces

  Traceback (most recent call last):
File string, line 1, in module
  IOError: [Errno 32] Broken pipe
  
Well, you can catch IOError, examine the errno, and do a sys.exit()
if it's EPIPE. Don't know if it should be sys.exit(0) or sys.exit(1)
though.

Oh, and *now* I see what you wrote at the top:

 I can wrap all my print statements with try/except blocks but it looks
 like overkill for me.

It's overkill if you have to do it for each print. You should always
(IMHO) wrap all your logic inside an object or a function, let's say
foo(). Then you only have to wrap the single call to foo().

There should be an even cleaner way. Mine is kind of ugly (catch,
examine, exit or re-raise) and it also incorrectly catches broken pipes
which aren't related to sys.stdout/stderr.

There is a similar problem with Ctrl-C, by the way -- the user gets a
KeyboardInterrupt exception thrown in his face where other languages
would have exited silently by default.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: best vi / emacs python features

2009-10-12 Thread Jorgen Grahn
On Wed, 2009-10-07, OdarR wrote:
 hello,

 * this is not a troll *

 which kind of help you have with your favorite editor ?

Syntax highlighting and help with the indentation (move to the
right after an else:, keep in the same column normally, etc).
Nothing else specific to Python.

 personnally, I find emacs very nice, in the current state of my
 knowledge, when I need to reindent the code.
 you know how this is critical in python...:-)

Moving a block one step right or left?  Oh, I use that, too.

I am also a heavy user of dabbrev-expand (that's an Emacs term, but
it exists in Vim too).  Also some other vital features which aren't
specific to Python. The best help an editor can give is language-
independent.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: start external program from python

2009-10-12 Thread Jorgen Grahn
On Mon, 2009-10-12, Bjorn wrote:

 Hi, I woul like to start a program from within python (under linux):
 This works fine:

 import os
 path = 'tclsh AppMain.tcl hej.gb'
 os.system(path)

 The file AppMain.tcl is the executable

Not really -- tclsh is the executable from Python's and the system's
view. If you are on Unix, you might want to use a shebang

  http://en.wikipedia.org/wiki/Shebang_(Unix)

and rename AppMain.tcl AppMain so you don't have to rewrite your
Python program just because you rewrite the Tcl program in some
other (more modern) language.

 and the file hej.gb is a
 textfile in the same directory.
 The text file gets opened in the app in the correct way.

So the subject line doesn't really capture your question?

 I wonder if I could pass information from the clipboard to the
 AppMain.tcl instead of the file hej.gb ?
 I use wxPython.
 any comment is appreciated!

Sure you can, but the clipboard wasn't really designed for that.

You can document 'AppMain' as taking its input from whatever is in the
clipboard when it starts, but if there is nothing there it cannot wait
for it, and there are no guarantees that the user or some other
program (or another copy of the program) hasn't put something else
there in the milliseconds it takes to start AppMain. It's also not a
private channel; any other program can read it.

Why do you want to use the clipboard?

If you think you need it because you don't want a temporary file,
maybe you can let AppMain read from standard input, and let the Python
program write the data using os.popen or one of the alternatives.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Daemon call python program

2009-10-12 Thread Jorgen Grahn
On Mon, 2009-10-12, §ä´m¦Û¤vª�...@¤ù¤Ñ wrote:
 ?? ???z?mcommand (???m???v...@)?n???G
 : I have a daemon process which will call a python program.
 : What do I do if I want to dump the exception when the python program exist
 : by uncaught exception.
 : Thanks a lot!

 By the way, the python program is multi-thread

It doesn't really matter if it's multi-threaded, or even that it is
Python. You would have the same problem with any program which may
print stuff to standard output or standard error, and/or exit.

I think it depends completely on the design of your daemon, and why
it calls another program.  And what it does while that other program
is running.

inetd/xinetd on Unix is one example, but they feed the program's output
(all of it, both standard output and standard error, IIRC) to the remote
client. Same with CGI, I think.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Error en el bus from python

2009-10-12 Thread Jorgen Grahn
On Mon, 2009-10-12, Philip Semanchuk wrote:

 On Oct 11, 2009, at 4:45 PM, Yusniel wrote:

 Hi. I did installed a library for python named pyswip-0.2.2 but when I
 run a python example with the next lines, the python interpreter, it
 throw me the following error: Error en el bus. The code lines are:

Makes me think of that guy from the Simpsons, in the bumble-bee suit ...
fortunately you don't need to know tech Spanish to decode this one.

...
 Are you on a Mac by any chance? I get a bus error out of Python once  
 in a while, usually when a C library has done something bad. I don't  
 know if this error is specific to OS X or not.

Bus Error is an old BSD-ism which I guess you don't see much in
Linux or Solaris these days (or maybe I never run buggy code ;-).  It
translates roughly to segmentation fault, but IIRC it is more about
accessing memory words on nonaligned adresses than about accessing
addresses your process doesn't own.

[...]

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Creating a local variable scope.

2009-09-12 Thread Jorgen Grahn
On Fri, 11 Sep 2009 19:07:55 -0700 (PDT), Bearophile bearophileh...@lycos.com 
wrote:
...
 No need to add other things to the
 language as the OP suggests.

He didn't suggest that (although he did give examples from other
languages).

Personally ... yes, I sometimes feel like the OP, but usually if I
want that kind of subtle improvements, I am also willing to
restructure my code so the natural scopes become short enough.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Code formatting question: conditional expression

2009-08-26 Thread Jorgen Grahn
On Wed, 26 Aug 2009 16:47:33 +1000, Ben Finney
ben+pyt...@benfinney.id.au wrote:
 Nicola Larosa (tekNico) nicola.lar...@gmail.com writes:

 Nicola Larosa wrote:
  Here's my take:
 
      excessblk = Block(total - P.BASE, srccol,
  carry_button_suppress=True
          ) if total  P.BASE else None

 Oops, it got shortened out: line longer than 72 chars, acceptable in
 code, but not in email. I'll try again.

 Fine in email; just munged on your behalf (and probably without your
 knowledge) by your email service provider. If you don't want that
 happening, it's probably best to avoid Google Mail.

But this is Usenet (or at least it gets gatewayed there) and there
are such limits here (either by common agreement or by RFC).
Probably not Google's fault.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ubuntu dist-packages

2009-08-26 Thread Jorgen Grahn
On Wed, 26 Aug 2009 12:46:13 +0200, Diez B. Roggisch
de...@nospam.web.de wrote:
 Robin Becker wrote:

 I was surprised a couple of days ago when trying to assist a colleage with
 his python setup on a ubuntu 9.04 system.
 
 We built our c-extensions and manually copied them into place, but
 site-packages wasn't there. It seems that ubuntu now wants stuff to go
 into lib/python2.6/dist-packages.
...

 I don't know much about the atrocities distributions commit to
 python-installations (ripping out distutils because it's a developer-only
 thing,

Well, if you are thinking about Debian Linux, it's not as much
ripping out as splitting into a separate package with a non-obvious
name. Annoying at times, but hardly an atrocity.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ubuntu dist-packages

2009-08-26 Thread Jorgen Grahn
On Wed, 26 Aug 2009 17:20:35 +0100, Robin Becker ro...@reportlab.com wrote:
 Jorgen Grahn wrote:
 On Wed, 26 Aug 2009 12:46:13 +0200, Diez B. Roggisch
 
 Well, if you are thinking about Debian Linux, it's not as much
 ripping out as splitting into a separate package with a non-obvious
 name. Annoying at times, but hardly an atrocity.

 so where is the official place for user installed stuff on
 ubuntu/debian ie will 
 there be dist-packages and site-packages.

I don't know, but I know Debian has a group of people working out how
to package Python and software written in Python.  Those guys have a
home page somewhere at debian.org -- there should be information
there, and/or under /usr/share/doc/python* on your system.

Another answer is let distutils do its job and do not worry, but
sometimes you need to know the rules and the reasons ...

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Career Track: Computer Programmer

2009-06-10 Thread Jorgen Grahn
On Mon, 8 Jun 2009 07:49:42 -0700 (PDT), youssef_edward3...@yahoo.com 
youssef_edward3...@yahoo.com wrote:
 Roles and Responsibilities :

 The primary role of a Computer Programmer is to write programs
 according to the instructions determined primarily by computer
 software engineers and systems analysts.

I hope this is a direct quote from a 1976 issue of some sleazy
business magazine, not something anyone believes in today. Except for
the systems analysts, maybe.

 In a nutshell, Computer
 Programmers are the ones that take the completed designs and convert
 them into the instructions that the computer can actually follow.

That's not a programmer, that's a compiler. Or (to at least *pretend*
to be on topic) that's the Python interpreter.

/Jorgen

-- 
  // Jorgen Grahn grahn@Ph'nglui mglw'nafh Cthulhu
\X/ snipabacken.se  R'lyeh wgah'nagl fhtagn!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: xml application advice

2009-06-10 Thread Jorgen Grahn
On Wed, 10 Jun 2009 08:57:42 -0500, William Purcell flye...@gmail.com wrote:
...
 I am writing a application to calculate pressure drop for a piping
 network.  Namely a building sprinkler system.  This will be a
 command line program at first with the system described in xml (at
 least that is how I think I want to do it).

How about (re)using the dot graph language from Graphviz?  It's a file
format for describing directed graphs, which I suppose a sprinkler
system is. It might fit; it might not.

 An important part of this calculation is finding the 'hydraulically
 most remote' sprinkler.  This is something that I could specify with
 an attribute for now and later think about how to automate it.  I
 need to walk through the dom tree until I find a node of type
 sprinkler that has an attribute of hydraulically_most_remote with
 a value of True.

 After I find this I need to break the itterator/for loop and then
 start walking backwards keeping a running total of the pressure drop
 until I reach a node that has multiple pipesections and then walk to
 the end of each branch and calculate the pressure drop, and then add
 them to the branch that contained the hydraulically most remote
 sprinkler, and then move on, repeating this until I walk all the way
 back to the inflow node.

 I am having trouble finding a decent python/xml resource on the web.
 I have ordered Python  XML by Jones and Drake, but I am anxious to
 get something started.

If what you're interested in is to get real work done, why decide to
make XML a showstopper?

I see two tasks: (a) transforming a text file description of a sprinkler
system into a Python data structure, and (b) implementing algorithms
to find out important stuff about such a data structure.

You do not need (a) before you can do (b). You can even have Python as
your input format, and eval() the file. Crude and insecure, but it
works, at almost zero cost.

/Jorgen

-- 
  // Jorgen Grahn grahn@Ph'nglui mglw'nafh Cthulhu
\X/ snipabacken.se  R'lyeh wgah'nagl fhtagn!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Using logging module to log into flash drive

2009-06-10 Thread Jorgen Grahn
On Tue, 9 Jun 2009 18:10:18 +0100, A. Cavallo a.cava...@mailsnare.com wrote:

[top-posting fixed]

 On Tuesday 09 June 2009 16:57:00 kretel wrote:
 Hi All,

 I am trying to implement the following functionality:
 1. log messages to the flash drive
 2. if the flash drive is not available, switch handler to the
 BufferringHandler and log into buffer,
 3. once the flash drive is plugged in and available store the logs
 from BufferHandler into that flash drive and switch the handler into
 RotateFileHandler.

 Which approach would you suggest to use while implementing this
 functionality?

First, to ignore the words flash drive and think in terms of can I
open the file named so-and-so for writing?. Unless you absolutely
must avoid storing your file on a file system based on some other
storage technology.

 One that come into my mind is to have one process or
 thread to check periodically if the flashdrive is available, and have
 a flag that will indicate that we can store the logs into the flash
 drive. But I don't particularly like this approach.
 Would you do it different way?
 Any suggestions are appreciated.

 Hi,
 the problem screams for a separate thread.

I don't hear any screaming.  It seems simpler to just

def log(something):
   if logging_to_memore() and seconds_since_i_last_checked()  N:
  try_to_switch_to_file()
   do_the_actual_logging(something)

unless it's vital that as many of these logs as possible survive a
crash (in which case I guess the program would also refuse to exit
until the user finds the physical flash memory device somewhere and
mounts it correctly -- flashback to ancient floppy-based Macs).

Yes, I find the requirements quite odd.

/Jorgen

-- 
  // Jorgen Grahn grahn@Ph'nglui mglw'nafh Cthulhu
\X/ snipabacken.se  R'lyeh wgah'nagl fhtagn!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python and getopt and spaces in option

2009-06-10 Thread Jorgen Grahn
On Tue, 9 Jun 2009 12:22:20 -0400, David Shapiro david.shap...@sas.com wrote:
 I have been trying to find an example of how to deal with options
 that have spaces in them.  I am using jython, which is the same I
 think as python 2.2.3.  I feebly tried to use optparse and argparse
 with no success (got gettext, locale, and optparse).  The code is as
 follows:


 try:
 (opts, args) = getopt.getopt(sys.argv[1:], hs:p:n:j:d:l:u:c:t:w:q:v,
 [help, server=, port=, 
 dsName=,jndiName=,driverName=,driverURL=,user=,passWD=,targetServer=,whereProp=,testTableName=,version])
 except getopt.GetoptError:
 usage()

 for opt in opts:
 (key, value) = opt
...
 if (key in (-q, --testTableName)):
...

 The one that gives me trouble is with the -q option, because it can
 look like: -q SQL 1 TABLE.  It returns back just SQL.  How do I get
 it to return the whole thing that is in double quotes?

You are probably calling the program incorrectly. A non-broken getopt has no
trouble with such things. When executing a program from a normal Unix
shell, single or double quotes (like you do above) is enough. I'd expect
other environments to behave in the same way.

If -q only eats the string SQL, where does 1 TABLE go?  It cannot
just disappear; does it end up in 'args'?

/Jorgen

-- 
  // Jorgen Grahn grahn@Ph'nglui mglw'nafh Cthulhu
\X/ snipabacken.se  R'lyeh wgah'nagl fhtagn!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Which C compiler?

2009-05-19 Thread Jorgen Grahn
On Mon, 18 May 2009 15:47:41 -0700, norseman norse...@hughes.net wrote:

 I suspect that if all python users were in the same room and the 
 question Are you NOT happy with python's upgrade requirements? was 
 asked you would find most hands in the air.  I have said it before - the 
 current attitude of 'new means we start over' was what nearly destroyed 
 Apple. Doesn't take joe public long to get tired of constantly 
 re-buying, re-writing themselves, re-hiring the same people to re-write 
 the same thing, etc...

I dislike the bleeding edge aspect of Python culture too, but (as
long as everyone ignores Python 3.0) it's not really something which
hurts me in my daily life. *Not* using Python would hurt, though.

I'm on Linux though, and use no third-party modules which haven't
already been filtered by Debian's maintainers. I don't know if that's
the reason, but my applications rarely or never break.  So I'm not
quite sure what happened in your case ...

/Jorgen

-- 
  // Jorgen Grahn grahn@Ph'nglui mglw'nafh Cthulhu
\X/ snipabacken.se  R'lyeh wgah'nagl fhtagn!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: strip char from list of strings

2009-05-19 Thread Jorgen Grahn
On Tue, 19 May 2009 10:25:35 +0200, Piet van Oostrum p...@cs.uu.nl wrote:
 Laurent Luce laurentluc...@yahoo.com (LL) wrote:

LL I have the following list:

LL [ 'test\n', test2\n', 'test3\n' ]

LL I want to remove the '\n' from each string in place, what is the
LL most efficient way to do that ? 

 I suppose you mean you have lists similar to the one given because with
 a list of 3 elements efficiency is a non-issue unless you do something
 stupid or repeat the operation thousands of times. Even with a list of
 1000 elements efficiency isn't very important. In fact you should worry
 about efficiency only after there are signs that there might be a
 problem. 

 Secondly, in Python you cannot remove a character from a string in place
 if that means modifying the string. Strings are immutable.

So the best way is probably to make sure the '\n's do not end up in
the list in the first place. I suspect that is often more elegant too.

/Jorgen

-- 
  // Jorgen Grahn grahn@Ph'nglui mglw'nafh Cthulhu
\X/ snipabacken.se  R'lyeh wgah'nagl fhtagn!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: CSV performance

2009-04-29 Thread Jorgen Grahn
On Mon, 27 Apr 2009 23:56:47 +0200, dean de...@yahoo.com wrote:
 On Mon, 27 Apr 2009 04:22:24 -0700 (PDT), psaff...@googlemail.com wrote:

 I'm using the CSV library to process a large amount of data - 28
 files, each of 130MB. Just reading in the data from one file and
 filing it into very simple data structures (numpy arrays and a
 cstringio) takes around 10 seconds. If I just slurp one file into a
 string, it only takes about a second, so I/O is not the bottleneck. Is
 it really taking 9 seconds just to split the lines and set the
 variables?

 I assume you're reading a 130 MB text file in 1 second only after OS
 already cashed it, so you're not really measuring disk I/O at all.

 Parsing a 130 MB text file will take considerable time no matter what.
 Perhaps you should consider using a database instead of CSV.

Why would that be faster? (Assuming all data is actually read from the
database into data structures in the program, as in the text file
case.)

I am asking because people who like databases tend to overestimate the
time it takes to parse text. (And I guess people like me who prefer
text files tend to underestimate the usefullness of databases.)

/Jorgen

-- 
  // Jorgen Grahn grahn@Ph'nglui mglw'nafh Cthulhu
\X/ snipabacken.se  R'lyeh wgah'nagl fhtagn!
--
http://mail.python.org/mailman/listinfo/python-list


Re: Third Party Modules

2009-04-29 Thread Jorgen Grahn
On Tue, 28 Apr 2009 10:15:23 -0700, John Nagle na...@animats.com wrote:
 Brock wrote:
 Hi Everyone,
 
 I know this is most likely a basic question and you will roll your
 eyes, but I am just starting out with Python (hobbyist) and I see many
 tutorials on the web referring to the use of external modules.
...

 There are several different mechanism for handling this, and they all 
 suck.
 The whole Python module distribution scheme is so uncoordinated that there's
 no uniform way to do this.  It's not your fault.
...
 I'm not going to put Python software out for public use again.  I don't
 have the time to deal with this crap.

And which other language would have made it easier? Once you have odd
third-party dependencies, you (or your users, rather) will have
problems.

/orgen

-- 
  // Jorgen Grahn grahn@Ph'nglui mglw'nafh Cthulhu
\X/ snipabacken.se  R'lyeh wgah'nagl fhtagn!
--
http://mail.python.org/mailman/listinfo/python-list


Re: stuck with PyOBEX

2009-04-29 Thread Jorgen Grahn
On Tue, 28 Apr 2009 18:52:38 +0200, Diez B. Roggisch de...@nospam.web.de 
wrote:
 alejandro wrote:

[AF_BLUETOOTH]

 Can you tell me what is it? Maybe I can search it and pass it in another
 way... if it is an address or protocol name

 I'm not entirely sure, but I guess no, you can't simply pass it in.

 Unix uses streams as abstraction for a lot of things - all kinds of devices
 for example.

You mean uses the BSD Socket API as an abstraction here. That's the
framework where AF_BLUETOOTH apparently lives.

/Jorgen

-- 
  // Jorgen Grahn grahn@Ph'nglui mglw'nafh Cthulhu
\X/ snipabacken.se  R'lyeh wgah'nagl fhtagn!
--
http://mail.python.org/mailman/listinfo/python-list


Re: how to handle/generate pcap file

2009-04-02 Thread Jorgen Grahn
On Wed, 1 Apr 2009 18:59:12 -0700 (PDT), Evan xdi...@gmail.com wrote:
 On Apr 2, 6:59 am, Rhodri James rho...@wildebst.demon.co.uk wrote:
 On Wed, 01 Apr 2009 14:53:34 +0100, Evan xdi...@gmail.com wrote:

  Hello -

  I'm trying to decode thepcapfilewhich is packet capture by tcpdump
  or wireshark.   Is there a python module that I can use it for this
  problem?

  Can python-libpcap or pycap or dpkt do that?

 A quick browse of the pypcap website suggests that yes, it can.

 --
 Rhodri James *-* Wildebeeste Herder to the Masses


 Yap, I found that dpkt can do this, Thanks all.

I have used the 'pcapy' module successfully for this. Might be better
than the ones mentioned above, might be worse.

Also, the pcap file format isn't really hard: you can write such code
by yourself in a few hours. I've done that too.

/Jorgen

-- 
  // Jorgen Grahn grahn@Ph'nglui mglw'nafh Cthulhu
\X/ snipabacken.se  R'lyeh wgah'nagl fhtagn!
--
http://mail.python.org/mailman/listinfo/python-list


Re: A design problem I met again and again.

2009-04-02 Thread Jorgen Grahn
[top-posting fixed]

On Thu, 2 Apr 2009 08:02:23 -0700 (PDT), =?GB2312?B?0rvK18qr?= 
newpt...@gmail.com wrote:
 On Apr 2, 5:58 am, Carl Banks pavlovevide...@gmail.com wrote:
 On Apr 1, 12:44 am, ?? newpt...@gmail.com wrote:

  I got the same problem when writing C#/C++ when I have to provide a
  lot of method to my code's user.  So I create a big class as the entry
  point of my code.  Although these big classes doesn't contains much
  logic,  they do grow bigger and bigger.

 This seems to be a classic result of code-based organization, that
 is, you are organizing your code according to how your functions are
 used.  That's appropriate sometimes.  Procedural libraries are often
 organized by grouping functions according to use.  The os module is a
 good example.

 However, it's usually much better to organize code according to what
 data it acts upon: data-based organization.  In other words, go
 though your big class and figure out what data belongs together
 conceptually, make a class for each conceptual set of data, then
 assign methods to classes based on what data the methods act upon.

 Consider the os module again.  It's a big collection of functions, but
 there are a group of functions is os that all act on a particular
 piece of data, namely a file descriptor.  This suggests tha all the
 functions that act upon file descriptors (os.open, os.close, os.seek,
 etc.) could instead be methods of a single class, with the file
 descriptor as a class member.
...

 You get it.  Sometimes I feel that my head is trained to work in a
 procedural way.  I use a big class just as a container of functions.

If that's true, then your problems are not surprising.
A real class normally doesn't get that big.

 About the data-based approach, what if these functions all shares a
 little data, e.g. a socket, but nothing else?

If that is true, then those functions *are* the Python socket class
and everything has already been done for you.

Turn your question around and it makes more sense (to me, at least).
You don't primarily work with functions: you work with data, a.k.a.
state, a.k.a. objects.  The functions follow from the data.

To me, if I can find something with a certain lifetime, a certain set
of invariants, and a suitable name and catchphrase describing it, then
that's probably a class. Then I keep my fingers crossed and hope it
works out reasonably well. If it doesn't, I try another approach.

/Jorgen

-- 
  // Jorgen Grahn grahn@Ph'nglui mglw'nafh Cthulhu
\X/ snipabacken.se  R'lyeh wgah'nagl fhtagn!
--
http://mail.python.org/mailman/listinfo/python-list


Re: Problem while copying a file from a remote filer

2009-03-17 Thread Jorgen Grahn
On Sun, 15 Mar 2009 22:47:54 -0700, Chris Rebert c...@rebertia.com wrote:
 On Sun, Mar 15, 2009 at 10:24 PM, venutaurus...@gmail.com
 venutaurus...@gmail.com wrote:
 Hi all,
      I have to write an application which does a move and copy of a
 file from a remote machine to the local machine. I tried something
 like:

 file = urvenuwin2008\\C\\4Folders\\Folder02\\Folder002\
 \TextFile_06.txt

 The 'r' prefix on the string makes it a raw string, meaning you don't
 have do double-up the backslashes, but you did so anyway, so your path
 has many extra backslashes, making it invalid. Dropping the 'r' prefix
 should fix the problem.

Also, the file isn't really remote if you can use the normal local
file system calls to read it.

/Jorgen

-- 
  // Jorgen Grahn grahn@Ph'nglui mglw'nafh Cthulhu
\X/ snipabacken.se  R'lyeh wgah'nagl fhtagn!
--
http://mail.python.org/mailman/listinfo/python-list


Re: download x bytes at a time over network

2009-03-17 Thread Jorgen Grahn
On Tue, 17 Mar 2009 13:38:31 +0530, Saurabh phoneth...@gmail.com wrote:
 Heres the reason behind wanting to get chunks at a time.
 Im actually retrieving data from a list of RSS Feeds and need to
 continuously check for latest posts.
 But I dont want to depend on Last-Modified header or the pubDate tag
 in channel. Because a lot of feeds just output date('now')  instead
 of the actual last-updated timestamp.
 But when continuously checking for latest posts, I dont want to
 bombard other people's bandwidth - so I just want to get chunks of
 bytes at a time and internally check for item.../item with my
 database against timestamp values.
 Is there a better way to achieve this ?

I don't know much about RSS, but one approach is If they are too lazy
to provide the information which protects their bandwidth, they
deserve being bombarded. But they also deserve a polite mail telling
them that they have that problem.

/Jorgen

-- 
  // Jorgen Grahn grahn@Ph'nglui mglw'nafh Cthulhu
\X/ snipabacken.se  R'lyeh wgah'nagl fhtagn!
--
http://mail.python.org/mailman/listinfo/python-list


Re: python book for a C programmer

2009-03-17 Thread Jorgen Grahn
On Fri, 13 Mar 2009 22:10:37 -0700 (PDT), Saurabh nirkh...@gmail.com wrote:
 Hi all,
 I am an experienced C programmer, I have done some perl code as well.
 But while thinking about large programs,I find perl syntax a
 hinderance.
 I read Eric Raymonds article reagrding python,(http://
 www.linuxjournal.com/article/3882).

I have a similar background, and I was pleased with this one:

%A Alex Martelli
%T Python in a nutshell
%I O'Reilly
%D 2003

Plus the excellent online docs.

/Jorgen

-- 
  // Jorgen Grahn grahn@Ph'nglui mglw'nafh Cthulhu
\X/ snipabacken.se  R'lyeh wgah'nagl fhtagn!
--
http://mail.python.org/mailman/listinfo/python-list


  1   2   3   4   >