On 2009-Aug-27 08:02:19 +0100, "Dr. David Kirkby" <[email protected]> 
wrote:
>Robert Bradshaw wrote:
>> Probably belongs in devel/sage/c_lib. Just make them functions that  
>> return longs (bytes is the most logical unit), and then we can make  
>> getusage invoke them directly.
>
>I was anticipating making this a stand alone executable to replace 
>'top', since 'top' is being called that way.

I think this is a really bad idea.  See the separate thread a few days
ago that discussed the performance difference between using pipe()s
(which is what you are proposing) and library calls.

> (I would tend to agree that 
>getting information from a library function is probably a better way to 
>do this, but I don't know enough python to do all this.)

It's not that difficult - have a look at how get_memory_usage() is
implemented in Darwin (as per the example I posted) for one approach.
It's possible that a selection of conditionally compiled code in
c_lib may be a better approach - but the underlying principle remains
the same.

>At the moment 'make test' is bringing my Solaris box to an almost 
>standstill, as 'top' is being called thousands of times per second.

Just calling fork() and exec() thousands of times a second will put a
significant load on a typical system.

>Note also the code at present in Sage assumes the data will be in MB, 
>which is what 'top' usually reports.

Actually, 'top' output varies between KB, MB and GB depending on the
process size.

>If my code returns things like load averages, then clearly in that case 
>it will need to return an array of floating point numbers, not a 'long'.

Well, load averages will normally be extracted from the kernel as an
array of longs - but you definitely don't want to present the data
to Python as the pseudo-FP-in-a-long format that the kernel uses.
The first step is to decide what information we are returning and how
it should be presented to Python:  top() returns a string (which makes
it fairly useless for doing anything except displaying it to the user)
and get_memory_usage() returns MB as a double.

Do we want a getusage() function that returns all the information in
an array, or a number of individual functions that each return one
piece (or closely related pieces) of information?  For the former,
should it be a vector (eg n-th element is 5-min load average) or a
hash (5-min load average has a key of 'LA5' or similar).  (And BTW,
not all OSs measure load averages over the same periods).  Once that
is decided, the most natural units for that information can be
determined.

-- 
Peter Jeremy

Attachment: pgppG4MXlGmtP.pgp
Description: PGP signature

Reply via email to