Linux-Development-Sys Digest #9, Volume #7       Thu, 29 Jul 99 20:14:10 EDT

Contents:
  Re: Linux SCSI Performance Issues (Dimi Shahbaz)
  glibc 2.1.1 clock() problem (returning -ve number?) (Dr H. T. Leung)
  Re: Device Drivers Programming problems in linux (Eric Hegstrom)
  Re: SIGFPE delivery (Andreas Jaeger)
  Re: glibc 2.1.1 clock() problem (returning -ve number?) (Andreas Jaeger)
  Re: SIGFPE delivery (Paul Kimoto)
  Re: SIGFPE delivery (David H. Munro)
  Re: f90 compiling error
  Re: Linux on PS/2 MCA ESDI???? (Matija Nalis)
  Re: Writing shared libraries (Ulrich Weigand)
  Re: Linux Journal - worth or not? (Christopher Browne)

----------------------------------------------------------------------------

From: Dimi Shahbaz <[EMAIL PROTECTED]>
Crossposted-To: 
linux.dev.c-programming,linux.dev.kernel,linux.dev.scsi,comp.os.linux.hardware
Subject: Re: Linux SCSI Performance Issues
Date: Thu, 29 Jul 1999 13:28:30 -0700

> Dave Platt wrote:
>
> Here are some quick results for 100MB sequential disk reads
> on sg (2.2.10-ac10, advansys 940UW and 2 DCHS04U IBM disks):
>
> # time ./sg_dd512 if=/dev/sgb of=/dev/null count=200k
> 204800+0 records in
> 204800+0 records out
> 0.00user 1.23system 0:10.44elapsed 11%CPU (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (99major+27minor)pagefaults 0swaps
>
> # time ./sg_dd512 if=/dev/sgb of=/dev/null count=200k
> 204800+0 records in
> 204800+0 records out
> 0.02user 1.75system 0:12.87elapsed 13%CPU (0avgtext+0avgdata 0maxresident)k
> 0inputs+0outputs (99major+27minor)pagefaults 0swaps
>
> The first result was obtained without any other loading.
> The second result was obtained when a similar command
> was run on /dev/sga (which took 12.5 seconds elapsed).
> When command queueing was used (i.e. sgq_dd512 command)
> the coincident runs took 10.5 and 11.1 seconds each.
>
> Doug Gilbert

I got some interesting results with the sg_dd512 program:

time ./sg_dd512 if=/dev/sga of=/dev/null count=20k
20480+0 records in
20480+0 records out
0.00user 0.08system 0:03.52elapsed 2%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (88major+44minor)pagefaults 0swaps

time ./sg_dd512 if=/dev/sga of=/dev/null count=20k &
time ./sg_dd512 if=/dev/sgb of=/dev/null count=20k &
20480+0 records in
20480+0 records out
0.00user 0.11system 0:06.87elapsed 1%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (88major+44minor)pagefaults 0swaps
20480+0 records in
20480+0 records out
0.00user 0.12system 0:06.92elapsed 1%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (88major+44minor)pagefaults 0swaps

time ./sg_dd512 if=/dev/sga of=/dev/null count=20k &
time ./sg_dd512 if=/dev/sgb of=/dev/null count=20k &
time ./sg_dd512 if=/dev/sgc of=/dev/null count=20k &
20480+0 records in
20480+0 records out
0.00user 0.12system 0:07.43elapsed 1%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (88major+44minor)pagefaults 0swaps
20480+0 records in
20480+0 records out
0.00user 0.16system 0:10.17elapsed 1%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (88major+44minor)pagefaults 0swaps
20480+0 records in
20480+0 records out
0.00user 0.15system 0:10.43elapsed 1%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (88major+44minor)pagefaults 0swaps

Is this the sort of performance I'm supposed to get?  The delay roughly
doubles from 1 disk to 2 disks, and even more so from 1 disk to 3 disks (btw,
the 2,3 disk runs were launched simultaneously to the background via a shell
script).  This was run on a 2.2.10 kernel, on 3 identical 9GB scsi disks.

Thank you,
Dimi Shahbaz


------------------------------

From: [EMAIL PROTECTED] (Dr H. T. Leung)
Subject: glibc 2.1.1 clock() problem (returning -ve number?)
Date: 29 Jul 1999 18:22:05 GMT


I have a big (long-running) program that output CPU time usage continously with
this routine; compile with libc 5 it gives the correct output, but when compiled
and linked against glibc 2.1.1, 

GLIBCFLAGS = -b i686-linux-glibc21 -nostdinc \
 -I/usr/local/i686-linux-glibc21/include \
 -I/usr/local/lib/gcc-lib/i686-linux-glibc21/egcs-2.91.66/include 

CFLAGS = $(GLIBCFLAGS) -Wall -O3 -malign-double -fomit-frame-pointer 

it gives wrong times like this:
===============================================
# Time per run = 5.75937 
# Run number = 347 ;  Time since execution = 2061.7 s 
# Time per run = 5.79459 
# Run number = 348 ;  Time since execution = 2079.99 s 
# Time per run = 5.8305 
# Run number = 349 ;  Time since execution = 2098.7 s 
# Time per run = 5.8674 
# Run number = 350 ;  Time since execution = 2117.46 s 
# Time per run = 5.90424 
# Run number = 351 ;  Time since execution = 2135.48 s 
# Time per run = 5.93875 
# Run number = 352 ;  Time since execution = -2141.58 s 
# Time per run = -6.22888 
# Run number = 353 ;  Time since execution = -2123.7 s 
# Time per run = -6.16058 
# Run number = 354 ;  Time since execution = -2105.44 s 
# Time per run = -6.09159 
# Run number = 355 ;  Time since execution = -2087.21 s 
# Time per run = -6.02308 
# Run number = 356 ;  Time since execution = -2068.9 s 
===============================================

I am sure it is some silly thing I do, but I can't seem to figure out why it does
that. I think under libc 5 CLOCKS_PER_SEC is 100, while in Glibc 2 it is much
bigger(1e6), but it still doesn't account for the change.

TIA.

=============================================================
void write_time(unsigned long int run_number, double offset)
{
  double my_time ;
  my_time = ((double) clock()) / CLOCKS_PER_SEC ; 
  printf( "# Run number = %ld ; ", run_number) ; 
  printf( " Time since execution = %g s \n", my_time) ;
  printf( "# Time per run = %g \n", ( my_time - offset )/run_number) ;
  return ;
}
=============================================================
-- 
          --------------------------------------------------
"What you don't care cannot hurt you."            Chap. 7a, AMS-NS

------------------------------

From: Eric Hegstrom <[EMAIL PROTECTED]>
Subject: Re: Device Drivers Programming problems in linux
Date: Thu, 29 Jul 1999 12:53:58 -0700

Ivo:
   Sounds like a great class!
   What exactly is the error message from insmod. You are logged in as
root right (I still forget that sometimes). There is a great book for
writing Linux Device Drivers called, oddly enough, "Linux Device
Drivers" by Alessandro Rubini. It is published by O'Reilly and
Associates, Inc (www.ora.com I think). It's something like $30US (the
ISBN number is 1-56592-292-1). It is oriented toward the version of the
kernel used in RedHat 5.2 (there are some changes if you move up to
6.0).

The examples out of the book are available via ftp:
ftp://ftp.ora.com/published/oreilly/linux/drivers/

Their basic character driver example is called scull (Simple Character
Utility for Loading Localitites).
I am sure that if you give us some more info about your exact problem,
we can help

Peace,
Eric

Ivo wrote:
> 
> Dear reader,
> 
> At school we need to make a character device driver for linux.
> I already know how to programm the programm itself (the driver), but can't seem
> to run it with insmod. I talked with my teacher, who is unfortunately on
> vacation now. And he said something about recompiling the kernel with the new
> code added to it in some way (mem.c ?).....
> Anyway, I have been searching the internet for awhile now in search of people,
> who ever build drivers for linux or know how to do is....all help is welcome.
> Oh I use redhat 5.2 (we need to for school).
> 
> Thanks in advance
> 
> Ivo Klerkx
> 
> e-mail: [EMAIL PROTECTED]

-- 
Eric Hegstrom                          .~.
Senior Software Engineer               /V\  
Sonoran Scanners, Inc.                // \\          L I N U X
[EMAIL PROTECTED]        /(   )\  >don't fear the penguin<
520-617-0072 x402                     ^^-^^

------------------------------

From: Andreas Jaeger <[EMAIL PROTECTED]>
Subject: Re: SIGFPE delivery
Date: 29 Jul 1999 21:47:34 +0200

>>>>> David H Munro writes:

 > [...]

 > Here's my immediate problem:

 > A recent GNU libc release withdrew the __setfpucw function (or, more
 > correctly, made it a static function only callable by their own crt0
 > code!).  This reduces anyone who wants SIGFPE delivered to duplicating
 > the assembly code used by __setfpucw -- making Linux the only OS
 > requiring assembly code to perform this important function.  (The only
 > exception is the alpha Linux platform, which has adopted Digital's
 > ieee_set_fp_control() function.  See fpuset.c below.)

 > I was told by Cygnus support that the new fenv.h interface replaces
 > the __setfpucw functionality.  I have read the C9X draft standard
 > describing fenv.h very carefully, and it provides no means for setting
 > the FPE mask bits -- you can use the fenv.h interface only to set,
 > query, and test the bits which determine whether an exception
 > condition has occurred.  Numerical software requires that the FPU
 > actually generate an interrupt; it doesn't do any good to know
 > something happened long afterward.  I also spoke with a Livermore
 > representative on the C9X committee, who told me that delivery or
 > non-delivery of any signal is not regarded as a C language issue (by
 > way of an excuse for the fenv.h interface not providing this crucial
 > functionality).  Unless GNU libc extends the fenv.h interface, it is
 > useless.

 > I suspect that the reason __setfpucw was withdrawn has to do with the
 > proliferation of Linux architectures.  Unfortunately, the absence of
 > any standard to replace it makes my programming job substantially
 > harder: I only have access to Intel Linux machines, so I have a hard
 > time testing code for my favorite OS anywhere else.  If any of you can
 > help me by building, testing, and correcting the enclosed test program
 > on non-Intel Linux platforms, I would be extremely grateful.

 > Any immediate help with the test program, or hope for a more rational
 > interface in the future will be deeply appreciated.

With:
#include <fenv.h>
void u_fpu_setup(void)
{
  fesetenv (FE_NOMASK_ENV);
}

and compiling with -D_GNU_SOURCE -lm I get on ix86:
$ ./fputest  
SIGFPE improperly generated on underflow

FE_NOMASK_ENV is documented in the glibc manual.

glibc 2.1 comes with a test program for fenv (math/test-fenv.c), it
might help you.

Andreas
-- 
 Andreas Jaeger   [EMAIL PROTECTED]    [EMAIL PROTECTED]
  for pgp-key finger [EMAIL PROTECTED]

------------------------------

From: Andreas Jaeger <[EMAIL PROTECTED]>
Subject: Re: glibc 2.1.1 clock() problem (returning -ve number?)
Date: 29 Jul 1999 21:15:19 +0200

>>>>> H T Leung writes:

 > I have a big (long-running) program that output CPU time usage
 > continously with this routine; compile with libc 5 it gives the
 > correct output, but when compiled and linked against glibc 2.1.1,
 > [..]
 > I am sure it is some silly thing I do, but I can't seem to figure
 > out why it does that. I think under libc 5 CLOCKS_PER_SEC is 100,
 > while in Glibc 2 it is much bigger(1e6), but it still doesn't
 > account for the change.

Do a bit of mathematics. clock_t is 32 bit on ix86, CLOCKS_PER_SEC is
required by POSIX to be 1e6 this leads to a wrap around of clock_t
after 71 minutes.  Everything is working as expected - only your
expectations aren't right;-)

There are better functions for longer running processes like
getrusage.

Andreas
-- 
 Andreas Jaeger   [EMAIL PROTECTED]    [EMAIL PROTECTED]
  for pgp-key finger [EMAIL PROTECTED]

------------------------------

From: [EMAIL PROTECTED] (Paul Kimoto)
Subject: Re: SIGFPE delivery
Date: 29 Jul 1999 17:26:01 -0500
Reply-To: [EMAIL PROTECTED]

In article <[EMAIL PROTECTED]>, David H. Munro wrote:
> most FPUs require
> detailed bits to be set to tell precisely which FPEs will generate
> signals: The possibilities are zero divide, invalid operation,
> overflow, underflow, inexact result, and (for some FPUs) denormalized
> operand.  However, the vast majority of numerical applications care
> about the first three and want the others ignored, so delivery of zero
> divide, invalid operation, and overflow, and non-delivery of all the
> rest is a very obvious default behavior for programs registering a
> SIGFPE handler.

Do you know of Bill Metzenthen's work (for x86 only)?
  http://www.linuxsupportline.com/~billm/

If you do, can you comment on it?

-- 
Paul Kimoto             <[EMAIL PROTECTED]>

------------------------------

From: [EMAIL PROTECTED] (David H. Munro)
Subject: Re: SIGFPE delivery
Date: 28 Jul 1999 21:32:48 -0700
Reply-To: [EMAIL PROTECTED]


Andreas Jaeger <[EMAIL PROTECTED]> writes:

> With:
> #include <fenv.h>
> void u_fpu_setup(void)
> {
>   fesetenv (FE_NOMASK_ENV);
> }
> 
> and compiling with -D_GNU_SOURCE -lm I get on ix86:
> $ ./fputest  
> SIGFPE improperly generated on underflow
> 
> FE_NOMASK_ENV is documented in the glibc manual.
> 
> glibc 2.1 comes with a test program for fenv (math/test-fenv.c), it
> might help you.

Thank you very much for reminding me about the glibc FE_NOMASK_ENV
extenstion to the C9X fenv.h interface.

However, as you can see, it does not work properly:

> $ ./fputest  
> SIGFPE improperly generated on underflow

Unless they add a second extension, perhaps FE_NUMERICAL_ENV, the
FE_NOMASK_ENV is worthless for numerical applications.  Many common
algorithms (including fft and some matrix solvers) commonly generate
very large numbers of underflows; the performance hit for interrupting
on underflows is totally unacceptable.

Incidentally, some FPU hardware also requires that a bit to get "rapid
underflow" be set in order to get high performance in numerical
applications.  If this bit is not set on those platforms, many
underflows are handled as interrupts by the operating system with a
huge penalty in speed (see the note under the FPU_GCC_SPARC branch of
my fpuset.c code).  FE_NUMERICAL_ENV would need to set that bit as
well as unmasking the correct subset of the exceptions.

Once again, numerical applications need to interrupt on zero divide,
overflow, and invalid operation, but *not* interrupt on any other
conditions, which represent normal operation of many important
algorithms.  I was apparently incorrect to say that this default
behavior is "obvious".

Note that the fenv.h interface fesetenv() parameter can only be a
predefined macro (such as FE_NOMASK_ENV) or the return from a previous
fegetenv() call.  Therefore, there is no way for the interested
programmer to set the exception mask correctly -- only the fenv.h
implementors can do that, and they haven't.  As I said before, the
fenv.h interface cannot replace the old __setfpucw() functionality
without considerable extension.

Dave Munro

------------------------------

Crossposted-To: 
comp.lang.fortran,comp.os.linux.misc,comp.os.linux.setup,comp.os.linux.development.apps
Subject: Re: f90 compiling error
From: <[EMAIL PROTECTED]>
Date: Thu, 29 Jul 1999 21:07:12 GMT

Kamran Mohseni <[EMAIL PROTECTED]> writes:

> I have a red hat linux 6 on my machine.

Version 6.0 of RedHat is based on GNU libc version 2.1.
You may want to read what http://www.nag.co.uk/ has to say about
the need to compile and link against the libc5 libraries and
include files. (Actually, linking against GNU libc 2.0 often
seems to work, but 2.1 has more drastic incompatibilities with libc5.)
Their solution is worded in terms of the f95 compiler, but you should
be able to transpose it to f90 2.2 with little or no difficulty.

> I recently instaled f90 Version 2.2(260) compiler from NAG.
[...]
>  Kami > make -f Makeinit
> f90   -o initdns initdns.o /home/mohseni/mohseni/Br/FFTLibHome/libgpfa.a
> 
> /usr/local/lib/f90/libf90.a(open.o): In function `__NAGf90_open':
> open.o(.text+0xc77): undefined reference to `_fxstat'

You could probably get away with mappig _fxstat to __fxstat, but then
you'll stumble into the __setfpucw business; and if you work around
that, who knows what subtle problem will bite you next. Oh, I know:
there are the changes to stdio that will force me to recompile
libpgplot myself instead of copying it over from Debian 2.1 :-(

Definitely install the libc5 development environment and use that.

------------------------------

From: [EMAIL PROTECTED] (Matija Nalis)
Subject: Re: Linux on PS/2 MCA ESDI????
Date: 29 Jul 1999 22:33:31 GMT

On 21 Jul 1999 07:49:28 GMT, Jan Andres <[EMAIL PROTECTED]> wrote:
>In article <[EMAIL PROTECTED]>, Roy Grimm wrote:
>>User Bsdbob BSD Bob wrote:
>>> Anyone have a pointer to a Linux that is known to boot and run
>>> directly on an IBM PS/2 Model 80, 10mb ram, 2 x 315mb ESDI drives?
>>> If so, I would like the pointer to it.
>
>Yes, it supports Microchannel. But I don't know if it supports ESDI,
>whatever this might be. :-)

HDD interface. Like IDE, SCSI, etc.

Yes, it works nice. I'm running it on IBM Thinkpad 720 notebook, MCA, 120MB
ESDI disks. Any 2.2 kernel should do fine.

-- 
Opinions above are GNU-copylefted.

------------------------------

From: [EMAIL PROTECTED] (Ulrich Weigand)
Subject: Re: Writing shared libraries
Date: 30 Jul 1999 01:28:18 +0200

Graffiti <[EMAIL PROTECTED]> writes:
>In article <[EMAIL PROTECTED]>,
>Paul D. Smith <[EMAIL PROTECTED]> wrote:
>>My understanding is that it affects all calls referencing code in the
>>shared library, whether they originate from within the lib or not.  The
>>thing is that the code in a shared library is generated as PIC (position
>>independant).  That means all references in the shared lib are
>>essentially offsets.  So, whenever you want to access data or functions
>>in the shared lib you have to do an extra calculation to add the offset
>>to whatever fixed value was used to load the shared lib in this run of
>>the app.

>Yes, this is all resolved when the binary is first run and loaded into
>the system's memory.  It's not resolved at run-time while the app is
>running (i.e. enters main()) unless you explicitly tell it to via
>the dlsym() functions.

No.  Code that is supposed to run in a dynamic library must be compiled
as position-independent code (using the -fPIC switch in gcc), because
you never know at what absolute address the library is going to be 
loaded.  A statically linked executable is loaded always at the fixed
base address determined by the linker.  This means that e.g. accessing
global variables can be done simply by using the well-known absolute
addresses, like so:

             movl %eax, 0xdeadbeef

(which would store the contents of the %eax register into the global
variable at the memory address 0xdeadbeef.)

A dynamic library, however, can be loaded at varying base addresses,
because the base address that the linker intended might already be
occupied by *another* dynamic library -- the linker cannot forsee
which other libraries might happen to be linked together with any
of the potential users of this library ...

Hence, you tell the compiler that it must not generate code that contains
*any* absolute address, even to private variables.  This is possible,
as the Intel assembly language contains some instructions that construct
an address *relative* to the current EIP (instruction pointer).  These
instructions are jumps and calls.  A typical PIC sequence looks like this:

             call +0
             pop  %ebx
             addl $-1234, %ebx
             ...
             movl %eax, 1111(%ebx)

Note that the 'call' jumps to the current EIP +0, i.e. it jumps to the
line which would have been executed next anyway.  But, a 'call' always
pushed the return address onto the stack.  Instead of returning to this
address, the address is popped off the stack into the %ebx register
next.  Finally, a certain constant is added to that value. This constant
was determined at link time as the difference in addresses between the
line 'pop %ebx' and a certain data structure in the local module (the
so-called 'global offset table').  This difference can be computed at
link-time, as all addresses within a single module are shifted by the
same offset if the library is loaded at a different base address.

The end result of these contortions is that the %ebx register now
contains the correct absolute address of the global offset table,
although this address is unknown at link-time.  In the subsequent
code, global variables are found at fixed offsets from that reference
point.

It should be obvious that this PIC code is less efficient than the
position-dependent version;  not only is the code bigger, it also
takes longer to execute  (one sometimes overlooked, but rather crucial,
loss of efficiency results also from the fact that now the %ebx 
register is no longer available to the optimizing compiler to store
temporary values; as the Intel processor has only extremely few
registers to start with, this can lead to significant loss ...).

So, the up to 10% loss of efficiency that were reported have nothing
to do with shared libraries as such, but with the use of position-
independent code.  (The loss is smaller if global variables are
used sparingly, however ...)

By the way, Windows solves this problem in a rather different way:
here, the compiler generates normal code containing absolute addresses.
In addition, it creates a list of all positions in the code where such
an absolute address is part of an instruction.  The module loader (part
of the OS) then consults this table and updates all these locations
when loading the executable, so as to point to the actual base address.
This can result in inefficient use of memory, however, because if the
same library happens to be mapped into two processes at *different*
base addresses, the code cannot be shared any longer, but each of the
processes must get its own copy in memory ...


-- 
  Ulrich Weigand,
  IMMD 1, Universitaet Erlangen-Nuernberg,
  Martensstr. 3, D-91058 Erlangen, Phone: +49 9131 85-7688

------------------------------

From: [EMAIL PROTECTED] (Christopher Browne)
Crossposted-To: comp.os.linux.development.apps
Subject: Re: Linux Journal - worth or not?
Reply-To: [EMAIL PROTECTED]
Date: Fri, 30 Jul 1999 00:06:07 GMT

On Fri, 30 Jul 1999 01:11:17 +0800, Bonn <[EMAIL PROTECTED]> wrote:
>i started using linux for a while.  is it worthwhile to subscribe the
>'Linux Journal'?
>any comment?

I've been subscribing for about 3 years now.

It's not always *tremendously* worthwhile, but there are almost always
some interesting articles, and usually something illuminating.
-- 
"Few people are capable of expressing with equanimity opinions which
differ from the prejudices of their social environment. Most people are
even incapable of forming such opinions." (Albert Einstein)
[EMAIL PROTECTED] <http://www.ntlug.org/~cbbrowne/linux.html>

------------------------------


** FOR YOUR REFERENCE **

The service address, to which questions about the list itself and requests
to be added to or deleted from it should be directed, is:

    Internet: [EMAIL PROTECTED]

You can send mail to the entire list (and comp.os.linux.development.system) via:

    Internet: [EMAIL PROTECTED]

Linux may be obtained via one of these FTP sites:
    ftp.funet.fi                                pub/Linux
    tsx-11.mit.edu                              pub/linux
    sunsite.unc.edu                             pub/Linux

End of Linux-Development-System Digest
******************************

Reply via email to