Linux-Development-Sys Digest #331, Volume #8      Thu, 7 Dec 00 23:13:10 EST

Contents:
  ramdisk: how much mem do they use? (Jerome Corre)
  Re: A faster memcpy and bzero for x86 ("Ken Whaley")
  Re: A faster memcpy and bzero for x86 (Erik Hensema)
  Re: A faster memcpy and bzero for x86 (Linus Torvalds)
  Re: kernel header problems ("Joshua Schaeffer")
  Re: A faster memcpy and bzero for x86 ([EMAIL PROTECTED])
  Re: kernel header problems ([EMAIL PROTECTED])
  Re: ramdisk: how much mem do they use? ([EMAIL PROTECTED])
  Re: kernel header problems (Kaz Kylheku)
  Re: this sucks! ([EMAIL PROTECTED])
  PLEASE - kernel / BIOS wrt hd DMA ("Guennadi V. Liakhovetski")
  Re: Problem getting ip_forwarding to work ("David K. Means")
  Intel's SSE block copy paper (was: Re: A faster memcpy and bzero for x86) ("Ken 
Whaley")
  Write software modem for linux ([EMAIL PROTECTED])
  Re: PLEASE - kernel / BIOS wrt hd DMA (J Wendel)

----------------------------------------------------------------------------

From: Jerome Corre <[EMAIL PROTECTED]>
Subject: ramdisk: how much mem do they use?
Date: Thu, 07 Dec 2000 18:03:15 GMT

 hi

I have 32Mb of ram on my linuxbox, I sometime use ramdisk (/dev/ram0 or
/dev/ram1) to store small filesystem, prepare a bootdisk or when my
system boot usimg initrd.
I would like to know if it is possible to know how much memory the
ramdisk are using. I suppose that they don't all use 4Mb as there are 16
ramdisk i can use and I only have 32 Mb of ram. How can I found out how
much memory they use? and if I don't need what is left on one of them
after i unmount it how can I free the memory a ramdisk use?

thanks for any help

--
Jerome Corre


Sent via Deja.com http://www.deja.com/
Before you buy.

------------------------------

From: "Ken Whaley" <[EMAIL PROTECTED]>
Subject: Re: A faster memcpy and bzero for x86
Date: Thu, 7 Dec 2000 11:12:04 -0800

If you want to get very CPU implementation-specific, you can certainly
write a better memcpy and bzero.  The problems are: 1) it's very
implementation-specific! Changing parameters like the size of the L1
cache can drastically affect performance.  2) They can have better or worse
performance depending on the program context: is the data touched
before/after the memcpy/bzero?  E.g., Pentium III can bzero large areas
really fast
using the SSE instructions (64-bit non-cacched writes), but if you want to
touch the data after the bzero, then you may want to use cached writes.
If the data's in the L1 cache, then you *don't* want to use uncached writes
(you may get data corruption, depending on the details of the CPU...)

If you have a special case in your application, then go ahead and code
up these routines that fits your special case.  That's what optimization
is all about: tailoring implementation to your specific needs.   The
general system routines are just that: general, with average performance
across the majority of different applications, and where portability is
also a very important issue.

"Grant Edwards" <[EMAIL PROTECTED]> wrote in message
news:3gOX5.3201$[EMAIL PROTECTED]...
> In article <[EMAIL PROTECTED]>, Johan Kullstam wrote:
>
> [regarding use of FPU to do memory move/fill]
>
> >> Now I'm just wondering why Linux doesn't already have these
> >> optimizations.
> >
> >you can't just clobber the FPU regs, something else might be using
> >them.  the kernel cannot use FPU without saving and restoring them
> >since they could be used in a userspace program (save/restore FPU every
> >time you enter the kernel is too expensive, FPU is only save/restore'd
> >at userspace program context switches).
>
> Are we talking about memcpy and bzero in glibc?  In that case
> it's not the kernel...
>
> --
> Grant Edwards                   grante             Yow!  ... Um...Um...
>                                   at
>                                visi.com



------------------------------

From: [EMAIL PROTECTED] (Erik Hensema)
Subject: Re: A faster memcpy and bzero for x86
Date: Thu, 7 Dec 2000 19:21:59 +0100
Reply-To: [EMAIL PROTECTED]

Boris Gjenero ([EMAIL PROTECTED]) wrote:
>Today I tried to see if I can make a memcpy that's faster than the
>normal one.  I quickly found out that if I just used processor
>instructions I couldn't improve things significantly.  However, if I
>used the math coprocessor fildq / fsistpq (load/store 64 bit integer)
>instructions, that did improve performance.

[...]

>Now I'm just wondering why Linux doesn't already have these
>optimizations.

Because you're using the coprocessor, your algorithm is likely to be slower
on small amounts of data (memcpy on 16 bytes, for example).
When most operations are done on small amounts of data, your algorithm may
be slower than the normal algorithm.

Maybe you can write wrappers around memcpy() and bzero() and profile some
real-life applications. Record what amounts of data are involved and what
amount of time is spend inside the routines (the regular ones, and the ones
optimised by you).

-- 
Erik Hensema ([EMAIL PROTECTED])
This signature is generated by siggen.pl v0.1
Available soon at http://www.xs4all.nl/~hensema

------------------------------

From: [EMAIL PROTECTED] (Linus Torvalds)
Subject: Re: A faster memcpy and bzero for x86
Date: 7 Dec 2000 12:03:53 -0800

In article <3gOX5.3201$[EMAIL PROTECTED]>,
Grant Edwards <[EMAIL PROTECTED]> wrote:
>In article <[EMAIL PROTECTED]>, Johan Kullstam wrote:
>
>[regarding use of FPU to do memory move/fill]
>
>>> Now I'm just wondering why Linux doesn't already have these
>>> optimizations.
>>
>>you can't just clobber the FPU regs, something else might be using
>>them.  the kernel cannot use FPU without saving and restoring them
>>since they could be used in a userspace program (save/restore FPU every
>>time you enter the kernel is too expensive, FPU is only save/restore'd
>>at userspace program context switches).
>
>Are we talking about memcpy and bzero in glibc?  In that case
>it's not the kernel...

Note that even in user space memcpy() using MMX registers is NOT
necessarily a good idea at all.

Why?

It look sdamn good in benchmarks. Especially for large memory areas that
are not in the cache.

But it tends to have horrible side-effects.  Like the fact that when
multiple processes (or threads) are running, it means that the FP state
has to be switched all the time.  Normally we can avoid this overhead,
because most programs do not actually tend to use the FP unit very much,
so with some simple lazy FP switching we can make thread and process
switches much faster. 

Using the FPU or MMX for memcpy makes that go away completely.  Suddenly
you get slower task switching, and people will blame the kernel.  Even
though the _real_ bug is an optimization that looks very good on
benchmarks, but does not necessarily actually win all that much in real
life. 

Basically, you should almost never use the i387 for memcpy(), unless you
know you can get it for free (ie you're already using the FPU). A i387
state save/restore is expensive. It's expensive even in user mode where
you don't do it explicitly, but the kernel does it for you.

The MMX stuff is similar. Only use it if you already know you're using
the MXX unit.  Because otherwise you _will_ slow the system down.

NOTE! If you absolutely want to do it anyway, make sure that the size
cutoff is large. It definitely is not worth a few FPU task switches to
do small memcpy's. But for really large memcpy's you might consider it
(ie if size is noticeably larger than a few kilobytes). Use regular
integer stuff for smaller areas.

And it's insidious.  When benchmarking this thing, you usually (a) don't
have any other programs running and (b) even if you do, they haven't
been converted to using FPU memcpy yet anyway, so you'd see only half of
the true cost anyway. 

                        Linus

------------------------------

From: "Joshua Schaeffer" <[EMAIL PROTECTED]>
Subject: Re: kernel header problems
Date: Thu, 07 Dec 2000 20:58:08 GMT

What in the world is the sense of having a separate size_t type?



------------------------------

From: [EMAIL PROTECTED]
Subject: Re: A faster memcpy and bzero for x86
Date: Thu, 07 Dec 2000 21:10:59 -0000

On Thu, 7 Dec 2000 19:21:59 +0100 Erik Hensema <[EMAIL PROTECTED]> wrote:

| Because you're using the coprocessor, your algorithm is likely to be slower
| on small amounts of data (memcpy on 16 bytes, for example).
| When most operations are done on small amounts of data, your algorithm may
| be slower than the normal algorithm.
|
| Maybe you can write wrappers around memcpy() and bzero() and profile some
| real-life applications. Record what amounts of data are involved and what
| amount of time is spend inside the routines (the regular ones, and the ones
| optimised by you).

Or call them by different names and make them optional to applications.
Perhaps big_fpu_memcpy() and big_fpu_bzero().

-- 
=================================================================
| Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |
| [EMAIL PROTECTED] | Texas, USA | http://phil.ipal.org/     |
=================================================================

------------------------------

From: [EMAIL PROTECTED]
Subject: Re: kernel header problems
Date: Thu, 07 Dec 2000 21:24:48 GMT

Joshua Schaeffer <[EMAIL PROTECTED]> wrote:
> What in the world is the sense of having a separate size_t type?

Symbolic types are mandated in many cases by the posix api in order to
allow it to be efficiently implemented on various systems, and to
avoid confusion. For example, in BSD 4.2 file modes were unsigned
shorts, in SysV V.3 they were int. Settling on mode_t allows code to
compile on either without warnings.

As to size_t and ssize_t, ssize_t was added to represent either a size
in bytes or an error code. That is, it's a signed size_t. 

-- 
Matt Gauthier <[EMAIL PROTECTED]>

------------------------------

From: [EMAIL PROTECTED]
Subject: Re: ramdisk: how much mem do they use?
Date: Thu, 07 Dec 2000 21:28:44 -0000

On Thu, 07 Dec 2000 18:03:15 GMT Jerome Corre <[EMAIL PROTECTED]> wrote:

| I have 32Mb of ram on my linuxbox, I sometime use ramdisk (/dev/ram0 or
| /dev/ram1) to store small filesystem, prepare a bootdisk or when my
| system boot usimg initrd.
| I would like to know if it is possible to know how much memory the
| ramdisk are using. I suppose that they don't all use 4Mb as there are 16
| ramdisk i can use and I only have 32 Mb of ram. How can I found out how
| much memory they use? and if I don't need what is left on one of them
| after i unmount it how can I free the memory a ramdisk use?

Take a look in the kernel source at drivers/block/rd.c around the function
called rd_ioctl() and see that there are a few ioctl() calls you could use,
such as BLKGETSIZE.  Then take a look at BLKFLSBUF.

I wish all the ioctl() calls were documented in one place.  But given that
to not be the case, a couple of techniques to find what facilities are
available include looking in the source, and if you know of a command that
surely must be doing what you need to do, run it via strace and see just
what it really does (or look at it's source).  This is one of the many
reasons having source code and open systems is a good thing.

Also, consider upgrading to 2.4 (I'm currently running -test10 on most of
my Linux machines) and trying out ramfs.  It won't work for making a
bootdisk, but mounting a loopback file is just as effective, and should
work fine with that file in ramfs.

-- 
=================================================================
| Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |
| [EMAIL PROTECTED] | Texas, USA | http://phil.ipal.org/     |
=================================================================

------------------------------

From: [EMAIL PROTECTED] (Kaz Kylheku)
Subject: Re: kernel header problems
Reply-To: [EMAIL PROTECTED]
Date: Thu, 07 Dec 2000 21:31:55 GMT

On Thu, 07 Dec 2000 20:58:08 GMT, Joshua Schaeffer <[EMAIL PROTECTED]> wrote:
>What in the world is the sense of having a separate size_t type?

It's not a separate type, it's an alias for some existing unsigned type.
It provides abstraction, giving implementors the chance to choose a suitable
type to hold the result of the sizeof operator. It could just be fixed as, say,
unsigned int, but that would not be appropriate in every C environment.
E.g. what about implementations with 32 bit unsigned int, but 64 bit unsigned
long and 64 bit pointers? Now sizeof, malloc and so on can't measure objects
4GB or larger, even though such objects are within the addresing capabilities
of the implementation! Okay, so let's make unsigned long the standard for
expressing size.  Woe for the C implementor for a small embedded system,
where size calculations suddenly require 32 bit arithmetic and size_t variables
require at least 32 bits of storage.

------------------------------

From: [EMAIL PROTECTED]
Subject: Re: this sucks!
Date: Thu, 07 Dec 2000 22:16:51 -0000

On Fri, 01 Dec 2000 08:50:17 GMT [EMAIL PROTECTED] wrote:

| Ok, I've asked two really simple questions soo far in this group, but
| haven't received one single answer. I'm new to linux drivers, and I
| really can't figure out exactly WHAT you all Linux geeks thinks is
| soooo good with linux!? I've written drivers for
| Win95/98/ME/CE/NT4/2000 and that is heaven compared with this shit!
|
| Open source - so what!? A good documentation can't be replaced by some
| nerdish source-code comments!
|
| Will you please do two things right for once?
| 1. Tell me how to open a tty device from a kernel model.
| 2. Buy a belt to those too-short and too-often weared jeans of yours.

So point me to a good URL which explains everything I need to know to
write drivers for Win95/98/ME/CE/NT4/2000.

-- 
=================================================================
| Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |
| [EMAIL PROTECTED] | Texas, USA | http://phil.ipal.org/     |
=================================================================

------------------------------

From: "Guennadi V. Liakhovetski" <[EMAIL PROTECTED]>
Crossposted-To: comp.os.linux.setup
Subject: PLEASE - kernel / BIOS wrt hd DMA
Date: Thu, 7 Dec 2000 22:13:35 +0000

Hello all

I've been fighting with this problem for a few weeks now. The problem
seems to be that my BIOS does not support DMA for IDE and the kernel
cannot bypas it. The chipset is ok (Intel 430FX), the disk too, afaik I
included all possible parameters in the kernel (2.2.17 + ide patch). But
the BIOS is old (AMI 1.00.04.CA0 for Intel Morrison64 aka Advanced/MN mobo
with a P-75 and onboard S3-Trio64) and no upgrades exist. So, the
question: is it either
1) there ARE situations when such BIOS fault cannot be fixed by the
software or
2) it IS always possible, so, something is wrong with the software
(kernel / its configuration)

Also, I read somewhere, that one often can flash a 'non-native'
BIOS... Does anybody know of identical mobos (430FX + S3-Trio64)? Note,
that Morrison32's BIOS (S3-Trio32) does not suit.

Thanks
Guennadi
___

Dr. Guennadi V. Liakhovetski
Department of Applied Mathematics
University of Sheffield, U.K.
email: [EMAIL PROTECTED]



------------------------------

From: "David K. Means" <[EMAIL PROTECTED]>
Crossposted-To: comp.protocols.ppp,comp.os.linux.networking
Subject: Re: Problem getting ip_forwarding to work
Date: Thu, 7 Dec 2000 15:37:56 -0800

"David Ronis" <[EMAIL PROTECTED]> wrote in message
news:mpPX5.6$[EMAIL PROTECTED]...
> I've been using pppd-2.3.11 on an i686-linux(2.2.17)-glibc(2.2) client
> to connect to a Sun also running 2.3.11.  I use a static IP number for
> the client and an options file as shown below.  This has worked for
> years.
>
> I'm trying to get a ppp server working on another linux box (same
> software as the client) that is permanently connected to the net at
> eth0 and has a ZyXEL omni 56K Plus modem connected to ttyS1.  My
> options file is appended below.
>
> I can connect to the server and seem to have set routing up correctly
> at the both ends:
>
> netstat -rn gives
>
> on my client:
>
> Kernel IP routing table
> Destination     Gateway         Genmask         Flags   MSS Window  irtt
Iface
> 132.206.205.91  0.0.0.0         255.255.255.255 UH        0 0          0
ppp0
> 127.0.0.0       0.0.0.0         255.0.0.0       U         0 0          0
lo
> 0.0.0.0         132.206.205.91  0.0.0.0         UG        0 0          0
ppp0
>
>
> and on the server:
>
> Kernel IP routing table
> Destination     Gateway         Genmask         Flags   MSS Window  irtt
Iface
> 132.206.205.86  0.0.0.0         255.255.255.255 UH        0 0          0
ppp0
> 132.206.205.0   0.0.0.0         255.255.255.0   U         0 0          0
eth0
> 127.0.0.0       0.0.0.0         255.0.0.0       U         0 0          0
lo
> 0.0.0.0         132.206.205.10  0.0.0.0         UG        0 0          0
eth0
>
> in addition /proc/sys/net/ipv4/ip_forward is 1 and rp_filter is 0
> (both for ppp0 and eth0).
  [snip]
I am not sure if it is required, since you do have a UH route,
 but you do not have a net-route (U) on the client machine:
  # route add -net 132.206.205.0 netmask 255.255.255.0 dev ppp0




------------------------------

From: "Ken Whaley" <[EMAIL PROTECTED]>
Subject: Intel's SSE block copy paper (was: Re: A faster memcpy and bzero for x86)
Date: Thu, 7 Dec 2000 17:53:15 -0800

Here's Intel's paper on block copy using the SSE
instructions.   rep movsb is still king for small (< 48 byte)
copies!

http://developer.intel.com/software/idap/media/pdf/copy.htm



------------------------------

From: [EMAIL PROTECTED]
Subject: Write software modem for linux
Date: Fri, 08 Dec 2000 02:09:43 GMT

Hi,
I am developing a software modem using DSP.
I want to know where can get information about CCITT
V.22,V.22bis,V.32,V.42,V.90...etc.
I found it in ITU site, but it need money.
Who can help me?


Sent via Deja.com http://www.deja.com/
Before you buy.

------------------------------

From: [EMAIL PROTECTED] (J Wendel)
Crossposted-To: comp.os.linux.setup
Subject: Re: PLEASE - kernel / BIOS wrt hd DMA
Date: Fri, 08 Dec 2000 04:06:42 GMT


Sorry I don't have any real help to offer, but 
I think you should send this problem to the Kernel Mailing List, it
sounds like Andre Hedrick ( the IDE maintainer ) would be interested.
You might also try the latest (2.4.0test12pre7) developement kernel,
they've been fixing a lot of BIOS related problems.

Good Luck,

John





On Thu, 7 Dec 2000 22:13:35 +0000, "Guennadi V. Liakhovetski"
<[EMAIL PROTECTED]> wrote:

>Hello all
>
>I've been fighting with this problem for a few weeks now. The problem
>seems to be that my BIOS does not support DMA for IDE and the kernel
>cannot bypas it. The chipset is ok (Intel 430FX), the disk too, afaik I
>included all possible parameters in the kernel (2.2.17 + ide patch). But
>the BIOS is old (AMI 1.00.04.CA0 for Intel Morrison64 aka Advanced/MN mobo
>with a P-75 and onboard S3-Trio64) and no upgrades exist. So, the
>question: is it either
>1) there ARE situations when such BIOS fault cannot be fixed by the
>software or
>2) it IS always possible, so, something is wrong with the software
>(kernel / its configuration)
>
>Also, I read somewhere, that one often can flash a 'non-native'
>BIOS... Does anybody know of identical mobos (430FX + S3-Trio64)? Note,
>that Morrison32's BIOS (S3-Trio32) does not suit.
>
>Thanks
>Guennadi
>___
>
>Dr. Guennadi V. Liakhovetski
>Department of Applied Mathematics
>University of Sheffield, U.K.
>email: [EMAIL PROTECTED]
>
>


------------------------------


** FOR YOUR REFERENCE **

The service address, to which questions about the list itself and requests
to be added to or deleted from it should be directed, is:

    Internet: [EMAIL PROTECTED]

You can send mail to the entire list (and comp.os.linux.development.system) via:

    Internet: [EMAIL PROTECTED]

Linux may be obtained via one of these FTP sites:
    ftp.funet.fi                                pub/Linux
    tsx-11.mit.edu                              pub/linux
    sunsite.unc.edu                             pub/Linux

End of Linux-Development-System Digest
******************************

Reply via email to