Hi!

Even facing the problem to make this mail unreadable I want to answer 
all inline. Please, don't complain.

I feel the urgent need to correct at least my last testing results as 
your last mail revealed heavy errors of my timing tests and kind-of 
opened my eyes.

On 09/18/2002 07:20 AM, Oleg Drokin wrote:
> Hello!
> 
> On Tue, Sep 17, 2002 at 07:39:39PM +0200, Manuel Krause wrote:
> 
>>>Copy same amount of data from RAM/nowhere to FS.
>>>E.g. make a file with file names and sizes and write a script that
>>>writes this amount of data from /dev/zero with these same names and needed 
>>>sizes into FS. (or just use RAMFS as your source if you have not much data 
>>>and huge RAM)
>>
>>To be honest, this already exceeds my linux knowledge...
> 
> I meant something to this extent:
> You run a script that runs over your filesystem and creates shell script
> that first creates whole dir structure of source dir and then for each file
> creates necessary command to recreate file of the same size:
> e.g for this directory contents:
> green@angband:~/z> ls -lR 
[...]
> Result of the work of the script would be:
> mkdir t
> mkdir t/z
> dd if=/dev/zero of=t/z/inode.c bs=69570 count=1
> dd if=/dev/zero of=t/z/stree.c bs=66478 count=1
> dd if=/dev/zero of=t/z/tail_conversion.c bs=10256 count=1
> 
> And you can run resulting script in target dir.

Yes, I saw this work in a nightmare last night. Scheduled for some dark 
moonless cold snow flurry winter night, sorry. Except for someone 
experienced likes to provide me with a basic script for that... ;-))

>>I was fiddling with some test directories containing 195.8MB I copied to 
>>and from /dev/shm with swap turned off.
>>
>># time cp -a /dev/shm/. /mnt/beta/z.Backup.3/
>>kernel 2.4.20-pre7  | kernel 2.4.20-pre6
>>real    0m9.006s    | real    0m6.740s
>>user    0m0.190s    | user    0m0.230s
>>sys     0m5.250s    | sys     0m4.780s
>># rm -r /dev/shm/*
>># time cp -a /mnt/beta/z.Backup.3/. /dev/shm/
>>kernel 2.4.20-pre7  | kernel 2.4.20-pre6
>>real    0m6.349s    | real    0m6.180s
>>user    0m0.210s    | user    0m0.220s
>>sys     0m2.450s    | sys     0m2.510s
> 
> This dataset is way too small and entirely fits into your RAM I presume.

Yes, it fits. I know that problem with this RAM based test. Though I may 
increase the testing directory a bit closer to the OOM limit, having 
512MB available.

> So to avoid any distortion or results you'd better have all periodic stuff
> disabled. (though kupdated is still there) so it's better to run it several
> times.
> Also since it its into RAM, it must be flushed out, so I usually do this
> using such command:
> time sh -c "cp -a /testfs0/linux-2.4.18 /mnt/ ; umount /mnt"

Couldn't you have written these words to me some years earlier?! The 
effect is measurable and on almost any so far discussed fs interaction 
huge or at least relevant. So, after reviewing my 
partition-backup-scripts, forget _all_ results I posted to the list. 
They're all lacking the umount=flush component.

Now, you caught me as the "fool of reiserfs-list"... Quite embarrassing. 
Mmmh. Painful.

>># time dd if=/dev/zero bs=1M count=1000 of=/mnt/beta/testfile.zero
>>kernel 2.4.20-pre7  | kernel 2.4.20-pre6
>>real    1m11.390s   | real    1m42.011s
>>sys     0m11.230s   | sys     0m5.620s
> 
> Hm. While system time is less as expected, real time increased, that's strange.
> 
>># time dd of=/dev/null bs=1M if=/mnt/beta/testfile.zero
>>kernel 2.4.20-pre7  | kernel 2.4.20-pre6
>>real    1m16.738s   | real    1m39.094s
>>sys     0m5.460s    | sys     0m5.930s
> 
> And real time is bigger for reads too, so it seems data layout is different.
> 
> That's really strange. If you can reproduce this behaviour, I am interested
> in getting debugreiserfs -d output for each case after you umount this volume
> (I assume that2 /mnt/beta/ filesystems contains nothing but this testfile.zero
> file).

No. /mnt/beta/ is my software storage partition and contains this:
  /dev/hda11   5550248   4089088   1461160  74% /mnt/beta .

I have no means to provide this complete "debugreiserfs -d /dev/hda11" 
output set, 42MB one of four, if I read your wording correctly (-pre6 
without 1G file, -pre6 with 1GB file, -pre7 without 1GB file, -pre7 with 
1GB file) on my web-space. As .tar.gz it's 4MB one of four, and even 
that set doesn't fit on my private t-online  website. Maybe, it would 
work if sent sequentially by mail.
Oh. O.k. you get a definite "No" on this, sorry. I just reviewed the 
debugreiserfs' output file content and I would not send or publish this 
in any way. It is simply too sensitive as it contains direct file and 
directory names.
Is it possible to provide the needed info without clear directory or 
file names in future?! (These names replaced by sequentionally taken 
numbers?)

////

O.k., too many words about unneeded things. I've remade the testing I 
posted and kept to umount the interacting related partition inbetween in 
order to force the needed flush and captured that time, too, now. My 
previously posted values have not been reproducible, if I review the new 
values correctly. But, you need to read and interprete it yourself!

I took 3 timings each test and calculated the mean value.

Comparison of "dd" actions:
---------------------------
reading command: time sh -c "dd if=/mnt/beta/testfile.zero bs=1M
  count=1000 of=/dev/null ; umount /mnt/beta"
writing command: time sh -c "dd if=/dev/zero bs=1M count=1000
  of=/mnt/beta/testfile.zero ; umount /mnt/beta"

mean-pre7 writing dd 1G | mean-pre6 writing dd 1G
real   1m32.288s        | real   1m30.531s
user   0m0.013s         | user   0m0.016s
sys    0m11.207s        | sys    0m5.036s

mean-pre7 reading dd 1G | mean-pre6 reading dd 1G
real    1m24.002s       | real   1m22.039s
user    0m0.010s        | user   0m0.013s
sys     0m6.470s        | sys    0m6.083s

related df values:
/dev/hda11   5550248   4089088   1461160  74% /mnt/beta
/dev/hda11   5550248   5114104    436144  93% /mnt/beta

Yes, that's going over the "senfseful" filesystem content value.

////

Comparison of "cp -a" actions:
------------------------------
reading command: time sh -c "cp -a /mnt/beta/z.Backup.3/. /mnt/ramfs/ ;
  umount /mnt/beta ; umount /mnt/ramfs"
writing command: time sh -c "cp -a /mnt/ramfs/. /mnt/beta/z.Backup.3/ ;
  umount /mnt/beta ; umount /mnt/ramfs"

mean-pre7 reading files | mean-pre6 reading files
real    0m38.641s       | real    0m39.110s
user    0m0.200s        | user    0m0.220s
sys     0m3.400s        | sys     0m3.200s

mean-pre7 writing files | mean-pre6 writing files
real    0m25.128s       | real    0m27.689s
user    0m0.200s        | user    0m0.160s
sys     0m4.860s        | sys     0m5.217s

directory content: 171.3MB

////

>>>Compare 2.4.20-pre[67] if you see any difference.
>>>Ah, also copy your data from original disk location to /dev/null and 
>>>measure
>>>time of that operation to know how much of total time is occupied by reads.
>>>Also you can calculate read and write throughput separately this way.
>>>And if reads are slower than writes - ...
>>
>>I'm definitely not sure if my lines above are something you meant.
> 
> Yes, kind of, though you have omitted timings of copying original data to
> /dev/shm/ that will give us read speed from original media.

I thought my posted set "# time cp -a /mnt/beta/z.Backup.3/. /dev/shm/" 
represented copying the original data from reiserfs partition /mnt/beta 
to /dev/shm ?!

> In fact instead of turning of swap you can do
> mount none /mnt/ramfs -t ramfs
> command (if you have ramfs compiled in of course) and /mnt/ramfs is now
> kind of ram filesystem with very low overhead. It also cannot be swapped out
> so if you fill all of your RAM, your box will OOM ;)

Even more, after finding that RAMFS was now internally set enabled in 
the related Config.in but searching it over and over before (Me: Am I 
really blind, now?!) I saw huge throughput diffs between shm and RAMFS.

> Byt the test itself is very small.
> Probably you need to run something like
> time find /source/that/needs/to/be/backed/up -type f -exec cat {} >/dev/null \;
> 
> to get read performance and implement a script like I mentioned in the beginning
> to measure writes.
> This way you do not need tons of RAM.

I see. :-)
Hope that mail brings some clarification and I haven't forgotten too 
much for now.


Thanks, good night,

Manuel

Reply via email to