Re: df -h stats for same file systems display different result son AMD64 then on i386 (Source solved)

2006-01-17 Thread Daniel Ouellet

OK,

Here is the source of the problem. The cache file generated by 
webazolver is the source of the problem. Based on the information of the 
software webalizer, as this:


Cached DNS addresses have a TTL (time to live) of 3 days.  This may be
changed at compile time by editing the dns_resolv.h header file and
changing the value for DNS_CACHE_TTL.

The cache file is process each night, and the records older then 3 days 
are remove, but somehow that file become a sparse file in the process 
and when copy else where show it's real size. In my case that file was 
using a bit over 4 millions blocks more then it should have and give me 
the 4GB+ difference in mirroring the content.


So, as far as I can see it, this process of expiring the records from 
the cache file that is always reuse doesn't shrink the file really, but 
somehow just mark the records inside the file as bad, or something like 
that.


So, nothing to do with OpenBSD at all but I would think there is a bug 
in the portion of webalizer however base on what I see from it's usage.


Now the source of the problem was found and many thanks to all that 
stick with me along the way.


Always feel good to know in the end!

Thanks to Otto, Ted and Tom.

Daniel



Re: df -h stats for same file systems display different result son AMD64 then on i386 (Source solved)

2006-01-17 Thread Otto Moerbeek
On Tue, 17 Jan 2006, Daniel Ouellet wrote:

 OK,
 
 Here is the source of the problem. The cache file generated by webazolver is
 the source of the problem. Based on the information of the software webalizer,
 as this:
 
 Cached DNS addresses have a TTL (time to live) of 3 days.  This may be
 changed at compile time by editing the dns_resolv.h header file and
 changing the value for DNS_CACHE_TTL.
 
 The cache file is process each night, and the records older then 3 days are
 remove, but somehow that file become a sparse file in the process and when
 copy else where show it's real size. In my case that file was using a bit over
 4 millions blocks more then it should have and give me the 4GB+ difference in
 mirroring the content.
 
 So, as far as I can see it, this process of expiring the records from the
 cache file that is always reuse doesn't shrink the file really, but somehow
 just mark the records inside the file as bad, or something like that.
 
 So, nothing to do with OpenBSD at all but I would think there is a bug in the
 portion of webalizer however base on what I see from it's usage.
 
 Now the source of the problem was found and many thanks to all that stick with
 me along the way.

You are wrong in thinking sparse files are a problem. Having sparse
files quite a nifty feature, I would say. 


-Otto



Re: df -h stats for same file systems display different result son AMD64 then on i386 (Source solved)

2006-01-17 Thread Joachim Schipper
On Tue, Jan 17, 2006 at 02:15:57PM +0100, Otto Moerbeek wrote:
 On Tue, 17 Jan 2006, Daniel Ouellet wrote:
 
  OK,
  
  Here is the source of the problem. The cache file generated by
  webazolver is the source of the problem. Based on the information of
  the software webalizer, as this:
  
  Cached DNS addresses have a TTL (time to live) of 3 days.  This may
  be changed at compile time by editing the dns_resolv.h header file
  and changing the value for DNS_CACHE_TTL.
  
  The cache file is process each night, and the records older then 3
  days are remove, but somehow that file become a sparse file in the
  process and when copy else where show it's real size. In my case
  that file was using a bit over 4 millions blocks more then it should
  have and give me the 4GB+ difference in mirroring the content.
  
  So, as far as I can see it, this process of expiring the records
  from the cache file that is always reuse doesn't shrink the file
  really, but somehow just mark the records inside the file as bad, or
  something like that.
  
  So, nothing to do with OpenBSD at all but I would think there is a
  bug in the portion of webalizer however base on what I see from it's
  usage.
  
  Now the source of the problem was found and many thanks to all that
  stick with me along the way.
 
 You are wrong in thinking sparse files are a problem. Having sparse
 files quite a nifty feature, I would say. 

Are we talking about webazolver or OpenBSD?

I'd argue that relying on the OS handling sparse files this way instead
of handling your own log data in an efficient way *is* a problem, as
evidenced by Daniels post. After all, it's reasonable to copy data to,
say, a different drive and expect it to take about as much space as the
original.

On the other hand, I agree with you that handling sparse files
efficiently is rather neat in an OS.

Joachim



Re: df -h stats for same file systems display different result son AMD64 then on i386 (Source solved)

2006-01-17 Thread Otto Moerbeek
On Tue, 17 Jan 2006, Joachim Schipper wrote:

 On Tue, Jan 17, 2006 at 02:15:57PM +0100, Otto Moerbeek wrote:

  You are wrong in thinking sparse files are a problem. Having sparse
  files quite a nifty feature, I would say. 
 
 Are we talking about webazolver or OpenBSD?
 
 I'd argue that relying on the OS handling sparse files this way instead
 of handling your own log data in an efficient way *is* a problem, as
 evidenced by Daniels post. After all, it's reasonable to copy data to,
 say, a different drive and expect it to take about as much space as the
 original.

Now that's a wrong assumption. A file is a row of bytes. The only
thing I can assume is that if I write a byte at a certain position, I
will get the same byte back when reading the file. Furthermoe, the
file size (not the disk space used!) is the largest position written.
If I assume anything more, I'm assuming too much.

For an application, having sparse files is completely transparant. The
application doesn't even know the difference. How the OS stores the
file is up to the OS.

Again, assuming a copy of a file takes up as much space as the
original is wrong. 

 On the other hand, I agree with you that handling sparse files
 efficiently is rather neat in an OS.

-Otto



Re: df -h stats for same file systems display different result son AMD64 then on i386 (Source solved)

2006-01-17 Thread Daniel Ouellet

You are wrong in thinking sparse files are a problem. Having sparse
files quite a nifty feature, I would say. 



Are we talking about webazolver or OpenBSD?

I'd argue that relying on the OS handling sparse files this way instead
of handling your own log data in an efficient way *is* a problem, as
evidenced by Daniels post. After all, it's reasonable to copy data to,
say, a different drive and expect it to take about as much space as the
original.


Just as feedback the size showed something like 150MB or so as the 
original file on OpenBSD. Using RSYNC to copy it over makes it almost 
5GB in size, well I wouldn't call that good. But again, before I say no 
 definitely, there is always something that I may not understands, so I 
am welling to leave some space for that here. But not much! (:



On the other hand, I agree with you that handling sparse files
efficiently is rather neat in an OS.


I am not sure that the OS handle it well or not. Again, no punch 
intended, but if it was, why copy no data then? Obviously something I 
don't understand for sure.


However, here is something I didn't include in my previous email with 
all the stats and may be very interesting to know. I didn't think it was 
so important at the time, but if you talk about handling it properly, 
may be it might be relevant.


The test were done with three servers. The file showing ~150MB in size 
was on www1. Then copying it to www2 with the -S switch in rsync 
regardless got it to ~5GB. Then copying the same file from www2 to www3 
using the same rsync -S setup go that file back to the size it was on 
www1. So, why not in the www2 in that case. So, it the the OS, or is 
that the rsync. Was it handle properly or wasn't it? I am not sure. If 
it was, then the www2 file should not have been ~5GB should it?


So the picture was

www1-www2-www3

www1 cache DB show 150MB

rsync -e ssh -aSuqz --delete /var/www/sites/ [EMAIL PROTECTED]:/var/www/sites

www2 cache DB show ~5GB

rsync -e ssh -aSuqz --delete /var/www/sites/ [EMAIL PROTECTED]:/var/www/sites

www3 cache DB show ~150MB

Why not 150Mb on www2???

One think that I haven't tried and regret not have done that not to know 
is just copying that file on www1 to a different name and then copying 
it again to it's original name and check the size at the and and the 
transfer of that file as well I without the -S switch to see if the OS 
did copy the empty data or not.


I guess the question would be, should it, or shouldn't it do it?

My own opinion right now is the file should show the size it really is. 
So, if it is 5GB and only 100MB is good on it, shouldn't it show it to 
be 5GB? I don't know, better mind then me sure have the answer to this 
one, right now, I do not for sure.




Re: df -h stats for same file systems display different result son AMD64 then on i386 (Source solved)

2006-01-17 Thread Joachim Schipper
On Tue, Jan 17, 2006 at 05:49:24PM +0100, Otto Moerbeek wrote:
 On Tue, 17 Jan 2006, Joachim Schipper wrote:
 
  On Tue, Jan 17, 2006 at 02:15:57PM +0100, Otto Moerbeek wrote:
 
   You are wrong in thinking sparse files are a problem. Having sparse
   files quite a nifty feature, I would say. 
  
  Are we talking about webazolver or OpenBSD?
  
  I'd argue that relying on the OS handling sparse files this way instead
  of handling your own log data in an efficient way *is* a problem, as
  evidenced by Daniels post. After all, it's reasonable to copy data to,
  say, a different drive and expect it to take about as much space as the
  original.
 
 Now that's a wrong assumption. A file is a row of bytes. The only
 thing I can assume is that if I write a byte at a certain position, I
 will get the same byte back when reading the file. Furthermoe, the
 file size (not the disk space used!) is the largest position written.
 If I assume anything more, I'm assuming too much.
 
 For an application, having sparse files is completely transparant. The
 application doesn't even know the difference. How the OS stores the
 file is up to the OS.
 
 Again, assuming a copy of a file takes up as much space as the
 original is wrong. 
 
  On the other hand, I agree with you that handling sparse files
  efficiently is rather neat in an OS.

Okay - I understand your logic, and yes, I do know about sparse files
and how they are typically handled. And yes, you are right that
there are very good reasons for handling sparse files this way.

And yes, application are right to make use of this feature where
applicable.

However, in this case, it's a simple log file, and what the application
did, while very much technically correct, clearly violated the principle
of least astonishment, for no real reason I can see. Sure, trying to
make efficient use of every single byte may not be very efficient - but
just zeroing out the first five GB of the file is more than a little
hackish, and not really necessary.

Joachim



Re: df -h stats for same file systems display different result son AMD64 then on i386 (Source solved)

2006-01-17 Thread Matthias Kilian
On Tue, Jan 17, 2006 at 02:36:44PM -0500, Daniel Ouellet wrote:
 [...] But having a 
 file that is let say 1MB of valid data that grow very quickly to 4 and 
 6GB quickly and takes time to rsync between servers were in one instance 
 fill the fill system and create other problem. (: I wouldn't call that 
 a feature.

As Otto noted, you've distinguish between file size (that's what
stat(2) and friends report, and at the same time it's the number
of bytes you can read sequentially from the file), and a file's
disk usage.

For more explanations, see the RATIONALE section at

http://www.opengroup.org/onlinepubs/009695399/utilities/du.html

(You may have to register, but it doesn't hurt)

See also the reference to lseek(2) mentioned there.


 But at the same time, I wasn't using the -S switch in rsync, 
 so my own stupidity there. However, why spend lots of time processing 
 empty files I still don't understand that however.

Please note that -S in rsync does not *guarantee* that source and
destination files are *identical* in terms of holes or disk usage.

For example:

$ dd if=/dev/zero of=foo bs=1m count=42
$ rsync -S foo host:
$ du foo
$ ssh host du foo

Got it? The local foo is *not* sparse (no holes), but the remote
one has been optimized by rsync's -S switch.

We recently had a very controverse (and flaming) discussion at our
local UG on such optimizations (or heuristics, as in GNU cp).
IMO, if they have to be explicitely enabled (like `-S' for rsync),
that's o.k. The other direction (copy is *not* sparse by default)
is exactly what I would expect.

Telling wether a sequence of zeroes is a hole or just a (real) block
of zeroes isn't possible in userland -- it's a filesystem implementation
detail.

To copy the *exact* contents of an existing filesystem including
all holes to another disk (or system), you *have* to use
filesystem-specific tools, such as dump(8) and restore(8). Period.


 I did research on google for sparse files and try to get more 
 informations about it. In some cases I would assume like if you do round 
 database type of stuff where you have a fix file that you write in at 
 various place or something, would be good and useful, but a sparse file 
 that keep growing over time uncontrol, I may be wrong, but I don't call 
 that useful feature.

Sparse files for databases on heavy load (many insertions and
updates) ar the death of performance -- you'll get files with blocks
spreaded all over your filesystem.

OTH, *spare* databases such as quotas files (potentially large, but
growing very slowly) are good candidates for sparse files.

Ciao,
Kili



Re: df -h stats for same file systems display different result son AMD64 then on i386 (Source solved)

2006-01-17 Thread Daniel Ouellet

Hi all,

First let me start with my apology to some of you for having waisted 
your time!


As much as this was/is interesting and puzzling to me and that I am 
trying obviously to get my hands around this issue and usage of sparse 
files, the big picture of it, is obviously something missing in my 
understanding at this time.


I am doing more research on my own, so lets kill this tread and sorry to 
have waisted any of your time with my lack of understanding of this aspect!


I am not trying to be a fucking idiot on the list, but it's obvious 
that I don't understand this at this time.


So, lets drop it and I will continue my homework!

Big thanks to all that try to help me as well!

Daniel



Re: df -h stats for same file systems display different result son AMD64 then on i386

2006-01-16 Thread Otto Moerbeek
On Sun, 15 Jan 2006, Daniel Ouellet wrote:

 Otto Moerbeek wrote:
  On Sun, 15 Jan 2006, Daniel Ouellet wrote:

  Since the bsize and fsize differ, it is expected that the used kbytes of the
  file systems differ. Also, the inode table size will not be the same.
 
 Not sure that I would agree fully with that, but I differ to your judgment.
 Yes there will and should be difference in usage as if you have a lots of
 small files, you are waisting more space if you fsize are bigger, unless I
 don't understand that part. Would it mean that the df -h would take the number
 of inode in use * the fsize to display the results for human then?

I do not understand what you mean. Of course df does not do such
calculation, because it does not mean anything. Inodes allocated does
have little to do with total space in use. fsize is fragment size,
which is something different than file size.

  You're comparing apples and oranges.
 
 I don't disagree to some extend as you know better, but I still try to
 understand it however. Shouldn't the df -h display the same results however to
 human? I am not arguing, but rather try to understand it. If it is design to
 be human converted, why a human would need to know or consider the file size
 in use then to compare the results?

The human thing is only doing the conversion to megabytes and such. It
does not compensate for space wasted due to blocks not being fully used
and such.

Now I agree that the difference you are seeing is larger than I would
expect. I would run a ls -laR or du -k on the filesystems and diff the
results to see if the contents are realy the same. My bet is that
you'll discover some files that are not on the system with a smaller
usage.  It is also perfectly possible that files having holes (also
called spare files) play a role: they take less space than their
length, but depending on how you copy them, the copy does take the
full amount of blocks. 

  BTW, you don't say which version(s) you are running. That's bad. since
  some bugs were fixed in the -h display. Run df without -h to see the
  real numbers.
 
 All run 3.8. Sorry about that.
 
 the 4.6GB have 4870062 * 1024 = 4,986,943,488
 www1# df
 Filesystem  1K-blocks  Used Avail Capacity  Mounted on
 /dev/wd0a  256814 4146420251017%/
 /dev/wd0h 104815854995698 0%/home
 /dev/wd0d 1030550 2979022 0%/tmp
 /dev/wd0g 5159638310910   4590748 6%/usr
 /dev/wd0e25799860   4870062  1963980620%/var
 /dev/wd0f 1030550  1546977478 0%/var/qmail

The above display used df -k, while the one below does not. Probably
you've set some alias for df or so, or you are using the BLOCKSIZE env
var. Why are you making things more difficult than needed for us (and
yourself?).

 
 the 8.1GB have 15967148 * 512 = 8,175,179,776
 # df
 Filesystem  512-blocks  Used Avail Capacity  Mounted on
 /dev/wd0a   513628 6558842236013%/
 /dev/wd0h  186162852   1768496 0%/home
 /dev/wd0d  2061100 4   1958044 0%/tmp
 /dev/wd0g  9904156424544   8984408 5%/usr
 /dev/wd0e 33022236   1537612  29833516 5%/var
 /dev/wd1b 16412252   1937920  1365372012%/var/mysql
 /dev/wd0f  2061100 4   1958044 0%/var/qmail
 /dev/wd1a 41280348  15967148  2324918441%/var/www/sites
 
 The funny part is that the first above /var include more files then the
 /var/www/sites below and still display less space in use.
 
  To check if the inode/block/fragment free numbers add up, you could
  use dumpfs, but that is a hell of a lot of work. 
  -Otto
  
 
 It's not a huge deal and the systems works well, I am just puzzle by the
 results and want to understand it, that's all.



Re: df -h stats for same file systems display different result son AMD64 then on i386

2006-01-16 Thread Ted Unangst
run du on both filesystems and compare the results.



Re: df -h stats for same file systems display different result son AMD64 then on i386

2006-01-16 Thread Daniel Ouellet

Ted Unangst wrote:

run du on both filesystems and compare the results.



OK, just because I am curious more then think there is a problem, and 
because I am still puzzle from what Otto and Ted said, here is what I 
did and the answer to question from Otto as well.


- Both system run 3.8. (www1 was running 3.6 and updated to 3.7, then 
3.8 all good following strict step from Nick in the FAQ). www2 was a 
full clean install from scratch, full wipe out, not an upgrade.


- There isn't any hard or soft link in that section.

- On the blocksize Otto said/asked if I play with it. No, never did, 
always did fresh install by default until this time when I try the 
upgrade from Nick as this www1 was a lots of work to do fresh, but I may 
just redo it to see.


Then as there is a lots of files and comparing them manually is really a 
lots of work, I use rsync 2.6.6 to mirror them:


www1# pkg_info | grep rsync
rsync-2.6.6 mirroring/synchronization over low bandwidth links

www2# pkg_info | grep rsync
rsync-2.6.6 mirroring/synchronization over low bandwidth links

Then, I use rsync and change the setup to allow to log in and run it as 
root so that no restrictions would be there in anyway as below to be 
sure I have a full identical copy of all the files just like this:


rsync -e ssh -auqz --delete /var/www/sites/ [EMAIL PROTECTED]:/var/www/sites

Then I did:
du -h /var/www/sites on www1 and get 3.9G/var/www/sites
du -h /var/www/sites on www2 and get 7.7G/var/www/sites

Also remember that www1 was the one upgraded, and www2 fresh new install.

Now I continue to look but I am not sure what else I can do to be 100% 
sure that all the files are identical before comparing them.


I am still comparing the results from du, but that's huge!

So, may be this test is not a valid one, but then why not?

It is interesting however to say the less.

Now, to push the issue even more, I did this with a third server from 
the www1 again:


rsync on it
www3# pkg_info | grep rsync
rsync-2.6.6 mirroring/synchronization over low bandwidth links

Then mirror it:
rsync -e ssh -auqz --delete /var/www/sites/ [EMAIL PROTECTED]:/var/www/sites

Then same, I did:
du -h /var/www/sites on www1 and get 3.9G/var/www/sites
du -h /var/www/sites on www3 and get 7.7G/var/www/sites

The only difference in hardware setup is that both www2 and www3 have 
their own drive for that mount point where www1 does not and run i386 
oppose to AMD64.


www1# df | grep /var/www/sites
www1# df | grep /var
/dev/wd0e25799860   4905500  1960436820%/var
/dev/wd0f 1030550  1552977472 0%/var/qmail

www2# df | grep /var/www/sites
/dev/wd1a20640174   8495528  263843%/var/www/sites

www3# df | grep /var/www/sites
/dev/wd1a20640174   8024648  1158351841%/var/www/sites

So, the www1 have to have even more stuff on the drive compare to www2 
and www3, but show much less.


So copying from www1 to www2 and to www3 give the same results and www2 
and www3 match very well, but www1 still show much less for sure.


I am very puzzle at best.



Re: df -h stats for same file systems display different result son AMD64 then on i386

2006-01-16 Thread Daniel Ouellet

Otto Moerbeek wrote:

Now I agree that the difference you are seeing is larger than I would
expect. I would run a ls -laR or du -k on the filesystems and diff the
results to see if the contents are realy the same. My bet is that
you'll discover some files that are not on the system with a smaller
usage.  It is also perfectly possible that files having holes (also
called spare files) play a role: they take less space than their
length, but depending on how you copy them, the copy does take the
full amount of blocks. 


So, I now did the ls -laR on each system:

ls -laR /var/www/sites  /tmp/wwwx where x was the server number

Then I compare the results with diff www1 www2 as well as diff www1 and 
www3:


diff www1 www3 /tmp/check

Then look at the check file, nothing that can explain the difference 
there. The only difference in file size there are the log files as they 
change live obviously, but the rest is the same.


I still can't explain it even using the ls -LaR options and looking at 
the diff between them.


In any case definitely nothing in the order of ~ 4GB.

I guess next test would be to wipe out and reinstall fresh to be sure.

PS; I did reboot the servers in case there was any open files or 
anything like that, still same results.


Now the only thing in what Otto said that make me thing is this 
statement but depending on how you copy them.


I use rsync all the time to do this!

Like: rsync -e ssh -auqz --delete 

Would this explain it? I can't imagine it would, but I don't know 
everything for sure.




Re: df -h stats for same file systems display different result son AMD64 then on i386

2006-01-16 Thread Daniel Ouellet

Just a bit more information on this.

As I couldn't understand if that was an AMD64 issue as illogical as that 
might be, I decided to put that to the test. So, I pull out an other 
AMD64 server and it's running 3.8, same fsize and bsize, one drive, etc.


Use rsync to mirror the content and the results are consistent with the 
i386. So, that prove it's not that at a minimum.


So, the source is 4.4GB and expand to 7.7GB.

I have no logical explications what so ever.

So, I will stop here and just drop it as I have nothing else logical I 
can think of to look at why that might be.


I will just have to put it in the unknown pile and leave it alone.

One thing for sure, I will be wiping that box out to be sure next.

Unless someone have an idea, again, not a problem, but something very 
weird I was trying to understand and find some logical explication too.


I will just have to take it as such and leave it alone.

So, case close as I have no more idea or clue.

This was done on a brand new server and using rsync to copy identical 
files from the source to the destination.


==
Source:
www1# disklabel wd0
# Inside MBR partition 3: type A6 start 63 size 78156162
# /dev/rwd0c:
type: ESDI
disk: ESDI/IDE disk
label: Maxtor 6E040L0
flags:
bytes/sector: 512
sectors/track: 63
tracks/cylinder: 16
sectors/cylinder: 1008
cylinders: 16383
total sectors: 78165360
rpm: 3600
interleave: 1
trackskew: 0
cylinderskew: 0
headswitch: 0   # microseconds
track-to-track seek: 0  # microseconds
drivedata: 0

16 partitions:
# sizeoffset  fstype [fsize bsize  cpg]
  a:52409763  4.2BSD   2048 16384  328 # Cyl 
0*-   519
  b:   8388576524160swap   # Cyl   520 
-  8841
  c:  78165360 0  unused  0 0  # Cyl 0 
- 77544
  d:   2097648   8912736  4.2BSD   2048 16384  328 # Cyl  8842 
- 10922
  e:  52429104  11010384  4.2BSD   2048 16384  328 # Cyl 10923 
- 62935
  f:   2097648  63439488  4.2BSD   2048 16384  328 # Cyl 62936 
- 65016
  g:  10486224  65537136  4.2BSD   2048 16384  328 # Cyl 65017 
- 75419
  h:   2132865  76023360  4.2BSD   2048 16384  328 # Cyl 75420 
- 77535*

www1# df -h
Filesystem SizeUsed   Avail Capacity  Mounted on
/dev/wd0a  251M   40.5M198M17%/
/dev/wd0h 1024M   54.0K972M 0%/home
/dev/wd0d 1006M8.5M948M 1%/tmp
/dev/wd0g  4.9G304M4.4G 6%/usr
/dev/wd0e 24.6G4.7G   18.7G20%/var
/dev/wd0f 1006M1.5M955M 0%/var/qmail

=

Destination.
# disklabel wd0
# Inside MBR partition 3: type A6 start 63 size 156296322
# /dev/rwd0c:
type: ESDI
disk: ESDI/IDE disk
label: Maxtor 6L080M0
flags:
bytes/sector: 512
sectors/track: 63
tracks/cylinder: 16
sectors/cylinder: 1008
cylinders: 16383
total sectors: 156301488
rpm: 3600
interleave: 1
trackskew: 0
cylinderskew: 0
headswitch: 0   # microseconds
track-to-track seek: 0  # microseconds
drivedata: 0

16 partitions:
# sizeoffset  fstype [fsize bsize  cpg]
  a:   104825763  4.2BSD   2048 16384  328 # Cyl 
0*-  1039
  b:  16777152   1048320swap   # Cyl  1040 
- 17683
  c: 156301488 0  unused  0 0  # Cyl 0 
-155060
  d:  10486224  17825472  4.2BSD   2048 16384  328 # Cyl 17684 
- 28086
  e:  83885760  28311696  4.2BSD   2048 16384  328 # Cyl 28087 
-111306
  f:   4194288 112197456  4.2BSD   2048 16384  328 # Cyl 111307 
-115467
  g:   2097648 116391744  4.2BSD   2048 16384  328 # Cyl 115468 
-117548
  h:  20971440 118489392  4.2BSD   2048 16384  328 # Cyl 117549 
-138353
  i:  16835553 139460832  4.2BSD   2048 16384  328 # Cyl 138354 
-155055*

# df -h
Filesystem SizeUsed   Avail Capacity  Mounted on
/dev/wd0a  502M   49.3M427M10%/
/dev/wd0i  7.9G2.0K7.5G 0%/home
/dev/wd0d  4.9G2.0K4.7G 0%/tmp
/dev/wd0h  9.8G958M8.4G10%/usr
/dev/wd0e 39.4G7.7G   29.7G21%/var
/dev/wd0f  2.0G252K1.9G 0%/var/log
/dev/wd0g 1006M2.0K956M 0%/var/qmail



Re: df -h stats for same file systems display different result son AMD64 then on i386

2006-01-16 Thread Otto Moerbeek
On Mon, 16 Jan 2006, Daniel Ouellet wrote:

 Just a bit more information on this.
 
 As I couldn't understand if that was an AMD64 issue as illogical as that might
 be, I decided to put that to the test. So, I pull out an other AMD64 server
 and it's running 3.8, same fsize and bsize, one drive, etc.
 
 Use rsync to mirror the content and the results are consistent with the i386.
 So, that prove it's not that at a minimum.
 
 So, the source is 4.4GB and expand to 7.7GB.
 
 I have no logical explications what so ever.

You have been told an explanation (sparse files).

-Otto

 
 So, I will stop here and just drop it as I have nothing else logical I can
 think of to look at why that might be.
 
 I will just have to put it in the unknown pile and leave it alone.
 
 One thing for sure, I will be wiping that box out to be sure next.
 
 Unless someone have an idea, again, not a problem, but something very weird I
 was trying to understand and find some logical explication too.
 
 I will just have to take it as such and leave it alone.
 
 So, case close as I have no more idea or clue.
 
 This was done on a brand new server and using rsync to copy identical files
 from the source to the destination.
 
 ==
 Source:
 www1# disklabel wd0
 # Inside MBR partition 3: type A6 start 63 size 78156162
 # /dev/rwd0c:
 type: ESDI
 disk: ESDI/IDE disk
 label: Maxtor 6E040L0
 flags:
 bytes/sector: 512
 sectors/track: 63
 tracks/cylinder: 16
 sectors/cylinder: 1008
 cylinders: 16383
 total sectors: 78165360
 rpm: 3600
 interleave: 1
 trackskew: 0
 cylinderskew: 0
 headswitch: 0   # microseconds
 track-to-track seek: 0  # microseconds
 drivedata: 0
 
 16 partitions:
 # sizeoffset  fstype [fsize bsize  cpg]
   a:52409763  4.2BSD   2048 16384  328 # Cyl 0*-   519
   b:   8388576524160swap   # Cyl   520 -  8841
   c:  78165360 0  unused  0 0  # Cyl 0 - 77544
   d:   2097648   8912736  4.2BSD   2048 16384  328 # Cyl  8842 - 10922
   e:  52429104  11010384  4.2BSD   2048 16384  328 # Cyl 10923 - 62935
   f:   2097648  63439488  4.2BSD   2048 16384  328 # Cyl 62936 - 65016
   g:  10486224  65537136  4.2BSD   2048 16384  328 # Cyl 65017 - 75419
   h:   2132865  76023360  4.2BSD   2048 16384  328 # Cyl 75420 -
 77535*
 www1# df -h
 Filesystem SizeUsed   Avail Capacity  Mounted on
 /dev/wd0a  251M   40.5M198M17%/
 /dev/wd0h 1024M   54.0K972M 0%/home
 /dev/wd0d 1006M8.5M948M 1%/tmp
 /dev/wd0g  4.9G304M4.4G 6%/usr
 /dev/wd0e 24.6G4.7G   18.7G20%/var
 /dev/wd0f 1006M1.5M955M 0%/var/qmail
 
 =
 
 Destination.
 # disklabel wd0
 # Inside MBR partition 3: type A6 start 63 size 156296322
 # /dev/rwd0c:
 type: ESDI
 disk: ESDI/IDE disk
 label: Maxtor 6L080M0
 flags:
 bytes/sector: 512
 sectors/track: 63
 tracks/cylinder: 16
 sectors/cylinder: 1008
 cylinders: 16383
 total sectors: 156301488
 rpm: 3600
 interleave: 1
 trackskew: 0
 cylinderskew: 0
 headswitch: 0   # microseconds
 track-to-track seek: 0  # microseconds
 drivedata: 0
 
 16 partitions:
 # sizeoffset  fstype [fsize bsize  cpg]
   a:   104825763  4.2BSD   2048 16384  328 # Cyl 0*-  1039
   b:  16777152   1048320swap   # Cyl  1040 - 17683
   c: 156301488 0  unused  0 0  # Cyl 0 -155060
   d:  10486224  17825472  4.2BSD   2048 16384  328 # Cyl 17684 - 28086
   e:  83885760  28311696  4.2BSD   2048 16384  328 # Cyl 28087 -111306
   f:   4194288 112197456  4.2BSD   2048 16384  328 # Cyl 111307
 -115467
   g:   2097648 116391744  4.2BSD   2048 16384  328 # Cyl 115468
 -117548
   h:  20971440 118489392  4.2BSD   2048 16384  328 # Cyl 117549
 -138353
   i:  16835553 139460832  4.2BSD   2048 16384  328 # Cyl 138354
 -155055*
 # df -h
 Filesystem SizeUsed   Avail Capacity  Mounted on
 /dev/wd0a  502M   49.3M427M10%/
 /dev/wd0i  7.9G2.0K7.5G 0%/home
 /dev/wd0d  4.9G2.0K4.7G 0%/tmp
 /dev/wd0h  9.8G958M8.4G10%/usr
 /dev/wd0e 39.4G7.7G   29.7G21%/var
 /dev/wd0f  2.0G252K1.9G 0%/var/log
 /dev/wd0g 1006M2.0K956M 0%/var/qmail



Re: df -h stats for same file systems display different result son AMD64 then on i386

2006-01-16 Thread Daniel Ouellet

Otto Moerbeek wrote:

On Mon, 16 Jan 2006, Daniel Ouellet wrote:


Just a bit more information on this.

As I couldn't understand if that was an AMD64 issue as illogical as that might
be, I decided to put that to the test. So, I pull out an other AMD64 server
and it's running 3.8, same fsize and bsize, one drive, etc.

Use rsync to mirror the content and the results are consistent with the i386.
So, that prove it's not that at a minimum.

So, the source is 4.4GB and expand to 7.7GB.

I have no logical explications what so ever.


You have been told an explanation (sparse files).

-Otto



Yes you are right and thanks to Tom Cosgrove that reminded me of that 
over look part in your previous answer, I did this again with:


rsync -e ssh -aSuqz --delete /var/www/sites/ [EMAIL PROTECTED]:/var/www/sites

instead of:

rsync -e ssh -auqz --delete /var/www/sites/ [EMAIL PROTECTED]:/var/www/sites

and it shrink to:

/dev/wd1a 19.7G3.9G   14.8G21%/var/www/sites

from:

/dev/wd1a 19.7G8.0G   10.7G43%/var/www/sites

Thanks for your patience with me!

That's a lots of waisted space for sure.

I am learning... sometime slow, but I am.

My next step is now trying to see where that's from.

Thanks

Daniel



df -h stats for same file systems display different result son AMD64 then on i386

2006-01-15 Thread Daniel Ouellet
Here is something I can't put my hands around to well and I don't really 
understand why that is, other then may be the fize of each mount point 
not process properly on AMD64, but that's just an idea. See lower below 
for why I think it might be the case. In any case, I would welcome a 
logical explication why that might be however.


I mirror a mount point for three servers, one AMD64 and two i386. Then I 
do df -h for each one, but I get way different results when I do it on 
AMD64, or when I do it on i386, but I can't understand why.


When I do the df -i however, I do get the same amount of inode, so there 
is the same amount of files. I even use rsync to make a perfect mirror 
of them and still I get way different results.


AMD64 give me 4.6GB as the i386 gives me 8.1GB. The funny part is that 
the AMD64 should give me more as the file system include a bit more stuff


AMD64 mount point file system is for /var/www as the mirror one is for 
/var/www/sites and the amd does include all of sites files.


However is I log in with WinSCP and do the calculate stuff on both 
server to the location


/var/www/sites, I do get the same results.

dev.
52584 files, 2799 folders
location /var/www
7,685 MB (8,059,054,473)

www2
52584 files, 2799 folders
location /var/www
7,683 MB (8,056,394,923)

The difference in size is the logs files that are process not in sync of 
each others, but locally on each one.


I can't explain this one.

This is really weird.

I thought to delete the file system and recreate it with the additional 
mount to to see, but the results should be good as it is now as the 
/var/www/sites is inside the /var/www one on the AMD64.


i386 display:
# df -h
Filesystem SizeUsed   Avail Capacity  Mounted on
/dev/wd0a  247M   27.8M206M12%/
/dev/wd0h  4.6G3.2M4.3G 0%/home
/dev/wd0d  495M1.0K470M 0%/tmp
/dev/wd0g  4.4G206M3.9G 5%/usr
/dev/wd0e 12.6G745M   11.2G 6%/var
/dev/wd1b  7.8G2.0K7.4G 0%/var/mysql
/dev/wd0f  991M1.1M940M 0%/var/qmail
/dev/wd1a 19.7G8.1G   10.6G43%/var/www/sites


AMD64 display:
www1# df -h
Filesystem SizeUsed   Avail Capacity  Mounted on
/dev/wd0a  251M   40.5M198M17%/
/dev/wd0h 1024M   54.0K972M 0%/home
/dev/wd0d 1006M2.0K956M 0%/tmp
/dev/wd0g  4.9G304M4.4G 6%/usr
/dev/wd0e 24.6G4.6G   18.7G20%/var
/dev/wd0f 1006M1.5M955M 0%/var/qmail


I also thought about files still open, but I rebooted the system to be 
safe and still the same results.


May be the disklabel is not seen right, or calculate right on AMD64. I 
am not sure I understand this right, but if the file system use fsize of 
 2048 on AMD64 and display almost 1/2 the size of the i386 that use 
fsize of 1024, may be that's just the part of the fsize that is missing 
in the calculation.


So, far I couldn't come up with a different explication.

www1# disklabel wd0
# Inside MBR partition 3: type A6 start 63 size 78156162
# /dev/rwd0c:
type: ESDI
disk: ESDI/IDE disk
label: Maxtor 6E040L0
flags:
bytes/sector: 512
sectors/track: 63
tracks/cylinder: 16
sectors/cylinder: 1008
cylinders: 16383
total sectors: 78165360
rpm: 3600
interleave: 1
trackskew: 0
cylinderskew: 0
headswitch: 0   # microseconds
track-to-track seek: 0  # microseconds
drivedata: 0

16 partitions:
# sizeoffset  fstype [fsize bsize  cpg]
  a:52409763  4.2BSD   2048 16384  328 # Cyl 
0*-   519
  b:   8388576524160swap   # Cyl   520 
-  8841
  c:  78165360 0  unused  0 0  # Cyl 0 
- 77544
  d:   2097648   8912736  4.2BSD   2048 16384  328 # Cyl  8842 
- 10922
  e:  52429104  11010384  4.2BSD   2048 16384  328 # Cyl 10923 
- 62935
  f:   2097648  63439488  4.2BSD   2048 16384  328 # Cyl 62936 
- 65016
  g:  10486224  65537136  4.2BSD   2048 16384  328 # Cyl 65017 
- 75419
  h:   2132865  76023360  4.2BSD   2048 16384  328 # Cyl 75420 
- 77535*



oppose to i386:
# disklabel wd0
# Inside MBR partition 3: type A6 start 63 size 58621122
# /dev/rwd0c:
type: ESDI
disk: ESDI/IDE disk
label: QUANTUM FIREBALL
flags:
bytes/sector: 512
sectors/track: 63
tracks/cylinder: 16
sectors/cylinder: 1008
cylinders: 16383
total sectors: 58633344
rpm: 3600
interleave: 1
trackskew: 0
cylinderskew: 0
headswitch: 0   # microseconds
track-to-track seek: 0  # microseconds
drivedata: 0

16 partitions:
# sizeoffset  fstype [fsize bsize  cpg]
  a:52409763  4.2BSD   1024  8192   86 # Cyl 
0*-   519
  b:   8388576524160swap   # Cyl   520 
-  8841
  c:  58633344 0  unused  0 0  # Cyl 0 
- 58167
  d:   1048320   8912736  4.2BSD   1024  8192   86 # Cyl 

Re: df -h stats for same file systems display different result son AMD64 then on i386

2006-01-15 Thread Otto Moerbeek
On Sun, 15 Jan 2006, Daniel Ouellet wrote:

[snip lots of talk by a confused person]

 16 partitions:
 # sizeoffset  fstype [fsize bsize  cpg]
   a:52409763  4.2BSD   2048 16384  328 # Cyl 0*-   519
   b:   8388576524160swap   # Cyl   520 -  8841
   c:  78165360 0  unused  0 0  # Cyl 0 - 77544
   d:   2097648   8912736  4.2BSD   2048 16384  328 # Cyl  8842 - 10922
   e:  52429104  11010384  4.2BSD   2048 16384  328 # Cyl 10923 - 62935
   f:   2097648  63439488  4.2BSD   2048 16384  328 # Cyl 62936 - 65016
   g:  10486224  65537136  4.2BSD   2048 16384  328 # Cyl 65017 - 75419
   h:   2132865  76023360  4.2BSD   2048 16384  328 # Cyl 75420 -
 77535*

 16 partitions:
 # sizeoffset  fstype [fsize bsize  cpg]
   a:52409763  4.2BSD   1024  8192   86 # Cyl 0*-   519
   b:   8388576524160swap   # Cyl   520 -  8841
   c:  58633344 0  unused  0 0  # Cyl 0 - 58167
   d:   1048320   8912736  4.2BSD   1024  8192   86 # Cyl  8842 -  9881
   e:  27263376   9961056  4.2BSD   1024  8192   86 # Cyl  9882 - 36928
   f:   2097648  37224432  4.2BSD   1024  8192   86 # Cyl 36929 - 39009
   g:   9436896  39322080  4.2BSD   1024  8192   86 # Cyl 39010 - 48371
   h:   9874368  48758976  4.2BSD   1024  8192   86 # Cyl 48372 - 58167

Since the bsize and fsize differ, it is expected that the used kbytes of the
file systems differ. Also, the inode table size will not be the same.

You're comparing apples and oranges.

BTW, you don't say which version(s) you are running. That's bad. since
some bugs were fixed in the -h display. Run df without -h to see the
real numbers.

To check if the inode/block/fragment free numbers add up, you could
use dumpfs, but that is a hell of a lot of work. 

-Otto



Re: df -h stats for same file systems display different result son AMD64 then on i386

2006-01-15 Thread Daniel Ouellet

Otto Moerbeek wrote:

On Sun, 15 Jan 2006, Daniel Ouellet wrote:

[snip lots of talk by a confused person]


16 partitions:
# sizeoffset  fstype [fsize bsize  cpg]
  a:52409763  4.2BSD   2048 16384  328 # Cyl 0*-   519
  b:   8388576524160swap   # Cyl   520 -  8841
  c:  78165360 0  unused  0 0  # Cyl 0 - 77544
  d:   2097648   8912736  4.2BSD   2048 16384  328 # Cyl  8842 - 10922
  e:  52429104  11010384  4.2BSD   2048 16384  328 # Cyl 10923 - 62935
  f:   2097648  63439488  4.2BSD   2048 16384  328 # Cyl 62936 - 65016
  g:  10486224  65537136  4.2BSD   2048 16384  328 # Cyl 65017 - 75419
  h:   2132865  76023360  4.2BSD   2048 16384  328 # Cyl 75420 -
77535*



16 partitions:
# sizeoffset  fstype [fsize bsize  cpg]
  a:52409763  4.2BSD   1024  8192   86 # Cyl 0*-   519
  b:   8388576524160swap   # Cyl   520 -  8841
  c:  58633344 0  unused  0 0  # Cyl 0 - 58167
  d:   1048320   8912736  4.2BSD   1024  8192   86 # Cyl  8842 -  9881
  e:  27263376   9961056  4.2BSD   1024  8192   86 # Cyl  9882 - 36928
  f:   2097648  37224432  4.2BSD   1024  8192   86 # Cyl 36929 - 39009
  g:   9436896  39322080  4.2BSD   1024  8192   86 # Cyl 39010 - 48371
  h:   9874368  48758976  4.2BSD   1024  8192   86 # Cyl 48372 - 58167


Since the bsize and fsize differ, it is expected that the used kbytes of the
file systems differ. Also, the inode table size will not be the same.


Not sure that I would agree fully with that, but I differ to your 
judgment. Yes there will and should be difference in usage as if you 
have a lots of small files, you are waisting more space if you fsize are 
bigger, unless I don't understand that part. Would it mean that the df 
-h would take the number of inode in use * the fsize to display the 
results for human then?



You're comparing apples and oranges.


I don't disagree to some extend as you know better, but I still try to 
understand it however. Shouldn't the df -h display the same results 
however to human? I am not arguing, but rather try to understand it. If 
it is design to be human converted, why a human would need to know or 
consider the file size in use then to compare the results?



BTW, you don't say which version(s) you are running. That's bad. since
some bugs were fixed in the -h display. Run df without -h to see the
real numbers.


All run 3.8. Sorry about that.

the 4.6GB have 4870062 * 1024 = 4,986,943,488
www1# df
Filesystem  1K-blocks  Used Avail Capacity  Mounted on
/dev/wd0a  256814 4146420251017%/
/dev/wd0h 104815854995698 0%/home
/dev/wd0d 1030550 2979022 0%/tmp
/dev/wd0g 5159638310910   4590748 6%/usr
/dev/wd0e25799860   4870062  1963980620%/var
/dev/wd0f 1030550  1546977478 0%/var/qmail


the 8.1GB have 15967148 * 512 = 8,175,179,776
# df
Filesystem  512-blocks  Used Avail Capacity  Mounted on
/dev/wd0a   513628 6558842236013%/
/dev/wd0h  186162852   1768496 0%/home
/dev/wd0d  2061100 4   1958044 0%/tmp
/dev/wd0g  9904156424544   8984408 5%/usr
/dev/wd0e 33022236   1537612  29833516 5%/var
/dev/wd1b 16412252   1937920  1365372012%/var/mysql
/dev/wd0f  2061100 4   1958044 0%/var/qmail
/dev/wd1a 41280348  15967148  2324918441%/var/www/sites

The funny part is that the first above /var include more files then the 
/var/www/sites below and still display less space in use.



To check if the inode/block/fragment free numbers add up, you could
use dumpfs, but that is a hell of a lot of work. 


-Otto



It's not a huge deal and the systems works well, I am just puzzle by the 
results and want to understand it, that's all.