Re: [lustre-discuss] free space on ldiskfs vs. zfs

2015-08-25 Thread Götz Waschk
Dear All,

I'm sorry, I cannot provide verbose zpool information anymore. I was a
bit in a hurry to put the file system into production and that's why I
have reformatted the servers with ldiskfs.

On Tue, Aug 25, 2015 at 5:54 AM, Alexander I Kulyavtsev a...@fnal.gov wrote:
 I was assuming the question was about total space as I struggled for some 
 time to understand  why do I have 99 TB total available space per OSS, after 
 installing zfs lustre, while ldiskfs OSTs have 120 TB on the same hardware. 
 The 20% difference was partially (10%) accounted by different raid6 / raidz2 
 configuration. But I was not able to explain the other 10%.

 For question in original post, I can not make 24 TB from available field of 
 df output:
 207 KiB available on his zfs lustre,  198 KiB on ldiskfs lustre.
 At the same time the difference of the total space is
 233548424256 -207693153280 = 25855270976 KiB = 24.09 TB.

 Götz, could you please tell us what did you mean by available ?


I was comparing the Lustre file system size from the two
configurations, the space available for user data. I expected it to be
the same, that is 218T for both file systems.

I understand that you have the same issue.

Regards, Götz Waschk
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] free space on ldiskfs vs. zfs

2015-08-24 Thread Christopher J. Morrone
If you provide the zpool list -v output it might give us a little 
clearer view of what you have going on.


Chris

On 08/19/2015 06:18 AM, Götz Waschk wrote:

Dear Lustre experts,

I have configured two different Lustre instances, both using Lustre
2.5.3, one with ldiskfs on RAID-6 hardware RAID and one using ZFS and
RAID-Z2, using the same type of hardware. I was wondering, why I 24 TB
less space available, when I should have the same amount of parity
used:

  # lfs df
UUID   1K-blocksUsed   Available Use% Mounted on
fs19-MDT_UUID   50322916  47269646494784   1%
/testlustre/fs19[MDT:0]
fs19-OST_UUID51923288320   12672 51923273600   0%
/testlustre/fs19[OST:0]
fs19-OST0001_UUID51923288320   12672 51923273600   0%
/testlustre/fs19[OST:1]
fs19-OST0002_UUID51923288320   12672 51923273600   0%
/testlustre/fs19[OST:2]
fs19-OST0003_UUID51923288320   12672 51923273600   0%
/testlustre/fs19[OST:3]
filesystem summary:  207693153280   50688 207693094400   0% /testlustre/fs19
UUID   1K-blocksUsed   Available Use% Mounted on
fs18-MDT_UUID   47177700  48215243550028   1%
/lustre/fs18[MDT:0]
fs18-OST_UUID58387106064  6014088200 49452733560  11%
/lustre/fs18[OST:0]
fs18-OST0001_UUID58387106064  5919753028 49547068928  11%
/lustre/fs18[OST:1]
fs18-OST0002_UUID58387106064  5944542316 49522279640  11%
/lustre/fs18[OST:2]
fs18-OST0003_UUID58387106064  5906712004 49560109952  11%
/lustre/fs18[OST:3]
filesystem summary:  233548424256 23785095548 198082192080  11% /lustre/fs18

fs18 is using ldiskfs, while fs19 is ZFS:
# zpool list
NAME  SIZE  ALLOC   FREECAP  DEDUP  HEALTH  ALTROOT
lustre-ost165T  18,1M  65,0T 0%  1.00x  ONLINE  -
# zfs list
NAME   USED  AVAIL  REFER  MOUNTPOINT
lustre-ost1   13,6M  48,7T   311K  /lustre-ost1
lustre-ost1/ost1  12,4M  48,7T  12,4M  /lustre-ost1/ost1


Any idea on why my 6TB per OST went?

Regards, Götz Waschk
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org



___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] free space on ldiskfs vs. zfs

2015-08-24 Thread Alexander I Kulyavtsev
Same question here.

6TB/65TB is 11% . In our case about the same fraction was missing.

My speculation was, It may happen if at some point between zpool and linux the 
value reported in TB is interpreted as in TiB, and then converted to TB. Or  
unneeded conversion MB to MiB done twice, etc.

Here is my numbers:
We have 12* 4TB drives per pool, it is 48 TB (decimal).
zpool created as raidz2 10+2.
zpool reports  43.5T.
Pool size shall be 48T=4T*12, or 40T=4T*10 (depending what zpool shows, before 
raiding or after raiding).
From the Oracle ZFS documentation, zpool list returns the total space 
without overheads, thus 48 TB shall be reported by zpool instead of 43.5TB.

In my case, it looked like conversion error/interpretation issue between TB and 
TiB:

48*1000*1000*1000*1000/1024/1024/1024/1024 = 43.65574568510055541992


At disk level:

~/sas2ircu 0 display

Device is a Hard disk
  Enclosure # : 2
  Slot #  : 12
  SAS Address : 5003048-0-015a-a918
  State   : Ready (RDY)
  Size (in MB)/(in sectors)   : 3815447/7814037167
  Manufacturer: ATA 
  Model Number: HGST HUS724040AL
  Firmware Revision   : AA70
  Serial No   : PN2334PBJPW14T
  GUID: 5000cca23de6204b
  Protocol: SATA
  Drive Type  : SATA_HDD

One disk size is about 4 TB (decimal):

3815447*1024*1024 = 4000786153472
7814037167*512  = 4000787029504

vdev presents whole disk to zpool. There is some overhead, some space left on 
sdq9 .

[root@lfs1 scripts]# head -4 /etc/zfs/vdev_id.conf
alias s0  /dev/disk/by-path/pci-:03:00.0-sas-0x50030480015aa90c-lun-0
alias s1  /dev/disk/by-path/pci-:03:00.0-sas-0x50030480015aa90d-lun-0
alias s2  /dev/disk/by-path/pci-:03:00.0-sas-0x50030480015aa90e-lun-0
alias s3  /dev/disk/by-path/pci-:03:00.0-sas-0x50030480015aa90f-lun-0
...
alias s12  /dev/disk/by-path/pci-:03:00.0-sas-0x50030480015aa918-lun-0
...

[root@lfs1 scripts]# ls -l  /dev/disk/by-path/
...
lrwxrwxrwx 1 root root  9 Jul 23 16:27 
pci-:03:00.0-sas-0x50030480015aa918-lun-0 - ../../sdq
lrwxrwxrwx 1 root root 10 Jul 23 16:27 
pci-:03:00.0-sas-0x50030480015aa918-lun-0-part1 - ../../sdq1
lrwxrwxrwx 1 root root 10 Jul 23 16:27 
pci-:03:00.0-sas-0x50030480015aa918-lun-0-part9 - ../../sdq9

Pool report:

[root@lfs1 scripts]# zpool list
NAMESIZE  ALLOC   FREE  EXPANDSZ   FRAGCAP  DEDUP  HEALTH  ALTROOT
zpla-  43.5T  10.9T  32.6T -16%24%  1.00x  ONLINE  -
zpla-0001  43.5T  11.0T  32.5T -17%25%  1.00x  ONLINE  -
zpla-0002  43.5T  10.8T  32.7T -17%24%  1.00x  ONLINE  -
[root@lfs1 scripts]# 

[root@lfs1 ~]# zpool list -v zpla-0001
NAME   SIZE  ALLOC   FREE  EXPANDSZ   FRAGCAP  DEDUP  HEALTH  ALTROOT
zpla-0001  43.5T  11.0T  32.5T -17%25%  1.00x  ONLINE  -
  raidz2  43.5T  11.0T  32.5T -17%25%
s12  -  -  - -  -  -
s13  -  -  - -  -  -
s14  -  -  - -  -  -
s15  -  -  - -  -  -
s16  -  -  - -  -  -
s17  -  -  - -  -  -
s18  -  -  - -  -  -
s19  -  -  - -  -  -
s20  -  -  - -  -  -
s21  -  -  - -  -  -
s22  -  -  - -  -  -
s23  -  -  - -  -  -
[root@lfs1 ~]# 

[root@lfs1 ~]# zpool get all zpla-0001
NAME   PROPERTYVALUE   SOURCE
zpla-0001  size43.5T   -
zpla-0001  capacity25% -
zpla-0001  altroot -   default
zpla-0001  health  ONLINE  -
zpla-0001  guid547290297520142 default
zpla-0001  version -   default
zpla-0001  bootfs  -   default
zpla-0001  delegation  on  default
zpla-0001  autoreplace off default
zpla-0001  cachefile   -   default
zpla-0001  failmodewaitdefault
zpla-0001  listsnapshots   off default
zpla-0001  autoexpand  off default
zpla-0001  dedupditto  0   default
zpla-0001  dedupratio  1.00x   

Re: [lustre-discuss] free space on ldiskfs vs. zfs

2015-08-24 Thread Christopher J. Morrone
I could be wrong, but I don't think that the original poster was asking 
why the SIZE field of zpool list was wrong, but rather why the AVAIL 
space in zfs list was lower than he expected.


I would find it easier to answer the question if I knew his drive count 
and drive size.


Chris

On 08/24/2015 02:12 PM, Alexander I Kulyavtsev wrote:

Same question here.

6TB/65TB is 11% . In our case about the same fraction was missing.

My speculation was, It may happen if at some point between zpool and linux the 
value reported in TB is interpreted as in TiB, and then converted to TB. Or  
unneeded conversion MB to MiB done twice, etc.

Here is my numbers:
We have 12* 4TB drives per pool, it is 48 TB (decimal).
zpool created as raidz2 10+2.
zpool reports  43.5T.
Pool size shall be 48T=4T*12, or 40T=4T*10 (depending what zpool shows, before 
raiding or after raiding).

From the Oracle ZFS documentation, zpool list returns the total space without 
overheads, thus 48 TB shall be reported by zpool instead of 43.5TB.


In my case, it looked like conversion error/interpretation issue between TB and 
TiB:

48*1000*1000*1000*1000/1024/1024/1024/1024 = 43.65574568510055541992


At disk level:

~/sas2ircu 0 display

Device is a Hard disk
   Enclosure # : 2
   Slot #  : 12
   SAS Address : 5003048-0-015a-a918
   State   : Ready (RDY)
   Size (in MB)/(in sectors)   : 3815447/7814037167
   Manufacturer: ATA
   Model Number: HGST HUS724040AL
   Firmware Revision   : AA70
   Serial No   : PN2334PBJPW14T
   GUID: 5000cca23de6204b
   Protocol: SATA
   Drive Type  : SATA_HDD

One disk size is about 4 TB (decimal):

3815447*1024*1024 = 4000786153472
7814037167*512  = 4000787029504

vdev presents whole disk to zpool. There is some overhead, some space left on 
sdq9 .

[root@lfs1 scripts]# head -4 /etc/zfs/vdev_id.conf
alias s0  /dev/disk/by-path/pci-:03:00.0-sas-0x50030480015aa90c-lun-0
alias s1  /dev/disk/by-path/pci-:03:00.0-sas-0x50030480015aa90d-lun-0
alias s2  /dev/disk/by-path/pci-:03:00.0-sas-0x50030480015aa90e-lun-0
alias s3  /dev/disk/by-path/pci-:03:00.0-sas-0x50030480015aa90f-lun-0
...
alias s12  /dev/disk/by-path/pci-:03:00.0-sas-0x50030480015aa918-lun-0
...

[root@lfs1 scripts]# ls -l  /dev/disk/by-path/
...
lrwxrwxrwx 1 root root  9 Jul 23 16:27 
pci-:03:00.0-sas-0x50030480015aa918-lun-0 - ../../sdq
lrwxrwxrwx 1 root root 10 Jul 23 16:27 
pci-:03:00.0-sas-0x50030480015aa918-lun-0-part1 - ../../sdq1
lrwxrwxrwx 1 root root 10 Jul 23 16:27 
pci-:03:00.0-sas-0x50030480015aa918-lun-0-part9 - ../../sdq9

Pool report:

[root@lfs1 scripts]# zpool list
NAMESIZE  ALLOC   FREE  EXPANDSZ   FRAGCAP  DEDUP  HEALTH  ALTROOT
zpla-  43.5T  10.9T  32.6T -16%24%  1.00x  ONLINE  -
zpla-0001  43.5T  11.0T  32.5T -17%25%  1.00x  ONLINE  -
zpla-0002  43.5T  10.8T  32.7T -17%24%  1.00x  ONLINE  -
[root@lfs1 scripts]#

[root@lfs1 ~]# zpool list -v zpla-0001
NAME   SIZE  ALLOC   FREE  EXPANDSZ   FRAGCAP  DEDUP  HEALTH  ALTROOT
zpla-0001  43.5T  11.0T  32.5T -17%25%  1.00x  ONLINE  -
   raidz2  43.5T  11.0T  32.5T -17%25%
 s12  -  -  - -  -  -
 s13  -  -  - -  -  -
 s14  -  -  - -  -  -
 s15  -  -  - -  -  -
 s16  -  -  - -  -  -
 s17  -  -  - -  -  -
 s18  -  -  - -  -  -
 s19  -  -  - -  -  -
 s20  -  -  - -  -  -
 s21  -  -  - -  -  -
 s22  -  -  - -  -  -
 s23  -  -  - -  -  -
[root@lfs1 ~]#

[root@lfs1 ~]# zpool get all zpla-0001
NAME   PROPERTYVALUE   SOURCE
zpla-0001  size43.5T   -
zpla-0001  capacity25% -
zpla-0001  altroot -   default
zpla-0001  health  ONLINE  -
zpla-0001  guid547290297520142 default
zpla-0001  version -   default
zpla-0001  bootfs  -   default
zpla-0001  delegation  on  default
zpla-0001  autoreplace off default
zpla-0001  cachefile   -  

Re: [lustre-discuss] free space on ldiskfs vs. zfs

2015-08-24 Thread Alexander I Kulyavtsev
Hmm,
I was assuming the question was about total space as I struggled for some time 
to understand  why do I have 99 TB total available space per OSS, after 
installing zfs lustre, while ldiskfs OSTs have 120 TB on the same hardware. The 
20% difference was partially (10%) accounted by different raid6 / raidz2 
configuration. But I was not able to explain the other 10%.

For question in original post, I can not make 24 TB from available field of 
df output:
207 KiB available on his zfs lustre,  198 KiB on ldiskfs lustre.
At the same time the difference of the total space is 
233548424256 -207693153280 = 25855270976 KiB = 24.09 TB.

Götz, could you please tell us what did you mean by available ?

Also,
in my case the output of linux df on OSS for the zfs pool looks strange:
zpool size reported as 25T (why?), and the formatted OST taking all space on 
this pool shows 33T:

[root@lfs1 ~]# df -h  /zpla-  /mnt/OST
Filesystem Size  Used Avail Use% Mounted on
zpla-   25T  256K   25T   1% /zpla-
zpla-/OST   33T  8.3T   25T  26% /mnt/OST
[root@lfs1 ~]# 

in bytes:

[root@lfs1 ~]# df --block-size=1  /zpla-  /mnt/OST
Filesystem 1B-blocks  Used  Available Use% Mounted on
zpla- 26769344561152262144 26769344299008   1% /zpla-
zpla-/OST 35582552834048 9093386076160 26489164660736  26% /mnt/OST

same ost reported by lustre:
[root@lfsa scripts]# lfs df 
UUID   1K-blocksUsed   Available Use% Mounted on
lfs-MDT_UUID   974961920  275328   974684544   0% /mnt/lfsa[MDT:0]
lfs-OST_UUID 34748586752  8880259840 25868324736  26% /mnt/lfsa[OST:0]
...

Compare:

[root@lfs1 ~]# zpool list
NAMESIZE  ALLOC   FREE  EXPANDSZ   FRAGCAP  DEDUP  HEALTH  ALTROOT
zpla-  43.5T  10.9T  32.6T -16%24%  1.00x  ONLINE  -
zpla-0001  43.5T  11.0T  32.5T -17%25%  1.00x  ONLINE  -
zpla-0002  43.5T  10.8T  32.7T -17%24%  1.00x  ONLINE  -
I realize zfs reports raw disk space including parity blocks (48TB = 43.5 TiB); 
 and everything else (like metadata, space for xattr inodes).

I can not explain the difference 40 TB (dec.) of data space (10*4TB drives) and 
35,582,552,834,048 bytes shown by df for OST.

Best regards, Alex.

On Aug 24, 2015, at 7:52 PM, Christopher J. Morrone morro...@llnl.gov wrote:

 I could be wrong, but I don't think that the original poster was asking 
 why the SIZE field of zpool list was wrong, but rather why the AVAIL 
 space in zfs list was lower than he expected.
 
 I would find it easier to answer the question if I knew his drive count 
 and drive size.
 
 Chris
 
 On 08/24/2015 02:12 PM, Alexander I Kulyavtsev wrote:
 Same question here.
 
 6TB/65TB is 11% . In our case about the same fraction was missing.
 
 My speculation was, It may happen if at some point between zpool and linux 
 the value reported in TB is interpreted as in TiB, and then converted to TB. 
 Or  unneeded conversion MB to MiB done twice, etc.
 
 Here is my numbers:
 We have 12* 4TB drives per pool, it is 48 TB (decimal).
 zpool created as raidz2 10+2.
 zpool reports  43.5T.
 Pool size shall be 48T=4T*12, or 40T=4T*10 (depending what zpool shows, 
 before raiding or after raiding).
 From the Oracle ZFS documentation, zpool list returns the total space 
 without overheads, thus 48 TB shall be reported by zpool instead of 43.5TB.
 
 In my case, it looked like conversion error/interpretation issue between TB 
 and TiB:
 
 48*1000*1000*1000*1000/1024/1024/1024/1024 = 43.65574568510055541992
 
 
 At disk level:
 
 ~/sas2ircu 0 display
 
 Device is a Hard disk
   Enclosure # : 2
   Slot #  : 12
   SAS Address : 5003048-0-015a-a918
   State   : Ready (RDY)
   Size (in MB)/(in sectors)   : 3815447/7814037167
   Manufacturer: ATA
   Model Number: HGST HUS724040AL
   Firmware Revision   : AA70
   Serial No   : PN2334PBJPW14T
   GUID: 5000cca23de6204b
   Protocol: SATA
   Drive Type  : SATA_HDD
 
 One disk size is about 4 TB (decimal):
 
 3815447*1024*1024 = 4000786153472
 7814037167*512  = 4000787029504
 
 vdev presents whole disk to zpool. There is some overhead, some space left 
 on sdq9 .
 
 [root@lfs1 scripts]# head -4 /etc/zfs/vdev_id.conf
 alias s0  /dev/disk/by-path/pci-:03:00.0-sas-0x50030480015aa90c-lun-0
 alias s1  /dev/disk/by-path/pci-:03:00.0-sas-0x50030480015aa90d-lun-0
 alias s2  /dev/disk/by-path/pci-:03:00.0-sas-0x50030480015aa90e-lun-0
 alias s3  /dev/disk/by-path/pci-:03:00.0-sas-0x50030480015aa90f-lun-0
 ...
 alias s12  /dev/disk/by-path/pci-:03:00.0-sas-0x50030480015aa918-lun-0
 ...
 

[lustre-discuss] free space on ldiskfs vs. zfs

2015-08-19 Thread Götz Waschk
Dear Lustre experts,

I have configured two different Lustre instances, both using Lustre
2.5.3, one with ldiskfs on RAID-6 hardware RAID and one using ZFS and
RAID-Z2, using the same type of hardware. I was wondering, why I 24 TB
less space available, when I should have the same amount of parity
used:

 # lfs df
UUID   1K-blocksUsed   Available Use% Mounted on
fs19-MDT_UUID   50322916  47269646494784   1%
/testlustre/fs19[MDT:0]
fs19-OST_UUID51923288320   12672 51923273600   0%
/testlustre/fs19[OST:0]
fs19-OST0001_UUID51923288320   12672 51923273600   0%
/testlustre/fs19[OST:1]
fs19-OST0002_UUID51923288320   12672 51923273600   0%
/testlustre/fs19[OST:2]
fs19-OST0003_UUID51923288320   12672 51923273600   0%
/testlustre/fs19[OST:3]
filesystem summary:  207693153280   50688 207693094400   0% /testlustre/fs19
UUID   1K-blocksUsed   Available Use% Mounted on
fs18-MDT_UUID   47177700  48215243550028   1%
/lustre/fs18[MDT:0]
fs18-OST_UUID58387106064  6014088200 49452733560  11%
/lustre/fs18[OST:0]
fs18-OST0001_UUID58387106064  5919753028 49547068928  11%
/lustre/fs18[OST:1]
fs18-OST0002_UUID58387106064  5944542316 49522279640  11%
/lustre/fs18[OST:2]
fs18-OST0003_UUID58387106064  5906712004 49560109952  11%
/lustre/fs18[OST:3]
filesystem summary:  233548424256 23785095548 198082192080  11% /lustre/fs18

fs18 is using ldiskfs, while fs19 is ZFS:
# zpool list
NAME  SIZE  ALLOC   FREECAP  DEDUP  HEALTH  ALTROOT
lustre-ost165T  18,1M  65,0T 0%  1.00x  ONLINE  -
# zfs list
NAME   USED  AVAIL  REFER  MOUNTPOINT
lustre-ost1   13,6M  48,7T   311K  /lustre-ost1
lustre-ost1/ost1  12,4M  48,7T  12,4M  /lustre-ost1/ost1


Any idea on why my 6TB per OST went?

Regards, Götz Waschk
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org