Re: Large single raid and XFS or two small ones and EXT3?

2006-06-26 Thread Justin Piszcz



On Sun, 25 Jun 2006, Bill Davidsen wrote:


Justin Piszcz wrote:



On Sat, 24 Jun 2006, Neil Brown wrote:


On Friday June 23, [EMAIL PROTECTED] wrote:


The problem is that there is no cost effective backup available.



One-liner questions :
- How does Google make backups ?



No, Google ARE the backups :-)


- Aren't tapes dead yet ?



LTO-3 does 300Gig, and LTO-4 is planned.
They may not cope with tera-byte arrays in one hit, but they still
have real value.


- What about a NUMA principle applied to storage ?



You mean an Hierarchical Storage Manager?  Yep, they exist.  I'm sure
SGI, EMC and assorted other TLAs could sell you one.



LTO3 is 400GB native and we've seen very good compression, so 800GB-1TB per 
tape. 


The problem is in small business use, LTO3 is costly in the 1-10TB range, and 
takes a lot of media changes as well. A TB of RAID-5 is ~$500, and at that 
small size the cost of drives and media is disproportionally high. Using more 
drives is cost effective, but they are not good for long term off site 
storage, because they're large and fragile.


No obvious solutions in that price and application range that I see.

--
bill davidsen [EMAIL PROTECTED]
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979



In the 1-10TB range you are probably correct, as the numbers increase 
however, many LTO2/LTO3 drives + robotics become justifiable.

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large single raid and XFS or two small ones and EXT3?

2006-06-26 Thread Bill Davidsen

Adam Talbot wrote:

Not exactly sure how to tune for stripe size. 
What would you advise?

-Adam
 



See the -R option of mke2fs. I don't have a number for the performance 
impact of this, but I bet someone else on the list will. Depending on 
what posts you read, reports range from measurable to significant, 
without quantifying.


Note, next month I will set up either a 2x750 RAID-1 or 4x250 RAID-5 
array, and if I got RAID-5 I will have the chance to run some metrics 
before putting the hardware into production service. I'll report on the 
-R option if I have any data.




Bill Davidsen wrote:
 


winspeareAdam Talbot wrote:

   


OK, this topic I relay need to get in on.
I have spent the last few week bench marking my new 1.2TB, 6 disk, RAID6
array. I wanted real numbers, not This FS is faster because... I have
moved over 100TB of data on my new array running the bench mark
testing.  I have yet to have any major problems with ReiserFS, EXT2/3,
JFS, or XFS.  I have done extensive testing on all, including just
trying to break the file system with billions of 1k files, or a 1TB
file. Was able to cause some problems with EXT3 and RiserFS with the 1KB
and 1TB tests, respectively. but both were fixed with a fsck. My basic
test is to move all data from my old server to my new server
(whitequeen2) and clock the transfer time.  Whitequeen2 has very little
storage.  The NAS's 1.2TB of storage is attached via iSCSI and a cross
over cable to the back of whitequeen2.  The data is 100GB of user's
files(1KB~2MB), 50GB of MP3's (1MB~5MB) and the rest is movies and
system backups 600MB~2GB.  Here is a copy of my current data sheet,
including specs on the servers and copy times, my numbers are not
perfect, but they should give you a clue about speeds...  XFS wins.


 


In many (most?) cases I'm a lot more concerned about filesystem
stability than performance. That is, I want the fastest reliable
filesystem. With ext2 and ext3 I've run multiple multi-TB machines
spread over four time zones, and not had a f/s problem updating ~1TB/day.

   


Did you tune the extN filesystems to the stripe size of the raid?

   




--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large single raid and XFS or two small ones and EXT3?

2006-06-25 Thread Joshua Baker-LePain

On Sat, 24 Jun 2006 at 3:52pm, Adam Talbot wrote


nas tmp # time tar cf - . | (cd /data ; tar xf - )


A (bit) cleaner way to accomplish the same thing:

tar cf - --totals . | tar xC /data -f -

--
Joshua Baker-LePain
Department of Biomedical Engineering
Duke University
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large single raid and XFS or two small ones and EXT3?

2006-06-25 Thread Adam Talbot
ACK!
At one point some one stated that they were having problems with XFS
crashing under high NFS loads...  Did it look something like this?
-Adam

Starting XFS recovery on filesystem: md0 (logdev: internal)
Filesystem md0: XFS internal error xlog_valid_rec_header(1) at line
3478 of file fs/xfs/xfs_log_recover.c.  Caller 0x802114fc

Call Trace: 80211437{xlog_valid_rec_header+231}
   802114fc{xlog_do_recovery_pass+172}
8020f0c8{xlog_find_tail+2344}
   802217e1{kmem_alloc+97}
80211bb0{xlog_recover+192}
   8020c564{xfs_log_mount+1380}
80213968{xfs_mountfs+2712}
   8016aa3a{set_blocksize+138}
80224d1d{xfs_setsize_buftarg_flags+61}
   802192b4{xfs_mount+2724}
8022ae00{linvfs_fill_super+0}
   8022aeb8{linvfs_fill_super+184}
8024a62e{strlcpy+78}
   80169db2{sget+722} 8016a460{set_bdev_super+0}
   8022ae00{linvfs_fill_super+0}
8022ae00{linvfs_fill_super+0}
   8016a5bc{get_sb_bdev+268}
8016a84b{do_kern_mount+107}
   8017eed3{do_mount+1603}
8011a2f9{do_page_fault+1033}
   80145f66{find_get_pages+22}
8014d57a{invalidate_mapping_pages+202}
   80149f99{__alloc_pages+89}
8014a234{__get_free_pages+52}
   8017f257{sys_mount+151} 8010a996{system_call+126}
XFS: log mount/recovery failed: error 990
XFS: log mount failed


Adam Talbot wrote:
 Trying to test for tuning with different chunk's.  Just finished 16K
 chunk and am about 20% done with the 32K test.  Here are the numbers on
 16K chunk, will send 32, 96,128,192 and 256 as I get them.  But keep in
 mind each one of these tests take about 4~6 hours, so it is a slow
 process...  I have settled for XFS as the file system type, it seems to
 be able to beat any thing else out there.
 -Adam

 XFS
 Config=NAS+NFS
 RAID6 16K chunk
 nas tmp # time tar cf - . | (cd /data ; tar xf - )
 real252m40.143s
 user1m4.720s
 sys 25m6.270s
 /dev/md/0 1.1T  371G  748G  34% /data
 4.207 hours @ 90,167M/hour or 1502M/min  or 25.05M/sec




 David Greaves wrote:
   
 Adam Talbot wrote:
   
 
 OK, this topic I relay need to get in on.
 I have spent the last few week bench marking my new 1.2TB, 6 disk, RAID6
 array.
 
   
 Very interesting. Thanks.

 Did you get around to any 'tuning'.
 Things like raid chunk size, external logs for xfs, blockdev readahead
 on the underlying devices and the raid device?

 David
 -
 To unsubscribe from this list: send the line unsubscribe linux-raid in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

   
 

 -
 To unsubscribe from this list: send the line unsubscribe linux-raid in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

   

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large single raid and XFS or two small ones and EXT3?

2006-06-25 Thread David Rees

On 6/23/06, Nix [EMAIL PROTECTED] wrote:

On 23 Jun 2006, PFC suggested tentatively:
   - ext3 is slow if you have many files in one directory, but has
   more mature tools (resize, recovery etc)

This is much less true if you turn on the dir_index feature.


However, even with dir_index, deleting large files is still much
slower with ext2/3 than xfs or jfs.

-Dave
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large single raid and XFS or two small ones and EXT3?

2006-06-25 Thread Chris Allen

Adam Talbot wrote:

ACK!
At one point some one stated that they were having problems with XFS
crashing under high NFS loads...  Did it look something like this?
-Adam

  
  


nope, it looked like the trace below - and I could make it happen 
consistently by thrashing xfs.

Not even sure it was over NFS - this could well have been a local test.


--

do_IRQ: stack overflow: 304
Unable to handle kernel paging request at virtual address a554b923
printing eip:
c011b202
*pde = 
Oops:  [#1]
SMP
Modules linked in: nfsd(U) lockd(U) md5(U) ipv6(U) autofs4(U) sunrpc(U) 
xfs(U) exportfs(U) video(U) button(U) battery(U) ac(U) uhci_hcd(U) 
ehci_hcd(U) i2c_i801(U) i2c_core(U) shpchp(U) e1000(U) floppy(U) 
dm_snapshot(U) dm_zero(U) dm_mirror(U) ext3(U) jbd(U) raid5(U) xor(U) 
dm_mod(U) ata_piix(U) libata(U) aar81xx(U) sd_mod(U) scsi_mod(U)

CPU:10
EIP:0060:[c011b202]Tainted: P  VLI
EFLAGS: 00010086   (2.6.11-2.6.11)
EIP is at activate_task+0x34/0x9b
eax: e514b703   ebx:    ecx: 028f8800   edx: c0400200
esi: 028f8800   edi: 000f4352   ebp: f545d02c   esp: f545d018
ds: 007b   es: 007b   ss: 0068
Process  (pid: 947105536, threadinfo=f545c000 task=f5a27000)
Stack: badc0ded c3630160 f7ae4a80 c0400200 f7ae4a80 c3630160 f545d074 
c011b785
   c0220f39 0001 0086  0001 0003 
f7ae4a80
  0082 0001 000a  c02219da f7d7cf60 c035d914 


Call Trace:
[c011b785] try_to_wake_up+0x24a/0x2aa
[c0220f39] scrup+0xcf/0xd9
[c02219da] set_cursor+0x4f/0x60
[c01348b0] autoremove_wake_function+0x15/0x37
[c011d197] __wake_up_common+0x39/0x59
[c011d1e9] __wake_up+0x32/0x43
[c0121e2c] release_console_sem+0xad/0xb5
[c0121c48] vprintk+0x1e7/0x29e
[c0121a5d] printk+0x1b/0x1f
[c010664b] do_IRQ+0x7f/0x86
[c0104a3e] common_interrupt+0x1a/0x20
[c024b5fa] cfq_may_queue+0x0/0xcd
[c02425e4] get_request+0xf2/0x2b7
[c02430cc] __make_request+0xbe/0x472
[c024375b] generic_make_request+0x91/0x234
[f881be38] compute_blocknr+0xe5/0x16e [raid5]
[c013489b] autoremove_wake_function+0x0/0x37
[f881d0c2] handle_stripe+0x736/0x109e [raid5]
[f881b45a] get_active_stripe+0x1fb/0x36c [raid5]
[f881deed] make_request+0x2e1/0x30d [raid5]
[c013489b] autoremove_wake_function+0x0/0x37
[c024375b] generic_make_request+0x91/0x234
[c03054e1] schedule+0x431/0xc5e
[c024a3f4] cfq_sort_rr_list+0x9b/0xe6
[c0148c27] buffered_rmqueue+0xc4/0x1fb
[c013489b] autoremove_wake_function+0x0/0x37
[c0243944] submit_bio+0x46/0xcc
[c0147aae] mempool_alloc+0x6f/0x108
[c013489b] autoremove_wake_function+0x0/0x37
[c0166696] bio_add_page+0x26/0x2c
[f9419fe7] _pagebuf_ioapply+0x175/0x2e3 [xfs]
[f941a185] pagebuf_iorequest+0x30/0x133 [xfs]
[f9419643] xfs_buf_get_flags+0xe8/0x147 [xfs]
[f9419d45] pagebuf_iostart+0x76/0x82 [xfs]
[f9419707] xfs_buf_read_flags+0x65/0x89 [xfs]
[f940c105] xfs_trans_read_buf+0x122/0x334 [xfs]
[f93d9dc2] xfs_btree_read_bufs+0x7d/0x97 [xfs]
[f93c0d7a] xfs_alloc_lookup+0x326/0x47b [xfs]
[f93bc96b] xfs_alloc_fixup_trees+0x14f/0x320 [xfs]
[f93d99d9] xfs_btree_init_cursor+0x1d/0x17f [xfs]
[f93bdc38] xfs_alloc_ag_vextent_size+0x377/0x456 [xfs]
[f93bcbdb] xfs_alloc_read_agfl+0x9f/0xb9 [xfs]
[f93bccf5] xfs_alloc_ag_vextent+0x100/0x102 [xfs]
[f93be929] xfs_alloc_fix_freelist+0x2ca/0x478 [xfs]
[f93bf087] xfs_alloc_vextent+0x182/0x570 [xfs]
[f93cdff3] xfs_bmap_alloc+0x111e/0x18e9 [xfs]
[c013489b] autoremove_wake_function+0x0/0x37
[c024375b] generic_make_request+0x91/0x234
[f891eb40] EdmaReqQueueInsert+0x70/0x80 [aar81xx]
[c011cf79] scheduler_tick+0x236/0x40f
[c011cf79] scheduler_tick+0x236/0x40f
[f93d833e] xfs_bmbt_get_state+0x13/0x1c [xfs]
[f93cfebf] xfs_bmap_do_search_extents+0xc3/0x476 [xfs]
[f93d1b9f] xfs_bmapi+0x72a/0x1670 [xfs]
[f93d833e] xfs_bmbt_get_state+0x13/0x1c [xfs]
[f93ffdf7] xlog_grant_log_space+0x329/0x350 [xfs]
[f93fb3d0] xfs_iomap_write_allocate+0x2d1/0x572 [xfs]
[c0243944] submit_bio+0x46/0xcc
[c0147aae] mempool_alloc+0x6f/0x108
[f93fa368] xfs_iomap+0x3ef/0x50c [xfs]
[f94173fd] xfs_map_blocks+0x39/0x71 [xfs]
[f94183b3] xfs_page_state_convert+0x4b9/0x6ab [xfs]
[f9418b1d] linvfs_writepage+0x57/0xd5 [xfs]
[c014e71d] pageout+0x84/0x101
[c014ea1b] shrink_list+0x281/0x454
[c014db1b] __pagevec_lru_add+0xac/0xbb
[c014ed82] shrink_cache+0xe7/0x26c
[c014f33f] shrink_zone+0x76/0xbb
[c014f3e5] shrink_caches+0x61/0x6f
[c014f4b8] try_to_free_pages+0xc5/0x18d
[c0148fbb] __alloc_pages+0x1cc/0x407
[c014674a] generic_file_buffered_write+0x148/0x60c
[c0180ee8] __mark_inode_dirty+0x28/0x199
[f941f444] xfs_write+0xa36/0xd03 [xfs]
[f941b89d] linvfs_write+0xe9/0x102 [xfs]
[c013489b] autoremove_wake_function+0x0/0x37
[c014294d] audit_syscall_entry+0x10b/0x15e
[f941b7b4] linvfs_write+0x0/0x102 [xfs]
[c0161a27] vfs_write+0x9e/0x110
[c0161b44] sys_write+0x41/0x6a
[c0104009] syscall_call+0x7/0xb
Code: 89 45 f0 89 55 ec 89 cb e8 24 57 ff ff 89 c6 89 d7 85 db 75 27 ba 
00 02 40 c0 b8 00 f0 ff ff 21 e0 8b 40 10 8b 04 85 20 50 40 c0 2b 74 
02 20 1b 7c 02 24 8b 45 ec 03 70 20 13 78 24 89 

Re: Large single raid and XFS or two small ones and EXT3?

2006-06-25 Thread Bill Davidsen

winspeareAdam Talbot wrote:


OK, this topic I relay need to get in on.
I have spent the last few week bench marking my new 1.2TB, 6 disk, RAID6
array. I wanted real numbers, not This FS is faster because... I have
moved over 100TB of data on my new array running the bench mark
testing.  I have yet to have any major problems with ReiserFS, EXT2/3,
JFS, or XFS.  I have done extensive testing on all, including just
trying to break the file system with billions of 1k files, or a 1TB
file. Was able to cause some problems with EXT3 and RiserFS with the 1KB
and 1TB tests, respectively. but both were fixed with a fsck. My basic
test is to move all data from my old server to my new server
(whitequeen2) and clock the transfer time.  Whitequeen2 has very little
storage.  The NAS's 1.2TB of storage is attached via iSCSI and a cross
over cable to the back of whitequeen2.  The data is 100GB of user's
files(1KB~2MB), 50GB of MP3's (1MB~5MB) and the rest is movies and
system backups 600MB~2GB.  Here is a copy of my current data sheet,
including specs on the servers and copy times, my numbers are not
perfect, but they should give you a clue about speeds...  XFS wins.
 



In many (most?) cases I'm a lot more concerned about filesystem 
stability than performance. That is, I want the fastest reliable 
filesystem. With ext2 and ext3 I've run multiple multi-TB machines 
spread over four time zones, and not had a f/s problem updating ~1TB/day.



The computer: whitequeen2
AMD Athlon64 3200 (2.0GHz)
1GB Corsair DDR 400 (2X 512MB's running in dual DDR mode)
Foxconn 6150K8MA-8EKRS motherboard
Off brand case/power supply
2X os disks, software raid array, RAID 1, Maxtor 51369U3, FW DA620CQ0
Intel pro/1000 NIC
CentOS 4.3 X86_64 2.6.9
   Main app server, Apache, Samba, NFS, NIS

The computer: nas
AMD Athlon64 3000 (1.8GHz)
256MB Corsair DDR 400 (2X 128MB's running in dual DDR mode)
Foxconn 6150K8MA-8EKRS motherboard
Off brand case/power supply and drive cages
2X os disks, software raid array, RAID 1, Maxtor 51369U3, FW DA620CQ0
6X software raid array, RAID 6, Maxtor 7V300F0, FW VA111900
Gentoo linux. X86_64 2.6.16-gentoo-r9
  System built very lite, only built as an iSCSI based NAS.

EXT3
Config=APP+NFS--NAS+iSCSI
RAID6 64K chunk
[EMAIL PROTECTED] tmp]# time tar cf - . | (cd /data ; tar xf - )
real371m29.802s
user1m28.492s
sys 46m48.947s
/dev/sdb1 1.1T  371G  674G  36% /data
6.192 hours @ 61,262M/hour or 1021M/min or 17.02M/sec


EXT2
Config=APP+NFS--NAS+iSCSI
RAID6 64K chunk
[EMAIL PROTECTED] tmp]# time tar cf - . | ( cd /data/ ; tar xf - )
real401m48.702s
user1m25.599s
sys 30m22.620s
/dev/sdb1 1.1T  371G  674G  36% /data
6.692 hours @ 56,684M/hour or 945M/min or 15.75M/sec


Did you tune the extN filesystems to the stripe size of the raid?

--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large single raid and XFS or two small ones and EXT3?

2006-06-25 Thread Bill Davidsen

Justin Piszcz wrote:



On Sat, 24 Jun 2006, Neil Brown wrote:


On Friday June 23, [EMAIL PROTECTED] wrote:


The problem is that there is no cost effective backup available.



One-liner questions :
- How does Google make backups ?



No, Google ARE the backups :-)


- Aren't tapes dead yet ?



LTO-3 does 300Gig, and LTO-4 is planned.
They may not cope with tera-byte arrays in one hit, but they still
have real value.


- What about a NUMA principle applied to storage ?



You mean an Hierarchical Storage Manager?  Yep, they exist.  I'm sure
SGI, EMC and assorted other TLAs could sell you one.



LTO3 is 400GB native and we've seen very good compression, so 
800GB-1TB per tape. 


The problem is in small business use, LTO3 is costly in the 1-10TB 
range, and takes a lot of media changes as well. A TB of RAID-5 is 
~$500, and at that small size the cost of drives and media is 
disproportionally high. Using more drives is cost effective, but they 
are not good for long term off site storage, because they're large and 
fragile.


No obvious solutions in that price and application range that I see.

--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large single raid and XFS or two small ones and EXT3?

2006-06-25 Thread Adam Talbot
Not exactly sure how to tune for stripe size. 
What would you advise?
-Adam


Bill Davidsen wrote:
 winspeareAdam Talbot wrote:

 OK, this topic I relay need to get in on.
 I have spent the last few week bench marking my new 1.2TB, 6 disk, RAID6
 array. I wanted real numbers, not This FS is faster because... I have
 moved over 100TB of data on my new array running the bench mark
 testing.  I have yet to have any major problems with ReiserFS, EXT2/3,
 JFS, or XFS.  I have done extensive testing on all, including just
 trying to break the file system with billions of 1k files, or a 1TB
 file. Was able to cause some problems with EXT3 and RiserFS with the 1KB
 and 1TB tests, respectively. but both were fixed with a fsck. My basic
 test is to move all data from my old server to my new server
 (whitequeen2) and clock the transfer time.  Whitequeen2 has very little
 storage.  The NAS's 1.2TB of storage is attached via iSCSI and a cross
 over cable to the back of whitequeen2.  The data is 100GB of user's
 files(1KB~2MB), 50GB of MP3's (1MB~5MB) and the rest is movies and
 system backups 600MB~2GB.  Here is a copy of my current data sheet,
 including specs on the servers and copy times, my numbers are not
 perfect, but they should give you a clue about speeds...  XFS wins.
  


 In many (most?) cases I'm a lot more concerned about filesystem
 stability than performance. That is, I want the fastest reliable
 filesystem. With ext2 and ext3 I've run multiple multi-TB machines
 spread over four time zones, and not had a f/s problem updating ~1TB/day.

 The computer: whitequeen2
 AMD Athlon64 3200 (2.0GHz)
 1GB Corsair DDR 400 (2X 512MB's running in dual DDR mode)
 Foxconn 6150K8MA-8EKRS motherboard
 Off brand case/power supply
 2X os disks, software raid array, RAID 1, Maxtor 51369U3, FW DA620CQ0
 Intel pro/1000 NIC
 CentOS 4.3 X86_64 2.6.9
Main app server, Apache, Samba, NFS, NIS

 The computer: nas
 AMD Athlon64 3000 (1.8GHz)
 256MB Corsair DDR 400 (2X 128MB's running in dual DDR mode)
 Foxconn 6150K8MA-8EKRS motherboard
 Off brand case/power supply and drive cages
 2X os disks, software raid array, RAID 1, Maxtor 51369U3, FW DA620CQ0
 6X software raid array, RAID 6, Maxtor 7V300F0, FW VA111900
 Gentoo linux. X86_64 2.6.16-gentoo-r9
   System built very lite, only built as an iSCSI based NAS.

 EXT3
 Config=APP+NFS--NAS+iSCSI
 RAID6 64K chunk
 [EMAIL PROTECTED] tmp]# time tar cf - . | (cd /data ; tar xf - )
 real371m29.802s
 user1m28.492s
 sys 46m48.947s
 /dev/sdb1 1.1T  371G  674G  36% /data
 6.192 hours @ 61,262M/hour or 1021M/min or 17.02M/sec


 EXT2
 Config=APP+NFS--NAS+iSCSI
 RAID6 64K chunk
 [EMAIL PROTECTED] tmp]# time tar cf - . | ( cd /data/ ; tar xf - )
 real401m48.702s
 user1m25.599s
 sys 30m22.620s
 /dev/sdb1 1.1T  371G  674G  36% /data
 6.692 hours @ 56,684M/hour or 945M/min or 15.75M/sec

 Did you tune the extN filesystems to the stripe size of the raid?


-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large single raid and XFS or two small ones and EXT3?

2006-06-24 Thread Adam Talbot
OK, this topic I relay need to get in on.
I have spent the last few week bench marking my new 1.2TB, 6 disk, RAID6
array. I wanted real numbers, not This FS is faster because... I have
moved over 100TB of data on my new array running the bench mark
testing.  I have yet to have any major problems with ReiserFS, EXT2/3,
JFS, or XFS.  I have done extensive testing on all, including just
trying to break the file system with billions of 1k files, or a 1TB
file. Was able to cause some problems with EXT3 and RiserFS with the 1KB
and 1TB tests, respectively. but both were fixed with a fsck. My basic
test is to move all data from my old server to my new server
(whitequeen2) and clock the transfer time.  Whitequeen2 has very little
storage.  The NAS's 1.2TB of storage is attached via iSCSI and a cross
over cable to the back of whitequeen2.  The data is 100GB of user's
files(1KB~2MB), 50GB of MP3's (1MB~5MB) and the rest is movies and
system backups 600MB~2GB.  Here is a copy of my current data sheet,
including specs on the servers and copy times, my numbers are not
perfect, but they should give you a clue about speeds...  XFS wins.

The computer: whitequeen2
AMD Athlon64 3200 (2.0GHz)
1GB Corsair DDR 400 (2X 512MB's running in dual DDR mode)
Foxconn 6150K8MA-8EKRS motherboard
Off brand case/power supply
2X os disks, software raid array, RAID 1, Maxtor 51369U3, FW DA620CQ0
Intel pro/1000 NIC
CentOS 4.3 X86_64 2.6.9
Main app server, Apache, Samba, NFS, NIS

The computer: nas
AMD Athlon64 3000 (1.8GHz)
256MB Corsair DDR 400 (2X 128MB's running in dual DDR mode)
Foxconn 6150K8MA-8EKRS motherboard
Off brand case/power supply and drive cages
2X os disks, software raid array, RAID 1, Maxtor 51369U3, FW DA620CQ0
6X software raid array, RAID 6, Maxtor 7V300F0, FW VA111900
Gentoo linux. X86_64 2.6.16-gentoo-r9
   System built very lite, only built as an iSCSI based NAS.

NFS mount from whitequeen (old server) goes to /mnt/tmp
Target iSCSI to NAS, or when running on local NAS, is /data

Raw dump /dev/null (Speed mark, how fast is the old whitequeen, Read test)
Config=APP+NFS--/dev/null
[EMAIL PROTECTED] tmp]# time tar cf - . | cat -  /dev/null
real216m30.621s
user1m24.222s
sys 15m20.031s
3.6 hours @ 105371M/hour or 1756M/min or *29.27M/sec*

XFS
Config=APP+NFS--NAS+iSCSI
RAID6 64K chunk
[EMAIL PROTECTED] tmp]# time tar cf - . | (cd /data ; tar xf - )
real323m9.990s
user1m28.556s
sys 31m6.405s
/dev/sdb1 1.1T  371G  748G  34% /data
5.399 hours @ 70,260M/hour or 1171M/min or 19.52M/sec

Pass 2 of XFS  (are my number repeatable?  Yes)
real320m11.615s
user1m26.997s
sys 31m11.987s

XFS (Direct NFS connection, no app server, max real world speed of my
array?)
Config=NAS+NFS
RAID6 64K chunk
nas tmp # time tar cf - . | (cd /data ; tar xf - )
real241m8.698s
user1m2.760s
sys 25m9.770s
/dev/md/0 1.1T  371G  748G  34% /data
4.417 hours @ 85,880M/hour or 1.431M/min or *23.86M/sec*


EXT3
Config=APP+NFS--NAS+iSCSI
RAID6 64K chunk
[EMAIL PROTECTED] tmp]# time tar cf - . | (cd /data ; tar xf - )
real371m29.802s
user1m28.492s
sys 46m48.947s
/dev/sdb1 1.1T  371G  674G  36% /data
6.192 hours @ 61,262M/hour or 1021M/min or 17.02M/sec


EXT2
Config=APP+NFS--NAS+iSCSI
RAID6 64K chunk
[EMAIL PROTECTED] tmp]# time tar cf - . | ( cd /data/ ; tar xf - )
real401m48.702s
user1m25.599s
sys 30m22.620s
/dev/sdb1 1.1T  371G  674G  36% /data
6.692 hours @ 56,684M/hour or 945M/min or 15.75M/sec


JFS
Config=APP+NFS--NAS+iSCSI
RAID6 64K chunk
[EMAIL PROTECTED] tmp]# time tar cf - . | (cd /data ; tar xf - )
real337m52.125s
user1m26.526s
sys 32m33.983s
/dev/sdb1 1.1T  371G  748G  34% /data
5.625 hours @ 67,438M/hour or 1124M/min or 18.73M/sec


ReiserFS
Config=APP+NFS--NAS+iSCSI
RAID6 64K chunk
[EMAIL PROTECTED] tmp]# time tar cf - . | (cd /data ; tar xf - )
real334m33.615s
user1m31.098s
sys 48m41.193s
/dev/sdb1 1.1T  371G  748G  34% /data
5.572 hours @ 68,078M/hour or 1135M/min or 18.91M/sec


Word count
[EMAIL PROTECTED] tmp]# ls | wc
66612  301527 5237755

Actule size = 379,336M
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large single raid and XFS or two small ones and EXT3?

2006-06-24 Thread David Greaves
Adam Talbot wrote:
 OK, this topic I relay need to get in on.
 I have spent the last few week bench marking my new 1.2TB, 6 disk, RAID6
 array.
Very interesting. Thanks.

Did you get around to any 'tuning'.
Things like raid chunk size, external logs for xfs, blockdev readahead
on the underlying devices and the raid device?

David
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large single raid and XFS or two small ones and EXT3?

2006-06-23 Thread PFC


- XFS is faster and fragments less, but make sure you have a good UPS
- ReiserFS 3.6 is mature and fast, too, you might consider it
	- ext3 is slow if you have many files in one directory, but has more  
mature tools (resize, recovery etc)


I'd go with XFS or Reiser.

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large single raid and XFS or two small ones and EXT3?

2006-06-23 Thread Francois Barre

2006/6/23, PFC [EMAIL PROTECTED]:


- XFS is faster and fragments less, but make sure you have a good UPS

Why a good UPS ? XFS has a good strong journal, I never had an issue
with it yet... And believe me, I did have some dirty things happening
here...


- ReiserFS 3.6 is mature and fast, too, you might consider it
- ext3 is slow if you have many files in one directory, but has more
mature tools (resize, recovery etc)

XFS tools are kind of mature also. Online grow, dump, ...



I'd go with XFS or Reiser.

I'd go with XFS. But I may be kind of fanatic...
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large single raid and XFS or two small ones and EXT3?

2006-06-23 Thread Gordon Henderson
On Fri, 23 Jun 2006, Chris Allen wrote:

 Strange that whatever the filesystem you get equal numbers of people
 saying that
 they have never lost a single byte to those who have had horrible
 corruption and
 would never touch it again. We stopped using XFS about a year ago because we
 were getting kernel stack space panics under heavy load over NFS. It
 looks like
 the time has come to give it another try.

I had a bad experience with XFS a year or so ago, and after getting told
to RTFM from the XFS users list, after I'd already RTFMd, I gave up on it.
(and them)

However, I've just decided to give it a go again (for the single reason
that it's faster at deleting large swathes of files than ext3, which this
server might have to do from time to time), and so-far so good.

Looking back, what I think I really was having problems with at the time
was 2 issues; one was that I was using LVM too, and it really wasn't
production ready, and the other was that the default kernel stack size was
4KB at the time - which was what was causing me problems under heavy NFS
load...

I'm trying it now on a 3.5TB RAID-6 server now with a relatively light NFS
(and Samba)  load, but will be rolling it out on an identical server soon
which is expected to have a relatively high load, so heres hoping...

Gordon
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large single raid and XFS or two small ones and EXT3?

2006-06-23 Thread Francois Barre

Strange that whatever the filesystem you get equal numbers of people
saying that
they have never lost a single byte to those who have had horrible
corruption and
would never touch it again.

[...]

Loosing data is worse than loosing anything else. You can buy you
another hard drive, you can buy you another CPU, but you won't buy you
all the data you lost... And as far as I know, true life does not
implement the Undo button,
So, as a matter of facts, I started to think that choosing a FS is
much more a matter of personnal belief than any kind of scientific,
statistical, even empirical benchmarking. Something like a new kind of
religion...

For example, back in the reiser3.6's first steps in life, I
experienced a handfull of oopses, and fuzzy things that made my box
think it was running a Redmond stuff... So I neglected Reiser.
Then Reiser4 concepts came to my ear, several years after, and I
thought that, well, you know, Hans Reiser has great ideas and
promising theories, let's have a closer look at it.
So I came back testing reiser3.6. Which just worked flawlessly.
And you know what ? I never had time to play with Reiser4 yet.

So I finally chose XFS for all my more-than-2G partitions, with regard
to thread contents I started back to january : Linux MD raid5 and
reiser4... Any experience ?.

Anyway, I'm shared between two points of view regarding the fs
experience in Linux
- maybe FS can not be generic, and cannot cover all usage scenarii.
Some are good for doing some stuff, some are better for some others...
And you'll have to chose with regard to your own usage forecasts.
- or maybe there's too much choice inthere : whenever a big problem
arises, it's easier to switch filesystems than to go bug hunting... At
least that's the way I reacted a couple of times. And because data
loss is such a sensible topic, when trust is broken, you just want to
change all stuff around, and start hating what you were found of a
minute ago...
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large single raid and XFS or two small ones and EXT3?

2006-06-23 Thread Al Boldi
Chris Allen wrote:
 Francois Barre wrote:
  2006/6/23, PFC [EMAIL PROTECTED]:
  - XFS is faster and fragments less, but make sure you have a
  good UPS
 
  Why a good UPS ? XFS has a good strong journal, I never had an issue
  with it yet... And believe me, I did have some dirty things happening
  here...
 
  - ReiserFS 3.6 is mature and fast, too, you might consider it
  - ext3 is slow if you have many files in one directory, but
  has more
  mature tools (resize, recovery etc)
 
  XFS tools are kind of mature also. Online grow, dump, ...
 
  I'd go with XFS or Reiser.
 
  I'd go with XFS. But I may be kind of fanatic...

 Strange that whatever the filesystem you get equal numbers of people
 saying that they have never lost a single byte to those who have had 
 horrible corruption and would never touch it again. We stopped using XFS 
 about a year ago because we were getting kernel stack space panics under 
 heavy load over NFS. It looks like the time has come to give it another
 try.

If you are keen on data integrity then don't touch any fs w/o data=ordered.

ext3 is still king wrt data=ordered, albeit slow.

Now XFS is fast, but doesn't support data=ordered.  It seems that their 
solution to the problem is to pass the burden onto hw by using barriers.  
Maybe XFS can get away with this.  Maybe.

Thanks!

--
Al

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large single raid and XFS or two small ones and EXT3?

2006-06-23 Thread Martin Schröder

2006/6/23, Francois Barre [EMAIL PROTECTED]:

Loosing data is worse than loosing anything else. You can buy you


That's why RAID is no excuse for backups.

Best
   Martin
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large single raid and XFS or two small ones and EXT3?

2006-06-23 Thread Francois Barre

That's why RAID is no excuse for backups.


Of course yes, but...
(I'm working in car industry) Raid is your active (if not pro-active)
security system, like a car ESP ; if something goes wrong, it
gracefully and automagically re-align to the *safe way*. Whereas
backup is your airbag. It's always too late when you use it.
And I've never seen anyone trying to recover something from a backup
without praying...

So, one day or another, I'll develop the strongest backup technology
ever, using marble-based disks and a redundant cluster of egyptian
scribes.
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large single raid and XFS or two small ones and EXT3?

2006-06-23 Thread Chris Allen



Martin Schröder wrote:

2006/6/23, Francois Barre [EMAIL PROTECTED]:

Loosing data is worse than loosing anything else. You can buy you


That's why RAID is no excuse for backups.



We have 50TB stored data now and maybe 250TB this time next year.
We mirror the most recent 20TB to a secondary array and rely on
the RAID for the rest.

I can't think of a practical tape backup strategy given tape sizes at 
the moment...



-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large single raid and XFS or two small ones and EXT3?

2006-06-23 Thread Bill Davidsen

Martin Schröder wrote:


2006/6/23, Francois Barre [EMAIL PROTECTED]:


Loosing data is worse than loosing anything else. You can buy you



That's why RAID is no excuse for backups. 



The problem is that there is no cost effective backup available. When a 
tape was the same size as a disk and 10% the cost, backups were 
practical. Today anything larger than hobby size disk is just not easy 
to back up. Anything large enough to be useful is expensive, small media 
or something you can't take off-site and lock in a vault aren't backups 
so much as copies, which may protect against some problems, but which 
provide little to no protection against site disasters.


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large single raid and XFS or two small ones and EXT3?

2006-06-23 Thread Francois Barre

The problem is that there is no cost effective backup available.


One-liner questions :
- How does Google make backups ?
- Aren't tapes dead yet ?
- What about a NUMA principle applied to storage ?
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large single raid and XFS or two small ones and EXT3?

2006-06-23 Thread Andreas Dilger
On Jun 23, 2006  17:01 +0300, Al Boldi wrote:
 Chris Allen wrote:
  Francois Barre wrote:
   2006/6/23, PFC [EMAIL PROTECTED]:
   - ext3 is slow if you have many files in one directory, but
   has more mature tools (resize, recovery etc)

Please use mke2fs -O dir_index or tune2fs -O dir_index when testing
ext3 performance for many-files-in-dir.  This is now the default in
e2fsprogs-1.39 and later.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large single raid and XFS or two small ones and EXT3?

2006-06-23 Thread Russell Cattelan

Al Boldi wrote:


Chris Allen wrote:
 


Francois Barre wrote:
   


2006/6/23, PFC [EMAIL PROTECTED]:
 


   - XFS is faster and fragments less, but make sure you have a
good UPS
   


Why a good UPS ? XFS has a good strong journal, I never had an issue
with it yet... And believe me, I did have some dirty things happening
here...

 


   - ReiserFS 3.6 is mature and fast, too, you might consider it
   - ext3 is slow if you have many files in one directory, but
has more
mature tools (resize, recovery etc)
   


XFS tools are kind of mature also. Online grow, dump, ...

 


   I'd go with XFS or Reiser.
   


I'd go with XFS. But I may be kind of fanatic...
 


Strange that whatever the filesystem you get equal numbers of people
saying that they have never lost a single byte to those who have had 
horrible corruption and would never touch it again. We stopped using XFS 
about a year ago because we were getting kernel stack space panics under 
heavy load over NFS. It looks like the time has come to give it another

try.
   



If you are keen on data integrity then don't touch any fs w/o data=ordered.

ext3 is still king wrt data=ordered, albeit slow.

Now XFS is fast, but doesn't support data=ordered.  It seems that their 
solution to the problem is to pass the burden onto hw by using barriers.  
Maybe XFS can get away with this.  Maybe.


Thanks!

--
 

When you refer to data=ordered are you taking about ext3 user data 
journaling?


While user data journaling seems like a good idea is unclear as what 
benefits it really provides?
By writing all user data twice the write performance of the files system 
is effectively halved.
Granted the log is on area of the disk so some performance advantages 
show up due

to less head seeking for those writes.

As far us meta data jornaling goes it is a fundamental requirement that 
the journal is
synced to disk to a given point in order to release the pinned meta 
data, thus allowing

the meta data to be synced to disk.

The way most files systems guarantee file system consistency is to 
either sync all
outstanding meta data changes to disk or to sync a record of what incore 
changes

have been made to disk.

In the XFS case since it logs meta data delta to the log it can record more
change operations in a smaller number of disk blocks, ext3 on the other hand
writes the entire metadata block to the log.

As far as barriers go I assume you are referring to the ide write barriers?

The need for barrier support in the file system is a result of cheap ide
disks providing large write caches but not having enough reserve power to
guarantee that the cache will be sync'ed to disk in the event of a power
failure.

Originally when xfs was written the disks/raids used by SGI system was 
pretty
much exclusively enterprise level devices that would guarantee the write 
caches

would be flushed in the event of a power failure.

Note ext3,xfs,and reiser all use write barrier now fos r ide disks.




-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large single raid and XFS or two small ones and EXT3?

2006-06-23 Thread Christian Pedaschus

Andreas Dilger wrote:

On Jun 23, 2006  17:01 +0300, Al Boldi wrote:
  

Chris Allen wrote:


Francois Barre wrote:
  

2006/6/23, PFC [EMAIL PROTECTED]:


- ext3 is slow if you have many files in one directory, but
has more mature tools (resize, recovery etc)
  


Please use mke2fs -O dir_index or tune2fs -O dir_index when testing
ext3 performance for many-files-in-dir.  This is now the default in
e2fsprogs-1.39 and later.
  

for ext3 use (on unmounted disks):
tune2fs -O has_journal -o journal_data /dev/{disk}
tune2fs -O dir_index /dev/{disk}

if data is on the drive, you need to run a fsck afterwards and it uses a
good bit of ram, but it makes ext3 a good bit faster.

and my main points for using ext3 is still: it's a very mature fs,
nobody will tell you such horrible storys about data-lossage with ext3
than with any other filesystem.
and there are undelete tools for ext3.

so if you're for data-integrity (i guess you are, else you would not use
raid, or? ;) ), use ext3 and if you need the last single kb/s get a
faster drive or use lots of them with a good raid-combo and/or use a
separate disk for the journal (man 8 tune2fs)

my 0.5 cents,
greets chris

ps. but you know, filesystem choosage is not pure science, it's
half-religion :D

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large single raid and XFS or two small ones and EXT3?

2006-06-23 Thread Christian Pedaschus

Christian Pedaschus wrote:

for ext3 use (on unmounted disks):
tune2fs -O has_journal -o journal_data /dev/{disk}
tune2fs -O dir_index /dev/{disk}

if data is on the drive, you need to run a fsck afterwards and it uses a
good bit of ram, but it makes ext3 a good bit faster.

and my main points for using ext3 is still: it's a very mature fs,
nobody will tell you such horrible storys about data-lossage with ext3
than with any other filesystem.
and there are undelete tools for ext3.

so if you're for data-integrity (i guess you are, else you would not use
raid, or? ;) ), use ext3 and if you need the last single kb/s get a
faster drive or use lots of them with a good raid-combo and/or use a
separate disk for the journal (man 8 tune2fs)

my 0.5 cents,
greets chris

ps. but you know, filesystem choosage is not pure science, it's
half-religion :D
  

Ops, should be:

tune2fs -O has_journal -o journal_data /dev/{partition}
tune2fs -O dir_index /dev/{partition}

;)
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large single raid and XFS or two small ones and EXT3?

2006-06-23 Thread Tom Vier
On Fri, Jun 23, 2006 at 11:21:34AM -0500, Russell Cattelan wrote:
 When you refer to data=ordered are you taking about ext3 user data 
 journaling?

iirc, data=ordered just writes new data out before updating block pointers,
the file's length in its inode, and the block usage bitmap. That way you
don't get junk or zeroed data at the tail of the file. However, i think to
prevent data leaks (from deleted files), data=writeback requires a write to
the journal, indicating what blocks are being added, so that on recovery
they can be zeroed if the transaction wasn't completed.

 While user data journaling seems like a good idea is unclear as what 
 benefits it really provides?

Data gets commited sooner (until pressure or timeouts force the data to be
written to its final spot - then you loose thruput and there's a net delay).
I think for bursts of small file creation, data=journaled is a win. I don't
know how lazy ext3 is about writing the data to its final position. It
probably does it when the commit timeout hits 0 or the journal is full.

 As far as barriers go I assume you are referring to the ide write barriers?
 
 The need for barrier support in the file system is a result of cheap ide
 disks providing large write caches but not having enough reserve power to
 guarantee that the cache will be sync'ed to disk in the event of a power
 failure.

It's needed on any drive (including scsi) that has writeback cache enabled.
Most scsi drives (in my experience) come from the factory with the cache set
to write thru, in case the fs/os doesn't use ordered tags, cache flushes, or
force-unit-access writes.

 Note ext3,xfs,and reiser all use write barrier now fos r ide disks.

What i've found very disappointing is that my raid1 doesn't support them!

Jun 22 10:53:49 zero kernel: Filesystem md1: Disabling barriers, not
supported by the underlying device

I'm not sure if it's the sata drive that don't support write barriers, or if
it's just the md1 layer. I need to investigate that. I think reiserfs also
complained that trying to enabled write barriers fails on that md1 (i've
been playing with various fs'es on it).

-- 
Tom Vier [EMAIL PROTECTED]
DSA Key ID 0x15741ECE
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large single raid and XFS or two small ones and EXT3?

2006-06-23 Thread Nix
On 23 Jun 2006, Francois Barre uttered the following:
 The problem is that there is no cost effective backup available.
 
 One-liner questions :
 - How does Google make backups ?

Replication across huge numbers of cheap machines on a massively
distributed filesystem.

-- 
`NB: Anyone suggesting that we should say Tibibytes instead of
 Terabytes there will be hunted down and brutally slain.
 That is all.' --- Matthew Wilcox
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large single raid and XFS or two small ones and EXT3?

2006-06-23 Thread Nix
On 23 Jun 2006, PFC suggested tentatively:
   - ext3 is slow if you have many files in one directory, but has
   more mature tools (resize, recovery etc)

This is much less true if you turn on the dir_index feature.

-- 
`NB: Anyone suggesting that we should say Tibibytes instead of
 Terabytes there will be hunted down and brutally slain.
 That is all.' --- Matthew Wilcox
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large single raid and XFS or two small ones and EXT3?

2006-06-23 Thread Nix
On 23 Jun 2006, Christian Pedaschus said:
 and my main points for using ext3 is still: it's a very mature fs,
 nobody will tell you such horrible storys about data-lossage with ext3
 than with any other filesystem.

Actually I can, but it required bad RAM *and* a broken disk controller
*and* an electrical storm *and* heavy disk loads (only read loads,
but I didn't have noatime active so read implied write).

In my personal experience it's since weathered machines with `only' RAM
so bad that md5sums of 512Kb files wouldn't come out the same way twice
with no problems at all (some file data got corrupted, unsurprisingly,
but the metadata was fine).

Definitely an FS to be relied upon.

-- 
`NB: Anyone suggesting that we should say Tibibytes instead of
 Terabytes there will be hunted down and brutally slain.
 That is all.' --- Matthew Wilcox
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large single raid and XFS or two small ones and EXT3?

2006-06-23 Thread Neil Brown
On Friday June 23, [EMAIL PROTECTED] wrote:
  The problem is that there is no cost effective backup available.
 
 One-liner questions :
 - How does Google make backups ?

No, Google ARE the backups :-)

 - Aren't tapes dead yet ?

LTO-3 does 300Gig, and LTO-4 is planned.
They may not cope with tera-byte arrays in one hit, but they still
have real value.

 - What about a NUMA principle applied to storage ?

You mean an Hierarchical Storage Manager?  Yep, they exist.  I'm sure
SGI, EMC and assorted other TLAs could sell you one.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Large single raid and XFS or two small ones and EXT3?

2006-06-22 Thread Chris Allen

Dear All,

I have a Linux storage server containing 16x750GB drives - so 12TB raw 
space.


If I make them into a single RAID5 array, then it appears my only
choice for a filesystem is XFS - as  EXT3 won't really handle partitions
over 8TB.

Alternatively, I could split each drive into 2 partitions and have 2 RAID5
arrays, then put an EXT3 on each one.

Can anybody advise the pros and cons of these two approaches with
regard to stability, reliability and performance? The store is to be used
for files which will have an even split of:

33% approx 2MB in size
33% approx 50KB in size
33% approx 2KB in size


Also:

- I am running a 2.6.15-1 stock FC5 kernel. Would there be any RAID 
benefits in
me upgrading to the latest 2.6.16 kernel? (don't want to do this unless 
there is

very good reason to)


- I am running mdadm 2.3.1. Would there be any benefits for me in 
upgrading to

mdadm v2.5?

- I have read good things about bitmaps. Are these production ready? Any 
advice/caveats?


Many thanks for reading,

Chris Allen.
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large single raid and XFS or two small ones and EXT3?

2006-06-22 Thread Gordon Henderson
On Thu, 22 Jun 2006, Chris Allen wrote:

 Dear All,

 I have a Linux storage server containing 16x750GB drives - so 12TB raw
 space.

Just one thing - Do you want to use RAID-5 or RAID-6 ?

I just ask, as with that many drives (and that much data!) the
possibilities of a 2nd drive failure is increasing, and personally,
wherever I can, I take the hit these days, and have used RAID-6 for
some time... drives are cheap, even the 750GB behemoths!

 If I make them into a single RAID5 array, then it appears my only
 choice for a filesystem is XFS - as  EXT3 won't really handle partitions
 over 8TB.

I can't help with this though - I didn't realise ext3 had such a
limitation though!

Gordon
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Large single raid and XFS or two small ones and EXT3?

2006-06-22 Thread Chris Allen



H. Peter Anvin wrote:

Gordon Henderson wrote:

On Thu, 22 Jun 2006, Chris Allen wrote:


Dear All,

I have a Linux storage server containing 16x750GB drives - so 12TB raw
space.


Just one thing - Do you want to use RAID-5 or RAID-6 ?

I just ask, as with that many drives (and that much data!) the
possibilities of a 2nd drive failure is increasing, and personally,
wherever I can, I take the hit these days, and have used RAID-6 for
some time... drives are cheap, even the 750GB behemoths!


If I make them into a single RAID5 array, then it appears my only
choice for a filesystem is XFS - as  EXT3 won't really handle 
partitions

over 8TB.


I can't help with this though - I didn't realise ext3 had such a
limitation though!



16 TB (2^32 blocks) should be the right number.

It should be, but mkfs.ext3 won't let me create a filesystem bigger than 
8TB.
It appears that the only way round this is through kernel patches, and, 
as this
is a production machine, I'd rather stick to mainstream releases and go 
for one

of the above solutions.

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html