Re: [Lustre-discuss] Quota question

2008-06-05 Thread Enrico Morelli
On Wed, 4 Jun 2008 12:42:47 +0200
Enrico Morelli [EMAIL PROTECTED] wrote:

 On Wed, 04 Jun 2008 14:11:47 +0400
 Andrew Perepechko [EMAIL PROTECTED] wrote:
 
  Enrico,
  
  what version of Lustre are you using?
  
 
 1.6.4.3
 
  Andrew.
  
  On Wednesday 04 June 2008 12:51:26 Enrico Morelli wrote:
   Dear all,
  
   I've a problem that I don't understand.
  
   Disk quotas for user hsco1 (uid 1250):
Filesystem  kbytes   quota   limit   grace   files   quota
   limit   grace /lustre_homes/ 13204576  3000 3500
   60309   0   0
   spfs-MDT_UUID
   14760   0  102400   60309   0 0
   spfs-OST_UUID
   3648336   0 3788800
   spfs-OST0001_UUID
   2050408   0 2150400
   spfs-OST0002_UUID
   3660516   0 3788800
   spfs-OST0003_UUID
   3830556*  0   1
  
  
   For me the user is under the quota (13GB data vs 30GB of quota),
   but for the system is out of quota. Where is the problem?
  
   Thanks
  
  
 
 
I don't understand the last row displayed by lfs quota:
spfs-OST0003_UUID
3830556*  0   1

This is my df:
/dev/data_se/lustre_mgs_mdt
  45871740   1515804  41734496   4% /lustre_mgs_mdt
/dev/data_se/data_local_a
 516061624 237605788 252241436  49% /data_local_a
/dev/data_se/data_local_b
 516061624 228373408 261473816  47% /data_local_b
/dev/data_se/data_local_c
 516061624 283749608 206097616  58% /data_local_c
/dev/data_se/data_local_d
 278673480 167750860  96766844  64% /data_local_d
[EMAIL PROTECTED]:/spfs
 1826858352 917479664 816579452  53% /lustre_homes


The manual say:
Note - Values appended with “*” show the limit that has been
over-used (exceeding the quota), and receives this message Disk quota
exceeded.

How can I solve this problem?

---
--- (o_
(o_//\  Coltivate Linux che tanto Windows si pianta da solo. (/)_
V_/_
+--+
| ENRICO MORELLI |  email: [EMAIL PROTECTED]   | |
* *   *   *|  phone: +39 055 4574269 | |
University of Florence|  fax  : +39 055 4574253 | |
CERM - via Sacconi, 6 -  50019 Sesto Fiorentino (FI) - ITALY|
+--+
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] ldiskfs kernel bug in ldiskfs_mb_use_best_found

2008-06-05 Thread Alex Zhuravlev

https://bugzilla.lustre.org/show_bug.cgi?id=15932

thanks, Alex

Aaron Knister wrote:
 For my own curiosity's sake, is there a bugzilla id associated with it? 
 I'm curious to know what the problem is/was.
 
 On Jun 5, 2008, at 8:50 AM, Alex Zhuravlev wrote:
 
 we've been investigating one issue which can be a source of your problem.
 fortunately we can reproduce that.

 thanks, Alex

 Aaron Knister wrote:
 Unfortunately the monkeys maintaining this setup (I left the company
 last week) have managed to destroy one of the 13 terabyte arrays...so
 they've got bigger issues right now. Thanks for your help though.

 On Jun 5, 2008, at 3:18 AM, Alex Zhuravlev wrote:

 hmm. could you provide me with ldiskfs.ok then?

 thanks, Alex

 Aaron Knister wrote:
 I'm not sure how to obtain that. I'm running whatever was distributed
 with 1.6.4.3 for rhel5-x86_64.

 On Jun 2, 2008, at 8:08 AM, Alex Zhuravlev wrote:

 could you also send me your mballoc.c please?

 thanks, Alex

 Aaron Knister wrote:
 Thanks so much for looking into this. Here's what I got from dmesg.
 Interestingly enough every time it panics the CPU listed is #4...do
 you think that points to a hardware problem? -
 --- [cut here ] - [please bite here ] -
 Kernel BUG at
 ...build/BUILD/lustre-ldiskfs-3.0.4/ldiskfs/mballoc.c:1334
 invalid opcode:  [1] SMP
 last sysfs file:
 /devices/pci:00/:00:02.0/:01:00.0/:02:02.0/:04:00.1/irq
  



 CPU 4
 Modules linked in: obdfilter(U) fsfilt_ldiskfs(U) ost(U) mgc(U)
 ldiskfs(U) crc16(U) lustre(U) lov(U) lquota(U) mdc(U) ko2iblnd(U)
 ptlrpc(U) obdclass(
 U) lnet(U) lvfs(U) libcfs(U) netconsole(U) autofs4(U) hidp(U)
 rfcomm(U) l2cap(U) bluetooth(U) sunrpc(U) ip6t_REJECT(U) 
 xt_tcpudp(U)
 ip6table_filter(U
 ) ip6_tables(U) x_tables(U) ipv6(U) ib_iser(U) libiscsi(U)
 scsi_transport_iscsi(U) rdma_ucm(U) ib_ucm(U) ib_srp(U) ib_sdp(U)
 rdma_cm(U) ib_cm(U) iw_c
 m(U) ib_addr(U) ib_local_sa(U) ib_ipoib(U) ib_sa(U) ib_uverbs(U)
 ib_umad(U) dm_multipath(U) video(U) sbs(U) backlight(U) i2c_ec(U)
 button(U) battery(
 U) asus_acpi(U) acpi_memhotplug(U) ac(U) parport_pc(U) lp(U)
 parport(U) ib_mthca(U) ib_mad(U) i2c_i801(U) e1000(U) ide_cd(U)
 ib_core(U) shpchp(U) i2c
 _core(U) cdrom(U) sg(U) pcspkr(U) dm_snapshot(U) dm_zero(U)
 dm_mirror(U) dm_mod(U) ata_piix(U) ahci(U) megaraid_sas(U)
 sata_sil(U) libata(U) sd_mod(U
 ) scsi_mod(U) ext3(U) jbd(U) ehci_hcd(U) ohci_hcd(U) uhci_hcd(U)
 Pid: 5717, comm: ll_ost_io_11 Tainted: GF
 2.6.18-53.1.13.el5_lustre.1.6.4.3smp #1
 RIP: 0010:[8885a01f]  [8885a01f]
 :ldiskfs:ldiskfs_mb_use_best_found+0xef/0x520
 RSP: 0018:81040e4dd320  EFLAGS: 00010246
 RAX:  RBX: 81040e4dd3d0 RCX: 007f
 RDX: 810406833000 RSI: 8103a825 RDI: 1000
 RBP: 0800 R08: 0020 R09: 0010
 R10: 81037d136ff8 R11: 8000 R12: 81040e4dd470
 R13: 0001 R14: 0020 R15: 0800
 FS:  2aac0220() GS:81042fc20b40()
 knlGS:
 CS:  0010 DS:  ES:  CR0: 8005003b
 CR2: 2e2ad0a0 CR3: 00041c6d1000 CR4: 06e0
 Process ll_ost_io_11 (pid: 5717, threadinfo 81040e4dc000, task
 81041228a040)
 Stack:  81040e4dd4c0 81040e4dd470 000b
 0001
 0001 0020 81040e4dd4c0 8885cd4e
 81038c2b0a38 0010 81040e4dd4a0 
 Call Trace:
 [8885cd4e] 
 :ldiskfs:ldiskfs_mb_regular_allocator+0x66e/0xd50
 [888443c2] :ldiskfs:ldiskfs_mark_inode_dirty+0x132/0x160
 [88859664]
 :ldiskfs:ldiskfs_mb_initialize_context+0x144/0x160
 [8885e1e6] :ldiskfs:ldiskfs_mb_new_blocks+0x166/0x280
 [888bf091]
 :fsfilt_ldiskfs:ldiskfs_ext_new_extent_cb+0x301/0x640
 [8000a63f] get_page_from_freelist+0x223/0x3cf
 [882390b0] :ib_mthca:mthca_arbel_post_send+0x5a2/0x5b4
 [88856d27] :ldiskfs:ldiskfs_ext_walk_space+0x1b7/0x250
 [888bed90]
 :fsfilt_ldiskfs:ldiskfs_ext_new_extent_cb+0x0/0x640
 [80054ccc] mwait_idle+0x0/0x4a
 [888ba9ae] :fsfilt_ldiskfs:fsfilt_map_nblocks+0xfe/0x150
 [888eee08] :obdfilter:filter_direct_io+0x478/0xce0
 [888f14de] :obdfilter:filter_commitrw_write+0x184e/0x2570
 [8005a2ea] cache_alloc_refill+0x106/0x186
 [8003ce86] lock_timer_base+0x1b/0x3c
 [8889eb46] :ost:ost_brw_write+0x21b6/0x28c0
 [80088431] default_wake_function+0x0/0xe
 [888a283e] :ost:ost_handle+0x2a8e/0x58d8
 [88603c82] :obdclass:class_handle2object+0xd2/0x160
 [8869f230] :ptlrpc:lustre_swab_ptlrpc_body+0x0/0x90
 [8869cde5] :ptlrpc:lustre_swab_buf+0xc5/0xf0
 [886a4a3b] 
 :ptlrpc:ptlrpc_server_handle_request+0xb0b/0x1270
 [80060f29] thread_return+0x0/0xeb
 [8006b6c9] do_gettimeofday+0x50/0x92
 [8855d056] 

[Lustre-discuss] MDS crash and the Dilger Procedure

2008-06-05 Thread Jakob Goldbach
Hi,

I just had to go through the Dilger procedure after MDS crashing when
mounting MDT. The system is running fine now - glad that I just learned
about this procedure. 

trace attached. 


[   78.216468] Lustre: OBD class driver, [EMAIL PROTECTED]
[   78.217547] Lustre Version: 1.6.4.3
[   78.218393] Build Version: 1.6.4.3-1970010101-PRISTINE-.home.goldbach.Build-lustre-server-kernel.linux-2.6.18.8-2.6.18.8-bnx2-1.6.7b-cciss-3.6.18-5-lustre-1.6.4.3
[   78.299294] Lustre: Added LNI [EMAIL PROTECTED] [8/256]
[   78.300363] Lustre: Accept secure, port 988
[   78.393841] Lustre: Lustre Client File System; [EMAIL PROTECTED]
[   78.453938] kjournald starting.  Commit interval 5 seconds
[   78.461664] LDISKFS FS on dm-0, internal journal
[   78.462675] LDISKFS-fs: recovery complete.
[   78.469931] LDISKFS-fs: mounted filesystem with ordered data mode.
[   78.529838] kjournald starting.  Commit interval 5 seconds
[   78.536134] LDISKFS FS on dm-0, internal journal
[   78.537334] LDISKFS-fs: mounted filesystem with ordered data mode.
[   78.594288] Lustre: MGS MGS started
[   78.595297] Lustre: Server MGS on device /dev/lustre_pool/mgt has started
[   78.676788] kjournald starting.  Commit interval 5 seconds
[   78.681816] LDISKFS FS on dm-1, internal journal
[   78.682987] LDISKFS-fs: recovery complete.
[   78.688720] LDISKFS-fs: mounted filesystem with ordered data mode.
[   78.757422] kjournald starting.  Commit interval 5 seconds
[   78.765206] LDISKFS FS on dm-1, internal journal
[   78.766255] LDISKFS-fs: mounted filesystem with ordered data mode.
[   78.864505] Lustre: Enabling user_xattr
[   78.907525] Lustre: 1361:0:(mds_fs.c:446:mds_init_server_data()) RECOVERY: service iloapp3-MDT, 3 recoverable clients, last_transno 158566404
[   78.952173] Lustre: MDT iloapp3-MDT now serving dev (iloapp3-MDT/1455817d-55a2-a694-4403-7abfdae1606f), but will be in recovery until 3 clients reconnect, or if no clients reconnect for 4:10; during that time new clients will not be allowed to connect. Recovery progress can be monitored by watching /proc/fs/lustre/mds/iloapp3-MDT/recovery_status.
[   78.978179] Lustre: 1361:0:(mds_lov.c:858:mds_notify()) MDS iloapp3-MDT: in recovery, not resetting orphans on iloapp3-OST_UUID
[   79.012936] BUG: scheduling while atomic: mount.lustre/0x8101/1361
[   79.014247] 
[   79.014247] Call Trace:
[   79.015240]  [8025973a] __sched_text_start+0x7a/0x769
[   79.016616]  [8023ab95] lock_timer_base+0x1b/0x3c
[   79.017943]  [8022f226] del_timer+0x4e/0x57
[   79.018971]  [802130a8] sync_buffer+0x0/0x3f
[   79.020003]  [8025a59a] io_schedule+0x28/0x34
[   79.021235]  [802130e3] sync_buffer+0x3b/0x3f
[   79.022490]  [8025a8f5] __wait_on_bit+0x40/0x6f
[   79.023780]  [802130a8] sync_buffer+0x0/0x3f
[   79.024955]  [8025a990] out_of_line_wait_on_bit+0x6c/0x78
[   79.026373]  [80286a91] wake_bit_function+0x0/0x23
[   79.027743]  [80222c9f] __bread+0x62/0x77
[   79.028724]  [88318de2] :ldiskfs:read_block_bitmap+0xa2/0xf0
[   79.030245]  [88319695] :ldiskfs:ldiskfs_free_blocks_sb+0x115/0x510
[   79.031856]  [88319b21] :ldiskfs:ldiskfs_free_blocks+0x91/0xe0
[   79.033392]  [8831ed1a] :ldiskfs:ldiskfs_free_data+0x8a/0x110
[   79.034944]  [8831f19c] :ldiskfs:ldiskfs_truncate+0x20c/0x650
[   79.036261]  [802dbeab] start_this_handle+0x355/0x405
[   79.037705]  [8831fbb4] :ldiskfs:ldiskfs_delete_inode+0x84/0xf0
[   79.039043]  [8831fb30] :ldiskfs:ldiskfs_delete_inode+0x0/0xf0
[   79.040636]  [8022c804] generic_delete_inode+0x8e/0x10b
[   79.042042]  [883ce891] :mds:mds_obd_destroy+0xa11/0xad0
[   79.043435]  [8022a2d7] mntput_no_expire+0x19/0x8b
[   79.044727]  [880f361b] :obdclass:llog_lvfs_close+0x6b/0x130
[   79.046201]  [880f46c1] :obdclass:llog_lvfs_destroy+0x841/0xa10
[   79.047756]  [880f0a0f] :obdclass:llog_cat_id2handle+0x4cf/0x5f0
[   79.049289]  [8021557d] cache_grow+0x2ee/0x343
[   79.050501]  [880fa9c5] :obdclass:cat_cancel_cb+0x405/0x630
[   79.051984]  [880f0129] :obdclass:llog_process+0xa09/0xe20
[   79.053398]  [8020c894] dput+0x23/0x152
[   79.054539]  [880fa5c0] :obdclass:cat_cancel_cb+0x0/0x630
[   79.055959]  [880fa3b3] :obdclass:llog_obd_origin_setup+0x773/0x980
[   79.057507]  [8021819f] vsnprintf+0x55e/0x5a3
[   79.058716]  [880fb37d] :obdclass:llog_setup+0x78d/0x860
[   79.060114]  [8840ea94] :osc:osc_llog_init+0x104/0x390
[   79.061491]  [880f9855] :obdclass:__llog_ctxt_put+0x25/0xe0
[   79.062957]  [880f8979] :obdclass:obd_llog_init+0x179/0x210
[   79.064362]  [882632ca] :lov:lov_llog_init+0x2ca/0x400
[   79.065556]  [880f8979] :obdclass:obd_llog_init+0x179/0x210
[   79.067021]  [8022a2d7] mntput_no_expire+0x19/0x8b
[   

[Lustre-discuss] lustre and multi path

2008-06-05 Thread Brock Palen
Our new lustre hardware arrived from sun today.  Looking at the duel  
MDS and FC disk array for it.  We will need multipath.
Has anyone ever used multipath with lustre?  Is there any issues?  If  
we set up regular multipath via LVM lustre won't care as far as I can  
tell and browsing archives.

What about multipath without LVM?  Our StorageTek array has dual  
controllers with dual ports going to dual port FC cards in the  
MDS's.  Each MDS has a connection to both controllers so we will need  
multipath to get any advantage to this.

Comments?


Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
[EMAIL PROTECTED]
(734)936-1985



___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] lustre and multi path

2008-06-05 Thread Stuart Marshall
Hi Brock,

We have sun oss's mds's and fc attached arrays all connected via an fc
switch.  We use sun's rdac driver rdac-LINUX-09.01.B2.74-source.tar.  I
think we tried to get the native RHEL4 multipath working but did not succeed
with our configuration.

Stuart

On Thu, Jun 5, 2008 at 3:57 PM, Brock Palen [EMAIL PROTECTED] wrote:

 Our new lustre hardware arrived from sun today.  Looking at the duel
 MDS and FC disk array for it.  We will need multipath.
 Has anyone ever used multipath with lustre?  Is there any issues?  If
 we set up regular multipath via LVM lustre won't care as far as I can
 tell and browsing archives.

 What about multipath without LVM?  Our StorageTek array has dual
 controllers with dual ports going to dual port FC cards in the
 MDS's.  Each MDS has a connection to both controllers so we will need
 multipath to get any advantage to this.

 Comments?


 Brock Palen
 www.umich.edu/~brockp http://www.umich.edu/%7Ebrockp
 Center for Advanced Computing
 [EMAIL PROTECTED]
 (734)936-1985



 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] MDS crash and the Dilger Procedure

2008-06-05 Thread Andreas Dilger
On Jun 05, 2008  23:28 +0200, Jakob Goldbach wrote:
 I just had to go through the Dilger procedure after MDS crashing when
 mounting MDT. The system is running fine now - glad that I just learned
 about this procedure. 

Hmm, looking at the size of this stack, I wonder if it is a stack
overflow problem related to bug 15575?  That was fixed in 1.6.5.
At least the second stack looks to be related, and the first one
is very deep...

It does also appear that it is doing a reschedule while atomic in
the first stack, though given the number of functions there it will
be hard to determine what is making this atomic...  Boy, do I love
Linux stack traces with every function ever called on the stack...

 [   79.012936] BUG: scheduling while atomic: mount.lustre/0x8101/1361
 [   79.014247] 
 [   79.014247] Call Trace:
 [   79.015240]  [8025973a] __sched_text_start+0x7a/0x769
 [   79.016616]  [8023ab95] lock_timer_base+0x1b/0x3c
 [   79.017943]  [8022f226] del_timer+0x4e/0x57
 [   79.018971]  [802130a8] sync_buffer+0x0/0x3f
 [   79.020003]  [8025a59a] io_schedule+0x28/0x34
 [   79.021235]  [802130e3] sync_buffer+0x3b/0x3f
 [   79.022490]  [8025a8f5] __wait_on_bit+0x40/0x6f
 [   79.023780]  [802130a8] sync_buffer+0x0/0x3f
 [   79.024955]  [8025a990] out_of_line_wait_on_bit+0x6c/0x78
 [   79.026373]  [80286a91] wake_bit_function+0x0/0x23
 [   79.027743]  [80222c9f] __bread+0x62/0x77
 [   79.028724]  [88318de2] :ldiskfs:read_block_bitmap+0xa2/0xf0
 [   79.030245]  [88319695] 
 :ldiskfs:ldiskfs_free_blocks_sb+0x115/0x510
 [   79.031856]  [88319b21] :ldiskfs:ldiskfs_free_blocks+0x91/0xe0
 [   79.033392]  [8831ed1a] :ldiskfs:ldiskfs_free_data+0x8a/0x110
 [   79.034944]  [8831f19c] :ldiskfs:ldiskfs_truncate+0x20c/0x650
 [   79.036261]  [802dbeab] start_this_handle+0x355/0x405
 [   79.037705]  [8831fbb4] :ldiskfs:ldiskfs_delete_inode+0x84/0xf0
 [   79.039043]  [8831fb30] :ldiskfs:ldiskfs_delete_inode+0x0/0xf0
 [   79.040636]  [8022c804] generic_delete_inode+0x8e/0x10b
 [   79.042042]  [883ce891] :mds:mds_obd_destroy+0xa11/0xad0
 [   79.044727]  [880f361b] :obdclass:llog_lvfs_close+0x6b/0x130

The fput-dput-iput-generic_drop_inode() part of the chain
_appears_ to be where the stack is going, and following it down further
ldiskfs_delete_inode-
  ldiskfs_truncate-
ldiskfs_free_data-
  ldiskfs_free_blocks-
ldiskfs_free_blocks_sb-
  read_block_bitmap-
__bread()

is calling might_sleep(), so the fact that we schedule later on
shouldn't come as a surprise, unless, it seems, the kernel is compiled
without CONFIG_DEBUG_SPINLOCK_SLEEP enabled.  In that case, this kind
of problem would be hit only when we actually DO sleep waiting for IO,


 [   79.046201]  [880f46c1] :obdclass:llog_lvfs_destroy+0x841/0xa10

It seems we are going into llog_lvfs_destroy() while processing the
configuration log, which is a bit strange, unless this is due to the
MDS-LOV-OSC hitting an empty log during startup and deleting it.
That in itself shouldn't be harmful...

There is (as can be seen below) a twisty maze of callbacks related to
the distributed logging code, but it doesn't appear to be holding a
spinlock that would make it atomic.  Looking at my kernel, in_atomic()
is set for a number of reasons (IRQ, NMI, etc) but the most common
reason is because of spinlock, unless CONFIG_PREEMPT is enabled.

 [   79.047756]  [880f0a0f] :obdclass:llog_cat_id2handle+0x4cf/0x5f0
 [   79.050501]  [880fa9c5] :obdclass:cat_cancel_cb+0x405/0x630
 [   79.051984]  [880f0129] :obdclass:llog_process+0xa09/0xe20

In 1.6.5 (due to bug 15575) a new thread is started at this point to process
the per-OSC log.

 [   79.054539]  [880fa5c0] :obdclass:cat_cancel_cb+0x0/0x630
 [   79.055959]  [880fa3b3] 
 :obdclass:llog_obd_origin_setup+0x773/0x980
 [   79.058716]  [880fb37d] :obdclass:llog_setup+0x78d/0x860
 [   79.060114]  [8840ea94] :osc:osc_llog_init+0x104/0x390
 [   79.061491]  [880f9855] :obdclass:__llog_ctxt_put+0x25/0xe0
 [   79.062957]  [880f8979] :obdclass:obd_llog_init+0x179/0x210
 [   79.064362]  [882632ca] :lov:lov_llog_init+0x2ca/0x400
 [   79.065556]  [880f8979] :obdclass:obd_llog_init+0x179/0x210
 [   79.068313]  [883941ad] :mds:mds_llog_init+0x1ad/0x270
 [   79.070777]  [880f8979] :obdclass:obd_llog_init+0x179/0x210
 [   79.073520]  [880f8dc5] :obdclass:llog_cat_initialize+0x3b5/0x670
 [   79.075090]  [88277c61] :lov:lov_get_info+0x9f1/0xaa0
 [   79.076405]  [8828f6c1] :lov:qos_add_tgt+0x681/0x730
 [   79.077573]  [8839d5ac] :mds:mds_lov_update_desc+0xbcc/0xd30
 [   79.079050]  [883a03ec] :mds:mds_notify+0x36c/0x690
 [   79.080630]  [8826c657] :lov:lov_notify+0xa97/0xfd0
 [   79.081820]  

Re: [Lustre-discuss] lustre and multi path

2008-06-05 Thread Klaus Steden

Hi Brock,

I've got a Sun StorageTek array hooked up to one of our clusters, and I'm
using labels instead of multi-pathing. We've got it hooked up in a similar
fashion as Stuart; it's a bit slow and sloppy when initializing, but it
works well enough and there are no problems once OSTs are online.

Klaus

On 6/5/08 3:57 PM, Brock Palen [EMAIL PROTECTED]did etch on stone
tablets:

 Our new lustre hardware arrived from sun today.  Looking at the duel
 MDS and FC disk array for it.  We will need multipath.
 Has anyone ever used multipath with lustre?  Is there any issues?  If
 we set up regular multipath via LVM lustre won't care as far as I can
 tell and browsing archives.
 
 What about multipath without LVM?  Our StorageTek array has dual
 controllers with dual ports going to dual port FC cards in the
 MDS's.  Each MDS has a connection to both controllers so we will need
 multipath to get any advantage to this.
 
 Comments?
 
 
 Brock Palen
 www.umich.edu/~brockp
 Center for Advanced Computing
 [EMAIL PROTECTED]
 (734)936-1985
 
 
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] lustre and multi path

2008-06-05 Thread Klaus Steden

Hi Brock,

Yeah, that's likely to be an issue if each host has more than one path ...

What about using HA to force one path to be inactive at the device level? I
know QLogic FC cards support this functionality, although it requires
changing the options used by the driver kernel module ... mind you,
comparing that to a solution using the multipath daemon, that's six of one,
half a dozen of the other, I'd think.

Klaus

On 6/5/08 5:09 PM, Brock Palen [EMAIL PROTECTED]did etch on stone
tablets:

 This would be for the MDS/MGS only, but thats good to know.  Problem
 is our two MDS servers (active/passive) will have two connections
 each to the same lun, so there could be issues.
 
 Brock Palen
 www.umich.edu/~brockp
 Center for Advanced Computing
 [EMAIL PROTECTED]
 (734)936-1985
 
 
 
 On Jun 5, 2008, at 7:52 PM, Klaus Steden wrote:
 
 Hi Brock,
 
 I've got a Sun StorageTek array hooked up to one of our clusters,
 and I'm
 using labels instead of multi-pathing. We've got it hooked up in a
 similar
 fashion as Stuart; it's a bit slow and sloppy when initializing,
 but it
 works well enough and there are no problems once OSTs are online.
 
 Klaus
 
 On 6/5/08 3:57 PM, Brock Palen [EMAIL PROTECTED]did etch on stone
 tablets:
 
 Our new lustre hardware arrived from sun today.  Looking at the duel
 MDS and FC disk array for it.  We will need multipath.
 Has anyone ever used multipath with lustre?  Is there any issues?  If
 we set up regular multipath via LVM lustre won't care as far as I can
 tell and browsing archives.
 
 What about multipath without LVM?  Our StorageTek array has dual
 controllers with dual ports going to dual port FC cards in the
 MDS's.  Each MDS has a connection to both controllers so we will need
 multipath to get any advantage to this.
 
 Comments?
 
 
 Brock Palen
 www.umich.edu/~brockp
 Center for Advanced Computing
 [EMAIL PROTECTED]
 (734)936-1985
 
 
 
 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss
 
 
 
 

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss