from:"Brian Kroth"

Re: [Ocfs2-users] Unstable Cluster

2011-12-09 Thread Brian Kroth


Sérgio Surkamp ser...@gruposinternet.com.br 2011-12-09 11:17:

Hi.

Why are you using OCFS2 version 1.5.0 in production?

As long as I known, 1.5 series is for developers only.


I think that's just the version tag line they give on the mainline 
kernel.  It's not just for developers, it just may not be as well 
supported by some commercial linux vendor if/when something goes wrong.


Brian


Em Fri, 9 Dec 2011 00:42:25 -0800
Tony Rios t...@tonyrios.com escreveu:


I managed to get ahold of the kernel panic message because it's
happening on any new machines I try to introduce to the cluster:

[   66.276054] OCFS2 1.5.0

snip/


signature.asc
Description: Digital signature
___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] Diagnosing some OCFS2 error messages

2010-06-14 Thread Brian Kroth

Patrick J. LoPresti lopre...@gmail.com 2010-06-13 19:14:
 Hello.  I am experimenting with OCFS2 on Suse Linux Enterprise Server
 11 Service Pack 1.
 
 I am performing various stress tests.  My current exercise involves
 writing to files using a shared-writable mmap() from two nodes.  (Each
 node mmaps and writes to different files; I am not trying to access
 the same file from multiple nodes.)
 
 Both nodes are logging messages like these:
 
 [94355.116255] (ocfs2_wq,5995,6):ocfs2_block_check_validate:443 ERROR:
 CRC32 failed: stored: 2715161149, computed 575704001.  Applying ECC.
 
 [94355.116344] (ocfs2_wq,5995,6):ocfs2_block_check_validate:457 ERROR:
 Fixed CRC32 failed: stored: 2715161149, computed 2102707465
 
 [94355.116348] (ocfs2_wq,5995,6):ocfs2_validate_extent_block:903
 ERROR: Checksum failed for extent block 2321665
 
 [94355.116352] (ocfs2_wq,5995,6):__ocfs2_find_path:1861 ERROR: status = -5
 
 [94355.116355] (ocfs2_wq,5995,6):ocfs2_find_leaf:1958 ERROR: status = -5
 
 [94355.116358] (ocfs2_wq,5995,6):ocfs2_find_new_last_ext_blk:6655
 ERROR: status = -5
 
 [94355.116361] (ocfs2_wq,5995,6):ocfs2_do_truncate:6900 ERROR: status = -5
 
 [94355.116364] (ocfs2_wq,5995,6):ocfs2_commit_truncate:7559 ERROR: status = -5
 
 [94355.116370] (ocfs2_wq,5995,6):ocfs2_truncate_for_delete:597 ERROR:
 status = -5
 
 [94355.116373] (ocfs2_wq,5995,6):ocfs2_wipe_inode:770 ERROR: status = -5
 
 [94355.116376] (ocfs2_wq,5995,6):ocfs2_delete_inode:1062 ERROR: status = -5
 
 
 ...although the particular extent block number varies somewhat.
 
 In addition, when I run fsck.ocfs2 -y -f /dev/md0, I get an I/O error:

 dp-1:~ # fsck.ocfs2 -y -f /dev/md0
 
 fsck.ocfs2 1.4.3
 
 Checking OCFS2 filesystem in /dev/md0:
 
   Label:  NONE
 
   UUID:   29BB12B5AA4C449E9DDE906405F5BDE4
 
   Number of blocks:   3221225472
 
   Block size: 4096
 
   Number of clusters: 12582912
 
   Cluster size:   1048576
 
   Number of slots:4
 
 
 
 /dev/md0 was run with -f, check forced.
 
 Pass 0a: Checking cluster allocation chains
 
 Pass 0b: Checking inode allocation chains
 
 Pass 0c: Checking extent block allocation chains
 
 Pass 1: Checking inodes and blocks.
 
 extent.c: I/O error on channel reading extent block at 2321665 in
 owner 9704867 for verification
 
 pass1: I/O error on channel while iterating over the blocks for inode 9704867
 
 fsck.ocfs2: I/O error on channel while performing pass 1
 
 
 
 This looks like a straightforward I/O error, right?  The only problem
 is that there is nothing in any log (dmesg, /var/log/messages, event
 log on the hardware RAID) to indicate any hardware problem.  That is,
 when fsck.ocfs2 reports this I/O error, no other errors are logged
 anywhere as far as I can tell.  Shouldn't the kernel log a message if
 a block device gets an I/O error?
 
 I am using a pair of hardware RAID chassis accessed via iSCSI, and
 then using Linux md (RAID-0) to stripe between them.
 
 Questions:
 
 1) I would like to confirm this I/O error for myself using dd.  How do
 I map the numbers above (extent block at 2321665 in owner 9704867)
 to an actual offset on the block device so I can try to read the
 blocks by hand?
 
 2) Is there any plausible explanation for these errors other than bad 
 hardware?
 
 Thanks!
 
  - Pat

I don't believe OCFS2 can currently support any logical volume manager
other than a simple concatenation (and even then it's with extreme
caution).  The overhead involved in the lower software layer doing
striping needs to somehow be coordinated among all the nodes in the
cluster else all fs consistency guarantees provided by the SCSI layer
are lost.

Brian


signature.asc
Description: Digital signature
___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] Support and Stability

2010-05-24 Thread Brian Kroth

Michael Austin onedbg...@gmail.com 2010-05-24 13:32:
I would like to get some feedback on the overall perception on the support
and stability of OCFS2 (latest).  This tool looks like a perfect fit for
a production system I am planning, but, due to it's open source roots,
there are some concerns about ss.  The app will be deemed mission
critical with very little tolerance for any downtime (24x365). 
 
Thanks.
 
M. Austin
Consultant

It pains me to, but I can't say I'd recommend it for something like a
mail setup that has heavy write of tiny files.  There's a fragmentation
issue that burned us bad recently and before that a locking issue
(search the archives).  Even then I have to say that the Oracle devs
were responsive to us even without a service contract, for which I'm
very grateful.  You might have better luck with a supported distro.
I've always used mainline kernels with Debian.

That said, I had been using an earlier version for a web server backend
(couple of TB, mostly read) and a video streaming library (_many_ TB and
_lots_ of read traffic) for a long time without any reports of problems.
I don't work there anymore, but from what I hear everything's still
humming along without interruption (that should be read overall cluster
interruption) for almost 3 years now.  That even with crummy server rooms
that try bake their inhabitants from time to time :)

I will also say just off hand that OCFS2 is still the best OSS shared
disk cluster fs I've tried.  I've tested GFS2 off and on for a couple of
years and it still has a rather trivial deadlock case:

# cssh node1 node2 node3
# mkdir /cluster/$HOSTNAME
# touch /cluster/$HOSTNAME/test
# rm -rf /cluster/*

Cheers,
Brian


signature.asc
Description: Digital signature
___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] slowdown - fragmentation?

2010-04-30 Thread Brian Kroth

Brian Kroth bpkr...@gmail.com 2010-04-26 09:17:
 Hello all,
 
 I've got a moderately active mail system running on OCFS2.  It's been on
 fresh volume for about 9 months now, however only ever with one node
 active at a time.  For the most part it's been very happy, however
 recently we started experiencing very noticeable and periodic but short
 lived slowdowns.  From all of the graphs and measurements I've been
 doing the IO system seems bored (not much more than ~200 IOPS even, and
 typically less on a 14 disk raid 50 with 15K disks).  We haven't changed
 anything recently and I'm having trouble nailing down the cause.  Given
 all the talk about ENOSPC and fragmentation of late it's growing on my
 list of worries.  Can someone please take a look at my stat_sysdir
 output and give me a quick opinion of whether or not they think that
 might be and issue?
 
 Thanks for your help,
 Brian
 
 
 We're running Debian Lenny with a 2.6.30 kernel in VMWare ESX 4.
 
 # dpkg -l | grep -i ocfs2
 ii  ocfs2-tools 1.4.2-1 

First, sorry for spamming everyone's mailbox by attaching that dump last
time.  I should have posted it somewhere.

Unfortunately, we've gotten confirmation that we hit the infamous ENOSPC
bug [1] in error messages stating the inability to store new mail
messages.

We performed the recommended reduction in slots from 8 to 4 and things
are temporarily up and running again.  However, I'm curious if you have
any thoughts on how much time or # of files we've bought ourselves by
doing this.  Ideally we'd like to be able to hold out for another month
when there's a mandatory holiday so that no you can use the system
anyways before we do our total overhaul fix.

Here's the results of two stat_sysdir.sh dumps.
http://cae.wisc.edu/~bpkroth/public/ocfs2/

Thanks very much,
Brian

[1] http://oss.oracle.com/bugzilla/show_bug.cgi?id=1189


signature.asc
Description: Digital signature
___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] Recommended settings for mkfs.ocfs2

2010-04-19 Thread Brian Kroth

lenny-backports has a 2.6.32 based kernel that might already have the
free space fix in it.  I haven't checked yet.

Also you don't really explain what you're trying to use the data store
for (eg: lots of small files, video files, heavy writes, heavy reads,
random, sequential, etc.).  It may impact the options you want to give
to mkfs.

Brian

Andrew Robert Nicols andrew.nic...@luns.net.uk 2010-04-19 10:06:
 I'm planning to deploy a new data store using a pair of servers running
 Debian Lenny and using DRBD to replicate data between them. 
 
 My intention is to use ocfs2 as the file system on these so that we can
 operate in a dual primary mode.
 
 The raid device I'm using gives 15TB of usable space and, having had a
 brief look through the ocfs2-users archive, I see that in January an issue
 with space left on device was fixed, but this isn't available in the stock
 Lenny kernel yet (2.6.26).
 
 From the notes on the bug, I see that altering the number of slots on the
 file system can help to alleviate the issue, but what are the
 recommendations on what mkfs.ocfs2 options work best.
 
 I've already created the file system with 2 node slots, but I see from
 comment #13 that Sunil recommends against adding slots. I am free to
 re-create the file system if need be.
 
 To summarise:
 * 2 Node cluster
 * 15TB storage
 * ocfs2-tools version 1.4.1
 * 2.6.26 kernel
 
 Any advice would be gratefully received,
 
 Andrew Nicols
 
 -- 
 Systems Developer
 
 e: andrew.nic...@luns.net.uk
 im: a.nic...@jabber.lancs.ac.uk
 t: +44 (0)1524 5 10147
 
 Lancaster University Network Services is a limited company registered in
 England and Wales. Registered number: 04311892. Registered office:
 University House, Lancaster University, Lancaster, LA1 4YW



 ___
 Ocfs2-users mailing list
 Ocfs2-users@oss.oracle.com
 http://oss.oracle.com/mailman/listinfo/ocfs2-users


signature.asc
Description: Digital signature
___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] Using multiple clusters on the same host

2010-03-18 Thread Brian Kroth

I believe that due to the way the fencing works you only need a single
cluster to have multiple volumes.  Just make sure that all of the hosts
involved are specified in the same cluster.conf file.

For example, nodes a, b, c could mount volume1, while b, c, d mount
volume2, and e, f, g mount volume3 so long as the cluster.conf file
holds nodes a-g and is consistent across all nodes.

Brian

Daniel Bidwell bidw...@andrews.edu 2010-03-18 08:46:
 Is it possible to use multiple ocfs2 clusters on the same host?  I would
 like to have a given host have access to several different clustered
 file systems.
 
 The older documentation says that you can only have one cluster per
 host.  This restriction appears to have been removed from the latest
 documentation, but I don't see any mention of how to configure multiple
 clusters.
 
 -- 
 Daniel R. Bidwell   |   bidw...@andrews.edu
 Andrews University  |   Information Technology Services
 If two always agree, one of them is unnecessary
 Friends don't let friends do DOS
 In theory, theory and practice are the same.
 In practice, however, they are not.
 
 
 
 ___
 Ocfs2-users mailing list
 Ocfs2-users@oss.oracle.com
 http://oss.oracle.com/mailman/listinfo/ocfs2-users


signature.asc
Description: Digital signature
___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] No Space left on the device.

2010-03-05 Thread Brian Kroth

I also have a mail volume hosted on OCFS2 and I'm somewhat concerned
about /when/ we will run into this problem and what we can do to help
avoid too much hurt when it happens.

Are there any tips on reading the output of stat_sysdir.sh?  The man
page wasn't especially helpful, but I'm guessing I'm looking at the
Contig column for enough clusters  511.  I can post the output if
you'd prefer.


As mentioned in the bug (didn't think it was a proper place for
discussion) I'm also curious more generally about backporting these
fixes to the 2.6.32 kernel since it's been designated long term stable.
Is that responsibility just on the individual distro's kernel maintainer
or are the OCFS2 devs planning on submitting fixes to the mainline
2.6.32 tree?

Thanks,
Brian

Brad Plant bpl...@iinet.net.au 2010-03-04 16:17:
 Hi Aravind,
 
 Sounds like you might have hit the free space fragmentation issue:
 http://oss.oracle.com/bugzilla/show_bug.cgi?id=1189
 
 I'm sure that if you post output of stat_sysdir.sh
 (http://oss.oracle.com/~seeda/misc/stat_sysdir.sh) one of the ocfs2
 devs will be able to confirm this.
 
 *If* it is this problem, removing some node slots will help. That is
 of course if you have more node slots that you need. I think 8 are
 created by default.
 
 Cheers,
 
 Brad
 
 
 On Thu, 4 Mar 2010 10:28:49 +0530 (IST)
 Aravind Divakaran aravind.divaka...@yukthi.com wrote:
 
  HiAll,
  
  For my mailserver i am using ocfs2 filesystem configured on san. Now my
  mail delivery application is sometimes complaining No Space left on the
  device, even though there is enough space and inodes. Can anyone help me
  to solve this issue.
  
  
  Rgds,
  Aravind M D
  
  
  
  ___
  Ocfs2-users mailing list
  Ocfs2-users@oss.oracle.com
  http://oss.oracle.com/mailman/listinfo/ocfs2-users


signature.asc
Description: Digital signature
___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] No Space left on the device.

2010-03-05 Thread Brian Kroth

Joel Becker joel.bec...@oracle.com 2010-03-05 13:48:
 On Fri, Mar 05, 2010 at 08:33:34AM -0600, Brian Kroth wrote:
  As mentioned in the bug (didn't think it was a proper place for
  discussion) I'm also curious more generally about backporting these
  fixes to the 2.6.32 kernel since it's been designated long term stable.
  Is that responsibility just on the individual distro's kernel maintainer
  or are the OCFS2 devs planning on submitting fixes to the mainline
  2.6.32 tree?
 
   Who 'designated' it long term stable?  I'm just wondering who we
 should send our patches to ;-)

 Joel

Fair enough.  Here's the most authoritative source [1] [2] I can find,
though a quick google on long term stable kernel produces a number of
other results [3].

[1] http://lwn.net/Articles/370236/
[2] http://www.kroah.com/log/linux/stable-status-01-2010.html
[3] http://www.fabian-fingerle.de/2010-02-23.233

Thanks,
Brian


signature.asc
Description: Digital signature
___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] fencing question

2010-02-25 Thread Brian Kroth

That seems unwise.  

Presumably the connection to the disk or other network nodes was lost
due to some failure in which case you don't want nodes operating on the
disk unless they can agree on what's safe.

If there was a planned outage of the disk or network connection, then
the related volumes should probably have been unmounted first.

Brian

Charlie Sharkey charlie.shar...@bustech.com 2010-02-25 14:09:
  
 
 Hi,
 
  
 
 I have a question on fencing. Besides setting the 
 
 O2CB_HEARTBEAT_THRESHOLD parameter to some large 
 
 value, is there any way to setup ocfs2 to only
 
 fence when it loses connections to ALL mounted disk 
 
 volumes rather than to ANY one volume ? 
 
  
 
 thanks,
 
  
 
 charlie


signature.asc
Description: Digital signature
___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

[Ocfs2-users] invalid opcode bug in dlmglue?

2010-02-04 Thread Brian Kroth

We've gotten a couple of dumps likes this in the last couple of days
while migrating some new users to our mail store which involves
untarring/moving large quantities of files.  We've gracefully rebooted
the node after every instance and it seems to do fine with normal mail
operations.  I'm wondering if you have any thoughts on the messages?

Running in ESX 3.5.
The kernel is Debian 2.6.30 based.
Storage backend is iSCSI EqualLogic.
Only one node currently has the FS mounted.

Thanks,
Brian

Feb  4 09:34:41 iris kernel: [528465.151651]  DS: 007b ES: 007b FS: 00d8 GS: 
0033 SS: 0068
Feb  4 09:34:41 iris kernel: [528465.151722] Process rm (pid: 32114, 
ti=dfad4000 task=e17b9610 task.ti=dfad4000)
Feb  4 09:34:41 iris kernel: [528465.147544] [ cut here 
]
Feb  4 09:34:41 iris kernel: [528465.148706] kernel BUG at 
fs/ocfs2/dlmglue.c:2470!
Feb  4 09:34:41 iris kernel: [528465.148818] invalid opcode:  [#1] SMP 
Feb  4 09:34:41 iris kernel: [528465.148983] last sysfs file: 
/sys/devices/system/clocksource/clocksource0/available_clocksource
Feb  4 09:34:41 iris kernel: [528465.149113] Modules linked in: ocfs2 jbd2 
quota_tree ocfs2_stack_o2cb ocfs2_stackglue netconsole vmsync vmmemctl 
ocfs2_dlmfs ocfs2_dlm ocfs2_nodemanager configfs usbhid hid uhci_hcd ohci_hcd eh
ci_hcd usbcore psmouse evdev serio_raw parport_pc parport snd_pcsp snd_pcm 
snd_timer snd soundcore snd_page_alloc container button ac i2c_piix4 processor 
i2c_core intel_agp shpchp agpgart pci_hotplug ext3 jbd mbcache dm_mirror 
dm_region_ha
sh dm_log dm_snapshot dm_mod sd_mod crc_t10dif ide_cd_mod cdrom ata_generic 
libata ide_pci_generic floppy mptspi mptscsih mptbase scsi_transport_spi 
scsi_mod vmxnet piix ide_core thermal fan thermal_sys [last unloaded: 
scsi_wait_scan]
Feb  4 09:34:41 iris kernel: [528465.150763] 
Feb  4 09:34:41 iris kernel: [528465.150945] Pid: 32114, comm: rm Not tainted 
(2.6.30-vmwareguest-smp-64g.20090711 #1) VMware Virtual Platform
Feb  4 09:34:41 iris kernel: [528465.151104] EIP: 0060:[f887783d] EFLAGS: 
00010246 CPU: 2
Feb  4 09:34:41 iris kernel: [528465.151446] EIP is at 
ocfs2_dentry_lock+0x26/0xf7 [ocfs2]
Feb  4 09:34:41 iris kernel: [528465.151520] EAX: f6548800 EBX: c8c53c6c ECX: 
 EDX: 
Feb  4 09:34:41 iris kernel: [528465.151586] ESI: 17395beb EDI: f6538000 EBP: 
0005 ESP: dfad5e88
Feb  4 09:34:41 iris kernel: [528465.151651]  DS: 007b ES: 007b FS: 00d8 GS: 
0033 SS: 0068
Feb  4 09:34:41 iris kernel: [528465.151722] Process rm (pid: 32114, 
ti=dfad4000 task=e17b9610 task.ti=dfad4000)
Feb  4 09:34:41 iris kernel: [528465.151814] Stack:
Feb  4 09:34:41 iris kernel: [528465.151886]  0001  c75fe8dc 
c8c53c6c 17395beb  c8c53c6c f888fdcc
Feb  4 09:34:41 iris kernel: [528465.152084]   17395beb f88925c9 
f6538000 f0987900 c75fe940 c75fe5c0 
Feb  4 09:34:41 iris kernel: [528465.152303]     
    e6472e10
Feb  4 09:34:41 iris kernel: [528465.152566] Call Trace:
Feb  4 09:34:41 iris kernel: [528465.152580]  [f888fdcc] ? 
ocfs2_remote_dentry_delete+0xe/0x95 [ocfs2]
Feb  4 09:34:41 iris kernel: [528465.152872]  [f88925c9] ? 
ocfs2_unlink+0x3fe/0xa26 [ocfs2]
Feb  4 09:34:41 iris kernel: [528465.152960]  [c019ab62] ? 
vfs_unlink+0x5c/0x95
Feb  4 09:34:41 iris kernel: [528465.153165]  [c019be31] ? 
do_unlinkat+0x93/0xfc
Feb  4 09:34:41 iris kernel: [528465.153240]  [c0114001] ? 
smp_reschedule_interrupt+0x13/0x1c
Feb  4 09:34:41 iris kernel: [528465.153336]  [c0107eda] ? 
reschedule_interrupt+0x2a/0x30
Feb  4 09:34:41 iris kernel: [528465.153413]  [c01077d4] ? 
sysenter_do_call+0x12/0x28
Feb  4 09:34:41 iris kernel: [528465.153538] Code: e9 19 fe ff ff 55 57 56 53 
83 ec 0c 83 fa 01 8b 50 58 19 ed 83 e5 fe 83 c5 05 89 54 24 04 8b 40 54 85 d2 
8b b8 98 01 00 00 75 04 0f 0b eb fe 8d 9f 9c 00 00 00 89 d8 e8 2a 1e ab c7 8b 
87 a4 00 
Feb  4 09:34:41 iris kernel: [528465.154876] EIP: [f887783d] 
ocfs2_dentry_lock+0x26/0xf7 [ocfs2] SS:ESP 0068:dfad5e88
Feb  4 09:34:41 iris kernel: [528465.155307] ---[ end trace 62c828cac153c25f 
]---


signature.asc
Description: Digital signature
___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] invalid opcode bug in dlmglue?

2010-02-04 Thread Brian Kroth

Excellent.  Thanks for the quick response.

Also, any idea when the tools might support indexed dirs?  I suspect
we'll have some downtime coming up in a couple of months and am
wondering if we can use the opportunity to turn on that feature for
quicker lookup times.

Thanks,
Brian

Sunil Mushran sunil.mush...@oracle.com 2010-02-04 09:16:
 Fixed.
 http://oss.oracle.com/bugzilla/show_bug.cgi?id=1137

 You probably already have this patch. If not, add it.
 http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=a5a0a630922a2f6a774b6dac19f70cb5abd86bb0

 You are definitely missing this patch.
 http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=a1b08e75dff3dc18a88444803753e667bb1d126e


 Brian Kroth wrote:
 We've gotten a couple of dumps likes this in the last couple of days
 while migrating some new users to our mail store which involves
 untarring/moving large quantities of files.  We've gracefully rebooted
 the node after every instance and it seems to do fine with normal mail
 operations.  I'm wondering if you have any thoughts on the messages?

 Running in ESX 3.5.
 The kernel is Debian 2.6.30 based.
 Storage backend is iSCSI EqualLogic.
 Only one node currently has the FS mounted.

 Thanks,
 Brian

 Feb  4 09:34:41 iris kernel: [528465.151651]  DS: 007b ES: 007b FS: 00d8 GS: 
 0033 SS: 0068
 Feb  4 09:34:41 iris kernel: [528465.151722] Process rm (pid: 32114, 
 ti=dfad4000 task=e17b9610 task.ti=dfad4000)
 Feb  4 09:34:41 iris kernel: [528465.147544] [ cut here 
 ]
 Feb  4 09:34:41 iris kernel: [528465.148706] kernel BUG at 
 fs/ocfs2/dlmglue.c:2470!
 Feb  4 09:34:41 iris kernel: [528465.148818] invalid opcode:  [#1] 
 SMP Feb  4 09:34:41 iris kernel: [528465.148983] last sysfs file: 
 /sys/devices/system/clocksource/clocksource0/available_clocksource
 Feb  4 09:34:41 iris kernel: [528465.149113] Modules linked in: ocfs2 jbd2 
 quota_tree ocfs2_stack_o2cb ocfs2_stackglue netconsole vmsync vmmemctl 
 ocfs2_dlmfs ocfs2_dlm ocfs2_nodemanager configfs usbhid hid uhci_hcd 
 ohci_hcd eh
 ci_hcd usbcore psmouse evdev serio_raw parport_pc parport snd_pcsp snd_pcm 
 snd_timer snd soundcore snd_page_alloc container button ac i2c_piix4 
 processor i2c_core intel_agp shpchp agpgart pci_hotplug ext3 jbd mbcache 
 dm_mirror dm_region_ha
 sh dm_log dm_snapshot dm_mod sd_mod crc_t10dif ide_cd_mod cdrom ata_generic 
 libata ide_pci_generic floppy mptspi mptscsih mptbase scsi_transport_spi 
 scsi_mod vmxnet piix ide_core thermal fan thermal_sys [last unloaded: 
 scsi_wait_scan]
 Feb  4 09:34:41 iris kernel: [528465.150763] Feb  4 09:34:41 iris 
 kernel: [528465.150945] Pid: 32114, comm: rm Not tainted 
 (2.6.30-vmwareguest-smp-64g.20090711 #1) VMware Virtual Platform
 Feb  4 09:34:41 iris kernel: [528465.151104] EIP: 0060:[f887783d] EFLAGS: 
 00010246 CPU: 2
 Feb  4 09:34:41 iris kernel: [528465.151446] EIP is at 
 ocfs2_dentry_lock+0x26/0xf7 [ocfs2]
 Feb  4 09:34:41 iris kernel: [528465.151520] EAX: f6548800 EBX: c8c53c6c 
 ECX:  EDX: 
 Feb  4 09:34:41 iris kernel: [528465.151586] ESI: 17395beb EDI: f6538000 
 EBP: 0005 ESP: dfad5e88
 Feb  4 09:34:41 iris kernel: [528465.151651]  DS: 007b ES: 007b FS: 00d8 GS: 
 0033 SS: 0068
 Feb  4 09:34:41 iris kernel: [528465.151722] Process rm (pid: 32114, 
 ti=dfad4000 task=e17b9610 task.ti=dfad4000)
 Feb  4 09:34:41 iris kernel: [528465.151814] Stack:
 Feb  4 09:34:41 iris kernel: [528465.151886]  0001  c75fe8dc 
 c8c53c6c 17395beb  c8c53c6c f888fdcc
 Feb  4 09:34:41 iris kernel: [528465.152084]   17395beb f88925c9 
 f6538000 f0987900 c75fe940 c75fe5c0 
 Feb  4 09:34:41 iris kernel: [528465.152303]     
     e6472e10
 Feb  4 09:34:41 iris kernel: [528465.152566] Call Trace:
 Feb  4 09:34:41 iris kernel: [528465.152580]  [f888fdcc] ? 
 ocfs2_remote_dentry_delete+0xe/0x95 [ocfs2]
 Feb  4 09:34:41 iris kernel: [528465.152872]  [f88925c9] ? 
 ocfs2_unlink+0x3fe/0xa26 [ocfs2]
 Feb  4 09:34:41 iris kernel: [528465.152960]  [c019ab62] ? 
 vfs_unlink+0x5c/0x95
 Feb  4 09:34:41 iris kernel: [528465.153165]  [c019be31] ? 
 do_unlinkat+0x93/0xfc
 Feb  4 09:34:41 iris kernel: [528465.153240]  [c0114001] ? 
 smp_reschedule_interrupt+0x13/0x1c
 Feb  4 09:34:41 iris kernel: [528465.153336]  [c0107eda] ? 
 reschedule_interrupt+0x2a/0x30
 Feb  4 09:34:41 iris kernel: [528465.153413]  [c01077d4] ? 
 sysenter_do_call+0x12/0x28
 Feb  4 09:34:41 iris kernel: [528465.153538] Code: e9 19 fe ff ff 55 57 
 56 53 83 ec 0c 83 fa 01 8b 50 58 19 ed 83 e5 fe 83 c5 05 89 54 24 04 8b 
 40 54 85 d2 8b b8 98 01 00 00 75 04 0f 0b eb fe 8d 9f 9c 00 00 00 89 
 d8 e8 2a 1e ab c7 8b 87 a4 00 Feb  4 09:34:41 iris kernel: 
 [528465.154876] EIP: [f887783d] ocfs2_dentry_lock+0x26/0xf7 [ocfs2] 
 SS:ESP 0068:dfad5e88
 Feb  4 09:34:41 iris kernel: [528465.155307] ---[ end trace 62c828cac153c25f 
 ]---
   


signature.asc
Description: Digital

[Ocfs2-users] esx elevator=noop

2010-01-15 Thread Brian Kroth

http://lonesysadmin.net/2008/02/21/elevatornoop/

I ran across this recently which describes, when operating in a virtual
environment with shared storage, how to try and let the storage and
hypervisor deal with arranging disk write operations in a more globally
optimal way rather than having all the guests try to do it and muck it
up.

However, this is contrary to ocfs2 recommendation of using the deadline
elevator.

I'm just wondering if you have any comments one way or the other?

My concern would be that while noop might make things globally optimal
it would still allow starvation in a single guest which might lead to
ocfs2 fencing.

Thanks,
Brian

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] esx elevator=noop

2010-01-15 Thread Brian Kroth

At least in ESX you can setup various reservations and priority
weightings for various resources to unsure that some machines are
considered more important than others.  As to what actually happens in
practice, who knows.

Thanks for the info,
Brian

Herbert van den Bergh herbert.van.den.be...@oracle.com 2010-01-15 11:20:

 I would assume that all bets are off if you're running a cluster inside  
 a vm.  There are no garantees for either I/O or cpu scheduling.

 Thanks,
 Herbert.


 On 01/15/2010 10:50 AM, Sunil Mushran wrote:
 The deadline recommendation was for early el4 kernels that had a bug
 in cfq. That bug was fixed years ago.

 I am unsure how using noop in guest will trigger starvation. Not that
 I am recommending it. I have not thought about this much.

 On Jan 15, 2010, at 9:55 AM, Brian Krothbpkr...@gmail.com  wrote:


 http://lonesysadmin.net/2008/02/21/elevatornoop/

 I ran across this recently which describes, when operating in a
 virtual
 environment with shared storage, how to try and let the storage and
 hypervisor deal with arranging disk write operations in a more
 globally
 optimal way rather than having all the guests try to do it and muck it
 up.

 However, this is contrary to ocfs2 recommendation of using the
 deadline
 elevator.

 I'm just wondering if you have any comments one way or the other?

 My concern would be that while noop might make things globally optimal
 it would still allow starvation in a single guest which might lead to
 ocfs2 fencing.

 Thanks,
 Brian

 ___
 Ocfs2-users mailing list
 Ocfs2-users@oss.oracle.com
 http://oss.oracle.com/mailman/listinfo/ocfs2-users
  
 ___
 Ocfs2-users mailing list
 Ocfs2-users@oss.oracle.com
 http://oss.oracle.com/mailman/listinfo/ocfs2-users


___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] Combining OCFS2 with Linux software RAID-0?

2009-12-11 Thread Brian Kroth

Luis Freitas lfreita...@yahoo.com 2009-12-11 05:40:
Patrick,

Depending on what you are using, you could use the volume manager
to do the striping, but you need to use CLVM. So if you can, go for
Heartbeat2+CLVM+OCFS2, all integrated.

Not sure but I think Heartbeat2+OCFS2 is only available on the
vanilla kernels, not on the enterprise ones. Maybe Suse has
support, I don't know, you will have to check.

Best Regards,
Luis Freitas

Just to elaborate on these comments. Last time I checked CLVM required
the openais/cman cluster stack, which neither heartbeat nor ocfs2 use
(by default). The userspace stack option for ocfs2 in recent mainline
kernels added support for the openais stack and pacemaker is required to
make heartbeat work with that rather than use it's own cluster stack.

Now, you can do an basic LVM linear span, concatenation, or whatever you
want to call it without any cluster stack, so long as it's not striped
and so long as you heed Sunil's warning about fat fingering changes to
the thing while more than one host is using it.

That means that if you want to add another LUN to the span you can't do
it on the fly. You have to do something like this:

# On all nodes:
umount /ocfs2

# On all nodes but one:
vgchange -an ocfs2span
# Or, to be extra safe:
halt -p

# On the remaining node:
vgextend ocfs2span /dev/newlun
lvextend -l+100%FREE /dev/mapper/ocfs2span-lv
tunefs.ocfs2 -S /dev/mapper/ocfs2span-lv

# You might actually need the fs mounted for that last bit, I forget.
# Probably a fsck somewhere in there would be wise as well.

# Bring the other nodes back up.

Brian

--- On Wed, 12/9/09, Patrick J. LoPresti lopre...@gmail.com wrote:

From: Patrick J. LoPresti lopre...@gmail.com
Subject: [Ocfs2-users] Combining OCFS2 with Linux software RAID-0?
To: ocfs2-users@oss.oracle.com, linux-r...@vger.kernel.org
Date: Wednesday, December 9, 2009, 9:03 PM

Is it possible to run an OCFS2 file system on top of Linux software RAID?

Here is my situation. I have four identical disk chassis that perform
hardware RAID internally. Each chassis has a pair of fiber channel
ports, and I can assign the same LUN to both ports. I want to connect
all of these chassis to two Linux systems. I want the two Linux
systems to share a file system that is striped across all four chassis
for performance.

I know I can use software RAID (mdadm) to do RAID-0 striping across
the four chassis on a single machine; I have tried this, it works
fine, and the performance is tremendous. I also know I can use OCFS2
to create a single filesystem on a single chassis that is shared
between my two Linux systems. What I want is to combine these two
things.

Suse's documentation

([1]http://www.novell.com/documentation/sles11/stor_admin/?page=/documentation/sles11/stor_admin/data/raidyast.html)
says:

IMPORTANT:Software RAID is not supported underneath clustered file
systems such as OCFS2, because RAID does not support concurrent
activation. If you want RAID for OCFS2, you need the RAID to be
handled by the storage subsystem.

Because my disk chassis already perform hardware RAID-5, I only need
Linux to do the striping (RAID-0) in software. So for me, there is no
issue about which node should rebuild the RAID etc. I understand
that Linux md stores meta-data on the partitions and is not cluster
aware, but will this create problems for OCFS2 even if it is just RAID
0?

Has anybody tried something like this? Are there alternative RAID-0
solutions for Linux that would be expected to work?

Thank you.

- Pat

___
Ocfs2-users mailing list
[2]ocfs2-us...@oss.oracle.com
[3]http://oss.oracle.com/mailman/listinfo/ocfs2-users

References

Visible links
1.
http://www.novell.com/documentation/sles11/stor_admin/?page=/documentation/sles11/stor_admin/data/raidyast.html
2. file:///mc/compose?to=ocfs2-us...@oss.oracle.com
3. http://oss.oracle.com/mailman/listinfo/ocfs2-users

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] OCFS2 and VMware ESX

2009-11-04 Thread Brian Kroth

Late to the party...

Here's what I did to get OCFS2 going with RDMs _and_ VMotion (with some
exceptions).  Almost surely not supported, but it works:

First, create the RDM as a physical passthru using either the gui or the
cli.  It need not be on a separate virtual controller.  However, to have
VMotion working it must _not_ use bus sharing.

For the other nodes setup another passthru RDM that uses the same path
as the previous one.  You must do this on the cli since the GUI will
hide the LUN path once it's been configured any where else.

# Query the path:
cd /vmfs/volumes/whereever/node1
vmkfstools -q node1_1.vmdk
# Make a passthru VMDK for that RDM
cd /vmfs/volumes/wherever/node2
vmkfstools -z /vmfs/devices/disks/vmhba37\:25\:0\:0 node2.vmdk

Now you can add that disk to the VM either by editing the vmx like David
said, or with the GUI.

Here's the catch - ESX locks that /vmfs/devices/disk/whatever path when
a node starts using it, so you can't 
1) Run more than one of these vm nodes on the same esx node
2) Migrate one of these vm nodes to an esx node that is already running
another of the vm nodes.

So, if your esx cluster is greater than your ocfs2 cluster, you're ok.
Else, you need to stick to either the bus sharing method which means no
vmotion, or the cluster in a box method which means all on one esx node
(which kinda defeats the point in my opinion).

I have noticed significant performance benefits from moving to RDMs vs
sw iscsi virtualized in the guest os.  It's the only reason I'd risk all
this.

As to snapshots I've been told by the vmware techs not to use them for
production level things (or at least not for very long) as they can
really kill performance.

We do snapshots on the raid device serving the lun and restrict its
visibility to a particular machine (actually another vm) in order to do
all of the backups from there.  Actually it had to be on a separate VM
outside the normal cluster else the machine would refuse to mount it.
Works out better this way anyways since then we're not burdening the
production VM with the backup work (though it still hits the same
storage device).

The script that does this also deals with all of the necessary fs fixups
to have multiple snapshots mounted at once though I think the 1.4.2
version of ocfs2-tools provides a cloned fs option for doing all that
now.

Brian

David Murphy da...@icewatermedia.com 2009-10-22 09:10:
With  RDM  versus the method Kent described. It's a bit more complicated
and will prevent snapshots and vmotion.
 
 
 
Basically follow what he said but instead of making a vmdk disk  choose
RDM and select  a  LUN.
 
 
 
Then make sure that machine is NOT powered on, log into the esx host and
move the RDM file to say /vmfs/volumes/volume_name/RawDeviceMaps ( you
need to make that folder).
 
Next manually edit the VMX for that host and change it path to the RDM to
where every  you moved it to.
 
 
 
Now you can create new clones  of your base  template, and add the RDM
drive to it ( as ken mentioned , its VERY important), pointing to the
RawDeviceMaps folder and the correct RDM file for that LUN.
 
 
 
 
 
This approach has many issue  so I'm planning on moving away from it.
 
 
 
 
 
1)  You can't clone
 
2)  You can't  snapshot
 
3)  You can't vmotion
 
4)  If you delete a host that has that drive attached you completely
destroy the RDM file. (BAD JOJO)
 
I you do need  to have cluster in such an environment I would suggest a
combination of the 2 approaches.
 
 
 
 
 
 
 
1)  Build a new LUN and  make it VMFS and let the  ESX hosts discover
it.
 
2)  Create the VMDK's on that LUN not in you main VMFS for  VM's
 
3)  Make sure you set any  OCFS drive to separate controller and
physical, persistent  ( so it won't snapshot it)
 
 
 
You should retain snap/vmotion. But we aware. I am not sure if cloning
will make a  new vmdk on your VMFS volume you make for the ocfs drives. So
I would have a base template I clone, then add that drive to the clone (
to guarantee the drives location).
 
 
 
 
 
It's a bit more work that just saving the VMDK to the  VM's folder on your
main VMFS, but it separates the OCFS drives to another  LUN. So you could
easily stop your cluster,  take a snapshot of the lun for backups and
bring them back up.  Limiting your downtime window. Might be over kill
depend on the companies backup stance.
 
 
 
 
 
Hope it helps
David
 
 
 
From: ocfs2-users-boun...@oss.oracle.com
[mailto:ocfs2-users-boun...@oss.oracle.com] On Behalf Of Rankin, Kent
Sent: Monday, July 28, 2008 9:13 PM
To: Haydn Cahir; ocfs2-users@oss.oracle.com
Subject: Re: [Ocfs2-users] OCFS2 and VMware ESX
 
 
 
What I did a few days ago was to create a vmware disk for each OCFS2
filesystem, and store it with one of the VM nodes.  Then, add that disk to
each

Re: [Ocfs2-users] more ocfs2_delete_inode dmesg questions

2009-10-06 Thread Brian Kroth

So, we've found that this has actually been causing some dropped mail
and backlogs.

Here's the situation:

MX servers filter into the main mail server, all running sendmail.  The
main mail server has an OCFS2 spool volume which will periodically throw
those error messages in dmesg that I listed earlier.

Sendmail returns one of these two sequences:

Oct  6 16:46:02 iris sm-mta[14407]: n96Lk23h014407:   0: fl=0x0, mode=20666: 
CHR: dev=0/13, ino=679, nlink=1, u/gid=0/0, size=0
Oct  6 16:46:02 iris sm-mta[14407]: n96Lk23h014407:   1: fl=0x802, mode=140777: 
SOCK smtp/25-mx2/54625
Oct  6 16:46:02 iris sm-mta[14407]: n96Lk23h014407:   2: fl=0x1, mode=20666: 
CHR: dev=0/13, ino=679, nlink=1, u/gid=0/0, size=0
Oct  6 16:46:02 iris sm-mta[14407]: n96Lk23h014407:   3: fl=0x2, mode=140777: 
SOCK localhost-[[UNIX: /dev/log]]
Oct  6 16:46:02 iris sm-mta[14407]: n96Lk23h014407:   4: fl=0x802, mode=140777: 
SOCK smtp/25-mx2/54625
Oct  6 16:46:02 iris sm-mta[14407]: n96Lk23h014407:   5: fl=0x0, mode=100640: 
dev=8/33, ino=580655, nlink=1, u/gid=0/25, size=164461
Oct  6 16:46:02 iris sm-mta[14407]: n96Lk23h014407:   6: fl=0x8000, 
mode=100644: dev=8/1, ino=175982, nlink=1, u/gid=107/25, size=12288
Oct  6 16:46:02 iris sm-mta[14407]: n96Lk23h014407:   7: fl=0x8000, 
mode=100644: dev=8/1, ino=175982, nlink=1, u/gid=107/25, size=12288
Oct  6 16:46:02 iris sm-mta[14407]: n96Lk23h014407:   8: fl=0x8000, 
mode=100644: dev=8/1, ino=175663, nlink=1, u/gid=107/25, size=49152
Oct  6 16:46:02 iris sm-mta[14407]: n96Lk23h014407:   9: fl=0x802, mode=140777: 
SOCK smtp/25-mx2/54625
Oct  6 16:46:02 iris sm-mta[14407]: n96Lk23h014407:  10: fl=0x8000, 
mode=100644: dev=8/1, ino=175663, nlink=1, u/gid=107/25, size=49152
Oct  6 16:46:02 iris sm-mta[14407]: n96Lk23h014407:  11: fl=0x8000, 
mode=100644: dev=8/1, ino=175662, nlink=1, u/gid=107/25, size=2621440
Oct  6 16:46:02 iris sm-mta[14407]: n96Lk23h014407:  12: fl=0x8000, 
mode=100644: dev=8/1, ino=175662, nlink=1, u/gid=107/25, size=2621440
Oct  6 16:46:02 iris sm-mta[14407]: n96Lk23h014407:  13: fl=0x8000, 
mode=100644: dev=8/1, ino=175622, nlink=1, u/gid=107/25, size=2543616
Oct  6 16:46:02 iris sm-mta[14407]: n96Lk23h014407:  14: fl=0x8000, 
mode=100644: dev=8/1, ino=175622, nlink=1, u/gid=107/25, size=2543616
Oct  6 16:46:02 iris sm-mta[14407]: n96Lk23h014407: SYSERR(root): queueup: 
cannot create queue file ./qfn96Lk23h014407, euid=0, fd=-1, fp=0x0: File exists

This results in dropped mail.

Oct  6 16:26:09 iris sm-mta[29393]: n96LQ4uK029393: SYSERR(root): collect: 
bfcommit(./dfn96LQ4uK029393): already on disk, size=0: File exists

This results in a failure being propagated to the mx, which then
backlogs mail for a while before it retries.

The debufs.ocfs2 command allows me to identify the file, which turns out
to be a 0 byte file on the spool volume.  If it's a data file I can
usually examine the cooresponding control file, or vise-versa, but after
not too long, both are removed, or rather replaced by new files.  I'm
not sure if that's just sendmail shoving lots of files around and
therefore needing to reuse inodes, or sendmail cleaning up after itself,
or ocfs2 cleaning up after itself.

For now I've moved the spool volume back to a local disk, but may try to
test this out some more on our test setup.

Any thoughts?

Thanks,
Brian

Brian Kroth bpkr...@gmail.com 2009-08-25 08:52:
 Sunil Mushran sunil.mush...@oracle.com 2009-08-24 18:12:
  So a delete was called for some inodes that had not been orphaned.
  The pre-checks detected the same and correctly aborted the deletes.
  No harm done.
 
 Very good to hear.
 
  No, the messages do not pinpoint the device. It's something we discussed
  adding, but have not done it as yet.
  
  Next time this happens and you can identify the volume, do:
  # debugfs.ocfs2 -R findpath 613069 /dev/sdX
  
  This will tell you the pathname for the inode#. Then see if you can remember
  performing any op on that file. Anything. It may help us narrow down the
  issue.
  
  Sunil
 
 Will do.
 
 Thanks again,
 Brian
 
  Brian Kroth wrote:
   I recently brought up a mail server with two ocfs2 volumes on it, one
   large one for the user maildirs, and one small one for queue/spool
   directories.  More information on the specifics below.  When flushing
   the queues from the MXs I saw the messages listed below fly by, but
   since then nothing.
  
   A couple of questions:
   - Should I be worried about these?  They seemed similar yet different to
 a number of other out of space and failure to delete reports of
 late.
   - How can I tell which volume has the problem inodes?
   - Is there anything to be done about them?
  
   Here's the snip from the tail of dmesg:
  
   [   34.578787] netconsole: network logging started
   [   36.695679] ocfs2: Registered cluster interface o2cb
   [   43.354897] OCFS2 1.5.0
   [   43.373100] ocfs2_dlm: Nodes in domain 
   (94468EF57C9F4CA18C8D218C63E99A9C): 1 
   [   43.386623] kjournald2 starting: pid 2328

Re: [Ocfs2-users] OCFS2 1.4 Problem on SuSE

2009-09-29 Thread Brian Kroth

Angelo McComis ang...@mccomis.com 2009-09-29 11:19:
 I'm sorry -- it's lvm2, and yes. :-)
 
 On Tue, Sep 29, 2009 at 10:41 AM, Charlie Sharkey
 charlie.shar...@bustech.com wrote:
 
  It was mentioned:
 
     - Checked our lvm configuration - seems to be good as well.
 
  Is lvm supported by ocfs2 ?

I didn't think this part was true.  The issue being that all nodes need
to be aware of possible metadata changes to the volume group and logical
volumes.

clvm (which I believe is supported by Novell) can handle that locking
between nodes so that they have a consistent view of the metadata, but
last I checked it used a different cluster stack that wasn't quite
supported by ocfs2 yet and running both sided by side would run into
some fencing issues.

Alternatively I think you can (read unsupported, but does work) do
simple LVM configurations like linear spans since they don't have any
striping metadata information that needs to be updated.  The trick is
that you need to take everything offline when you want to make any
changes to the volume group or logical volume.

Brian

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] Clear Node

2009-08-26 Thread Brian Kroth

You can do what Sunil mentioned using heartbeat [1].  However, MySQL
also has replication built into it and you can also use heartbeat to
automatically turn a slave into a master very quickly without any need
for shared storage.  This way you could also use the slave to do load
balancing of reads and provide backups without interrupting access on
the master when all the tables and databases get locked.

Brian

[1] http://linux-ha.org

Sunil Mushran sunil.mush...@oracle.com 2009-08-25 17:14:
 Can you describe the mount lock?
 
 You don't have to limit the mount to just one node. Have both
 nodes mount the volume but run mysql only on one node only.
 
 Sunil
 
 James Devine wrote:
  I am trying to make a mysql standby setup with 2 machines, one primary
  and one hot standby, which both share disk for the data directory.  I
  used tunefs.ocfs2 to change the number of open slots to 1 since only
  one machine should be accessing it at a time.  This way it is fairly
  safe to assume one shouldn't clobber the other's data.  Only problem
  is, if one node dies, the mount lock still persists.  Is there a way
  to clear that lock so the other node can mount the share?
 
  ___
  Ocfs2-users mailing list
  Ocfs2-users@oss.oracle.com
  http://oss.oracle.com/mailman/listinfo/ocfs2-users

 
 
 ___
 Ocfs2-users mailing list
 Ocfs2-users@oss.oracle.com
 http://oss.oracle.com/mailman/listinfo/ocfs2-users

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] more ocfs2_delete_inode dmesg questions

2009-08-25 Thread Brian Kroth

Sunil Mushran sunil.mush...@oracle.com 2009-08-24 18:12:
 So a delete was called for some inodes that had not been orphaned.
 The pre-checks detected the same and correctly aborted the deletes.
 No harm done.

Very good to hear.

 No, the messages do not pinpoint the device. It's something we discussed
 adding, but have not done it as yet.
 
 Next time this happens and you can identify the volume, do:
 # debugfs.ocfs2 -R findpath 613069 /dev/sdX
 
 This will tell you the pathname for the inode#. Then see if you can remember
 performing any op on that file. Anything. It may help us narrow down the
 issue.
 
 Sunil

Will do.

Thanks again,
Brian

 Brian Kroth wrote:
  I recently brought up a mail server with two ocfs2 volumes on it, one
  large one for the user maildirs, and one small one for queue/spool
  directories.  More information on the specifics below.  When flushing
  the queues from the MXs I saw the messages listed below fly by, but
  since then nothing.
 
  A couple of questions:
  - Should I be worried about these?  They seemed similar yet different to
a number of other out of space and failure to delete reports of
late.
  - How can I tell which volume has the problem inodes?
  - Is there anything to be done about them?
 
  Here's the snip from the tail of dmesg:
 
  [   34.578787] netconsole: network logging started
  [   36.695679] ocfs2: Registered cluster interface o2cb
  [   43.354897] OCFS2 1.5.0
  [   43.373100] ocfs2_dlm: Nodes in domain 
  (94468EF57C9F4CA18C8D218C63E99A9C): 1 
  [   43.386623] kjournald2 starting: pid 2328, dev sdb1:36, commit interval 
  5 seconds
  [   43.395413] ocfs2: Mounting device (8,17) on (node 1, slot 0) with 
  ordered data mode.
  [   44.984201] eth1: no IPv6 routers present
  [   54.362580] warning: `ntpd' uses 32-bit capabilities (legacy support in 
  use)
  [ 1601.560932] ocfs2_dlm: Nodes in domain 
  (10BBA4EB7687450496F7FCF0475F9372): 1 
  [ 1601.581106] kjournald2 starting: pid 7803, dev sdc1:36, commit interval 
  5 seconds
  [ 1601.593065] ocfs2: Mounting device (8,33) on (node 1, slot 0) with 
  ordered data mode.
  [ 3858.778792] (26441,0):ocfs2_query_inode_wipe:882 ERROR: Inode 613069 
  (on-disk 613069) not orphaned! Disk flags  0x1, inode flags 0x80
  [ 3858.779005] (26441,0):ocfs2_delete_inode:1010 ERROR: status = -17
  [ 4451.007580] (5053,0):ocfs2_query_inode_wipe:882 ERROR: Inode 613118 
  (on-disk 613118) not orphaned! Disk flags  0x1, inode flags 0x80
  [ 4451.007711] (5053,0):ocfs2_delete_inode:1010 ERROR: status = -17
  [ 4807.908463] (11859,0):ocfs2_query_inode_wipe:882 ERROR: Inode 612899 
  (on-disk 612899) not orphaned! Disk flags  0x1, inode flags 0x80
  [ 4807.908611] (11859,0):ocfs2_delete_inode:1010 ERROR: status = -17
  [ 5854.377155] (31074,1):ocfs2_query_inode_wipe:882 ERROR: Inode 612867 
  (on-disk 612867) not orphaned! Disk flags  0x1, inode flags 0x80
  [ 5854.377302] (31074,1):ocfs2_delete_inode:1010 ERROR: status = -17
  [ 6136.297464] (3463,0):ocfs2_query_inode_wipe:882 ERROR: Inode 612959 
  (on-disk 612959) not orphaned! Disk flags  0x1, inode flags 0x80
  [ 6136.297555] (3463,0):ocfs2_delete_inode:1010 ERROR: status = -17
  [19179.000100] NOHZ: local_softirq_pending 80
 
 
  There's actually three nodes, all VMs, that are setup for the ocfs2
  cluster volumes, but only one has it mounted.  The others are available
  as cold standbys that may eventually be managed by heartbeat, so there
  shouldn't be any locking contention going on.
 
  All nodes are running 2.6.30 with ocfs2-tools 1.4.2.
 
  Here's the commands used to make the volumes:
  mkfs.ocfs2 -v -L ocfs2mailcluster2 -N 8 -T mail /dev/sdb1
  mkfs.ocfs2 -v -L ocfs2mailcluster2spool -N 8 -T mail /dev/sdc1
 
  The features the were setup with:
  tunefs.ocfs2 -Q Label: %V\nFeatures: %H %O\n /dev/sdb1
  Label: ocfs2mailcluster2
  Features: sparse inline-data unwritten
 
  tunefs.ocfs2 -Q Label: %V\nFeatures: %H %O\n /dev/sdc1
  Label: ocfs2mailcluster2spool
  Features: sparse inline-data unwritten
 
  And their mount options:
  mount | grep cluster
  /dev/sdb1 on /cluster type ocfs2 
  (rw,noexec,nodev,_netdev,relatime,localflocks,heartbeat=local)
  /dev/sdc1 on /cluster-spool type ocfs2 
  (rw,noexec,nodev,_netdev,relatime,localflocks,heartbeat=local)
 
  localflocks because I ran into a problem with them previously, and since
  it's a single active node model currently there's no reason for them
  anyways.
 
  Let me know if you need any other information.
 
  Thanks,
  Brian
 
  ___
  Ocfs2-users mailing list
  Ocfs2-users@oss.oracle.com
  http://oss.oracle.com/mailman/listinfo/ocfs2-users

 
 
 ___
 Ocfs2-users mailing list
 Ocfs2-users@oss.oracle.com
 http://oss.oracle.com/mailman/listinfo/ocfs2-users

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] Ghost files in OCFS2 filesystem

2009-08-21 Thread Brian Kroth

I didn't see this in the bug list.  Which mainline release is this fixed
in?

Thanks,
Brian

Sunil Mushran sunil.mush...@oracle.com 2009-08-20 17:46:
 Yes, this is a known issue in OCFS2 1.4.1 and 1.4.2. That is assuming
 no process in the cluster has that file open. We have the fix. It will be
 available with 1.4.3 which is in testing.
 
 This was discussed in the email announcing the 1.4.2 release.
 
 http://oss.oracle.com/pipermail/ocfs2-announce/2009-June/28.html
 
 1. Oracle# 7643059 - Orphan files not getting deleted
 When one unlinks a file, its inode is moved to the orphan directory and
 deleted when it is no longer in-use across the cluster. As part of the
 scheme, the node that unlinks the file, informs interested nodes of the
 same, asking the last node to stop using that inode to recover the space
 allocated to it. However, this scheme fails if memory pressure forces a
 node to forget to delete the inode on close.
 This issue was introduced in OCFS2 1.4.1. While we have fixed this issue,
 the fix did not make it into this release. Users running into this issue
 can call Oracle Support and ask for an interim release with this fix.
 
 
 Workaround:
 If on 1.4.2, mount the fs on another node. Chances are it will delete
 the orphans. If not or if you are on 1.4.1, umount vol on all nodes and
 run fsck.ocfs2 -f.
 
 You could ping support to get an interim fix. But we are close to releasing
 1.4.3. So maybe better if you wait for that.
 
 Sunil
 
 
 Shave, Chris wrote:
  Hi,
   
  I have encountered an issue on an Oracle RAC cluster using ocfs2, OS 
  is RH Linux 5.3.
   
  One of the ocfs2 filesystems appears to be 97% full, yet when I look 
  at the files in there they only equal about 13gig (filesystems is 
  40gig in size).
   
  I have seen this sort of thing in HP-UX but that involved a process 
  who's output file was deleted but the process hadn't been stopped 
  properly, once we killed the offending process the space was released, 
  but I can't seem to find any process on this Linux server that is 
  using or writing files to that filesystem.
   
  File system:
   
  FilesystemSize  Used Avail Use% Mounted on
  /dev/emcpoweri140G   39G  1.9G  96% /oraexport
  File listing: Other than directories, these are the only files in that 
  filesystem, nothing in lost+found either..
   
  [r...@aumel21db01cn01]# ll
  total 11926784
  -rw-rw 1 oracle oinstall 12210978816 Aug 21 05:23   
  0.Full.090821.dmp
  -rw-rw-r-- 1 oracle oinstall 1920327 Aug 21 05:23  
  0.Full.090821.log
  Any assistance with what is going on would be greatly appreciated.
 
 
 ___
 Ocfs2-users mailing list
 Ocfs2-users@oss.oracle.com
 http://oss.oracle.com/mailman/listinfo/ocfs2-users

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] OCFS2 on email storage

2009-08-20 Thread Brian Kroth

I just did this:
mkfs.ocfs2 -v -L ocfs2mail -N 8 -T mail /dev/sdb1

The tools happen to choose -b 4096 -C 4096 for you at that point.

Brian

Sérgio Surkamp ser...@gruposinternet.com.br 2009-08-12 12:03:
 Em Wed, 12 Aug 2009 15:05:44 +0800
 Thomas G. Lau thomas@ntt.com.hk escreveu:
 
  Dear all,
  
  Anyone using OCFS2 on email storage system ? (postfix/qmail).
  Wondering if any of you suffer any problem?! Also, what parameter did
  you tune for email system only? thanks.
  
  ___
  Ocfs2-users mailing list
  Ocfs2-users@oss.oracle.com
  http://oss.oracle.com/mailman/listinfo/ocfs2-users
 
 Hello Thomas,
 
 We are using as general propose shared filesystem (email, shared files,
 some web related files, etc...) and the only drawback from a local
 filesystem is that it's a bit slower due to cluster stack.
 
 About the tuning we have reduced the block size and pre-allocated size
 to reduce the on-disk size, since email messages usually are tiny files
 (lets say about 8K per message).
 
 Our setup:
 
 mkfs.ocfs2 -b 4096 -C 4096 -N 4 /dev/sdX
 
 If you plan to use it for a large number of accounts, keep your eye on
 the 32000 sub-directories limitation.
 
 Regards,
 -- 
   .:':.
 .:'` Sérgio Surkamp | Gerente de Rede
 ::   ser...@gruposinternet.com.br
 `:..:'
   `:,   ,.:' *Grupos Internet S.A.*
 `: :'R. Lauro Linhares, 2123 Torre B - Sala 201
  : : Trindade - Florianópolis - SC
  :.'
  ::  +55 48 3234-4109
  :
  '   http://www.gruposinternet.com.br
 
 ___
 Ocfs2-users mailing list
 Ocfs2-users@oss.oracle.com
 http://oss.oracle.com/mailman/listinfo/ocfs2-users

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] [Ocfs2-announce] OCFS2 1.4.2-1 and OCFS2 Tools 1.4.2-1 released

2009-06-18 Thread Brian Kroth

Sunil Mushran sunil.mush...@oracle.com 2009-06-16 16:38:
 LOOKING AHEAD
 
 We are aiming to release OCFS2 1.6 later this year. This release will
 include the features that we have worked on over the past year. These are:
 
 1. Extended Attributes (unlimited number of attributes)
 2. POSIX ACLs
 3. Security Attributes
 4. Metadata Checksums with ECC (inodes and directories are checksummed)
 5. JBD2 Support
 6. Indexed Directories (subdirs number increased from 32000 to 2147483647)
 7. REFLINK (inode snapshotting)

Just looking over these as well:
http://kernelnewbies.org/Linux_2_6_29#head-2febaacb9f9bef03ee54da9a2b026fdea824a996
http://kernelnewbies.org/Linux_2_6_30#head-1a54a63244fb0d85375f8ecbe651cf94dac38c6c

For those of us wanting to play with the 2.6.30 kernel, does the new
ocfs2-tools release support any of these features?  In particular are
ACLs, extended attributes, indexed directories, or optimized inode
allocations supported by the tools?  Last I had heard they weren't quite
ready for some of those yet.

Thanks,
Brian

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] Filesystem corruption and OCFS2 errors

2009-05-21 Thread Brian Kroth

Sunil Mushran sunil.mush...@oracle.com 2009-05-20 15:36:
 Well, as long as the LVM mappings remain consistent on all nodes,
 it will work. The problem is that if someone changes the setup on a
 node, you will encounter the problem you just did. The only safe way
 is to have the lvm clustered too. Whereas clvm is clustered, we would
 prefer supporting it if we can run the fs and it, using one clusterstack.
 SLES11 HAE will have support for this. We hope to have the same by (RH)EL6.

That's what's always held me back from doing this as well.  Will the
common stack be the openais stack (ie: the so called user stack), the
o2cb stack, or something completely different?

Thanks,
Brian

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] [Fwd: Re: Unable to fix corrupt directories with fsck.ocfs2]

2009-05-20 Thread Brian Kroth

Luis Freitas lfreita...@yahoo.com 2009-05-20 10:46:
I am not aware of any filesystem that can withstand a online fsck.
Sun ZFS can do online correction, but it doesnt have a fsck tool.

I hear btrfs will support this.  It may be a feature that's easier to
accomplish with copy on write.

Brian

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] OCFS2 FS with BACKUP Tools/Vendors

2009-04-02 Thread Brian Kroth

We've used Veritas NetBackup before without problems but are currently
toying with rsyncing to ZFS (running on OpenSolaris) with fs compression
and daily (ZFS) snapshots and then possibly dumping to tape.  It's
working really well so far.

All of this actually happens from a SAN based snapshot of the OCFS2
volume so that we can have a point in time backup and not hassle the
production machines with backups (even though the underlying storage is
still being taxed).

Brian

Bumpass, Brian brian.bump...@wachovia.com 2009-04-02 14:30:
My apology up front if this has been discussed already.   I've reviewed
the archives back to Nov. 2005 and found little of anything.
 
 
 
I need some information concerning support for OCFS2 by backup products.
Currently we use IBM/Tivoli's TSM tool.They don't support OCFS2
filesystems.  And it looks like they have no intent to supporting the FS
in next releases.Note... They do support their own NAS FS, GPFS.  But
this costs extra.
 
 
 
Additionally, the small testing I have done is that a file under an OCFS2
FS backs up and recovers quite nicely.I have not tested using
ACL-lists.  But don't really care about those.   This issue comes down to
support.
 
 
 
So... I guess what I am looking for is some indication of what the user
community with OCFS2 and doing backups has been along similar issues.
 
 
 
Sorry... The environment being supported is SLES 10 SP2 64-bit on DELL 
HP hardware.
 
 
 
Thanks in advance,
 
-B
 
 
 
 

 ___
 Ocfs2-users mailing list
 Ocfs2-users@oss.oracle.com
 http://oss.oracle.com/mailman/listinfo/ocfs2-users

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] ocfs2 backup strategies

2009-03-13 Thread Brian Kroth

Uwe Schuerkamp uwe.schuerk...@nionex.net 2009-03-13 10:42:
 Hi folks,
 
 I was wondering what is a good backup strategy for ocfs2 based
 clusters. 
 
 
 Background: We're running a cluster of 8 SLES 10 sp2 machines sharing
 a common SAN-based FS (/shared) which is about 350g in size at the
 moment. We've already taken care of the usual optimizations concerning
 mount options on the cluster nodes (noatime and so on), but our backup
 software (bacula 2.2.8) slows to a crawl when encountering directories
 in this filesystem that contain quite a few small files. Data rates
 usually average in the tens of MB/sec doing normal backups of local
 filesystems on remote machines in the same LAN, but with the ocfs2 fs
 bacula is hard pressed to not fall below 1mb / sec sustained
 throughput which obviously isn't enough to back up 350g of data in a
 sensible timeframe.
 
 I've already tried disabling compression, rsync'ing to another server
 and so on, but so far nothing has helped with improving data rates. 
 
 How would reducing the number of cluster nodes help with backups? Is
 there a dirty read option in ocfs2 that would allow reading the
 files without locking them first or something similar? I don't think
 bacula is the culprit as it easily manages larger backups in the same
 environment, even reading off smb shares is order of magnitudes faster
 in this case, so my guess is I'm missing out some non-obvious
 optimization that would improve ocfs2 cluster performance. 
 
 Thanks in advance for any pointers  all the best, 
 
 
 Uwe 

This clearly may not work for all cases and I'm sure is totally
unsupported, but our SAN (Equallogic) has the ability to take RW
snapshots which is where we do our backups from.  There was a thread a
while back about the proper way to do this.  Basically after taking the
snapshot you need to fixup the filesystem in a couple of different ways
(fsck, relabel, reuuid, etc.) so that the machine can mount several of
these at once.  If anyone's interested I can post these scripts.  Since
there's only one machine handling the snapshots and it's outside of the
real ocfs2 cluster, while we're doing the fixups we also convert the
snapshot to a local fs and finally remount it ro.  This prevents all
network locking from happening (since it's unnecessary) while the
backups happen.  We're doing this with a 2TB mail volume (~700G of
_many_ small files) and haven't noticed any problems with it.

I think you could probably achieve something similar by taking the
number of active nodes in the cluster down to 1 during your backup
window, but that has it's own problems to be concerned with.  I think a
simple umount /shared on all but that one would do it.

Brian

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] quota/acl support

2009-02-27 Thread Brian Kroth

Attempting this again since I got a DSN earlier.

Brian Kroth bpkr...@gmail.com 2009-02-25 10:25:
 I'm doing some research on the possibility of using OCFS2 to serve
 users' home directories and other shared space.  I noticed that quota
 and posix acl support was added in 2.6.29 but the tools are not there
 yet.  When can we expect that?
 
 Also, are the quotas implemented on a directory or volume level?
 
 Thanks,
 Brian

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

[Ocfs2-users] quota/acl support

2009-02-27 Thread Brian Kroth

I'm doing some research on the possibility of using OCFS2 to serve
users' home directories and other shared space.  I noticed that quota
and posix acl support was added in 2.6.29 but the tools are not there
yet.  When can we expect that?

Also, are the quotas implemented on a directory or volume level?

Thanks,
Brian

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] ocfs2 hangs during webserver usage

2009-01-28 Thread Brian Kroth

I've run a web cluster with OCFS2 for almost two years now and found the
local log files option to work just fine.  You can use various tools to
merge them, though the things that come with awstats have suited my
tastes.

As for monitoring multiple nodes logs during troubleshooting, I've used
swatch and cssh.  This may depend upon your setup, but in general the
individual node that a client is connected to doesn't change frequently.

Here are some other possible solutions to the problems you mentioned.

David Johle djo...@industrialinfo.com 2009-01-28 13:53:
 1) Lots of VirtualHosts, and I believe the correlation of a log with 
 a particular host is lost when using syslog as there aren't enough 
 facilities to allocate one per VH.

Include the vhost name in the logging output.  You already should be
doing this if you're using mod_vhost_alias.  If you really want separate
logs per vhost have your syslog server split them back out.  I know
syslog-ng can do this.

 2) A single syslog server is a single point of failure (for logging 
 at least).  I guess I could set up multiple syslog destinations and 
 have each server send duplicate syslogs out.

One method might be to use heartbeat (linux-ha.org) and rotate the
service ip that syslog clients send to based on the health of a couple
of syslog servers.

 3) More network overhead, especially in the case of multiple log servers.

You already have that to some extent if you're logging to OCFS2.

 4) Doesn't address combined logging of non-Apache processes (e.g. 
 Tomcat), which may eventually have the same issue.
 
 
 I did see mod_log_spread, which sounds like a promising alternative 
 to apache syslogging as it addresses #1-3 above.
 http://www.backhand.org/mod_log_spread
 
 
 
 At 12:44 PM 1/28/2009, Sean Gray wrote:
 Why not just setup a syslog server and send all your apache logs to 
 a central repository. Here is a quick tutorial 
 http://www.oreillynet.com/pub/a/sysadmin/2006/10/12/httpd-syslog.html
 
 
 ___
 Ocfs2-users mailing list
 Ocfs2-users@oss.oracle.com
 http://oss.oracle.com/mailman/listinfo/ocfs2-users

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] OCFS2 replication

2009-01-22 Thread Brian Kroth

Our iscsi San (equallogic) does block level replication so we were  
thinking of trying to set something up soon so that we could have some  
nodes in another building connected via fiber to provide site level  
failover.  I'll report back our experiences when we do that, but I  
imagine it would be similar to drdb with nice interconnect.


Brian

On Jan 22, 2009, at 2:08 PM, CA Lists li...@creativeanvil.com wrote:

Can't say I've replicated it between two sites, but definitely  
between two physical servers. I used drbd in my particular case.  
Here's a small blog entry I put together a while back about what I  
did. Hopefully it's helpful:


http://www.creativeanvil.com/blog/2008/how-to-create-an-iscsi-san-using-heartbeat-drbd-and-ocfs2/

Joe Koenig


Creative Anvil, Inc.

Phone: 314.692.0338

1346 Baur Blvd.

Olivette, MO 63132

j...@creativeanvil.com

http://www.creativeanvil.com



David Schüler wrote:


What about drbd?

Von: ocfs2-users-boun...@oss.oracle.com [mailto:ocfs2-users-boun...@oss.oracle.com 
] Im Auftrag von Garcia, Raymundo

Gesendet: Donnerstag, 22. Januar 2009 20:46
An: ocfs2-users@oss.oracle.com
Betreff: Re: [Ocfs2-users] OCFS2 replication

RSYNC is not real time…. any other suggestion…? I treid RSYNC  
already…




From: ocfs2-users-boun...@oss.oracle.com [mailto:ocfs2-users-boun...@oss.oracle.com 
] On Behalf Of Sérgio Surkamp

Sent: Thursday, January 22, 2009 12:32 PM
Cc: ocfs2-users@oss.oracle.com
Subject: Re: [Ocfs2-users] OCFS2 replication



Try rsync.

Garcia, Raymundo wrote:

Hello… I am trying to replicate a OCFS2 filesystem in a site A to  
another OCFS@ based partition in another site B … I have tried sev 
eral products, inmage, steeleye.. etc without any luck.. those pro 
grams help me to replicate the filesystem but nnot the OCFS2 mount 
ed …I assume that this is because that most software based rep 
lication system work on the block level instead of the file leve…  
I wonder is anyone have tried to replicate OCFS2 between 2 sites….




Thanks



Raymundo Garcia









The information contained in this message may be confidential and  
legally protected under applicable law. The message is intended  
solely for the addressee(s). If you are not the intended recipient,  
you are hereby notified that any use, forwarding, dissemination, or  
reproduction of this message is strictly prohibited and may be  
unlawful. If you are not the intended recipient, please contact the  
sender by return e-mail and destroy all copies of the original  
message.







___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users
Regards,

--

  .:':.
.:'` Sérgio Surkamp | Gerente de Rede
::   ser...@gruposinternet.com.br
`:..:'
  `:,   ,.:' *Grupos Internet S.A.*
`: :'R. Lauro Linhares, 2123 Torre B - Sala 201
 : : Trindade - Florianópolis - SC
 :.'
 ::  +55 48 3234-4109
 :
 '   http://www.gruposinternet.com.br



Virus checked by G DATA AntiVirusKit
Version: AVF 19.226 from 18.01.2009
Virus news: www.antiviruslab.com

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users


___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users
___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] flock errors in dmesg

2009-01-16 Thread Brian Kroth

Thanks for the info you guys.  I ended up having to take everything  
down, fsck, and remount with localflocks.  Hopefully that will prevent  
it from happening again while we wait for the kernel fix to go  
stable.  Nothing should be doing flock anymore anyways.

Luckily we can take san based snapshots of pre and post fsck to see  
what's changed.  That script is still running.  I'll report back if I  
have questions about that.  So far it looks to be just the  
dovecot.index.cache files that got nuked, which is to be expected  
since that's what dovecot was flocking.

Thanks again,
Brian

On Jan 15, 2009, at 11:58 AM, Coly Li coly...@suse.de wrote:



 Brian Kroth Wrote:
 I've been working on creating a mail cluster using ocfs2.  Dovecot  
 was
 configured to use flock since the kernel we're running is debian  
 based
 2.6.26 which supports cluster aware flock.  User space is 1.4.1.   
 During
 testing everything seemed fine, but when we got a real load on  
 things we
 got a whole bunch of these messages in dmesg on the node that was
 hosting imap.  Note that it's maildir and only one node is hosting  
 imap
 so we don't actually need flock.

 I think we're going to switch back to dotlock'ing but I was hoping
 someone could interpret these error messages for me?  Are they
 dangerous?


 This is an known issue and the patch gets merged in 2.6.29-rc1. Here  
 is the patch for your reference.

 Author: Sunil Mushran sunil.mush...@oracle.com
ocfs2/dlm: Fix race during lockres mastery

dlm_get_lock_resource() is supposed to return a lock resource  
 with a proper
master. If multiple concurrent threads attempt to lookup the  
 lockres for the
same lockid while the lock mastery in underway, one or more  
 threads are likely
to return a lockres without a proper master.

This patch makes the threads wait in dlm_get_lock_resource()  
 while the mastery
is underway, ensuring all threads return the lockres with a  
 proper master.

This issue is known to be limited to users using the flock()  
 syscall. For all
other fs operations, the ocfs2 dlmglue layer serializes the dlm  
 op for each
lockid.

Users encountering this bug will see flock() return EINVAL and  
 dmesg have the
following error:
ERROR: Dlm error DLM_BADARGS while calling dlmlock on resource  
 LOCKID: bad api args

Reported-by: Coly Li co...@suse.de
Signed-off-by: Sunil Mushran sunil.mush...@oracle.com
Signed-off-by: Mark Fasheh mfas...@suse.com
 ---
 7b791d68562e4ce5ab57cbacb10a1ad4ee33956e
 fs/ocfs2/dlm/dlmmaster.c |9 -
 1 files changed, 8 insertions(+), 1 deletions(-)

 diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c
 index cbf3abe..54e182a 100644
 --- a/fs/ocfs2/dlm/dlmmaster.c
 +++ b/fs/ocfs2/dlm/dlmmaster.c
 @@ -732,14 +732,21 @@ lookup:
if (tmpres) {
int dropping_ref = 0;

 +spin_unlock(dlm-spinlock);
 +
spin_lock(tmpres-spinlock);
 +/* We wait for the other thread that is mastering the  
 resource */
 +if (tmpres-owner == DLM_LOCK_RES_OWNER_UNKNOWN) {
 +__dlm_wait_on_lockres(tmpres);
 +BUG_ON(tmpres-owner == DLM_LOCK_RES_OWNER_UNKNOWN);
 +}
 +
if (tmpres-owner == dlm-node_num) {
BUG_ON(tmpres-state  DLM_LOCK_RES_DROPPING_REF);
dlm_lockres_grab_inflight_ref(dlm, tmpres);
} else if (tmpres-state  DLM_LOCK_RES_DROPPING_REF)
dropping_ref = 1;
spin_unlock(tmpres-spinlock);
 -spin_unlock(dlm-spinlock);

/* wait until done messaging the master, drop our ref to allow
 * the lockres to be purged, start over. */


 Thanks,
 Brian
 [snip]

 -- 
 Coly Li
 SuSE Labs

 ___
 Ocfs2-users mailing list
 Ocfs2-users@oss.oracle.com
 http://oss.oracle.com/mailman/listinfo/ocfs2-users

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] [Ocfs2-devel] Transport endpoint is not connected while mounting....

2009-01-15 Thread Brian Kroth

Add one for --srcport as well and I think you'll be ok.  Actually, since
my cluster traffic all goes over a separate switch I usually just allow
all traffic in/out of eth1.

Brian

Bret Palsson b...@getjive.com 2009-01-15 08:12:
So it looks like iptables is what is stopping it from working. After
disabling iptables completely for 1 minute then trying to mount on node 1
it worked fine.
So my new question is why did `iptables -A INPUT -ptcp --dport  -j
ACCEPT ; service iptables save` not allow ocfs2 to talk?  What do people
add the their iptables?
-Bret
On Jan 14, 2009, at 4:50 PM, Sunil Mushran wrote:
 
  It's part and parcel of the fs. If you want mainline linux,
  goto [1]http://kernel.org.
 
  Bret Palsson wrote:
 
Can I get the source for DLM 1.5.0 and build it on my other machines?
 
If so where do I grab it?
 
Thanks,
 
Bret
 
On Jan 14, 2009, at 4:28 PM, Sunil Mushran wrote:
 
  I hate cut-paste's because I have no idea whether I can trust it
 
  or not. A misspelled 0 and 1 makes a whole world of difference.
 
  But the following seems to indicate that the configuration is bad.
 
  (3130,1):o2net_connect_expired:1659 ERROR: no connection established
 
  with node 0 after 30.0 seconds, giving up and returning errors.
 
  (4670,1):dlm_request_join:1033 ERROR: status = -107
 
  (4670,1):dlm_try_to_join_domain:1207 ERROR: status = -107
 
  (4670,1):dlm_join_domain:1485 ERROR: status = -107
 
  (4670,1):dlm_register_domain:1732 ERROR: status = -107
 
  (4670,1):o2cb_cluster_connect:302 ERROR: status = -107
 
  (4670,1):ocfs2_dlm_init:2753 ERROR: status = -107
 
  (4670,1):ocfs2_mount_volume:1274 ERROR: status = -107
 
  ocfs2: Unmounting device (253,2) on (node 0)
 
  Why is the mount failing on node 0? I thought it was mounted on
 
  node 0?
 
  Maybe best if you file a bugzilla and attach the /var/log/messages
 
  of both nodes. Indicate the time you did the mount.
 
  Sunil
 
  Bret Palsson wrote:
 
Output of Node 0 {
 
OCFS2 Node Manager 1.4.1 Tue Dec 16 19:18:05 PST 2008 (build
 
0f78045c75c0174e50e4cf0934bf9eae)
 
OCFS2 DLM 1.4.1 Tue Dec 16 19:18:05 PST 2008 (build
 
4ce8fae327880c466761f40fb7619490)
 
OCFS2 DLMFS 1.4.1 Tue Dec 16 19:18:05 PST 2008 (build
 
4ce8fae327880c466761f40fb7619490)
 
OCFS2 User DLM kernel interface loaded
 
SELinux: initialized (dev ocfs2_dlmfs, type ocfs2_dlmfs), not
 
configured for labeling
 
eth3: no IPv6 routers present
 
OCFS2 1.4.1 Tue Dec 16 19:18:02 PST 2008 (build
 
3fc82af4b5669945497b322b6aabd031)
 
ocfs2_dlm: Nodes in domain (8B2CCF82F1BA4A70B587580B23D9D7F7): 0
 
kjournald starting.  Commit interval 5 seconds
 
ocfs2: Mounting device (253,3) on (node 0, slot 0) with ordered
data
 
mode.
 
SELinux: initialized (dev dm-3, type ocfs2), not configured for
 
labeling
 
ocfs2_dlm: Nodes in domain (222B65A090D6477481AD30DE9FCE7961): 0
 
kjournald starting.  Commit interval 5 seconds
 
ocfs2: Mounting device (253,2) on (node 0, slot 0) with ordered
data
 
mode.
 
SELinux: initialized (dev dm-2, type ocfs2), not configured for
 
labeling
 
ocfs2_dlm: Nodes in domain (0425C0367AF547E989864A46F3DBD6E6): 0
 
kjournald starting.  Commit interval 5 seconds
 
ocfs2: Mounting device (253,4) on (node 0, slot 0) with ordered
data
 
mode.
 
SELinux: initialized (dev dm-4, type ocfs2), not configured for
 
labeling
 
}
 
Output of Node 1 {
 
OCFS2 Node Manager 1.5.0
 
OCFS2 DLM 1.5.0
 
ocfs2: Registered cluster interface o2cb
 
OCFS2 DLMFS 1.5.0
 
OCFS2 User DLM kernel interface loaded
 
device eth0 entered promiscuous mode
 
OCFS2 1.5.0
 
}
 
On Jan 14, 2009, at 3:58 PM, Sunil Mushran wrote:
 
  What about the dmesg on node 1?
 
  Now ideally we want the fs versions to be the same on all nodes.
 
  However as we have not changed the protocol since 1.4.1, this
 
  should still work.
 
  Bret Palsson wrote:
 
node 0 (and FS) OCFS2 1.4.1 2.6.18-92.1.22.el5xen
 
node 1 OCFS 21.5 2.6.28-vs2.3.0.36.4
 
Output of Node 1 {
 
OCFS2 Node Manager 1.5.0
 
OCFS2 DLM 1.5.0
 
ocfs2: Registered cluster interface o2cb
 
OCFS2 DLMFS 1.5.0

[Ocfs2-users] flock errors in dmesg

2009-01-15 Thread Brian Kroth

I've been working on creating a mail cluster using ocfs2.  Dovecot was
configured to use flock since the kernel we're running is debian based
2.6.26 which supports cluster aware flock.  User space is 1.4.1.  During
testing everything seemed fine, but when we got a real load on things we
got a whole bunch of these messages in dmesg on the node that was
hosting imap.  Note that it's maildir and only one node is hosting imap
so we don't actually need flock.

I think we're going to switch back to dotlock'ing but I was hoping
someone could interpret these error messages for me?  Are they
dangerous?

Thanks,
Brian

[257387.675734] (21573,0):ocfs2_file_lock:1587 ERROR: status = -22
[257387.675734] (21573,0):ocfs2_do_flock:79 ERROR: status = -22
[257392.121692] (21360,0):dlm_send_remote_lock_request:333 ERROR: status
= -40
[257392.121938] (21360,0):dlmlock_remote:269 ERROR: dlm status =
DLM_BADARGS
[257392.122023] (21360,0):dlmlock:747 ERROR: dlm status = DLM_BADARGS
[257392.122079] (21360,0):ocfs2_lock_create:998 ERROR: DLM error -22
while calling ocfs2_dlm_lock on resource F0008fca8f0e7e038e3
[257392.10] (21360,0):ocfs2_file_lock:1587 ERROR: status = -22
[257392.122326] (21360,0):ocfs2_do_flock:79 ERROR: status = -22
[257479.277941] (21950,0):dlm_send_remote_lock_request:333 ERROR: status
= -40
[257479.277941] (21950,0):dlmlock_remote:269 ERROR: dlm status =
DLM_BADARGS
[257479.277941] (21950,0):dlmlock:747 ERROR: dlm status = DLM_BADARGS
[257479.277941] (21950,0):ocfs2_lock_create:998 ERROR: DLM error -22
while calling ocfs2_dlm_lock on resource F00085d6ff2e7e0a106
[257479.277941] (21950,0):ocfs2_file_lock:1587 ERROR: status = -22
[257479.277941] (21950,0):ocfs2_do_flock:79 ERROR: status = -22
[257480.407024] (21947,0):dlm_send_remote_lock_request:333 ERROR: status
= -40
[257480.407024] (21947,0):dlmlock_remote:269 ERROR: dlm status =
DLM_BADARGS
[257480.407024] (21947,0):dlmlock:747 ERROR: dlm status = DLM_BADARGS
[257480.407024] (21947,0):ocfs2_lock_create:998 ERROR: DLM error -22
while calling ocfs2_dlm_lock on resource F000955ae83e7e0a13d
[257480.407024] (21947,0):ocfs2_file_lock:1587 ERROR: status = -22
[257480.407024] (21947,0):ocfs2_do_flock:79 ERROR: status = -22
[257483.221066] (21972,1):dlm_send_remote_lock_request:333 ERROR: status
= -40
[257483.221066] (21972,1):dlmlock_remote:269 ERROR: dlm status =
DLM_BADARGS
[257483.221066] (21972,1):dlmlock:747 ERROR: dlm status = DLM_BADARGS
[257483.221066] (21972,1):ocfs2_lock_create:998 ERROR: DLM error -22
while calling ocfs2_dlm_lock on resource F000955ae84e7e0a23c
[257483.221066] (21972,1):ocfs2_file_lock:1587 ERROR: status = -22
[257483.221066] (21972,1):ocfs2_do_flock:79 ERROR: status = -22
[257725.200695] (12536,0):dlm_send_remote_lock_request:333 ERROR: status
= -40
[257725.200695] (12536,0):dlmlock_remote:269 ERROR: dlm status =
DLM_BADARGS
[257725.200695] (12536,0):dlmlock:747 ERROR: dlm status = DLM_BADARGS
[257725.200695] (12536,0):ocfs2_lock_create:998 ERROR: DLM error -22
while calling ocfs2_dlm_lock on resource F000758938de7e0de1f
[257725.200695] (12536,0):ocfs2_file_lock:1587 ERROR: status = -22
[257725.200695] (12536,0):ocfs2_do_flock:79 ERROR: status = -22
[257959.288124] (18619,1):dlm_send_remote_lock_request:333 ERROR: status
= -40
[257959.288124] (18619,1):dlmlock_remote:269 ERROR: dlm status =
DLM_BADARGS
[257959.288124] (18619,1):dlmlock:747 ERROR: dlm status = DLM_BADARGS
[257959.288124] (18619,1):ocfs2_lock_create:998 ERROR: DLM error -22
while calling ocfs2_dlm_lock on resource F000585c3e9e7e0e40d
[257959.288124] (18619,1):ocfs2_file_lock:1587 ERROR: status = -22
[257959.288124] (18619,1):ocfs2_do_flock:79 ERROR: status = -22


___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] how to format

2009-01-14 Thread Brian Kroth

Brett Worth br...@worth.id.au 2009-01-14 20:00:
 Christophe BOUDER wrote:
 
  but i can't format my new big device to use more than 16To for it.
 
 You should Consider increasing the block size to perhaps 16k.  That should 
 increase the
 size to 64TB
 
I think he meant cluster size.

http://oss.oracle.com/projects/ocfs2/dist/documentation/v1.2/ocfs2_faq.html

Look at number 64 and 32.

I know the documentation is for the previous version, but I believe the
principles still apply.

Note that since this sets the smallest allocatable size to a single
file, if your volume is meant for many small files you'll end up wasting
a lot of space.  Perhaps the data in inode feature helps with that
though.  I've used an increased cluster size on media volumes before
though and had no troubles.

Brian

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

[Ocfs2-users] mem usage

2009-01-05 Thread Brian Kroth

I've got a question about tuning mem usage which may or may not be ocfs2
related.  I have some VMs that share an iSCSI device formatted with
ocfs2.  They're all running a debian based 2.6.26 kernel.  We basically
just dialed the kernel down to 100HZ rather than the default 1000HZ.
Everything else is the same.  All the machines have 2 CPUs and 2GB of
RAM.

Over time I would expect that the amount of free mem decreases towards 0
and the amount of (fs) cached mem increases.  I think one can simulate
this by doing the following: 
echo 1  /proc/sys/vm/drop_caches
ls -lR /ocfs2/  /dev/null

When I do this on a physical machine with a large ext3 volume, the
cached field steadily increases as I expected.  However, on the ocfs2
volume what I actually see is that the free mem and cache mem remains
fairly constant.

# free
 total   used   free sharedbuffers cached
Mem:   2076376 9892561087120  0 525356  33892
-/+ buffers/cache: 4300081646368
Swap:  1052216  01052216

# top -n1  | head -n5
top - 11:11:19 up 2 days, 20:26,  2 users,  load average: 1.45, 1.39, 1.26
Tasks: 140 total,   1 running, 139 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.8%us,  3.7%sy,  0.0%ni, 79.5%id, 15.2%wa,  0.1%hi,  0.6%si,  0.0%st
Mem:   2076376k total,   989772k used,  1086604k free,   525556k buffers
Swap:  1052216k total,0k used,  1052216k free,33920k cached

I've tried to trick the machines into allowing for more cached inodes by
decreasing vfs_cache_pressure, but it doesn't seem to have had much
affect.  I also get the same results if only one machine has the ocfs2
fs mounted.  I have also tried mounting it with the localalloc=16 option
that I found in a previous mailing list post.

The ocfs2 filesystem is 2TB and has about 600GB of maildirs on it (many
small files).  The ext3 machine is about 200GB and has a couple of
workstation images on it (a mix of file sizes).

I haven't yet been able to narrow down whether this is vm vs. physical,
ocfs2 vs. ext3, iSCSI vs. local, or something else.

Has anyone else seen similar results or have some advice as to how to
improve the situation?

Thanks,
Brian

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] mem usage

2009-01-05 Thread Brian Kroth

I had read in the past that lots of RAM would be helpful for caching
inodes, locks, and whatnot, so it concerned me that the machines didn't
appear to be using it.

As for my goals, the machines will be hosting maildirs so I think that
caching directories would be of most use.

Although I agree the ls -lR is perhaps not the best test, I saw similar
results with other VMs that didn't have ocfs2, so I'm leaning towards an
iSCSI/VM problem and not ocfs2.  However, getting it to cache more
directories would still be of interest to me.

I'll have to report back later about the number of disk reads.

Thanks,
Brian

Herbert van den Bergh herbert.van.den.be...@oracle.com 2009-01-05 10:09:

 The ls -lR command will only access directory entries and inodes.  These  
 are cached in slabs (see /proc/slabinfo).  Not sure what happens to the  
 disk pages that they live in, but those disk pages can be discarded  
 immediately after the slab caches have been populated.  So it's probably  
 not a good test of filesystem data caching.  You may want to explain  
 what you hope to achieve with this: more cache hits on directory and  
 inode entries, or on file data?  Are you seeing more disk reads in this  
 configuration than in the one you're comparing with?

 Thanks,
 Herbert.


 Brian Kroth wrote:
 I've got a question about tuning mem usage which may or may not be ocfs2
 related.  I have some VMs that share an iSCSI device formatted with
 ocfs2.  They're all running a debian based 2.6.26 kernel.  We basically
 just dialed the kernel down to 100HZ rather than the default 1000HZ.
 Everything else is the same.  All the machines have 2 CPUs and 2GB of
 RAM.

 Over time I would expect that the amount of free mem decreases towards 0
 and the amount of (fs) cached mem increases.  I think one can simulate
 this by doing the following: echo 1  /proc/sys/vm/drop_caches
 ls -lR /ocfs2/  /dev/null

 When I do this on a physical machine with a large ext3 volume, the
 cached field steadily increases as I expected.  However, on the ocfs2
 volume what I actually see is that the free mem and cache mem remains
 fairly constant.

 # free
  total   used   free sharedbuffers cached
 Mem:   2076376 9892561087120  0 525356  33892
 -/+ buffers/cache: 4300081646368
 Swap:  1052216  01052216

 # top -n1  | head -n5
 top - 11:11:19 up 2 days, 20:26,  2 users,  load average: 1.45, 1.39, 1.26
 Tasks: 140 total,   1 running, 139 sleeping,   0 stopped,   0 zombie
 Cpu(s):  0.8%us,  3.7%sy,  0.0%ni, 79.5%id, 15.2%wa,  0.1%hi,  0.6%si,  
 0.0%st
 Mem:   2076376k total,   989772k used,  1086604k free,   525556k buffers
 Swap:  1052216k total,0k used,  1052216k free,33920k cached

 I've tried to trick the machines into allowing for more cached inodes by
 decreasing vfs_cache_pressure, but it doesn't seem to have had much
 affect.  I also get the same results if only one machine has the ocfs2
 fs mounted.  I have also tried mounting it with the localalloc=16 option
 that I found in a previous mailing list post.

 The ocfs2 filesystem is 2TB and has about 600GB of maildirs on it (many
 small files).  The ext3 machine is about 200GB and has a couple of
 workstation images on it (a mix of file sizes).

 I haven't yet been able to narrow down whether this is vm vs. physical,
 ocfs2 vs. ext3, iSCSI vs. local, or something else.

 Has anyone else seen similar results or have some advice as to how to
 improve the situation?

 Thanks,
 Brian

 ___
 Ocfs2-users mailing list
 Ocfs2-users@oss.oracle.com
 http://oss.oracle.com/mailman/listinfo/ocfs2-users
   

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

[Ocfs2-users] mailcluster advice?

2008-12-05 Thread Brian Kroth

I'm working on setting up a mail cluster (imap, pop, mx) using ocfs2.
Does anyone have any advice or experiences they'd like to share?

I've done web and video clusters before with great success, however
those were structured in a such a way that there was generally only ever
one write node active at a time and many read nodes.  Mail won't have
that feature.

It's all maildir, so there's no file locking, but one does need to be
concerned about ocfs2 lock contention.  Currently our thought is to have
one node for mx, pop, and imap each.  Ignoring pop for the moment, that
means there's potentially a source of contention on $mail_folder/new
when sendmail is pushing mail and dovecot is trying to move mail to
$mail_folder/cur.  After that imap connections should be on their own as
far as other nodes are concerned.  There may be multiple connections
from different clients to a single folder, but that's still confined to
the node serving imap, so in theory obtaining locks on the directory
should be quick.  I'm ignoring pop because the bulk of the users use
imap and those that use pop almost exclusively use pop, so the analysis
remains the same.

Thoughts?

Also, does ocfs2 support dnotify or inotify?

Thanks,
Brian

___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

37 matches

Mail list logo