Re: [Ocfs2-users] [Ocfs2-devel] size increase

2015-03-17 Thread Sunil Mushran
This is because you are specifying a 128k cluster size. Refer to man mkfs.ocfs2 for more. On Mar 17, 2015 8:04 PM, Umarzuki Mochlis umarz...@gmail.com wrote: Hi, What I meant by total size is output of 'du -hs' I can see output of fdisk on mpath1 of ocfs2 LUN similar to logical volume of

Re: [Ocfs2-users] OCFS2 “Heartbeat generation mismatch on device” error when mounting iscsi target

2015-02-09 Thread Sunil Mushran
https://www.activecollab.com/ On February 9, 2015 at 8:09:06 PM, Sunil Mushran (sunil.mush...@gmail.com) wrote: On node 2, do: ps aux | grep o2hb I suspect you have multiple o2hb threads running. If so, restart the o2cb cluster on that node. On Mon, Feb 9, 2015 at 10:08 AM, Danijel Krmar

Re: [Ocfs2-users] How to unlock a bloked resource? Thanks

2014-09-10 Thread Sunil Mushran
What is the output of the commands? The protocol is supposed to do the unlocking on its own. See what is it blocked on. It could be that the node that has the lock cannot unlock it because it cannot flush the journal to disk. On Tue, Sep 9, 2014 at 7:55 PM, Guozhonghua guozhong...@h3c.com wrote:

Re: [Ocfs2-users] FSCK may be failing and corrupting my disk???

2014-03-24 Thread Sunil Mushran
inode? On 03/22/2014 09:40 PM, Sunil Mushran wrote: Cloning the inode means inode + data. Let it finish. On Sat, Mar 22, 2014 at 3:44 PM, Eric Raskin eras...@paslists.com wrote: Hi: I am running a two-node Oracle VM Server 2.2.2 installation. We were having some strange problems

Re: [Ocfs2-users] FSCK may be failing and corrupting my disk???

2014-03-22 Thread Sunil Mushran
Cloning the inode means inode + data. Let it finish. On Sat, Mar 22, 2014 at 3:44 PM, Eric Raskin eras...@paslists.com wrote: Hi: I am running a two-node Oracle VM Server 2.2.2 installation. We were having some strange problems creating new virtual machines, so I shut down the systems

Re: [Ocfs2-users] How to break out the unstop loop in the recovery thread? Thanks a lot.

2013-11-01 Thread Sunil Mushran
It is encountering scsi errrors reading the device. Fixing that will fix the issue. If you want to stop the logging, I don't believe there is a method right now. But i could be trivially added. Allow user to disable mlog(ML_ERROR) logging. On Thu, Oct 31, 2013 at 7:38 PM, Guozhonghua

Re: [Ocfs2-users] How do I check fragmentation amount?

2013-11-01 Thread Sunil Mushran
debugfs.ocfs2 -R frag filespec DEVICE will show you the fragmentation level on an inode basis. You could run that for all inodes and figure out the value for the entire volume. On Fri, Nov 1, 2013 at 3:00 PM, Andy ary...@allantgroup.com wrote: How can I check the amount on fragmentation on

Re: [Ocfs2-users] OCFS2 tuning, fragmentation and localalloc option. Cluster hanging during mix read+write workloads

2013-08-06 Thread Sunil Mushran
If the storage connectivity is not stable, then dlm issues are to be expected. In this case, the processes are all trying to take the readlock. One possible scenario is that the node holding the writelock is not able to relinquish the lock because it cannot flush the updated inodes to disk. I

Re: [Ocfs2-users] High inodes usage

2013-07-03 Thread Sunil Mushran
Hoe did you figure this out? Also, which version of the kernel are you using? On Wed, Jul 3, 2013 at 1:05 AM, Nicolas Michel be.nicolas.mic...@gmail.comwrote: Hello guys, I'm using OCFS2 for a shared storage (on SAN). I just saw that the inode usage is really high although these filesystems

Re: [Ocfs2-users] High inodes usage

2013-07-03 Thread Sunil Mushran
it is not causing any problem but I found it weird). 2013/7/3 Sunil Mushran sunil.mush...@gmail.com That is old. It just could be a minor bug is that release. Is it causing you any problems? On Wed, Jul 3, 2013 at 12:31 PM, Nicolas Michel be.nicolas.mic...@gmail.com wrote: Hello Sunil, I

Re: [Ocfs2-users] Problems with volumes coming from RHEL5 going to OEL6

2013-06-21 Thread Sunil Mushran
Can you dump the following using the 1.8 binary. debugfs.ocfs2 -R stats /dev/mapper/. On Fri, Jun 21, 2013 at 6:17 AM, Ulf Zimmermann u...@openlane.com wrote: We have a production cluster of 6 nodes, which are currently running RHEL 5.8 with OCFS2 1.4.10. We snapclone these volumes to

Re: [Ocfs2-users] Unable to set the o2cb heartbeat to global

2013-06-04 Thread Sunil Mushran
Support for global heartbeat was added in ocfs2-tools-1.8. On Tue, Jun 4, 2013 at 8:31 AM, Vineeth Thampi vineeth.tha...@gmail.comwrote: Hi, I have added heartbeat mode as global, but when I do a mkfs and mount, and then check the mount, it says I am in local mode. Even

Re: [Ocfs2-users] What is the overhead/disk loss of formatting an ocfs2 filesystem?

2013-04-15 Thread Sunil Mushran
-N 16 means 16 journals. I think it defaults to 256M journals. So that's 4G. Do you plan to mount it on 16 nodes? If not, reduce that. Other options is a smaller journal. But you have to be careful as a small journal could limit your write thruput. On Mon, Apr 15, 2013 at 1:37 PM, Jerry Smith

Re: [Ocfs2-users] Significant Slowdown when writing and deleting files at the same time

2013-03-29 Thread Sunil Mushran
Are you mounting -o writeback? On Fri, Mar 29, 2013 at 12:28 PM, Andy ary...@allantgroup.com wrote: I have been having performance issues from time to time on our production ocfs2 volumes, so I set up a test system to try to reproduce what I was seeing on the production systems. This is

Re: [Ocfs2-users] [OCFS2] Crash at o2net_shutdown_sc()

2013-03-01 Thread Sunil Mushran
[ 1481.620253] o2hb: Unable to stabilize heartbeart on region 1352E2692E704EEB8040E5B8FF560997 (vdb) What this means is that the device is suspect. o2hb writes are not hitting the disk. vdb is accepting and acknowledging the write but spitting out something else during the next read. Heartbeat

Re: [Ocfs2-users] OCFS ..Inode contains a hole at offset...

2013-02-20 Thread Sunil Mushran
This is probably a directory. debugs.ocfs2 -R 'stat 52663' /dev/ will dump the inode. Are you sure fsck is fixing it? Does the output show this block getting fixed? If not, you may want to run fsck.ocfs2 v1.8. I think a fix code was added for it. On Wed, Feb 20, 2013 at 1:01 AM, Fiorenza

Re: [Ocfs2-users] ocfs cluster node keeps rebooting

2013-01-14 Thread Sunil Mushran
1.2.5 is 6+ year old release. You may want to use something more current. On Mon, Jan 14, 2013 at 12:06 PM, Bill Zha lfl200...@yahoo.com wrote: Hi Sunil and All, We have a 10 Redhat4.2-node OCFS cluster running on version 1.2.5-6. One of the node started to rebooted almost everyday since

Re: [Ocfs2-users] asynchronous hwclocks

2013-01-03 Thread Sunil Mushran
The fs does not care about time. It should have no effect on the cluster. However the apps may care and may behave erratically. On Jan 3, 2013, at 3:13 PM, Medienpark, Jakob Rößler roess...@medienpark.net wrote: Hello list, today I noticed huge differences between the hardware clocks in

Re: [Ocfs2-users] Is this a valid configuration?

2012-12-05 Thread Sunil Mushran
This is normal. My only concern is the use of very old kernel/fs versions. On Wed, Dec 5, 2012 at 3:08 AM, Neil campbell.n...@hotmail.com wrote: Anyone? On 2012-11-28 00:47:56 + neil campbell campbell.n...@hotmail.com wrote: Hi list, I am running

Re: [Ocfs2-users] ls taking ages on a directory containing 900000 files

2012-12-04 Thread Sunil Mushran
strace -p PID -ttt -T Attach and get some timings. The simplest guess is that the system lacks memory to cache all the inodes and thus has to hit disk (and more importantly take cluster locks) for the same inode repeatedly. The user guide has a section in NOTES explaining this. On Tue, Dec 4,

Re: [Ocfs2-users] ls taking ages on a directory containing 900000 files

2012-12-04 Thread Sunil Mushran
*amaury.franc...@digora.com mailto:amaury.franc...@digora.com*** * * *Siège Social – 66 rue du Marché Gare – 67200 STRASBOURG* Tél : 0 820 200 217 - +33 (0)3 88 10 49 20 Description : test *De :*Sunil Mushran [mailto:sunil.mush...@gmail.com] *Envoyé :* mardi 4

Re: [Ocfs2-users] Huge Problem ocfs2

2012-11-09 Thread Sunil Mushran
IO error on channel means the system cannot talk to the block device. The problem is in the block layer. Maybe a loose cable or a setup problem. dmesg should show errors. On Fri, Nov 9, 2012 at 10:46 AM, Laurentiu Gosu l...@easic.ro wrote: Hi, I'm using ocfs2 cluster in a production

Re: [Ocfs2-users] Huge Problem ocfs2

2012-11-09 Thread Sunil Mushran
(2, \r\n, 2 On 10.11.2012 02:06, Sunil Mushran wrote: It's either that or a check sum problem. Disable metaecc. Not sure which kernel you are running. We had fixed few problems few years ago around this. If your kernel is older, then it could be a known issue. On Fri, Nov 9, 2012

Re: [Ocfs2-users] Huge Problem ocfs2

2012-11-09 Thread Sunil Mushran
at ocfs2_validate_meta_ecc in order to bypass the ECC checks? On 10.11.2012 03:55, Sunil Mushran wrote: If global bitmap is gone. then the fs is unusable. But you can extract data using the rdump command in debugfs.ocfs. The success depends on how much of the device is still usable. On Fri, Nov 9

Re: [Ocfs2-users] HA-OCFS2?

2012-09-13 Thread Sunil Mushran
cfs != storage You need to get a highly available storage that is concurrently accessible from multiple nodes. ocfs2 will allow multiple nodes to concurrently access the same storage. With posix semantics. If a node dies, the remaining nodes will pause to recover and then continue functioning.

Re: [Ocfs2-users] Ocfs2-users Digest, Vol 105, Issue 4

2012-09-12 Thread Sunil Mushran
On Wed, Sep 12, 2012 at 9:45 AM, Asanka Gunasekera asanka_gunasek...@yahoo.co.uk wrote: Load O2CB driver on boot (y/n) [y]: Cluster stack backing O2CB [o2cb]: Cluster to start on boot (Enter none to clear) [ocfs2]: Specify heartbeat dead threshold (=7) [31]: Specify network idle timeout in

Re: [Ocfs2-users] test inode bit failed -5

2012-08-31 Thread Sunil Mushran
nfsd encountered an error reading the device. So something in the io path below the fs encountered a problem. If it just happened once, then you can ignore it. On Fri, Aug 31, 2012 at 2:23 AM, Hideyasu Kojima hid.koj...@ms.scsk.jpwrote: Hi I using ocfs2 cluster as NFS Server. Only once,I got

Re: [Ocfs2-users] Issue with files and folder ownership

2012-08-29 Thread Sunil Mushran
AM, Sunil Mushran sunil.mush...@gmail.comwrote: Isn't the mount point is local to the machine? I use iSCSI for the Block device and I mount the device (/dev/sdc1) at /var/lib/nova/instances. I've formated /dev/sdc1 in OCFS2 FS. Should I use Pacemaker to manage OCFS2 ? Thanks, -Emilien

Re: [Ocfs2-users] Issue with OCFS2 mount

2012-08-29 Thread Sunil Mushran
Forgot to add that this issue is limited to metaecc. So you could avoid the issue in your same setup by not enabling metaecc on the volume. And last I checked mkfs did not enable it by default. On Mon, Aug 27, 2012 at 10:35 AM, Sunil Mushran sunil.mush...@gmail.comwrote: So you are running

Re: [Ocfs2-users] Issue with OCFS2 mount

2012-08-24 Thread Sunil Mushran
What is the version of the kernel, ocfs2 and ocfs2 tools? uname -a modinfo ocfs2 mkfs.ocfs2 --version On Fri, Aug 24, 2012 at 1:09 PM, Rory Kilkenny rory.kilke...@ticoon.comwrote: We have an HP P2000 G3 Storage array, fiber connected. The storage array has a RAID5 array broken into 2

Re: [Ocfs2-users] OCFS2 and util_file

2012-08-23 Thread Sunil Mushran
You are probably mounting the volume with the datavolume option. Instead use the init.ora param, filesystemio_options for force odirect and mount the volume without the datavolume option. This is documented in the user's guide. On Thu, Aug 23, 2012 at 8:14 AM, Maki, Nancy nancy.m...@suny.edu

Re: [Ocfs2-users] OCFS2 and util_file

2012-08-23 Thread Sunil Mushran
On Thu, Aug 23, 2012 at 10:58 AM, Maki, Nancy nancy.m...@suny.edu wrote: By default we mount all our OCFS2 volumes with datavolume. To be more specific, the volume that we are having the issue with is not a database volume but a shared drive for developers to read and write other types of

Re: [Ocfs2-users] null pointer dereference

2012-08-21 Thread Sunil Mushran
You may want to run a full fsck on the fs. fsck.ocfs2 -fy /dev/ On Tue, Aug 21, 2012 at 12:49 AM, Pawel pzl...@mp.pl wrote: Hi, After upgrading ocfs2 my cluster is instable. At least ones per week I can see: kernel panic: Null pointer dereference at 00048 o2dlm_blocking_ast_wrapper +

Re: [Ocfs2-users] ocfs2 problem journal size

2012-08-02 Thread Sunil Mushran
The 4 journal inodes got zeroed out. Do you know how/why? Have you tried running fsck with -fy (enable writes). fsck.ocfs2 does have a check for bad journals that it will regenerate. JOURNAL_FILE_INVALID OCFS2 uses JDB for journalling and some journal files exist in the system directory. Fsck

Re: [Ocfs2-users] ocfs2 problem journal size

2012-08-02 Thread Sunil Mushran
oh crap. The dlm lock needs to lock the journals. So you need to recreate the journal inodes with i_size 0. dd a good journal inode and edit it using binary editor. Change the inode num to the block number, zero out the i_size and next_free_extent. Repeat for the 4 inodes. Hopefully some one on

Re: [Ocfs2-users] ocfs2-tools git: broken after commit deb5ade9145f8809f1fde19cf53bdfdf1fb7963e

2012-07-26 Thread Sunil Mushran
be: else -tmp = g_list_append(elem, cfs); +g_list_append(elem, cfs); Attached patch. Thanks. Acked-by: Sunil Mushran sunil.mush...@gmail.com ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com https://oss.oracle.com/mailman

Re: [Ocfs2-users] Removing a node from cluster.conf (on a specific node)

2012-04-29 Thread Sunil Mushran
Online add/remove of nodes and of global heartbeat devices has been in mainline for over a year. I think 2.6.38+ and tools 1.8. The ocfs2-tools tree hosted on oss.oracle.com/git has a 1.8.2 tag that can be used safely. It has been fully tested. The user's guide has been moved to man pages

Re: [Ocfs2-users] Permission denied on ocfs2 cluster

2012-03-16 Thread Sunil Mushran
[mailto:ocfs2-users-boun...@oss.oracle.com] On Behalf Of зоррыч Sent: Thursday, March 15, 2012 11:26 PM To: 'Sunil Mushran' Cc: ocfs2-users@oss.oracle.com Subject: Re: [Ocfs2-users] Permission denied on ocfs2 cluster [root@noc-1-synt /]# ls -lh | grep ocfs drwxr-xr-x. 3 root root 3.9K Mar 15 02

Re: [Ocfs2-users] Permission denied on ocfs2 cluster

2012-03-15 Thread Sunil Mushran
strace may show more. I would first confirm that my perms are correct. On 03/15/2012 07:58 AM, ?? wrote: I am testing the scheme of drbd and ocfs2 If you attempt to write to the cluster error: [root@noc-1-m77 share]# mkdir 12 mkdir: cannot create directory `12': Permission denied

Re: [Ocfs2-users] ocfs2-1.4.7 is not binding in scientific linux 6.2

2012-03-12 Thread Sunil Mushran
ocfs2 1.4 will not build with 2.6.32. A better solution is to just enable ocfs2 in the 2.6.32 kernel src tree and build. On 03/11/2012 07:37 AM, зоррыч wrote: Hi. I use scientific linux 6.2: [root@noc-1-m77 ocfs2-1.4.7]# cat /etc/redhat-release Scientific Linux release 6.2 (Carbon)

Re: [Ocfs2-users] ocfs2console hangs on startup

2012-03-10 Thread Sunil Mushran
ocfs2console has been obsoleted. Just use the utilities directly. To detect ocfs2 volumes, use blkid. You can use it to restrict the lookup paths. Refer its manpage. On 03/09/2012 06:15 PM, John Major wrote: Hi, Hope this is the right place to ask this. I have set up 2 ubuntu lts machines

Re: [Ocfs2-users] OCFS2 1.2/1.6

2012-03-02 Thread Sunil Mushran
The file system on-disk image has not changed. So the 1.6 file system software can mount the volume created with 1.2 mkfs. What you cannot do is concurrently mount the same volume with nodes running 1.2 and 1.6 versions of the file system software. It is not mixed mode. The 1.6 fs software will

Re: [Ocfs2-users] Ocfs2-users Digest, Vol 98, Issue 9

2012-03-02 Thread Sunil Mushran
On 02/29/2012 04:10 PM, David Johle wrote: I too have seen some serious performance issues under 1.4, especially with writes. I'll share some info I've gathered on this topic, take it however you wish... In the past I never really thought about running benchmarks against the shared block

Re: [Ocfs2-users] Concurrent write performance issues with OCFS2

2012-02-28 Thread Sunil Mushran
In 1.4, the local allocator window is small. 8MB. Meaning the node has to hit the global bitmap after every 8MB. In later releases, the window is much larger. Second, a single node is not a good baseline. A better baseline is multiple nodes writing concurrently to the block device. Not fs. Use

Re: [Ocfs2-users] A Billion Files on OCFS2 -- Best Practices?

2012-02-01 Thread Sunil Mushran
On 02/01/2012 07:02 AM, Mark wrote: One more thing. When I straced one of the application processes (these are the processes that create the files) I saw this: % time seconds usecs/callcalls errors syscall --- -- -- -- --- 68.94 3.002017

Re: [Ocfs2-users] Extend space on ocfs mount point

2012-02-01 Thread Sunil Mushran
I am not aware of any downsizes in resizing. On 02/01/2012 09:57 AM, Kalra, Pratima wrote: We have a ucm installation on ocfs mount point and we need to increase the space on that mount point from 20gb to 30 gb. Is this possible without resulting in any after effects? Pratima.

Re: [Ocfs2-users] A Billion Files on OCFS2 -- Best Practices?

2012-02-01 Thread Sunil Mushran
debugfs.ocfs2 -R stats /dev/mapper/... I want to see the features enabled. The main issue with large metdata is the fsck timing. The recently tagged 1.8 release of the tools has much better fsck performance. On 02/01/2012 05:25 AM, Mark Hampton wrote: We have an application that has many

Re: [Ocfs2-users] A Billion Files on OCFS2 -- Best Practices?

2012-02-01 Thread Sunil Mushran
On 02/01/2012 10:24 AM, Mark Hampton wrote: Here's what I got from debugfs.ocfs2 -R stats. I have to type it out manually, so I'm only including the features lines: Feature Compat: 3 backup-super strict-journal-super Feature Incompat: 16208 sparse extended-slotmap inline-data metaecc

Re: [Ocfs2-users] Bad magic number in inode

2012-02-01 Thread Sunil Mushran
inode#11 is in the system directory. fsck cannot fix this automatically. If the corruption is limited, there is a chance the inodes could be recreated manually. But do look at backups to restore. On 02/01/2012 10:20 AM, Werner Flamme wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi,

Re: [Ocfs2-users] Help ! OCFS2 unstable on Disparate Hardware

2012-01-27 Thread Sunil Mushran
Symmetric clustering works best when the nodes are comparable because all nodes have to work in sync. NFS may be more suitable for your needs. On 01/26/2012 05:51 PM, Jorge Adrian Salaices wrote: I have been working on trying to convince Mgmt at work that we want to go to OCFS2 away from NFS

Re: [Ocfs2-users] One node, two clusters?

2011-12-22 Thread Sunil Mushran
You don't need to have two clusters for this. This can be accomplished with one cluster with the default local heartbeat. Create one cluster.conf with all the nodes. All nodes, except the one machine, will mount from just one san. The common node will mount from both sans. If you look at the

Re: [Ocfs2-users] One node, two clusters?

2011-12-22 Thread Sunil Mushran
On 12/22/2011 10:39 AM, Kushnir, Michael (NIH/NLM/LHC) [C] wrote: Is there a separate DLM instance for each ocfs2 volume? I have two sub-clusters in the same cluster... A 10 node Hadoop cluster sharing a SATA RAID10 and a Two node web server cluster sharing a SSD RAID0. One server mounts

Re: [Ocfs2-users] reflink status

2011-12-17 Thread Sunil Mushran
First we have to get the new syscall added to the kernel. The first attempt failed because people overloaded the call with extraneous stuff. Recently there is another attempt to go back to the original proposal. Hopefully, next kernel release. The reflink utility should work. So what it is based

Re: [Ocfs2-users] reflink status

2011-12-17 Thread Sunil Mushran
On 12/17/2011 12:05 PM, richard -rw- weinberger wrote: The reflink utility should work. So what it is based on an older coreutils. It is derived from the hard link (ln) utility. So, building it from http://oss.oracle.com/git/?p=jlbec/reflink.git;a=shortlog via reflink.spec is the way to go?

Re: [Ocfs2-users] OCFS2 cluster won't come up and stay up

2011-12-01 Thread Sunil Mushran
of reset it so we can get these servers back online and talking again in the meanwhile? Tony On Dec 1, 2011, at 5:05 PM, Sunil Mushran wrote: Node 3 is joining the domain. It is having problms getting the superblock cluster lock. Create a bugzilla on oss.oracle.com and attach the /var/logs

Re: [Ocfs2-users] Monitoring progress of fsck.ocfs2

2011-11-18 Thread Sunil Mushran
Do: cat /proc/PID/stack It is probably stuck in the block layer. On 11/18/2011 08:33 AM, Nick Khamis wrote: Hello Everyone, I just ran fsck.ocfs2 on /dev/drbd0 which is a one gig partition on a vm with limited resource (100meg of ram). I am worried that the process crashed because it has

Re: [Ocfs2-users] Number of Nodes defined

2011-11-17 Thread Sunil Mushran
the node slots to 2 and everything worked until this morning when the same error returned No space left on device. The OS is still showing available disk space but as the error suggests i can't write to the partition. Any idea what could be happening? On 11/16/2011 05:45 PM, Sunil Mushran

Re: [Ocfs2-users] [Ocfs2-devel] vmstore option - mkfs

2011-11-16 Thread Sunil Mushran
fstype is a handy way to format the volume with parameters that are thought to be useful for that use-case. The result of this is printed during format by way of the parameters selected. man mkfs.ocfs2 has a blurb about the features it enabled by default. On 11/16/2011 08:45 AM, Artur Baruchi

Re: [Ocfs2-users] [Ocfs2-devel] vmstore option - mkfs

2011-11-16 Thread Sunil Mushran
Yes. But this is just the features. It also selects the appropriate cluster size, block size, journal size, etc. All the params selected are printed by mkfs. You also have the option of running with the --dry-option to see the params. On 11/16/2011 09:41 AM, Artur Baruchi wrote: I just found

Re: [Ocfs2-users] Number of Nodes defined

2011-11-16 Thread Sunil Mushran
man tunefs.ocfs2 It cannot be done in an active cluster. But it can be done without having to reformat the volume. On 11/16/2011 10:08 AM, David wrote: I wasn't able to find any documentation that answers whether or not the number of nodes defined for a cluster, can be reduced on an active

Re: [Ocfs2-users] Number of Nodes defined

2011-11-16 Thread Sunil Mushran
what the impact to the fs would be when making a change to an existing fs such as reducing the node slots. Anyway, thank you for the feedback, I was able to make the changes with no impact to the fs. David On 11/16/2011 12:12 PM, Sunil Mushran wrote: man tunefs.ocfs2 It cannot

Re: [Ocfs2-users] dlm locking

2011-11-14 Thread Sunil Mushran
o2image is only useful for debugging. It allows us to get a copy of the file system on which we can test fsck inhouse. The files in lost+found have to be resolved manually. If they are junk, delete them. If useful, move it to another directory. On 11/11/2011 05:36 PM, Nick Khamis wrote: All

Re: [Ocfs2-users] OCFS2 and db_block_size

2011-11-14 Thread Sunil Mushran
We talk about this in the user's guide. 1. Always use 4K blocksize. 2. Never set the cluster size less than the database block size. Having a smaller cluster size could mean that a db block may not be contiguous. And you don't want that for performance and other reasons. Having a still larger

Re: [Ocfs2-users] dlm locking

2011-11-10 Thread Sunil Mushran
Do: fsck.ocfs2 -f /dev/... Without -f, it only replays the journal. On 11/09/2011 05:49 PM, Nick Khamis wrote: Hello Sunil, This is only on the protoype so it's not crucial however, it would be nice to figure out why for future reference: fsck.ocfs2 /dev/drbd0 fsck.ocfs2 1.6.4 Checking

Re: [Ocfs2-users] dlm locking

2011-11-10 Thread Sunil Mushran
The ro issue was different. It appears the volume has more problems. If you want to me to look at the issue, I'll need the image of the volume. # o2image /dev/device /tmp/o2image.out On 11/10/2011 01:55 PM, Nick Khamis wrote: Hello Sunil, Thank you so much for your time, and I do not want to

Re: [Ocfs2-users] mixing ocfs2 versions in a cluster

2011-11-09 Thread Sunil Mushran
I would recommend upgrading all the nodes to 1.2.9 as it contains fixes to known bugs in the versions you are running. Mixing versions is never recommended mainly because it is hard to test all possible combinations. It is alright to do so on an interim basis. But never recommended as a stable

Re: [Ocfs2-users] dlm locking

2011-11-09 Thread Sunil Mushran
This has nothing to do with the dlm. The error states that the fs encountered a bad inode on disk. Possible disk corruption. On encountering the fs goes readonly and asks the user to run fsck. On 11/09/2011 11:51 AM, Nick Khamis wrote: Hello Everyone, For the first time I eoerienced a dlm

Re: [Ocfs2-users] Error building ocfs2-tools

2011-10-28 Thread Sunil Mushran
On 10/27/2011 07:10 PM, Tim Serong wrote: Damn. It was in Pacemaker's include/crm/ais.h, back before June 27 last year(!), when it was moved to Pacemaker's configure.ac: https://github.com/ClusterLabs/pacemaker/commit/8e939b0ad779c65d445e2fa150df1cc046428a93#include/crm/ais.h This means it

Re: [Ocfs2-users] Error building ocfs2-tools

2011-10-27 Thread Sunil Mushran
ocfs2-tools-1.4.4 is too old. Build 1.6.4. The source tarball is on oss.oracle.com. On 10/27/2011 12:45 PM, Nick Khamis wrote: Hello Everyone, I am building ocfs2-tools from source. Modified /ocfs2_controld/Makefile to point to the correct pacemaker 1.1.6 headers: PCMK_INCLUDES =

Re: [Ocfs2-users] Error building ocfs2-tools

2011-10-27 Thread Sunil Mushran
I don't remember that resource. If it did exist, it would have existed in pacemaker. ocfs2-tools does not carry any pacemaker bits. It carries bits that allows it to work with pacemaker cman. On 10/27/2011 02:27 PM, Nick Khamis wrote: Hello Sunil, Thank you so much for your response. I just

Re: [Ocfs2-users] Error building ocfs2-tools

2011-10-27 Thread Sunil Mushran
On 10/27/2011 05:26 PM, Tim Serong wrote: That ought to work... But where did PCMK_SERVICE_ID come from in that context? AFAICT it's always been CRM_SERVICE there. See current head: http://oss.oracle.com/git/?p=ocfs2-tools.git;a=blob;f=ocfs2_controld/pacemaker.c;hb=HEAD#l158 CRM_SERVICE

Re: [Ocfs2-users] Unable to stop cluster as heartbeat region still active

2011-10-23 Thread Sunil Mushran
be started and stopped once the volume gets mounted/umounted. br, Laurentiu. On 10/19/2011 02:28, Sunil Mushran wrote: Manual delete will only work if there are no references. In your case there are references. You may want to start both nodes from scratch. Do not start/stop heartbeat manually. Also

Re: [Ocfs2-users] OCFS2 slow with multiple writes

2011-10-21 Thread Sunil Mushran
option as data needs to be flushed to the FS before journal commit, but why is that blocking a new separate file from being written to the file system? Regards, Prakash On Oct 20, 2011, at 6:25 PM, Sunil Mushran wrote: Use writeback. Ordered data requires the data to be flushed before

Re: [Ocfs2-users] OCFS2 slow with multiple writes

2011-10-20 Thread Sunil Mushran
Use writeback. Ordered data requires the data to be flushed before journal commit. And flushing 40G takes time. mount -t data=writeback DEVICE PATH On 10/20/2011 03:05 PM, Prakash Velayutham wrote: Hi, OS - SLES 11.1 with HAE OCFS2 - 1.4.3-0.16.7 Cluster stack - Pacemaker I have Heartbeat

Re: [Ocfs2-users] Unable to stop cluster as heartbeat region still active

2011-10-18 Thread Sunil Mushran
ls -lR /sys/kernel/config/cluster What does this return? On 10/18/2011 02:05 PM, Laurentiu Gosu wrote: Hi, I have a 2 nodes ocfs2 cluster running UEK 2.6.32-100.0.19.el5, ocfs2console-1.6.3-2.el5, ocfs2-tools-1.6.3-2.el5. My problem is that all the time when i try to run /etc/init.d/o2cb

Re: [Ocfs2-users] Unable to stop cluster as heartbeat region still active

2011-10-18 Thread Sunil Mushran
19 00:12 ipv4_address -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port -rw-r--r-- 1 root root 4096 Oct 19 00:12 local -rw-r--r-- 1 root root 4096 Oct 19 00:12 num On 10/19/2011 00:12, Sunil Mushran wrote: ls -lR /sys/kernel/config/cluster What does this return? On 10/18/2011 02:05 PM

Re: [Ocfs2-users] Unable to stop cluster as heartbeat region still active

2011-10-18 Thread Sunil Mushran
somehow..? Laurentiu. On 10/19/2011 00:17, Sunil Mushran wrote: What does this return? cat /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev Also, do: ls -lR /sys/kernel/debug/ocfs2 ls -lR /sys/kernel/debug/o2dlm On 10/18/2011 02:14 PM, Laurentiu Gosu

Re: [Ocfs2-users] Unable to stop cluster as heartbeat region still active

2011-10-18 Thread Sunil Mushran
: total 0 ls -lR /sys/kernel/debug/o2dlm /sys/kernel/debug/o2dlm: total 0 ocfs2_hb_ctl -I -d /dev/dm-2 ocfs2_hb_ctl: Device name specified was not found while reading uuid There is no /dev/dm-2 mounted. On 10/19/2011 00:27, Sunil Mushran wrote: mount -t debugfs debugfs /sys/kernel/debug

Re: [Ocfs2-users] Unable to stop cluster as heartbeat region still active

2011-10-18 Thread Sunil Mushran
DeviceFS Nodes /dev/mapper/volgr1-lvol0 ocfs2 ro02xsrv001 ro02xsrv001 = the other node in the cluster. By the way, there is no /dev/md-2 ls /dev/dm-* /dev/dm-0 /dev/dm-1 On 10/19/2011 00:37, Sunil Mushran wrote: So it is not mounted. But we still have a hb thread

Re: [Ocfs2-users] Unable to stop cluster as heartbeat region still active

2011-10-18 Thread Sunil Mushran
See if this cleans it up. ocfs2_hb_ctl -K -u 0C4AB55FE9314FA5A9F81652FDB9B22D On 10/18/2011 02:44 PM, Laurentiu Gosu wrote: ocfs2_hb_ctl -I -u 0C4AB55FE9314FA5A9F81652FDB9B22D 0C4AB55FE9314FA5A9F81652FDB9B22D: 0 refs On 10/19/2011 00:43, Sunil Mushran wrote: ocfs2_hb_ctl -l -u

Re: [Ocfs2-users] Unable to stop cluster as heartbeat region still active

2011-10-18 Thread Sunil Mushran
:( On 10/19/2011 00:50, Sunil Mushran wrote: See if this cleans it up. ocfs2_hb_ctl -K -u 0C4AB55FE9314FA5A9F81652FDB9B22D On 10/18/2011 02:44 PM, Laurentiu Gosu wrote: ocfs2_hb_ctl -I -u 0C4AB55FE9314FA5A9F81652FDB9B22D 0C4AB55FE9314FA5A9F81652FDB9B22D: 0 refs On 10/19/2011 00:43, Sunil Mushran

Re: [Ocfs2-users] Unable to stop cluster as heartbeat region still active

2011-10-18 Thread Sunil Mushran
from?? ocfs2_hb_ctl -I -u 918673F06F8F4ED188DDCE14F39945F6 918673F06F8F4ED188DDCE14F39945F6: 1 refs On 10/19/2011 01:04, Sunil Mushran wrote: Let's do it by hand. rm -rf /sys/kernel/config/cluster/.../heartbeat/*0C4AB55FE9314FA5A9F81652FDB9B22D * On 10/18/2011 02:52 PM, Laurentiu Gosu wrote

Re: [Ocfs2-users] Unable to stop cluster as heartbeat region still active

2011-10-18 Thread Sunil Mushran
/2011 03:24 PM, Laurentiu Gosu wrote: Yes, i did reformat it(even more than once i think, last week). This is a pre-production system and i'm trying various options before moving into real life. On 10/19/2011 01:19, Sunil Mushran wrote: Did you reformat the volume recently? or, when did you format

Re: [Ocfs2-users] Unable to stop cluster as heartbeat region still active

2011-10-18 Thread Sunil Mushran
to sleep now, i have to be up in a few hours. We can continue tomorrow if it's ok with you. Thank you for your help. Laurentiu. On 10/19/2011 01:33, Sunil Mushran wrote: One way this can happen is if one starts the hb manually and then force formats on that volume. The format will generate

Re: [Ocfs2-users] Partition table crash, where can I find debug message?

2011-10-12 Thread Sunil Mushran
Not sure what you mean by a partition table crash. Is it that someone overwrote the partition table on the iscsi server? That's what it looks like. If mount cannot detect the fs type, then it means atleast superblock corruption. And such corruptions typically caused by external entities. Stray dd

Re: [Ocfs2-users] Partition table crash, where can I find debug message?

2011-10-12 Thread Sunil Mushran
it. Given it was under heavy usage because of many VM running on, I guess this may be the cause. now I am trying to recover it *From:*Sunil Mushran [mailto:sunil.mush...@oracle.com] mailto:[mailto:sunil.mush...@oracle.com] *Sent:* Wednesday, October 12, 2011 10:08 AM *To:* Frank Zhang *Cc

Re: [Ocfs2-users] Partition table crash, where can I find debug message?

2011-10-12 Thread Sunil Mushran
extent of the corruption... (not crash) On 10/12/2011 10:51 AM, Sunil Mushran wrote: Hard to say. You'll need to investigate the extent of the crash. On 10/12/2011 10:49 AM, Frank Zhang wrote: Sorry, it's not power outage, it's just a normal reboot. Is this serious to corrupt the super

Re: [Ocfs2-users] one node kernel panic

2011-10-07 Thread Sunil Mushran
arise? (2011/10/05 1:45), Sunil Mushran wrote: int sigprocmask(int how, sigset_t *set, sigset_t *oldset) { int error; spin_lock_irq(current-sighand-siglock); CRASH if (oldset) *oldset = current-blocked; ... } current-sighand is NULL. So definitely a race. Generic kernel issue. Ping

Re: [Ocfs2-users] Kernel Panic / Fencing

2011-10-06 Thread Sunil Mushran
I am unclear. What happens when a server is rebooted (or crashes). Crash the network? Can you expand on this? On 10/06/2011 05:52 PM, Tony Rios wrote: Hey all, I'm running a current version of Ubuntu and we are using OCFS2 across a cluster of 9 web servers. Everything works perfectly, so

Re: [Ocfs2-users] Fwd: OCFS drives not syncing

2011-10-05 Thread Sunil Mushran
On 10/05/2011 08:46 AM, Bradlee Landis wrote: Sorry Sunil, my email replied to you instead of the list. On Wed, Oct 5, 2011 at 10:09 AM, Sunil Mushransunil.mush...@oracle.com wrote: ocfs2 is a shared disk cluster file system. It requires a shared disk. However, if you are only going to

Re: [Ocfs2-users] one node kernel panic

2011-10-04 Thread Sunil Mushran
int sigprocmask(int how, sigset_t *set, sigset_t *oldset) { int error; spin_lock_irq(current-sighand-siglock); CRASH if (oldset) *oldset = current-blocked; ... } current-sighand is NULL. So definitely a race. Generic kernel issue. Ping your kernel

Re: [Ocfs2-users] dlm_lockres_release:507 ERROR: Resource W0000000000000001b027d69b591f15 not on the Tracking list

2011-09-30 Thread Sunil Mushran
On 09/30/2011 06:49 AM, Herman L wrote: On Thursday, September 29, 2011 2:04 PM Sunil Mushran wrote: On 09/29/2011 08:56 AM, Herman L wrote: On Wednesday, September 21, 2011 4:00 PM, Sunil Mushran wrote: On 09/21/2011 12:37 PM, Herman L wrote: On 09/19/2011 08:35 AM, Herman L wrote: Hi all

Re: [Ocfs2-users] Problem with tunefs.ocfs2, similar to fsck.ocfs2 on EL5

2011-09-27 Thread Sunil Mushran
On 09/27/2011 09:12 AM, Ulf Zimmermann wrote: - -Original Message- From: Sunil Mushran [mailto:sunil.mush...@oracle.com] Sent: Monday, September 26, 2011 10:09 AM To: Ulf Zimmermann Cc: ocfs2-users@oss.oracle.com Subject: Re: [Ocfs2-users] Problem with tunefs.ocfs2, similar

Re: [Ocfs2-users] Problem with tunefs.ocfs2, similar to fsck.ocfs2 on EL5

2011-09-26 Thread Sunil Mushran
I'll look at the tunefs issue. But the other one does not make sense. strict_jbd is a compat flag. Mount should work. What is the mount error? As in, in dmesg. On 09/25/2011 04:43 AM, Ulf Zimmermann wrote: As tunefs.ocfs2 wasn't working for us, I tried to mkfs.ocfs2 the volumes again with

Re: [Ocfs2-users] dlm_lockres_release:507 ERROR: Resource W0000000000000001b027d69b591f15 not on the Tracking list

2011-09-21 Thread Sunil Mushran
! Herman From: Sunil Mushran To: Herman L Sent: Monday, September 19, 2011 12:57 PM Subject: Re: [Ocfs2-users] dlm_lockres_release:507 ERROR: Resource W0001b027d69b591f15 not on the Tracking list I've no idea of the state of the source that you are using. The message

Re: [Ocfs2-users] dlm_lockres_release:507 ERROR: Resource W0000000000000001b027d69b591f15 not on the Tracking list

2011-09-19 Thread Sunil Mushran
I've no idea of the state of the source that you are using. The message is a warning indicating a race. While it probably did not affect the functioning, there is no guarantee that that would be the case the next time around. The closest relevant patch is over 2 years old.

Re: [Ocfs2-users] 11gr1 RAC + ocfs2 node2 is down and not able to mount the ocfs2 FS on node1

2011-09-19 Thread Sunil Mushran
The connect is failing. One of the main reason is a firewall. See if iptables is running. Check on both nodes. If so, shutdown it down or add a rule to allow traffic on the o2cb port. On 09/18/2011 08:57 PM, veeraa bose wrote: Hi All, we are having two node 11gr1 RAC (we have used ocfs2 for

Re: [Ocfs2-users] fsck doesn't fix bad chain

2011-09-17 Thread Sunil Mushran
Can you save the o2image of the volume when it is in that state. We'll need that for analysis. On 09/16/2011 05:41 AM, Andre Nathan wrote: Hello For a while I had seen errors like this in the kernel logs: OCFS2: ERROR (device drbd5): ocfs2_validate_gd_parent: Group descriptor

Re: [Ocfs2-users] Linux kernel crash due to ocfs2

2011-09-16 Thread Sunil Mushran
, George On Thu, 2011-09-15 at 09:45 -0700, Sunil Mushran wrote: I was hoping to get a readable stack. Please could you provide a link to the coredump. On 09/15/2011 02:51 AM, Betzos Giorgos wrote: Hello, I am sorry for the delay in responding. Unfortunately, if faulted again. Here

Re: [Ocfs2-users] Syslog reports (ocfs2_wq, 15527, 2):ocfs2_orphan_del:1841 ERROR: status = -2

2011-09-15 Thread Sunil Mushran
04096 21-Nov-2008 10:54 .. [root@ausracdbd01 tmp]# From: Sunil Mushran [mailto:sunil.mush...@oracle.com] Sent: Thursday, September 15, 2011 10:04 AM To: Daniel Keisling Cc: ocfs2-users@oss.oracle.com Subject: Re: [Ocfs2-users] Syslog

  1   2   3   4   5   6   7   8   9   10   >