Re: [Ocfs2-users] [Ocfs2-devel] size increase

2015-03-17 Thread Sunil Mushran
This is because you are specifying a 128k cluster size. Refer to man mkfs.ocfs2 for more. On Mar 17, 2015 8:04 PM, "Umarzuki Mochlis" wrote: > Hi, > > What I meant by total size is output of 'du -hs' > > I can see output of fdisk on mpath1 of ocfs2 LUN similar to logical > volume of ext4 partitio

Re: [Ocfs2-users] OCFS2 “Heartbeat generation mismatch on device” error when mounting iscsi target

2015-02-09 Thread Sunil Mushran
t generation mismatch” error > message. > > -- > Danijel Krmar > A51 D.O.O. > Novi Sad > https://www.activecollab.com/ > > On February 9, 2015 at 8:09:06 PM, Sunil Mushran (sunil.mush...@gmail.com) > wrote: > > On node 2, do: > ps aux | grep o2hb > > I susp

Re: [Ocfs2-users] OCFS2 “Heartbeat generation mismatch on device” error when mounting iscsi target

2015-02-09 Thread Sunil Mushran
On node 2, do: ps aux | grep o2hb I suspect you have multiple o2hb threads running. If so, restart the o2cb cluster on that node. On Mon, Feb 9, 2015 at 10:08 AM, Danijel Krmar < danijel.kr...@activecollab.com> wrote: > As said in the title, when I want to mount a iSCSI target on one machine I >

Re: [Ocfs2-users] How to unlock a bloked resource? Thanks

2014-09-10 Thread Sunil Mushran
What is the output of the commands? The protocol is supposed to do the unlocking on its own. See what is it blocked on. It could be that the node that has the lock cannot unlock it because it cannot flush the journal to disk. On Tue, Sep 9, 2014 at 7:55 PM, Guozhonghua wrote: > Hi All: > > > >

Re: [Ocfs2-users] OCFS2 slow when using 'find' and 'du' commands

2014-05-22 Thread Sunil Mushran
Is this slow the second time you run the command or only the first? How much memory do you have? -mmin needs the inode. And reading inodes from disk is expensive. One reason could be that the system does not have enough memory to cache the inodes and thus is triggering lots of disk reads. On Thu

Re: [Ocfs2-users] OCFS2 and PHP is it related to ocfs2 ?

2014-05-02 Thread Sunil Mushran
0xa0 > [] user_path_at+0x11/0x20 > [] SyS_faccessat+0x9c/0x220 > [] SyS_access+0x18/0x20 > [] system_call_fastpath+0x1a/0x1f > [] 0x > > > *Gesendet:* Freitag, 02. Mai 2014 um 17:16 Uhr > *Von:* "Sunil Mushran" > *An:* molo@web.de > *Cc:* Ocfs2-

Re: [Ocfs2-users] OCFS2 and PHP is it related to ocfs2 ?

2014-05-02 Thread Sunil Mushran
Dump some kernel/user stacks to see if we can narrow down the loop it is spinning in. cat /process/PID/stack will show the kernel stack pstack should show user stack. On May 2, 2014 8:12 AM, wrote: > its PHP-FPM > > root 1951 1.5 0.1 362344 7704 ?Ss 17:09 0:01 php-fpm: > mast

Re: [Ocfs2-users] OCFS2 and PHP is it related to ocfs2 ?

2014-05-02 Thread Sunil Mushran
Which process is pegging the CPU? On May 2, 2014 6:12 AM, wrote: > We have two nodes which are serving PHP webpages with PHP5-FPM. Both Nodes > are configured with drbd in dual primary mode. > In our tests, if one of these two nodes get 10-20 Pagerefresh's at the > same time, the CPU are 100% in

Re: [Ocfs2-users] FSCK may be failing and corrupting my disk???

2014-03-24 Thread Sunil Mushran
r than cloning a bad inode? > > > On 03/22/2014 09:40 PM, Sunil Mushran wrote: > > Cloning the inode means inode + data. Let it finish. > > > On Sat, Mar 22, 2014 at 3:44 PM, Eric Raskin wrote: > >> Hi: >> >> I am running a two-node Oracle VM Server 2.2

Re: [Ocfs2-users] FSCK may be failing and corrupting my disk???

2014-03-22 Thread Sunil Mushran
Cloning the inode means inode + data. Let it finish. On Sat, Mar 22, 2014 at 3:44 PM, Eric Raskin wrote: > Hi: > > I am running a two-node Oracle VM Server 2.2.2 installation. We were > having some strange problems creating new virtual machines, so I shut down > the systems and unmounted the

Re: [Ocfs2-users] How do I check fragmentation amount?

2013-11-01 Thread Sunil Mushran
debugfs.ocfs2 -R "frag filespec" DEVICE will show you the fragmentation level on an inode basis. You could run that for all inodes and figure out the value for the entire volume. On Fri, Nov 1, 2013 at 3:00 PM, Andy wrote: > How can I check the amount on fragmentation on an OCFS2 volume? >

Re: [Ocfs2-users] How to break out the unstop loop in the recovery thread? Thanks a lot.

2013-11-01 Thread Sunil Mushran
It is encountering scsi errrors reading the device. Fixing that will fix the issue. If you want to stop the logging, I don't believe there is a method right now. But i could be trivially added. Allow user to disable mlog(ML_ERROR) logging. On Thu, Oct 31, 2013 at 7:38 PM, Guozhonghua wrote: >

Re: [Ocfs2-users] OCFS2 tuning, fragmentation and localalloc option. Cluster hanging during mix read+write workloads

2013-08-06 Thread Sunil Mushran
If the storage connectivity is not stable, then dlm issues are to be expected. In this case, the processes are all trying to take the readlock. One possible scenario is that the node holding the writelock is not able to relinquish the lock because it cannot flush the updated inodes to disk. I would

Re: [Ocfs2-users] Problems with volumes coming from RHEL5 going to OEL6

2013-07-09 Thread Sunil Mushran
nil, any suggestions on this? > > ** ** > > ** ** > > *From:* ocfs2-users-boun...@oss.oracle.com [mailto: > ocfs2-users-boun...@oss.oracle.com] *On Behalf Of *Ulf Zimmermann > *Sent:* Saturday, June 22, 2013 15:20 > *To:* Sunil Mushran > > *Cc:* ocfs2-users@oss.oracle.co

Re: [Ocfs2-users] High inodes usage

2013-07-03 Thread Sunil Mushran
o I suppose it is not causing any problem but I found it > weird). > > > 2013/7/3 Sunil Mushran > >> That is old. It just could be a minor bug is that release. Is it causing >> you any problems? >> >> >> On Wed, Jul 3, 2013 at 12:31 PM, Nicolas Michel <

Re: [Ocfs2-users] High inodes usage

2013-07-03 Thread Sunil Mushran
27;m not > at work but it's a SLES 10 SP2, so a pretty old kernel I suppose. > > Nicolas > > > 2013/7/3 Sunil Mushran > >> Hoe did you figure this out? Also, which version of the kernel are you >> using? >> >> >> On Wed, Jul 3, 2013 at 1:05

Re: [Ocfs2-users] High inodes usage

2013-07-03 Thread Sunil Mushran
Hoe did you figure this out? Also, which version of the kernel are you using? On Wed, Jul 3, 2013 at 1:05 AM, Nicolas Michel wrote: > Hello guys, > > I'm using OCFS2 for a shared storage (on SAN). I just saw that the inode > usage is really high although these filesystems are used for Oracle DAT

Re: [Ocfs2-users] Problems with volumes coming from RHEL5 going to OEL6

2013-06-21 Thread Sunil Mushran
Can you dump the following using the 1.8 binary. debugfs.ocfs2 -R "stats" /dev/mapper/. On Fri, Jun 21, 2013 at 6:17 AM, Ulf Zimmermann wrote: > We have a production cluster of 6 nodes, which are currently running > RHEL 5.8 with OCFS2 1.4.10. We snapclone these volumes to multiple > desti

Re: [Ocfs2-users] Unable to set the o2cb heartbeat to global

2013-06-04 Thread Sunil Mushran
Support for global heartbeat was added in ocfs2-tools-1.8. On Tue, Jun 4, 2013 at 8:31 AM, Vineeth Thampi wrote: > Hi, > > I have added heartbeat mode as global, but when I do a mkfs and mount, and > then check the mount, it says I am in local mode. Even > /sys/kernel/config/cluster/ocfs2/heartb

Re: [Ocfs2-users] What is the overhead/disk loss of formatting an ocfs2 filesystem?

2013-04-15 Thread Sunil Mushran
-N 16 means 16 journals. I think it defaults to 256M journals. So that's 4G. Do you plan to mount it on 16 nodes? If not, reduce that. Other options is a smaller journal. But you have to be careful as a small journal could limit your write thruput. On Mon, Apr 15, 2013 at 1:37 PM, Jerry Smith wr

Re: [Ocfs2-users] Significant Slowdown when writing and deleting files at the same time

2013-03-29 Thread Sunil Mushran
Are you mounting -o writeback? On Fri, Mar 29, 2013 at 12:28 PM, Andy wrote: > I have been having performance issues from time to time on our > production ocfs2 volumes, so I set up a test system to try to reproduce > what I was seeing on the production systems. This is what I found out: > > I

Re: [Ocfs2-users] [OCFS2] Crash at o2net_shutdown_sc()

2013-03-01 Thread Sunil Mushran
[ 1481.620253] o2hb: Unable to stabilize heartbeart on region 1352E2692E704EEB8040E5B8FF560997 (vdb) What this means is that the device is suspect. o2hb writes are not hitting the disk. vdb is accepting and acknowledging the write but spitting out something else during the next read. Heartbeat de

Re: [Ocfs2-users] OCFS ..Inode contains a hole at offset...

2013-02-20 Thread Sunil Mushran
This is probably a directory. debugs.ocfs2 -R 'stat <52663>' /dev/ will dump the inode. Are you sure fsck is fixing it? Does the output show this block getting fixed? If not, you may want to run fsck.ocfs2 v1.8. I think a fix code was added for it. On Wed, Feb 20, 2013 at 1:01 AM, Fiorenza M

Re: [Ocfs2-users] ocfs cluster node keeps rebooting

2013-01-14 Thread Sunil Mushran
1.2.5 is 6+ year old release. You may want to use something more current. On Mon, Jan 14, 2013 at 12:06 PM, Bill Zha wrote: > Hi Sunil and All, > > We have a 10 Redhat4.2-node OCFS cluster running on version 1.2.5-6. One > of the node started to rebooted almost everyday since last week. The >

Re: [Ocfs2-users] asynchronous hwclocks

2013-01-03 Thread Sunil Mushran
The fs does not care about time. It should have no effect on the cluster. However the apps may care and may behave erratically. On Jan 3, 2013, at 3:13 PM, "Medienpark, Jakob Rößler" wrote: > Hello list, > > today I noticed huge differences between the hardware clocks in our cluster. > Some

Re: [Ocfs2-users] Is this a valid configuration?

2012-12-05 Thread Sunil Mushran
This is normal. My only concern is the use of very old kernel/fs versions. On Wed, Dec 5, 2012 at 3:08 AM, Neil wrote: > Anyone? > > > > On 2012-11-28 00:47:56 + neil campbell > wrote: > > > > > > > Hi list, > > > > I am running OCFS2 1.2.9-9.bug13439173 on RHEL 4

Re: [Ocfs2-users] "ls" taking ages on a directory containing 900000 files

2012-12-04 Thread Sunil Mushran
4("TEW_STRESS_TEST_VM.1K_100P_1F.P022_F01583.txt", > > {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 <0.001413> > > > > > > > > We are using a 32 bits architecture, can it be the cause of the kernel > > not having enough memory ? Any pos

Re: [Ocfs2-users] "ls" taking ages on a directory containing 900000 files

2012-12-04 Thread Sunil Mushran
strace -p PID -ttt -T Attach and get some timings. The simplest guess is that the system lacks memory to cache all the inodes and thus has to hit disk (and more importantly take cluster locks) for the same inode repeatedly. The user guide has a section in NOTES explaining this. On Tue, Dec 4, 2

Re: [Ocfs2-users] Huge Problem ocfs2

2012-11-09 Thread Sunil Mushran
27;s enough to just force the return value > of 0 at "ocfs2_validate_meta_ecc" in order to bypass the ECC checks? > > > > > On 10.11.2012 03:55, Sunil Mushran wrote: > > If global bitmap is gone. then the fs is unusable. But you can extract > data using > the rdump command in

Re: [Ocfs2-users] Huge Problem ocfs2

2012-11-09 Thread Sunil Mushran
close(3)= 0 > write(2, "tunefs.ocfs2", 12tunefs.ocfs2)= 12 > write(2, ": ", 2: ) = 2 > write(2, "I/O error on channel", 20I/O error on channel)= 20 > write(2, " ", 1 )

Re: [Ocfs2-users] Huge Problem ocfs2

2012-11-09 Thread Sunil Mushran
It's either that or a check sum problem. Disable metaecc. Not sure which kernel you are running. We had fixed few problems few years ago around this. If your kernel is older, then it could be a known issue. On Fri, Nov 9, 2012 at 12:50 PM, Marian Serban wrote: > Hi Sunil, > > Thank you for answ

Re: [Ocfs2-users] Huge Problem ocfs2

2012-11-09 Thread Sunil Mushran
IO error on channel means the system cannot talk to the block device. The problem is in the block layer. Maybe a loose cable or a setup problem. dmesg should show errors. On Fri, Nov 9, 2012 at 10:46 AM, Laurentiu Gosu wrote: > Hi, > I'm using ocfs2 cluster in a production environment since al

Re: [Ocfs2-users] HA-OCFS2?

2012-09-13 Thread Sunil Mushran
cfs != storage You need to get a highly available storage that is concurrently accessible from multiple nodes. ocfs2 will allow multiple nodes to concurrently access the same storage. With posix semantics. If a node dies, the remaining nodes will pause to recover and then continue functioning. Th

Re: [Ocfs2-users] Ocfs2-users Digest, Vol 105, Issue 4

2012-09-12 Thread Sunil Mushran
On Wed, Sep 12, 2012 at 9:45 AM, Asanka Gunasekera < asanka_gunasek...@yahoo.co.uk> wrote: > Load O2CB driver on boot (y/n) [y]: > Cluster stack backing O2CB [o2cb]: > Cluster to start on boot (Enter "none" to clear) [ocfs2]: > Specify heartbeat dead threshold (>=7) [31]: > Specify network idle ti

Re: [Ocfs2-users] test inode bit failed -5

2012-08-31 Thread Sunil Mushran
nfsd encountered an error reading the device. So something in the io path below the fs encountered a problem. If it just happened once, then you can ignore it. On Fri, Aug 31, 2012 at 2:23 AM, Hideyasu Kojima wrote: > Hi > I using ocfs2 cluster as NFS Server. > > Only once,I got a bellow error,an

Re: [Ocfs2-users] Issue with OCFS2 mount

2012-08-29 Thread Sunil Mushran
Forgot to add that this issue is limited to metaecc. So you could avoid the issue in your same setup by not enabling metaecc on the volume. And last I checked mkfs did not enable it by default. On Mon, Aug 27, 2012 at 10:35 AM, Sunil Mushran wrote: > So you are running into a bug that has b

Re: [Ocfs2-users] Issue with files and folder ownership

2012-08-29 Thread Sunil Mushran
Aug 29, 2012 at 7:25 AM, Sunil Mushran wrote: > >> Isn't the mount point is local to the machine? > > > I use iSCSI for the Block device and I mount the device (/dev/sdc1) at > /var/lib/nova/instances. > > I've formated /dev/sdc1 in OCFS2 FS. > > Should

Re: [Ocfs2-users] Issue with files and folder ownership

2012-08-28 Thread Sunil Mushran
Permissions on the mount point should be local to a machine. AFAIK. On Mon, Aug 27, 2012 at 3:08 AM, Emilien Macchi wrote: > Hi, > > > I'm working on a two nodes cluster with the goal to store virtual machines > managed by OpenStack services and KVM Hypervisor. I also use iSCSI > Multi-Pathing f

Re: [Ocfs2-users] Issue with OCFS2 mount

2012-08-27 Thread Sunil Mushran
Oracle > version:1.5.0 > description:OCFS2 1.5.0 > srcversion: B13569B35F99D43FA80D129 > depends:jbd2,ocfs2_stackglue,quota_tree,ocfs2_nodemanager > vermagic: 2.6.34.7-0.7-desktop SMP preempt mod_unload modversions > > # mkfs.ocfs2 --version >

Re: [Ocfs2-users] Issue with OCFS2 mount

2012-08-24 Thread Sunil Mushran
What is the version of the kernel, ocfs2 and ocfs2 tools? uname -a modinfo ocfs2 mkfs.ocfs2 --version On Fri, Aug 24, 2012 at 1:09 PM, Rory Kilkenny wrote: > We have an HP P2000 G3 Storage array, fiber connected. The storage > array has a RAID5 array broken into 2 physical OCFS2 volumes (A & B

Re: [Ocfs2-users] OCFS2 and util_file

2012-08-23 Thread Sunil Mushran
On Thu, Aug 23, 2012 at 10:58 AM, Maki, Nancy wrote: > By default we mount all our OCFS2 volumes with datavolume. To be more > specific, the volume that we are having the issue with is not a database > volume but a shared drive for developers to read and write other types of > files. Would it b

Re: [Ocfs2-users] OCFS2 and util_file

2012-08-23 Thread Sunil Mushran
You are probably mounting the volume with the datavolume option. Instead use the init.ora param, filesystemio_options for force odirect and mount the volume without the datavolume option. This is documented in the user's guide. On Thu, Aug 23, 2012 at 8:14 AM, Maki, Nancy wrote: > We are getting

Re: [Ocfs2-users] null pointer dereference

2012-08-21 Thread Sunil Mushran
You may want to run a full fsck on the fs. fsck.ocfs2 -fy /dev/ On Tue, Aug 21, 2012 at 12:49 AM, Pawel wrote: > Hi, > After upgrading ocfs2 my cluster is instable. > > At least ones per week I can see: > kernel panic: Null pointer dereference at 00048 > o2dlm_blocking_ast_wrapper + 0x8/0x

Re: [Ocfs2-users] ocfs2 problem journal size

2012-08-02 Thread Sunil Mushran
oh crap. The dlm lock needs to lock the journals. So you need to recreate the journal inodes with i_size 0. dd a good journal inode and edit it using binary editor. Change the inode num to the block number, zero out the i_size and next_free_extent. Repeat for the 4 inodes. Hopefully some one on t

Re: [Ocfs2-users] ocfs2 problem journal size

2012-08-02 Thread Sunil Mushran
The 4 journal inodes got zeroed out. Do you know how/why? Have you tried running fsck with -fy (enable writes). fsck.ocfs2 does have a check for bad journals that it will regenerate. JOURNAL_FILE_INVALID OCFS2 uses JDB for journalling and some journal files exist in the system directory. Fsck ha

Re: [Ocfs2-users] ocfs2-tools git: broken after commit deb5ade9145f8809f1fde19cf53bdfdf1fb7963e

2012-07-26 Thread Sunil Mushran
e. Good commit must be: > else > -tmp = g_list_append(elem, cfs); > +g_list_append(elem, cfs); > > Attached patch. > > Thanks. Acked-by: Sunil Mushran ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] Removing a node from cluster.conf (on a specific node)

2012-04-29 Thread Sunil Mushran
Online add/remove of nodes and of global heartbeat devices has been in mainline for over a year. I think 2.6.38+ and tools 1.8. The ocfs2-tools tree hosted on oss.oracle.com/git has a 1.8.2 tag that can be used safely. It has been fully tested. The user's guide has been moved to man pages bundle

Re: [Ocfs2-users] Permission denied on ocfs2 cluster

2012-03-16 Thread Sunil Mushran
om > [mailto:ocfs2-users-boun...@oss.oracle.com] On Behalf Of зоррыч > Sent: Thursday, March 15, 2012 11:26 PM > To: 'Sunil Mushran' > Cc: ocfs2-users@oss.oracle.com > Subject: Re: [Ocfs2-users] Permission denied on ocfs2 cluster > > [root@noc-1-synt /]# ls -lh | grep o

Re: [Ocfs2-users] Permission denied on ocfs2 cluster

2012-03-15 Thread Sunil Mushran
DONLY) = -1 ENOENT > (No such file or directory) > open("/usr/share/locale/en.UTF-8/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT > (No such file or directory) > open("/usr/share/locale/en.utf8/LC_MESSAGES/libc.mo", O_RDONLY) = -1 ENOENT > (No such file or directory)

Re: [Ocfs2-users] Permission denied on ocfs2 cluster

2012-03-15 Thread Sunil Mushran
strace may show more. I would first confirm that my perms are correct. On 03/15/2012 07:58 AM, ?? wrote: > I am testing the scheme of drbd and ocfs2 > > If you attempt to write to the cluster error: > > [root@noc-1-m77 share]# mkdir 12 > > mkdir: cannot create directory `12': Permission denied

Re: [Ocfs2-users] ocfs2-1.4.7 is not binding in scientific linux 6.2

2012-03-12 Thread Sunil Mushran
ocfs2 1.4 will not build with 2.6.32. A better solution is to just enable ocfs2 in the 2.6.32 kernel src tree and build. On 03/11/2012 07:37 AM, зоррыч wrote: > Hi. > > I use scientific linux 6.2: > > [root@noc-1-m77 ocfs2-1.4.7]# cat /etc/redhat-release > > Scientific Linux release 6.2 (Carbon) >

Re: [Ocfs2-users] ocfs2console hangs on startup

2012-03-10 Thread Sunil Mushran
ocfs2console has been obsoleted. Just use the utilities directly. To detect ocfs2 volumes, use blkid. You can use it to restrict the lookup paths. Refer its manpage. On 03/09/2012 06:15 PM, John Major wrote: > Hi, > > Hope this is the right place to ask this. > > I have set up 2 ubuntu lts machine

Re: [Ocfs2-users] Ocfs2-users Digest, Vol 98, Issue 9

2012-03-02 Thread Sunil Mushran
On 02/29/2012 04:10 PM, David Johle wrote: > I too have seen some serious performance issues under 1.4, especially > with writes. I'll share some info I've gathered on this topic, take > it however you wish... > > In the past I never really thought about running benchmarks against > the shared blo

Re: [Ocfs2-users] OCFS2 1.2/1.6

2012-03-02 Thread Sunil Mushran
The file system on-disk image has not changed. So the 1.6 file system software can mount the volume created with 1.2 mkfs. What you cannot do is concurrently mount the same volume with nodes running 1.2 and 1.6 versions of the file system software. It is not mixed mode. The 1.6 fs software will r

Re: [Ocfs2-users] Concurrent write performance issues with OCFS2

2012-02-28 Thread Sunil Mushran
In 1.4, the local allocator window is small. 8MB. Meaning the node has to hit the global bitmap after every 8MB. In later releases, the window is much larger. Second, a single node is not a good baseline. A better baseline is multiple nodes writing concurrently to the block device. Not fs. Use dd.

Re: [Ocfs2-users] Bad magic number in inode

2012-02-01 Thread Sunil Mushran
inode#11 is in the system directory. fsck cannot fix this automatically. If the corruption is limited, there is a chance the inodes could be recreated manually. But do look at backups to restore. On 02/01/2012 10:20 AM, Werner Flamme wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > Hi,

Re: [Ocfs2-users] A Billion Files on OCFS2 -- Best Practices?

2012-02-01 Thread Sunil Mushran
On 02/01/2012 10:24 AM, Mark Hampton wrote: > Here's what I got from debugfs.ocfs2 -R "stats". I have to type it out > manually, so I'm only including the "features" lines: > > Feature Compat: 3 backup-super strict-journal-super > Feature Incompat: 16208 sparse extended-slotmap inline-data

Re: [Ocfs2-users] A Billion Files on OCFS2 -- Best Practices?

2012-02-01 Thread Sunil Mushran
debugfs.ocfs2 -R "stats" /dev/mapper/... I want to see the features enabled. The main issue with large metdata is the fsck timing. The recently tagged 1.8 release of the tools has much better fsck performance. On 02/01/2012 05:25 AM, Mark Hampton wrote: > We have an application that has many pro

Re: [Ocfs2-users] Extend space on ocfs mount point

2012-02-01 Thread Sunil Mushran
I am not aware of any downsizes in resizing. On 02/01/2012 09:57 AM, Kalra, Pratima wrote: > We have a ucm installation on ocfs mount point and we need to increase > the space on that mount point from 20gb to 30 gb. Is this possible > without resulting in any after effects? > Pratima. ___

Re: [Ocfs2-users] A Billion Files on OCFS2 -- Best Practices?

2012-02-01 Thread Sunil Mushran
On 02/01/2012 07:02 AM, Mark wrote: > One more thing. When I straced one of the application processes (these are > the > processes that create the files) I saw this: > > % time seconds usecs/callcalls errors syscall > --- -- -- -- --- >68.94 3.

Re: [Ocfs2-users] Help ! OCFS2 unstable on Disparate Hardware

2012-01-27 Thread Sunil Mushran
Symmetric clustering works best when the nodes are comparable because all nodes have to work in sync. NFS may be more suitable for your needs. On 01/26/2012 05:51 PM, Jorge Adrian Salaices wrote: > I have been working on trying to convince Mgmt at work that we want to > go to OCFS2 away from NFS

Re: [Ocfs2-users] One node, two clusters?

2011-12-22 Thread Sunil Mushran
On 12/22/2011 10:39 AM, Kushnir, Michael (NIH/NLM/LHC) [C] wrote: > Is there a separate DLM instance for each ocfs2 volume? > > I have two "sub-clusters" in the same cluster... A 10 node Hadoop cluster > sharing a SATA RAID10 and a Two node web server cluster sharing a SSD RAID0. > One server mou

Re: [Ocfs2-users] One node, two clusters?

2011-12-22 Thread Sunil Mushran
You don't need to have two clusters for this. This can be accomplished with one cluster with the default local heartbeat. Create one cluster.conf with all the nodes. All nodes, except the one machine, will mount from just one san. The common node will mount from both sans. If you look at the clus

Re: [Ocfs2-users] reflink status

2011-12-17 Thread Sunil Mushran
On 12/17/2011 12:05 PM, richard -rw- weinberger wrote: >> The reflink utility should work. So what it is based on an older >> coreutils. It is derived from the hard link (ln) utility. > So, building it from http://oss.oracle.com/git/?p=jlbec/reflink.git;a=shortlog > via reflink.spec is the way to g

Re: [Ocfs2-users] reflink status

2011-12-17 Thread Sunil Mushran
First we have to get the new syscall added to the kernel. The first attempt failed because people overloaded the call with extraneous stuff. Recently there is another attempt to go back to the original proposal. Hopefully, next kernel release. The reflink utility should work. So what it is based o

Re: [Ocfs2-users] OCFS2 cluster won't come up and stay up

2011-12-01 Thread Sunil Mushran
rt of reset it so we can get these servers back online and > talking again in the meanwhile? > Tony > > On Dec 1, 2011, at 5:05 PM, Sunil Mushran wrote: > >> Node 3 is joining the domain. It is having problms getting the superblock >> cluster lock. >> Create a b

Re: [Ocfs2-users] Monitoring progress of fsck.ocfs2

2011-11-18 Thread Sunil Mushran
Do: cat /proc/PID/stack It is probably stuck in the block layer. On 11/18/2011 08:33 AM, Nick Khamis wrote: > Hello Everyone, > > I just ran fsck.ocfs2 on /dev/drbd0 which is a one gig partition on a > vm with limited resource (100meg of ram). > I am worried that the process crashed because it ha

Re: [Ocfs2-users] Number of Nodes defined

2011-11-17 Thread Sunil Mushran
e partition. > > Any idea what could be happening? > > On 11/16/2011 05:45 PM, Sunil Mushran wrote: >> Reducing node-slots frees up the journal and distributes the metadata >> that that slot was tracking to the remaining slots. I am not aware of >> any reason why the

Re: [Ocfs2-users] Number of Nodes defined

2011-11-16 Thread Sunil Mushran
anything indicating > what the impact to the fs would be when making a change to an existing fs > such as reducing the node slots. > > Anyway, thank you for the feedback, I was able to make the changes with no > impact to the fs. > > David > > On 11/16/2011 12:12 PM, Sunil

Re: [Ocfs2-users] Number of Nodes defined

2011-11-16 Thread Sunil Mushran
man tunefs.ocfs2 It cannot be done in an active cluster. But it can be done without having to reformat the volume. On 11/16/2011 10:08 AM, David wrote: > I wasn't able to find any documentation that answers whether or not the > number of nodes defined for a cluster, can be reduced on an active >

Re: [Ocfs2-users] [Ocfs2-devel] vmstore option - mkfs

2011-11-16 Thread Sunil Mushran
NCOMPAT_REFCOUNT_TREE, > + OCFS2_FEATURE_RO_COMPAT_UNWRITTEN}, /* FS_VMSTORE */ > > These options are the ones that, when choosing for vmstore, are > enabled by default. Is this correct? > > Thanks. > > Att. > Artur Baruchi > > > > On Wed, Nov 16, 2011 at 3:26 PM, Sunil Mushran > wrote: >

Re: [Ocfs2-users] [Ocfs2-devel] vmstore option - mkfs

2011-11-16 Thread Sunil Mushran
fstype is a handy way to format the volume with parameters that are thought to be useful for that use-case. The result of this is printed during format by way of the parameters selected. man mkfs.ocfs2 has a blurb about the features it enabled by default. On 11/16/2011 08:45 AM, Artur Baruchi wrot

Re: [Ocfs2-users] OCFS2 and db_block_size

2011-11-14 Thread Sunil Mushran
We talk about this in the user's guide. 1. Always use 4K blocksize. 2. Never set the cluster size less than the database block size. Having a smaller cluster size could mean that a db block may not be contiguous. And you don't want that for performance and other reasons. Having a still larger clus

Re: [Ocfs2-users] dlm locking

2011-11-14 Thread Sunil Mushran
all the common problems > * What can I do with the files in lost+found > > Thanks Again, > > Nick. > > On Fri, Nov 11, 2011 at 8:02 PM, Sunil Mushran > wrote: >> So it detected one cluster that was doubly allocated. It fixed it. >> Details below. The other fixe

Re: [Ocfs2-users] dlm locking

2011-11-10 Thread Sunil Mushran
The ro issue was different. It appears the volume has more problems. If you want to me to look at the issue, I'll need the image of the volume. # o2image /dev/device /tmp/o2image.out On 11/10/2011 01:55 PM, Nick Khamis wrote: > Hello Sunil, > > Thank you so much for your time, and I do not want t

Re: [Ocfs2-users] dlm locking

2011-11-10 Thread Sunil Mushran
Do: fsck.ocfs2 -f /dev/... Without -f, it only replays the journal. On 11/09/2011 05:49 PM, Nick Khamis wrote: > Hello Sunil, > > This is only on the protoype so it's not crucial however, it would be > nice to figure out why for > future reference: > > fsck.ocfs2 /dev/drbd0 > fsck.ocfs2 1.6.4 >

Re: [Ocfs2-users] dlm locking

2011-11-09 Thread Sunil Mushran
This has nothing to do with the dlm. The error states that the fs encountered a bad inode on disk. Possible disk corruption. On encountering the fs goes readonly and asks the user to run fsck. On 11/09/2011 11:51 AM, Nick Khamis wrote: > Hello Everyone, > > For the first time I eoerienced a dlm l

Re: [Ocfs2-users] mixing ocfs2 versions in a cluster

2011-11-09 Thread Sunil Mushran
I would recommend upgrading all the nodes to 1.2.9 as it contains fixes to known bugs in the versions you are running. Mixing versions is never recommended mainly because it is hard to test all possible combinations. It is alright to do so on an interim basis. But never recommended as a stable setu

Re: [Ocfs2-users] mount.ocfs2: Device name specified was not found while opening device

2011-11-03 Thread Sunil Mushran
The device is missing. IOW, "ls /dev/Data-1/sto2data-1" is failing. You need the fix that. On 11/03/2011 06:15 AM, Anderson J. Dominitini wrote: > Hi guys > > I added a new storage in my cluster with five new partition. In my > headnode are all ok, all partition were mounted. But in nodes, I h

Re: [Ocfs2-users] Error building ocfs2-tools

2011-10-28 Thread Sunil Mushran
On 10/27/2011 07:10 PM, Tim Serong wrote: > Damn. It was in Pacemaker's include/crm/ais.h, back before June 27 last > year(!), when it was moved to Pacemaker's configure.ac: > > https://github.com/ClusterLabs/pacemaker/commit/8e939b0ad779c65d445e2fa150df1cc046428a93#include/crm/ais.h > > This mea

Re: [Ocfs2-users] Error building ocfs2-tools

2011-10-27 Thread Sunil Mushran
On 10/27/2011 05:26 PM, Tim Serong wrote: > That ought to work... But where did PCMK_SERVICE_ID come from in that > context? AFAICT it's always been CRM_SERVICE there. See current head: > > http://oss.oracle.com/git/?p=ocfs2-tools.git;a=blob;f=ocfs2_controld/pacemaker.c;hb=HEAD#l158 > > CRM_SER

Re: [Ocfs2-users] Error building ocfs2-tools

2011-10-27 Thread Sunil Mushran
I don't remember that resource. If it did exist, it would have existed in pacemaker. ocfs2-tools does not carry any pacemaker bits. It carries bits that allows it to work with pacemaker & cman. On 10/27/2011 02:27 PM, Nick Khamis wrote: > Hello Sunil, > > Thank you so much for your response. I jus

Re: [Ocfs2-users] Error building ocfs2-tools

2011-10-27 Thread Sunil Mushran
ocfs2-tools-1.4.4 is too old. Build 1.6.4. The source tarball is on oss.oracle.com. On 10/27/2011 12:45 PM, Nick Khamis wrote: > Hello Everyone, > > I am building ocfs2-tools from source. Modified > /ocfs2_controld/Makefile to point to the correct pacemaker 1.1.6 > headers: > > PCMK_INCLUDES = -I

Re: [Ocfs2-users] Unable to stop cluster as heartbeat region still active

2011-10-23 Thread Sunil Mushran
0C4AB55FE9314FA5A9F81652FDB9B22D ocfs2 ocfs2_hb_ctl: File not found by ocfs2_lookup while stopping heartbeat I can still kill the ref using device name (-d). On 10/23/2011 17:57, Sunil Mushran wrote: I think it stops by uuid. So try doing this the next time. You are encountering some issue that we have

Re: [Ocfs2-users] Unable to stop cluster as heartbeat region still active

2011-10-23 Thread Sunil Mushran
ld be started and stopped once the volume gets mounted/umounted. br, Laurentiu. On 10/19/2011 02:28, Sunil Mushran wrote: Manual delete will only work if there are no references. In your case there are references. You may want to start both nodes from scratch. Do not start/stop heartbeat manually.

Re: [Ocfs2-users] OCFS2 slow with multiple writes

2011-10-21 Thread Sunil Mushran
ake longer with "ordered" option as data needs to be flushed to > the FS before journal commit, but why is that blocking a new separate file > from being written to the file system? > > Regards, > Prakash > > On Oct 20, 2011, at 6:25 PM, Sunil Mushran wrote: > >&

Re: [Ocfs2-users] OCFS2 slow with multiple writes

2011-10-20 Thread Sunil Mushran
Use writeback. Ordered data requires the data to be flushed before journal commit. And flushing 40G takes time. mount -t data=writeback DEVICE PATH On 10/20/2011 03:05 PM, Prakash Velayutham wrote: > Hi, > > OS - SLES 11.1 with HAE > OCFS2 - 1.4.3-0.16.7 > Cluster stack - Pacemaker > > I have Hea

Re: [Ocfs2-users] Unable to stop cluster as heartbeat region still active

2011-10-18 Thread Sunil Mushran
going to sleep now, i have to be up in a few hours. We can continue tomorrow if it's ok with you. Thank you for your help. Laurentiu. On 10/19/2011 01:33, Sunil Mushran wrote: One way this can happen is if one starts the hb manually and then force formats on that volume. The format will

Re: [Ocfs2-users] Unable to stop cluster as heartbeat region still active

2011-10-18 Thread Sunil Mushran
/2011 03:24 PM, Laurentiu Gosu wrote: Yes, i did reformat it(even more than once i think, last week). This is a pre-production system and i'm trying various options before moving into real life. On 10/19/2011 01:19, Sunil Mushran wrote: Did you reformat the volume recently? or, when did you f

Re: [Ocfs2-users] Unable to stop cluster as heartbeat region still active

2011-10-18 Thread Sunil Mushran
from?? ocfs2_hb_ctl -I -u 918673F06F8F4ED188DDCE14F39945F6 918673F06F8F4ED188DDCE14F39945F6: 1 refs On 10/19/2011 01:04, Sunil Mushran wrote: Let's do it by hand. rm -rf /sys/kernel/config/cluster/.../heartbeat/*0C4AB55FE9314FA5A9F81652FDB9B22D * On 10/18/2011 02:52 PM, Laurentiu Gosu

Re: [Ocfs2-users] Unable to stop cluster as heartbeat region still active

2011-10-18 Thread Sunil Mushran
gt; No improvment :( > > > On 10/19/2011 00:50, Sunil Mushran wrote: >> See if this cleans it up. >> ocfs2_hb_ctl -K -u 0C4AB55FE9314FA5A9F81652FDB9B22D >> >> On 10/18/2011 02:44 PM, Laurentiu Gosu wrote: >>> ocfs2_hb_ctl -I -u 0C4AB55FE9314FA5A9F81652FDB9B22D

Re: [Ocfs2-users] Unable to stop cluster as heartbeat region still active

2011-10-18 Thread Sunil Mushran
See if this cleans it up. ocfs2_hb_ctl -K -u 0C4AB55FE9314FA5A9F81652FDB9B22D On 10/18/2011 02:44 PM, Laurentiu Gosu wrote: > ocfs2_hb_ctl -I -u 0C4AB55FE9314FA5A9F81652FDB9B22D > 0C4AB55FE9314FA5A9F81652FDB9B22D: 0 refs > > > On 10/19/2011 00:43, Sunil Mushran wrote: >&g

Re: [Ocfs2-users] Unable to stop cluster as heartbeat region still active

2011-10-18 Thread Sunil Mushran
> mounted.ocfs2 -f > DeviceFS Nodes > /dev/mapper/volgr1-lvol0 ocfs2 ro02xsrv001 > > ro02xsrv001 = the other node in the cluster. > > By the way, there is no /dev/md-2 > ls /dev/dm-* > /dev/dm-0 /dev/dm-1 > > > On 10/19/2011 00:37, Sunil M

Re: [Ocfs2-users] Unable to stop cluster as heartbeat region still active

2011-10-18 Thread Sunil Mushran
/ocfs2: > total 0 > > ls -lR /sys/kernel/debug/o2dlm > /sys/kernel/debug/o2dlm: > total 0 > > ocfs2_hb_ctl -I -d /dev/dm-2 > ocfs2_hb_ctl: Device name specified was not found while reading uuid > > There is no /dev/dm-2 mounted. > > > On 10/19/2011 00:27, Sun

Re: [Ocfs2-users] Unable to stop cluster as heartbeat region still active

2011-10-18 Thread Sunil Mushran
e or directory > > I think i have to enable debug first somehow..? > > Laurentiu. > > On 10/19/2011 00:17, Sunil Mushran wrote: >> What does this return? >> cat >> /sys/kernel/config/cluster/CLUSTER/heartbeat/918673F06F8F4ED188DDCE14F39945F6/dev >> >> Al

Re: [Ocfs2-users] Unable to stop cluster as heartbeat region still active

2011-10-18 Thread Sunil Mushran
19 00:12 local > -rw-r--r-- 1 root root 4096 Oct 19 00:12 num > > /sys/kernel/config/cluster/CLUSTER/node/ro02xsrv002: > total 0 > -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_address > -rw-r--r-- 1 root root 4096 Oct 19 00:12 ipv4_port > -rw-r--r-- 1 root root 4096 Oct 19 0

Re: [Ocfs2-users] Unable to stop cluster as heartbeat region still active

2011-10-18 Thread Sunil Mushran
ls -lR /sys/kernel/config/cluster What does this return? On 10/18/2011 02:05 PM, Laurentiu Gosu wrote: > Hi, > I have a 2 nodes ocfs2 cluster running UEK 2.6.32-100.0.19.el5, > ocfs2console-1.6.3-2.el5, ocfs2-tools-1.6.3-2.el5. > My problem is that all the time when i try to run /etc/init.d/o2cb

Re: [Ocfs2-users] Partition table crash, where can I find debug message?

2011-10-12 Thread Sunil Mushran
extent of the corruption... (not crash) On 10/12/2011 10:51 AM, Sunil Mushran wrote: Hard to say. You'll need to investigate the extent of the crash. On 10/12/2011 10:49 AM, Frank Zhang wrote: Sorry, it's not power outage, it's just a normal reboot. Is this serious to co

Re: [Ocfs2-users] Partition table crash, where can I find debug message?

2011-10-12 Thread Sunil Mushran
outage yesterday so they rebooted it. Given it was under heavy usage because of many VM running on, I guess this may be the cause. now I am trying to recover it *From:*Sunil Mushran [mailto:sunil.mush...@oracle.com] <mailto:[mailto:sunil.mush...@oracle.com]> *Sent:* Wednesday, Octobe

Re: [Ocfs2-users] Partition table crash, where can I find debug message?

2011-10-12 Thread Sunil Mushran
Not sure what you mean by a partition table crash. Is it that someone overwrote the partition table on the iscsi server? That's what it looks like. If mount cannot detect the fs type, then it means atleast superblock corruption. And such corruptions typically caused by external entities. Stray dd

  1   2   3   4   5   6   7   8   9   10   >