Re: [Ocfs2-users] Server crash

2010-09-22 Thread Sunil Mushran
] ----- > > > > > Am 21.09.2010 22:47, schrieb Sunil Mushran: >> There should have been another message possible just above the "cut >> here" >> saying possibly that the there were not enough credits, or something >> about >> a running or

Re: [Ocfs2-users] Server crash

2010-09-21 Thread Sunil Mushran
There should have been another message possible just above the "cut here" saying possibly that the there were not enough credits, or something about a running or committing transaction. On 09/21/2010 12:38 AM, Georg Höllrigl wrote: > Hello, > > I got a crashed server when using ocfs2 on SLES10 wit

Re: [Ocfs2-users] General protection fault

2010-09-17 Thread Sunil Mushran
It is oopsing 6-8 secs after the mount. ;) The stack trace does not show ocfs2/drbd. It is pointing to slub. But you have to read that with a pinch of salt. This just could be a case of some memory being scribbled. On 09/17/2010 11:39 AM, Andre Nathan wrote: > Hello > > I have an active-active DR

Re: [Ocfs2-users] tunefs.ocfs2 -Q question

2010-09-16 Thread Sunil Mushran
On 09/16/2010 09:06 AM, Enrique Sanchez wrote: > Currently a large number of OCFS2 clusters and while the paperwork > trolls have been pretty good at keeping us in pretty decent shape, > I've been trying to create dynamic documentation of the several > servers we maintain, right now I am hitting th

Re: [Ocfs2-users] No space left on device

2010-09-07 Thread Sunil Mushran
Which kernel are you using? We have fixed this issue in mainline. We will soon have the same fix for production kernels. On 09/07/2010 02:06 PM, Todd Freeman wrote: >From reading the archives I can see this issue has been hit before but > I haven't found a resolution. > > I have a 50gb partit

Re: [Ocfs2-users] Extremely high memory usage and iowait times

2010-09-07 Thread Sunil Mushran
tion. This partition is running multipath'd (each node had two > Ethernet adapters) back into the same EL SANs. > > Anything else needed? > > -dan > > > > -Original Message- > From: Sunil Mushran [mailto:sunil.mush...@oracle.com] > Sent: Tuesday, September 07,

Re: [Ocfs2-users] Extremely high memory usage and iowait times

2010-09-07 Thread Sunil Mushran
Can you describe your usage and your setup a bit? On 09/07/2010 11:55 AM, Dan Lark wrote: > I know this has been discussed before, but I am seeing high iowait times and > the occasional deadlock between my two node OCFS2 cluster. I will be turning > on "noatime" for the mount on both nodes in a

Re: [Ocfs2-users] Servers reboot - may be OCFS2 related

2010-09-03 Thread Sunil Mushran
The stack points to netlink. ocfs2 does not use netlink. That it reproduced with ocfs2 may just mean that the particular load triggers it. That's it. On 09/03/2010 05:21 AM, Proskurin Kirill wrote: Hello. What we have: 2x Debian 5.0 x64 - 2.6.32-20~bpo50+1 from backports DRBD + OCFS2 1.4.1-1 I

Re: [Ocfs2-users] Doubts about OCFS2 Performance

2010-07-28 Thread Sunil Mushran
Can you attach some info in the bz. Like a iostat run for a minute or more. top output, showing active processes. If the slowdown is being observed by any one process, maybe a strace -p pid -T -ttt -o /tmp/out output. On 07/28/2010 05:28 AM, Jeronimo Bezerra wrote: >Guys, > > comments or advi

Re: [Ocfs2-users] Too much journaling or not ?

2010-07-27 Thread Sunil Mushran
Have you tried mounting with data=writeback ? On Jul 27, 2010, at 9:31 PM, wanchat padungrat wrote: > Dear all, > > Not realy sure whether this is bug or not, but we found that sometimes OCFS2 > on our system do journaling a lot. > > (Please see screen shot below) > > As you can see, the IO

Re: [Ocfs2-users] some beginner questions

2010-07-14 Thread Sunil Mushran
ocfs2 is a shared disk cfs. Meaning it expects the disk/vol to be accessible by all nodes. Using fiber channel, iscsi, etc. On 07/14/2010 12:03 PM, Alexander Nagel wrote: > Hi, > > I'am new to ocfs2 filesystem and I have some questions about it. > > I installed three server according to the user g

Re: [Ocfs2-users] dlmfs_unlink errors

2010-07-12 Thread Sunil Mushran
You are attempting to remove a lock resource that is still active. This is an app bug. On 07/06/2010 11:51 AM, Charlie Sharkey wrote: I'm seeing an occasional error on the nodes from a two node cluster. Is this something I should be concerned about ? Sles10 sp2 2.6.16.60-0.34-smp x86_64

Re: [Ocfs2-users] Not able to flashcopy ocfs2 mount point

2010-06-30 Thread Sunil Mushran
tunefs.ocfs2 --cloned-volume The man page has the details. Ensure you run the command only on the cloned volume. On Jun 30, 2010, at 3:19 AM, Devender Narula wrote: Hi guys i got one Ocfs2 mount point running on RHEL 5.4 .. we are about to configuring flashcopy to take backup of it .

Re: [Ocfs2-users] df showing wrong size

2010-06-28 Thread Sunil Mushran
What version are you running? On 06/28/2010 03:46 PM, Garcia, Raymundo wrote: > So.. if it is not cleaned up.. what can we do? > > -Original Message- > From: Sunil Mushran [mailto:sunil.mush...@oracle.com] > Sent: Monday, June 28, 2010 11:04 AM > To: Patrick J. LoP

Re: [Ocfs2-users] df showing wrong size

2010-06-28 Thread Sunil Mushran
On 06/28/2010 09:37 AM, Patrick J. LoPresti wrote: > On Mon, Jun 28, 2010 at 9:29 AM, Sunil Mushran > wrote: > >> ocfs2 is a journaled file system. But it is also a clustered file system. >> So it cannot arbitrarily delete orphaned files because they could still be >

Re: [Ocfs2-users] df showing wrong size

2010-06-28 Thread Sunil Mushran
On 06/28/2010 07:57 AM, Patrick J. LoPresti wrote: > On Sun, Jun 27, 2010 at 11:17 PM, Garcia, Raymundo > wrote: > >> Hello… it was put under my attention that a partition we have in one of our >> production system was displaying wrong size with df command…. 123 GB… but in >> fact the size of

Re: [Ocfs2-users] orphan nodes

2010-06-20 Thread Sunil Mushran
No, only a fsck will remove the two inodes from the orphan dir. And until that's run, that message will be printed every 10 mins on some node in the cluster. The bug that led to this problem was fixed in 2.6.34. http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=3939fda4b3

Re: [Ocfs2-users] Info on Version Upgrade

2010-06-16 Thread Sunil Mushran
As far as ocfs2 is concerned, the current version of ocfs2 1.2 is ocfs2 1.2.9. You will find the packages for your kernel on oss.oracle.com. The news section has the list of changes/bugs fixed. asmlib also has some updates. You can review the fixes to see whether an upgrade is warranted. For all

Re: [Ocfs2-users] Non-clean fsck on almost-new filesystem

2010-06-16 Thread Sunil Mushran
For issues on sles, file a bz with Novell. Make sure you list out the fs features that had been enabled. On 06/15/2010 07:08 PM, Patrick J. LoPresti wrote: > My O/S is Suse Linux Enteprise Server 11 Service Pack 1. > > My SCSI device is a hardware iSCSI RAID chassis. I have done a > variety of re

Re: [Ocfs2-users] Diagnosing some OCFS2 error messages

2010-06-14 Thread Sunil Mushran
- lopre...@gmail.com wrote: > Hello. I am experimenting with OCFS2 on Suse Linux Enterprise Server > 11 Service Pack 1. > > I am performing various stress tests. My current exercise involves > writing to files using a shared-writable mmap() from two nodes. > (Each > node mmaps and writes

Re: [Ocfs2-users] Diagnosing some OCFS2 error messages

2010-06-14 Thread Sunil Mushran
- bpkr...@gmail.com wrote: > Patrick J. LoPresti 2010-06-13 19:14: > > Hello. I am experimenting with OCFS2 on Suse Linux Enterprise > Server > > 11 Service Pack 1. > > > > I am performing various stress tests. My current exercise involves > > writing to files using a shared-writable mmap

Re: [Ocfs2-users] Problems building ocfs2-1.4.7 against Centos 5.3 and 2.6.30 kernel

2010-06-12 Thread Sunil Mushran
Yes, that's part of the tools. Tools are not kernel dependent. Just install the one for rhel5 that is downloadable from oss.oracle.com. On Jun 12, 2010, at 8:50 AM, Jeffrey Layton wrote: Got it - thanks! I'm looking for /etc/init.d/o2cb and I don't see it. Does that come with ocfs2tools?

Re: [Ocfs2-users] OCFS2 and huge (> 50TB) partitions

2010-06-11 Thread Sunil Mushran
We could remove this check. If you want this in your sles kernel, the quickest route will be via Novell. We'll need both sles and mainline patched. On Jun 11, 2010, at 5:54 PM, "Patrick J. LoPresti" wrote: > Hello. I am experimenting with OCFS2 on a brand new 10GigE iSCSI SAN. > It looks

Re: [Ocfs2-users] Heartbeat threshold and data corruption

2010-06-07 Thread Sunil Mushran
Other than a longer pause. No other negative side effects. Processes that have the locks at the required level will continue to do io. Processes that need to upconvert a lock and need the "dead" node to respond, will have to wait for the deadthreshold to expire before recovery can clean out

Re: [Ocfs2-users] ocfs2 tools, bug 1255

2010-06-04 Thread Sunil Mushran
You mean the next version of 1.4 tools? Not anytime soon I'm afraid. That patch needs to be pushed to the 1.4 branch before you can build it. On 06/04/2010 10:42 AM, Ulf Zimmermann wrote: > Sunil, Tao, > > Can you tell me an estimation when a new tools package with this patch will > be available

Re: [Ocfs2-users] Tracking down hangs

2010-06-04 Thread Sunil Mushran
On 06/04/2010 07:17 AM, Andrew Robert Nicols wrote: > If the hang is only short, could it be that we're just missing the relevant > busy locks by running scanlocks too late? > Then it is not a hang. It is just slow. A hang is more permanent and is typically due to a bug in some component. A bu

Re: [Ocfs2-users] Tracking down hangs

2010-06-03 Thread Sunil Mushran
If scanlocks is clean, means it is not a dlm issue. Have you tried mounting with data=writeback? With drbd, a 1G write becomes a 2G write. With ordered mode, a journal checkpoint, which is done when relinquishing a write lock, will wait on the data flush. That could be the cause for the slowdown.

Re: [Ocfs2-users] Tracking down hangs

2010-06-03 Thread Sunil Mushran
It is not a dlm issue. On 06/03/2010 07:38 AM, Andrew Robert Nicols wrote: On Thu, Jun 03, 2010 at 11:12:49AM +0100, Andrew Robert Nicols wrote: What's the best place to start looking for the cause of these hangs? I've attached the dmesg output which includes some call traces for hung threa

Re: [Ocfs2-users] debugfs.ocfs2 and Feature Incompat

2010-06-01 Thread Sunil Mushran
t was the "bottleneck" in the first place. On 06/01/2010 11:00 AM, Stefan Priebe - allied internet ag wrote: > > No not but bonnie is still as slow as before when creating and > deleting files. I thought this should be "fixed" when using indexed-dirs. > > Stef

Re: [Ocfs2-users] OCFS2 performance - disk random access time problem

2010-06-01 Thread Sunil Mushran
The kernel is old. We fixed this issue in 2.6.30. We have also backported it to the 1.4 production tree. The problem was that the inodes being created did not have locality leading to a directory having inodes that were spaced far apart from each other. The one place where it really affected perfo

Re: [Ocfs2-users] debugfs.ocfs2 and Feature Incompat

2010-06-01 Thread Sunil Mushran
.32.12 kernel. Also the disk was formatted using: > mkfs.ocfs2 --fs-feature-level=max-features -L ocfs2disk -N 10 -T mail > -v /dev/sdb > > So i thought that i can now use features like unwritten, inline-data > and indexed-dirs. > > Stefan > > Sunil Mushran schrieb: >&g

Re: [Ocfs2-users] debugfs.ocfs2 and Feature Incompat

2010-06-01 Thread Sunil Mushran
On 06/01/2010 04:06 AM, Stefan Priebe - allied internet ag wrote: > Feature Compat: 3 backup-super strict-journal-super > Feature Incompat: 8016 sparse extended-slotmap inline-data > metaecc xattr indexed-dirs refcount > Tunefs Incomplete: 0 > Feature RO co

Re: [Ocfs2-users] Failover testing problem and a heartbeat question

2010-05-26 Thread Sunil Mushran
On 05/26/2010 01:39 PM, Daniel McDonald wrote: > >> ocfs2 does not reset without a log message. Do you have netconsole >> setup? Messages logged a tick before reset can only be captured by >> netconsole/kdump etc. >> > Unfortunately no. Here are the two lines in /var/log/message prior to the

Re: [Ocfs2-users] Failover testing problem and a heartbeat question

2010-05-26 Thread Sunil Mushran
When a node dies, the cluster ops pause for the node to be first declared dead followed by recovery. Threshold governs the time it takes to declare the node dead. The higher the value, the longer the pause. ocfs2 does not reset without a log message. Do you have netconsole setup? Messages logged a

Re: [Ocfs2-users] OCFS2 ERROR: status = - 107

2010-05-26 Thread Sunil Mushran
-107 means the node lost connection with the other node. The messages below appear cut-pastes and not in sequence. So I cannot tell for sure what happened next. What should have happened is that the node would then go into quorum mode followed by recovery mode. Sunil On 05/26/2010 05:38 AM, Fran

Re: [Ocfs2-users] Support and Stability

2010-05-24 Thread Sunil Mushran
Fragmentation has been atop our dev priority list for sometime now. That is, both, reducing it and handling it better when it does get fragmented. Just last week we pushed patches for the same into the newly created 2.6.35. http://oss.oracle.com/pipermail/ocfs2-devel/2010-May/006511.html As alwa

Re: [Ocfs2-users] fsck.ocfs2 using huge amount of memory?

2010-05-20 Thread Sunil Mushran
http://oss.oracle.com/projects/ocfs2-tools/news/article_8.html We did make a related change in fsck in that release. Do you mind creating a bugzilla for this? Do mention the arch. I can then send you a debug version of the tool that'll tell us why it is behaving like that on your machine. On 05/2

Re: [Ocfs2-users] tunefs.ocfs2 resize issue,

2010-05-20 Thread Sunil Mushran
for that The server is a redhat enterprise Linux Server release 5.4 and we 're using the following rpm ocfs2-tools-1.4.3-1.el5 ocfs2-2.6.18-164.2.1.el5-1.4.4-1.el5 ocfs2console-1.4.3-1.el5 Le 20/05/2010 16:32, Sunil Mushran a écrit : Versions? On May 20, 2010, at 6:47 AM, ste

Re: [Ocfs2-users] tunefs.ocfs2 resize issue,

2010-05-20 Thread Sunil Mushran
Versions? On May 20, 2010, at 6:47 AM, stephane lomine > wrote: Hello Sorry if ask a question already seen before but i'm new to ocfs2 and was not able to find a proper answer on the web. We re using ocfs2 on a two nodes system with SAN disks. We need to extend 3 of our partitions so we a

Re: [Ocfs2-users] dying ocfs2_wq thread

2010-05-17 Thread Sunil Mushran
The fs is oopsing when trying to remove a entry from the orphan dir. It could be that that orphaned inode is corrupted. You could try running fsck.ocfs2 - fy /dev/sdX. Better if you ping Novell support for assistance. On May 17, 2010, at 6:01 AM, Georg Höllrigl wrote: > Hi Folks, > > I´

Re: [Ocfs2-users] ocfs2-tools 1.6

2010-05-11 Thread Sunil Mushran
On 05/10/2010 11:53 PM, Stefan Priebe - allied internet ag wrote: > Do you mean the trunk at all? > http://oss.oracle.com/git/?p=ocfs2-tools.git;a=summary > Yes. >>> 2.) Is there a release date for 1.6? >>> >> We are planning to do a beta release for el6. No dates as yet. >> Ping Nove

Re: [Ocfs2-users] Kernel panic when deleting a file

2010-05-11 Thread Sunil Mushran
e cluster > Start ocfs2 > Mount ocfs2 disks > Start oracle cluster > > > This plan will need to be done in a quiet period of the system, otherwise the > running node will have to much pressure on it. > > Regards > Morten K > > -Opprinnelig mel

Re: [Ocfs2-users] Kernel panic when deleting a file

2010-05-10 Thread Sunil Mushran
On 05/10/2010 05:18 PM, Sunil Mushran wrote: > Upgrade to el5 u4 kernel. 2.6.18-164.el5. ocfs2 1.2 is provided for > all el5 kernels upto u4. Correction. It is provided on el5 upto u3. 2.6.18-128.el5. ___ Ocfs2-users mailing list Ocfs2

Re: [Ocfs2-users] Kernel panic when deleting a file

2010-05-10 Thread Sunil Mushran
.oracle.com] *På vegne av* Sunil Mushran *Sendt:* 7. mai 2010 19:07 *Til:* ocfs2-users@oss.oracle.com *Emne:* Re: [Ocfs2-users] Kernel panic when deleting a file Unsure why you have to build the packages when they are downloadable. Please use the packages provided on oss. On 05/07/2010 06:26 AM, Kr

Re: [Ocfs2-users] ocfs2-tools 1.6

2010-05-10 Thread Sunil Mushran
On 05/10/2010 01:33 PM, Stefan Priebe - allied internet ag wrote: > 1.) Is ocfs2-tools 1.6 considered as stable? > Mostly. We push changes to the tree only after it passes all the tests. > 2.) Is there a release date for 1.6? > We are planning to do a beta release for el6. No dates as ye

Re: [Ocfs2-users] Kernel panic when deleting a file

2010-05-07 Thread Sunil Mushran
Unsure why you have to build the packages when they are downloadable. Please use the packages provided on oss. On 05/07/2010 06:26 AM, Kristiansen Morten wrote: Hi again, This time I try attach a jpeg. Regards Morten K -Opprinnelig melding- Fra: ocfs2-users-boun...@oss.oracle.com [ma

Re: [Ocfs2-users] compile error on sles 11

2010-05-04 Thread Sunil Mushran
On 05/04/2010 02:42 AM, Werner Flamme wrote: > > Thank you. Took a while to get this posting out of spam quarantine :-( > We had problems here for the last weeks :-( > > So, on the SLES10 where I succeeded in building the packages for ocfs2 > and ocfs2-tools from source, I entered this command and

Re: [Ocfs2-users] Hardware error or ocfs2 error?

2010-04-29 Thread Sunil Mushran
Cannot say for sure. It could be a deadlock (bug) too. As in, I don't want to blame any one entity without knowing more. If it were up to me, I'd start with the dlm. See which node holds the lock that others are waiting on. Then see why that node is unable to downconvert that lock. As in, if the

Re: [Ocfs2-users] compile error on sles 11

2010-04-27 Thread Sunil Mushran
Werner Flamme wrote: > this is what I do. But since the version on the RAC server is newer than > mine, I cannot mount the filesystem (I quoted the error in a previous > mail). That's why I try to compile the sources from Oracle. I'd rather > deinstall the SLES version and switch over to the new

Re: [Ocfs2-users] compile error on sles 11

2010-04-27 Thread Sunil Mushran
Werner Flamme wrote: > Unfortunately, I do not run the Oracle RAC. The RAC runs with Oracle > Unbreakable Linux. I am sorry for that decision, but I can't change it. > > SAP does not forbid to use Oracle's version of Linux. SAP only says that > SAP systems are fully supported on RHEL or SLES only (

Re: [Ocfs2-users] compile error on sles 11

2010-04-26 Thread Sunil Mushran
Werner Flamme wrote: > For RHEL there are readyly built packages, why would I build my own > packages then? The build worked fine with another SLES10. > > Yes, ocfs2 is included by SLES. Not SLES 11, of course - it is in the > separately sold "High Availability" package. But the version included in

Re: [Ocfs2-users] compile error on sles 11

2010-04-20 Thread Sunil Mushran
1.4 tree is only meant to be build against EL5 U2+. Not SLES nor any other kernel tree. SLES9/10/11 already includes ocfs2. Werner Flamme wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > Hi, > > next VM, next trouble :-( > > Now I work inside a VM with SLES 11. Configuring ocfs2-1.4.7

Re: [Ocfs2-users] Problem after upgrade to 1.4.7-1 (Bad magic number in superblock while opening context for device)

2010-04-20 Thread Sunil Mushran
Your setup may not have persistent device naming. You can use blkid or mounted.ocfs2 to discover the ocfs2 devices. Marcus Alves Grando wrote: > Hello Guys, > > After upgrade to 1.4.7-1 my FS does not mount anymore. Just after > upgrade rpm, FS mount and works fine, but after reboot server, it do

Re: [Ocfs2-users] OCFS2 1.4.7-1 and OCFS2 Tools 1.4.4-1 released

2010-04-19 Thread Sunil Mushran
gt; [mailto:ocfs2-users-boun...@oss.oracle.com] On Behalf Of Sunil Mushran > Sent: Monday, April 19, 2010 1:08 PM > To: ocfs2-annou...@oss.oracle.com; ocfs2-users > Subject: [Ocfs2-users] OCFS2 1.4.7-1 and OCFS2 Tools 1.4.4-1 released > > All, > > We are pleased to announce the

Re: [Ocfs2-users] processes in "D" State

2010-04-19 Thread Sunil Mushran
res: M095a03 Owner: 18 State: 0x0 > Last Used: 0 ASTs Reserved: 0Inflight: 0Migration Pending: No > Refs: 3Locks: 1On Lists: None > Reference Map: > Lock-Queue Node Level Conv Cookie Refs AST BAST > Pending-Action &

[Ocfs2-users] OCFS2 1.4.7-1 and OCFS2 Tools 1.4.4-1 released

2010-04-19 Thread Sunil Mushran
All, We are pleased to announce the release of OCFS2 1.4.7-1 and OCFS2 Tools 1.4.4-1 for Oracle's and Red Hat's Enterprise Linux 5 Update 2 and higher. Oracle's Unbreakable Linux Network users who are subscribing to the "OCFS2 1.4 packages for Enterprise Linux 5" channel can upgrade to this relea

Re: [Ocfs2-users] memory leak

2010-04-15 Thread Sunil Mushran
Joel Becker wrote: > On Thu, Apr 15, 2010 at 12:31:02PM +0200, Kristiansen Morten wrote: >> I discovered our four node cluster running on RedHat EL5, Ocfs2 1.2.6 and >> Oracle 10.2.0.3 have memory leak. I suspect ocfs2, but I could be wrong. I >> suspect ocfs2 because when we run RMAN backup the

Re: [Ocfs2-users] new ocfs2 release?

2010-04-15 Thread Sunil Mushran
ons but I feel this > could become an issue again. > > David > > -Original Message- > From: ocfs2-users-boun...@oss.oracle.com > [mailto:ocfs2-users-boun...@oss.oracle.com] On Behalf Of Sunil Mushran > Sent: Thursday, April 15, 2010 12:14 PM > To: li...@svrinfor

Re: [Ocfs2-users] new ocfs2 release?

2010-04-15 Thread Sunil Mushran
We are hoping to release it anyday now. Have you filed a bug about your issue? I have no recollection of any reports of such an issue. Orphan scanning has not changed in 1.4.7. File a bz. We'll need to get more information to understand the problem you are encountering. Mailing List SVR wrote: >

Re: [Ocfs2-users] fsck.ocfs2 question

2010-04-12 Thread Sunil Mushran
No. it should not be running for so long. Attach strace to it. It could be caught in a loop. Schildwachter, Xavier wrote: > > We setup four nodes connected to an iSCSI SAN with 3 ocfs2 volumes. > > Last Friday, one of the volumes was set read-only because of the > following error: > > Apr 9 01:4

Re: [Ocfs2-users] Ocfs2-users Digest, Vol 76, Issue 17

2010-04-08 Thread Sunil Mushran
Ping Oracle Support. They will be able to answer qs on OracleVM. You are seeing 4 paths because they might be multipathed. Check your iscsi configuration. Use blkid to determine which two paths are the same. e.g. /dev/sdf1: LABEL="label1" UUID="908a0229-88c3-4a0d-b6bc-38c43c6b1461" TYPE="ocfs2"

Re: [Ocfs2-users] Kernel Panic, Server not coming back up

2010-04-05 Thread Sunil Mushran
beat=local) > > My concern was with writing status files for the other nodes to see. If > the partition was mounted read-only, would that cause another node to think > that the read-only node has failed? > > Thanks, > > Kevin > > On Mon, 05 Apr 2010 14:03:52 -0700, Sunil M

Re: [Ocfs2-users] Kernel Panic, Server not coming back up

2010-04-05 Thread Sunil Mushran
ocfs2 can handle multiple writers to the same file. Cannot say whether it is due to the io load. All I can say is that it is an io error. Unsure what /data partitions are. Sunil ke...@utahsysadmin.com wrote: > Sunil, > > Thanks for the response. Could this be triggered by both servers trying t

Re: [Ocfs2-users] Kernel Panic, Server not coming back up

2010-04-05 Thread Sunil Mushran
It is having problems doing ios to the virtual devices. -5 is EIO. ke...@utahsysadmin.com wrote: > I have a relatively new test environment setup that is a little different > from your typical scenario. This is my first time using OCFS2, but I > believe it should work the way I have it setup. > >

Re: [Ocfs2-users] Ftp server... single file seems locked

2010-04-02 Thread Sunil Mushran
Attach it to a bugzilla. Don't cut-paste it. It is impossible to decipher. Jason Price wrote: > Ok. I have netconsole setup, and echo 't' > /proc/sysrq-trigger did > not cause the box to crash. > > Here's the output sent over to the other host: > > My source host setup was: > > dmesg -n 8 > > ser

Re: [Ocfs2-users] Ftp server... single file seems locked

2010-04-02 Thread Sunil Mushran
If "fs_locks -B" is empty, then the processes are not waiting on a cluster lock. Process pegged at 100% cpu means it is actively waiting to acquire a spinlock. Is the other process running? Unfortunately in EL5 there is no clean way to get the kernel stack for a process. "echo t >/proc/sysrq-t

Re: [Ocfs2-users] error while upgrading 1.2 to 1.4

2010-03-30 Thread Sunil Mushran
Where are you downloading from? Florin Andrei wrote: > ## > There was a package dependency problem. The message was: > > Unresolvable chain of dependencies: > ocfs2-2.6.18-128.el5-1.4.4-1.el5 requires ocfs2-tools >= 1.4.3 > > > The following packages we

Re: [Ocfs2-users] log files appended with NULL when node lost power

2010-03-30 Thread Sunil Mushran
Yes and without the trailing nulls. Florin Andrei wrote: > So what would happen if we use 1.4 instead? Would the logger keep > logging data and the user would be able to read the new lines normally > while node A is down? > > > On 03/30/2010 11:42 AM, Sunil Mushran wrote: &g

Re: [Ocfs2-users] log files appended with NULL when node lost power

2010-03-30 Thread Sunil Mushran
What you are seeing is the result of writeback data journaling in ocfs2 1.2. In ocfs2 1.4, we default to ordered data journaling. Refer to the 1.4 user's guide for more. Florin Andrei wrote: > A and B are identical machines. Network has lots of redundancy. They > both access same OCFS2 volumes ove

Re: [Ocfs2-users] Probles resizing a partition

2010-03-30 Thread Sunil Mushran
What's the current size of the partition? The error indicates that the partition has not been resized. Mattia Gandolfi wrote: > Hi all, > > I'm a new OCFS2 user, I'm trying to implement a 2-nodes cluster with a > shared fs, and I'm facing an issue while trying to resize an existing > ocfs2 files

Re: [Ocfs2-users] Odd error on FC12 with ocfs2

2010-03-30 Thread Sunil Mushran
; mtime: 0x4a0b2372 -- Wed May 13 14:45:54 2009 > dtime: 0x0 -- Wed Dec 31 18:00:00 1969 > ctime_nsec: 0x -- 0 > atime_nsec: 0x0000 -- 0 > mtime_nsec: 0x -- 0 > Last Extblk: 0 >

Re: [Ocfs2-users] Odd error on FC12 with ocfs2

2010-03-29 Thread Sunil Mushran
No On Mar 29, 2010, at 8:10 PM, Angelo McComis wrote: > Does it matter that the nodes are numbered 1-6 instead of 0-5? > > > > On Mon, Mar 29, 2010 at 4:25 PM, Sunil Mushran > wrote: >> Enable some debugging. >> >> #debugfs.ocfs2 -l TCP allow >> .

Re: [Ocfs2-users] Odd error on FC12 with ocfs2

2010-03-29 Thread Sunil Mushran
102.141 > Connection to 192.168.102.141 port [tcp/cbt] succeeded! > > -Original Message- > From: Sunil Mushran [mailto:sunil.mush...@oracle.com] > Sent: Monday, March 29, 2010 5:08 PM > To: David Murphy > Cc: ocfs2-users@oss.oracle.com > Subject: Re: [Ocfs2

Re: [Ocfs2-users] Odd error on FC12 with ocfs2

2010-03-29 Thread Sunil Mushran
= -107 > ocfs2: Unmounting device (253,1) on (node 0) > > > > So clearly ocfs2 the service things it can connect to the node, but nmap > sees the connection just fine. And Web2 can see the port on web1 just fine, > so there is no firewall blocking the connections.

Re: [Ocfs2-users] node B reboots when node A is isolated from the network

2010-03-29 Thread Sunil Mushran
If node A is a lower node number than node B, then the behavior is correct. In a 2 node cluster, if the two nodes cannot talk to each other, the higher node number will fence itself. Also, when a node mounts a volume, it initiates connections to other live nodes. If any connection fails, the mount

Re: [Ocfs2-users] Extended Attributes support

2010-03-29 Thread Sunil Mushran
I checked. The core support for xattr in tools was added in 1.4.2. We plan on pinging ubuntu to pick up tools 1.6 (when it is ready) that has xattr support enabled by default. Frank Lahm wrote: > Hi, > > although official Oracle documentation indeed suggests that EA should > not work if ocfs-tool

Re: [Ocfs2-users] download for 2.6.18-164.15.1.el5

2010-03-26 Thread Sunil Mushran
http://oss.oracle.com/projects/ocfs2/files/RedHat/RHEL5/ Devender Narula wrote: > Hi Guys > > i need ocfs2 software for 2.6.18-164.15.1.el5 RHEL 64 bit.. Can > anybody please tell me from where i can download it. > > thanks > > Devender > ___ Oc

Re: [Ocfs2-users] ENOSPC

2010-03-26 Thread Sunil Mushran
Please file a bugzilla. This may involve more people. And it is easier to track that way. David Johle wrote: > See attachment for requested output. > > At 08:54 PM 3/24/2010, Sunil Mushran wrote: >> Quite a bit of work is ongoing on this front. I'll list all that work

Re: [Ocfs2-users] Odd error on FC12 with ocfs2

2010-03-25 Thread Sunil Mushran
hmm.. o2cb_ctl makes no connections. It just reads the cluster.conf and populates configfs. AFAIK. David Murphy wrote: > > We had 6 nodes running CentOS 5.4 using 1.4.3 ocfs2-tools. > > > > I decided to rebuild one node with FC12. > > > > > > Which is working fine, however > > > > Nmap 1

Re: [Ocfs2-users] ENOSPC

2010-03-24 Thread Sunil Mushran
Quite a bit of work is ongoing on this front. I'll list all that work in another email. Meanwhile make a bz with the stat_sysdir output. We'll need that to determine the best way forward. David Johle wrote: > So in light of prior issues with lock contention and such due to > writing apache logs

Re: [Ocfs2-users] Problems mounting shared filesystem

2010-03-22 Thread Sunil Mushran
The network connect is failing. Could be because of a firewall, or bad ip address, some switch issue. Mount the volume on node 2. Then enable tracing and tail messages file. # debugfs.ocfs2 -l TCP allow # tail -f /var/log/messages Then from node 4, ping node 2 using netcat. # nc -z 192.168.1.2 77

Re: [Ocfs2-users] Can't delete LV snapshot after mounting

2010-03-19 Thread Sunil Mushran
at appears to be the > reason, why I can't remove the snapshot. > > Is there any (safe) way get rid of it after unmounting? > > Thank's > Armin > > > > >> -Original Message- >> From: Sunil Mushran [mailto:sunil.mush...@oracle.com] >>

Re: [Ocfs2-users] Quota support

2010-03-19 Thread Sunil Mushran
ocfs2-tools-1.6 will have support for it. It has not been released as yet. But you are free to build it for your use. http://oss.oracle.com/git/?p=ocfs2-tools.git;a=summary Redshift wrote: > Hello, > > I am using the linux-image-2.6.32-trunk-amd64 Debian kernel with OCFS2 > "1.5.0" (at least, th

Re: [Ocfs2-users] NFS in "D" State

2010-03-18 Thread Sunil Mushran
C: 214-587-3882 > > -Original Message- > From: Sunil Mushran [mailto:sunil.mush...@oracle.com] > Sent: Thursday, March 18, 2010 1:25 PM > To: Jaquays, Michael A. > Cc: ocfs2-users@oss.oracle.com > Subject: Re: [Ocfs2-users] NFS in "D" State > > I am assum

Re: [Ocfs2-users] fsck.ocfs2 can't fix an orphaned inode

2010-03-18 Thread Sunil Mushran
One option is to provide me with the o2image of the volume. # o2image -r /dev/sda1 - | bzip2 > sda1.out.bz2 File a bugzilla and add the link to that image. (The bz cannot handle large files.) The other option is to file a bz and attach the stat_sysdir output. http://oss.oracle.com/~smushran/.debu

Re: [Ocfs2-users] NFS in "D" State

2010-03-18 Thread Sunil Mushran
I am assuming you are mounting the nfs mounts with the nordirplus mount option. If not, that is known to deadlock a nfsd thread leading to what you are seeing. There are two possible reasons for this error. One is a dlm issue. Other is a local deadlock like above. To see if the dlm is the cause f

Re: [Ocfs2-users] OCFS2 Multipath Configuration

2010-03-18 Thread Sunil Mushran
Yeah.. mounted is a bit dumb. In the next release, it will recognize /dev/mapper devices. We still need to teach it to handle multipathing fully. David Johle wrote: > I'm not sure about why mounted.ocfs2 is showing both the dm and the > sd devices for the same volume. But this could all be very

Re: [Ocfs2-users] Can't delete LV snapshot after mounting

2010-03-18 Thread Sunil Mushran
I am queasy recommending such a setup to anyone. It is one thing to handle a workload. The problem is about handling user/admin errors. You are essentially running a local volume manager that is unaware of the other node. Any reconfig that is not coordinated will lead to corruption. Below that yo

Re: [Ocfs2-users] Disk access hang

2010-03-11 Thread Sunil Mushran
:1224 > Recovering node 9 from slot 7 on device (152,0) > > But the ocfs2 disk was unavailable anyway. > > Any other hint? > > Regards, > > G. > > On Wed, Mar 10, 2010 at 8:56 PM, Sunil Mushran > wrote: >> Were the first set of messages on all nodes? On that

Re: [Ocfs2-users] POSIX locks supported?

2010-03-10 Thread Sunil Mushran
To get this feature, one needs both a kernel >= 2.6.28 and a functioning userspace cluster stack. ocfs2 1.4 on (rh)el5/sles10 does not satisfy either of the two. As of now, there are two such stacks. Pacemaker from Novell and the "new" CMAN (new is my term... unsure how RH will be marketing it) f

Re: [Ocfs2-users] Disk access hang

2010-03-10 Thread Sunil Mushran
Were the first set of messages on all nodes? On that node atleast the o2hb node down event fired. It should have fired on all nodes. This is the dlm eviction message. If they all fired, then look for a node to have a message that reads "Node x is the Recovery Master for the Dead Node y". That sho

Re: [Ocfs2-users] Filesystem ready check...

2010-03-04 Thread Sunil Mushran
It is waiting for the heartbeat timeout to trigger denoting a node death. Then it initiates recovery. With the default settings, this takes about a minute plus. This is with the o2cb cluster stack. Note not all fs ops will hang during the detection phase. Only those ops will hang that directly or

Re: [Ocfs2-users] support xattr, quota on RHEL5 of OCFS2 1.4

2010-03-03 Thread Sunil Mushran
ocfs2 release 1.4 does not have these features. These features (expect quotas) will be in ocfs2 release 1.6. The 1.4 user's guide lists the features included in that release. Elia Pinto wrote: > Hi > > I'd like to know if OCFS2 1.4 on RHEL5 with the REDHAT latest kernel > supports disk quotas,

Re: [Ocfs2-users] I have questions regarding Fencing

2010-03-02 Thread Sunil Mushran
On Mar 2, 2010, at 2:52 AM, Brad Plant wrote: > On Tue, 2 Mar 2010 21:51:45 +1100 > Brad Plant wrote: > >> On Tue, 2 Mar 2010 05:09:21 -0500 >> Enrique Sanchez wrote: >> > During my test (take Node0 down cold turkey) Node1 hung pretty > badly, > is this something expected??

Re: [Ocfs2-users] prefetch?

2010-03-01 Thread Sunil Mushran
Dmitry Rybin wrote: > I have storage 2TB with 4K cluster size. Collected statistic says, > that 4K max reading block from disc. > But extX - variable read size up to 50K with cluster size 4K. > > ocfs2 have no prefetch? > > Centos 5.4, ocfs2-1.4.4-1 > Is the workload the same on the two volumes

Re: [Ocfs2-users] ocfs2 filesystem R/O in one node but not in the 2nd.

2010-02-19 Thread Sunil Mushran
The volume goes ro when it detects an on-disk corruption. I would imagine the node detected the problem with the group allocator fixed by fsck. dmesg will tell you more. If it mentions block#1709568, then fsck took care of it. Sunil Enrique Sanchez wrote: > Hello folks, > > I have a Filesystem th

Re: [Ocfs2-users] Kernel Panic ocfs2_inode_lock_full

2010-02-18 Thread Sunil Mushran
t; > -Original Message- > From: Sunil Mushran [mailto:sunil.mush...@oracle.com] > Sent: Thursday, February 18, 2010 11:47 AM > To: Jaquays, Michael A. > Cc: ocfs2-users@oss.oracle.com > Subject: Re: [Ocfs2-users] Kernel Panic ocfs2_inode_lock_full > > Yes, this is a k

Re: [Ocfs2-users] OCFS2 + ISCSI SAN slowdown

2010-02-18 Thread Sunil Mushran
That's a poor workload for a clustered file system because it has to master/take clustered locks for each inode. Actually multiple locks per inode. Those locks are used once and then freed. A local fs only takes a read hit. loebber...@eplan.de wrote: > Hi Guy, > > if you have serveral OCFS2 partit

Re: [Ocfs2-users] Kernel Panic ocfs2_inode_lock_full

2010-02-18 Thread Sunil Mushran
Yes, this is a known issue. Only occurs when nfs is in the equation. This issue has been fixed in mainline quite some time ago. We are in the process of backporting that to 1.4. michael.a.jaqu...@verizon.com wrote: > All, > > I have a 3 node cluster that is experiencing kernel panics once every fe

Re: [Ocfs2-users] 2.6.33-rc8 bug fixes -> 2.6.32

2010-02-12 Thread Sunil Mushran
Brad Plant wrote: > I noticed that there were a lot of bug fixes in 2.6.33-rc8. Just wondering if > any of these are also applicable to 2.6.32 and if they'll be merged into the > long term stable branch? > > If there are no 2.6.32 merge plans, are any of the 2.6.33 commits beneficial > to 2.6.32

<    1   2   3   4   5   6   7   8   9   10   >