Re: [Ocfs2-users] OCFS2 and berkeley database files

2006-12-06 Thread Sunil Mushran
ocfs2 supports private mmap r/w and shared mmap readonly. Shared mmap writeable is the only piece missing. We should have that by 1.4. Alexei_Roudnev wrote: There was a clear answer, WHY it did not worked on OCFSv2: - BerkleyDB and LDAP uses mmap to the files; - OCFSv2 don't implement it (becau

Re: [Ocfs2-users] OCFS2 and berkeley database files

2006-12-06 Thread Sunil Mushran
ssage - From: "Sunil Mushran" <[EMAIL PROTECTED]> To: "Alexei_Roudnev" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]>; "Michael Wood" <[EMAIL PROTECTED]>; Sent: Wednesday, December 06, 2006 1:47 PM Subject: Re: [Ocfs2-users] OCFS2 and berkeley data

Re: [Ocfs2-users] DMesg error on startup ...

2006-12-07 Thread Sunil Mushran
4o2cb. *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] *On Behalf Of *Yuval Baruch *Sent:* Thursday, December 07, 2006 4:06 PM *To:* Alexei_Roudnev *Cc:* Sunil Mushran; ocfs2-users@oss.oracle.com; [EMAIL PROTECTED] *Subject:* Re: [Ocfs2-users] DMesg error on startup ... thanks for you offer

Re: [Ocfs2-users] OCFS2 1.2.4: what is the release date?

2006-12-12 Thread Sunil Mushran
Sorry for the delay. We are investigating a problem we encountered when we tested the patch with 2.6.19 (mainline). We are still investigating whether it is limited to the mainline or is it something that can occur on the supported kernels. (It just could be that we were lucky in our testing.) I

[Ocfs2-users] ocfs2 1.2.4-0.1 (Preliminary drop)

2006-12-13 Thread Sunil Mushran
All, http://oss.oracle.com/~smushran/.ocfs2-1.2.4-0.1/ This is for users interested in the preliminary patch. The main issue (not releasing the memory) has been ironed out but there are some small races we still need to plug. If users want packages for different kernels/arches, the tarball has

Re: [Ocfs2-users] Kernel Panic - OCFS2

2006-12-14 Thread Sunil Mushran
Increase default disk heartbeat timeout. For more refer to the FAQ. HAWKER, Dan wrote: Hi All, Have just installed OCFS2 as a test on a server I am planning on rolling out a clusterFS onto. The server(s) will be running FC5 and are standard HP DL360 G5's, have a Qlogic iSCSI HBA inside (QLA401

Re: [Ocfs2-users] re: different availability requirements for multiple ocfs2 volumes?

2006-12-14 Thread Sunil Mushran
Peter Santos wrote: The backup volume availability requirements shouldn't be so strict because if the volume is not available, I would just rather have my RMAN backup fail, and not have the machine reboot .. just because of this volume alone. My question is, is

Re: [Ocfs2-users] another node is heartbeating in our slot!

2006-12-18 Thread Sunil Mushran
As per the config, your node names are 'san' and 'mail'. Are the names the same as the hostname? Do on both nodes: # for i in /config/cluster/san/node/*/local ; do LOCAL=`cat $i`; if [ $LOCAL -eq 1 ] ; then echo $i; fi; done; You should see /config/cluster/san/node/mail/local on mail and /conf

Re: [Ocfs2-users] re: is it possible for the o2cb stack to monitor multiple "clusternames" on the same box

2006-12-20 Thread Sunil Mushran
Currently it supports only one cluster. Peter Santos wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Folks, When I installed ocfs2 the first time and setup oracle to work with it, the clustername defaulted to "ocfs2". We are testing adding new nodes, but we would like to

Re: [Ocfs2-users] re: is it possible for the o2cb stack to monitor multiple "clusternames" on the same box

2006-12-20 Thread Sunil Mushran
never be part of that domain. Sunil Mushran wrote: Currently it supports only one cluster. Peter Santos wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Folks, When I installed ocfs2 the first time and setup oracle to work with it, the clustername defaulted to "ocfs2"

Re: [Ocfs2-users] o2net_connect_expired

2006-12-28 Thread Sunil Mushran
In short, node 1's attempt to connect to node 0 failed. See the messages file on node 0. If there is no evidence of a connect request, check if you have some sort of firewall setup. If so, open up the required port(s). You could also use tcpdump to trap traffic on the required port / ethernet int

Re: [Ocfs2-users] Problem installing OCFS 1.2.3

2007-01-04 Thread Sunil Mushran
Refer to the FAQ. The module's kernel version should match the kernel version. Lin Shen (lshen) wrote: Hi, The kernel I'm using is: [EMAIL PROTECTED] Desktop]# uname -a Linux cfs2 2.6.9-42.7.ELsmp #1 SMP Tue Sep 5 18:29:39 EDT 2006 i686 i686 i386 GNU/Linux So I installed ocfs2-2.6.9-42.ELsm

Re: [Ocfs2-users] Problem installing OCFS 1.2.3

2007-01-04 Thread Sunil Mushran
depmod -a ? Lin Shen (lshen) wrote: Switched the kernel to 2.6.9-42.Elsmp, still got the same error. [EMAIL PROTECTED] Desktop]# uname -a Linux cfs2 2.6.9-42.ELsmp #1 SMP Wed Jul 12 23:27:17 EDT 2006 i686 i686 i386 GNU/Linux -Original Message- From: Sunil Mushran [mailto:[EMAIL

Re: [Ocfs2-users] Problem installing OCFS 1.2.3

2007-01-04 Thread Sunil Mushran
: I don't see configfs.ko. [EMAIL PROTECTED] Desktop]# ls /lib/modules/2.6.9-42.ELsmp/kernel/fs/ocfs2/ ocfs2_dlmfs.ko ocfs2_dlm.ko ocfs2.ko ocfs2_nodemanager.ko -Original Message- From: Sunil Mushran [mailto:[EMAIL PROTECTED] Sent: Thursday, January 04, 2007 12:08 PM To

Re: [Ocfs2-users] Problem installing OCFS 1.2.3

2007-01-04 Thread Sunil Mushran
as TIPC? o2net code is pretty well contained and isolated. while we have discussed tipc, not sure if we ever gave it a serious look. lin -Original Message----- From: Sunil Mushran [mailto:[EMAIL PROTECTED] Sent: Thursday, January 04, 2007 1:21 PM To: Lin Shen (lshen)

Re: [Ocfs2-users] Problem installing OCFS 1.2.3

2007-01-04 Thread Sunil Mushran
theoretically yes... but for practical usage go with atleast iscsi Lin Shen (lshen) wrote: So w/o shared disk, is it possible to make OCFS2 to work by utilizing GNBD or etc? lin -Original Message- From: Sunil Mushran [mailto:[EMAIL PROTECTED] Sent: Thursday, January 04, 2007 2

Re: [Ocfs2-users] update on o2net_idle_timer

2007-01-04 Thread Sunil Mushran
That and also we've seen similar issues with Broadcom TG3 drivers. We use Intel E1000 mostly and thus did not experience the same issue. As far as the configurable net timeouts goes, the patch was added into mainline on Dec 4th. So it will be available with ocfs2 1.4. We are still seeing if we ha

Re: [Ocfs2-users] configfs module question

2007-01-04 Thread Sunil Mushran
Ah yes... you want o2cb service script to be smarter. :) Edit /etc/init.d/o2cb and remove load_module configfs from the following line. It wouldn't make it smarter... just would get it to work. LOAD_ACTIONS=("load_module configfs" "mount_fs configfs "'$(configfs_path)'

Re: [Ocfs2-users] Kernel panic - not syncing: ocfs2 is very sorry

2007-01-05 Thread Sunil Mushran
Lot of ink has been spilled on this subject. ;) Check out the heartbeat section in the FAQ. One easy solution is to increase the hb timeout to 60 secs... O2CB_HEARTBEAT_THRESHOLD = 31 We will leaning towards making that number the default in the 1.4 release. George Liu wrote: Both systems cras

Re: [Ocfs2-users] mount error

2007-01-09 Thread Sunil Mushran
You are using two different versions of ocfs2 on the two nodes. Different enough that they are not network compatible. It is working as designed. Consulente3 wrote: Hi, I'm new to ocfs2, and in my test's environment, i have: 2 node, becks and vaix becks can mount ocfs2 fs, but vaix can't. Whe

Re: [Ocfs2-users] block size - cluster size

2007-01-16 Thread Sunil Mushran
Arkadiy Kulev wrote: Hello ocfs2-users, I have 2 questions: 1. I forgot the block size and cluster size that I chose during formatting of the drive. Is there any way I can find it out afterwards? # debugfs.ocfs2 -R "stats" /dev/sdX Look for Cluster Size bits and Block Size bits. 9

Re: [Ocfs2-users] OCFS2 crash

2007-01-16 Thread Sunil Mushran
Looks to be running out of lowmem. # date # cat /proc/meminfo # cat /proc/slabinfo Run a script that dumps the above every 1 to 5 mins. That should help explain the cause. Brian Sieler wrote: Using 2-node clustered file system on DELL/EMC SAN/RHEL 2.6.9-34.0.2.ELsmp x86_64. Config: O2CB_HEAR

Re: [Ocfs2-users] OCFS2 crash

2007-01-17 Thread Sunil Mushran
Could be. But I cannot say for sure till I don't get the slab/mem data. Brian Sieler wrote: Does this appear to be the same issue as the "OOM Killer" issue previously reported that would be fixed with ocfs2 1.2.4? On 1/16/07, Sunil Mushran <[EMAIL PROTECTED]> wrote: Looks

[Ocfs2-users] ocfs2-1.2.4 RC2 released

2007-01-17 Thread Sunil Mushran
All, http://oss.oracle.com/~smushran/.ocfs2-1.2.4-0.2/ The final 1.2.4 should look very close to this drop. We still have one slippery issue open that we are working on. But, other than that, this drop is looking good. The list of patches added post 1.2.4-0.1 is as follows: r2948: fs - Allow d

Re: [Ocfs2-users] ocfs2 keeps fencing all my nodes

2007-01-18 Thread Sunil Mushran
1. In SLES10, the /config has been moved to /sys/kernel/config. That's how it is on mainline. 2. To monitor heartbeat do: # watch -d -n2 debugfs.ocfs2 -R "hb" /dev/sdX This comand will work if you have ocfs2-tools 1.2.2. (Not sure whether sles10 ships with 1.2.2 or 1.2.1.) If 1.2.1, do: # watc

Re: [Ocfs2-users] does nfs work with ocfs2? "fh buffer is too small for encoding"

2007-01-19 Thread Sunil Mushran
http://oss.oracle.com/projects/ocfs2/dist/documentation/ocfs2_faq.html#NFS John Lange wrote: I am attempting to export a 4 node ocfs2 file system via NFS without much success. When a client mounts the export file system, /var/log/messages spews thousands of errors as follows: kernel: (11514,0)

Re: [Ocfs2-users] ocfs2_cdsl_follow_link errors

2007-01-22 Thread Sunil Mushran
#define EACCES 13 /* Permission denied */ The messages are harmless. Patch to silence them has already been checked into the 1.2 repo and mainline git. Matthew Flusche wrote: I’m seeing the following errors in my two node cluster. Is this anything to be concerned with? Host information: Red

Re: [Ocfs2-users] kernel panic - not syncing

2007-01-22 Thread Sunil Mushran
o2net timeout cannot cause the o2hb panic. The two are totally different. From the outputs, I would guess o2hb is timing out but I cannot say for sure till I don't see the full logs. Andy Phillips wrote: Its worth pointing out that the o2net idle timer is triggering on the network heartbeat, whi

Re: [Ocfs2-users] kernel panic - not syncing

2007-01-22 Thread Sunil Mushran
rror messages usually follow. If I'm wrong, please email me directly and help sort out my understanding. Andy On Mon, 2007-01-22 at 10:38 -0800, Sunil Mushran wrote: o2net timeout cannot cause the o2hb panic. The two are totally different. From the outputs, I would guess o2hb is timing

Re: [Ocfs2-users] ocfs2 kernel bug in Fedora Core 4 update kernel

2007-01-23 Thread Sunil Mushran
This was the lvb issue that was fixed long ago. In the 1.2 tree, it was fixed in 1.2.2. 2.6.18 should definitely have the fix for this. davide rossetti wrote: OS: Fedora Core release 4 (Stentz) KERNEL: Linux rack1.ape 2.6.17-1.2142_FC4smp #1 SMP Tue Jul 11 22:57:02 EDT 2006 i686 i686 i386 GNU/

Re: [Ocfs2-users] ocfs2 kernel bug in Fedora Core 4 update kernel

2007-01-23 Thread Sunil Mushran
ECTED]> davide rossetti wrote: On 1/23/07, *Sunil Mushran* <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>> wrote: This was the lvb issue that was fixed long ago. In the 1.2 tree, it was fixed in 1.2.2. 2.6.18 should definitely have the fix for this. it seems it

Re: [Ocfs2-users] kernel panic - not syncing

2007-01-23 Thread Sunil Mushran
o2hb is timing out because the io to the device is taking too much time. Not much one can do other than increase the time out. Say 2mins. O2CB_HEARTBEAT_THRESHOLD = 61 Consulente3 wrote: I can reprodute it, every time on heavy IO I have read this FAQ: I encounter "Kernel panic - not syncing:

Re: [Ocfs2-users] unable to configure O2CB_HEARTBEAT_THRESHOLD

2007-01-24 Thread Sunil Mushran
The o2cb script fix is in ocfs2-tools 1.2.2 released Oct 2006. Ping SUSE for the update. [EMAIL PROTECTED] wrote: Using SuSE SP2 Linux running V1.0.8 of OCFS2 and the tools/console that comes with SP2 distribution. I am unable to set the* O2CB_HEARTBEAT_THRESHOLD* parameter in the /etc/sysc

Re: [Ocfs2-users] ocfs2 kernel bug in Fedora Core 4 update kernel

2007-01-24 Thread Sunil Mushran
This is not a fs issue. As in the file must be alright. This is a dlm issue. The fs is asking the dlm to free the lock and the dlm is stuck. How many nodes do you have? We've fixed a bunch of dlm bugs since what you appear to be running. davide rossetti wrote: I rebooted the two faulty nodes. no

[Ocfs2-users] OCFS2 1.2.4-2 released

2007-02-02 Thread Sunil Mushran
All, We are pleased to announce the release of OCFS2 1.2.4-2. This release addresses the lowmem consumption issue that has plagued many users. It also addresses few races in the dlm relating to the lockres migration. The complete list of changes post 1.2.3 is available here: http://oss.oracle

Re: [Ocfs2-users] OCFS2 mount problem

2007-02-05 Thread Sunil Mushran
It could be that the device name is not the same across the two nodes. Do: # mounted.ocfs2 -d on both nodes. Match the device using the uuid. As in, you should see a device with the same uuid on both nodes. If not, then the device is not shared. If you do see the device on both nodes but with di

Re: [Ocfs2-users] OCFS2 mount problem

2007-02-05 Thread Sunil Mushran
The device needs to be shared. As in, both nodes need to be able to see the same device concurrently. Refer to iscsi, fiber channel, aoe, etc. aibolit 66 wrote: -Original Message- From: Sunil Mushran <[EMAIL PROTECTED]> To: aibolit 66 <[EMAIL PROTECTED]> Date: Mon, 05 Feb 2

Re: [Ocfs2-users] OCFS2 1.2.4-2 released

2007-02-06 Thread Sunil Mushran
m: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Sunil Mushran Sent: Friday, February 02, 2007 8:29 PM To: ocfs2-announce@oss.oracle.com; ocfs2-users Subject: [Ocfs2-users] OCFS2 1.2.4-2 released All, We are pleased to announce the release of OCFS2 1.2.4-2. This release addresses the lowmem

Re: [Ocfs2-users] OCFS2 1.2.4-2 released

2007-02-06 Thread Sunil Mushran
That's the source. Randy Ramsdell wrote: Mark Fasheh wrote: On Tue, Feb 06, 2007 at 10:18:51AM -0500, Randy Ramsdell wrote: Is source available? http://oss.oracle.com/projects/ocfs2/dist/files/source/v1.2/ocfs2-1.2.4.tar.gz --Mark -- Mark Fasheh Senior Software

Re: [Ocfs2-users] ocfs2-tools-1.2.2 compile.

2007-02-06 Thread Sunil Mushran
The following patch will address this issue. The fix will be provided with the next tools release. Index: libocfs2/include/ocfs2.h === --- libocfs2/include/ocfs2.h(revision 1269) +++ libocfs2/include/ocfs2.h(revision 1270) @

Re: [Ocfs2-users] Network 10 sec timeout setting?

2007-02-07 Thread Sunil Mushran
Means there was a network hiccup that caused Node 1 to fence itself. The problem is that our default timeout is too low. We have already addressed this in mainline and are looking to add that patch into 1.2.5. I am unclear as to your last qs. Randy Ramsdell wrote: Hi, Ok I'll try this again si

Re: [Ocfs2-users] 1.3.3 mount problem

2007-02-07 Thread Sunil Mushran
The datavolume code is not in mainline. But you should be able to get Oracle RDBMS to work with it. Ensure the init.ora paramater filesystemio_options is set to direct_io. Ivo Maya wrote: Hi, I need to mount ocfs2 with datavolume option on open SuSE 10.2 Machines. ocfs2 is 1.3.3 version and doe

Re: [Ocfs2-users] Network 10 sec timeout setting?

2007-02-07 Thread Sunil Mushran
There are two heartbeats in OCFS2. One on disk and the other on the network. Randy Ramsdell wrote: Sunil Mushran wrote: Means there was a network hiccup that caused Node 1 to fence itself. The problem is that our default timeout is too low. We have already addressed this in mainline and are

Re: [Ocfs2-users] 1.2.4 symbols

2007-02-09 Thread Sunil Mushran
What does dmesg say? Randy Ramsdell wrote: Hi, Everything compiled correctly for the ocfs2 package, but so far the modules will not load with the "well known" module symbol error. FATAL: Error inserting ocfs2 (/lib/modules/2.6.16.27-0.6-smp/kernel/fs/ocfs2/ocfs2.ko): Unknown symbol in module,

Re: [Ocfs2-users] 1.2.4 symbols

2007-02-09 Thread Sunil Mushran
Appears ocfs2 fs module is not compatible with the other modules ocfs2_dlm.ko, ocfs2_nodemanager.ko, ocfs2_dlmfs.ko, configfs.ko. When you build the modules, ensure you copy all of them in /lib/modules/... before running depmod -a. Randy Ramsdell wrote: Sunil Mushran wrote: What does

Re: [Ocfs2-users] Does OCFS2 support these features?

2007-02-09 Thread Sunil Mushran
No. Lin Shen (lshen) wrote: Hi, Can someone let me know if OCFS2 support the following features. 1. Quota. 2. POSIX ACL 3. Clustered volume manager Lin ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/o

Re: [Ocfs2-users] 1.2.4 symbols

2007-02-09 Thread Sunil Mushran
I do 2. You could also look into adding the modules to /lib/modules/`uname -r`/update. I believe depmod searches that patch before the others. Randy Ramsdell wrote: Sunil Mushran wrote: Appears ocfs2 fs module is not compatible with the other modules ocfs2_dlm.ko

Re: [Ocfs2-users] 1.2.4 symbols

2007-02-09 Thread Sunil Mushran
If a vendor is distributing ocfs2, then the vendor controls the location. The spec file in our tree is only relevant if we build the package for that distro. Randy Ramsdell wrote: Joel Becker wrote: On Fri, Feb 09, 2007 at 02:55:37PM -0500, Randy Ramsdell wrote: I found the issue. T

Re: [Ocfs2-users] Re: Am I doing something wrong?

2007-02-09 Thread Sunil Mushran
Are you using 1.2.4? If not, upgrade. We fixed a NFS -ESTALE (-116) issue in it. Though that the error occurs while mounting is a bit puzzling. Brandon Lamb wrote: On 2/9/07, Brandon Lamb <[EMAIL PROTECTED]> wrote: On 2/9/07, Brandon Lamb <[EMAIL PROTECTED]> wrote: > Hey > > So I installed a ne

Re: [Ocfs2-users] ocfs2console

2007-02-13 Thread Sunil Mushran
It's probably because you are missing some package. See the FAQ for the list of packages it is dependent on. Randy Ramsdell wrote: Hi, I see that the ocfs2console app for 1.2.2 doesn't have the same menu items as does the 1.1.0 package. Is the propagate config, check and repair going to be adde

Re: [Ocfs2-users] segfault on 1.2.4

2007-02-14 Thread Sunil Mushran
I meant the solution mentioned by Mark is listed on the ocfs2 home page. Randy Ramsdell wrote: I configured a new cluster using 1.2.4-2. This was a custom install by compiling source. Any ideas or questions? The segmentation fault produced with "rm -rf" or any other rm switch : NOTE: This doe

Re: [Ocfs2-users] segfault on 1.2.4

2007-02-14 Thread Sunil Mushran
Yes, this is mentioned on the ocfs2 home page. Randy Ramsdell wrote: I configured a new cluster using 1.2.4-2. This was a custom install by compiling source. Any ideas or questions? The segmentation fault produced with "rm -rf" or any other rm switch : NOTE: This does not happen on a local fi

Re: [Ocfs2-users] Moving data from ext3 to ocfs2 (san to san)

2007-02-15 Thread Sunil Mushran
Use cp. Or use dd with bs=1M. LOPEZ DIAZ, JORGE wrote: One server with two hba's, one to old SAN and the other to new one, mounting each fs on different points and "cp" command with convenient options? May be too slow. Jorge López Díaz Técnico de Gestión Inform

Re: [Ocfs2-users] Re: [Linux-HA] OCFS2 - Memory hog?

2007-02-15 Thread Sunil Mushran
Fixed in 1.2.4. SUSE has the patch-fix. The patch has also been added to mainline. John Lange wrote: Yes, the clients are doing lots of creates. But my question is, if this is a memory leak, why does ocfs2 eat up the memory as soon as the clients start accessing the filesystem. Within about 5-1

Re: [Ocfs2-users] 2 OCFS2 clusters that affect each other

2007-02-15 Thread Sunil Mushran
Do you have the full oops trace? Nathan Ehresman wrote: I have a strange OCFS2 problem that has been plaguing me. I have 2 separate OCFS2 clusters, each consisting of 3 machines. One is an Oracle RAC, the other is used as a shared DocumentRoot for a web cluster. All 6 machines are in an IBM

Re: [Ocfs2-users] ocfs2 with user based heartbeat

2007-02-16 Thread Sunil Mushran
That's probably dlm communication. You should be able to confirm that using ethereal/wireshark. http://oss.oracle.com/~smushran/.debug/wireshark/ Sebastian Reitenbach wrote: Hi list, I just have a quick question. We are experimenting with ocfs2 and linux heartbeat, using user based heartbeat

Re: [Ocfs2-users] relation between ocfs2 1.2.4-2 and kernel.org GIT HEAD

2007-02-20 Thread Sunil Mushran
OCFS2 has two trees. The 1.2 tree and the git tree. All new development happens on git head. All bug fixes are typically worked on the tree that it was detected on. Later, the bug fix is applied to the other tree. As most of our users are using the 1.2 tree, almost all bug fixes flow from the 1.2

Re: [Ocfs2-users] Performance Problems while reading

2007-02-21 Thread Sunil Mushran
And you are convinced that drdb's primary-primary is not the cause for the slowdown. ?? Egon Burgener wrote: Hi all We are using a 2 node cluster with drbd 8 (primary/primary state) and ocfs2. Reading a file on one node while it will be written on the other node is very slow. Reading a file on

Re: {Spam?} Re: [Ocfs2-users] Performance Problems while reading

2007-02-23 Thread Sunil Mushran
Egon Burgener wrote: And you are convinced that drdb's primary-primary is not the cause for the slowdown. ?? Yes, writing a file is fast. Reading a file has no influence on drbd. We noticed, that reading a big file on one node while the other node opened that file in RW mode but without wr

Re: [Ocfs2-users] OCFS 1.2.4 memory problems still?

2007-02-23 Thread Sunil Mushran
Start monitoring /proc/slabinfo and /proc/meminfo. Dump it to a file every 5-10 mins. Which version of the rhel4 kernel are you on: uname -a? Cline, Ernie wrote: I have a 2 node cluster of HP DL380G4s. These machines are attached via scsi to an external HP disk enclosure. They run 32bit RH A

Re: [Ocfs2-users] OCFS 1.2.4 memory problems still?

2007-02-23 Thread Sunil Mushran
t run over the weekend. -Original Message- From: Sunil Mushran [mailto:[EMAIL PROTECTED] Sent: Friday, February 23, 2007 2:25 PM To: Cline, Ernie Cc: ocfs2-users@oss.oracle.com Subject: Re: [Ocfs2-users] OCFS 1.2.4 memory problems still? Start monitoring /proc/slabinfo and /proc/meminfo.

Re: [Ocfs2-users] High on buffers and deep on swap

2007-02-23 Thread Sunil Mushran
Hmmm... the last time I saw your numbers, ocfs2's foot print was 15M. You'll have to do better than that. Anycase, Luis problem is the relationship between swap and cached buffers. Which kernel is this? John Lange wrote: It seems that ocfs has an unfixed memory leak even in the most recent vers

Re: [Ocfs2-users] dlm timeouts and following errors -112

2007-02-26 Thread Sunil Mushran
Yes, the messages are related. -112 is EHOSTDOWN. Sebastian Reitenbach wrote: Hi list, I am experimenting with ocfs2 (rpm package: 1.2.2-0.2), using linux-ha 2.0.8 (all running on a SLES 10 x86-64, rpm packages from linux-ha.org) for the heartbeat. The three nodes are connected on a gigabit s

Re: [Ocfs2-users] High on buffers and deep on swap

2007-02-26 Thread Sunil Mushran
er is 2.6.9-34.ELsmp (RedHat 4.0) as this is our production cluster and we did not update it recently. Regards, Luis */Sunil Mushran <[EMAIL PROTECTED]>/* wrote: Hmmm... the last time I saw your numbers, ocfs2's foot print was 15M. You'll have to do better than that.

Re: [Ocfs2-users] Problems with ocfs2 when rebooting the first node.

2007-02-26 Thread Sunil Mushran
Check out this bug: http://oss.oracle.com/bugzilla/show_bug.cgi?id=854 José Costa wrote: Hello, I'm using 2.6.16.41-SLES10_SP1_BRANCH_20070220135926-smp with OCFS2 1.2.4. If I start the node1 and then the node2... everything works. If I reboot the node1, it gives this error to node2 and I ca

Re: [Ocfs2-users] OCFS2 and LVM2

2007-02-27 Thread Sunil Mushran
Alexei_Roudnev wrote: (Btw, some Oracle engineers are saying that they are going to drop OCFSv2 certification.) I would rather you not use this channel to spread FUD. ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/ma

Re: [Ocfs2-users] growing a ocfs2 filesystem

2007-02-28 Thread Sunil Mushran
Means that the volume is in use on at least one node in the cluster. If you were using the native o2cb heartbeat, the following command would have shown the heartbeating node. # watch -d -n2 "debugfs.ocfs2 -R \"hb\" /dev/sdX" Ping SUSE to find out the details when using ocfs2 with linux-ha. Seb

Re: [Ocfs2-users] Few panics with OCFSv2, SLES9 Sp3, kernel 282

2007-03-01 Thread Sunil Mushran
This specific issue was long addressed in ocfs2 1.2.2. The fix is available on sles9 with 2.6.5-7.283. Alexei_Roudnev wrote: Saw it few times until I unmounted FS on all nodes, run fsck (show nothing) and then mounted back: Do we have any errors/bugs, explaining this: Mar 1 06:26:42 spproddoc0

Re: [Ocfs2-users] data volume option, is it present in current version of ocfs2

2007-03-02 Thread Sunil Mushran
If you want to use the mainline version of ocfs2, use a raw partition for the voting disk and ocr. For the db, specify filesystemio_options=directio in init.ora. nirmal tom wrote: hi, i am trying to install oracle rac on fedora core 6 through iscsi. when i try to mount, mount.ocfs2: Invalid ar

[Ocfs2-users] OCFS2 Tools 1.2.3 released

2007-03-02 Thread Sunil Mushran
All, We are pleased to announce the release of OCFS2 Tools 1.2.3. This release is fully compatible with the OCFS2 1.2.x and the OCFS2 bundled with the mainline Linux kernel 2.6.20 (and earlier). The summary of changes in this release are as follows: * Backup super block support added * Local mou

Re: [Ocfs2-users] Strange problems (deadlock) in ocfs2 (rpm 1.2.4-2 and svn 2982) - dlm related?

2007-03-05 Thread Sunil Mushran
How many nodes in the cluster? Marcus Alves Grando wrote: Hi list, I have some problems testing ocfs2. My test consist in: #server1: dd if=/dev/random of=/ocfs2_1/test & #server1: dd if=/dev/random of=/ocfs2_2/test & #server1: dd if=/dev/random of=/ocfs2_3/test & ... #server1: dd if=/dev/rando

Re: [Ocfs2-users] Strange problems (deadlock) in ocfs2 (rpm 1.2.4-2 and svn 2982) - dlm related?

2007-03-05 Thread Sunil Mushran
N-COLUMN Stop the tcpdumps and make them available to me via some ftp site or whatever. Also, file a bugzilla for tracking purposes. Marcus Alves Grando wrote: Sunil Mushran wrote: How many nodes in the cluster? Four. Marcus Alves Grando wrote: Hi list, I have some problems testing ocfs2.

Re: [Ocfs2-users] ocfs2 is still eating memory

2007-03-05 Thread Sunil Mushran
Well, kswapd is supposed to flush the caches. As in, the vm controls the lifetime of the inodes in the inode_cache not ocfs2. All ocfs2 can do is free the memory associated with the inode when asked to. And it does that when you manually flush the cache. Qs is why the vm is not doing it on its ow

Re: [Ocfs2-users] ocfs2 is still eating memory

2007-03-08 Thread Sunil Mushran
x27;t have much hopes that it works., More likely it will work well since SLES10 SP3, as usual (all, SLES9 and SLES8, became 100% reliable starting with SP3 approximately). - Original Message - From: "John Lange" <[EMAIL PROTECTED]> To: "Sunil Mushran" <[

Re: [Ocfs2-users] ocfs2 is still eating memory

2007-03-08 Thread Sunil Mushran
If you are running a prod shop, you should looking into buying support. John Lange wrote: On Mon, 2007-03-05 at 13:46 -0800, Sunil Mushran wrote: Well, kswapd is supposed to flush the caches. As in, the vm controls the lifetime of the inodes in the inode_cache not ocfs2. All ocfs2 can do

Re: [Ocfs2-users] ocfs2 is still eating memory

2007-03-09 Thread Sunil Mushran
hit harsh and disrespectful to say the least. Which is never really appreciated. A little bit of respect and more constructive feedback usually goes a very long way. Everyone is tryong their best. -Original Message- From: "Alexei_Roudnev" <[EMAIL PROTECTED]> To: &q

Re: [Ocfs2-users] ocfs2 cluster becomes unresponsive

2007-03-09 Thread Sunil Mushran
File a bugzilla with the messages from all three nodes. Appears node 2 went down but kept heartbeating. Strange. The messages from node 2 may shed more light. Andy Kipp wrote: We are running OCFS2 on SLES9 machines using a FC SAN. Without warning both nodes will become unresponsive. Can not acces

Re: [Ocfs2-users] ocfs2 cluster becomes unresponsive

2007-03-10 Thread Sunil Mushran
The config error I would imagine would be that you defined two different clusters, each not having the other node, and that the two nodes have the same node number in both clusters. If so, the disk hb would have detected this error. It would have spewed error messages indicating that "some other n

Re: [Ocfs2-users] ocfs2 cluster becomes unresponsive

2007-03-13 Thread Sunil Mushran
Have you tried to do alt-sysrq-t on the "dead" node? The stack traces will be of great help. Also, even though this could be the same as #819, I would still recommend filing a new bug with all the messages files. Even though that will take some of your time, it will be much easier to keep track

Re: [Ocfs2-users] data volume option, is it present in current version of ocfs2

2007-03-13 Thread Sunil Mushran
bversion from development holds support for it?Any other ways of using ocfs2? thanks for the response regards, Nirmal Tom. From: Sunil Mushran <[EMAIL PROTECTED]> To: nirmal tom <[EMAIL PROTECTED]> CC: ocfs2-users@oss.oracle.com Subject: Re: [Ocfs2-users] data volume option,

Re: [Ocfs2-users] data volume option, is it present in current versionof ocfs2

2007-03-13 Thread Sunil Mushran
ginal Message - From: "Sunil Mushran" <[EMAIL PROTECTED]> To: "nirmal tom" <[EMAIL PROTECTED]> Cc: Sent: Tuesday, March 13, 2007 9:28 AM Subject: Re: [Ocfs2-users] data volume option, is it present in current versionof ocfs2 Just specify block devices for vot

Re: [Ocfs2-users] ocfs2 v.1.2.5 question

2007-03-15 Thread Sunil Mushran
Yes. 1.2.5 will have the configurable network timeout. Randy Ramsdell wrote: Hi, We are planning an upgrade of an ocfs2 cluster and I wanted to clarify something first. Is 1.2.5 going to include a variable to set the network timeout? This seems to be important as we have had to move processes

Re: [Ocfs2-users] ocfs2 is still eating memory

2007-03-16 Thread Sunil Mushran
mand for memory, why shouldn't caches stay in memory? That's the entire point of caches. If we start OOM killing processes due to the caches taking all the memory, that's absolutely a bug. Here is what Sunil Mushran from Oracle had to say about the issue: Well, kswapd

Re: [Ocfs2-users] re: o2hb_do_disk_heartbeat:963 ERROR: Device "sdb1" another node is heartbeating in our slot!

2007-03-16 Thread Sunil Mushran
Peter Santos wrote: "Mar 16 13:38:02 dbo3 kernel: (3712,3):o2hb_do_disk_heartbeat:963 ERROR: Device "sdb1": another node is heartbeating in our slot!" Usually there are a number of other errors, but this one was it. If this was one isolated er

Re: [Ocfs2-users] ocfs2 v.1.2.5 question

2007-03-19 Thread Sunil Mushran
e. Stephan Hendl wrote: fine, we appreciate this variable as well. Will there be a procedure for a rolling upgrade? We are using 1.2.4 in a production environment. From 1.2.3 to 1.2.4 there whole cluster had to be offline during upgrade... ;-( Stephan Sunil Mushran <[EMAIL PROTECTED]> sc

Re: [Ocfs2-users] OCFS2 + DRBD 0.81 Casch

2007-03-20 Thread Sunil Mushran
The patch fix for this missed the 2.6.20 window. The following link has all the relevant patches atop 2.6.20. http://git.kernel.org/?p=linux/kernel/git/mfasheh/ocfs2.git;a=log;h=2.6.20_fixes Apply all in order starting from one after the official 2.6.20. Incidentally the fix you require is the

Re: [Ocfs2-users] make error on 2.6.20

2007-03-22 Thread Sunil Mushran
The kernel has changed much. 1.2 will not build against 2.6.20. We have updated autogen to handle rhel5, but that's still 2.6.18. What are you trying to achieve? Why not use the ocfs2 modules shipped natively with fc6? If you want to run database, specify filesystemio_options=odirect in init.ora.

Re: [Ocfs2-users] kernel: Kernel BUG at aops:249

2007-03-22 Thread Sunil Mushran
Please file a bug in oss.oracle.com/bugzilla with the full strack trace. The kernel BUG line looks wrong or at least incomplete. Is this the full trace? If you have hit this on more than one occasion please do mention that in the bugzilla. Maybe even upload the stack trace from those instances t

Re: [Ocfs2-users] ocfs2_file_sendfile: 372 ERROR

2007-03-26 Thread Sunil Mushran
#define ECONNRESET 104 /* Connection reset by peer */ #define EPIPE 32 /* Broken pipe */ Harmless. But do file a bug. oss.oracle.com/bugzilla. We should not be printing the ERROR. It should be handled by the userspace. Stephan Hendl wrote: Hi, I'm using a 4 node cluster

Re: [Ocfs2-users] OCFS2 has a likely memory leak. Bug 864

2007-03-27 Thread Sunil Mushran
You'll run into the size-256 slab explosion on sles9 sp3. That issue was addressed in 1.2.4. sp3 ships 1.2.3. Alexei_Roudnev wrote: OCFSv2 @ SLES9 Sp3 build 283 is relatively stable. I am running your test on 2 hosts now (create files from 2 hosts, and delete them with some delay from host1 by

Re: [Ocfs2-users] 1.2.4 still eating memory??

2007-03-29 Thread Sunil Mushran
How long does it take for the node to die? File a new bugzilla with the following info. date >> /tmp/info.txt iostat -x 1 3 >> /tmp/info.txt vmstat 1 3 >> /tmp/info.txt top -b -n 1 | head -50 >> /tmp/info.txt ps -elf>> /tmp/info.txt cat /pro

Re: [Ocfs2-users] HowTo recover ocfs2 in a 10g four node cluster

2007-03-30 Thread Sunil Mushran
ocfs2 init script is mounting devices listed in /etc/fstab. Check the device names. If you are mounting by device name, the name may have changed. If so, fix the device name and also look up mount by label in the docs. John E wrote: Hi All, I needed to rebuild the operating system on one of the

Re: [Ocfs2-users] Catatonic nodes under SLES10

2007-04-02 Thread Sunil Mushran
In ocfs2, the default network timeouts are too low. The patch fix to make this timeout configurable is available in sles10 sp1. David Miller wrote: Good afternoon all; I'm planning on implementing a shared storage solution for a primary and backup oracle server in the near future. We can't a

Re: [Ocfs2-users] Ocfs2 1.2.5 ?

2007-04-02 Thread Sunil Mushran
The packages have been spun and have been in testing since last week. We should get the green light sometime later this week. Randy Ramsdell wrote: Hi, We were having issues with an ocfs2 cluster and I was really thinking the instability was related to ocfs2. In retrospect, it was an ocfs2 prob

Re: [Ocfs2-users] OCFS2 Fencing, then panic

2007-04-03 Thread Sunil Mushran
This is a known issue on SLES10. Ping Novell for the update. Eli Criffield wrote: Whenever i mount my shared ocfs2 volume on the second node, the primary kernel panics. I have SLES10 xen guests both able to access the same /dev/sdc1. My /etc/ocfs2/cluster.conf cluster: node_cou

Re: [Ocfs2-users] JBD: no valid journal superblock found

2007-04-04 Thread Sunil Mushran
You may want all nodes to be on the same kernel version. 2.6.5-7.283. But that's not the problem. There was a bug in tunefs.ocfs2 1.1.4. The jbd superblock was not endian correct. Was fixed shortly thereafter. You can fix this issue as follows: 1. Use yast to upgrade to ocfs2-tools 1.2.1 (or hig

Re: [Ocfs2-users] fcntl exclusive lock implementation in ocfs2

2007-04-05 Thread Sunil Mushran
ocfs2 currently lets vfs handle fcntl locking. Jeff Fookson wrote: I am currently testing ocfs2 for use in a two-node cluster that will run the Cyrus imapd and am having issues that seem to be related to occasionally long times being needed while the software blocks waiting to get a writelock v

Re: [Ocfs2-users] OCFS2 Fencing, then panic

2007-04-06 Thread Sunil Mushran
You will have to provide more information. If you have a netconsole server configured, it would have the details. Else, I would recommend you configure one to catch the messages during fence. We have to see the deduce for the fence to determine the actual problem. enohi ibekwe wrote: Is this als

[Ocfs2-users] OCFS2 1.2.5 and OCFS2 TOOLS 1.2.4 Release

2007-04-06 Thread Sunil Mushran
All, We are pleased to announce the release of OCFS2 1.2.5 and OCFS2 Tools 1.2.4. This release provides the long awaited configurable network timeout feature. This feature was originally written by Andrew Beekhof and ported to the OCFS2 1.2 tree by Jeff Mahoney, both of Novell/SUSE. The complet

Re: [Ocfs2-users] Get error on mount.ocfs2 "No such file or directory while mounting ...."

2007-04-09 Thread Sunil Mushran
mkdir /u01 or /u02 As in, it appears you are missing the mount directory. Zosen Wang wrote: I try to install 2 nodes RAC in Linux 2.6.9.-22.EL by using Jeffery Hunter’s paper. I am getting problem to mount the ocfs2 file system. The following is mount command error output: [EMAIL PROTECTED] ~]

<    1   2   3   4   5   6   7   8   9   10   >