Re: [Ocfs2-users] Number of Nodes defined
Reducing node-slots frees up the journal and distributes the metadata that that slot was tracking to the remaining slots. I am not aware of any reason why there should be an impact. On 11/16/2011 03:07 PM, David wrote: > I did read the man page for tunefs.ocfs2 but I didn't see anything indicating > what the impact to the fs would be when making a change to an existing fs > such as reducing the node slots. > > Anyway, thank you for the feedback, I was able to make the changes with no > impact to the fs. > > David > > On 11/16/2011 12:12 PM, Sunil Mushran wrote: >> man tunefs.ocfs2 >> >> It cannot be done in an active cluster. But it can be done without having to >> reformat the volume. >> >> On 11/16/2011 10:08 AM, David wrote: >>> I wasn't able to find any documentation that answers whether or not the >>> number of nodes defined for a cluster, can be reduced on an active >>> cluster as seen via: >>> >>> tunefs.ocfs2 -Q "%B %T %N\n" >>> >>> Does anyone know if this can be done, or do I have to copy the data off >>> of the fs, make the changes, reformat the fs and copy the data back? >>> >>> ___ >>> Ocfs2-users mailing list >>> Ocfs2-users@oss.oracle.com >>> http://oss.oracle.com/mailman/listinfo/ocfs2-users >> ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users
Re: [Ocfs2-users] Number of Nodes defined
I did read the man page for tunefs.ocfs2 but I didn't see anything indicating what the impact to the fs would be when making a change to an existing fs such as reducing the node slots. Anyway, thank you for the feedback, I was able to make the changes with no impact to the fs. David On 11/16/2011 12:12 PM, Sunil Mushran wrote: > man tunefs.ocfs2 > > It cannot be done in an active cluster. But it can be done without > having to > reformat the volume. > > On 11/16/2011 10:08 AM, David wrote: >> I wasn't able to find any documentation that answers whether or not the >> number of nodes defined for a cluster, can be reduced on an active >> cluster as seen via: >> >> tunefs.ocfs2 -Q "%B %T %N\n" >> >> Does anyone know if this can be done, or do I have to copy the data off >> of the fs, make the changes, reformat the fs and copy the data back? >> >> ___ >> Ocfs2-users mailing list >> Ocfs2-users@oss.oracle.com >> http://oss.oracle.com/mailman/listinfo/ocfs2-users > ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users
Re: [Ocfs2-users] Ocfs2-users Digest, Vol 95, Issue 11
Are you by chance writing your Apache access & error log files to the shared volume? I was having an issue much like this about 2 years ago: http://www.mail-archive.com/ocfs2-users@oss.oracle.com/msg02927.html I updated to the 1.4.x release and it got better, but eventually it came back. The final solution (final as in the problem hasn't returned yet) was to move the separate access_log and error_log files to local disks. At the start of each day a script combines the prior day's logs of each node and places the resulting file back on to the shared volume for further use by statistics analyzers/etc. OCFS2 has proven to be an excellent shared filesystem over the years, but the biggest shortcoming I've found is poor performance during high concurrent writes. Of course, managing write contention is one of the most "difficult" tasks for shared resources, so it's somewhat expected. At 12:00 PM 11/16/2011, Andy Herrero wrote: >Message: 1 >Date: Wed, 16 Nov 2011 12:36:37 +0100 >From: Andy Herrero >Subject: [Ocfs2-users] Hangups >To: >Message-ID: <4ec3a045.4060...@internetborder.se> >Content-Type: text/plain; charset="ISO-8859-1" > >Hi all! > >I have a four-node OCFS2-kluster running RHEL 5.7 (x86_64) on >ProLiants DL360-G5's >with 6 GB RAM each. Two webservers running Apache and two MySQL-servers. I've >shut down OCFS2 on the DB-servers since it's never really been used >there, so it's only >live on the webservers. > >OCFS2 is a 14-disk RAID-10 connected via dedicated QLogic iSCSI-NICs >via a Gigabit >switch. The serves are connected to the same switch on a separate VLAN for the >heartbeat. They also have dedicated Gbps-NICs for frontend and >backend traffic. > >The problem is that last couple of weeks the write-performance has >intermittently >slowed down to a crawl and right now the system is pretty much >unusable. httpd has >its DocumentRoot on /san and these processes often go into D-state. >"iostat -dmx 1" >often reports ~99 %util and writing anything hangs disturbingly >often. My biggest >problem right now is that I've got very little to work with (that is >no juicy kernel-panics >and such) and the only OCFS-related stuff in the logs look like this: [snip] ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users
Re: [Ocfs2-users] Number of Nodes defined
man tunefs.ocfs2 It cannot be done in an active cluster. But it can be done without having to reformat the volume. On 11/16/2011 10:08 AM, David wrote: > I wasn't able to find any documentation that answers whether or not the > number of nodes defined for a cluster, can be reduced on an active > cluster as seen via: > > tunefs.ocfs2 -Q "%B %T %N\n" > > Does anyone know if this can be done, or do I have to copy the data off > of the fs, make the changes, reformat the fs and copy the data back? > > ___ > Ocfs2-users mailing list > Ocfs2-users@oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-users ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users
[Ocfs2-users] Number of Nodes defined
I wasn't able to find any documentation that answers whether or not the number of nodes defined for a cluster, can be reduced on an active cluster as seen via: tunefs.ocfs2 -Q "%B %T %N\n" Does anyone know if this can be done, or do I have to copy the data off of the fs, make the changes, reformat the fs and copy the data back? ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users
Re: [Ocfs2-users] [Ocfs2-devel] vmstore option - mkfs
That's fine... Using vmstore: vegeta:~ # mkfs.ocfs2 -T vmstore -n /dev/drbd1 mkfs.ocfs2 1.4.3 Dry run Filesystem Type of vmstore Label: Features: sparse backup-super unwritten inline-data strict-journal-super metaecc xattr indexed-dirs refcount Block size: 4096 (12 bits) Cluster size: 131072 (17 bits) Volume size: 16105472000 (122875 clusters) (3932000 blocks) Cluster groups: 4 (tail covers 26107 clusters, rest cover 32256 clusters) Extent allocator size: 8388608 (2 groups) Journal size: 134217728 Node slots: 8 Default mkfs: vegeta:~ # mkfs.ocfs2 -n /dev/drbd1 mkfs.ocfs2 1.4.3 Dry run Label: Features: sparse backup-super unwritten inline-data strict-journal-super metaecc xattr indexed-dirs Block size: 4096 (12 bits) Cluster size: 4096 (12 bits) Volume size: 16105598976 (3932031 clusters) (3932031 blocks) Cluster groups: 122 (tail covers 29055 clusters, rest cover 32256 clusters) Extent allocator size: 4194304 (1 groups) Journal size: 100659200 Node slots: 8 Thanks for your help. Att. Artur Baruchi On Wed, Nov 16, 2011 at 3:45 PM, Sunil Mushran wrote: > Yes. But this is just the features. It also selects the appropriate cluster > size, block size, > journal size, etc. All the params selected are printed by mkfs. You also > have the option of > running with the --dry-option to see the params. > > On 11/16/2011 09:41 AM, Artur Baruchi wrote: >> >> I just found this: >> >> + {OCFS2_FEATURE_COMPAT_BACKUP_SB | OCFS2_FEATURE_COMPAT_JBD2_SB, >> + OCFS2_FEATURE_INCOMPAT_SPARSE_ALLOC | >> + OCFS2_FEATURE_INCOMPAT_INLINE_DATA | >> + OCFS2_FEATURE_INCOMPAT_XATTR | >> + OCFS2_FEATURE_INCOMPAT_REFCOUNT_TREE, >> + OCFS2_FEATURE_RO_COMPAT_UNWRITTEN}, /* FS_VMSTORE */ >> >> These options are the ones that, when choosing for vmstore, are >> enabled by default. Is this correct? >> >> Thanks. >> >> Att. >> Artur Baruchi >> >> >> >> On Wed, Nov 16, 2011 at 3:26 PM, Sunil Mushran >> wrote: >>> >>> fstype is a handy way to format the volume with parameters that are >>> thought >>> to be useful for that use-case. The result of this is printed during >>> format >>> by >>> way of the parameters selected. man mkfs.ocfs2 has a blurb about the >>> features >>> it enabled by default. >>> >>> On 11/16/2011 08:45 AM, Artur Baruchi wrote: Hi. I tried to find some information about the option vmstore when formating a device, but didnt found anything about it (no documentation, I did some greps inside the source code, but nothing returned). My doubts about this: - What kind of optimization this option creates in my file system to store vm images? I mean.. what does exactly this option do? - Where, in source code, I can find the part that makes this optimization? Thanks in advance. Att. Artur Baruchi >>> > > ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users
Re: [Ocfs2-users] [Ocfs2-devel] vmstore option - mkfs
Yes. But this is just the features. It also selects the appropriate cluster size, block size, journal size, etc. All the params selected are printed by mkfs. You also have the option of running with the --dry-option to see the params. On 11/16/2011 09:41 AM, Artur Baruchi wrote: > I just found this: > > + {OCFS2_FEATURE_COMPAT_BACKUP_SB | OCFS2_FEATURE_COMPAT_JBD2_SB, > + OCFS2_FEATURE_INCOMPAT_SPARSE_ALLOC | > + OCFS2_FEATURE_INCOMPAT_INLINE_DATA | > + OCFS2_FEATURE_INCOMPAT_XATTR | > + OCFS2_FEATURE_INCOMPAT_REFCOUNT_TREE, > + OCFS2_FEATURE_RO_COMPAT_UNWRITTEN}, /* FS_VMSTORE */ > > These options are the ones that, when choosing for vmstore, are > enabled by default. Is this correct? > > Thanks. > > Att. > Artur Baruchi > > > > On Wed, Nov 16, 2011 at 3:26 PM, Sunil Mushran > wrote: >> fstype is a handy way to format the volume with parameters that are thought >> to be useful for that use-case. The result of this is printed during format >> by >> way of the parameters selected. man mkfs.ocfs2 has a blurb about the >> features >> it enabled by default. >> >> On 11/16/2011 08:45 AM, Artur Baruchi wrote: >>> Hi. >>> >>> I tried to find some information about the option vmstore when >>> formating a device, but didnt found anything about it (no >>> documentation, I did some greps inside the source code, but nothing >>> returned). My doubts about this: >>> >>> - What kind of optimization this option creates in my file system to >>> store vm images? I mean.. what does exactly this option do? >>> - Where, in source code, I can find the part that makes this optimization? >>> >>> Thanks in advance. >>> >>> Att. >>> Artur Baruchi >>> >> ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users
Re: [Ocfs2-users] [Ocfs2-devel] vmstore option - mkfs
I just found this: + {OCFS2_FEATURE_COMPAT_BACKUP_SB | OCFS2_FEATURE_COMPAT_JBD2_SB, +OCFS2_FEATURE_INCOMPAT_SPARSE_ALLOC | +OCFS2_FEATURE_INCOMPAT_INLINE_DATA | +OCFS2_FEATURE_INCOMPAT_XATTR | +OCFS2_FEATURE_INCOMPAT_REFCOUNT_TREE, +OCFS2_FEATURE_RO_COMPAT_UNWRITTEN}, /* FS_VMSTORE */ These options are the ones that, when choosing for vmstore, are enabled by default. Is this correct? Thanks. Att. Artur Baruchi On Wed, Nov 16, 2011 at 3:26 PM, Sunil Mushran wrote: > fstype is a handy way to format the volume with parameters that are thought > to be useful for that use-case. The result of this is printed during format > by > way of the parameters selected. man mkfs.ocfs2 has a blurb about the > features > it enabled by default. > > On 11/16/2011 08:45 AM, Artur Baruchi wrote: >> >> Hi. >> >> I tried to find some information about the option vmstore when >> formating a device, but didnt found anything about it (no >> documentation, I did some greps inside the source code, but nothing >> returned). My doubts about this: >> >> - What kind of optimization this option creates in my file system to >> store vm images? I mean.. what does exactly this option do? >> - Where, in source code, I can find the part that makes this optimization? >> >> Thanks in advance. >> >> Att. >> Artur Baruchi >> > > ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users
Re: [Ocfs2-users] [Ocfs2-devel] vmstore option - mkfs
fstype is a handy way to format the volume with parameters that are thought to be useful for that use-case. The result of this is printed during format by way of the parameters selected. man mkfs.ocfs2 has a blurb about the features it enabled by default. On 11/16/2011 08:45 AM, Artur Baruchi wrote: > Hi. > > I tried to find some information about the option vmstore when > formating a device, but didnt found anything about it (no > documentation, I did some greps inside the source code, but nothing > returned). My doubts about this: > > - What kind of optimization this option creates in my file system to > store vm images? I mean.. what does exactly this option do? > - Where, in source code, I can find the part that makes this optimization? > > Thanks in advance. > > Att. > Artur Baruchi > ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users
[Ocfs2-users] vmstore option - mkfs
Hi. I tried to find some information about the option vmstore when formating a device, but didnt found anything about it (no documentation, I did some greps inside the source code, but nothing returned). My doubts about this: - What kind of optimization this option creates in my file system to store vm images? I mean.. what does exactly this option do? - Where, in source code, I can find the part that makes this optimization? Thanks in advance. Att. Artur Baruchi ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users
[Ocfs2-users] Hangups
Hi all! I have a four-node OCFS2-kluster running RHEL 5.7 (x86_64) on ProLiants DL360-G5's with 6 GB RAM each. Two webservers running Apache and two MySQL-servers. I've shut down OCFS2 on the DB-servers since it's never really been used there, so it's only live on the webservers. OCFS2 is a 14-disk RAID-10 connected via dedicated QLogic iSCSI-NICs via a Gigabit switch. The serves are connected to the same switch on a separate VLAN for the heartbeat. They also have dedicated Gbps-NICs for frontend and backend traffic. The problem is that last couple of weeks the write-performance has intermittently slowed down to a crawl and right now the system is pretty much unusable. httpd has its DocumentRoot on /san and these processes often go into D-state. "iostat -dmx 1" often reports ~99 %util and writing anything hangs disturbingly often. My biggest problem right now is that I've got very little to work with (that is no juicy kernel-panics and such) and the only OCFS-related stuff in the logs look like this: ---CUT--- Nov 14 14:13:49 web02 kernel: INFO: task httpd:3959 blocked for more than 120 seconds. Nov 14 14:13:49 web02 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Nov 14 14:13:49 web02 kernel: httpd D 810001004420 0 3959 3149 3962 3954 (NOTLB) Nov 14 14:13:49 web02 kernel: 81018c407ea8 0082 81019d600140 8875b54b Nov 14 14:13:49 web02 kernel: 81018421 000a 81018eb957a0 80314b60 Nov 14 14:13:49 web02 kernel: 00b90056ee85 0018d3ad 81018eb95988 0003 Nov 14 14:13:49 web02 kernel: Call Trace: Nov 14 14:13:49 web02 kernel: [] :ocfs2:ocfs2_inode_lock_full+0x5fb/0xfe2 Nov 14 14:13:49 web02 kernel: [] cp_new_stat+0xe5/0xfd Nov 14 14:13:49 web02 kernel: [] __mutex_lock_slowpath+0x60/0x9b Nov 14 14:13:49 web02 kernel: [] .text.lock.mutex+0xf/0x14 Nov 14 14:13:49 web02 kernel: [] generic_file_llseek+0x2a/0x8b Nov 14 14:13:49 web02 kernel: [] sys_lseek+0x40/0x60 Nov 14 14:13:49 web02 kernel: [] system_call+0x7e/0x83 ---CUT--- Also, I see these kind of errors regarding eth4 (heartbeat-network): # grep 'e1000e.*Detected' /var/log/messages Nov 13 13:25:00 web02 kernel: e1000e :0b:00.0: eth4: Detected Hardware Unit Hang: Nov 14 09:10:55 web02 kernel: e1000e :0b:00.0: eth4: Detected Hardware Unit Hang: Nov 14 21:54:49 web02 kernel: e1000e :0b:00.0: eth4: Detected Hardware Unit Hang: Nov 14 21:58:35 web02 kernel: e1000e :0b:00.0: eth4: Detected Hardware Unit Hang: Nov 15 01:42:33 web02 kernel: e1000e :0b:00.0: eth4: Detected Hardware Unit Hang: Nov 15 13:34:01 web02 kernel: e1000e :0b:00.0: eth4: Detected Hardware Unit Hang: Nov 15 20:55:24 web02 kernel: e1000e :0b:00.0: eth4: Detected Hardware Unit Hang: Nov 16 02:42:55 web02 kernel: e1000e :0b:00.0: eth4: Detected Hardware Unit Hang: However, they happen relatively seldom and I've tried flood-pinging hosts over that VLAN and I've never experienced timeouts nor packet-loss so really don't think that's the issue. I managed to catch an strace running a dd to the drive and it hanged during the 8th 10MB-write for approx 1 minute and 30 seconds. Things are looking surprisingly cheerful right now as only maybe 1 out of 10 tries hangs like this. ---CUT--- # strace dd if=/dev/zero of=/san/testfile bs=1M count=10 ... open("/san/testfile", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 1 ... read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 write(1, "\0\0\0\