See attachment for requested output.

At 08:54 PM 3/24/2010, Sunil Mushran wrote:
Quite a bit of work is ongoing on this front. I'll list all that work
in another email.

Meanwhile make a bz with the stat_sysdir output. We'll need that
to determine the best way forward.

David Johle wrote:
So in light of prior issues with lock contention and such due to writing apache logs to shared files I have started storing them locally on each node. I made a script to combine them nightly before the statistics generator kicks off for the previous day's traffic analysis.

This script, using logresolvemerge.pl, is actually writing the output back to the shard volume for easy reference later. I figure I would not have issues with this as it's a large amount of sequential writes from a single node at off-peak time. However, It's been getting hung with high CPU from the merger.

I'm pretty sure I'm running into the famous "free space fragmentation" problem, but wanted to confirm that this was the case or see if there was additional troubleshooting I can do.

Here's the disk, plenty of overall free space:

Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/mapper/mpath1   209725440  85311460 124413980  41% /san/live-websites


While my merging was going 100% of a CPU core, but the merged file was not growing in size and not much I/O actually happening to the shared volume, I did an strace to see what it was doing and got this:

# strace -p 16844
Process 16844 attached - interrupt to quit
read(3, "1\" 200 936 \"http://www.industria";..., 4096) = 4096
write(1, ".NET CLR 1.1.4322; .NET CLR 2.0."..., 4096) = -1 ENOSPC (No space left on device)
read(4, "oration&locationName=South+Jerse"..., 4096) = 4096
write(1, "ivers=8&ngPipelines=600&kvtl230="..., 4096) = -1 ENOSPC (No space left on device)
read(4, "1\" 200 936 \"http://www.industria";..., 4096) = 4096
write(1, "gan+Boulevard&locationCSZ=Salem%"..., 4096) = -1 ENOSPC (No space left on device)
read(3, "HTTP/1.0\" 200 4096 \"-\" \"WinampMP"..., 4096) = 4096
write(1, "elta=.375&zoomlevel=6&label=Sout"..., 4096) = -1 ENOSPC (No space left on device)
read(4, "HTTP/1.0\" 200 4096 \"-\" \"WinampMP"..., 4096) = 4096
write(1, "ident/4.0; .NET CLR 1.1.4322; .N"..., 4096) = -1 ENOSPC (No space left on device)
read(3, "0 36516 \"-\" \"Mozilla/5.0 (compat"..., 4096) = 4096


Now I'm really worried about the cluster stability from other routine writes that might fail soon. I know the typical workaround is to reduce the node slots, but I don't have any excess slots to spare. Are there any other tricks to improve/reduce freespace fragmentation?

Attachment: statsys.output.bz2
Description: Binary data

_______________________________________________
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Reply via email to