Re: [Ocfs2-users] ENOSPC

David Johle Thu, 08 Apr 2010 07:52:04 -0700

Just realized I never replied to this, but I had created the bugzilla 
report a little while back.  It can be found at:
http://oss.oracle.com/bugzilla/show_bug.cgi?id=1237



At 08:54 PM 3/24/2010, Sunil Mushran wrote:
>Quite a bit of work is ongoing on this front. I'll list all that work
>in another email.
>
>Meanwhile make a bz with the stat_sysdir output. We'll need that
>to determine the best way forward.
>
>David Johle wrote:
>>So in light of prior issues with lock contention and such due to 
>>writing apache logs to shared files I have started storing them 
>>locally on each node.  I made a script to combine them nightly 
>>before the statistics generator kicks off for the previous day's 
>>traffic analysis.
>>
>>This script, using logresolvemerge.pl, is actually writing the 
>>output back to the shard volume for easy reference later.  I figure 
>>I would not have issues with this as it's a large amount of 
>>sequential writes from a single node at off-peak time.  However, 
>>It's been getting hung with high CPU from the merger.
>>
>>I'm pretty sure I'm running into the famous "free space 
>>fragmentation" problem, but wanted to confirm that this was the 
>>case or see if there was additional troubleshooting I can do.
>>
>>Here's the disk, plenty of overall free space:
>>
>>Filesystem           1K-blocks      Used Available Use% Mounted on
>>/dev/mapper/mpath1   209725440  85311460 124413980  41% /san/live-websites
>>
>>
>>While my merging was going 100% of a CPU core, but the merged file 
>>was not growing in size and not much I/O actually happening to the 
>>shared volume, I did an strace to see what it was doing and got this:
>>
>># strace -p 16844
>>Process 16844 attached - interrupt to quit
>>read(3, "1\" 200 936 \"http://www.industria";..., 4096) = 4096
>>write(1, ".NET CLR 1.1.4322; .NET CLR 2.0."..., 4096) = -1 ENOSPC 
>>(No space left on device)
>>read(4, "oration&locationName=South+Jerse"..., 4096) = 4096
>>write(1, "ivers=8&ngPipelines=600&kvtl230="..., 4096) = -1 ENOSPC 
>>(No space left on device)
>>read(4, "1\" 200 936 \"http://www.industria";..., 4096) = 4096
>>write(1, "gan+Boulevard&locationCSZ=Salem%"..., 4096) = -1 ENOSPC 
>>(No space left on device)
>>read(3, "HTTP/1.0\" 200 4096 \"-\" \"WinampMP"..., 4096) = 4096
>>write(1, "elta=.375&zoomlevel=6&label=Sout"..., 4096) = -1 ENOSPC 
>>(No space left on device)
>>read(4, "HTTP/1.0\" 200 4096 \"-\" \"WinampMP"..., 4096) = 4096
>>write(1, "ident/4.0; .NET CLR 1.1.4322; .N"..., 4096) = -1 ENOSPC 
>>(No space left on device)
>>read(3, "0 36516 \"-\" \"Mozilla/5.0 (compat"..., 4096) = 4096
>>
>>
>>Now I'm really worried about the cluster stability from other 
>>routine writes that might fail soon.  I know the typical workaround 
>>is to reduce the node slots, but I don't have any excess slots to 
>>spare.  Are there any other tricks to improve/reduce freespace fragmentation?

_______________________________________________
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] ENOSPC

Reply via email to