On Apr 24, 2007  08:42 -0600, Daniel Leaberry wrote:
> We're running 1.6b7 and have noticed the following two problems. I'm 
> wondering if they're correlated.
> 
> 1. We get files that are 0 bytes. They have nothing in them.

This may or may not be related to the recent bug 12181 problem.
That bug will be fixed in 1.6.0+ and 1.4.10.1 and 1.4.11+.

It can also happen if the clients are evicted while they are
writing to the file.

> 2. We get these errors across our 30 nodes
> LustreError: 7030:0:(dir.c:330:ll_readdir()) error reading dir 
> 167108765/2378987153 page 13: rc -5
> LustreError: 7029:0:(dir.c:330:ll_readdir()) error reading dir 
> 171699532/2388399554 page 9: rc -5
> LustreError: 7027:0:(dir.c:330:ll_readdir()) error reading dir 
> 171403580/2387428410 page 2: rc -5
> LustreError: 6990:0:(dir.c:330:ll_readdir()) error reading dir 
> 171011300/2386583645 page 8: rc -5
> LustreError: 7027:0:(dir.c:330:ll_readdir()) error reading dir 
> 172286916/2390172901 page 13: rc -5
> LustreError: 6990:0:(dir.c:330:ll_readdir()) error reading dir 
> 172030180/2388919021 page 13: rc -5
> LustreError: 7027:0:(dir.c:330:ll_readdir()) error reading dir 
> 172321971/2390308492 page 3: rc -5
> LustreError: 7027:0:(dir.c:330:ll_readdir()) error reading dir 
> 163603484/1208913504 page 8: rc -5
> LustreError: 6990:0:(dir.c:330:ll_readdir()) error reading dir 
> 172748079/2390802528 page 13: rc -5
> LustreError: 9133:0:(dir.c:330:ll_readdir()) error reading dir 
> 172818070/2390892206 page 2: rc -5
> LustreError: 9171:0:(dir.c:330:ll_readdir()) error reading dir 
> 168359805/2380837293 page 8: rc -5
> LustreError: 9187:0:(dir.c:330:ll_readdir()) error reading dir 
> 163706128/1209056171 page 7: rc -5
> LustreError: 9199:0:(dir.c:330:ll_readdir()) error reading dir 
> 165116087/1211142674 page 0: rc -5
> LustreError: 9217:0:(dir.c:330:ll_readdir()) error reading dir 
> 162005170/1206582728 page 12: rc -5
> LustreError: 9216:0:(dir.c:330:ll_readdir()) error reading dir 
> 162686166/1207618778 page 12: rc -5
> LustreError: 6990:0:(dir.c:330:ll_readdir()) error reading dir 
> 163079284/1208141145 page 3: rc -5

These are reporting IO errors while reading directories from the MDS.
This isn't a problem I've seen before, it's hard to say what is the
root cause.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

Reply via email to