Hi Colin, The file is blinking may be casued by the file's inode corruption. I met with it once.
As for debug ocfs2, there are many ways. One is http://oss.oracle.com/projects/ocfs2-tools/dist/documentation/v1.4/debugfs.ocfs2.html debugfs.ocfs2 *-l* [/tracebit/ ... [*allow*|*off*|*deny*]] ... can open and off a lot of tracing which will show some helpful information in system log. But I guess what Sunil mean is the debug version of ocfs2, not how to debug? Since it is a production system, I am afraid a debug version isn't allowed in your system. Regards, Tao Wang2, Colin (NSN - CN/Cheng Du) wrote: > Hi Sunil, > > Please see answer in line. > > BRs, > Colin > > -----Original Message----- > *From*: ext Sunil Mushran <sunil.mush...@oracle.com > <mailto:ext%20sunil%20mushran%20%3csunil.mush...@oracle.com%3e>> > *To*: Wang2, Colin (NSN - CN/Cheng Du) <colin.wa...@nsn.com > <mailto:%22Wang2,%20Colin%20%28NSN%20-%20CN/cheng%20du%29%22%20%3ccolin.wa...@nsn.com%3e>> > *Cc*: ocfs2-users@oss.oracle.com <ocfs2-users@oss.oracle.com > <mailto:%22ocfs2-us...@oss.oracle.com%22%20%3cocfs2-users@oss.oracle.com%3e>> > *Subject*: Re: [Ocfs2-users] ocfs2_encode_fh:152 ERROR: fh buffer is > too small for encoding > *Date*: Wed, 11 Nov 2009 19:55:57 -0800 > > Wang2, Colin (NSN - CN/Cheng Du) wrote: > > Base on your questions, > > 1. The error is time issue. And it's a production system, it's hard to > > install a debug version. > > I appreciate if you share some document about debug version so I can > > test it while have chance. > > The error is not necessarily an ocfs2 issue. ocfs2 has 64-bit inode numbers > and requires the large filehandle. I am unsure what you mean by document > about debug version. > Colin: > I mean the method to debug ocfs2. > > > 2. Confirmed with onsite engineer. > > I think it's a file data corruption but file system. Here are scenes. > > The system has 2 nodes with ocfs2 filesystem, and nfs export on one node. > > Suppose: > > Node name: db1, db2 > > Node that currently export NFS; db1 > > Node that mount exported nfs: app1 > > A. Read/write file corruption. > > Shutdown app1. > > When check file with ls command, it's blinking on db1, it's ok on > > db2. > > Remove on db2 failed too. > > Can't unmount and stop ocfs2 on db2. > > Faillover nfs to db1 and reboot db2. > > It's ok to delete on db1. > > Reboot app1, it can use exported fs. > > I don't what the error, why file is blinking? inode missed? > > I did not follow what you meant by "blinking". Secondly if you > have exported a volume, then that volume cannot be umounted. > That goes for all fs. > Colin: > When I run "ls -l" command, the bad file will be marked as read and > blinking. > While I use xterm. I don't know what cause this. > > > B. Readonly file corruption. > > Update file, maybe from db1, maybe from db2. > > app1 report corruption file. > > Failover nfs from db1 to db2. > > Reboot app1, it's ok now. > > I think this scene caused by exported nfs fs not lock relative file, > > and partial content of updated file on another node(like db2) is not > > synchnized to db1 and then to app1, so app1 report corruption. > > > > I think this scene can be prevented from update file from > > db1(currently nfs exported node) but db2. > > So when you write to a file on node db2, the next read on db1 will > show that new data. However, there is no guarantee that app1 (which > has nfs mounted the volume on db2) will see the same data. The only > way this will work is if the application is doing odirect ios. This is an > inherent limitation in nfs. > Colin: > Thanks, got it. But I think we must accept current situation for direct ios > will reduce our performance. > > > BRs, > Colin > > > ------------------------------------------------------------------------ > > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users@oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-users _______________________________________________ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users