Hi Steve

I will try the script in the meantime I have tried to hammer it this
afternoon copying a 500meg ISO and repeatedly doing cmp, I saw no errors.
Also it came to me that when I copied the 500gb in I used a mirroring
application which would have highlighted any bad copies as files updating, I
saw none.

I can also see no errors if I just have one machine copying/verifying
to the PVFS2 area.  Is your error free run from a case when several
machine are accessing/writing to the one PVFS2 area?

Regards
Mark






-------Original Message-------



From: Mark Van De Vyver

Date: 02/03/2007 19:17:40

To: Steve

Cc: [email protected]

Subject: Re: [Pvfs2-users] PVFS 2.6.2 intermittent cmp/diff failure



Hi Steve,

I don't have access to the cluster now, but the following script has a

Few fixes.

I haven't yet tested copying from a non-pvfs area to pvfs with pvfs 2.6.2.

I saw something similar in pvfs 1.5.1 when copyinf from a tmpfs area to
pvfs2.

Running `mount` should show you if the dvd is auto-mounted and under

What directory, in which case my mount below is redundant and you'll

Need to replace the '/media/cdrom/' references.



# untested script start

Mkdir /media/cdrom

# you may have to insert your systems dev name here

Mount /dev/hdb /media/cdrom



For fn in `ls /media/cdrom/*.*|sed -e 'S/\/media\/cdrom\///G`

Do

If [ -f "/mnt/pvfs2/${fn}" ]

Then

# This should 'fail' more frequently than the cmp in the else clause

Cmp /media/cdrom/${fn} /mnt/pvfs2/${fn}

If [ $? != 0 ]

Then

Echo "Prexisting copy not exact - more frequent and random?"

If

Else

Cp /media/cdrom/${fn} /mnt/pvfs2/${fn}

Cmp /media/cdrom/${fn} /mnt/pvfs2/${fn}

If [ $? != 0 ]

Then

Echo " Initial copy not exact - less frequent and random"

If

If

Done

# untested script end



Thanks

Mark



On 3/2/07, Steve <[EMAIL PROTECTED]> wrote:

> Well I thought id try manual cp

>

>

>

> I never mounted a dvd under link only cdrom. I mounted a movie dvd and get


> an I/O error when trying to copy. I mounted a data dvd burned under
windows

> and the mount fails as wrong filesystem.

>

>

>

> Whats your mount command syntax ?

>

>

>

> BTW do you get the same if you copy your files to local non pvfs2 disk and


> then use your script ?

>

>

>

>

>

> -------Original Message-------

>

>

>

> From: Mark Van De Vyver

>

> Date: 02/03/2007 09:40:30

>

> To: Steve

>

> Cc: [email protected]

>

> Subject: Re: [Pvfs2-users] PVFS 2.6.2 intermittent cmp/diff failure

>

>

>

> Thanks Steve,

>

> I don't see any problem until I run the diff or cmp and even then

>

> These indicate the files are identical if the cmp is run _immediately_

>

> After the file copy.

>

> Cmp and diff only indicate a difference when a file is 'checked' after

>

> Some other files have been copied-checked.

>

>

>

> The files are from the NYSE trade and quote (TAQ) DVD's, so they are

>

> Text stored as binary.

>

>

>

> You might be able to try the following with a dozen or so large binary

>

> Files, I have approx 300-400GB stored in the PVFS area.

>

>

>

> Ideally the following should be run on two or more PVFS2 servers at

>

> The same time, apply this to several DVD's that have not been copied

>

> To the PVFS area, then reapply the script to the same DVD's after they

>

> Have been copied.

>

> The following is a slightly simplified version of my script - here I

>

> Don't delete and re-copy when an existing file fails the cmp

>

> Verification:

>

>

>

> # untested script start

>

> For fn in `ls /dvd/*large.bin|sed -e 'S/\/dev\//G`

>

> Do

>

> If [ -f /mnt/pvfs2/${fn} ]

>

> Then

>

> # This should 'fail' more frequently than the cmp in the else clause

>

> Cmp ${fn} /mnt/pvfs2/${fn}

>

> If [ $? != 0 ]

>

> Then

>

> Echo "Prexisting copy not exact - more frequent and random?"

>

> If

>

> Else

>

> Cp ${fn} /mnt/pvfs2/${fn}

>

> Cmp ${fn} /mnt/pvfs2/${fn}

>

> If [ $? != 0 ]

>

> Then

>

> Echo " Initial copy not exact - less frequent and random"

>

> If

>

> Done

>

> # untested script end

>

>

>

> Regards

>

> Mark

>

>

>

> On 3/2/07, Steve <[EMAIL PROTECTED]> wrote:

>

> > My setup is a little different in that at the moment I have 2 I/O
services

>

>

> > running on one box, a metadata on another and a client/samba server on a


>

> > third. I have moved in the data via samba. We have copied in mp3's and

>

> > avi/mpg's as well as large ISO's plus software exe's. Surely after
several

>

>

> > week of use we would notice some problem ?

>

> >

>

> >

>

> >

>

> > I do have another box set up as a client that happens to have a dvd ROM

>

> > drive in it.

>

> >

>

> >

>

> >

>

> > What type of files ? A vob ?

>

> >

>

> > What sequence of commands would I need to do you test your problem ?

>

> >

>

> > If I get a little spare time I could try for U ?

>

> >

>

> >

>

> >

>

> > Steve

>

> >

>

> >

>

> >

>

> > -------Original Message-------

>

> >

>

> >

>

> >

>

> > From: Mark Van De Vyver

>

> >

>

> > Date: 02/03/2007 08:18:11

>

> >

>

> > To: Steve

>

> >

>

> > Subject: Re: [Pvfs2-users] PVFS 2.6.2 intermittent cmp/diff failure

>

> >

>

> >

>

> >

>

> > Hi Steve,

>

> >

>

> >

>

> >

>

> > > Not sure if this helps any but I have copied over 500gb of media files


> to

>

> >

>

> > > pvfs2 running on old dell's 533 to 866 CPU with very little ram
running

> on

>

> >

>

> >

>

> > > caos3 beta 3. Although I havent done any checks other than using the

> media

>

> >

>

> >

>

> > > I havent noticed any problems.

>

> >

>

> > >

>

> >

>

> >

>

> >

>

> > The failures might be spurious....?

>

> >

>

> >

>

> >

>

> > > Could you have problems with the dvd device ?

>

> >

>

> >

>

> >

>

> > I doubt it - but it may not be impossible?

>

> >

>

> > This happens with the DVD drives on all three nodes, and when I just

>

> >

>

> > Have one node 'working the diif/cmp failures either don't occur or

>

> >

>

> > Very, very rarely. Start all three nodes 'working' and I see roughly

>

> >

>

> > 1 out of 2 binary files fail the initial diff/cmp check, but very very

>

> >

>

> > Few (one every couple of DVD's fail the cmp/diff check immediately

>

> >

>

> > After the copy is done.....

>

> >

>

> >

>

> >

>

> > Thanks

>

> >

>

> > Mark

>

> >

>

> >

>

> >

>

> > >

>

> >

>

> > > -------Original Message-------

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > > From: Mark Van De Vyver

>

> >

>

> > >

>

> >

>

> > > Date: 02/03/2007 03:26:40

>

> >

>

> > >

>

> >

>

> > > To: [email protected]

>

> >

>

> > >

>

> >

>

> > > Subject: [Pvfs2-users] PVFS 2.6.2 intermittent cmp/diff failure

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > > Hi,

>

> >

>

> > >

>

> >

>

> > > This is a follow up on an earlier email where I reported that PVFS

>

> >

>

> > >

>

> >

>

> > > 1.5.1 failed copy binary files from several DVD's.

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > > I'm running a 3 node Rocks 4.2.1 Cluster, CentOS4.4, x86_64, nodes are


>

> >

>

> > >

>

> >

>

> > > Connected via an unmanaged switch.

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > > I have reinstalled the Rocks Cluster (all nodes), including the PVFS2

> Roll

>

> >

>

> >

>

> > >

>

> >

>

> > > The cluster is set up with the frontend as the metadaat server and the


>

> >

>

> > >

>

> >

>

> > > Other two nodes are PVFS2 I/O servers and clients. The /mnt.pvfs2

>

> >

>

> > >

>

> >

>

> > > Area is on a 3 disk RAID 0 partition formatted as ext3.

>

> >

>

> > >

>

> >

>

> > > After installing I ran the test steps in the "PVFS2 Quick Start

>

> >

>

> > >

>

> >

>

> > > Guide". The test steps ran without error.

>

> >

>

> > >

>

> >

>

> > > I upgraded to PVFS 2.6.2 on all nodes and re-ran the test steps, again


>

> >

>

> > >

>

> >

>

> > > No errors or problems.

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > > I build PVFS 2.6.2 with the following:

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > > ./configure --with-kernel=</path/to/kernel26/>

>

> >

>

> > >

>

> >

>

> > > --enable-kernel-sendfile --prefix=/usr/local/pvfs2/

>

> >

>

> > >

>

> >

>

> > > Then type

>

> >

>

> > >

>

> >

>

> > > Make all

>

> >

>

> > >

>

> >

>

> > > Make kmod_install

>

> >

>

> > >

>

> >

>

> > > Make install

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > > On each node I have a script that lists the files on the DVD disc

>

> >

>

> > >

>

> >

>

> > > Loaded on that node.

>

> >

>

> > >

>

> >

>

> > > Each file is copied if it does not exist on the HDD (PVFS area) and

>

> >

>

> > >

>

> >

>

> > > The copy is immediately verified:

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > > Cp /dvd/file1 /mnt/pvfs2/file1

>

> >

>

> > >

>

> >

>

> > > Cmp /dvd/file1 /mnt/pvfs2/file1

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > > `cmp` does not report any error.

>

> >

>

> > >

>

> >

>

> > > This has been done for 60-70 DVD.

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > > If I insert a DVD that has previously been copied my script finds that


>

> >

>

> > >

>

> >

>

> > > A file exists in the PVFS area and does a `cmp` with the DVD file, if

>

> >

>

> > >

>

> >

>

> > > The file fails this comparison the file is deleted, copied, verified

>

> >

>

> > >

>

> >

>

> > > (cmp).

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > > I notice that frequently and randomly the previously copied files will


>

> >

>

> > >

>

> >

>

> > > Fail the _initial_ `cmp` check if more than one node is 'active', I.e.


>

> >

>

> > >

>

> >

>

> > > Processing a DVD.

>

> >

>

> > >

>

> >

>

> > > Once deleted and copied the second `cmp` check is passed.

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > > Some details:

>

> >

>

> > >

>

> >

>

> > > The files do not fail the `cmp` check immediately after being copied -


>

> >

>

> > >

>

> >

>

> > > Only when checking a previously copied file.

>

> >

>

> > >

>

> >

>

> > > The `cmp` result indicates a different byte at which the files differ.


>

> >

>

> > >

>

> >

>

> > > Re-inserting the same dvd several times results if different files

>

> >

>

> > >

>

> >

>

> > > Failing the first `cmp` check.

>

> >

>

> > >

>

> >

>

> > > The second check (immediately after the copy is finished) is always

> passed

>

> >

>

> >

>

> > >

>

> >

>

> > > This occurs rarely, if at all (I.e. I haven't noticed it), when only

>

> >

>

> > >

>

> >

>

> > > One node is processing a DVD.

>

> >

>

> > >

>

> >

>

> > > This only occurs with binary files - which are relatively large 200MB
-

> 2

>

> > GB

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > > This never occurs with text files - which are also small 100'sKB

>

> >

>

> > >

>

> >

>

> > > The pvfs2-client.log file is empty on each node.

>

> >

>

> > >

>

> >

>

> > > I have tried using diff and experience the same results.

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > > This is similar to an error I was seeing in PVFS 1.5.1 - hence the

>

> >

>

> > >

>

> >

>

> > > Upgrade. I've also changed my previous script which `dd` copied the

>

> >

>

> > >

>

> >

>

> > > DVD to memory (approx 8GB), then wrote this ISO file to the PVFS2 area


>

> >

>

> > >

>

> >

>

> > > - this worked fine for initial copies, but failed for re-copies. At

>

> >

>

> > >

>

> >

>

> > > That time I wasn't verifiying the copy, so it was the copy to the

>

> >

>

> > >

>

> >

>

> > > PVFS2 area that failed.....

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > > Finally, on one occasion when manually running `cmp` on a file I

>

> >

>

> > >

>

> >

>

> > > Noticed the following sequence.

>

> >

>

> > >

>

> >

>

> > > Cmp file1 file2 (pass)

>

> >

>

> > >

>

> >

>

> > > Cmp file1 file2 (pass)

>

> >

>

> > >

>

> >

>

> > > Difffile1 file2 (fail)

>

> >

>

> > >

>

> >

>

> > > Cmp file1 file2 (fail)

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > > Is this known behavior with a known workaround/configuration setting?

>

> >

>

> > >

>

> >

>

> > > The behavior I see made me guess a caching or network issue (there are


>

> >

>

> > >

>

> >

>

> > > No other machines on the cluster network).

>

> >

>

> > >

>

> >

>

> > > Can anyone suggest PVFS configuration settings that will make PVFS
more

>

> >

>

> > > robust.

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > > I'm not a programmer or linux guru - I just spent this summer

>

> >

>

> > >

>

> >

>

> > > Converting from winxp...

>

> >

>

> > >

>

> >

>

> > > I'm happy to explore some possible fixes, but don't assume too much :)


>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > > Thanks in advance

>

> >

>

> > >

>

> >

>

> > > Mark

>

> >

>

> > >

>

> >

>

> > > _______________________________________________

>

> >

>

> > >

>

> >

>

> > > Pvfs2-users mailing list

>

> >

>

> > >

>

> >

>

> > > [email protected]

>

> >

>

> > >

>

> >

>

> > > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> > >

>

> >

>

> >

>

> >

>

>

>



_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to