Hello,

Some more info about the issue mentioned below:
I can now reproduce this problem consistently by just creating a file on specific machines, and it seems to depend on whether that particular machine has just the client running or both server and client running. In my PVFS configuration it is as follows:
running only client : deva02 and deva03
running both client and server : deva{04,11}

So if I create a file on any machine in the second group(running both client and server) it is not accessible from the first group(trying to ls for that file gives "Invalid argument" error ).

Guys, any clue whats happening?

Thanks
Vikrant

Hi,

This is the layout of the file system:
11 clients on deva{02-11}
8 servers on deva{04-11}, each node has four cores.

I created this file on deva02 and try to look for it on deva04, this is the error i get:

[EMAIL PROTECTED]:/mnt/pvfs2/vsk/fl5l2$ ls test.jou
ls: test.jou: Invalid argument

On deva02 it lists the file correctly. I have waited for much longer than 30 seconds for this(many minutes and now days). This does not happen always and usually things work fine. Im not sure what particular way to get this situation. I had to dig into the console history to get this output.

This is with a proprietary code, but I will try to send you some sample MPI code which shows a similar problem soon. We have used this code successfully with a previous installation of pvfs2-1.5, so looks like some installation issue or a bug in the current release. Would the config files and configure options for this installation help you to identify if its an installation issue?

Thanks
Vikrant

Sam Lang wrote:

Hi Vikrant,

Along with MPI code, if you could send us the output of your shell commands and the errors you see that would also be helpful in debugging.

Thanks,

-sam

On Dec 20, 2006, at 11:44 AM, Robert Latham wrote:

On Wed, Dec 20, 2006 at 06:55:22PM +0530, Vikrant Kumar wrote:
With MPI applications it fails at certain times in MPI_File_open on some
nodes, which again looks similar to the above problem.

Can you guys suggest me how to isolate the problem?
Let me know what information you require.

Oh, one more thing that would help is if you can send us the MPI code
you are using.  If we can reproduce the problem on our end, that will
make debugging and fixing a lot easier.

==rob

--Rob Latham
Mathematics and Computer Science Division    A215 0178 EA2D B059 8CDF
Argonne National Lab, IL USA                 B29D F333 664A 4280 315B
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users





_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to