On Jul 18, 2007, at 6:19 AM, Florin Isaila wrote:

Hi everybody,

many thanks for your feedback.

Dries, the PVFS2 was not configured for the native IB, but used TCP/IP
over IB. The access pattern of each process is nested strided. But,
ROMIO two phase collective I/O accesses contiguous portions of the
files, so that at file system level the accesses are cotiguous. We do
not have access to kernel logs. Thank you for your program, we will
try it.

Murali, this results are not for the ppc64 cluster you connected, but
for a  Xeon Intel Duo-Core 64-bit cluster.

Julian, we have used local scratch space for each PVFS2 server. We
just used the default  block size of PVFS so far.

Sam, when you say that "IO operations in PVFS don't require updates at
the metadata servers at all", you refer to the individual write
operations?

Yes.

I do not understand very well the TroveSyncMeta attribute.
It should control when to commit metadata to disk, right?

Yes, you're right. The backend for our metadata uses Berkeley DB, which provides an in-memory cache. So updates to the DB don't necessarily get written to disk unless a sync of the DB is done. The TroveSyncMeta option essentially enables that sync to be done for each update. Disabling the option has the drawback that server failures may cause loss of data.

When is a
metadata transaction triggered and what is the difference between
yes/no values?

We don't store the entire file size in the file's metadata, so each IO operation (a read or a write) doesn't require an update to the metadata. This is why I don't think your slow write performance is related to metadata updates.

To answer your question, metadata operations occur for pretty much anything that isn't an IO operation.

-sam


We keep looking into that.....

Best regards
Florin


On 7/18/07, Murali Vilayannur <[EMAIL PROTECTED]> wrote:
Sam,
> In the <StorageHints> context:
>
> TroveMethod alt-aio

Ah! thanks!
>
> Unless the data he's just written is sitting in the kernel buffers, I
> would expect reads to have the same problem as writes if aio is the
> cause.  What makes you suspect AIO libraries for his platform?

Oh I didn't realize his reads were better.. I just jumped to a conclusion because Florin gave me access to his cluster machine and when I set it up I ran configure with --disable-aio-threaded callbacks since without that
his pvfs2 setup just sat there and did nothing..:)
No I/Os were being completed.
If it is the same cluster that he is talking about here then I assumed it was
most likely due to that.

>
>
> Do you mean configure is disabling threaded callbacks for his build, > or that we should ask it to? AIO results we've seen without threaded
> callbacks are worse than with them.

Without disabling his setup does not even work. It is a ppc64 based
cluster with a fairly ancient glibc if I am not mistaken.
yeah.. thats what I thought too..

> :-)  I hear you.  Are you running over ext3 Murali?  I've seen
> results that suggest xfs might be better for large IOs and multiple
> threads.

On my home machine, yes.
On my latop, no. I run over NTFS eventually since my "virtual disk
files" are hosted off NTFS ;)
XFS rocks for such workloads indeed.
thanks,
Murali
>
> -sam
>
> > Thanks,
> > Murali
> >
> > On 7/17/07, Florin Isaila <[EMAIL PROTECTED]> wrote:
> >> Hi Sam, we start the pvfs2 servers on different machines than the > >> compute nodes (picking the nodes from the list provided by the batch
> >> system). Was that your question?
> >>
> >> And I should have said, all the measurements are done with collective
> >> I/O of ROMIO.
> >>
> >> On 7/17/07, Sam Lang <[EMAIL PROTECTED]> wrote:
> >> >
> >> > Ah, I read your email wrong. Hmm...so writes really tank. Are you
> >> > using the storage nodes as servers, or other compute nodes?
> >> >
> >> > -sam
> >> >
> >> > On Jul 17, 2007, at 11:15 AM, Sam Lang wrote:
> >> >
> >> > >
> >> > > Hi Florin,
> >> > >
> >> > > Just one clarification question...are those are bandwidth numbers
> >> > > not seconds as the plot label suggests?
> >> > >
> >> > > -sam
> >> > >
> >> > > On Jul 17, 2007, at 11:03 AM, Florin Isaila wrote:
> >> > >
> >> > >> Hi everybody,
> >> > >>
> >> > >> I have a question about the PVFS2 write performance.
> >> > >>
> >> > >> We did some measurements with BTIO over PVFS2 on lonestar at
> >> TACC
> >> > >> (http://www.tacc.utexas.edu/services/userguides/lonestar/)
> >> > >>
> >> > >> and we get pretty bad write results with classes B and C:
> >> > >>
> >> > >> http://www.arcos.inf.uc3m.es/~florin/btio.htm
> >> > >>
> >> > >> We used 16 I/O servers, the default configuration parameters
> >> and upto
> >> > >> 100 processes. We realized that all I/O servers were used
> >> also as
> >> > >> metadata servers, but BTIO uses just one file.
> >> > >>
> >> > >> The times are in seconds, contain only I/O time (no compute
> >> time) and
> >> > >> are aggregated per each BTIO run (BTIO performs several writes).
> >> > >>
> >> > >> TroveSyncMeta was set to yes (by default). Could this cause
> >> the I/
> >> > >> O to
> >> > >> be serialized? It looks as if there were a serialization.
> >> > >>
> >> > >> Or could the fact that all nodes were also launched as metadata
> >> > >> managers affect the performance?
> >> > >>
> >> > >> Any clue why this happens?
> >> > >>
> >> > >> Many thanks
> >> > >> Florin
> >> > >> _______________________________________________
> >> > >> Pvfs2-users mailing list
> >> > >> [email protected]
> >> > >> http://www.beowulf-underground.org/mailman/listinfo/ pvfs2-users
> >> > >>
> >> > >
> >> >
> >> >
> >> _______________________________________________
> >> Pvfs2-users mailing list
> >> [email protected]
> >> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
> >>
> >
>
>

_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users


_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to