Hi,
I looked a bit arround the implementation of the data sync mode, 
currently the PINT_flow_setinfo is called which sets the sync mode for each 
write operation of a flow. That means if 100 MByte are transfered for blocks 
with 256 Kbyte a sync happens, which ends up in quite a lot syncs.

Maybe it would be nice if the client could specify in the IO request 
(PVFS_servreq_io) if the data should be synced instead of setting it per 
filesystem. Maybe the kernel interface can take benefit of this to save sync 
operations or this can be useful elsewhere ? Of course, this value can be 
filled by default with the filesystems TroveSyncData option. 
In MPI there is the explicit sync via MPI_File_sync, maybe we could rely on 
this for MPI apps ? 

Independent of this questions, Rob mentioned that the sync policy maybe should 
be changed, too. For example to sync the data only after at the end of the 
flow and that data syncs could be coalesynced like the metadata coalesyncs.
I think maybe the coalesyncing of operations should be handled by the trove 
module, because this knows which coalesync method is best for the 
implementation or should this be handled by a upper layer (e.g. job ?).

In case a I/O scheduler will be added to the Trove layer maybe small write 
requests can be combined like in ROMIO. Also the policy might depend on the 
servers I/O load and pending I/O jobs.

I will take care for modifications and evaluate possible policies if nobody 
else is currently working on this issues.

Thanks,
Julian
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Reply via email to