Hi, I looked a bit arround the implementation of the data sync mode, currently the PINT_flow_setinfo is called which sets the sync mode for each write operation of a flow. That means if 100 MByte are transfered for blocks with 256 Kbyte a sync happens, which ends up in quite a lot syncs.
Maybe it would be nice if the client could specify in the IO request (PVFS_servreq_io) if the data should be synced instead of setting it per filesystem. Maybe the kernel interface can take benefit of this to save sync operations or this can be useful elsewhere ? Of course, this value can be filled by default with the filesystems TroveSyncData option. In MPI there is the explicit sync via MPI_File_sync, maybe we could rely on this for MPI apps ? Independent of this questions, Rob mentioned that the sync policy maybe should be changed, too. For example to sync the data only after at the end of the flow and that data syncs could be coalesynced like the metadata coalesyncs. I think maybe the coalesyncing of operations should be handled by the trove module, because this knows which coalesync method is best for the implementation or should this be handled by a upper layer (e.g. job ?). In case a I/O scheduler will be added to the Trove layer maybe small write requests can be combined like in ROMIO. Also the policy might depend on the servers I/O load and pending I/O jobs. I will take care for modifications and evaluate possible policies if nobody else is currently working on this issues. Thanks, Julian _______________________________________________ Pvfs2-developers mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
