I'm not familiar with the callback structures in dbpf (is there a
particular example I should look at?), but I agree in principle w/ what
you are saying.
Adding a dbpf-null-aio.c as an alternative to dbpf-alt-aio.c just looked
like the quickest way to drop an alternative in there that didn't modify
any existing code. There is no technical motivation for it to be done
at that level.
Is this something that we should reorganize after your O_DIRECT changes
go in, or is there already a better way available to plug in something
like this?
-Phil
Sam Lang wrote:
Hi Phil,
Its good to get this functionality into the code base -- we've had a
number of attempts at this sort of thing, but none of them got committed
to HEAD, and having it in there in whatever state is better than not. I
have a concern (and overall design gripe) with the use of AIO interfaces
for this sort of thing, when we already have callback structures in dbpf.
We now have two levels of indirection, with threads being created and
managed in both. Obviously, that's more code to manage in different
locations that do more or less the same thing, making it harder for
others to understand and augment.
In general I don't think the aio callback structures are needed at all,
but its admittedly much easier to implement to those functions than the
dbpf ones, if only because of the disorderly op mgmt code in dbpf bstream.
I don't know what our long term plans will be for the trove code, but I
would vote for trying to move towards a simpler centralized location for
management of the IO threads and queues, and different callbacks for IO
impls. I've done a prototype of this for queue/thread management and
O_DIRECT, and I think it would clean things up quite a bit to go that
route.
-sam
On Apr 17, 2008, at 4:10 PM, Phil Carns wrote:
There is a new trove method available in trunk now called "null-aio".
It can be selected by putting "TroveMethod null-aio" in the
<StorageHints> section of the file system configuration file.
This is only useful for debugging purposes, because it deliberately
skips doing any file I/O on the server side. Please use with caution!
It does all metadata operations the same as any other method, but file
reads will return garbage and file writes are thrown away. Writing
beyond eof triggers a truncate to mimic the appropriate resulting
bstream size.
This might be useful once in a while for narrowing down performance
problems between network and storage. It takes the storage out of the
loop and shows approximately what the network is capable of by itself.
Of course it will only work for benchmarks that don't verify data
correctness (or otherwise rely on data read off of PVFS).
We used to have a compile time option (--disable-disk-io) for this
same purpose, but that actually hasn't worked in a while. Nowadays
its easier to just do this as a trove method that can be selected at
runtime without recompiling.
-Phil
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers