On Fri, 25 Apr 2008, Dirk-Willem van Gulik wrote:
- any religion on how we serialize tables throughout ?
-> mod_disk_cache -- while there is some binary
we are fairly careful to write most things
out with 'key' ':' 'value' CR LF (and sort of
hope the key never has a ':'.
-> mod_memcached and lots of others do
'key' \0 'value' \0
...
'\0'
My mod_disk_cache jumbopatch changes the on-disk headers to
key\0value\0-style with great success, ie. it works fine with the
lockless read-while-caching design I hacked up. The original (as in
httpd proper) has lots of issues (mainly unnecessarily inefficient
both to store and recall, no size-information so you can't easily
decide on whether the on-disk file is incomplete/corrupted).
For reference, my patch is currently being tracked at
https://issues.apache.org/bugzilla/show_bug.cgi?id=39380 - it's a bit
out of date but since 2.2.8 is unusable due to the
32bit-lfs-brigade-brokenness it can wait until 2.2.9. I haven't gotten
around to create a real in-tree fork of mod_disk_cache to accomodate
it yet, some if not all ideas in there should be usable for a wider
audience than mostly-large-file archive sites. Anyhow, feel free to
peek at it if you want to see code backing my ramblings ;)
And the latter make it a superset of array serialization. I am
tempted
to go for the latter - and let it long term migrate into apr.
- it is useful to store things like timestamps, expiredates - we now
do this 'raw' in most modules.
-> wrap those in htonl/htons()
-> serialize them integers to ascii.
This seems unneccesary to me. The on-disk info is only meant to be
read by that specific machine IMHO. It should be tuned to be easy to
load/store, even though it's tempting to waste cycles on making it
universally portable I suspect it's a bad idea in the long run.
The only thing that I think can be of interest is to make it handle
32/64bit transitions gracefully. I've achieved this in my
mod_disk_cache jumbopatch by using the APR 32/64bit types explicitly
instead of types that might vary in size.
or is that generally felt as over the top ? I am tempted to do
so - as the
cost is very low - and it does help with distributed cased.
But those who
care the most about that are also the most likely to be
careful about using
the same endian/operating system throughout.
I'm not convinced this is useful, but then I'm rather biased due to
our usecase of mod_disk_cache ;)
/Nikke - at least managed to fill the 4 gigabits available to the
computer club during the Ubuntu release using
donated five-year-old leftover servers ;)
--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | [EMAIL PROTECTED]
---------------------------------------------------------------------------
"Read my lips and come to grips with reality." --Jafar
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=