On Fri, 25 Apr 2008, Dirk-Willem van Gulik wrote:

-       any religion on how we serialize tables throughout ?

        ->   mod_disk_cache -- while there is some binary
                we are fairly careful to write most things
                out with 'key' ':' 'value' CR LF (and sort of
                hope the key never has a ':'.

        ->   mod_memcached and lots of others do
                        'key' \0 'value' \0
                        ...
                        '\0'

My mod_disk_cache jumbopatch changes the on-disk headers to key\0value\0-style with great success, ie. it works fine with the lockless read-while-caching design I hacked up. The original (as in httpd proper) has lots of issues (mainly unnecessarily inefficient both to store and recall, no size-information so you can't easily decide on whether the on-disk file is incomplete/corrupted).

For reference, my patch is currently being tracked at https://issues.apache.org/bugzilla/show_bug.cgi?id=39380 - it's a bit out of date but since 2.2.8 is unusable due to the 32bit-lfs-brigade-brokenness it can wait until 2.2.9. I haven't gotten around to create a real in-tree fork of mod_disk_cache to accomodate it yet, some if not all ideas in there should be usable for a wider audience than mostly-large-file archive sites. Anyhow, feel free to peek at it if you want to see code backing my ramblings ;)

And the latter make it a superset of array serialization. I am tempted
        to go for the latter - and let it long term migrate into apr.

-       it is useful to store things like timestamps, expiredates - we now
        do this 'raw' in most modules.

        ->   wrap those in htonl/htons()

        ->   serialize them integers to ascii.

This seems unneccesary to me. The on-disk info is only meant to be read by that specific machine IMHO. It should be tuned to be easy to load/store, even though it's tempting to waste cycles on making it universally portable I suspect it's a bad idea in the long run.

The only thing that I think can be of interest is to make it handle 32/64bit transitions gracefully. I've achieved this in my mod_disk_cache jumbopatch by using the APR 32/64bit types explicitly instead of types that might vary in size.

or is that generally felt as over the top ? I am tempted to do so - as the cost is very low - and it does help with distributed cased. But those who care the most about that are also the most likely to be careful about using
        the same endian/operating system throughout.

I'm not convinced this is useful, but then I'm rather biased due to our usecase of mod_disk_cache ;)


/Nikke - at least managed to fill the 4 gigabits available to the
         computer club during the Ubuntu release using
         donated five-year-old leftover servers ;)
--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
 Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se      |     [EMAIL PROTECTED]
---------------------------------------------------------------------------
 "Read my lips and come to grips with reality." --Jafar
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Reply via email to