I understand this is the least of our worries right now, but I'd like to
get the indexer cache working so I can get karma for PhD. :-D

The main troubles with serialize is that it 1. wastes space by not
preserving internal references and 2. requires lots of memory. I recall
SQLite being proposed as a possible solution, but since the entire index
is loaded into memory I don't think that's necessary: just as easily
parseable file format.

So, we have a file, with each id seperated by a newline, and the fields
seperated by tabs (both of which should never occur in the fields, if
they do I'll use a control character or something). children is
collapsed into a list of IDs.

Reassembling is as simple parsing the file line by line, constructing
the ID array from the fields. Then the children are reconstituted by
running through $IDs a second time, replacing them with references to
the appropriate indexes.

Thoughts?

-- 
 Edward Z. Yang                        GnuPG: 0x869C48DA
 HTML Purifier <http://htmlpurifier.org> Anti-XSS Filter
 [[ 3FA8 E9A9 7385 B691 A6FC B3CB A933 BE7D 869C 48DA ]]

Reply via email to