Unfortunately, all the nasty varint stuff makes it hard to have a struct for things. The header tries to give a psuedo-code struct-ish description.
It may make sense going forward to start having some sort of centralized set of decode functions. Something like decodeInteriorNodeHeader(pData, nData, &iHeight, &nTerm, &pTerm, &nDoclist, &pDoclist). These could probably be small enough to act as a form of documentation. Something to think about. -scott On Thu, Sep 18, 2008 at 2:51 PM, Michael Nordman <[EMAIL PROTECTED]> wrote: > > > On Thu, Sep 18, 2008 at 2:39 PM, Scott Hess <[EMAIL PROTECTED]> wrote: >> >> I saw the LGTM, but I want to make sure my responses address your >> questions, >> first. > > Yes they do... thnx. > >> >> ======================================================================== >> >> http://mondrian.corp.google.com/file/8305435///depot/googleclient/gears/opensource/third_party/sqlite_google/ext/fts2/fts2.c?a=1 >> File >> //depot/googleclient/gears/opensource/third_party/sqlite_google/ext/fts2/fts2.c >> (snapshot 1) >> ------------------------------------ >> Line 5122: if( pRootData==NULL || nRootData<1 || pRootData[0]!='\0' ){ >> On Thu Sep 18 12:28:14 2008 PDT, michaeln wrote: >> > what is expected in the first byte(s) of pRootData, ditto for pLeafData >> below... >> > why is non-zero not valid... just curious? >> >> The node format (documented in the page-header comments) has the height of >> the >> node from the leaf level as byte 0. So leaf nodes always lead with 0, >> interior >> nodes always lead with non-0, and you can assert that you are descending >> the >> tree if you want (I don't think any current code does). >> >> I think it probably makes more sense to assume understanding of the >> storage >> formats, rather than have too many inline comments. > > Absolutely... just i don't have an understanding of the storage format is > why i asked :) > Something that could help is to define structs that codify the node data > formats and > to use those structs when reasoning about the contents of the node data. > >> >> It would be nicer if this >> wasn't such a big file. >> >> These checks could generally be made more specific, based on the other >> stuff >> which must be there. For instance, nRootData must be 5 or greater for a >> leaf >> node. But then each instance would be different, so I want to think about >> a >> better way to work that in than to have complicated thinking at each >> occurance >> of this kind of check. >> ------------------------------------ >> Line 5171: sqlite3_reset(s); >> On Thu Sep 18 12:37:11 2008 PDT, michaeln wrote: >> > why finalize vs reset here... just curious... oh i see, has todo with >> > this >> being >> > recursive... do we really need to reset here? >> >> The optimize() implementation re-purposed the LeavesReader code. The >> prepared-statement cache stores MERGE_COUNT variants for the idx parameter >> to >> sql_get_leaf_statement(). The cached statements will be destroyed in the >> vtable >> destructor. idx==-1 is used for cases beyond MERGE_COUNT, in which case >> the >> statement isn't cached, so it needs to be destroyed here. >> >> The sqlite3_reset() is because if we do a successful sqlite3_step() and >> get data >> and the data is corrupt, we need to reset the statement so that it doesn't >> hold >> the lock. The current code has only made this kind of change in fairly >> local >> circumstances, future changes are going to need to do a more thorough job >> of >> flushing things like that out. For Gears, SQLITE_CORRUPT has consequences >> such >> that database locks should clear reasonably quickly. >> ------------------------------------ >> Line 5176: pReader->pStmt = s; >> On Thu Sep 18 12:38:13 2008 PDT, michaeln wrote: >> > when idx != -1 is pReader->pStmt expected to be 's' already... just >> > curious? >> >> In this case, we're going to return SQLITE_OK, so pReader needs the >> statement >> for future work in all cases. It moved down here so that I don't leave a >> reference to a potentially-finalized statement in pReader for the other >> cases. >> ------------------------------------ >> Line 5202: return SQLITE_CORRUPT_BKPT; >> On Thu Sep 18 12:36:30 2008 PDT, michaeln wrote: >> > this function may read nicer w/o the else clause, but i see your doing >> > this to >> > be consistent with what you did in the function with the goto's in it >> >> Actually, I'm doing this because it's not C++, and I wanted pLeafData and >> nLeafData to be defined close to their use :-). > > (D)oh... C... right > >> >> ------------------------------------ >> Line 5295: sqlite3_reset(s); /* So we don't leave a lock. */ >> On Thu Sep 18 12:39:09 2008 PDT, michaeln wrote: >> > ah... a bug fix? >> >> Adding the corruption code called out that we could leave the statement in >> the >> middle of a query. This was true before if we saw SQLITE_CORRUPT from a >> sqlite3_* call, but now that I'm generating them things like this are >> popping >> up. >> >> Probably I need to think about a better way to handle cached prepared >> statements >> which systemically resolves this kind of problem. >> ======================================================================== >> >> -- >> To respond, reply to this email or visit >> http://mondrian.corp.google.com/8305435 > >
