"Wesley W. Terpstra" <[EMAIL PROTECTED]> wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... > On Tue, Nov 19, 2002 at 10:38:27AM -0500, David Abrahams wrote: > > "Wesley W. Terpstra" <[EMAIL PROTECTED]> writes: > > > So... I am beginning to lean towards the "don't do that" approach where I > > > simply don't allow the user to call member methods on items in the > > > container. (And not let them take pointers) This allows at least the above > > > optimization and a few others (like *i = *j; -- no deserialize&serialize) > > > and probably more I don't forsee yet. > > > > I haven't been paying attention, but IIUC what you're proposing, these > > things are no longer conforming iterators. > > > > The way to make random access iterators over disk storage is to build > > an iterator which stores its value_type internally.
IMHO "stores pointer to value from cache internally" would be better. >> You can even > > arrange for it to construct the value_type in its internal storage on > > demand, so that it doesn't store anything until it is dereferenced. > > I assume you mean they are not iterators because operator -> is broken? > Yes I agree. > Aside from that however, I believe they do conform to iterators. > > What you are proposing however is flawed for several reasons. > > If I stored the value_type internally, this will break: > > map::iterator i = ...; > map::reference x = *i; > ++i; > x = ...; // what is x now pointing at? the wrong record. With above just next record. > > Also, if you have two iterators pointing at the same thing, but keeping > distinct value_types internally, expressions like: > i->set_member_a(j->set_member_b(3) + 2); > will break -- only one of the changes will make it to disk. > > --- > > I know that this could be solved with some sort of: > > struct Address > { > sectorptr_t sector; > sectorlen_t record; > }; > > struct Object > { > Observable observable; > T object; > }; > > std::map<Address, Object> which I keep in for each database. Isn't it cache ? Looks familiar :) > Then, every time you want to dereference an iterator, you lookup the address > in the table (deserializing if necessary), reference the observable and > return the object. > > When the observable is not_observed, you remove the Object from the table > and reserialize to disk. > > The whole question revolves around: > is the overhead of such a table justified by the benefit of allowing > member methods to be called on objects within the container. No overhead. Rather you have overhead with constant deserializing. Just make a list of all possible operations on object and you will understand (i hope:) ) that cache class not only accelerates your serialization/deserialization but also solves problem of "pointer<->object on disk identity". Cache of pure data buffers is much simpler but not need when you have object cache. > > There are significant costs: > the overhead of redundant cache > (it is already cached at the sector level) > the overhead of indexing the map > (considerable if you are just deserializing an int) In my practice storing int is 1% vs complex object storing is 99%. Constant serializing/deserializing is poor idea for big object. But i see ... you are fighting for simplicity. > > My current answer is "not justified". But, I am open to persuasion, > especially in the form of an optimized solution. Ok lets order all problems: Ex: //legacy code { class MyClass { string name; }; void do_some_changes( MyClass & value ) { value.name = "..."; ... } //legacy code } How are you going to : load object, do_some_changes on it and save ? Most probably: MyClass x = db[ 444 ] ; //copy #1 do_some_changes( x ); db[ 444 ] = x; //copy #2 Problem #1: User may want just: do_some_changes( db[444] ); //note! it can even compile on some comiplers. Which is wrong untill you put serialization in destructor, but in this case you have frequent object serialization, which is slow and serialization/deserialization can't be synchronized: construct instance1 -> deserialize1 change instance1. construct instance2 -> deserialize2 change instance2. destruct 2 destruct 1. Changes from instance2 goes away. Problem #2: Why you think you can copy any object ? Problem #3: Even if you can. Why do you think that object copy is cheap ? Problem #4: User has some template algorithm which deals with generic stl container. Algorithm expects object "pointer <-> disk buffer" identity when performing const operations on your container which is norm for std::container. I'm not sure if it is required by standart, but it seems logical to me. ------------------- MHO : It would be better to implement buffer (POD?) disk container and object disk container separately. I looks like they are pretty different things. regards, bohdan _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost