On Wed, Sep 18, 2013 at 12:55 PM, Jeff Janes <jeff.ja...@gmail.com<javascript:_e({}, 'cvml', 'jeff.ja...@gmail.com');> > wrote:
> On Mon, Sep 16, 2013 at 6:59 AM, Heikki Linnakangas < > hlinnakan...@vmware.com <javascript:_e({}, 'cvml', > 'hlinnakan...@vmware.com');>> wrote: > >> >> Here's a rebased version of the patch, including the above-mentioned >> fixes. Nothing else new. > > > I've applied this to 0892ecbc015930d, the last commit to which it applies > cleanly. > > When I test this by repeatedly incrementing a counter in a randomly chosen > row, then querying the whole table and comparing the results to what my > driver knows they should be, I get discrepancies. > > No crash/recovery needs to be done to get the behavior. > > The number of rows is correct, so one version of every row is visible, but > it is sometimes the wrong version. > > The discrepancy arises shortly after the first time this type of message > appears: > > 6930 UPDATE 2013-09-18 12:36:34.519 PDT:LOG: started new XID range, XIDs > 1000033-, MultiXIDs 1-, tentative LSN 0/FA517F8 > 6930 UPDATE 2013-09-18 12:36:34.519 PDT:STATEMENT: update foo set > count=count+1 where index=$1 > 6928 UPDATE 2013-09-18 12:36:34.521 PDT:LOG: closed old XID range at > 1000193 (LSN 0/FA58A08) > 6928 UPDATE 2013-09-18 12:36:34.521 PDT:STATEMENT: update foo set > count=count+1 where index=$1 > > I'll work on getting the driver to shutdown the database the first time it > finds a problem so that autovac doesn't destroy evidence. > I have uploaded the script to reproduce, and a tarball of the data directory (when started, it will go through recovery. table "foo" is in the jjanes database and role.) https://drive.google.com/folderview?id=0Bzqrh1SO9FcEek51NGEzRmFDVEE&usp=sharing The row with index=8499 should have count of 8, but really has count of 4, and is only findable by seq scan, there is no such row by index scan. select ctid,* from foo where index=8499; select ctid,* from foo where index+0=8499; select * from heap_page_items(get_raw_page('foo',37)) where lp=248 \x\g\x Expanded display is on. -[ RECORD 1 ]--------- lp | 248 lp_off | 8160 lp_flags | 1 lp_len | 32 t_xmin | 2 t_xmax | 0 t_field3 | 0 t_ctid | (37,248) t_infomask2 | 32770 t_infomask | 10496 t_hoff | 24 t_bits | t_oid | So the xmax is 0 when it really should not be. What I really want to do is find the not-visible ctids which would have 8499 for index, but I don't know how to do that. Cheers, Jeff