[HACKERS] : [HACKERS] ????: [HACKERS] Otvet: WAL and indexes (Re: [HACKERS] WAL status todo)

2000-10-17 Thread Mikheev, Vadim

 Just a confirmation.
 Do you plan overwrite storage manager also in 7.2 ?

Yes if I'll get enough time.

Vadim




RE: [HACKERS] Re: ?????: ?????: WAL and indexes (Re: [HACKERS] WAL status todo)

2000-10-17 Thread Mikheev, Vadim

 I'm still nervous about how we're going to test the WAL code 
 adequately for the lesser-used index types. Any ideas out there?

First, seems we'll have to follow to what you've proposed for
their redo/undo: log each *fact* of changing a page to know
was update op done entirely or not (rebuild index if so).
+ log information about where to find tuple pointing to heap
(for undo).

This is much easy to do than logging suitable for recovery.

Vadim



Re: [HACKERS] Otvet: WAL and indexes (Re: [HACKERS] WAL status todo)

2000-10-16 Thread Alfred Perlstein

* Mikheev, Vadim [EMAIL PROTECTED] [001016 09:33] wrote:
 I don't understand why WAL needs to log internal operations of any of
 the index types.  Seems to me that you could treat indexes as black
 boxes that are updated as side effects of WAL log items for heap tuples:
 when adding a heap tuple as a result of a WAL item, you just call the
 usual index insert routines, and when deleting a heap tuple as a result
 
 On recovery backend *can't* use any usual routines:
 system catalogs are not available.
 
 of undoing a WAL item, you mark the tuple invalid but don't physically
 remove it till VACUUM (thus no need to worry about its index entries).
 
 One of the purposes of WAL is immediate removing tuples 
 inserted by aborted xactions. I want make VACUUM
 *optional* in future - space must be available for
 reusing without VACUUM. And this is first, very small,
 step in this direction.

Why would vacuum become optional?  Would WAL offer an option to
not reclaim free space?  We're hoping that vacuum becomes unneeded
when postgresql is run with some flag indicating that we're
uninterested in time travel.

How much longer do you estimate until you can make it work that way?

thanks,
-Alfred



[HACKERS] Re: : WAL and indexes (Re: [HACKERS] WAL status todo)

2000-10-16 Thread Tom Lane

"Mikheev, Vadim" [EMAIL PROTECTED] writes:
 I don't understand why WAL needs to log internal operations of any of
 the index types.  Seems to me that you could treat indexes as black
 boxes that are updated as side effects of WAL log items for heap tuples:
 when adding a heap tuple as a result of a WAL item, you just call the
 usual index insert routines, and when deleting a heap tuple as a result

 On recovery backend *can't* use any usual routines:
 system catalogs are not available.

OK, good point, but that just means you can't use the catalogs to
discover what indexes exist for a given table.  You could still create
log entries that look like "insert indextuple X into index Y" without
any further detail.

 the index is corrupt and rebuild it from scratch, using Hiroshi's
 index-rebuild code.

 How fast is rebuilding of index for table with 10^7 records?

It's not fast, of course.  But the point is that you should seldom
have to do it.

 I agree to consider rtree/hash/gist as experimental
 index access methods BUT we have to have at least
 *one* reliable index AM with short down time/
 fast recovery.

With all due respect, I wonder just how "reliable" btree WAL undo/redo
will prove to be ... let alone the other index types.  I worry that
this approach is putting too much emphasis on making it fast, and not
enough on making it right.

regards, tom lane



Re: [HACKERS] Re: Otvet: WAL and indexes (Re: [HACKERS] WAL status todo)

2000-10-16 Thread Alfred Perlstein

* Tom Lane [EMAIL PROTECTED] [001016 09:47] wrote:
 "Mikheev, Vadim" [EMAIL PROTECTED] writes:
  I don't understand why WAL needs to log internal operations of any of
  the index types.  Seems to me that you could treat indexes as black
  boxes that are updated as side effects of WAL log items for heap tuples:
  when adding a heap tuple as a result of a WAL item, you just call the
  usual index insert routines, and when deleting a heap tuple as a result
 
  On recovery backend *can't* use any usual routines:
  system catalogs are not available.
 
 OK, good point, but that just means you can't use the catalogs to
 discover what indexes exist for a given table.  You could still create
 log entries that look like "insert indextuple X into index Y" without
 any further detail.

One thing you guys may wish to consider is selectively fsyncing on
system catelogs and marking them dirty when opened for write:

postgres:  i need to write to a critical table...
opens table, marks dirty
completes operation and marks undirty and fsync

-or-

postgres:  i need to write to a critical table...
opens table, marks dirty
crash, burn, smoke (whatever)

Now you may still have the system tables broken, however the chances
of that may be siginifigantly reduced depending on how often writes
must be done to them.

It's a hack, but depending on the amount of writes done to critical
tables it may reduce the window for these inconvient situations 
signifigantly.

-- 
-Alfred Perlstein - [[EMAIL PROTECTED]|[EMAIL PROTECTED]]
"I have the heart of a child; I keep it in a jar on my desk."



[HACKERS] Re: : : WAL and indexes (Re: [HACKERS] WAL status todo)

2000-10-16 Thread Tom Lane

"Mikheev, Vadim" [EMAIL PROTECTED] writes:
 And how could I use such records on recovery
 being unable to know what data columns represent
 keys, what functions should be used for ordering?

Um, that's not built into the index either, is it?  OK, you win ...

I'm still nervous about how we're going to test the WAL code adequately
for the lesser-used index types.  Any ideas out there?

regards, tom lane



[HACKERS] : [HACKERS] Otvet: WAL and indexes (Re: [HACKERS] WAL status todo)

2000-10-16 Thread Mikheev, Vadim

 One of the purposes of WAL is immediate removing tuples 
 inserted by aborted xactions. I want make VACUUM
 *optional* in future - space must be available for
 reusing without VACUUM. And this is first, very small,
 step in this direction.

Why would vacuum become optional?  Would WAL offer an option to
not reclaim free space?  We're hoping that vacuum becomes unneeded

Reclaiming free space is issue of storage manager, as
I said here many times. WAL is just Write A-head Log
(first write to log then to data files, to have ability
to recover using log data) and for matter of space it can
only help to delete tuples inserted by aborted transaction.

when postgresql is run with some flag indicating that we're
uninterested in time travel.

Time travel is gone ~ 3 years ago and vacuum was needed all
these years and will be needed to reclaim space in 7.1

How much longer do you estimate until you can make it work that way?

Hopefully in 7.2

Vadim




Re: [HACKERS] Re: ?????: ?????: WAL and indexes (Re: [HACKERS] WAL status todo)

2000-10-16 Thread Bruce Momjian

 "Mikheev, Vadim" [EMAIL PROTECTED] writes:
  And how could I use such records on recovery
  being unable to know what data columns represent
  keys, what functions should be used for ordering?
 
 Um, that's not built into the index either, is it?  OK, you win ...
 
 I'm still nervous about how we're going to test the WAL code adequately
 for the lesser-used index types.  Any ideas out there?

Wait for bug reports?  :-)

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 853-3000
  +  If your life is a hard drive, |  830 Blythe Avenue
  +  Christ can be your backup.|  Drexel Hill, Pennsylvania 19026



Re: [HACKERS] WAL status todo

2000-10-15 Thread Martin A. Marques

On Sat, 14 Oct 2000, Vadim Mikheev wrote:
 Well, hopefully WAL will be ready for alpha testing in a few days.
 Unfortunately
 at the moment I have to step side from main stream to implement new file
 naming,
 the biggest todo for integration WAL into system.

 I would really appreciate any help in the following issues (testing can
 start regardless
 of their statuses but they must be resolved anyway):

I have downloaded the source via CVSup. Where can I find the WAL and the 
TOAST code?

Thanks!!

-- 
"And I'm happy, because you make me feel good, about me." - Melvin Udall
-
Martín Marqués  email:  [EMAIL PROTECTED]
Santa Fe - Argentinahttp://math.unl.edu.ar/~martin/
Administrador de sistemas en math.unl.edu.ar
-



[HACKERS] : [HACKERS] WAL status todo

2000-10-15 Thread Mikheev, Vadim

 Well, hopefully WAL will be ready for alpha testing in a few
days.
 Unfortunately at the moment I have to step side from main stream
 to implement new file naming, the biggest todo for integration
WAL into system.

 I would really appreciate any help in the following issues
(testing can
 start regardless of their statuses but they must be resolved
anyway):

 I have downloaded the source via CVSup. Where can I find the WAL
 and the TOAST code?

HEAP/BTREE related WAL code are in src/backend/acces/{heap|nbtree}/
#ifdef-ed with XLOG.

Vadim




[HACKERS] WAL status todo

2000-10-14 Thread Vadim Mikheev

Well, hopefully WAL will be ready for alpha testing in a few days.
Unfortunately
at the moment I have to step side from main stream to implement new file
naming,
the biggest todo for integration WAL into system.

I would really appreciate any help in the following issues (testing can
start regardless
of their statuses but they must be resolved anyway):

1. BTREE: sometimes WAL can't guarantee right order of items on leaf pages
after recovery - new flag BTP_REORDER introduced to mark such pages.
Btree should be changed to handle this case in normal processing mode.
2. HEAP: like 1., this issue is result of attempt to go without compensation
records
(ie without logging undo operations): it's possible that sometimes in
redo
there will be no space for new records because of in recovery we don't
undo changes for aborted xactions immediately - function like BTREE'
_bt_cleanup_page_
required for HEAP as well as general inspection of all places where
HEAP' redo ops
try to insert records (initially I thought that in recovery we'll undo
changes immediately
after reading abort record from log - this wouldn't work for BTREE:
splits must be
redo-ne before undo).
3. There are no redo/undo for HASH, RTREE  GIST yet. This would be *really
really
great* if someone could implement it using BTREE' redo/undo code as
prototype.
These are the most complex parts of this todo.

Probably, something else will follow later.

Regards,
Vadim