Re: [HACKERS] win32 tablespace handing

2004-10-06 Thread Zeugswetter Andreas SB SD
hardlinks and junctions don't work across physical disks, only symlinks. Where did you read this? I just looked and can see no such restriction. There is no such restriction for junctions, I just tried it to be safe. Andreas ---(end of

Re: [HACKERS] AIX and V8 beta 3

2004-10-04 Thread Zeugswetter Andreas SB SD
Have you tried using cc_r for that compile line? Does that help? Alas, that is not an option available. cc_r is specific to the AIX xlc compiler; we're using GCC, and xlc is not available to us. What is missing is a -lpthread . Andreas ---(end of

Re: [HACKERS] psql questions: SQL, progname, copyright dates

2004-09-10 Thread Zeugswetter Andreas SB SD
I think adding 'S' to \df confuses more than it helps. Why that? Imho it would be consistent. Andreas ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not

Re: [HACKERS] APR 1.0 released

2004-09-10 Thread Zeugswetter Andreas SB SD
Well, it's easily changed, if all that's needed is a search-and-replace. Suggestions for a better name? MINGW32 I think that is a bad idea. That symbol sure suggests, that you are using mingw. Are you expecting someone who creates a VisualStudio project to define MINGW32 ? Andreas

Re: [HACKERS] APR 1.0 released

2004-09-10 Thread Zeugswetter Andreas SB SD
Personally I don't think that any rename()-usleep loop is necessary. I'll check the archives. I agree the rename loop seems unnecessary. I kept it in case we hadn't dealt with all the failure places. Should we remove them now or wait for 8.1? Seems we should keep them in and see if we

Re: [HACKERS] PITR: XLog File compression on Archive

2004-08-24 Thread Zeugswetter Andreas SB SD
PS: but something you *could* do in 8.0 is replace cp by gzip to archive compressed files that way. How about replacing the page image records with a same size dummy record that only contains a dummy header and all 0's. If the archiver cares about space he will use some sort of compression

Re: [HACKERS] 8.0 beta 1 on linux-mipsel R5900

2004-08-24 Thread Zeugswetter Andreas SB SD
Think harder... one processor != one process... Well sure, but you don't want a spinlock in that case. Actually you do, when the normal case is that you don't have to block. You want it to fall through as quickly as possible in the success case (the blocking case is going to suck no

Re: [HACKERS] PITR: XLog File compression on Archive

2004-08-24 Thread Zeugswetter Andreas SB SD
Re-thinking the whole purpose of the additional full page images appended to the xlog records, I now understand and agree with Tom's comment in the docs that we don't need to include those additional full page images for PITR - they only exist to correctly recover the database in the event of

Re: [HACKERS] Point in Time Recovery

2004-07-30 Thread Zeugswetter Andreas SB SD
I was wondering about this point - might it not be just as reasonable for the copied file to *be* an exact image of pg_control? Then a very simple variant of pg_controldata (or maybe even just adding switches to pg_controldata itself) would enable the relevant info to be extracted

Re: [HACKERS] Sketch of extending error handling for subtransactions

2004-07-26 Thread Zeugswetter Andreas SB SD
I was just looking around the net to see exactly what Oracle's PL/SQL syntax is. It doesn't seem too unreasonable syntax-wise: BEGIN ... controlled statements ... EXCEPTION WHEN exception_name THEN ... error handling

Re: [HACKERS] Fixing PKs and Uniques in tablespaces

2004-07-23 Thread Zeugswetter Andreas SB SD
Also, since I checked and it seems that our syntax for putting tables an d indexes in tablespaces at creation time is identical to oracle's, perhaps we should copy them on constraints as well. Since we're getting close to beta, can we have consensus on what I'm to do about this? The

Re: [HACKERS] localhost redux

2004-07-20 Thread Zeugswetter Andreas SB SD
I don't know if the problem is isolated to just me, but I wanted to suggest that we use a parameter for that, which can be configured in the postgresql.conf, with a default value if it's not set, set to localhost. I think you should first trace down what the problem really is --- is

Re: [HACKERS] Point in Time Recovery

2004-07-20 Thread Zeugswetter Andreas SB SD
Hang on, are you supposed to MOVE or COPY away WAL segments? Copy. pg will delete them once they are archived. Copy. pg will recycle them once they are archived. Andreas ---(end of broadcast)--- TIP 9: the planner will ignore your desire to

Re: [HACKERS] localhost redux

2004-07-20 Thread Zeugswetter Andreas SB SD
If localhost isn't being resolved correctly are you seeing error messages like this on the server log (from pqcomm.c)? could not translate host name \%s\, service \%s\ to address: %s After poking around, I found the following: The Windows pdc (==name server :-( ) does really not

Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Zeugswetter Andreas SB SD
I'm aiming for the minimum feature set - which means we do need to take care over whether that set is insufficient and also to pull any part that doesn't stand up to close scrutiny over the next few days. As you can see, we are still chewing on NT. What PITR features are missing? I

Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Zeugswetter Andreas SB SD
then on restore once all the files are restored move the pg_control.backup to its original name. That gives us the checkpoint wal/offset but how do we get the start/stop information. Is that not required? The checkpoint wal/offset is in pg_control, that is sufficient start information.

Re: [HACKERS] Point in Time Recovery

2004-07-16 Thread Zeugswetter Andreas SB SD
Do we need a checkpoint after the archiving starts but before the backup begins? No. Actually yes. Sorry, I did incorrectly not connect 'archiving' with the backed up xlogs :-( So yes, you need one checkpoint after archiving starts. Imho turning on xlog archiving should issue such a

Re: [HACKERS] Point in Time Recovery

2004-07-15 Thread Zeugswetter Andreas SB SD
Other db's have commands for: start/end external backup I see that the analogy to external backup was not good, since you are correct that dba's would expect that to stop all writes, so they can safely split their mirror or some such. Usually the expected time from start until end external

Re: [HACKERS] Point in Time Recovery

2004-07-15 Thread Zeugswetter Andreas SB SD
Sorry for the stupid question, but how do I get this patch if I do not receive the patches mails ? The web interface html'ifies it, thus making it unusable. Thanks Andreas ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ?

Re: [HACKERS] Point in Time Recovery

2004-07-14 Thread Zeugswetter Andreas SB SD
The recovery mechanism doesn't rely upon you knowing 1 or 3. The recovery reads pg_control (from the backup) and then attempts to de-archive the appropriate xlog segment file and then starts rollforward Unfortunately this only works if pg_control was the first file to be backed up (or by

Re: [HACKERS] Portals and nested transactions

2004-07-14 Thread Zeugswetter Andreas SB SD
My answers: Q1: Should Portals successfully created within the failed subxact be closed? Or should they remain open? no for protocol level I can understand a yes to this one for sql level, because it will be hard to clean up by hand :-( But I like the analogy to hold cursors, so I would

Re: [HACKERS] Point in Time Recovery

2004-07-13 Thread Zeugswetter Andreas SB SD
The starting a new timeline thought works for xlogs, but not for clogs. No matter how far you go into the future, there is a small (yet vanishing) possibility that there is a yet undiscovered committed transaction in the future. (Because transactions are ordered in the clog because xids are

Re: [HACKERS] Nested Transactions, Abort All

2004-07-09 Thread Zeugswetter Andreas SB SD
My proposal would be: 1. Begin main transaction: BEGIN { TRANSACTION | WORK } 2. Commit main (all) transaction: COMMIT { TRANSACTION | WORK } 3. Rollback main (all) transaction: ROLLBACK { TRANSACTION } 4. Begin inner transaction: BEGIN NESTED { TRANSACTION | WORK } 5. Commit inner

Re: [HACKERS] Nested Transactions, Abort All

2004-07-09 Thread Zeugswetter Andreas SB SD
As far as implementing only savepoints, look at this: BEGIN; BEGIN; INSERT INTO ...; COMMIT; BEGIN; INSERT INTO ...; COMMIT; BEGIN; INSERT INTO ...; COMMIT; With savepoints, it looks pretty strange: BEGIN;

Re: [HACKERS] Nested Transactions, Abort All

2004-07-09 Thread Zeugswetter Andreas SB SD
I'd opt for BEGIN as a start of a subtransaction (no need for special semantics in plpgsql), the corresponding END simply changes the transaction context to the parent level. But 'BEGIN' in plpgsql does not start a [sub]transaction, it starts a statement block. Are we intending to change

Re: [HACKERS] Nested Transactions, Abort All

2004-07-09 Thread Zeugswetter Andreas SB SD
But 'BEGIN' in plpgsql does not start a [sub]transaction, it starts a statement block. Are we intending to change that ? I think not. There are two possibilities: Either BEGIN *does* start a subtransaction, or BEGIN does not. I don't see how two nesting level hierarchies in a

Re: [HACKERS] Point in Time Recovery

2004-07-07 Thread Zeugswetter Andreas SB SD
Well, Tom does seem to have something with regard to StartUpIds. I feel it is easier to force a new timeline by adding a very large number to the LogId IF, and only if, we have performed an archive recovery. That way, we do not change at all the behaviour of the system for people that choose

Re: [HACKERS] Point in Time Recovery

2004-07-06 Thread Zeugswetter Andreas SB SD
- by time - but the time stamp on each xlog record only specifies to the second, which could easily be 10 or more commits (we hope) Should we use a different datatype than time_t for the commit timestamp, one that offers more fine grained differentiation between checkpoints? Imho

Re: subtransactions and FETCH behaviour (was Re: [HACKERS] PREPARE and transactions)

2004-07-06 Thread Zeugswetter Andreas SB SD
Well, the proposal of implementing it like holdable cursors means using a Materialize node which, if I understand correctly, means taking the whole result set and storing it on memory (or disk). Would it help to hold the lock for a record that is the current cursor position, iff this record

Re: [HACKERS] nested-xacts cursors (was Re: Performance with new nested-xacts code)

2004-07-05 Thread Zeugswetter Andreas SB SD
begin; declare cursor c ...; fetch 1 from c; -- returns tuple 1 begin; fetch 1 from c; -- returns tuple 2 rollback; fetch 1 from c; -- returns tuple 1 again This is mightly ugly but I think it's the most usable of the options seen so far.

Re: [HACKERS] signal 11 on AIX: 7.4.2

2004-06-18 Thread Zeugswetter Andreas SB SD
My only guess is that getaddrinfo in your libc has a bug somehow that is corrupting the stack (hance the improper backtrace), then crashing. It could be libc on AIX, I suppose, but it strikes me as sort of odd that nobody else ever seens this. Unless nobody else is using AIX 5.1, which

Re: [HACKERS] Tablespaces

2004-06-11 Thread Zeugswetter Andreas SB SD
With the rule system and two underlying tables one could make it work by hand I think. The rule system could be used to do this, but there was some discussion of using inherited tables to handle it. However neither handles the really hard part of detecting queries that use only a part

Re: [HACKERS] Why hash indexes suck

2004-06-07 Thread Zeugswetter Andreas SB SD
We could safely sort on the hash value, but I'm not sure how effective that would be, considering that we're talking about values that already hashed into the same bucket --- there's likely not to be very many distinct hash values there. I think we can safely put that on the todo list. The

Re: [HACKERS] Fast index build vs. PITR

2004-06-01 Thread Zeugswetter Andreas SB SD
I think an actually implementable version of this would be: 1. Don't log any index operations at all in WAL. 2. When recovering from WAL, restore all the table contents by WAL replay. (This would of course include the system catalog contents that describe the indexes.) Then sit there

Re: [HACKERS] SELECT * FROM table LIMIT 1; is really slow

2004-05-27 Thread Zeugswetter Andreas SB SD
What would be useful for that (as of 7.5) is a dummy ALTER TABLE TYPE, viz ALTER TABLE tab ALTER COLUMN col TYPE same-type-it-already-has I think we should add special syntax for this purpose, since I would like to (or someone else later on) see all possible cases of alter column

Re: [HACKERS] Timezone fun (bugs and a request)

2004-05-25 Thread Zeugswetter Andreas SB SD
LOG: database system was shut down at 2004-05-25 15:15:44 GMT-12 For comparison, 7.4.1 on the same system says: LOG: database system was shut down at 2004-05-25 16:03:43 NZST Can we keep the zic database convention unchanged but change the display format in the logs to be

Re: [HACKERS] Table Spaces

2004-05-18 Thread Zeugswetter Andreas SB SD
If you run NTFS, it's still possible to use arbitrary links. In the Windows world, they are called junctions. Microsoft does not provide a junction tool for some reason (perhaps because it's limited to NTFS). A good tool, free and with source, can be found here

Re: [HACKERS] Call for 7.5 feature completion

2004-05-17 Thread Zeugswetter Andreas SB SD
It is too late to think about pushing back another month. We had this discussion already. June 1 is it. I thought the outcome of that discussion was June 15 ? Can we try to do the 2PC patch now instead of waiting for subtransactions ? Andreas ---(end of

Re: [HACKERS] relcache refcount

2004-05-14 Thread Zeugswetter Andreas SB SD
BTW, what are your plans for state saving/reversion for the lock manager and buffer manager? The lock state, in particular, makes these other problems look trivial by comparison. Why can't we keep all locks until main tx end ? Locks are not self conflicting are they ? So the only reason to

Re: [HACKERS] threads stuff/UnixWare

2004-05-13 Thread Zeugswetter Andreas SB SD
I know, this sucks, but, I don't see any other way, other than linking *ALL* libpq-using programs (including initdb and friends) with -K pthread. How about making a libpq.so (without pthread) and a thread safe libpq_r.so ? Andreas ---(end of

Re: [HACKERS] XLog: how to log?

2004-05-12 Thread Zeugswetter Andreas SB SD
I think this argument is largely a red herring ... but if it makes you feel better, we could change the contents of the commit timestamp to be gettimeofday() output (seconds+microseconds) instead of just time() output. That should be precise enough for practical purposes. I am saying

Re: [HACKERS] ALTER TABLE TODO items

2004-05-07 Thread Zeugswetter Andreas SB SD
FireBird: ALTER COLUMN column TYPE type DB2: ALTER COLUMN column SET DATA TYPE type. Oracle: MODIFY column type MSSQL:ALTER COLUMN column type constraints MySQL:Both Oracle and MSSQL Sap: MODIFY column type Spec: Nothing (obvious) on changing column types

Re: [HACKERS] ANALYZE locks pg_listener in EXCLUSIVE for long time?

2004-05-03 Thread Zeugswetter Andreas SB SD
* Is it really a good idea for database-wide ANALYZE to run as a single transaction? Holding all those locks is a recipe for deadlocks, even if they're as inoffensive as AccessShareLocks normally are. Wasn't one idea behind that change also to not make the planner create a plan from mixed

Re: [HACKERS] PITR logging control program

2004-04-30 Thread Zeugswetter Andreas SB SD
Basically it is updating the logs as soon as it receives the notifications. Writing 16 MB of xlogs could take some time. In my experience with archiving logs, 16 Mb is on the contrary way too small for a single log. The overhead of starting e.g. a tape session is so high that you cannot

Re: [HACKERS] pre-loading a user table.

2004-04-02 Thread Zeugswetter Andreas SB SD
Why do you think it useful to preload something during InitPostgres, anyway? Any heavily used table will certainly be present in shared buffers already, and even more surely present in kernel buffers. And if you really want it preloaded you can issue dummy selects with a client right after

Re: [HACKERS] [pgsql-hackers-win32] fsync with sync, and Win32 unlink

2004-03-11 Thread Zeugswetter Andreas SB SD
Consider either a box with many different postgresql instances, or one that run both postgresql and other software. Issuing sync() in that sitaution will cause sync of a lot of data that probably doesn't need syncing. But it'd probably be a very good thing on a dedicated server, giving the

Re: [HACKERS] PITR Functional Design v2 for 7.5

2004-03-10 Thread Zeugswetter Andreas SB SD
The only way we can support file-level hot backup is in conjunction with PITR-style WAL log archiving. It is okay for the data area dump to be inconsistent, so long as your recovery process includes replay of WAL starting at some checkpoint before the filesystem dump started, and extending

Re: [HACKERS] PITR Functional Design v2 for 7.5

2004-03-10 Thread Zeugswetter Andreas SB SD
To clarify: I'd expect a cluster to be workable, if I - disable VACUUM until backup completed - issue CHECKPOINT - backup clog (CHECKPOINT and backup clog are the backup checkpoint) - backup all datafiles (which include at least all completed transaction data at checkpoint time)

Re: [pgsql-hackers-win32] [HACKERS] Tablespaces

2004-03-05 Thread Zeugswetter Andreas SB SD
First of all, symlinks are a pretty popular feature. Even Windows supports what would be needed. Second of all, PostgreSQL will still run on OSes without symlinks, tablespaces won't be available, but PostgreSQL will still run. Since we are all using PostgreSQL without My idea for

Re: [HACKERS] Thread safe connection-name mapping in ECPG. Is it

2004-03-04 Thread Zeugswetter Andreas SB SD
- Dispose names of connectiong and replace them with a pointer. You cannot dispose the names, you can only add something to also allow pointers. The names are in the ESQL/C standard. Andreas ---(end of broadcast)--- TIP 2: you can get off all

Re: [pgsql-hackers-win32] [HACKERS] Tablespaces

2004-03-03 Thread Zeugswetter Andreas SB SD
For tablespaces on OS's that don't support it, I think we will have to store the path name in the file and read it via the backend. Somehow we should cache those lookups. My feeling is that we need not support tablespaces on OS's without symlinks. To create symlinked directories

Re: [HACKERS] Tablespaces

2004-02-27 Thread Zeugswetter Andreas SB SD
Ahh. I forgot to detail my ideas on this. It seems to me that we cannot drop a table space until the directory is empty. Agreed. How would it get to be empty? Are you thinking of some sort of connect database to tablespace and disconnect database from tablespace commands that would

Re: [HACKERS] Thread safe connection-name mapping in ECPG. Is it required?

2004-02-27 Thread Zeugswetter Andreas SB SD
I'm not sure I understand you correctly. The SQL standard says you can call your statement as this: exec sql at CONNECTION select 1; Here CONNECTION of course is a string, the name of the connection. So, yes, we have to maintain that list to make sure we get the right connection. I

Re: [HACKERS] Thread safe connection-name mapping in ECPG. Is it

2004-02-27 Thread Zeugswetter Andreas SB SD
I am asking for CONNECTION being a variable of data type 'connection *' rather than 'const char *'. That would avoid name lookups. Is that out of spec? Yes, but the preprocessor could still add an optimization ala 'connection *' for the hardcoded cases (exec sql set connection 'myconn1';

Re: [HACKERS] Tablespaces

2004-02-27 Thread Zeugswetter Andreas SB SD
I do not intend to undertake raw disk tablespaces for 7.5. I'd be interested if anyone could provide some real world benchmarking of file system vs. raw disk. Postgres benefits a lot from kernel file system cache at the moment. Yes, and don't forget that pg also relys on the OS for grouping

Re: [HACKERS] OIDs, CTIDs, updateable cursors and friends

2004-02-19 Thread Zeugswetter Andreas SB SD
I believe the ODBC driver uses CTID for this sort of problem. CTID is guaranteed to exist and to be fast to access (since it's a physical locator). Against this you have the problem that concurrent updates of the record will move it, leaving your CTID invalid. However, that IIRC the ctid

Re: [HACKERS] [PATCHES] NO WAIT ...

2004-02-19 Thread Zeugswetter Andreas SB SD
The question is whether we should have a GUC variable to control no waiting on locks or add NO WAIT to specific SQL commands. Does anyone want to vote _against_ the GUC idea for nowait locking. (We already have two voting for such a variable.) I vote against. We got bit by both the

Re: [HACKERS] [PATCHES] NO WAIT ...

2004-02-19 Thread Zeugswetter Andreas SB SD
I personally think a wait period in seconds would be more useful. Milli second timeouts tend to be misused with way too low values in this case, imho. I understand, but GUC lost the vote. I have updated the TODO list to indicate this. Tatsuo posted a patch to add NO WAIT to the LOCK

Re: [HACKERS] Transaction aborts on syntax error.

2004-02-13 Thread Zeugswetter Andreas SB SD
In both cases, the transaction either commits or rollback occurs. No other option is possible at the end of the transaction, but in the first style of transaction semantics you get a mid-way decision point. This only refers to retryable errors, since errors like access rights violations and

Re: [HACKERS] Transaction aborts on syntax error.

2004-02-13 Thread Zeugswetter Andreas SB SD
It seems to me, that leaving all this to the client (which implicitly inserts savepoints) can never be as efficient as a serverside feature. I think this is an overly narrow view of efficiency. With client control, the client can insert savepoints whereever it needs them, Yes, but not

Re: [HACKERS] Transaction aborts on syntax error.

2004-02-12 Thread Zeugswetter Andreas SB SD
Improving on not ideal would be good, and would get even closer to full Oracle/SQLServer migration/compatibility. However, since I've never looked at that section of code, I couldn't comment on any particular approach nor implement such a change, so I'll shut up and be patient. Imagine

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-05 Thread Zeugswetter Andreas SB SD
I don't think the bgwriter is going to be able to keep up with I/O bound backends, but I do think it can scan and set those booleans fast enough for the backends to then perform the writes. As long as the bgwriter does not do sync writes (which it does not, since that would need a whole lot

Re: [HACKERS] [pgsql-hackers-win32] Sync vs. fsync during checkpoint

2004-02-05 Thread Zeugswetter Andreas SB SD
People keep saying that the bgwriter mustn't write pages synchronously because it'd be bad for performance, but I think that analysis is faulty. Performance of what --- the bgwriter? Nonsense, the *point* Imho that depends on the workload. For a normal OLTP workload this is certainly

Re: [pgsql-hackers-win32] [HACKERS] [PATCHES] fork/exec patch

2003-12-18 Thread Zeugswetter Andreas SB SD
How about the typical answer on Windows ? Create an invisible Window with an Event Handler and pass it a windows message ? Andreas ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister

Re: [HACKERS] 7.4 include file conflict

2003-12-18 Thread Zeugswetter Andreas SB SD
Does anyone know how Informix, where this file comes from, handles this? Informix puts those files in $INFORMIXDIR/incl/esql (e.g. /usr/informix/incl/esql), so imho a /usr/postgres installation could have them somewhere under /usr/postgres Andreas ---(end of

Re: [HACKERS] fsync method checking

2003-12-17 Thread Zeugswetter Andreas SB SD
Ideally that path isn't taken very often. But I'm currently having a discussion off-list with a CMU student who seems to be seeing a case where it happens a lot. (She reports that both WALWriteLock and WALInsertLock are causes of a lot of process blockages, which seems to mean that a lot

Re: [HACKERS] fsync method checking

2003-12-12 Thread Zeugswetter Andreas SB SD
Running the attached test program shows on BSD/OS 4.3: write 0.000360 write fsync 0.001391 I think the write fsync pays for the previous write test (same filename). write, close fsync 0.001308 open o_fsync, write0.000924 I have

Re: [HACKERS] pg_restore and create FK without verification check

2003-11-26 Thread Zeugswetter Andreas SB SD
Q2: New situation: Why is it not a good idea to backup the database files of a cluster incl. all c_log and x_log (log files last) to get a physicaly hot backup. In principle it is the same situation like a server which is crashing (not a once but during some time). After restoring, it

Re: [HACKERS] A rough roadmap for internationalization fixes

2003-11-26 Thread Zeugswetter Andreas SB SD
There are no such libraries. I keep hearing ICU, but that is much too bloated. At least it is kind of standard and also something what will be maintained for foreseeable future, it also has a compatible license and is available on all platforms of interest to postgresql. And it is used

Re: [HACKERS] A rough roadmap for internationalization fixes

2003-11-25 Thread Zeugswetter Andreas SB SD
Have you looked at what is available from http://oss.software.ibm.com/icu/ ? Seems they have a compatible license, but use some C++. Andreas ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate

splitting WAL (was RE: [HACKERS] Providing anonymous mmap as an option of sharing memory)

2003-11-25 Thread Zeugswetter Andreas SB SD
In case of WAL per database, the operations done on a shared catalog from a backend would need flushing system WAL and database WAL to ensure such transaction commit. Otherwise only flushing database WAL would do. I don't think that is a good idea. If you want databases separated you should

Re: [HACKERS] Sponsoring enterprise features

2003-11-24 Thread Zeugswetter Andreas SB SD
Main needs partitioning is useful for: - partition elimination for queries (e.g. seq scans only scan relevant partitions) - deleting/detaching huge parts of a table in seconds - attaching huge parts to a table in seconds (that may have been loaded with a fast loading utility (e.g. loading

Re: [HACKERS] Background writer process

2003-11-19 Thread Zeugswetter Andreas SB SD
1. Open WAL files with O_SYNC|O_DIRECT or O_SYNC(Not sure if Without grouping WAL writes that does not fly. Iff however such grouping is implemented that should deliver optimal performance. I don't think flushing WAL to the OS early (before a tx commits) is necessary, since writing 8k or

Re: [HACKERS] Background writer process

2003-11-18 Thread Zeugswetter Andreas SB SD
If the background writer uses fsync, it can write and allow the buffer to be reused and fsync later, while if we use O_SYNC, we have to wait for the O_SYNC write to happen before reusing the buffer; that will be slower. You can forget O_SYNC for datafiles for now. There would simply be too

Re: [HACKERS] Background writer process

2003-11-17 Thread Zeugswetter Andreas SB SD
1. Open WAL files with O_SYNC|O_DIRECT or O_SYNC(Not sure if Without grouping WAL writes that does not fly. Iff however such grouping is implemented that should deliver optimal performance. I don't think flushing WAL to the OS early (before a tx commits) is necessary, since writing 8k or

Re: [HACKERS] 7.4RC2 regression failur and not running stats collector process on Solaris

2003-11-12 Thread Zeugswetter Andreas SB SD
LOG: could not bind socket for statistics collector: Cannot assign requested address Hmm ... that's sure the problem, but what can we do about it? ISTM that any non-broken system ought to be able to resolve localhost. Actually it's worse than that: your system resolved localhost and

Re: [HACKERS] Experimental patch for inter-page delay in VACUUM

2003-11-10 Thread Zeugswetter Andreas SB SD
that works well enough to make it uncommon for backends to have to write dirty buffers for themselves. If we can, then doing all the writes O_SYNC would not be a problem. One problem with O_SYNC would be, that the OS does not group writes any more. So the code would need to eighter do it's

Re: [HACKERS] Experimental patch for inter-page delay in VACUUM

2003-11-10 Thread Zeugswetter Andreas SB SD
One problem with O_SYNC would be, that the OS does not group writes any more. So the code would need to eighter do it's own sorting and grouping (256k) or use aio, or you won't be able to get the maximum out of the disks. Or just run multiple writer processes, which I believe is

Re: [HACKERS] Experimental patch for inter-page delay in VACUUM

2003-11-05 Thread Zeugswetter Andreas SB SD
The only idea I have come up with is to move all buffer write operations into a background writer process, which could easily keep track of every file it's written into since the last checkpoint. I fear this approach. It seems to limit a lot of design flexibility later. But I can't

Re: [HACKERS] Experimental ARC implementation

2003-11-05 Thread Zeugswetter Andreas SB SD
My plan is to create another background process very similar to the checkpointer and to let that run forever basically looping over that BufferSync() with a bool telling that it's the bg_writer. Why not use the checkpointer itself inbetween checkpoints ? use a min and a max dirty setting

Re: [HACKERS] Experimental ARC implementation

2003-11-05 Thread Zeugswetter Andreas SB SD
Why not use the checkpointer itself inbetween checkpoints ? use a min and a max dirty setting like Informix. Start writing when more than max are dirty stop when at min. This avoids writing single pages (which is slow, since it cannot be grouped together by the OS). Current approach

Re: [HACKERS] Experimental patch for inter-page delay in VACUUM

2003-11-04 Thread Zeugswetter Andreas SB SD
Or... It seems to me that we have been observing something on the order of 10x-20x slowdown for vacuuming a table. I think this is WAY overcompensating for the original problems, and would cause it's own problem as mentioned above. Since the granularity of delay seems to be the

Re: [HACKERS] 2-phase commit

2003-10-10 Thread Zeugswetter Andreas SB SD
I was wondering whether we need to keep WAL online for 2PC, or whether only something like clog is sufficient. What if: 1. phase 1 commit must pass the slave xid that will be used for 2nd phase (it needs to return some sort of identification anyway) 2. the coordinator

Re: [HACKERS] 2-phase commit

2003-10-09 Thread Zeugswetter Andreas SB SD
Why would you spent time on implementing a mechanism whose ultimate benefit is supposed to be increasing reliability and performance, when you already realize that it will have to lock up at the slightest sight of trouble? There are better mechanisms out there that you can use instead.

Re: [HACKERS] new initdb.c available

2003-10-08 Thread Zeugswetter Andreas SB SD
. using wb for writing out on Windows is so that we don't get Windows' gratuitous addition of carriage returns. I will document that. Please use the #define PG_BINARY_W from c.h which is defined with the correct letters for all platforms (wb on Windows). That is how Peter's comment was

Re: [HACKERS] 2-phase commit

2003-09-29 Thread Zeugswetter Andreas SB SD
The simplest senario(though there could be varations) is [At participant(master)'s side] Because the commit operations is done, does nothing. [At coordinator(slave)' side] 1) After a while 2) re-establish the communication path between the

Re: [HACKERS] 2-phase commit

2003-09-29 Thread Zeugswetter Andreas SB SD
I don't think there is any way to handle cases where the master or slave just disappears. The other machine isn't under the server's control, so it has no way of it knowing. I think we have to allow the administrator to set a timeout, or ask to wait indefinately, and allow them to call an

Re: [HACKERS] 2-phase commit

2003-09-29 Thread Zeugswetter Andreas SB SD
Master Slave -- - commit ready-- --OK commit done-XX is the commit done message needed ? Of course ... how else will the Slave commit? From my understanding, the concept is that the

Re: [HACKERS] 2-phase commit

2003-09-29 Thread Zeugswetter Andreas SB SD
Or the slave could reject the request. Huh? The slave has that option?? In what circumstance? I thought the slave could reject if someone local already had the row locked. No, not at all. The slave would need to reject phase 1 commit ready for this. Andreas

Re: [HACKERS] Threads vs Processes

2003-09-26 Thread Zeugswetter Andreas SB SD
When the address-of operator is applied to a thread-local variable, it is evaluated at run-time and returns the address of the current thread's instance of that variable. An address so obtained may be used by any thread. When a thread terminates, any pointers to thread-local variables in

Re: [HACKERS] More Prelimiary DBT-2 Test Results with PostgreSQL 7.3.4

2003-09-25 Thread Zeugswetter Andreas SB SD
Are those response times in the right unit? 7-10s? No problem: http://developer.osdl.org/markw/misc/plana.out Ok, I guess I misunderstood you. These queries are taking 0.5ms - 300ms except for the last aggregate query which takes just over 1s. Yes, but because this is a benchmark he

Re: [HACKERS] AIX 4.2.1 CVS head and SSL

2003-09-15 Thread Zeugswetter Andreas SB SD
I get the following errors gmake -C ecpglib all gmake[4]: Entering directory `/usr/local/postgres/pgsql/src/interfaces/ecpg/ecpglib' ../../../../src/backend/port/aix/mkldexport.sh libecpg.a libecpg.exp gcc -O2 -pipe -Wall -Wmissing-prototypes -Wmissing-declarations -Wl,-bnoentry -

Re: [HACKERS] Win32 native port

2003-09-12 Thread Zeugswetter Andreas SB SD
Below is the email that prompted me to add the derived files to WIN32_DEV CVS. However, most people don't want them in there, so I have removed them, and updated the web page to recommend the nightly snapshots (which have the derived files), and mentioned the tools that will be needed

Re: [HACKERS] Vote: Adding flex/bison derived files in WIN32_DEV

2003-09-11 Thread Zeugswetter Andreas SB SD
I'm confused. Right on the MinGW download page is a link for bison-1.875. Yep, but I had problems with it. Author confirmed that there could be some problems creating processes (for example M4). However if You make it work, I'll be interested to know how. Check the MinGW mailing list

Re: [HACKERS] 2-phase commit

2003-09-10 Thread Zeugswetter Andreas SB SD
From our previous discussion of 2-phase commit, there was concern that the failure modes of 2-phase commit were not solvable. However, I think multi-master replication is going to have similar non-solvable failure modes, yet people still want multi-master replication. No. The real

Re: [HACKERS] Index creation takes for ever

2003-09-08 Thread Zeugswetter Andreas SB SD
I don't think so, because the patch does nothing to keep the sort order once the index is initially created. As Tom mentioned, we might not want to keep the tid's in order after the index is created because he wants the most recent tid's first, so the expired ones migrate to the end.

Re: [HACKERS] Correlation in cost_index()

2003-08-14 Thread Zeugswetter Andreas SB SD
In both cases ANALYZE will calculate correlation 1.0 for column X, and something near zero for column Y. We would like to come out with index correlation 1.0 for the left-hand case and something much less (but, perhaps, not zero) for the right-hand case. I don't really see a way to do this

Re: [HACKERS] Changing behavior of BEGIN...sleep...do something...COMMIT

2003-04-04 Thread Zeugswetter Andreas SB SD
Also, per other discussions, we are removing backend autocommit support in 7.4. It was the wrong way to do it. Somehow I did not see that conclusion made. I thought, at least for JDBC, it is already successfully used ? I think the backend autocommit is useful. Maybe only the

Re: [HACKERS] Nested transactions: low level stuff

2003-04-01 Thread Zeugswetter Andreas SB SD
In fact, I had proposed a simpler UNDO capability that revisited tuples and set their XID to a fixed aborted XID to clean up aborted subtransactions, but most now like the multiple XID solution. I think for the implicit subtransactions that we will want (with error codes comming) using a

Re: [HACKERS] Roadmap for FE/BE protocol redesign

2003-03-14 Thread Zeugswetter Andreas SB SD
for that, we get what exactly? Fetching one row at a time is *guaranteed* to be inefficient. The correct response if that bothers you is to fetch multiple rows at a time, not to make a less robust protocol. I don't feel strongly either way on this one, but IIRC the SQL standard for

  1   2   3   >