Re: [GENERAL] Stored procedures and schema renames

2014-04-22 Thread Florian Weimer

On 04/10/2014 03:29 PM, Rob Sargent wrote:


Code for db functions should be a repository. Easy edit and rerun


Well, not necessarily inside the transaction that renames the schema.

I've settled for this inside the transaction (running as a superuser):

   UPDATE pg_proc
  SET proconfig = '{search_path=symboldb, public}'
  WHERE pronamespace = (SELECT oid FROM pg_namespace
WHERE nspname = 'symboldb');


--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Stored procedures and schema renames

2014-04-10 Thread Florian Weimer
I would like to rename schemas without breaking the stored procedures in 
them.  Currently, this does not work if the stored procedure definition 
contains a schema self-reference because that does not get renamed.  I 
tried SET search_path FROM CURRENT, but that seems to expand the 
search path on function definition time, which does not address this 
issue either.


If backend code changes are required, what would be a reasonable way to 
approach this?  Would adding a CURRENT_SCHEMA pseudo-schema which can be 
used in stored procedures work?


--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres as In-Memory Database?

2014-04-07 Thread Florian Weimer

On 04/02/2014 12:32 AM, Stefan Keller wrote:


It also mentions an insert-only technique: This approach has been
adopted before in POSTGRES [21] in 1987 and was called time-travel.
I would be interested what time-travel is and if this is still used by
Postgres.


Back in the old days, PostgreSQL never deleted any tuples.  Rows were 
deleted by writing the deletion time into a column.  As a result, you 
could go back to old data just by telling PostgreSQL to report rows 
which where visible at a given time.


Obviously, this approach precluded use of PostgreSQL in many scenarios. 
 For example, you wouldn't want to use it as your web application 
session store.


--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] 9.3.2 server creates hundreds of thousands of temporary files

2014-02-03 Thread Florian Weimer

On 01/22/2014 06:56 PM, Tom Lane wrote:

Florian Weimer fwei...@redhat.com writes:

I've got a query which causes PostgreSQL to create hundreds of thousands
of temporary files, many of them empty.  The process also needs a lot of
memory.  I suspect this is due to bookkeeping for those files.



The query looks like this:
[ huge hash join ]



I track this down to a lower-than-usual setting of work_mem, to 1MB,
after the upgrade to 9.3.


The system is trying to do the join with only 1MB of workspace, so
yes, you end up with lots and lots of small temporary files.


Is this a bug?


No.


It's still quite surprising that this temporarily needs multiple 
gigabytes of RAM, much more than what's required in in terms of work_mem 
to make this query run quickly.


Is there an easy way to check if there is a memory leak in the file 
descriptor switching code confined to the current transaction, or 
something like that?  It seems a bit unlikely that the per-file 
bookkeeping overhead is larger than 10 KB.


--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Re: PostgreSQL specific datatypes very confusing for beginners who use wrappers around JDBC

2014-01-29 Thread Florian Weimer

On 01/29/2014 09:07 AM, Craig Ringer wrote:


A challenge I've found when approaching this from the ORM side has been
getting people to care. The sentiment has tended to be along the lines
of: No other DBMS does this or requires this, why do we have to jump
through hoops just to make PostgreSQL happy?


Is this true?  Can you use other JDBC drivers (except SQLite) to insert 
Java Strings into NUMERIC columns and Java ints into text columns?



Look at the example that started this thread, though. The stack is:

PostgreSQL
   PgJDBC
 Java JDBC API
   EBean ORM
 Play! Framework

and *every level* needs to have a clue about this or a way to pass the
information trough transparently.

Now think about Hibernate, Sequel ORM, Django ORM, ActiveRecord,
EclipseLink, OpenJPA, DataMapper, Entity Framework, EBean, TopLink,
iBatis/MyBatis, Doctrine, Propel, CakePHP, Zend, SQLAlchemy, 

Wouldn't it be nice if we could find a solution to this user pain point
in one place?


What about using types on the PostgreSQL side which match the 
application types?


In any case, use *can* use strings everywhere if you use the 
stringtype=unspecified connection parameter:


http://jdbc.postgresql.org/documentation/92/connect.html#connection-parameters

--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] 9.3.2 server creates hundreds of thousands of temporary files

2014-01-22 Thread Florian Weimer
I've got a query which causes PostgreSQL to create hundreds of thousands 
of temporary files, many of them empty.  The process also needs a lot of 
memory.  I suspect this is due to bookkeeping for those files.


The query looks like this:

SELECT ef.arch::text, en.name, file_id, f.name
  FROM symboldb.package_set_member psm
  JOIN symboldb.file f USING (package_id)
  JOIN symboldb.elf_file ef USING (contents_id)
  JOIN symboldb.elf_needed en USING (contents_id)
  WHERE psm.set_id = 36;

The query plan looks like this:

 Hash Join  (cost=3153618.94..7866324.29 rows=86653612 width=84)
   Hash Cond: (f.contents_id = ef.contents_id)
   -  Hash Join  (cost=2349021.27..5624009.06 rows=5647704 width=68)
 Hash Cond: (psm.package_id = f.package_id)
 -  Bitmap Heap Scan on package_set_member psm 
(cost=495.42..5385.79 rows=33870 width=4)

   Recheck Cond: (set_id = 36)
   -  Bitmap Index Scan on package_set_member_set_id_idx 
(cost=0.00..486.95 rows=33870 width=0)

 Index Cond: (set_id = 36)
 -  Hash  (cost=1252310.60..1252310.60 rows=45263060 width=72)
   -  Seq Scan on file f  (cost=0.00..1252310.60 
rows=45263060 width=72)

   -  Hash  (cost=592698.57..592698.57 rows=10958808 width=28)
 -  Hash Join  (cost=51493.85..592698.57 rows=10958808 width=28)
   Hash Cond: (en.contents_id = ef.contents_id)
   -  Seq Scan on elf_needed en  (cost=0.00..202333.08 
rows=10958808 width=20)

   -  Hash  (cost=30622.82..30622.82 rows=1272082 width=8)
 -  Seq Scan on elf_file ef  (cost=0.00..30622.82 
rows=1272082 width=8)


This happens with postgresql-server-9.3.2-2.fc20.x86_64.

I track this down to a lower-than-usual setting of work_mem, to 1MB, 
after the upgrade to 9.3.


With work_mem set to 8MB, the query completes in a reasonable time 
frame, with just a few thousand temporary files.  EXPLAIN ANALYZE gives 
these numbers:


 Hash Join  (cost=3153618.94..7866324.29 rows=86653612 width=84) 
(actual time=56041.178..87956.190 rows=511599 loops=1)

   Output: (ef.arch)::text, en.name, f.file_id, f.name
   Hash Cond: (f.contents_id = ef.contents_id)
   -  Hash Join  (cost=2349021.27..5624009.06 rows=5647704 width=68) 
(actual time=46754.959..75014.287 rows=3820442 loops=1)

 Output: f.file_id, f.name, f.contents_id
 Hash Cond: (psm.package_id = f.package_id)
 -  Bitmap Heap Scan on symboldb.package_set_member psm 
(cost=495.42..5385.79 rows=33870 width=4) (actual time=2.124..8.425 
rows=32872 loops=1)

   Output: psm.package_id
   Recheck Cond: (psm.set_id = 36)
   -  Bitmap Index Scan on package_set_member_set_id_idx 
(cost=0.00..486.95 rows=33870 width=0) (actual time=2.089..2.089 
rows=33112 loops=1)

 Index Cond: (psm.set_id = 36)
 -  Hash  (cost=1252310.60..1252310.60 rows=45263060 width=72) 
(actual time=46699.578..46699.578 rows=45327619 loops=1)

   Output: f.file_id, f.name, f.package_id, f.contents_id
   Buckets: 8192  Batches: 2048 (originally 1024)  Memory 
Usage: 8193kB
   -  Seq Scan on symboldb.file f  (cost=0.00..1252310.60 
rows=45263060 width=72) (actual time=0.003..34658.946 rows=45327619 loops=1)

 Output: f.file_id, f.name, f.package_id, f.contents_id
   -  Hash  (cost=592698.57..592698.57 rows=10958808 width=28) (actual 
time=9261.212..9261.212 rows=10974399 loops=1)

 Output: ef.arch, ef.contents_id, en.name, en.contents_id
 Buckets: 16384  Batches: 128  Memory Usage: 5462kB
 -  Hash Join  (cost=51493.85..592698.57 rows=10958808 
width=28) (actual time=449.725..7130.583 rows=10974399 loops=1)

   Output: ef.arch, ef.contents_id, en.name, en.contents_id
   Hash Cond: (en.contents_id = ef.contents_id)
   -  Seq Scan on symboldb.elf_needed en 
(cost=0.00..202333.08 rows=10958808 width=20) (actual 
time=0.008..1633.980 rows=10974399 loops=1)

 Output: en.name, en.contents_id
   -  Hash  (cost=30622.82..30622.82 rows=1272082 width=8) 
(actual time=449.424..449.424 rows=1275149 loops=1)

 Output: ef.arch, ef.contents_id
 Buckets: 32768  Batches: 8  Memory Usage: 6253kB
 -  Seq Scan on symboldb.elf_file ef 
(cost=0.00..30622.82 rows=1272082 width=8) (actual time=0.032..223.748 
rows=1275149 loops=1)

   Output: ef.arch, ef.contents_id
 Total runtime: 87983.826 ms

Is this a bug?

--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Testing an extension without installing it

2013-12-06 Thread Florian Weimer
Is it possible to test an C extension module (.so file) without 
installing a SHAREDIR/extension/extension_name.control file?


My test suite already runs initdb and the database as a non-postgres 
user, but I don't see a way to override the extension control file location.


--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] What is the relationship between checkpoint and wal

2013-08-26 Thread Florian Weimer

On 08/26/2013 11:37 AM, Luca Ferrari wrote:

On Mon, Aug 26, 2013 at 4:57 AM, 高健 luckyjack...@gmail.com wrote:

But why writes the entire content of each disk page to WAL ?



The documentation states that: The row-level change data normally
stored in WAL will not be enough to completely restore such a page
during post-crash recovery.. I guess that a mixed page (i.e., a page
that contains old and new data) cannot be safely recovered with
deltas, so you need to have a clean page image to which start
recovery.


Correct, the full-page image is needed for restoring the known state of 
a page.


It also speeds up recovery because you can just fire off writes to the 
pages under recovery, followed by incremental updates to the cached 
copy.  Without full page writes, you'd have to read the current version 
of the page from the disk first, often resulting in somewhat random read 
activity during recovery.


--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Problem creating index

2013-08-26 Thread Florian Weimer

On 08/26/2013 04:27 PM, Torello Querci wrote:

Create index statement that I use is:

CREATE INDEX dati_impianto_id_tipo_dato_id_data_misurazione_idx
   ON dati
   USING btree
   (impianto_id , tipo_dato_id , data_misurazione  DESC);


What are the data types of these columns?

--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Bottlenecks with large number of relation segment files

2013-08-06 Thread Florian Weimer

On 08/06/2013 12:28 PM, KONDO Mitsumasa wrote:

(2013/08/05 20:38), Florian Weimer wrote:

On 08/05/2013 10:42 AM, John R Pierce wrote:

On 8/5/2013 1:01 AM, KONDO Mitsumasa wrote:

When we open file, ext3 or ext4 file system seems to sequential search
inode for opening file in file directory.


no, ext3/4 uses H-tree structures to search directories over 1 block
long quite efficiently.


And the Linux dentry cache is rather aggressive, so most of the time,
only the
in-memory hash table will be consulted.  (The dentry cache only gets
flushed on
severe memory pressure.)



Are you really? When I put large number of files in same directory and
open, it is very very slow. But open directory is not.


The first file name resolution is slow, but subsequent resolutions 
typically happen from the dentry cache.  (The cache is not populated 
when the directory is opened.)


--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Bottlenecks with large number of relation segment files

2013-08-05 Thread Florian Weimer

On 08/05/2013 10:42 AM, John R Pierce wrote:

On 8/5/2013 1:01 AM, KONDO Mitsumasa wrote:

When we open file, ext3 or ext4 file system seems to sequential search
inode for opening file in file directory.


no, ext3/4 uses H-tree structures to search directories over 1 block
long quite efficiently.


And the Linux dentry cache is rather aggressive, so most of the time, 
only the in-memory hash table will be consulted.  (The dentry cache only 
gets flushed on severe memory pressure.)


--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] Floating point error

2013-03-06 Thread Florian Weimer

On 03/05/2013 07:23 PM, Tom Lane wrote:

Maciek Sakrejda m.sakre...@gmail.com writes:

Thank you: I think this is what I was missing, and what wasn't clear
from the proposed doc patch. But then how can pg_dump assume that it's
always safe to set extra_float_digits = 3?


It's been proven (don't have a link handy, but the paper is at least
a dozen years old) that 3 extra digits are sufficient to accurately
reconstruct any IEEE single or double float value, given properly
written conversion functions in libc.  So that's where that number comes
from.  Now, if either end is not using IEEE floats, you may or may not
get equivalent results --- but it's pretty hard to make any guarantees
at all in such a case.


There's also gdtoa, which returns the shortest decimal representation 
which rounds to the same decimal number.  It would print 0.1 as 0.1, but 
0.1 + 0.2 as 0.30004.


--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Pipelining INSERTs using libpq

2012-12-23 Thread Florian Weimer

On 12/21/2012 03:29 PM, Merlin Moncure wrote:

How you attack this problem depends a lot on if all your data you want
to insert is available at once or you have to wait for it from some
actor on the client side.  The purpose of asynchronous API is to allow
client side work to continue while the server is busy with the query.


The client has only very little work to do until the next INSERT.


So they would only help in your case if there was some kind of other
processing you needed to do to gather the data and/or prepare the
queries.  Maybe then you'd PQsend multiple insert statements with a
single call.


I want to use parameterized queries, so I'll have to create an INSERT 
statement which inserts multiple rows.  Given that it's still 
stop-and-wait (even with PQsendParams), I can get through at most one 
batch per RTT, so the number of rows would have to be rather large for a 
cross-continental bulk load.  It's probably doable for local bulk loading.


Does the wire protocol support pipelining?  The server doesn't have to 
do much to implement it. It just has to avoid discarding unexpected 
bytes after the current frame and queue it for subsequent processing 
instead.


(Sorry if this message arrives twice.)
--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Pipelining INSERTs using libpq

2012-12-21 Thread Florian Weimer
I would like to pipeline INSERT statements.  The idea is to avoid 
waiting for server round trips if the INSERT has no RETURNING clause and 
runs in a transaction.  In my case, the failure of an individual INSERT 
is not particularly interesting (it's a can't happen scenario, more or 
less).  I believe this is how the X toolkit avoided network latency issues.


I wonder what's the best way to pipeline requests to the server using 
the libpq API.  Historically, I have used COPY FROM STDIN instead, but 
that requires (double) encoding and some client-side buffering plus 
heuristics if multiple tables are filled.


It does not seem possible to use the asynchronous APIs for this purpose, 
or am I missing something?


--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Is it possible to speed up addition of not null?

2012-02-02 Thread Florian Weimer
* hubert depesz lubaczewski:

 I tried with some indexes, but I can't get the time to something
 reasonable, so here is my question: is there any way I could make the
 not null constraint *fast*?

You coul patch pg_attribute directly.  I'm not sure if that's still safe
in current versions, though.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Puzzling full database lock

2012-02-02 Thread Florian Weimer
* Christopher Opena:

 We've been running into some very strange issues of late with our
 PostgreSQL database(s).  We have an issue where a couple of queries push
 high CPU on a few of our processors and the entire database locks (reads,
 writes, console cannot be achieved unless the high CPU query procs are
 killed).

Does the kernel log something?  Does dmesg display anything
illuminating?

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Is it possible to speed up addition of not null?

2012-02-02 Thread Florian Weimer
* hubert depesz lubaczewski:

 procedure would look like:
 1. update pg_attribute set attnotnull = true where attrelid = 
 'my_table'::regclass and attname = 'not-null-column';
 2. delete from my_table where not-null-column is null; -- this shouldn't
do anything, as I know that there are no null values, but just in
case
 3. pg_reorg of the table.

You could install a trigger before step 1 which prevents INSERTs and
UPDATEs which would add even more rows violating the constraint.

I'm not sure if the DELETE will actually do anything, given that
pg_attribute says that the column cannot be NULL.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Incomplete startup packet help needed

2012-01-24 Thread Florian Weimer
* David Johnston:

 Immediately upon starting the server I get an incomplete startup
 packet log message.  Just prior there is an autovacuum launcher
 started message.

Like this?

2012-01-23 10:42:55.245 UTC 11545   LOG:  database system is ready to accept 
connections
2012-01-23 10:42:55.245 UTC 11549   LOG:  autovacuum launcher started
2012-01-23 10:42:55.268 UTC 11551 [unknown] [unknown] LOG:  incomplete startup 
packet

I think it's harmless, it's been there for years.  It might be related
to the init script that starts the database server.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] On duplicate ignore

2012-01-20 Thread Florian Weimer
* Lincoln Yeoh:

If you use serializable transactions in PostgreSQL 9.1, you can
implement such constraints in the application without additional
locking.  However, with concurrent writes and without an index, the rate
of detected serialization violations and resulting transactions aborts
will be high.

 Would writing application-side code to handle those transaction aborts
 in 9.1 be much easier than writing code to handle transaction
 aborts/DB exceptions due to unique constraint violations? These
 transaction aborts have to be handled differently (e.g. retried for X
 seconds/Y tries) from other sort of transaction aborts (not retried).

There's a separate error code, so it's easier to deal with in theory.
However, I don't think that's sufficient justification for removing the
unique constraints.

 Otherwise I don't see the benefit of this feature for this
 scenario. Unless of course you get significantly better performance by
 not having a unique constraint.

Performance is worse.

 If insert performance is not an issue and code simplicity is
 preferred, one could lock the table (with an exclusive lock mode),
 then do the selects and inserts, that way your code can assume that
 any transaction aborts are due to actual problems rather than
 concurrency. Which often means less code to write :).

Choosing the right lock is a bit tricky because you usually want to
block INSERTs only.  Explicit locks on a hash of the unique column,
using pg_advisory_xact_lock, are often an alternative.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] On duplicate ignore

2012-01-20 Thread Florian Weimer
* Lincoln Yeoh:

 Is there a simple way to get postgresql to retry a transaction, or
 does the application have to actually reissue all the necessary
 statements again?

The application has to re-run the transaction, which might result in the
execution of different statements.  In the INSERT-or-UPDATE case, the
new attempt will have to use an UPDATE instead of an INSERT, so replying
the statements verbatim will not work.

 I'd personally prefer to use locking and selects to avoid transaction
 aborts whether due to unique constraint violations or due to
 serialization violations.

Once you address the restart issue, transactional code is simpler and
easier to check for correctness.

Restarting transactions has other benefits, too.  For instance, you can
restart your PostgreSQL server process, and your applications will just
keep running.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] On duplicate ignore

2012-01-19 Thread Florian Weimer
* Gnanakumar:

 Just create a unique index on EMAIL column and handle error if it comes

 Thanks for your suggestion.  Of course, I do understand that this could be
 enforced/imposed at the database-level at any time.  But I'm trying to find
 out whether this could be solved at the application layer itself.  Any
 thoughts/ideas?

If you use serializable transactions in PostgreSQL 9.1, you can
implement such constraints in the application without additional
locking.  However, with concurrent writes and without an index, the rate
of detected serialization violations and resulting transactions aborts
will be high.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] On duplicate ignore

2012-01-19 Thread Florian Weimer
* Scott Marlowe:

 On Thu, Jan 19, 2012 at 7:54 AM, Florian Weimer fwei...@bfk.de wrote:
 * Gnanakumar:

 Just create a unique index on EMAIL column and handle error if it comes

 Thanks for your suggestion.  Of course, I do understand that this could be
 enforced/imposed at the database-level at any time.  But I'm trying to find
 out whether this could be solved at the application layer itself.  Any
 thoughts/ideas?

 If you use serializable transactions in PostgreSQL 9.1, you can
 implement such constraints in the application without additional
 locking.  However, with concurrent writes and without an index, the rate
 of detected serialization violations and resulting transactions aborts
 will be high.

 No, you sadly can't.  PostgreSQL doesn't yet support proper predicate
 locking to allow the application to be sure that the OP's original
 statement, and ones like it, don't have a race condition.  A unique
 index is the only way to be sure.

Huh?  This was one of the major new features in PostgreSQL 9.1.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Adding German Character Set to PostgresSQL

2012-01-03 Thread Florian Weimer
* Hagen Finley:

 Yes but I couldn't input your second line - the ('ä,ß,ö') was not
 possible via the psql client - just got beeped when I tried to type or
 paste those characters.

If you start cat instead of psql, can you enter those characters?
What about python or the shell itself? What terminal do you use?

This doesn't appear to be an issue with PostgreSQL as such, rather
something related to terminal configuration and perhaps readline.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] fsync on ext4 does not work

2011-12-20 Thread Florian Weimer
* Havasvölgyi Ottó:

 2011/12/19 Florian Weimer fwei...@bfk.de

 * Havasvölgyi Ottó:

  Even though the TPS in pgbench about 700 with 1 client.
  I have tried other sync methods (fdatasync, open_sync), but all are
 similar.
  Should I disable write cache on HDD to make it work?

 Did you mount your ext4 file system with the nobarrier option?

 By default, ext4 is supposed to cope properly with hard disk caches,
 unless the drive is lying about completing writes (but in that case,
 disabling write caching is probably not going to help much with
 reliability, either).


 It is mounted with defaults, no other option yet, so it should flush.
 These HDDs are 7200 rpm SATA with some low level software RAID1.
 I cannot understand why disabling HDD write cache does not help either.
 Could you explain please?

The drive appears to be fundamentally broken.  Disabling the cache won't
change that.

But you mention software RAID1---perhaps your version of the RAID code
doesn't pass down the barriers to the disk?

 There is also an InnoDB transaction log on this partition, but its commit
 time is quite longer. On the same workload PgSql's commit is about 1 ms,
 but InnoDB's is about 4-7 ms. I think 4-7 is also too short to flush
 something to such disk, am I right?

Yes, it's still too low, unless multiple commits are grouped together.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] fsync on ext4 does not work

2011-12-19 Thread Florian Weimer
* Havasvölgyi Ottó:

 Even though the TPS in pgbench about 700 with 1 client.
 I have tried other sync methods (fdatasync, open_sync), but all are similar.
 Should I disable write cache on HDD to make it work?

Did you mount your ext4 file system with the nobarrier option?

By default, ext4 is supposed to cope properly with hard disk caches,
unless the drive is lying about completing writes (but in that case,
disabling write caching is probably not going to help much with
reliability, either).

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] heavy swapping, not sure why

2011-08-30 Thread Florian Weimer
* Scott Marlowe:

 On a machine with lots of memory, I've run into pathological behaviour
 with both the RHEL 5 and Ubuntu 10.04 kernels where the kswapd starts
 eating up CPU and swap io like mad, while doing essentially nothing.
 Setting swappiness to 0 delayed this behaviour but did not stop it.
 Given that I'm on a machine with 128G ram, I just put /sbin/swapoff
 -a in /etc/rc.local and viola, problem solved.

Was this NUMA machine?  Some older kernels can only migrate pages
between nodes through swap.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Generic terminal-based data browsing entry

2011-07-06 Thread Florian Weimer
I'm looking for a simple application which supports table-based and
card-based browsing and runs on a character terminal.  Field editing
should support an external editor for large values.  (The databases are
not directly reachable from a graphic terminal, and there is no HTTP
server running on them.)

psql is mostly fine for browsing, but I have some tables which have text
fields with a bad length distribution, and the nice and generally useful
padding in psql's output drastically inflates the output, to a point at
which it becomes completely unusable.  And of course, data entry using
psql leaves something to be desired.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] LOCK TABLE permission requirements

2011-06-30 Thread Florian Weimer
* Josh Kupershmidt:

 On Wed, Jun 29, 2011 at 7:48 AM, Florian Weimer fwei...@bfk.de wrote:
 I've been looking around in the 9.0 documentation, but couldn't find the
 permission requirements for LOCK TABLE (in particular, LOCK TABLE IN
 SHARE MODE).  From the source, you need at least one of UPDATE, DELETE
 or TRUNCATE.

 Is there a reason why the INSERT privilege is not sufficient for LOCK
 TABLE, or is this just an oversight?

 The comments on this thread outline some reasons the permissions for
 LOCK TABLE are setup the way they are:
   http://archives.postgresql.org/pgsql-hackers/2010-11/msg01819.php

 Basically, if you have UPDATE, DELETE, or TRUNCATE privileges you can
 potentially lock out competing sessions on a table, similar to what
 some forms of LOCK TABLE would do; just having INSERT privileges
 doesn't necessarily give you that power.

This makes sense, thanks.  Doesn't REFERENCES have the same potential?
Then it could be added to the list.

In my pre-9.1 world, I need to acquire a table lock to avoid incorrectly
serialized transactions.  The application is only doing SELECTs and
INSERTs, so I don't want to grant it UPDATE privileges, but REFERENCES
would be fine.  Right now, I'm using a separate, empty table and lock
that, but that's a bit of a kludge.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] LOCK TABLE permission requirements

2011-06-29 Thread Florian Weimer
I've been looking around in the 9.0 documentation, but couldn't find the
permission requirements for LOCK TABLE (in particular, LOCK TABLE IN
SHARE MODE).  From the source, you need at least one of UPDATE, DELETE
or TRUNCATE.

Is there a reason why the INSERT privilege is not sufficient for LOCK
TABLE, or is this just an oversight?

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: Fwd: Re: [GENERAL] SSDD reliability

2011-05-05 Thread Florian Weimer
* Greg Smith:

 Intel claims their Annual Failure Rate (AFR) on their SSDs in IT
 deployments (not OEM ones) is 0.6%.  Typical measured AFR rates for
 mechanical drives is around 2% during their first year, spiking to 5%
 afterwards.  I suspect that Intel's numbers are actually much better
 than the other manufacturers here, so a SSD from anyone else can
 easily be less reliable than a regular hard drive still.

I'm a bit concerned with usage-dependent failures.  Presumably, two SDDs
in a RAID-1 configuration are weared down in the same way, and it would
be rather inconvenient if they failed at the same point.  With hard
disks, this doesn't seem to happen; even bad batches fail pretty much
randomly.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] SSDs with Postgresql?

2011-04-28 Thread Florian Weimer
* Michael Nolan:

 If you archive your WAL files, wouldn't that give you a pretty good
 indication of write activity?

WAL archiving may increase WAL traffic considerably, I think.  Fresh
table contents (after CREATE TABLE or TRUNCATE) is written to the log
if WAL archiving is active.  This would have a significant performance
impact on some of our loads.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] SSDs with Postgresql?

2011-04-28 Thread Florian Weimer
* Greg Smith:

 To convert the internal numbers returned by that into bytes, you'll
 need to do some math on them.  Examples showing how that works and
 code in a few languages:

Thanks for the pointers.

Those examples are slightly incongruent, but I think I've distilled
something that should be reasonably accurate.  The numbers look as if
they are valid, they match my expectations for different databases
with different loads.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] SSDs with Postgresql?

2011-04-21 Thread Florian Weimer
* Greg Smith:

 The fact that every row update can temporarily use more than 8K means
 that actual write throughput on the WAL can be shockingly large.  The
 smallest customer I work with regularly has a 50GB database, yet they
 write 20GB of WAL every day.  You can imagine how much WAL is
 generated daily on systems with terabyte databases.

Interesting.  Is there an easy way to monitor WAL traffic in away?  It
does not have to be finegrained, but it might be helpful to know if
we're doing 10 GB, 100 GB or 1 TB of WAL traffic on a particular
database, should the question of SSDs ever come up.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] SSDs with Postgresql?

2011-04-21 Thread Florian Weimer
* Adrian Klaver:

 Interesting.  Is there an easy way to monitor WAL traffic in away?  It
 does not have to be finegrained, but it might be helpful to know if
 we're doing 10 GB, 100 GB or 1 TB of WAL traffic on a particular
 database, should the question of SSDs ever come up.

 They are found in $DATA/pg_xlog so checking the size of that
 directory regularly would get you the information.

But log files are recycled, so looking at the directory alone does not
seem particularly helpful.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Changing SHMMAX

2011-02-02 Thread Florian Weimer
* Adarsh Sharma:

 Please guide  me how to change it  permanently and what is the correct
 value for it.
 I am going for 8GB .

Usually, you can put these lines

kernel.shmall = 90
kernel.shmmax = 90

into /etc/sysctl.conf.  Run sysctl -p to activate them.  However,
this is a bit distribution-specific.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] why sometimes checkpoint is too slow????

2011-01-19 Thread Florian Weimer
* Edmundo Robles L.:

 why  sometimes checkpoint  is too slow

Checkpoints are deliberately throttled to spread out the disk write
load.  Is that what you are observing?

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] encode(bytea_value, 'escape') in PostgreSQL 9.0

2010-12-06 Thread Florian Weimer
* Tom Lane:

 Florian Weimer fwei...@bfk.de writes:
 The old 'escape' encoding used by PostgreSQL 8.4 and prior was pretty
 helpful for getting human-readable strings in psql.  It seems this
 functionality was removed in PostgreSQL 9.0.  Was this an accident or
 a deliberate decision?  Could we get it back, please?

 I think you're looking for set bytea_output = escape.

To me, this seems problematic as a general recommendation because
programs won't use this, and it's confusing to have different output
in psql than what your program sees.  That's why I don't want to put
it into .psqlrc.  The separate command will raise a few eyebrows here
and there. 8-/

Put differently, I think it's rather odd that in 9.0, both
encode(bytea_value, 'escape') and encode(bytea_value, 'hex') output
hexadecimal values.  There's also an explicit way to request such
output, so I don't think that encode() should obey the bytea_output
setting.  In 8.4's psql, a BYTEA column and its escape-encoded TEXT
were displayed differently, so there is precedent.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] encode(bytea_value, 'escape') in PostgreSQL 9.0

2010-12-06 Thread Florian Weimer
* Tom Lane:

 Florian Weimer fwei...@bfk.de writes:
 Put differently, I think it's rather odd that in 9.0, both
 encode(bytea_value, 'escape') and encode(bytea_value, 'hex') output
 hexadecimal values.

 I don't believe that; encode produces text not bytea, so its result
 is not affected by this setting.

And you are right, as usual.  It turns out that we've got a
double-encoding issue in the loader.  How embarrassing.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] encode(bytea_value, 'escape') in PostgreSQL 9.0

2010-12-03 Thread Florian Weimer
The old 'escape' encoding used by PostgreSQL 8.4 and prior was pretty
helpful for getting human-readable strings in psql.  It seems this
functionality was removed in PostgreSQL 9.0.  Was this an accident or
a deliberate decision?  Could we get it back, please?

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL hanging on new connections?

2010-11-30 Thread Florian Weimer
* hubert depesz lubaczewski:

 Now, the question is: why did it hang? Is there anything we can do to
 make it *not* hang?

It might be some general system overload issue.  Try running echo w 
/proc/sysrq-trigger as root the next time it happens.  This will dump
kernel backtraces to dmesg, which might hint to what's going on with
the system.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Streaming processing of result sets

2010-11-24 Thread Florian Weimer
Unless you use COPY, libpq loads the complete query result into
memory.  In some cases, this is not desirable.  I know that with
non-MVCC databases, it is important to load the result from the
database server in a non-blocking fashion because you can easily stall
other transactions or even deadlock if you block during result
processing (waiting for another network connection, for instance).  Is
this true for PostgreSQL as well, or can clients block without causing
too much trouble?

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] alter table add column - specify where the column will go?

2010-11-24 Thread Florian Weimer
* Grzegorz Jaśkiewicz:

 just never use SELECT *, but always call columns by names. You'll
 avoid having to depend on the order of columns, which is never
 guaranteed, even if the table on disk is one order, the return columns
 could be in some other.

This can't be true because several SQL features rely on deterministic
column order.  Here's an example:

SELECT 1 AS a, 2 AS b UNION SELECT 3 AS b, 4 AS a;

 a | b 
---+---
 1 | 2
 3 | 4
(2 rows)

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] alter table add column - specify where the column will go?

2010-11-24 Thread Florian Weimer
* Grzegorz Jaśkiewicz:

 2010/11/24 Florian Weimer fwei...@bfk.de:
 * Grzegorz Jaśkiewicz:

 just never use SELECT *, but always call columns by names. You'll
 avoid having to depend on the order of columns, which is never
 guaranteed, even if the table on disk is one order, the return columns
 could be in some other.

 This can't be true because several SQL features rely on deterministic
 column order.  Here's an example:

 SELECT 1 AS a, 2 AS b UNION SELECT 3 AS b, 4 AS a;

  a | b
 ---+---
  1 | 2
  3 | 4
 (2 rows)

 Yes, most DBs do a good job to keep it consistent, but they don't have
 to. So unless you specify column names explicitly (like you did in the
 example above), there's no guarantees.

If the database looked at the column names, the result would be
(1, 2), (4, 3), not (1, 2), (3, 4).

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Simple schema diff script in Perl

2010-09-17 Thread Florian Weimer
 sqlt-diff --to MySQL Schema1=MySQL Schema2=MySQL  SchemaUpdate.mysql.sql

 i can't guess where is the database name or user to use, if it work
 with dumps i need to give the dump files and the database type...

My version says:

| Currently (v0.0900), only MySQL is supported by this code.

I don't know if there is a newer version.

I can see that such a tool could be useful.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] \dt+ sizes don't include TOAST data

2010-01-21 Thread Florian Weimer
The sizes displayed by \dt+ in version 8.4.2 do not take TOAST tables
into account, presumably because the pg_relation_size does not reflect
that, either.  I think this is a bit surprising.  From a user
perspective, these are part of the table storage (I understand that
the indices might be a different story, but TOAST table are a fairly
deep implementation detail and should perhaps be hidden here).

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Possible causes for database corruption and solutions

2009-12-16 Thread Florian Weimer
* Craig Ringer:

 On 16/12/2009 3:54 PM, Florian Weimer wrote:
 * Michael Clark:

 The solution to the problem seemed to be to change the value for the
 wal_sync_method setting to fsync_writethrough from the default of fsync.
 I was curious if there were perhaps any other reasons that we should look
 at?  Or if there may be other alternatives to changing the wal_sync_method
 setting.

 Fsync and related settings only matter if the operating system (not
 just the database) crashes.  Does this happen frequently for you?

 When you're dealing with end users who have machines running
 god-knows-what kinds of awful hardware drivers

Even Mac OS X?  There should be less variety.

 and with no power protection, then I expect it does. Add laptop
 users with ageing/flakey batteries, laptops let go flat after they
 go into powersave suspend, etc, and you're sure to see plenty of
 cases of apparent crashes.

I hope that Mac OS X turns off write caches on low battery.

Improperly disconnected external drives are quite common and the
effect mimics operating system crashes, but is it common to store
PostgreSQL databases there?  I don't think so.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Possible causes for database corruption and solutions

2009-12-15 Thread Florian Weimer
* Michael Clark:

 The solution to the problem seemed to be to change the value for the
 wal_sync_method setting to fsync_writethrough from the default of fsync.
 I was curious if there were perhaps any other reasons that we should look
 at?  Or if there may be other alternatives to changing the wal_sync_method
 setting.

Fsync and related settings only matter if the operating system (not
just the database) crashes.  Does this happen frequently for you?

 I should note, our product runs on OS X, and I would say about 95% of the
 corruptions happen in a bytea column in a given table which tends to hold
 largish data (like email bodies which may or may not have embedded
 attachments).

That's not surprising if 95% of the data are stored that way.

 With wal_sync_method set to fsync it takes 2 seconds.
 With wal_sync_method set to fsync_writethrough it takes 3 minutes and 51
 seconds.

fsync_writethrough seems to be global in effect (not file specific),
so it's going to hurt if there is other I/O activity on the box.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Working around spurious unique constraint errors due to SERIALIZABLE bug

2009-07-20 Thread Florian Weimer
* Craig Ringer:

 The test program, attached, demonstrates what I should've known in the
 first place. In SERIALIZABLE isolation, the above is *guaranteed* to
 fail every time there's conflict, because concurrent transactions cannot
 see changes committed by the others. So is a SELECT test then separate
 INSERT, by the way. 

Yes, I forgot to mention that you can't use SERIALIZABLE if you use
this approach.

 Given that, it seems to me you'll have to rely on Pg's internal
 lower-level synchonization around unique indexes. Try the insert and see
 if it fails, then ROLLBACK TO SAVEPOINT (or use a PL/PgSQL exception
 block). As you noted, this does mean that certain side-effects may
 occur, including:

- advancement of sequences due to nextval(...) calls

- triggers that've done work that can't be rolled back, eg
  dblink calls, external file writes, inter-process communication etc

It's also the cost of producing the input data for the INSERT.

 (You might want to use the two-argument form of the advisory locking
 calls if your IDs are INTEGER size not INT8, and use the table oid for
 the first argument.)

Locking on a hash value could also be an option (it's how concurrent
hash tables are sometimes implemented).

 Also: Is this really a phantom read? Your issue is not that you read a
 record that then vanishes or no longer matches your filter criteria;
 rather, it's that a record is created that matches your criteria after
 you tested for it.

It's the INSERT which performs the phantom read.

And is SQL's definition of serializability really different from the
textbook one?

 Certainly that wouldn't be possible if the concurrent transactions were
 actually executed serially, but does the standard actually require that
 this be the case? If it does, then compliant implementations would have
 to do predicate locking. Ouch. Does anybody do that?

You don't need predicate locking here.  You just have to lock on the
gap in the index you touched.  I think some implementations do this
(InnoDB calls it next-key locking).

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Working around spurious unique constraint errors due to SERIALIZABLE bug

2009-07-20 Thread Florian Weimer
* Albe Laurenz:

 The original question asked was how can I tell an error that is caused
 by incomplete isolation from another error?

 If you have a code segment like
SELECT COUNT(id) INTO i2 FROM a WHERE id = i;
IF i2 = 0 THEN
   INSERT INTO a (id) VALUES (i);
END IF;

 Then you can be certain that any unique_violation thrown here must
 be a serialization problem (if the only unique contraint is on id).

I want to put this into a library, so I'd like something foolproof.
Right now, user code sets a flag which basically says that the wrapper
should retry the transaction a few times if a unique_violation is
detected, but I'd like to get rid of that because it's one thing less
the programmer needs to worry about.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Working around spurious unique constraint errors due to SERIALIZABLE bug

2009-07-16 Thread Florian Weimer
SERIALIZABLE isolation level doesn't really conform to the spec
because it doesn't deal with phantoms.  The only case I've come across
where this actually matters is when you're implementing some sort of
insert into table if not yet present operation.  This will typically
result in a unique constraint violation.[*]

Usually, constraint violations are programming errors, but not this
one.  It's more like a detected deadlock.  Is there a way to tell this
type of constraint violation from other types, so that the transaction
can be restarted automatically (as if there was a deadlock)?
Theoretically, PostgreSQL should detect that the conflicting row
wasn't there when the snapshot for the transaction was taken, and
somehow export this piece of information, but I'm not sure if it's
available to the client.

[*] One way to work around this is to batch inserts and eventually
perform them in a background task which doesn't run in parallel, but
this approach isn't always possible.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Working around spurious unique constraint errors due to SERIALIZABLE bug

2009-07-16 Thread Florian Weimer
* Albe Laurenz:

SELECT COUNT(id) INTO i2 FROM a WHERE id = i;
IF i2 = 0 THEN
   /* This INSERT will never throw an exception if the
  transactions are truly serialized */
   INSERT INTO a (id) VALUES (i);
   RETURN TRUE;
ELSE
   RETURN FALSE;
END IF;

 This is what you are talking about, right?

Yes.

 I am not sure what exactly you mean by retrying the transaction in
 Session A. Even on a second try A would not be able to insert the
 duplicate key. But at least there would not be an error:

I often need to obtain the automatically generated primary key in both
cases (with and without INSERT).

 The best way to work around a problem like this is to write
 code that does not assume true serializability, for example:

 BEGIN
INSERT INTO a (id) VALUES (i);
RETURN TRUE;
 EXCEPTION
WHEN unique_violation THEN
   RETURN FALSE;
 END;

Oh, since when does this perform an implicit snapshot?  I haven't
noticed this before.

The drawback is that some of the side effects of the INSERT occur
before the constraint check fails, so it seems to me that I still need
to perform the select.

My main concern is that the unqiue violation could occur for another
reason (which would be a bug), and I want to avoid an endless loop in
such cases.  But if it's possible to isolate this type of error
recovery to a single statement, this risk is greatly reduced.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How to store text files in the postgresql?

2009-06-07 Thread Florian Weimer
* DimitryASuplatov:

 I`ve also worked out how to do this simply from bash

 ./bin/psql mypdb EOF
 insert into pdb values ('`cat /file/name`'); 
 EOF

This doesn't work if the file contains embedded ' characters (and
backslashes and NULs are also problematic).  You will also get errors
if the file encoding does not match the database encoding.

You probably should use a BYTEA column and a little Perl script which
uses bind_param to specify a type of PG_BYTEA for the parameter.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Apparent race in information_schema.tables

2009-05-11 Thread Florian Weimer
This query:

SELECT 1 FROM information_schema.tables WHERE table_schema = $1 AND table_name 
= $2;

fails sporadically with the error relation with OID number does not
exist.  The query is run by a non-superuser, and the table/schema
combination exists in the database.  The query may have been PREPAREd
(it's submitted using DBD::Pg with the default flags)---I would have
to turn on logging to discover this, and I'm somewhat reluctant to do
so.

I guess the OID refers to a temporary table because I can't find it in
pg_class, and the cause is a race between listing the tables and
applying the permission checks to them (which I don't need anyway, I
think) because tables in the system catalog are not subject to MVCC.
That suggests the problem should go away when I query pg_tables
instead, but I haven't tried that yet.

This happens with 8.2.6 and 8.2.13, and only while there is
significant CREATE TEMPORARY TABLE/DROP TABLE activity in an other
backend process.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Memory on 32bit machine

2008-01-07 Thread Florian Weimer
 Or would you rather vote for 64bit because there are no problems
 anymore and postgresql runs fine on 64bit debian.

We haven't run into any trouble with iptables on Debian etch, running
on amd64 hardware.  But we use only fairly standard iptables
functionality.

-- 
Florian Weimer[EMAIL PROTECTED]
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] [Slony1-general] Any big slony and WAL shipping users?

2007-12-28 Thread Florian Weimer
* Andrew Sullivan:

 (For instance, a DNS company runs completely different name server code on
 completely different hardware and OS platforms in order to make sure not to
 be vulnerable to day-0 exploits.  That kind of thing.)

This only helps against crasher bugs.  For code injection, it's
devastating if the attacker can compromise one node, and by
diversifying, he or she can choose which code base to attack.  I guess
that in the database case, it's mostly the same, with crash bugs on
the one side (where diversification helps), and creeping data
corruption bugs on the other (where it might increase risk).  If you
use multiple systems with a comparator, things are different, of
course.

-- 
Florian Weimer[EMAIL PROTECTED]
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [GENERAL] 8.2.3: Server crashes on Windows using Eclipse/Junit

2007-10-22 Thread Florian Weimer
* Magnus Hagander:

 Oh, that's interesting. That's actually a sideeffect of us increasing
 the stack size for the postgres.exe executable in order to work on other
 things. By default, it burns 1MB/thread, but ours will do 4MB. Never
 really thought of the problem that it'll run out of address space.
 Unfortunately, that size can't be changed in the CreateThread() call -
 only the initially committed size can be changed there.

Windows XP supports the STACK_SIZE_PARAM_IS_A_RESERVATION flag, which
apparently allows to reduce the reserved size.  It might be better to do
this the other way round, though (leave the reservation at its 1 MB
default, and increase it only when necessary).

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [GENERAL] Undetected corruption of table files

2007-08-29 Thread Florian Weimer
* Alban Hertroys:

 If you have a proper production database server, your memory has
 error checking, and your RAID controller has something of the kind
 as well.

To my knowledge, no readily available controller performs validation
on reads (not even for RAID-1 or RAID-10, where it would be pretty
straightforward).

Something like an Adler32 checksum (not a full CRC) on each page might
be helpful.  However, what I'd really like to see is something that
catches missed writes, but this is very difficult to implement AFAICT.

-- 
Florian Weimer[EMAIL PROTECTED]
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Memory settings, vm.overcommit, how to get it really safe?

2007-05-18 Thread Florian Weimer
* Scott Marlowe:

 What distro / kernel version of linux are you running?  We have a
 similar issue with late model hardware and RHEL4 recently here at
 work, where our workstations are running out of memory.  They aren't
 running postgresql, they're java dev workstations and it appears to be
 a RHEL4 on 64 bit problem, so that's why I ask.

When Java sees that your machine has got plenty of RAM and more than
one CPU, it assumes that it's a server and you want to run just a
single VM, and configures itself to use a fair chunk of available RAM.

This is more or less a Sun-specific issue.  Other Java implementations
make different choices.

-- 
Florian Weimer[EMAIL PROTECTED]
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[GENERAL] Foreign keys, table inheritance, and TRUNCATE

2007-01-30 Thread Florian Weimer
Here's something I've just noticed:

CREATE TABLE foo (f INTEGER PRIMARY KEY);
INSERT INTO foo VALUES (1);
CREATE TABLE bar (b INTEGER REFERENCES foo);
CREATE TABLE bar1 () INHERITS (bar);
INSERT INTO bar1 VALUES (1);

This is quite correct:

TRUNCATE foo;
ERROR:  cannot truncate a table referenced in a foreign key constraint
DETAIL:  Table bar references foo.
HINT:  Truncate table bar at the same time, or use TRUNCATE ... CASCADE.

But:

TRUNCATE foo, bar;
SELECT * FROM bar;
 b
---
 1
(1 row)

SELECT * FROM foo;
 f
---
(0 rows)

Whoops.  The referential constraint has been violated.  Perhaps it's a
good idea to extend TRUNCATE on a parent table to all children?

-- 
Florian Weimer[EMAIL PROTECTED]
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Rollback using WAL files?

2007-01-26 Thread Florian Weimer
* M. A. Oude Kotte:

 I'm running a production/development database using PostgreSQL 8.1 on a
 Debian server. Due to some bad code in one of our applications who use
 this database, some of the data was modified incorrectly the last few
 days. The idea is that I would like to restore the entire database as
 much as possible, meaning I would like to undo all transactions that
 were performed on it.

In theory, this should be possible (especially if you haven't switched
off full page writes).  But I don't know a ready-made solution for
this kind of task.

-- 
Florian Weimer[EMAIL PROTECTED]
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Very long or where clause

2007-01-16 Thread Florian Weimer
* Scara Maccai:

 Which would be the best method to access data? Should I use a
 procedure on the server side?

I tend to use a join to a temporary table for similar purposes.  It
seems like the cleanest approach.

-- 
Florian Weimer[EMAIL PROTECTED]
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] (Perl) script to set up an instance for regression tests

2006-12-13 Thread Florian Weimer
* Tom Lane:

 Florian Weimer [EMAIL PROTECTED] writes:
 For regression tests, I'd like to automatically set up a fresh
 PostgreSQL instance.  Has anybody automated the task (initdb, setting
 a password, choosing a port at random, starting the server, and after
 running the tests, stopping the server and deleting all the
 directories)?

 make check?

Ah, pg_regress.sh is indeed a good start.  Silly me.  Thanks.

 $ fakeroot /usr/lib/postgresql/8.1/bin/postgres -D . postgres
 root execution of the PostgreSQL server is not permitted.
 This is a major problem when autobuilding Debian packages. 8-(

 Surely you don't build random packages as root.

It's just fakeroot, a user-space emulation of UID=0, using pre-loaded
dynamic shared objects.  Although fakeroot id -u returns 0, you
cannot actually execute any operations that require privileges you
haven't got.

-- 
Florian Weimer[EMAIL PROTECTED]
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[GENERAL] (Perl) script to set up an instance for regression tests

2006-12-12 Thread Florian Weimer
For regression tests, I'd like to automatically set up a fresh
PostgreSQL instance.  Has anybody automated the task (initdb, setting
a password, choosing a port at random, starting the server, and after
running the tests, stopping the server and deleting all the
directories)?  I know, it's a straightforward Perl script, but perhaps
someone else has already written it. 8-)

And:

$ fakeroot /usr/lib/postgresql/8.1/bin/postgres -D . postgres
root execution of the PostgreSQL server is not permitted.
[...]

This is a major problem when autobuilding Debian packages. 8-(

-- 
Florian Weimer[EMAIL PROTECTED]
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[GENERAL] DISTINCT is not quite distinct

2006-11-06 Thread Florian Weimer
I run this innocent query

CREATE TABLE foo AS SELECT DISTINCT bar FROM baz ORDER BY bar;

and the resulting table contains duplicate rows. 8-(

According to EXPLAIN, an index scan on the bar column is used (using
the underlying B-tree index).  This is with PostgreSQL 8.1.4 (Debian
package 8.1.4-6).  Is this a known problem?

If I drop the DISTINCT, the output is not correctly ordered, either.
Perhaps this is an index corruption issue?  The hardware itself seems
fine.

-- 
Florian Weimer[EMAIL PROTECTED]
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] DISTINCT is not quite distinct

2006-11-06 Thread Florian Weimer
* Tom Lane:

 According to EXPLAIN, an index scan on the bar column is used (using
 the underlying B-tree index).

 Do you mean an indexscan followed immediately by a Unique node?  If
 so, yeah, that would depend entirely on correct ordering of the
 indexscan output to produce distinct results.

Yes.

 If I drop the DISTINCT, the output is not correctly ordered, either.
 Perhaps this is an index corruption issue?  The hardware itself seems
 fine.

 Perhaps.  Do you want to save off a physical copy of the index and then
 try REINDEXing?

The duplicate row is gone.

 If that fixes it, I'd be interested to compare the two versions of
 the index.

The index files are about 155 MB and 98 MB, compressed.  How shall we
transfer them?  (Their contents is not super-secret, but I don't want
to distribute them widely, either.)

-- 
Florian Weimer[EMAIL PROTECTED]
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] DISTINCT is not quite distinct

2006-11-06 Thread Florian Weimer
* Richard Huxton:

 I take it SELECT DISTINCT bar... shows the same problem?

SELECT bar FROM baz does *not* show the duplicate row.

 If so, can you do:
  SELECT OID,xmin,cmin,xmax,cmax,bar FROM baz
   WHERE bar = something with duplicates

Even if I force a complete index scan, I get xmin = 1007617 for both
rows, the others are zero.  The table hasn't got OIDs.

-- 
Florian Weimer[EMAIL PROTECTED]
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Deadlock when one process opens two separate connections?

2006-09-26 Thread Florian Weimer
* Kaloyan Iliev:

 Probbly beacause both transactions started from one process?

Yes, the deadlock detector isn't psychic.  It can't know about lock
ordering constraints which are external to PostgreSQL.

-- 
Florian Weimer[EMAIL PROTECTED]
BFK edv-consulting GmbH   http://www.bfk.de/
Durlacher Allee 47tel: +49-721-96201-1
D-76131 Karlsruhe fax: +49-721-96201-99

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[GENERAL] Inherited tables vs UNION ALL views

2006-09-14 Thread Florian Weimer
I'm going to create a (manually) partioned table and wonder whether I
should use inherited tables or an explicitly created view using UNION
ALL.  Constraint exclusion is not important for this application
(major updates will directly target the individual tables).

After these considerations, is there still a difference between the
two approaches?

-- 
Florian Weimer[EMAIL PROTECTED]
BFK edv-consulting GmbH   http://www.bfk.de/
Durlacher Allee 47tel: +49-721-96201-1
D-76131 Karlsruhe fax: +49-721-96201-99

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] ECPG and COPY

2006-09-12 Thread Florian Weimer
* Bruce Momjian:

  Could you please explain what this has to do with my original question?
 
 I assumed that ECPG did something special with TO STDOUT, like other
 interfaces do.  This is not the case (that is, STDOUT is really
 standard output, so the functionality is not very useful.

 I am confused.  STDOUT is already implemented.

I wasn't aware of the fact that ECPG's implementation of STDOUT is
verbatim stdout (and not something similar to what DBD::Pg does).
This means both STDOUT and STDIN are not very useful (and STDIN even
less).

-- 
Florian Weimer[EMAIL PROTECTED]
BFK edv-consulting GmbH   http://www.bfk.de/
Durlacher Allee 47tel: +49-721-96201-1
D-76131 Karlsruhe fax: +49-721-96201-99

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] ECPG and COPY

2006-08-02 Thread Florian Weimer
* Michael Meskes:

 COPY TO STDOUT has been implemented, but I'm unsure whether COPY FROM
 STDIN really makes sense. Does anyone know a real life example where
 this would be needed and the work couldn't be done easier using psql?

COPY FROM STDIN saves lots of network round-trips.

-- 
Florian Weimer[EMAIL PROTECTED]
BFK edv-consulting GmbH   http://www.bfk.de/
Durlacher Allee 47tel: +49-721-96201-1
D-76131 Karlsruhe fax: +49-721-96201-99

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] ECPG and COPY

2006-08-02 Thread Florian Weimer
* Michael Meskes:

 Could you please explain what this has to do with my original question?

I assumed that ECPG did something special with TO STDOUT, like other
interfaces do.  This is not the case (that is, STDOUT is really
standard output, so the functionality is not very useful.

-- 
Florian Weimer[EMAIL PROTECTED]
BFK edv-consulting GmbH   http://www.bfk.de/
Durlacher Allee 47tel: +49-721-96201-1
D-76131 Karlsruhe fax: +49-721-96201-99

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] INSERT 0 1 problems

2006-07-17 Thread Florian Weimer
* Stefan Schwarzer:

 INSERT 0 1

INSERT 48593 1

The former is printed if the table hasn't got an OID column (new
default), the latter is used when OIDs are available (old default,
nowadays it's CREATE TABLE ... WITH OIDS).

-- 
Florian Weimer[EMAIL PROTECTED]
BFK edv-consulting GmbH   http://www.bfk.de/
Durlacher Allee 47tel: +49-721-96201-1
D-76131 Karlsruhe fax: +49-721-96201-99

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Timestamp vs timestamptz

2006-07-14 Thread Florian Weimer
* Agent M.:

 timestamp with time zone does not record the timezone you inserted it
 with- it simply stores the GMT version and converts to whatever
 timezone you like on demand.

Are you sure?  This behavior is not documented, and I can't reproduce
it with PostgresQL 8.1.4.

-- 
Florian Weimer[EMAIL PROTECTED]
BFK edv-consulting GmbH   http://www.bfk.de/
Durlacher Allee 47tel: +49-721-96201-1
D-76131 Karlsruhe fax: +49-721-96201-99

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Timestamp vs timestamptz

2006-07-14 Thread Florian Weimer
* Tom Lane:

 Florian Weimer [EMAIL PROTECTED] writes:
 * Agent M.:
 timestamp with time zone does not record the timezone you inserted it
 with- it simply stores the GMT version and converts to whatever
 timezone you like on demand.

 Are you sure?  This behavior is not documented, and I can't reproduce
 it with PostgresQL 8.1.4.

 Huh?  Section 8.5.1.3. Time Stamps says

Oops, I misread what Agent M wrote--timestamp with time zone vs
timestamp with time zone.  Sorry about that.

-- 
Florian Weimer[EMAIL PROTECTED]
BFK edv-consulting GmbH   http://www.bfk.de/
Durlacher Allee 47tel: +49-721-96201-1
D-76131 Karlsruhe fax: +49-721-96201-99

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Need help with quote escaping in exim for postgresql

2006-07-09 Thread Florian Weimer
* Martijn van Oosterhout:

 * If application always sends untrusted strings as out-of-line
 parameters, instead of embedding them into SQL commands, it is not
 vulnerable.

This paragraph should explictly mention PQexecParams (which everybody
should use anyway).

It seems that Exim's architecture prevents the use of PQexecParams,
though.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Disk corruption detection

2006-06-12 Thread Florian Weimer
* Lincoln Yeoh:

 At 07:42 PM 6/11/2006 +0200, Florian Weimer wrote:

We recently had a partially failed disk in a RAID-1 configuration
which did not perform a write operation as requested.  Consequently,

 What RAID1 config/hardware/software was this?

I would expect that any RAID-1 controller works in this mode by
default.  It's an analogy to RAID-5: In that case, you clearly can't
verify the parity bits on read for performance reasons.  So why do it
for RAID-1?

(If there is a controller which offers compare-on-read for RAID-1, I
would like to know it's name. 8-)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Fabian Pascal and RDBMS deficiencies in fully

2006-06-12 Thread Florian Weimer
* Roy Souther:

 In what way could a database like PostgreSQL not be faithful to
 relational theory? Does he give any explanation as to what that means?

My guess: In SQL (and in PostgreSQL as a result), relations aren't
sets, aren't first-class, and the underlying logic is not Boolean.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [GENERAL] Disk corruption detection

2006-06-12 Thread Florian Weimer
* Jim C. Nasby:

 Anyway, how would be the chances for PostgreSQL to detect such a
 corruption on a heap or index data file?  It's typically hard to
 detect this at the application level, so I don't expect wonders.  I'm
 just curious if using PostgreSQL would have helped to catch this
 sooner.

 I know that WAL pages are (or at least were) CRC'd, because there was
 extensive discussion around 32 bit vs 64 bit CRCs.

CRCs wouldn't help because the out-of-date copy has got a correct CRC.
That's why it's so hard to detect this problem at the application
level.  Putting redundancy into rows doesn't help, for instance.

 There is no such check for data pages, although PostgreSQL has other
 ways to detect errors. But in a nutshell, if you care about your
 data, buy hardware you can trust.

All hardware can fail. 8-/

AFAIK, compare-on-read is the recommend measure to compensate for this
kind of failure.  (The traditional recommendation also includes three
disks, so that you've got a tie-breaker.)  It seems to me that
PostgreSQL's MVCC-related don't directly overwrite data rows policy
might help to expose this sooner than with direct B-tree updates.

In this particular case, we would have avoided the failure if we
properly monitored the disk subsystem (the failure was gradual).
Fortunately, it was just a test system, but it got me woried a bit.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[GENERAL] Disk corruption detection

2006-06-11 Thread Florian Weimer
We recently had a partially failed disk in a RAID-1 configuration
which did not perform a write operation as requested.  Consequently,
the mirrored disks had different contents, and the file which
contained the block switched randomly between two copies, depending on
which disk had been read.  (In theory, it is possible to read always
from both disks, but this is not what RAID-1 configurations normally
do.)

Anyway, how would be the chances for PostgreSQL to detect such a
corruption on a heap or index data file?  It's typically hard to
detect this at the application level, so I don't expect wonders.  I'm
just curious if using PostgreSQL would have helped to catch this
sooner.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Announce: GPL Framework centered on Postgres

2006-05-18 Thread Florian Weimer
* Kenneth Downs:

 If you seek to provide a closed source app that is built upon
 Andromeda, you are required to provide the source code to Andromeda
 itself.  However, your app is not a derivative work in the strict
 sense because your code is not mixed in with mine in any sense.  You
 never modify a file, and your files and mine are actually in separate
 directories.

Many proprietary software vendors think that if you program to an
interface which has a sole implementation, your code becomes a derived
work of that implementation.  If you sell different licenses for
run-time and development environments, such an attitude towards
copyright law seems inevitable.

It's a bit unfortunate that the FSF promotes this interpretation,
although it's necessary for creating an effective copyleft license for
libraries and other reusable components.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Announce: GPL Framework centered on Postgres

2006-05-18 Thread Florian Weimer
* Joshua D. Drake:

 Sounds great! But why GPL? Are you looking to sell licenses?
 GPL is to spread it as far and wide as possible as fast as possible.

 LGPL?

 My concern would be, I can't use this toolkit for a closed source
 application if it is GPL.

Closed source?  It's a PHP framework. 8-)

Anyway, for a web application, the GPL is usually *less* restrictive
than various BSD license variants because you do not need to mention
the software in the end user documentation.  The viral aspect of the
GPL does not come into play because you do not actually distribute the
software.  You just run it on your servers.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Announce: GPL Framework centered on Postgres

2006-05-18 Thread Florian Weimer
* Kenneth Downs:

Many proprietary software vendors think that if you program to an
interface which has a sole implementation, your code becomes a derived
work of that implementation.  If you sell different licenses for
run-time and development environments, such an attitude towards
copyright law seems inevitable.

 I am not understanding you.  By sole implementation do you mean sole
 license, or single codebase, or cant-run-without-the-library?

Sole implementation, IOW, you cannot replace the implementation with
something else from a different vendor.

 The last sentence I don't understand at all, can you elaborate?

Suppose that I've implemented a COM (or CORBA) object.  I sell an SDK
(with documentation, IDL files and things like that) for $3,000.  For
each application which redistributes the object, I charge you $150
(because you aren't eligible for volume discounts).  Now the IDL files
can be reverse-engineered from the object in straightforward manner.
So you go out, buy some software that includes the object (maybe even
one of my demo versions), and use that for development.  Instead of
paying me royalties, you instruct your customers to obtain the other
software to get the object.  This isn't too far-fetched, I've seen
things like that many moons ago.

 Which has me thinking of the idea of requiring a copyright notice in
 the HTML files sent to the browser, or some type of powered by
 notice.  I will add that to the list of ponderables along with LGPL.

This can be quite obnoxious if the application is ever used with a
non-browser front end.  It's also quite easy to remove the copyright
statement in a reverse proxy, without changing the application itself.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Announce: GPL Framework centered on Postgres

2006-05-18 Thread Florian Weimer
* Matteo Beccati:

 Hi,

 Florian Weimer wrote:
 Closed source?  It's a PHP framework. 8-)
 Anyway, for a web application, the GPL is usually *less* restrictive
 than various BSD license variants because you do not need to mention
 the software in the end user documentation.  The viral aspect of the
 GPL does not come into play because you do not actually distribute the
 software.  You just run it on your servers.

 So you're supposing that no one would ever build a distributable (free
 or commercial) application on your own framework, because if they do
 they are forced to release the whole project under GPL.

If the project is implemented in some kind of scripting language
(which does not offer persistent compilations, or some kind of
compilation which is easily reversed), the GPL vs BSD distinction is
not very important.  If you are technically forced to ship the program
as source code, a license that allows you to distribute binaries
without source code does not offer much more freedom than one which
forces you to distribute the source code if you distribute
(non-existent) binaries.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] [PERFORM] Arguments Pro/Contra Software Raid

2006-05-10 Thread Florian Weimer
* Hannes Dorbath:

 + Hardware Raids might be a bit easier to manage, if you never spend a
 few hours to learn Software Raid Tools.

I disagree.  RAID management is complicated, and once there is a disk
failure, all kinds of oddities can occur which can make it quite a
challenge to get back a non-degraded array.

With some RAID controllers, monitoring is diffcult because they do not
use the system's logging mechanism for reporting.  In some cases, it
is not possible to monitor the health status of individual disks.

 + Using SATA drives is always a bit of risk, as some drives are lying
 about whether they are caching or not.

You can usually switch off caching.

 + Using hardware controllers, the array becomes locked to a particular
 vendor. You can't switch controller vendors as the array meta
 information is stored proprietary. In case the Raid is broken to a
 level the controller can't recover automatically this might complicate
 manual recovery by specialists.

It's even more difficult these days.  3ware controllers enable drive
passwords, so you can't access the drive from other controllers at all
(even if you could interpret the on-disk data).

 + Even battery backed controllers can't guarantee that data written to
 the drives is consistent after a power outage, neither that the drive
 does not corrupt something during the involuntary shutdown / power
 irregularities. (This is theoretical as any server will be UPS backed)

UPS failures are not unheard of. 8-/ Apart from that, you can address
a large class of shutdown failures if you replay a log stored in the
BBU on the next reboot (partial sector writes come to my mind).

It is very difficult to check if the controller does this correctly,
though.

A few other things to note: You can't achieve significant port density
with non-RAID controllers, at least with SATA.  You need to buy a RAID
controller anyway.  You can't quite achieve what a BBU does (even if
you've got a small, fast persistent storage device) because there's
no host software support for such a configuration.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] Debian Packages For PostgreSQL

2006-02-09 Thread Florian Weimer
* Redefined Horizons:

 It looks like the packages.debian.org site is down. Is there another
 place where I can download a .deb for the latest stable version of
 PostgreSQL. (I don't have a direct link to the internet on my Linux
 box, so I can't use APT.)

http://ftp.debian.org/debian/pool/main/p/postgresql/; the packages
have sarge in their names.

Speaking of Debian, is there some list to discuss Debian-specific
packaging issues, e.g. how to create a Debian package which installs
some stored procedures written in C?

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [GENERAL] Postgresql FIFO Tables, How-To ?

2003-07-18 Thread Florian Weimer
Kirill Ponazdyr [EMAIL PROTECTED] writes:

 It is for a advanced syslog server product we are currently developing.

 The very basic idea is to feed all syslog messages into a DB and allow
 easy monitoring and even correlation, we use Postgres as our DB Backend,
 in big environments the machine would be hit with dozens of syslog
 messages in a second and the messaging tables could grow out of controll
 pretty soon (We are talking of up to 10mil messages daily).

We have something similar (with about 50 log entries written per
second).  I guess you too have got a time-based column which is
indexed.  This means that you'll run into the creeping index
syndrome (as far as I understand it, pages in the index are never
reused because your column values are monotonically increasing).
Expiring old rows is a problem, too.  We now experiment with per-day
tables, which makes expire rather cheep and avoids growing indices.
And backup is much easier, too.

If you have some time for toying with different ideas, you might want
to look at TelegraphCQ.

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [GENERAL] IPv4 addresses, unsigned integers, space

2003-07-17 Thread Florian Weimer
Jim Crate [EMAIL PROTECTED] writes:

 on 7/15/03, Florian Weimer [EMAIL PROTECTED] wrote:

If I switched from signed integers to unsigned integers (and from INET
to real IPv4 addresses, consisting of the relevant 32 bits only) I
think I could save about 25% of my table size.

 Why do you need unsigned ints to hold IP addresses?

This is a misunderstanding.  I could use both space-conservative IP
addresses and unsigned integers.

 What difference does it make if IP addresses with a class A higher
 than 127 appear as negative numbers?

The mapping does not preserve ordering if not done carefully.

---(end of broadcast)---
TIP 8: explain analyze is your friend


[GENERAL] IPv4 addresses, unsigned integers, space

2003-07-15 Thread Florian Weimer
If I switched from signed integers to unsigned integers (and from INET
to real IPv4 addresses, consisting of the relevant 32 bits only) I
think I could save about 25% of my table size.

Does PostgreSQL already implement these data types?  I don't think so.
If I succeed in implementing them, would you accept a patch?

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [GENERAL] IPv4 addresses, unsigned integers, space

2003-07-15 Thread Florian Weimer
Bruno Wolff III [EMAIL PROTECTED] writes:

 Does PostgreSQL already implement these data types?  I don't think so.
 If I succeed in implementing them, would you accept a patch?

 You can have unsigned integers using a domain with a check constraint.

They take twice as much storage as necessary.

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org