from:"pgsql"

[HACKERS] Request improve pg_stat_statements module

2014-02-27 Thread pgsql-kr

I patched to add one column in pg_stat_statements module.
and sent to author but
I recived a reject mail because unknown user :(

so I am posting to this mailling.

I need a last time of query, because I want to analyse order by recent time.

this patch code below,
review please and
I wish to apply at next version.

--- diff begin
--- ../pg_stat_statements.orig/pg_stat_statements.c 2014-02-18 
04:29:55.0 +0900
+++ pg_stat_statements.c2014-02-28 15:34:38.0 +0900
@@ -59,6 +59,7 @@
 #include storage/spin.h
 #include tcop/utility.h
 #include utils/builtins.h
+#include utils/timestamp.h


 PG_MODULE_MAGIC;
@@ -116,6 +117,7 @@
double  blk_read_time;  /* time spent reading, in msec */
double  blk_write_time; /* time spent writing, in msec */
double  usage;  /* usage factor */
+   TimestampTz last_executed_timestamp; /* last executed 
timestamp of query */
 } Counters;

 /*
@@ -1043,6 +1045,8 @@
e-counters.blk_read_time += 
INSTR_TIME_GET_MILLISEC(bufusage-blk_read_time);
e-counters.blk_write_time += 
INSTR_TIME_GET_MILLISEC(bufusage-blk_write_time);
e-counters.usage += USAGE_EXEC(total_time);
+   /* last executed timestamp */
+   e-counters.last_executed_timestamp = GetCurrentTimestamp();

SpinLockRelease(e-mutex);
}
@@ -1069,7 +1073,8 @@
 }

 #define PG_STAT_STATEMENTS_COLS_V1_0   14
-#define PG_STAT_STATEMENTS_COLS18
+#define PG_STAT_STATEMENTS_COLS_V1_1   18
+#define PG_STAT_STATEMENTS_COLS19

 /*
  * Retrieve statement statistics.
@@ -1087,6 +1092,7 @@
HASH_SEQ_STATUS hash_seq;
pgssEntry  *entry;
boolsql_supports_v1_1_counters = true;
+   boolsql_supports_v1_2_counters = true;

if (!pgss || !pgss_hash)
ereport(ERROR,
@@ -1107,8 +1113,12 @@
/* Build a tuple descriptor for our result type */
if (get_call_result_type(fcinfo, NULL, tupdesc) != TYPEFUNC_COMPOSITE)
elog(ERROR, return type must be a row type);
-   if (tupdesc-natts == PG_STAT_STATEMENTS_COLS_V1_0)
+   if (tupdesc-natts == PG_STAT_STATEMENTS_COLS_V1_0){
sql_supports_v1_1_counters = false;
+   sql_supports_v1_2_counters = false;
+   }
+   if (tupdesc-natts == PG_STAT_STATEMENTS_COLS_V1_1)
+   sql_supports_v1_2_counters = false;

per_query_ctx = rsinfo-econtext-ecxt_per_query_memory;
oldcontext = MemoryContextSwitchTo(per_query_ctx);
@@ -1185,8 +1195,15 @@
values[i++] = Float8GetDatumFast(tmp.blk_read_time);
values[i++] = Float8GetDatumFast(tmp.blk_write_time);
}
+   // last_executed_timestamp
+   if (sql_supports_v1_2_counters)
+   values[i++] = 
TimestampTzGetDatum(tmp.last_executed_timestamp);
+

-   Assert(i == (sql_supports_v1_1_counters ?
+   if(sql_supports_v1_2_counters)
+   Assert(i == PG_STAT_STATEMENTS_COLS);
+   else
+   Assert(i == (sql_supports_v1_1_counters ?
 PG_STAT_STATEMENTS_COLS : 
PG_STAT_STATEMENTS_COLS_V1_0));

tuplestore_putvalues(tupstore, tupdesc, values, nulls);

-- end of diff




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pgsql-hack...@news.hub.org 21% OFF on Pfizer!

2010-08-06 Thread pgsql-hackers

http://groups.yahoo.com/group/igodsinkma/message
















































































die ihre Stellung als Herren der Erde nur der Genialitat und dem Mute 
verdanken, mit dem sie sich diese 
zu erkampfen und zu wahren wissen; vor unserer deutschen Nachwelt aber, 
insofern wir keines Burgers 
Blut vergossen, aus dem nicht tausend andere der Nachwelt geschenkt werden. Der 
Grund und Boden, 
auf dem dereinst deutsche Bauerngeschlecht


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pgsql-hack...@news.hub.org 37% OFF on Pfizer!

2010-08-05 Thread pgsql-hackers

http://groups.yahoo.com/group/syrilalwinl/message















































































n Bayern 646 
Ludwig III. von Bayern: Gesuch Hitlers an L. 179 
Lueger, Dr. Karl, BegrunderderChristlich-sozialen Partei (s. diese): L. und die 
Christlich-soziale Partei 58.  Burgermeister von 
Wien 74, 107, 108, 133 
Madchenerziehung im volkischen Staat 454.  Vgl. Erziehung 
Madchenhandel und Judentum 63 
Marx, Karl, Begrunder des Marxismus 234, 420, 532.  Staatslehre 434 
Marxismus: Verkennen 184.  Kern 351.  Kulturzerstorer 69.  Von der 
westlichen Demokratie gefordert 85.  M. und 
Demokratie 412.  M. und Judentum 350 f., 352, 498.  Staatsauffassung 420.  V


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pgsql-hack...@news.hub.org 81% OFF on Pfizer!

2010-08-05 Thread pgsql-hackers

http://groups.yahoo.com/group/pedranmarcksld/message















































































 vielmehr an das Vorhandensein eines 

{688 Voraussetzung fur die Befreiung verlorener Gebiete} 

wenn auch noch so kleinen Restes dieses Volkes und Staates, der, im Besitz der 
notigen Freiheit, nicht 
nur der Trager der geistigen Gemeinschaft des gesamten Volkstums, sondern auch 
der Vorbereiter des 
militarischen Freiheitskampfes zu sein vermag. 

Wenn ein Volk von hundert Millionen Menschen, um die staatliche Geschlossenheit 
zu wahren, 
gemeinsam das Joch der Sklaverei erduldet, so ist dies schlimmer, als wenn ein 
solcher Staat und ein 
solches Volk zertrum


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] xpath not a good replacement for xpath_string

2009-07-29 Thread pgsql

 On Tuesday 28 July 2009 23:30:23 pg...@mohawksoft.com wrote:
 The thing that perplexed me was that it was not obvious from the docs
 how,
 exactly, to get the functionality that was simple and straight forward
 in
 XML2.

 I continue to be in favor of adding

 xpath_string
 xpath_number
 xpath_boolean

 functions, which would be both easier to use and provide a more
 casting-free
 approach to pass the data around.  In the past there were some doubts and
 objections about that, but I think it could be done.


I totally agree, but I tend to be more of a pragmatist than a purist. It
seems to me that purists tend to like a lot of topical consistency in an
API, like the new implementation of xpath over the former. Where as a
pragmatists will violate some of the rules to make something seemingly
more easy.

The issue I have with the xpath implementation is that it seems more
geared to an XML implementation on top of SQL instead of an XML
implementation embedded within SQL.

For instance, I use an XML column in a database for metadata about
customers and other objects. So, we have a base table of objects and the
specifics of each object is contained within XML.


So, the former API was perfect for this use:

select datum form objects were key ='GUID' and
xpath_string(datum,E'foo/bar') = 'frobozz';

The logic of the function seems is that it is intended to use extracted
XML within a query. The new xpath functionality seems not to be designed
to facilitate this, requiring a pretty arcane query structure to do the
same thing:

select datum from objects where key='GUID' and (xpath(E'foo/bar',
XMLPARSE(CONTENT datum))::text())[1] = 'frobozz';

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] xpath not a good replacement for xpath_string

2009-07-29 Thread pgsql



 select datum form objects were key ='GUID' and
 xpath_string(datum,E'foo/bar') = 'frobozz';

 The logic of the function seems is that it is intended to use extracted
 XML within a query. The new xpath functionality seems not to be designed
 to facilitate this, requiring a pretty arcane query structure to do the
 same thing:

 select datum from objects where key='GUID' and (xpath(E'foo/bar',
 XMLPARSE(CONTENT datum))::text())[1] = 'frobozz';



 It's not that arcane. Mike Rylander and I came up with the same answer
 independently within a very short time of you posting your query. I
 guess it depends how used you are to using XPath.

That is sort of the point I was making. It just seems arcane, by the very
definition of arcane, within a SQL context. It is workable and it can be
used, but I don't think the change was designed to make writing queries
easier. It was designed to be more about XPath than SQL.


 It's also probably not terribly hard to produce a wrapper to do what
 you'd like.

No, it isn't but you haven't made the usage of XPath any easier in the
more general case.


 I have no problem with adding some convenience functions. I do have a
 problem with functions where we try to make things easy and instead muck
 them up. We just ripped out a convenience from our xpath processing
 that was totally braindead, so this isn't an idle concern.

 I would argue that xpath_string is a fairly horrible name for what the
 xml2 module provides. Specifically, my objection is that an xpath query
 returns a nodeset, and what this function returns is not the string
 value of the nodeset, but the string value of the *contents* of those
 nodes, which is not the same thing at all. To that extent the xml2
 module documentation is at best imprecise and at worst plain wrong.

Well, the API is there, it is where, I guess, PostgreSQL is going, but I
think, philosophically, the API needs to see the XML contained within SQL
columns as being able to represent variable and optional columns in object
oriented environments easily. The harder it is to use a feature, the less
usable the feature is.

Do you disagree?



 cheers

 andrew

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] xpath not a good replacement for xpath_string

2009-07-28 Thread pgsql

Sorry to bring this up, I know you've been fighting about XML for a while.

Currently, I am using XML2 functionality and have tried to get the newer
XPath function to work similarly, but can't quite seem to do it.

I think the current xpath function is too limited. (The docs said to post
problems to hackers if I have an issue.)

For instance, we have a web application that uses java with an XML class
serializer/deserializer Xstream. It creates XML that looks like this:

com.company.local.myclass
uuidb5212259-a91f-4dca-a547-4fe89cf2f32c/uuid
emailj...@somedomain.com/email
/com.company.local.myclass

My current strategy is to use xml2 as:

select xpath_string(datum, E'/com\.company\.local\.myclass/uuid) as uuid
from table;

Which produces a usable:
b5212259-a91f-4dca-a547-4fe89cf2f32c

I have been trying to use xpath
select xpath(E'/com\.company\.local\.myclass/uuid', XMLPARSE(CONTENT
datum)) as uuid from table;

Which produces an unusable:
{uuidb5212259-a91f-4dca-a547-4fe89cf2f32c/uuid}




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] xpath not a good replacement for xpath_string

2009-07-28 Thread pgsql

 Andrew Dunstan and...@dunslane.net wrote:

 in fact the desired functionality is present [...] You just need to
 use the text() function to get the contents of the node, and an
 array subscript to pull it out of the result array.

 I just took a quick look, and that didn't jump out at me from the
 documentation.  Perhaps there should be an example or two of how to
 get the equivalent functionality through the newer standard API, for
 those looking to migrate?

 Would it make sense to supply convenience SQL functions which map
 some of the old API to the new?

The thing that perplexed me was that it was not obvious from the docs how,
exactly, to get the functionality that was simple and straight forward in
XML2.

Another thing that is troubling is that more exotic types do not seem to
be supported at all. For instance, in my example I used uuid, and if one
substitutes uuid() for text() that doesn't work.

The API is less intuitive than the previous incarnation and is, indeed,
more difficult to use.




 -Kevin

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] WAL archive, warm backup and read-only slave

2009-05-24 Thread pgsql

How difficult would it be, and does anyone think it is possible to have a
continuous restore_command ala pg_standby running AND have the database
operational in a read-only mode?



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block-level CRC checks

2008-10-01 Thread pgsql

 On Tue, 2008-09-30 at 17:13 -0400, [EMAIL PROTECTED] wrote:
 
  I believe the idea was to make this as non-invasive as possible. And
  it would be really nice if this could be enabled without a dump/
  reload (maybe the upgrade stuff would make this possible?)
  --

 It's all about the probability of a duplicate check being generated. If
 you use a 32 bit checksum, then you have a theoretical probability of 1
 in
 4 billion that a corrupt block will be missed (probably much lower
 depending on your algorithm). If you use a short, then a 1 in 65
 thousand
 probability. If you use an 8 bit number, then 1 in 256.

 Why am I going on? Well, if there are any spare bits in a block header,
 they could be used for the check value.

 Even and 64-bit integer is just 0.1% of 8k page size, and it is even
 less than 0.1% likely that page will be 100% full and thus that 64bits
 wastes any real space at all.

 So I don't think that this is a space issue.


Oh, I don't think it is a space issue either, the question was could there
be a way to do it without a dump and restore. The only way that occurs to
me is if there are some unused bits in the block header. I haven't looked
at that code in years. The numerics of it are just a description of the
probability of a duplicate sum or crc, meaning a false OK.

Also, regardless of whether or not the block is full, the block is read
and written as a block and that the underlying data unimportant.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block-level CRC checks

2008-10-01 Thread pgsql

 Hannu Krosing [EMAIL PROTECTED] writes:
 So I don't think that this is a space issue.

 No, it's all about time penalties and loss of concurrency.

I don't think that the amount of time it would take to calculate and test
the sum is even important. It may be in older CPUs, but these days CPUs
are so fast in RAM and a block is very small. On x86 systems, depending on
page alignment, we are talking about two or three pages that will be in
memory (They were used to read the block from disk or previously
accessed).



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block-level CRC checks

2008-10-01 Thread pgsql

 [EMAIL PROTECTED] writes:
 No, it's all about time penalties and loss of concurrency.

 I don't think that the amount of time it would take to calculate and
 test
 the sum is even important. It may be in older CPUs, but these days CPUs
 are so fast in RAM and a block is very small. On x86 systems, depending
 on
 page alignment, we are talking about two or three pages that will be in
 memory (They were used to read the block from disk or previously
 accessed).

 Your optimism is showing ;-).  XLogInsert routinely shows up as a major
 CPU hog in any update-intensive test, and AFAICT that's mostly from the
 CRC calculation for WAL records.

 We could possibly use something cheaper than a real CRC, though.  A
 word-wide XOR (ie, effectively a parity calculation) would be sufficient
 to detect most problems.

That was something that I mentioned in my first response. if the *only*
purpose of the check is to generate a pass or fail status, and not
something to be used to find where in the block it is corrupted or attempt
to regenerate the data, then we could certainly optimize the check
algorithm. A simple checksum may be good enough.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block-level CRC checks

2008-10-01 Thread pgsql

 Aidan Van Dyk [EMAIL PROTECTED] writes:
 One possibility would be to double-buffer the write... i.e. as you
 calculate your CRC, you're doing it on a local copy of the block, which
 you hand to the OS to write...  If you're touching the whole block of
 memory to CRC it, it isn't *ridiculously* more expensive to copy the
 memory somewhere else as you do it...

 That actually seems like a really good idea.  We don't have to increase
 the buffer locking requirements, or make much of any change at all in
 the existing logic.  +1, especially if this is intended to be an
 optional feature (which I agree with).

I don't think it make sense at all!!!

If you are going to double buffer, one presumes that for some non-zero
period of time, the block must be locked during which it is copied. You
wouldn't want it changing mid-copy would you? How is this any less of a
hit than just calculating the checksum?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block-level CRC checks

2008-09-30 Thread pgsql

 A customer of ours has been having trouble with corrupted data for some
 time.  Of course, we've almost always blamed hardware (and we've seen
 RAID controllers have their firmware upgraded, among other actions), but
 the useful thing to know is when corruption has happened, and where.

That is an important statement, to know when it happens not necessarily to
be able to recover the block or where in the block it is corrupt. Is that
correct?


 So we've been tasked with adding CRCs to data files.

CRC or checksum? If the objective is merely general detection there
should be some latitude in choosing the methodology for performance.


 The idea is that these CRCs are going to be checked just after reading
 files from disk, and calculated just before writing it.  They are
 just a protection against the storage layer going mad; they are not
 intended to protect against faulty RAM, CPU or kernel.

It will actually find faults in all if it. If the CPU can't add and/or a
RAM location lost a bit, this will blow up just as easily as a bad block.
It may cause false identification of an error, but it will keep a bad
system from hiding.


 This code would be run-time or compile-time configurable.  I'm not
 absolutely sure which yet; the problem with run-time is what to do if
 the user restarts the server with the setting flipped.  It would have
 almost no impact on users who don't enable it.

CPU capacity on modern hardware within a small area of RAM is practically
infinite when compared to any sort of I/O.

 The implementation I'm envisioning requires the use of a new relation
 fork to store the per-block CRCs.  Initially I'm aiming at a CRC32 sum
 for each block.  FlushBuffer would calculate the checksum and store it
 in the CRC fork; ReadBuffer_common would read the page, calculate the
 checksum, and compare it to the one stored in the CRC fork.

Hell, all that is needed is a long or a short checksum value in the block.
I mean, if you just want a sanity test, it doesn't take much. Using a
second relation creates confusion. If there is a CRC discrepancy between
two different blocks, who's wrong? You need a third control to know. If
the block knows its CRC or checksum and that is in error, the block is
bad.


 A buffer's io_in_progress lock protects the buffer's CRC.  We read and
 pin the CRC page before acquiring the lock, to avoid having two buffer
 IO operations in flight.

 I'd like to submit this for 8.4, but I want to ensure that -hackers at
 large approve of this feature before starting serious coding.

 Opinions?

If its fast enough, its a good idea. It could be very helpful in
protecting users data.


 --
 Alvaro Herrera
 http://www.CommandPrompt.com/
 PostgreSQL Replication, Consulting, Custom Development, 24x7 support

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Block-level CRC checks

2008-09-30 Thread pgsql



 I believe the idea was to make this as non-invasive as possible. And
 it would be really nice if this could be enabled without a dump/
 reload (maybe the upgrade stuff would make this possible?)
 --

It's all about the probability of a duplicate check being generated. If
you use a 32 bit checksum, then you have a theoretical probability of 1 in
4 billion that a corrupt block will be missed (probably much lower
depending on your algorithm). If you use a short, then a 1 in 65 thousand
probability. If you use an 8 bit number, then 1 in 256.

Why am I going on? Well, if there are any spare bits in a block header,
they could be used for the check value.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Ad-hoc table type?

2008-09-29 Thread pgsql

 What you're talking about is a document based database like
 StrokeDB, CouchDB. With hstore you don't need to parse content of
 'aggregate' column, it provides necessary methods. Also, we tried
 to speedup selects using indexes. Probably, we need to refresh our
 interest to hstore, do you have any actual proposals ?

Proposals, not at this point. I'm trying to decide (a) if I have the time
and (b) do I do it with Postgres or SQLite. The hstore module, as I said,
looks really cool, I've contemplated something like it. I have a module
provides a set of accessors for an XML text column that works similarly,
but it parses the XML on each access and the application has to create the
XML. (I have XML creation modules for Java, PHP, C++, and standard C
bindings.)

It is more a conflict of data ideology, IMHO. There is a class of data
that is logically on the same level as other data, but is forced into a
secondary storage methodology. It isn't a pressing need as there are work
arounds, but don't you think a cleaner interface make sense? Also, what is
the overhead for the secondary storage mechanism? I think it would make
the life of application developers easier.



 Oleg

 On Sun, 28 Sep 2008, [EMAIL PROTECTED] wrote:

 [EMAIL PROTECTED] writes:
 Something like this:

 create adhoc table foo ();

 insert into foo (name, rank, serial) values ('joe', 'sargent', '42');

 In an ad-hoc table type, when an insert is made, and a column is not
 found, then a new varchar column is added.

 I know the idea has a lot of holes, and is probably a bad idea, but it
 answers an important problem of easily mapping programmatic types to a
 database.

 Seems like a table with one contrib/hstore column might be more
 relevant
 to this guy's idea of how to do database design.


 That's actually a very cool module, I hadn't seen it before. I've
 considered writing something like it, but more XML centric, but I'm not
 sure it answers the concept.

 I'm not sure if you have dealt with web site sessions and object
 persistence crap, but its a pain to get up and running and improving
 performance is a drag. Web guys tend to know very little about databases
 and tend, sadly, not to be very inquisitive about such things.

 Web session and user attribute objects are typically stored in a
 database
 as XML, JSON, or some other aggregated format in a single column
 (hstore).
 That works great for when you just need to access the data by the key,
 but
 if you want to use the data outside the web application for something
 like OLAP, you have to decide which attributes reside in the aggregate
 column or get promoted to a full fledged column. That's why you'll see
 tables with username, passwdhash, email, etc. in addition to an
 aggregated
 column of things like screen template, age, etc.

 So, how do you have a table of a generally arbitrary number of columns
 without creating some sort of aggregate column?  With an aggregate
 column,
 the data isn't on the same level as real column data, so you need to
 parse
 the aggregate to extract a value, and you have to do that for each
 value.
 On top of that, you then have to explain your aggregate strategy to the
 web guys.

 Being able to insert arbitrary named values, and extracting them
 similarly, IMHO works better and more naturally than some external
 aggregate system built on a column. I know it is a little outside the
 box thinking, what do you think?



   Regards,
   Oleg
 _
 Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
 Sternberg Astronomical Institute, Moscow University, Russia
 Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
 phone: +007(495)939-16-83, +007(495)939-23-83

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Ad-hoc table type?

2008-09-28 Thread pgsql

I was in a discussion with someone about the difference between ad-hoc
storage systems and SQL. Yes, I know, I was rolling my eyes as well. One
thing did strike me though was the idea that a table could contain a
variable number of columns.

Something like this:

create adhoc table foo ();

insert into foo (name, rank, serial) values ('joe', 'sargent', '42');

In an ad-hoc table type, when an insert is made, and a column is not
found, then a new varchar column is added.

I know the idea has a lot of holes, and is probably a bad idea, but it
answers an important problem of easily mapping programmatic types to a
database.

Anyone think its interesting?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Ad-hoc table type?

2008-09-28 Thread pgsql

 [EMAIL PROTECTED] writes:
 Something like this:

 create adhoc table foo ();

 insert into foo (name, rank, serial) values ('joe', 'sargent', '42');

 In an ad-hoc table type, when an insert is made, and a column is not
 found, then a new varchar column is added.

 I know the idea has a lot of holes, and is probably a bad idea, but it
 answers an important problem of easily mapping programmatic types to a
 database.

 Seems like a table with one contrib/hstore column might be more relevant
 to this guy's idea of how to do database design.


That's actually a very cool module, I hadn't seen it before. I've
considered writing something like it, but more XML centric, but I'm not
sure it answers the concept.

I'm not sure if you have dealt with web site sessions and object
persistence crap, but its a pain to get up and running and improving
performance is a drag. Web guys tend to know very little about databases
and tend, sadly, not to be very inquisitive about such things.

Web session and user attribute objects are typically stored in a database
as XML, JSON, or some other aggregated format in a single column (hstore).
That works great for when you just need to access the data by the key, but
if you want to use the data outside the web application for something
like OLAP, you have to decide which attributes reside in the aggregate
column or get promoted to a full fledged column. That's why you'll see
tables with username, passwdhash, email, etc. in addition to an aggregated
column of things like screen template, age, etc.

So, how do you have a table of a generally arbitrary number of columns
without creating some sort of aggregate column?  With an aggregate column,
the data isn't on the same level as real column data, so you need to parse
the aggregate to extract a value, and you have to do that for each value.
On top of that, you then have to explain your aggregate strategy to the
web guys.

Being able to insert arbitrary named values, and extracting them
similarly, IMHO works better and more naturally than some external
aggregate system built on a column. I know it is a little outside the
box thinking, what do you think?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Ad-hoc table type?

2008-09-28 Thread pgsql

 [EMAIL PROTECTED] writes:
 Being able to insert arbitrary named values, and extracting them
 similarly, IMHO works better and more naturally than some external
 aggregate system built on a column. I know it is a little outside the
 box thinking, what do you think?

 I'm failing to see the point.  Allowing columns to spring into existence
 without any forethought seems to me to be all minuses and no pluses
 worth mentioning.

 * What if the column name is just a typo?

In an automated system like PHP, Java, etc. that's not too likely.


 * What datatype should it have?  (Always varchar is just lame.)

varchar or text is not just lame, SQLite used to do that exclusively.
One could argue that XML is nothing more than text.


 * Should it have an index?  If so, should it be unique?

The answer to that is, well, no, not unless the dba generates one or it is
declared. Just like any other column. All the rules that apply to create
table and alter table add column just apply naturally as would be
expected.

create adhoc table userdata(username varchar, email varchar, primary
key(email));


 * If you keep doing this, you'll soon find yourself reading out
 unbelievably wide tables (lots of columns), which won't be especially
 easy or efficient to process on either the backend or the client side.
 Plus you might run into the max-columns-per-tuple limit.

Well, I fully understand that it is not a general purpose unlimited
width sort of thing. In a programing environment, the target environment
for this type of feature, it is unlikely to be a run-away problem.


 If you've expended enough thought to be sure that the column is not just
 a typo, ISTM that you can afford to enter an ALTER TABLE ADD COLUMN
 command to tell the database the results of your genius.

Like I said, if you've never dealt with a live web site, maintained by a
team of web dudes, working furiously to keep their job and get paid,
your only hope to keep up with Oh! I needed to add the 'time to live' of
the session into the session data is to use an aggregate storage system.


 I do see the point that switching from member of an hstore column to
 real database column is pretty painful, but I don't see that allow
 columns to spring into existence solves that in any meaningful way.
 Is there some other way we could address such conversions?

Every other solution creates a second tier of data storage. You either
deal with data elements at the table level, or you create a roll your
own aggregate mechanism, or make a HUGE table of user,name,value table
and force a join and index scan for every select.  (A million users, 5-10
attributes each is an expensive join.)

 BTW, I think it is (or should be) possible to create an index on
 hstore-'mycol', so at least one of the reasons why you should *need*
 to switch to a real database column seems bogus.

Oh, yea, function indexes work great. I think you did that right?

For what its worth, I don't expect you to jump all over this. It really is
a divergence from classic SQL design. I'm not even sure I like it. In
fact, I don't like it, but the argument that you are being forced to
create a second class data storage mechanism or a relational join for data
that is logically in a single relation does cause one to ponder the
problem.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Function call order dependency

2008-09-03 Thread pgsql

Is there a knowable order in which functions are called within a query in
PostgreSQL?

For example I'll use the Oracle contains function, though this is not
exactly what I'm doing, it just illustrates the issue clearly.

select *, score(1) from mytable where contains(mytable.title, 'Winding
Road', 1) order by score(1);

The contains function does a match against mytable.title for the term
'Winding Road' and both returns and saves an integer score which may be
retrieved later using the score(...) function. The integer used as a
parameter in score(...) and contains(...) is an index to reference which
score you need as more than one contains(...) call may be used in single
query.

This sets up an interesting issue, how can one ensure that contains() is
called prior to any score() function on each row? Is this possible? Is
there a specific order on which you can count?

Would it be something like: where clause first, left to right, followed
by select terms, left to right, and lastly the order by clause?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Function call order dependency

2008-09-03 Thread pgsql

 [EMAIL PROTECTED] writes:
 For example I'll use the Oracle contains function, though this is not
 exactly what I'm doing, it just illustrates the issue clearly.

 select *, score(1) from mytable where contains(mytable.title, 'Winding
 Road', 1) order by score(1);

 The contains function does a match against mytable.title for the term
 'Winding Road' and both returns and saves an integer score which may be
 retrieved later using the score(...) function.

 This is just a bad, bad idea.  Side-effects in a WHERE-clause function
 are guaranteed to cause headaches.  When (not if) it breaks, you get
 to keep both pieces.

I was kind of afraid of that. So, how could one implement such a function
set?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Function call order dependency

2008-09-03 Thread pgsql

 [EMAIL PROTECTED] writes:
 For example I'll use the Oracle contains function, though this is not
 exactly what I'm doing, it just illustrates the issue clearly.

 select *, score(1) from mytable where contains(mytable.title, 'Winding
 Road', 1) order by score(1);

 The contains function does a match against mytable.title for the term
 'Winding Road' and both returns and saves an integer score which may be
 retrieved later using the score(...) function.

 This is just a bad, bad idea.  Side-effects in a WHERE-clause function
 are guaranteed to cause headaches.  When (not if) it breaks, you get
 to keep both pieces.

Well, I guess I need to alter the question a bit.

I need to perform an operation during query time and there are multiple
results based on the outcome. For instance: (Lets try this)

select myrank(t1.column1, t2.column2, 1) as rank,
myscore(t1.column1,t2.column2, 1) as score from t1, t2 where
myrank(t1.column1,t2.column2)  10 order by myscore(t1.column1,
t2.column2, 1) desc;

This is a bit messier, and I wanted to resist this approach as it is ugly.
The underlying code will check the values of the first and second
parameters and only perform the operation if a previous call did not
already act on the current parameters.

Now, can I assume that in the above select statement, that each
permutation of t1.column1 and t2.column2 will only be evaluated once and
that myscore(...) and myrank(...) will all be called before the next
permutation is evaluated?

So, basically, I don't want to recalculate the values for each and every
function call as that would make the system VERY slow.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Function call order dependency

2008-09-03 Thread pgsql

 I was kind of afraid of that. So, how could one implement such a
 function
 set?

 Write a function (say, score_contains) that returns NULL whenever
 contains would return false, and the score otherwise.

 SELECT * FROM (
 SELECT *, score_contains(mytable.title, 'Winding Road', 1) AS
 score FROM mytable
 ) x WHERE x.score IS NOT NULL
 ORDER BY x.score

That could work, and while it fits my example, my actual need is a bit
more complex. For instance, say I have two variables (I actually have a
few that I need)

select myvar1(1), myvar2(1), myvar3(1) from mytable where
myfunction(mytable.column, 'some text to search for', 1)  2;

How could I ensure that (1) myfunction is called prior to myvar1(),
myvar2(), and myvar3()? I think the answer is that I can't. So, the
obvious solution is to pass all the variables to all the functions and
have it first come first served.

The next issue is something like this:

select *, myvar1(t1.col1,t2.col2,1), myvar2(t1.col1.t2.col2,1) from t1,t2
where myfunction(t1.col1,t2.col2,1)  10 order by
myvar3(t1.col1,t2.col2,1) desc;

Using a first come first served strategy, is there any discontinuity
between the function calls for t1.col1 and t2.col2. Will they all be
called for a particular combination of t1.col1 and t2.col2, in some
unpredictable order before the next row(s) combination is evaluated or
will I have to execute the underlying algorithm for each and every call?



 ...Robert



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Function call order dependency

2008-09-03 Thread pgsql

 [EMAIL PROTECTED] writes:
 I need to perform an operation during query time and there are multiple
 results based on the outcome. For instance: (Lets try this)

 select myrank(t1.column1, t2.column2, 1) as rank,
 myscore(t1.column1,t2.column2, 1) as score from t1, t2 where
 myrank(t1.column1,t2.column2)  10 order by myscore(t1.column1,
 t2.column2, 1) desc;

 Why not have one function that produces multiple output columns?

I was sort of trying to make this a fairly generic SQL extension who's
methodology could be moved to other databases if needed. I guess multiple
columns could work. I've got some code in another extension that does
that.


 Now, can I assume that in the above select statement, that each
 permutation of t1.column1 and t2.column2 will only be evaluated once and
 that myscore(...) and myrank(...) will all be called before the next
 permutation is evaluated?

 You can assume that functions in the SELECT target list are evaluated
 exactly once per output row (at least as long as no
 SRFs-in-the-targetlist are involved).  I don't think it'd be wise to
 assume anything about order of evaluation, though it's probably true
 that it's left-to-right at the moment.

But are all the items targeted in close proximity to each other BEFORE
moving on to the next row? What about the where clause? would that be
called out of order of the select target list? I'm doing a fairly large
amount of processing  and doing it once is important.
/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] SSL configure patch

2008-06-13 Thread pgsql

Here is the SSL patch we discussed previously for 8.3.1.

sslconfig.patch.8.3.1
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSL and USER_CERT_FILE

2008-05-15 Thread pgsql

 Mark Woodward wrote:
 I am using PostgreSQL's SSL support and the conventions for the key and
 certifications don't make sense from the client perspective. Especially
 under Windows.

 I am proposing a few simple changes:

 Adding two API
 void PQsetSSLUserCertFileName(char *filename)
 {
 user_crt_filename = strdup(filename);
 }
 PQsetSSLUserKeyFileName(char *filename)
 {
 user_key_filename = strdup(filename);
 }



 [snip]
 Any comments?




 I think it would probably be much better to allow for some environment
 variables to specify the locations of the client certificate and key
 (and the CA cert and CRL) - c.f. PGPASSFILE.

 That way not only could these be set by C programs but by any libpq user
 (I'm sure driver writers who use libpq don't want to have to bother with
 this stuff.) And we wouldn't need to change the API at all.


The problem I have with environment variables is that they tend not to be
application specific and almost always lead to configuration issues. As a
methodology for default configuration, it adds flexibility. Also, the
current configuration does not easily take in to consideration the idea
that different databases with different keys can be used from the same
system the same user.

Maybe we need to go even further and add it to the PQconnect API
sslkey=filename and sslcrt=filename in addition to sslmode?



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSL and USER_CERT_FILE

2008-05-15 Thread pgsql

 [EMAIL PROTECTED] writes:
 Maybe we need to go even further and add it to the PQconnect API
 sslkey=filename and sslcrt=filename in addition to sslmode?

 If there's a case to be made for this at all, it should be handled the
 same way as all other libpq connection parameters.

   regards, tom lane


Here's the use case:

I have an application that must connect to multiple PostgreSQL databases
and must use secure communications and the SSL keys are under the control
of the business units the administer the databases, not me. In addition my
application also communicates with other SSL enabled versions of itself.

I think you would agree that a hard coded immutable location for client
interface is problematic.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSL and USER_CERT_FILE

2008-05-15 Thread pgsql


 On May 15, 2008, at 6:31 AM, [EMAIL PROTECTED] wrote:

 Mark Woodward wrote:
 I am using PostgreSQL's SSL support and the conventions for the
 key and
 certifications don't make sense from the client perspective.
 Especially
 under Windows.

 I am proposing a few simple changes:

 Adding two API
 void PQsetSSLUserCertFileName(char *filename)
 {
user_crt_filename = strdup(filename);
 }
 PQsetSSLUserKeyFileName(char *filename)
 {
user_key_filename = strdup(filename);
 }



 [snip]
 Any comments?




 I think it would probably be much better to allow for some
 environment
 variables to specify the locations of the client certificate and key
 (and the CA cert and CRL) - c.f. PGPASSFILE.

 That way not only could these be set by C programs but by any libpq
 user
 (I'm sure driver writers who use libpq don't want to have to bother
 with
 this stuff.) And we wouldn't need to change the API at all.


 The problem I have with environment variables is that they tend not
 to be
 application specific and almost always lead to configuration issues.
 As a
 methodology for default configuration, it adds flexibility. Also, the
 current configuration does not easily take in to consideration the
 idea
 that different databases with different keys can be used from the same
 system the same user.

 Environment variables don't have to be set in your shell.

 This would seem to give the same functionality you suggest above,
 given support for environment variables:

 void PQsetSSLUserCertFileName(char * filename)
 {
setenv(PGCERTFILE, filename);
 }

 void PQsetSSLUserKeyFileName(char *filename)
 {
setenv(PGKEYFILE, filename);
 }

 Or, in perl, $ENV{PGKEYFILE} = $file and so on. It seems
 less intrusive than adding new API calls.

 Cheers,
Steve

Doesn't it make sense that the connection be configured in one place? I
agree with Tom, if it should be done, it should be done in PQconnectdb.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] SSL and USER_CERT_FILE round 2

2008-05-15 Thread pgsql

Adding sslkey and sslcert to the PQconnectdb connection string.

After some discussion, I think it is more appropriate to add the key/cert
file for SSL into the connect string. For example:

PQconnectdb(host=foo dbname=bar sslmode=require
sslkey=/opt/myapp/share/keys/client.key
sslcert=/opt/myapp/share/keys/client.crt);


Any comments?




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSL and USER_CERT_FILE round 2

2008-05-15 Thread pgsql



 [EMAIL PROTECTED] wrote:
 Adding sslkey and sslcert to the PQconnectdb connection string.

 After some discussion, I think it is more appropriate to add the
 key/cert
 file for SSL into the connect string. For example:

 PQconnectdb(host=foo dbname=bar sslmode=require
 sslkey=/opt/myapp/share/keys/client.key
 sslcert=/opt/myapp/share/keys/client.crt);


 Any comments?



 I think if you're going to provide for these then you should also
 provide for the CA cert and CRL.

 Otherwise, it seems sensible.

I thought about that, but the root and crl are for the server, and that
makes sense that the keys would be in the server directory. The server
needs to protect its data against clients wishing to connect.  The client
on the other hand, needs to access one or more postgresql servers.

It makes sense that the server keys and credentials be hard coded to its
protected data directory, while the client needs the ability to have
multiple keys.

Think of it this way, a specific lock only takes one key while a person
needs to carry multiple keys on a ring.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSL and USER_CERT_FILE

2008-05-15 Thread pgsql

 [EMAIL PROTECTED] wrote:
  [EMAIL PROTECTED] writes:
  Maybe we need to go even further and add it to the PQconnect API
  sslkey=filename and sslcrt=filename in addition to sslmode?
 
  If there's a case to be made for this at all, it should be handled
  the same way as all other libpq connection parameters.
 
 regards, tom lane
 

 Here's the use case:

 I have an application that must connect to multiple PostgreSQL
 databases and must use secure communications and the SSL keys are
 under the control of the business units the administer the databases,
 not me. In addition my application also communicates with other SSL
 enabled versions of itself.

 I think you would agree that a hard coded immutable location for
 client interface is problematic.

 I agree fully with the use-case. Most of the other things we allow both
 as connection parameters and as environment variables, so we should do
 that IMHO. What could be debated is if we should also somehow allow it
 to be specified in .pgpass for example?



I am testing a patch that is currently against the 8.2 series.

It implements in PQconnectdb(...)

sslmode=require sslkey=client.key sslcert=client.crt ssltrustcrt=certs.pem
sslcrl=crl.pem

BTW: the revocation list probably never worked in the client.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] unsubscribe

2005-11-20 Thread pgsql

unsubscribe

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] signed short fd

2005-03-14 Thread pgsql



 Maybe we make the assumption that all OS will
 implement fd as an array index

 The POSIX spec requires open() to assign fd's consecutively from zero.
 http://www.opengroup.org/onlinepubs/007908799/xsh/open.html

With all due respect, PostgreSQL now runs natively on Win32. Having a
POSIX-only mentality, especially with something so trivial, is a mistake.
I would say int is the best way to handle it. You just *never* know.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] signed short fd

2005-03-14 Thread pgsql

 [EMAIL PROTECTED] writes:
 The POSIX spec requires open() to assign fd's consecutively from zero.
 http://www.opengroup.org/onlinepubs/007908799/xsh/open.html

 With all due respect, PostgreSQL now runs natively on Win32.

 ... using the POSIX APIs that Microsoft so kindly provides.
 fd.c will certainly not work at all on a platform that doesn't
 provide a POSIX-like file access API, and in the absence of any
 evidence to the contrary, I don't see why we shouldn't assume
 that the platform adheres to that part of the spec too.


I'm a better safe than sorry sort of guy. I would rather code
defensively against a poorly implemented API. However:

Upon successful completion, the function will open the file and return a
non-negative integer representing the lowest numbered unused file
descriptor. Otherwise, -1 is returned and errno is set to indicate the
error. No files will be created or modified if the function returns -1.

That is hardly anything that I would feel comfortable with. Lets break
this down into all the areas that are ambiguous:

unused file descriptor, define unused. Is it unused ever, or currently
unused? Could an API developer simply just increment file opens? What
about just allocating a structure on each open, and returning its pointer
cast to an int?

Also notice that no mention of process separation exists, it could very
well be that a file descriptor may be usable system wide, with the
exceptions of stdin, stdout, and stderr.

Nowhere does it say how the file descriptors are numbered. 1,2,3,4 sure,
that's what you expect, but it isn't an explicitly documented behavior.

What is documented, however, that it is a machine int and that the
number will be positive and be the lowest unused descriptor (depending
on the definition of unused)

This is the sort of thing that makes software brittle and likely to crash.
Sure, you may be right in saying a short int is enough. Some developer
creating a POSIX clib my think he is right doing something his way. What
happens is that there is a potentially serious bug that will only show up
at seemingly random times.

The fact is that it is PostgreSQL that would be wrong, the API is
documented as taking an int. PostgreSQL casts it to a short. What ever
you read into the implementation of the API is wrong. The API is an
abstraction and you should assume you don't know anything about it.



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] signed short fd

2005-03-14 Thread pgsql

 [EMAIL PROTECTED] writes:
 That is hardly anything that I would feel comfortable with. Lets break
 this down into all the areas that are ambiguous:

 There isn't anything ambiguous about this, nor is it credible that there
 are implementations that don't follow the intent of the spec.

How do you know the intent of the spec? I have seen no meta discussion
about the behavior of the file descriptor integer returned from open. The
Steven's book makes no such assumptions, and the steven's book (Advanced
Programming in the UNIX Environment) is what people reference.

 Consider
 the standard paradigm for replacing stdout: you close(1) and then open()
 the target file.  If the open() doesn't pick 1 as the fd, you're screwed.
 Every shell in the world would break atop such an implementation.

I said that stdin, stdout, and stderr would be treated differently as they
are on all platforms.


 It may well be the case that saving 4 bytes per VFD is useless
 micro-optimization.   But the code isn't broken as it stands.

It most likely is not broken as it is, but it would be interesting to put
an assert(fd  32768) in the code and see if it ever breaks. Never the
less, the spec DOES call for file fds to be a machine int. All
acceptable coding practices would demand that since the API spec calls for
an int, the application should use an int.

This is the sort of thing that is caught and fixed in any standard code
review. Why is this an argument? What am I missing that you are defending?




---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] signed short fd

2005-03-14 Thread pgsql


 My copy of APUE says on page 49: The file descriptor returned by open
 is the lowest numbered unused descriptor. This is used by some
 applications to open a new file on standard input, standard output, or
 standard error.

Yes, I'll restate my questions:

What is meant by unused? Is it read to mean that a higher number file is
*never* returned if there is a lower number that has been used and is now
available? Is that something we can 100% absolutely depend on. On All
curent and future platforms?

It is a stupid idea to truncate the upper bytes of an integer without good
reason. I can see LOTS of reasons why this will break something in the
future. The upper bits may be used to identify storage media or
characteristics.

My point is that the spec calls for an int, PostgreSQL should use an int.


 Unless someone can show there's an actual problem this discussion seems
 quite pointless.


The point is that this *is* silly, but I am at a loss to understand why it
isn't a no-brainer to change. Why is there a fight over a trivial change
which will ensure that PostgreSQL aligns to the documented behavior of
open()


---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] signed short fd

2005-03-14 Thread pgsql

 [EMAIL PROTECTED] wrote:
 The point is that this *is* silly, but I am at a loss to understand why
 it
 isn't a no-brainer to change. Why is there a fight over a trivial change
 which will ensure that PostgreSQL aligns to the documented behavior of
 open()

 (Why characterise this as a fight, rather than a discussion? Perhaps
 it is because of the same combative, adversarial attitude you seem to
 bring to every discussion you're involved in on -hackers...)

I really don't intend to do that, and it does seem to happen a lot. I am
the first to admit I lack tact, but often times I view the decisions made
as rather arbitrary and lacking a larger perspective, but that is a rant I
don't want to get right now.


 Anyway, I agree, there's no point keeping it a short; I highly doubt
 this would actually be a problem, but we may as well change it to an int.

And this is my point. There are things that are no brainers, and a few
times I have been completely dumbfounded as to the source of resistence.
Silently truncating the upper 2 bytes of data type declared as an int is
a bug. I can't believe anyone would defend it, but here it happens.

Maybe it is me. I know I'm stubborn and confrontational, personally I've
wished I could be different, but I'm 42 so I guess I'm not going to change
any time soon.

Regardless of the source, if you want code to be portable, you have to
take APIs at face value. Any assumptions you think you can make are by
definition wrong. Allow the API authors the space to handle what they need
to handle.

Assuming a specific behavior is dangerous. Is it currently a problem, most
likely not, but since there is no downside, why leave it lurking to bite
us?



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [pgsql-hackers-win32] [HACKERS] snprintf causes regression

2005-03-11 Thread pgsql

 Tom Lane wrote:
 [EMAIL PROTECTED] writes:
  Please see my posting about using a macro for snprintf.

  Wasn't the issue about odd behavior of the Win32 linker choosing the
 wrong
  vnsprintf?

 You're right, the point about the macro was to avoid linker weirdness on
 Windows.  We need to do that part in any case.  I think Bruce confused
 that issue with the one about whether our version supported %n$
 adequately ... which it doesn't just yet ...

 Perhaps I am reading old email in this reply but I thought I should
 clarify:

 Once we do:

   #define vsnprintf(...)pg_vsnprintf(__VA_ARGS__)
   #define snprintf(...) pg_snprintf(__VA_ARGS__)
   #define printf(...)   pg_printf(__VA_ARGS__)


I'm not sure that macros can have variable number of arguments on all
supported platforms. I've been burnt by this before.


---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [pgsql-hackers-win32] [HACKERS] snprintf causes regression

2005-03-10 Thread pgsql

 Tom Lane wrote:
 Bruce Momjian pgman@candle.pha.pa.us writes:
  Please see my posting about using a macro for snprintf.  If the
 current
  implementation of snprintf is enough for our existing translation
 users
  we probably don't need to add anything more to it because snprintf
 will
  not be exported to client applications.

 The CVS-tip implementation is fundamentally broken and won't work even
 for our internal uses.  I've not wasted time complaining about it
 because I thought we were going to replace it.  If we can't find a
 usable replacement then we're going to have to put a lot of effort
 into fixing what's there.  On the whole I think the effort would be
 better spent importing someone else's solution.

 Oh, so our existing implementation doesn't even meet our needs. OK.

Wasn't the issue about odd behavior of the Win32 linker choosing the wrong
vnsprintf?

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [pgsql-hackers-win32] [HACKERS] snprintf causes regression

2005-03-09 Thread pgsql

From what I recall from the conversation, I would say rename the vsprintf
and the sprintf functions in postgres to pq_vsnprintf and pq_snprintf.
Define a couple macros: (in some common header, pqprintf.h?)

#define snprintf pq_snprintf
#define vsnprintf pq_snprintf

Then just maintain the postgres forms of printf which have seemed to be OK
except that on Win32 vnsprintf, although in the same object file was not
being used.



 Dear all,
 After struggling for one week to to integrate FreeBSD's vfprintf.c into
 PostgreSQL I finally gave up. It is too dependent on underlying
 FreeBSD system functions. To incorporate it into PostgreSQL we need
 to move vfprintf.c file itself, two dozen files form gdtoa and a half
 a dozen __XXtoa.c files scattered in apparently random fashion all
 around FreeBSD source tree.

 Instead I researched some other implementations of snprintf on
 the web released under a license compatible with PostgreSQL's.
 The most suitable one I have come upon is Trio
 [http://daniel.haxx.se/projects/trio/].
 It is distributed under a MIT-like license which, I think will be
 compatible with us.

 What do you think about it? Shall I abandon FreeBSD and go ahead
 ıncorporatıng Trıo?

 And by the way, what ıs the conclusıon of snprıntf() vs. pg_snprintf()
 and UNIX libraries discussion a week ago? Which one shall
 I implement?

 Regards,
 Nicolai Tufar

 ---(end of broadcast)---
 TIP 6: Have you searched our list archives?

http://archives.postgresql.org



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

[HACKERS] Manual vs automatic functionality

2005-03-05 Thread pgsql

Tom recently said, when talking about allowing the user (in this case me)
from passing a hash table size to create index:

but that doesn't mean I want to make the user deal with it.


I started thinking about this and, maybe I'm old fashioned, but I would
like the ability to deal with it. So much software these days does things
in an automatic fashion, and too often you are left saying stop, no do it
that way, damn! tossing hands up in frustration. Come on, be honest, we
all fight this.

I think the holy grail of completly automatic tuning/functionality is a
lofty goal and a good one to seek, but the harsh reality is that there are
many times when the statistics aren't sufficient and the very broad
audience to which PostgreSQL is targeted clouds various specific use
cases.

I have been on the end of these problems numerous times in the almost 10
years of using PostgreSQL. While I still believe that PostgreSQL is, by
far, one of the best and most usable databases out there, there are times
I just get frustrated.

Being able to assign hints to queries may be able to allow DBAs to tune
queries in tables who's characteristics are misrepresented by the
statistics in ANALYZE.

Being able to pass a hash table size to a hash CREATE INDEX  statement,
may make hash table faster.

Whould a hinting syntax help this out? I don't know, but I do know that
just about every non-trivial project for which I have used PostgreSQL, I
have run into issues where I've had to manually alter statistics source
code, or enable/disable scan types to work around situations where PG just
didn't understand the nature of the problem.

Also, isn't SET enable_seqscan=FALSE just another more clumsy way of
issuing a hint to the planner?

CREATE INDEX mytablehash ON mytable USING hash /* +HASH_SIZE(1000) */

SELECT * from table1, table2 where table1.realm = table2.realm and
table1.name = 'foo' /* +USE_INDEX(tabel1.realm) +USE_HASH(table1.realm,
table2.realm) */



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] bitmap AM design

2005-03-04 Thread pgsql

 Pailloncy Jean-Gerard wrote:
 You should have a look to this thread
 http://archives.postgresql.org/pgsql-hackers/2005-02/msg00263.php

 Take a look at this paper about lock-free parallel hash table
 http://www.cs.rug.nl/~wim/mechver/hashtable/

 Is this relevant? Hash indexes are on-disk data structures, so ISTM
 lock-free algorithms aren't really applicable.

 (BTW, is poor concurrency really the biggest issue with hash indexes? If
 so, there is some low-hanging fruit that I noticed a few years ago, but
 never got around to fixing: _hash_doinsert() write-locks the hash
 metapage on every insertion merely to increment a tuple counter. This
 could be improved by just acquiring the lock with probability 1/k, and
 incrementing the counter k times -- or some similar statistical
 approximation. IMHO there are bigger issues with hash indexes, like the
 lack of WAL safety, the very slow index build times, and their
 relatively poor performance -- i.e. no better than btree for any
 workload I've seen. If someone wants to step up to the plate and fix
 some of that, I'll improve hash index concurrency -- any takers? :-) )


As always, I'd love to have the time to do this stuff, but as always, I'm
not in the position to spend any time on it. It's frustrating.

Anyway, IMHO, hash indexes would be dramatically improved if you could
specify your own hashing function and declare initial table size. If you
could do that, and work on an assumption that the hashing index was for
fairly static data, it could handle many needs. As it stands, hash indexes
are virtually useless.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] bitmap AM design

2005-03-04 Thread pgsql

 [EMAIL PROTECTED] writes:
 Anyway, IMHO, hash indexes would be dramatically improved if you could
 specify your own hashing function

 That's called a custom operator class.

Would I also be able to query the bucket size and all that?


 and declare initial table size.

 It would be interesting to see if setting up the hashtable with about
 the right number of buckets initially would make CREATE INDEX enough
 faster to be a win ... but that doesn't mean I want to make the user
 deal with it.  We could probably hack hashbuild() to estimate the
 size of the parent table using the same code that the planner is now
 using (ie, actual size in pages times a possibly-dead-reckoning rows
 per page estimate).


I know a linear hash is different than a classic simple hash table, but a
classic simple hash table has some great advantages at the expense of disk
space. IMHO being able to use the hash index in a way that is more of the
classic theoretical hash table and use the linear behavor if the table
grows beyond initial estimates I think would be a big win. It could
actually get to a 1:1 operation data retrieval on properly estimated
tables.

Estimations are a great idea, something like first prime after 2*NROWS
(with a GUC limit, I guess) would probably make hash indexes the fastest
most disk hogging index around.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] db cluster ?

2005-03-04 Thread pgsql

 Hello hackers,

 i'm wondering if is possible to somehow spread pretty big db (aprox 50G)
 over few boxes to get more speed ?
 if anyone did that i'd be glad to have some directions in right way,


I have done different elements of clusering with PostgreSQL on a per task
basis, but not a fully comprehensive generic distributed cluster. There
are a couple tools you can use if your are an engineering sort of fellow.

Sloney is a replication cluster, all the data is on all the machines.

There is a project that shows promise as a distributed data system:
contrib/dblink. One could segment their database as a number of logical
data managers and use dblink to incorporate the data on one database into
the queries on another. It won't be transparent, but could be fairly
managable if you use views to implement the links.

I guess the real question is what performance do you need to improve? If
it is just read performance, then sloney is probably your best bet. Put a
number of redundant machines behind a load balancer and you are all set.

If you need to increase write performance, well, that can be problematic.

What is it that your want to accomplish?

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] cluster table by two-column index ?

2005-03-03 Thread pgsql

 I'm wondering,
 is there any sense to cluster table using two-column index ?


We've had this discussion a few weeks ago. Look at the archives for my
post One Big Trend 

The problem is that while the statistics can resonably deal with the
primary column it completely misses the trends produced in the secondary
column. This situation can be seen quite clearly using the US Census TIGER
database.

I imagine the primary and secondary columns both have a discrete index and
the combined index is for the cluser or more complex queries.

If you execute a query based on the secondary column's index that should
return about 100 rows. The smaller trends in the column produced by the
cluster are not detected. So, rather then seeing that its probably a few
index seeks and a few table seeks because the data is fairly well grouped,
it opts, instead, to do a table scan because it doesn't see any
correlation.

Increasing the number of samples in ANALIZE helps a bit, but the solution
is better statistics or maybe hints that can be embedded into the query.

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] bitmap AM design

2005-03-03 Thread pgsql

 Ühel kenal päeval (teisipäev, 1. märts 2005, 14:54-0500), kirjutas
 [EMAIL PROTECTED]:
 Now, it occurs to me that if my document reference table can refer to
 something other than an indexed primary key, I can save a lot of index
 processing time in PostgreSQL if I can have a safe analogy to CTID.

 I guess you could work on making hash indexes better (for concurrent
 access).

 'a safe analogy to CTID' looks remarkably like hash index


Yes, I agree, but I don't particularly like linear hash models without the
ability to adjust the initial table size estimates. Also, hash tables
without access to the hash function typically have a lot of collision,
specifically, I am dubious of generic hash functions having an optimally
dispersed behavior.

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] snprintf causes regression tests to fail

2005-03-02 Thread pgsql


 The big question is why our own vsnprintf() is not being called from
 snprintf() in our port file.


I have seen this problem before, well, it isn't really a problem I guess.

I'm not sure of the gcc compiler options, but

On the Microsoft compiler if you specify the option /Gy it separates the
functions within the object file, that way you don't load all the
functions from the object if they are not needed.

If, however, you create a function with the same name as another function,
and one is declared in an object compiled with the /Gy option, and the
other's object file is not, then if you also use a different function or
reference variable in the object file compiled without the /Gy option,
then the conflicting function will probably be used. Make sense?

I would suggest using macro to redefine snprintf and vnsprintf to avoid
the issue:

#define snprintf pg_snprintf
#define vnsprintf pg_vnsprintf



---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [pgsql-hackers-win32] [HACKERS] snprintf causes regression

2005-03-02 Thread pgsql


 Yes, strangly the Window's linker is fine because libpqdll.def defines
 what symbols are exported.  I don't think Unix has that capability.

A non-static public function in a Windows DLL is not available for
dynamic linking unless explicitly declared as dll export. This behavior is
completely different than UNIX shared libraries.

Windows static libraries operate completely differently than Windows DLLs,
they work like their UNIX equivilents.

So, if you create an snprintf function in code that will be in both a
static and dynamic library, the static library may have conflicts where as
the dynamic one will not.

Don't you love Windows?

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [pgsql-hackers-win32] [HACKERS] snprintf causes regression

2005-03-02 Thread pgsql

 Bruce Momjian pgman@candle.pha.pa.us writes:
 Tom Lane wrote:
 First line of thought: we surely must not insert a snprintf into
 libpq.so unless it is 100% up to spec *and* has no performance issues
 ... neither of which can be claimed of the CVS-tip version.

 Agreed, and we have to support all the 64-bit specifications a port
 might support like %qd and %I64d as well as %lld.  I have added that to
 our current CVS version.

 I really dislike that idea and request that you revert it.

 Is there any way we can have just gettext() call our snprintf under a
 special name?

 The issue only comes up in libpq --- in the backend there is no reason
 that snprintf can't be our snprintf, and likewise in self-contained
 programs like psql.  It might be worth the pain-in-the-neck quality to
 have libpq refer to the functions as pq_snprintf etc.  Perhaps we could
 do this via macros

 #define snprintf  pq_snprintf

Didn't I suggest that earlier? :) Also, since it is vsnprintf that seems
to be a bigger issue:

#define vsnprintf pq_vsnprintf

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] bitmap AM design

2005-03-01 Thread pgsql


 I don't think we really need any more fundamentally nonconcurrent index
 types :-(


Tom, I posted a message about a week ago (I forget the name) about a
persistent reference index, sort of like CTID, but basically a table
lookup. The idea is to simulate a structure that ISAM sort of techniques
can work in PostgreSQL.

Victor had emailed me and basically said he needed a similar sort of thing
for this bitmap index.

Eliminating the bitmap index issue for a moment, how hard would it be to
create a reference table like index? I'm sure given that construct, the
bitmap index becomes easier to construct.

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] bitmap AM design

2005-03-01 Thread pgsql

 [EMAIL PROTECTED] writes:
 Tom, I posted a message about a week ago (I forget the name) about a
 persistent reference index, sort of like CTID, but basically a table
 lookup. The idea is to simulate a structure that ISAM sort of techniques
 can work in PostgreSQL.

 Eliminating the bitmap index issue for a moment, how hard would it be to
 create a reference table like index?

 I didn't see the point.  You cannot actually use CTID that way (at least
 not without fundamentally redesigning our MVCC machinery) and anything
 else you might come up with is effectively just a random unique ID that
 has to be mapped through an index to find the row.  I don't see the
 advantage compared to any ordinary application-defined primary key.

I have a search engine product which does use primary key based number.
The search engine also uses an external bitmap index. Here is the process:

Parse incoming text into discrete words.
Look up each word and retrieve its bitmap.
Combine all the bitmaps using the appropriate logical functions (AND, OR,
etc)
list out all the 1s from the bitmaps as an entry into a table which
points to the primary key number.
Find all the records in the database with all the primary keys, sometimes
hundreds or thousands of entries in a WHERE IN (...) clause.
Now, after I've done all this logical work getting document numbers, I
have to do an index lookup for each one (or, god forbid, a full table
scan!)
This can be a long process, longer than actually doing the text search
with the bitmaps in the first place.

A persistent reference index would be like almost any other index except
that as new items are added to a table a new entry is added to the end of
the index. When a row is updated, its CTID is updated in the table. When
you run vacuum, you just update the CTID in the table as if it were any
other index. When you delete an item, you clear the CDID value of the
table entry. You can do VACUUM COMPRESS INDEX myindex which will
re-order and compact the persistent reference.

I know from a purely SQL standpoint, it sounds whacky, but from a VAR or
consultants point of view, it can really increase performance of some
products. That last WHERE IN clause of my search engine can take several
tens of seconds to run, but the actual search engine only took 0.03
seconds to find the documents. A persistent reference system would
eliminate tens ro thousands of index lookups per search query.

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] bitmap AM design

2005-03-01 Thread pgsql

OK, lets step back a bit and see if there is a solution that fits what we
think we need and PostgreSQL.

Lets talk about FTSS, its something I can discuss easily. It is a two
stage system with an indexer and a server. Only the data to be indexed is
in the database, all the FTSS data structures are in external files.

The indexer creates a number of data structures.
A table of document references, one entry per document.
A table of words parsed, one word per entry
A table of compressed bitmaps, one (or more) bitmap(s) per word.

The relation of bits in the word bitmaps is one bit per document as
ordered by the document table, i.e. if bit 5 is set high, then the fith
document is selected.

(Let's not discuss phrase analysis at this point.)

When the indexer runs, it executes a query that produces a set of results.
Each result has a document reference which is stored in the FTSS document
table.
The results are parsed out as discrete words, new words are added to the
word table, previously used word's reference counts are incremented.
A bitmap is created for each new word.
The bit of the current document is set to 1.
This procedure runs for each record in the query.

The server runs as follows:
accepts an HTTP request for search
Parses out the discrete words.
The word is found in the word table.
The word's bitmap is retrieved from the bitmap table.
A series of logical functions are performed on the retrieved bitmaps.
The resulting bitmap contains all the relevant documents in the form of
bits correlating to offsets into the document reference table.
The list of document references is returned to the database and found
using a WHERE IN clause.

Now, it occurs to me that if my document reference table can refer to
something other than an indexed primary key, I can save a lot of index
processing time in PostgreSQL if I can have a safe analogy to CTID.

I should be able to know more about a particular row (document) being
referenced, because I have already been through the table once.

I need to be able to know which rows are newer than my FTSS index so I
can search those rows more dynamically. I currently do this by saving the
highest value during indexing.



 [EMAIL PROTECTED] writes:
 A persistent reference index would be like almost any other index
 except
 that as new items are added to a table a new entry is added to the end
 of
 the index. When a row is updated, its CTID is updated in the table.

 This *does not work* under MVCC: it can't cope with referencing
 multiple simultaneously existing versions of a row.  In general, any
 index design that includes the words update in place can be rejected
 out of hand.

 In any case I still fail to see the advantage compared to an ordinary
 serial primary key.  You could have made your bitmaps reference the
 serial numbers directly, instead of an intermediate table.  (Either way
 still fails to handle MVCC updates, unless the indexable attributes
 cannot be changed by updates; but the intermediate table isn't helping
 or hurting that.)

 A bitmap that indexes CTIDs directly could work, because it doesn't need
 to update any entries in-place when a table row is updated.  I didn't
 care for the details of Victor's design because (a) the intermediate
 list of CTIDs looks bulky and (b) it seemed to require a lot of data
 shuffling to cope with growth of the table.  But in principle it would
 work.

   regards, tom lane



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [pgsql-hackers-win32] [HACKERS] snprintf causes regression

2005-03-01 Thread pgsql

 Nicolai Tufar wrote:
 On Tue, 1 Mar 2005 00:55:20 -0500 (EST), Bruce Momjian
  My next guess
  is that Win32 isn't handling va_arg(..., long long int) properly.
 

 I am trying various combination of number and types
 of parameters in my test program and everything prints fine.
 When it comes to pg, it fails :(

 template1=# select * from test where x  1000::int8;
  x
 
  -869367531
 (1 row)

 I am not too fluent in source code, could someone
 point me to there actual call to snprintf() is being done
 when a query like this is executed. I could not find it myslef

 Sure, in src/backend/utils/adt/int8.c, there is a call in int8out():

 if ((len = snprintf(buf, MAXINT8LEN, INT64_FORMAT, val))  0)

 and that calls port/snprintf.c.

 I have added a puts() in snprintf.c to make sure it is getting the
 long/long specifier.

Just a question, and don't mind me if I am being rude, isn't this the
WRONG PLACE for a printf function? Wouldn't an itoa function be more
efficient and be less problematic?


---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [pgsql-hackers-win32] [HACKERS] snprintf causes regression

2005-03-01 Thread pgsql

 I spent all day debugging it. Still have absolutely
 no idea what could possibly go wrong. Does
 anyone have a slightest clue what can it be and
 why it manifests itself only on win32?

It may be that the CLIB  has badly broken support for 64bit integers on 32
bit platforms. Does anyone know of any Cygwin/Ming issues?

Is this only with the new snprintf code in Win32?

Is this a problem with snprintf as implemented in src/port?

Is there a reason why we don't use the snprintf that comes with the
various C compilers?




 On Tue, 1 Mar 2005 09:29:07 -0500 (EST), Bruce Momjian
 pgman@candle.pha.pa.us wrote:
 Nicolai Tufar wrote:
  On Tue, 1 Mar 2005 00:55:20 -0500 (EST), Bruce Momjian
   My next guess
   is that Win32 isn't handling va_arg(..., long long int) properly.
  
 
  I am trying various combination of number and types
  of parameters in my test program and everything prints fine.
  When it comes to pg, it fails :(
 
  template1=# select * from test where x  1000::int8;
   x
  
   -869367531
  (1 row)
 
  I am not too fluent in source code, could someone
  point me to there actual call to snprintf() is being done
  when a query like this is executed. I could not find it myslef

 Sure, in src/backend/utils/adt/int8.c, there is a call in int8out():

 if ((len = snprintf(buf, MAXINT8LEN, INT64_FORMAT, val))  0)

 and that calls port/snprintf.c.

 I have added a puts() in snprintf.c to make sure it is getting the
 long/long specifier.

 --
   Bruce Momjian|  http://candle.pha.pa.us
   pgman@candle.pha.pa.us   |  (610) 359-1001
   +  If your life is a hard drive, |  13 Roberts Road
   +  Christ can be your backup.|  Newtown Square, Pennsylvania
 19073


 ---(end of broadcast)---
 TIP 8: explain analyze is your friend



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [pgsql-hackers-win32] [HACKERS] snprintf causes regression

2005-03-01 Thread pgsql

 On Tue, 1 Mar 2005 15:38:58 -0500 (EST), [EMAIL PROTECTED]
 [EMAIL PROTECTED] wrote:
 Is there a reason why we don't use the snprintf that comes with the
 various C compilers?

 snprintf() is usually buried in OS libraries. We implement
 our own snprintf to make things like this:
 snprintf(buf,%2$s %1$s,world,Hello);
 which is not supported on some platforms work.

 We do it for national language translation of
 messages. In some languages you may need
 to change order of parameters to make a meaningful
 sentence.

 Another question is why we are using it for printing
 values from database. I am not too good on function
 overriding in standard C but maybe there is a way
 to usage of standard snprintf() in a particular place.


Well, here is a stupid question, do we even know which snprintf we are
using on Win32? May it be possible that we are using the MingW version
which may be broken?


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] snprintf causes regression tests to fail

2005-02-28 Thread pgsql

 Linux and Solaris 10 x86 pass regression tests fine when I force the use
 of new
 snprintf().   The problem should be win32 - specific. I will
 investigate it throughly
 tonight. Can someone experienced in win32 what can possibly be the
 problem?

Do we have any idea about what format string causes the regression failure?
It may be that certain integer types are not promoted uniformly when
pushed on the stack.



 Nick

 On Sun, 27 Feb 2005 19:07:16 +0100, Magnus Hagander [EMAIL PROTECTED]
 wrote:
 Hi!

 The new snpritnf code appears not to deal with 64-bit ints. I'm getting
 failures on win32 for int8 as well as all the time related tests (win32
 uses int8 for tinmestamps). Removing the snprintf code and falling back
 to the OS code makes everything pass again.

 I would guess this affects int8 etc on other platforms as well (assuming
 they use our snprintf and not the libc one), but I haven't checked it.

 //Magnus



 ---(end of broadcast)---
 TIP 9: the planner will ignore your desire to choose an index scan if your
   joining column's datatypes do not match



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [pgsql-hackers-win32] [HACKERS] win32 performance - fsync

2005-02-24 Thread pgsql

 Magnus Hagander [EMAIL PROTECTED] writes:
 My results are:
 Fisrt, baseline:
 * Linux, with fsync (default), write-cache disabled: no data corruption
 * Linux, with fsync (default), write-cache enabled: usually no data
 corruption, but two runs which had

 That makes sense.

 * Win32, with fsync, write-cache disabled: no data corruption
 * Win32, with fsync, write-cache enabled: no data corruption
 * Win32, with osync, write cache disabled: no data corruption
 * Win32, with osync, write cache enabled: no data corruption. Once I
 got:
 2005-02-24 12:19:54 LOG:  could not open file C:/Program
 Files/PostgreSQL/8.0/data/pg_xlog/00010010 (log file 0,
 segment 16): No such file or directory
   but the data in the database was consistent.

 It disturbs me that you couldn't produce data corruption in the cases
 where it theoretically should occur.  Seems like this is an indication
 that your test was insufficiently severe, or that there is something
 going on we don't understand.

I was thinking about that. A few years back, Microsoft had some serious
issues with write caching drives. They were taken to task for losing data
if Windows shut down too fast, especially on drives with a large cache.

MS is big enough and bad enough to get all the info they need from the
various drive makers to know how to handle write cache flushing. Even the
stuff that isn't documented.

If anyone has a very good debugger and/or emulator or even a logic
analyzer, it would be interesting to see if MS sends commands to the
drives after a disk write or a set of disk writes.

Also, I would like to see this test performed on NTFS and FAT32, and see
if you are more likely to lose data on FAT32.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] Data loss, vacuum, transaction wrap-around

2005-02-20 Thread pgsql

 On Sat, Feb 19, 2005 at 18:04:42 -0500,

 Now, lets imagine PostgreSQL is being developed by a large company. QA
 announces it has found a bug that will cause all the users data to
 disappear if they don't run a maintenence program correctly. Vacuuming
 one
 or two tables is not enough, you have to vacuum all tables in all
 databases.

 Except that Postgres isn't a large company and doing the work of
 back patching and testing old versions will be done instead of
 more important work.

PostgreSQL is an open source project that plays with the big guys. Look at
the Linux kernel. Imagine their file system guys thinking this way. Linux
would still be Linus' hobbie.


 This bug would get marked as a critical error and a full scale effort
 would be made to contact previous users to upgrade or check their
 procedures.

 I don't think all commercial companies would do that. I doubt that even
 most of them would.

Database companies? You bet they would.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] Data loss, vacuum, transaction wrap-around

2005-02-20 Thread pgsql

 On Sun, 20 Feb 2005 [EMAIL PROTECTED] wrote:

  On Sat, Feb 19, 2005 at 18:04:42 -0500,
 
  Now, lets imagine PostgreSQL is being developed by a large company.
 QA
  announces it has found a bug that will cause all the users data to
  disappear if they don't run a maintenence program correctly.
 Vacuuming
  one
  or two tables is not enough, you have to vacuum all tables in all
  databases.
 
  Except that Postgres isn't a large company and doing the work of
  back patching and testing old versions will be done instead of
  more important work.

 PostgreSQL is an open source project that plays with the big guys. Look
 at
 the Linux kernel. Imagine their file system guys thinking this way.
 Linux
 would still be Linus' hobbie.

 So, you are certain that every Linux file system bug has been patched all
 the way back to say kernel version 1.0 then?  Do you have any evidence of
 this claim?

No one is suggesting back to version 1.0, but critical data loss bugs that
are present and relvent in used prior versions are fixed.


  This bug would get marked as a critical error and a full scale effort
  would be made to contact previous users to upgrade or check their
  procedures.
 
  I don't think all commercial companies would do that. I doubt that
 even
  most of them would.

 Database companies? You bet they would.

 Do you have any evidence or are you merely spouting an opinion as fact?

With Oracle and DB2, yes I have some personal experience.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] Query optimizer 8.0.1 (and 8.0)

2005-02-20 Thread pgsql

 Jim C. Nasby wrote:
 On Mon, Feb 14, 2005 at 09:55:38AM -0800, Ron Mayer wrote:

  I still suspect that the correct way to do it would not be
  to use the single correlation, but 2 stats - one for estimating
  how sequential/random accesses would be; and one for estimating
  the number of pages that would be hit.  I think the existing
  correlation does well for the first estimate; but for many data
  sets, poorly for the second type.

 Should this be made a TODO? Is there some way we can estimate how much
 this would help without actually building it?

 I guess I am confused how we would actually do that or if it is
 possible.

Bruce, we didn't get much time to talk at Linux world (It was nice to meet
you).

I'm not sure how you would go about it, but in my post One big trend ..
(In don't even remember anymore), I talk about tables that are physically
sorted on multiple keys, the addresses:

streetname, typename, state, zip.

maple, st, ma, 02186
maple, st, ma, 02186
maple, rd, ma, 02186
maple, ave, ma, 02186
maple, st, me, ??

Assuming the table is physically sorted by state, town (zip), streetname,
streettype, zip.

If one were to type:

select * from locations where streetname = 'maple';


The analysis of that query improperly minimizes the correlation of the
street address of the table at whole.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Data loss, vacuum, transaction wrap-around

2005-02-19 Thread pgsql

 [EMAIL PROTECTED] writes:
 I think there should be a 100% no data loss fail safe.

OK, maybe I was overly broad in my statement, but I assumed a context that
I guess you missed. Don't you think that in normal operations, i.e. with
no hardware of OS failure, we should see any data loss as unacceptable?

If a bug causes data loss, it is a big deal right?

 Possibly we need to recalibrate our expectations here.  The current
 situation is that PostgreSQL will not lose data if:

   1. Your disk drive doesn't screw up (eg, lie about write complete,
  or just plain die on you).
   2. Your kernel and filesystem don't screw up.
   3. You follow the instructions about routine vacuuming.
   4. You don't hit any bugs that we don't know about.


See, here is where I strongly disagree.Items (1) and (2) are completely
out of our control and no one would blame PostgreSQL.

Item (4) is an issue with all software, every now and then people hit bugs
and the bug is reported and assumed to get fixed.

Item (3) is just nasty, RTFM or else sucka! I think it is a very user
hostile stance.


 I agree that it's a nice idea to be able to eliminate assumption #3 from
 our list of gotchas, but the big picture is that it's hard to believe
 that doing this will make for a quantum jump in the overall level of
 reliability.  I think I listed the risks in roughly the right order of
 severity ...

Sometimes the edge conditions of a problem are not so obscure. I think (3)
is a huge issue, iamgine I'm in this meeting:

DBA: We can't use PostgreSQL, if we forget to do normal maintenence we'll
lose all our data.

ME: Well, there is an amount of truth in that, but we just won't forget.

DBA: Sorry, I don't trust it.

CTO: Mark, I think joe has some serious issues that need to be resolved
before we can move on this.

Boom!! Lost.


 I'm willing to fix this for 8.1 (and am already in process of drafting a
 patch), especially since it ties into some other known problems such as
 the pg_pwd/pg_group files not being properly reconstructed after PITR
 recovery.  But I think that a Chinese fire drill is not called for,
 and backpatching a significant but poorly tested change falls into that
 category IMHO.

   regards, tom lane

 ---(end of broadcast)---
 TIP 7: don't forget to increase your free space map settings



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Data loss, vacuum, transaction wrap-around

2005-02-19 Thread pgsql

 On Fri, 18 Feb 2005 22:35:31 -0500, Tom Lane [EMAIL PROTECTED] wrote:
 [EMAIL PROTECTED] writes:
  I think there should be a 100% no data loss fail safe.

 Possibly we need to recalibrate our expectations here.  The current
 situation is that PostgreSQL will not lose data if:

1. Your disk drive doesn't screw up (eg, lie about write
 complete,
   or just plain die on you).
2. Your kernel and filesystem don't screw up.
3. You follow the instructions about routine vacuuming.
4. You don't hit any bugs that we don't know about.

 I'm not an expert but a happy user. My opinion is:
 1)  there is nothing to do with #1 and #2.
 2)  #4 is not a big problem because of the velocity developers fix
 those when a bug is found.

 3) All databases has some type of maintenance routine, in informix for
 example we have (update statistics, and there are others for oracle)
 of course they are for performance reasons, but vacuum is too for that
 and additionally give us the XID wraparound.
 So, to have a maintenance routine in PostgreSQL is not bad. *Bad* is
 to have a DBA(1) with no clue about the tool is using. Tools that do
 to much are an incentive in hire *no clue* people.

 (1) DBA: DataBase Administrator or DataBase Aniquilator???

PostgreSQL is such an awesome project. The only thing it seems to suffer
from is a disregard for its users.

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Data loss, vacuum, transaction wrap-around

2005-02-19 Thread pgsql

 On Sat, Feb 19, 2005 at 13:35:25 -0500,
   [EMAIL PROTECTED] wrote:

 The catastrophic failure of the database because a maintenence function
 is
 not performed is a problem with the software, not with the people using
 it.

 There doesn't seem to be disagreement that something should be done going
 forward.

 The disagreement sems to be what effort should be made in back porting
 fixes to previous versions.


Now, lets imagine PostgreSQL is being developed by a large company. QA
announces it has found a bug that will cause all the users data to
disappear if they don't run a maintenence program correctly. Vacuuming one
or two tables is not enough, you have to vacuum all tables in all
databases.

This bug would get marked as a critical error and a full scale effort
would be made to contact previous users to upgrade or check their
procedures.



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Data loss, vacuum, transaction wrap-around

2005-02-19 Thread pgsql

 [ Shrugs ] and looks at other database systems ...

 CA has put Ingres into Open Source last year.

 Very reliable system with a replicator worth looking at.

 Just a thought.

The discussion on hackers is how to make PostgreSQL better. There are many
different perspectives, differences are argued and concensus reached, and
a better PostgreSQL emerges.

Going to another database would be counter productive to the process.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

[HACKERS] Data loss, vacuum, transaction wrap-around

2005-02-18 Thread pgsql

I want to see if there is a concensus of opinion out there.

We've all known that data loss could happen if vacuum is not run and you
perform more than 2b transactions.  These days with faster and bigger
computers and disks, it more likely that this problem can be hit in months
-- not years.

To me, the WORST thing a program can do is lose data. (Certainly this is
bad for a database.) I don't think there is any real excuse for this.
While the 2b transaction problem was always there, it seemed so remote
that I never obcessed about it. Now that it seems like a real problem that
more than one user has hit, I am worried.

In fact, I think it is so bad, that I think we need to back-port a fix to
previous versions and issue a notice of some kind.

Here as my suggestions:

(1) As Tom has already said, at some point start issuing warning in the
log that vacuum needs to be run.

(2) At some point, stop accepting transactions on anything but template1,
issuing an error saying the vacuum needs to be run.

(3) Either with psql on template1 or postgres or some vacuumall
program, open the database in single user mode or on template1 and vacuum
database.

(4) This should remain even after autovacuum is in place. If for some
reason auto vacuum is installed but not running, we still need to protect
the data from a stupid admin. (Last time I looked, auto vacuum used
various stats, and that may be something an admin disables.)

(5) Vacuum could check for a wrap-around condition in the database cluster
and take it upon itself to run more broadly even if it was directed only
towards a table.

We've been saying that mysql is ok if you don't care about your data, I
would hate if people started using this issue against postgresql.



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] Data loss, vacuum, transaction wrap-around

2005-02-18 Thread pgsql

More suggestions:

(1) At startup, postmaster checks for an XID, if it is close to a problem,
force a vacuum.

(2) At sig term shutdown, can the postmaster start a vacuum?

(3) When the XID count goes past the trip wire can it spontaneously
issue a vacuum?


NOTE:
Suggestions 1 and 2 are for 8.0 and prior. 3 is for later than 8.0.1


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] Data loss, vacuum, transaction wrap-around

2005-02-18 Thread pgsql

 On Sat, 19 Feb 2005 04:10 am, Tom Lane wrote:
 [EMAIL PROTECTED] writes:
  In fact, I think it is so bad, that I think we need to back-port a fix
 to
  previous versions and issue a notice of some kind.

 They already do issue notices --- see VACUUM.

 A real fix (eg the forcible stop we were talking about earlier) will not
 be reasonable to back-port.

 Not to be rude, but if backporting is not an option, why do we not just
 focus on the job of getting autovacuum into 8.1, and not have to think
 about how a patch that will warn users will work?

Unless I'm mistaken, even autovacuum may not be enough. AFAIK,
autovacuum depends on the statistics daemon, and some admins may turn that
off for performance. Even so, how unlikely is it that autovacuum doesn't
run.

I think there should be a 100% no data loss fail safe. Anything less is a
cop-out. I can't see one successful argument that starts with data loss
and ends with maintenence.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] Help me recovering data

2005-02-17 Thread pgsql

 Gaetano Mendola [EMAIL PROTECTED] writes:

 We do ~4000 txn/minute so in 6 month you are screewd up...

 Sure, but if you ran without vacuuming for 6 months, wouldn't you notice
 the
 huge slowdowns from all those dead tuples before that?


I would think that only applies to databases where UPDATE and DELETE are
done often. What about databases that are 99.999% inserts? A DBA lightly
going over the docs may not even know that vacuum needs to be run.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Help me recovering data

2005-02-16 Thread pgsql

 The checkpointer is entirely incapable of either detecting the problem
 (it doesn't have enough infrastructure to examine pg_database in a
 reasonable way) or preventing backends from doing anything if it did
 know there was a problem.

 Well, I guess I meant 'some regularly running process'...

I think people'd rather their db just stopped accepting new transactions
rather than just losing data...

 Not being able to issue new transactions *is* data loss --- how are you
 going to get the system out of that state?

 Not allowing any transactions except a vacuum...

 autovacuum is the correct long-term solution to this, not some kind of
 automatic hara-kiri.

 Yeah, seems like it should really happen soon...

 Chris

Maybe I'm missing something, but shouldn't the prospect of data loss (even
in the presense of admin ignorance) be something that should be
unacceptable? Certainly within the realm normal PostgreSQL operation.



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Help me recovering data

2005-02-16 Thread pgsql

 [EMAIL PROTECTED] writes:
 Maybe I'm missing something, but shouldn't the prospect of data loss
 (even
 in the presense of admin ignorance) be something that should be
 unacceptable? Certainly within the realm normal PostgreSQL operation.

 [ shrug... ]  The DBA will always be able to find a way to shoot himself
 in the foot.  We've seen several instances of people blowing away
 pg_xlog and pg_clog, for example, because they don't need log files.
 Or how about failing to keep adequate backups?  That's a sure way for an
 ignorant admin to lose data too.

There is a difference between actively doing something stupid and failing
to realize a maintenence task is required.

PostgreSQL should stop working. When the admin tries to understand why,
they can read a troubleshooting FAQ and say oops, I gotta run this vacuum
thingy. That is a whole lot better than falling off a cliff you didn't
even know was there.


 Once autovacuum gets to the point where it's used by default, this
 particular failure mode should be a thing of the past, but in the
 meantime I'm not going to panic about it.

I don't know how to say this without sounding like a jerk, (I guess that's
my role sometimes) but would you go back and re-read this sentence?

To paraphrase: I know this causes a catestrophic data loss, and we have
plans to fix it in the future, but for now, I'm not going panic about it.

What would you do if the FreeBSD group or Linux kernel group said this
about a file system? If you failed to run fsck after 100 mounts, you loose
your data?

I thought PostgreSQL was about protecting your data. How many times have
we smugly said, yea, you can use MySQL if you don't care about your
data. Any data loss caused by postgresql should be seen as unacceptable.
It's funny, while I've known about this for a while, and it has always
seemed a sort of distant edge condition that is easily avoided. However,
with todays faster machines and disks, it is easier to reach this
limitation than ever before. All PostgreSQL needs is one or two VERY UPSET
mainstream users who lose data to completely reverse the momemntum that it
is gaining.

No amount of engineering discussion about it not being the fault of
postgresql will be lost, and rightfully so, IMHO.

Sorry.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Help me recovering data

2005-02-16 Thread pgsql

 On Wed, 16 Feb 2005 [EMAIL PROTECTED] wrote:

 
  Once autovacuum gets to the point where it's used by default, this
  particular failure mode should be a thing of the past, but in the
  meantime I'm not going to panic about it.

 I don't know how to say this without sounding like a jerk, (I guess
 that's
 my role sometimes) but would you go back and re-read this sentence?

 To paraphrase: I know this causes a catestrophic data loss, and we have
 plans to fix it in the future, but for now, I'm not going panic about
 it.

 Do you have a useful suggestion about how to fix it?  Stop working is
 handwaving and merely basically saying, one of you people should do
 something about this is not a solution to the problem, it's not even an
 approach towards a solution to the problem.

Actually, it is not a solution to the problem of losing data. It is a drop
dead last ditch failsafe that EVERY PRODUCT should have before losing
data.





---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Help me recovering data

2005-02-16 Thread pgsql


 On Wed, 16 Feb 2005, Joshua D. Drake wrote:


 Do you have a useful suggestion about how to fix it?  Stop working is
 handwaving and merely basically saying, one of you people should do
 something about this is not a solution to the problem, it's not even
 an
 approach towards a solution to the problem.
 
 
 I believe that the ability for PostgreSQL to stop accepting
 queries and to log to the file or STDERR why it stopped working
 and how to resolve it is appropriate.

 Right, but since the how to resolve it currently involves executing a
 query, simply stopping dead won't allow you to resolve it. Also, if we
 stop at the exact wraparound point, can we run into problems actually
 trying to do the vacuum if that's still the resolution technique?  If so,
 how far in advance of wraparound must we stop to guarantee it will
 succeed? It's not rocket science, but figuring such things out is part of

I would say, have a GUC parameter set at 1000 transactions. When fewer
than this number are available, postmaster will not run and issue a
message

Transaction wrap-around error! You must run vacuum in stingle user
postgres mode to correct it, to avoid this message run the vacuum command
more frequently

Hell, why not block  all the PostgreSQL processes and run vacuum? But, for
now, versions of PostgreSQL should stop before losing data.


---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Help me recovering data

2005-02-16 Thread pgsql

 Stephan Szabo [EMAIL PROTECTED] writes:
 Right, but since the how to resolve it currently involves executing a
 query, simply stopping dead won't allow you to resolve it. Also, if we
 stop at the exact wraparound point, can we run into problems actually
 trying to do the vacuum if that's still the resolution technique?

 We'd have to do something with a fair amount of slop.  The idea I was
 toying with just now involved a forcible shutdown once we get within
 say 100,000 transactions of a wrap failure; but apply this check only
 when in interactive operation.  This would allow the DBA to perform
 the needed VACUUMing manually in a standalone backend.

 The real question here is exactly how large a cluestick do you want to
 hit the DBA with.  I don't think we can guarantee no data loss with
 anything less than forced shutdown, but that's not so much a cluestick
 as a clue howitzer.

I think a DBA or accidental DBA would prefer stating in a meeting:

Yea, the database shut down because I didn't perform normal maintenence,
its fixed now and we have a script in place so it won't happen again

Over

Yea, the database lost all its data and we have to restore from our last
backup because I didn't perform normal maintenence.

One gets a boy are you lucky over a you're fired.


 Maybe

 (a) within 200,000 transactions of wrap, every transaction start
 delivers a WARNING message;

 (b) within 100,000 transactions, forced shutdown as above.

I agree.

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Help me recovering data

2005-02-16 Thread pgsql


 On Wed, 16 Feb 2005 [EMAIL PROTECTED] wrote:

  On Wed, 16 Feb 2005 [EMAIL PROTECTED] wrote:
 
  
   Once autovacuum gets to the point where it's used by default, this
   particular failure mode should be a thing of the past, but in the
   meantime I'm not going to panic about it.
 
  I don't know how to say this without sounding like a jerk, (I guess
  that's
  my role sometimes) but would you go back and re-read this sentence?
 
  To paraphrase: I know this causes a catestrophic data loss, and we
 have
  plans to fix it in the future, but for now, I'm not going panic about
  it.
 
  Do you have a useful suggestion about how to fix it?  Stop working
 is
  handwaving and merely basically saying, one of you people should do
  something about this is not a solution to the problem, it's not even
 an
  approach towards a solution to the problem.

 Actually, it is not a solution to the problem of losing data. It is a
 drop
 dead last ditch failsafe that EVERY PRODUCT should have before losing
 data.

 Let's try again. Saying, one of you people should do something about
 this is not a solution to the problem or an approach thereto.  Stop
 working is handwaving since I see no approach therein that allows the
 user to actually recover the data.



Well, it is sort of the the Hockey strike, now that it seems like stoping
normal operation is better than losing billions of rows of data. We can
decide who to do it and how to correct it.



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] I will be on Boston

2005-02-15 Thread pgsql

I will be at the BLU booth Tuesday.

Any and all, drop by.


 I will be on Boston for Linuxworld from Tuesday through Thursday.  I
 will read email only occasionally.

 --
   Bruce Momjian|  http://candle.pha.pa.us
   pgman@candle.pha.pa.us   |  (610) 359-1001
   +  If your life is a hard drive, |  13 Roberts Road
   +  Christ can be your backup.|  Newtown Square, Pennsylvania
 19073

 ---(end of broadcast)---
 TIP 4: Don't 'kill -9' the postmaster



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

[HACKERS] PostgreSQL at Linux World

2005-02-15 Thread pgsql

I was at Linux world Tuesday, it was pretty good. I was in the org
pavilion, where the real Linux resides. The corporate people were on the
other side of the room. (There was a divider where the rest rooms and
elevators were.)

I say that this was where the real linux resides because all the real
brains behind Linux were there, x.org, debian, fsf, kde, gnome, gentoo,
and so on. Bruce was sort of in the middle of the room with his company.
Our booth was off in the corner. (cold drafty corner.)

Anyway, I noticed Pervasive software selling PostgreSQL support in the
corporate end of the room. Bless them, they were trying to sell me on a
GUI front end for PostgreSQL, asked What tool to you use to administer
PostgreSQL? My answer? psql The crestfallen salesman knew that I was
not interested in GUI frontends.

It was kind of funny, sad in a a way, but funny never the less.

Anyway, it was good to see PostgreSQL out in the corporate end of the
building, I even got a couple very cute rubber elephants.

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] Query optimizer 8.0.1 (and 8.0)

2005-02-14 Thread pgsql

 Probably off-topic, but I think it's worth to see what astronomers are
 doing with their very big spatial databases. For example, we are working
 with more than 500,000,000 rows catalog and we use some special
 transformation
 of coordinates to integer numbers with preserving objects closeness.
 I hope we could show postgresql is good enough to be used in astronomy
 for very big catalogs. Currently, MS SQL is in use.
 See http://www.sdss.jhu.edu/htm/ for details. We use another technique.

You know, I don't think a lot of people get the issues I was describing,
or maybe they don't believe it, I don't know, but, I think that it would
be a useful contrib project to create an 'analyze_special('table',
'column', 'method')' function that does a better job at calculating the
stats for table that contain multiple trend waveforms. A separate function
will probably work well as the trends within the data probably only apply
to specific rows.

It's interesting, because I don't think it needs to calculate a perfect
representation of the data so much as better clue to its nature for the
optimizer.

When I get the time (or can get someone to pay me to do it) I'm going to
try it.



   Oleg
 On Wed, 9 Feb 2005 [EMAIL PROTECTED] wrote:

 I wrote a message caled One Big trend vs multiple smaller trends in
 table
 statistics that, I think, explains what we've been seeing.


 [EMAIL PROTECTED] wrote:

 In this case, the behavior observed could be changed by altering the
 sample size for a table. I submit that an arbitrary fixed sample size
 is
 not a good base for the analyzer, but that the sample size should be
 based
 on the size of the table or some calculation of its deviation.


Mark,

 Do you have any evidence that the Sample Size had anything to do
 with the performance problem you're seeing?

 Sample size is only a bandaid for the issue, however, more samples
 always
 provide more information.



 I also do a lot with the complete Census/TIGER database.

 Every problem I have with the optimizer comes down to the
 fact that the data is loaded (and ordered on disk) by
 State/County FIPS codes, and then queried by zip-code
 or by city name.  Like this:

  Alabama36101 [hundreds of pages with zip's in 36***]
  Alaska 99686 [hundreds of pages with zip's in 9]
  Arizona85701 [hundreds of pages with zip's in 855**]

 Note that the zip codes are *NOT* sequential.

 Again, read One Big Trend... and let me know what you think. I think
 it
 describes exactly the problem that we see.

 For now, the solution that works for me is to seriously up the value of
 targrows in analyze.c. It makes it take longer, and while the stats
 are
 not correct because they are not designed to detect these sorts of
 patterns, a larger sample allows them to be less wrong enough to give
 a
 better hint to the planner.



 ---(end of broadcast)---
 TIP 7: don't forget to increase your free space map settings


   Regards,
   Oleg
 _
 Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
 Sternberg Astronomical Institute, Moscow University (Russia)
 Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
 phone: +007(095)939-16-83, +007(095)939-23-83

 ---(end of broadcast)---
 TIP 3: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Help me recovering data

2005-02-14 Thread pgsql

It must be possible to create a tool based on the PostgreSQL sources that
can read all the tuples in a database and dump them to a file stream. All
the data remains in the file until overwritten with data after a vacuum.
It *should* be doable.

If there data in the table is worth anything, then it would be worth
extracting.

It would, of course, be a tool of last resort.



 Kouber Saparev [EMAIL PROTECTED] writes:
 After asking the guys in the [EMAIL PROTECTED] channel they
 told
 me that the reason is the Transaction ID wraparound, because I have
 never
 ran VACUUM on the whole database.

 So they proposed to ask here for help. I have stopped the server, but
 what
 could I do in order to save the data if it's possible at all?

 I think you're pretty well screwed as far as getting it *all* back goes,
 but you could use pg_resetxlog to back up the NextXID counter enough to
 make your tables and databases reappear (and thereby lose the effects of
 however many recent transactions you back up over).

 Once you've found a NextXID setting you like, I'd suggest an immediate
 pg_dumpall/initdb/reload to make sure you have a consistent set of data.
 Don't VACUUM, or indeed modify the DB at all, until you have gotten a
 satisfactory dump.

 Then put in a cron job to do periodic vacuuming ;-)

   regards, tom lane

 ---(end of broadcast)---
 TIP 6: Have you searched our list archives?

http://archives.postgresql.org



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Help me recovering data

2005-02-14 Thread pgsql

 I think you're pretty well screwed as far as getting it *all* back goes,
 but you could use pg_resetxlog to back up the NextXID counter enough to
 make your tables and databases reappear (and thereby lose the effects of
 however many recent transactions you back up over).

 Once you've found a NextXID setting you like, I'd suggest an immediate
 pg_dumpall/initdb/reload to make sure you have a consistent set of data.
 Don't VACUUM, or indeed modify the DB at all, until you have gotten a
 satisfactory dump.

 Then put in a cron job to do periodic vacuuming ;-)

 This might seem like a stupid question, but since this is a massive data
 loss potential in PostgreSQL, what's so hard about having the
 checkpointer or something check the transaction counter when it runs and
   either issue a db-wide vacuum if it's about to wrap, or simply
 disallow any new transactions?

 I think people'd rather their db just stopped accepting new transactions
 rather than just losing data...


I would certainly prefer the system to issue an error and stop working
than complete data loss.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] enforcing a plan (in brief)

2005-02-14 Thread pgsql

 On Thu, 2005-02-10 at 14:37 -0500, Bruce Momjian wrote:
 No, we feel that is of limited value.  If the optimizer isn't doing
 things properly, we will fix it.

 I agree that improving the optimizer is the right answer for normal
 usage, so I can't get excited about query-level plan hints, but I can
 see the capability to instruct the planner being useful in an academic
 context.


I think that is sort of arrogant. Look at Oracle, you can give the planner
hints in the form of comments.

The idea that constructing a planner that will always do the best job is
like creating a program that can predict the weather. There are too many
subtle variations in datasets that are impossible to really evalute. I
posted a message last week called One Big trend vs multiple smaller
trends. and you'll see what I mean.

Yea, on a simple data organization, you could make a great planner, but
someone who has studied the nature of their data can almost always toss
their hands up in frustration because the planner isn't working right.

I have had multiple issues with the inability to guide the planner on its
decisions. I'll give a couple examples:

A music database where the artist name is Various Artists,  given any
normal database of music recordings, Various Artists will be *THE* most
popular artist, usually close to almost half the data. Most of the time
I've had to turn off sequential scans for these queries. (I filter out
various artists) Being able to say:

select * from cdtitles where artist = 'foo' /* index scan
ctitles_artist_ndx  */ ;

Would be helpful as disabling sequential scan isn't always the right think
either.

The whole Query optimizer 8.0.1 (and 8.0) series of posts show a
different problem.

It all comes down to that the planner *can not* be perfect, and thus will
always be lacking in some respect. This is because you can not anticipate
every physical data storage pattern, therefore, the analyzer will not
correctly characterize them, and the planner will not create an optimal
plan.

Allowing the user to suggest alternate query strategies is a good idea.





---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] 8.0.X and the ARC patent

2005-02-14 Thread pgsql

Might it be possible to contact IBM directly and ask if they will allow
usage of the patent for PostgreSQL. They've let 500 patents for open
source, maybe they'll give a write off for this as well.

There is an advantage beyond just not having to re-write the code, but it
would also be sort of an IBM blessing, great PR.

I will be at Linux World and see if there is an IBM booth, maybe I can
get  some contact info.


 FYI, core has discussed the pending IBM ARC patent and the usage of
 those ideas in 8.0.

 Tom has found a 2Q cache algorithm that predates the ARC patent and is
 very similar to ARC.  The major difference is that it doesn't auto-size
 the ARC sub-buffers.

 Core believes it is best to backpatch this 2Q algorithm into 8.0.X to
 avoid any possible patent problems if the patent is granted and
 enforced.

 We are testing the use of the 2Q code to see if it has any performance
 impact.  The 8.0.X release that uses 2Q will have more extensive testing
 than a normal minor release.  8.1 will have a new cache algorithm that
 hopefully will remove the buffer contention problems experienced by SMP
 machines.

 For development, this means we will _not_ have a shortened, non-initdb
 8.1 release but a regular release cycle with the typical big batch of
 features.

 --
   Bruce Momjian|  http://candle.pha.pa.us
   pgman@candle.pha.pa.us   |  (610) 359-1001
   +  If your life is a hard drive, |  13 Roberts Road
   +  Christ can be your backup.|  Newtown Square, Pennsylvania
 19073

 ---(end of broadcast)---
 TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] 8.0.X and the ARC patent

2005-02-14 Thread pgsql

 [EMAIL PROTECTED] wrote:
 Might it be possible to contact IBM directly and ask if they will allow
 usage of the patent for PostgreSQL. They've let 500 patents for open
 source, maybe they'll give a write off for this as well.

 There is an advantage beyond just not having to re-write the code, but
 it
 would also be sort of an IBM blessing, great PR.

 I will be at Linux World and see if there is an IBM booth, maybe I can
 get  some contact info.

 I doubt they will give us something that extends to companies that sell
 PostgreSQL so I don't see the point.

Actually, I think that's wrong. IBM has been really gung-ho about Linux.
Of course this is an obvious movement against Microsoft domination of the
server market, but they have made some very strong open source statements
and have release about 500 patents to open source projects.

The current open source patents extend to companies that sell other
products, why not PostgreSQL as well?

There is a *LOT* of crap going on with patents, there are so many issues
and motives that is hard to keep track of why who is doing what. My bet is
that it is 50/50. It all depends if IBM wants to hurt Oracle more than it
wants to defned DB2.




---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] enforcing a plan (in brief)

2005-02-14 Thread pgsql

 [EMAIL PROTECTED] writes:
 I think that is sort of arrogant. Look at Oracle, you can give the
 planner
 hints in the form of comments.

 Arrogant or not, that's the general view of the people who work on the
 planner.

 The real issue is not so much whether the planner will always get things
 right --- it won't, and no one imagines that it will ever be perfect.
 The real issue is that we have limited manpower, and designing and
 implementing a useful hint facility is a nontrivial project.  (Not to
 mention that maintaining such a thing in the face of frequent,
 fundamental changes to the underlying planner and executor capabilities
 would be an outright nightmare.)

 The people who are actually doing the work think their time is more
 usefully spent on improving the planner's intelligence than on devising
 ways to override it.


I know I come on strong, and I know I'm probably irritating in many ways,
however, I have been a PostgreSQL user since just after it was named
postgreSQL from Postgres95. I've seen a lot of changes, and almost all of
them have been quite good.

I have over 10 years of using it on various projects. While I have not
been lucky enough to get a gig in which I could contribute more, I do try
to contribute and sometimes it is quite difficult.

The one thing that I think you guys miss is actually using PostgreSQL in
some projects where the company and the deliverables don't give a rat's
ass about whether you use PostgreSQL or Oracle or something else. Over the
years I have beat my head against the walls suggesting features, most of
which eventually have come to PostgreSQL, but every one was a fight.

I have some well founded opinions about PostgreSQL hard earned from real
world situations. Agree or not, I have experience with this database and I
have hit many of its short comings.

One consistent problem is the planner not being able to handle this or
that scenario. At this stage, the *best* way to improve the planner is to
add the ability to place hints in the plan. It *is* good enough for 90% of
the types of queries you would ever want to do. I am dubious that you can
get it demonstrably better in the last 10% or so without making it worse.

Simple hints would go a HUGE way to improving the last 10%. Many of the
Why doesn't PostgreSQL use my index questions would go away. Most of the
time Tom spends looking at people's pg_stats info would drop. It would
actually save time.

As a PostgreSQL user, I can tell you with 100% confidence, if I had this
tool, I could do my job easier. I can also tell you that while I have
genuine appreciation for the current quality of the planner, I still would
like to be able to tailor queries specifically to test various approaches
for performance reasons.

---(end of broadcast)---
TIP 8: explain analyze is your friend

[HACKERS] New form of index persistent reference

2005-02-10 Thread pgsql

For about 5 years now, I have been using a text search engine that I wrote
and maintain.

In the beginning, I hacked up function mechanisms to return multiple value
sets and columns. Then PostgreSQL aded setof and it is was cool. Then it
was able to return a set of rows, which was even better.

Lately, I have been thinking that a cool form of index would be some sort
of persistent reference index. Like the old ISAM days of yore, a fixed
number could point you right to the row that you want. I'm not sure if the
persistent reference is a specific auto numbering column type or
separate index structure or both.

I asked the question how do you get a record without going through an
index, the answer was CTID, which unfortunately changes when the row is
updated.

Now, what I want to brainstorm is some sort of persistent reference
where the value is not algorithmically stored, maybe just an offset into a
table. The number of operations should be about 1 per lookup.

Imagine a dynamically growing array that has one slot per row. Every row
is considered unique. Rows which are updated, their CTID is updated in the
reference. (with vacuum?)

Imagine something like this:

create table foobar(id reference, name varchar, value varchar);

select * from foobar where id = 100;

The reference type has an implicit index that is basically a lookup table.
On unique references where the reference value is fairly arbitrary, this
would be a HUGE gain for direct lookups. There is no need for the NlogN of
a tree.

On the surface level, this would be a huge win for websites that use
semi-fixed tables of data.



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] New form of index persistent reference

2005-02-10 Thread pgsql

 If that ID is the only thing you use to access that data, why not just
 store
 it in a flat file with fixed-length records? seek() (or your language's
 equivalent) is usually fast.

As a matter of policy, I would never manage data outside of the database.


 If you need to drive that from within PostgreSQL, you would need an
 untrusted language to read the file, but you could also generate it from a
 table using a trigger.

Very ugly.


 Or maybe use a serial column, an index on that column, and cluster the
 table
 on that index. It's more than one lookup, but not much with a Btree index.
 (Not sure if this is better than just using a serial and an index.
 http://www.postgresql.org/docs/8.0/interactive/sql-cluster.html says it
 isn't, if I read it correctly.)

Clustering is OK, but it doesn't handle updates and additions until you
recluster the data.

If a static reference is all that is needed, then merely using CTID would
suffice. I was thinking a little overhead for a reference table would
allow it to hook into PostgreSQL and keep it up to date.




 Then anytime there is a batch of updates to the table, re-cluster it.

Yea, like I said, there are easier ways of doing that with fairly static
data.



 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 Sent: Thursday, February 10, 2005 11:22 AM
 To: pgsql-hackers@postgresql.org
 Subject: [HACKERS] New form of index persistent reference


 For about 5 years now, I have been using a text search engine
 that I wrote
 and maintain.

 In the beginning, I hacked up function mechanisms to return
 multiple value
 sets and columns. Then PostgreSQL aded setof and it is was
 cool. Then it
 was able to return a set of rows, which was even better.

 Lately, I have been thinking that a cool form of index would
 be some sort
 of persistent reference index. Like the old ISAM days of
 yore, a fixed
 number could point you right to the row that you want. I'm
 not sure if the
 persistent reference is a specific auto numbering column type or
 separate index structure or both.

 I asked the question how do you get a record without going through an
 index, the answer was CTID, which unfortunately changes when
 the row is
 updated.

 Now, what I want to brainstorm is some sort of persistent reference
 where the value is not algorithmically stored, maybe just an
 offset into a
 table. The number of operations should be about 1 per lookup.

 Imagine a dynamically growing array that has one slot per
 row. Every row
 is considered unique. Rows which are updated, their CTID is
 updated in the
 reference. (with vacuum?)

 Imagine something like this:

 create table foobar(id reference, name varchar, value varchar);

 select * from foobar where id = 100;

 The reference type has an implicit index that is basically a
 lookup table.
 On unique references where the reference value is fairly
 arbitrary, this
 would be a HUGE gain for direct lookups. There is no need for
 the NlogN of
 a tree.

 On the surface level, this would be a huge win for websites that use
 semi-fixed tables of data.



 ---(end of
 broadcast)---
 TIP 1: subscribe and unsubscribe commands go to
 [EMAIL PROTECTED]




---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Query optimizer 8.0.1 (and 8.0)

2005-02-09 Thread pgsql

I wrote a message caled One Big trend vs multiple smaller trends in table
statistics that, I think, explains what we've been seeing.


 [EMAIL PROTECTED] wrote:

 In this case, the behavior observed could be changed by altering the
 sample size for a table. I submit that an arbitrary fixed sample size is
 not a good base for the analyzer, but that the sample size should be
 based
 on the size of the table or some calculation of its deviation.


Mark,

 Do you have any evidence that the Sample Size had anything to do
 with the performance problem you're seeing?

Sample size is only a bandaid for the issue, however, more samples always
provide more information.



 I also do a lot with the complete Census/TIGER database.

 Every problem I have with the optimizer comes down to the
 fact that the data is loaded (and ordered on disk) by
 State/County FIPS codes, and then queried by zip-code
 or by city name.  Like this:

  Alabama36101 [hundreds of pages with zip's in 36***]
  Alaska 99686 [hundreds of pages with zip's in 9]
  Arizona85701 [hundreds of pages with zip's in 855**]

 Note that the zip codes are *NOT* sequential.

Again, read One Big Trend... and let me know what you think. I think it
describes exactly the problem that we see.

For now, the solution that works for me is to seriously up the value of
targrows in analyze.c. It makes it take longer, and while the stats are
not correct because they are not designed to detect these sorts of
patterns, a larger sample allows them to be less wrong enough to give a
better hint to the planner.



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Query optimizer 8.0.1 (and 8.0)

2005-02-09 Thread pgsql

 Mark,

 Hey, I can give you a copy of RT1 which is fine, but it is 1.1G
 compressed. I'd have to mail you a DVD.

 Sure, cool.

[address info sniped]

I would be willing to send a couple DVDs (on a regular basis) to anyone
who is able to post this on a good mirror that anyone could get at.

I can send the US Census database, fully loaded and in compressed pg_dump
format. I also have the latest CD database from www.freedb.org.

I would love to keep these things current for PG development, but my
company's  server is on a plan that gets 1G free, and is billed after
that. Also, I am on a broadband line at my office, and uploading the data
would take days.

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Query optimizer 8.0.1 (and 8.0)

2005-02-09 Thread pgsql

 On Wed, Feb 09, 2005 at 07:30:16PM -0500, [EMAIL PROTECTED] wrote:
 I would love to keep these things current for PG development, but my
 company's  server is on a plan that gets 1G free, and is billed after
 that. Also, I am on a broadband line at my office, and uploading the
 data
 would take days.

 Maybe post it as a Torrent?

That still doesn't answer the question of the initial upload.

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Query optimizer 8.0.1 (and 8.0)

2005-02-08 Thread pgsql

 [EMAIL PROTECTED] wrote:

 In this case, the behavior observed could be changed by altering the
 sample size for a table. I submit that an arbitrary fixed sample size is
 not a good base for the analyzer, but that the sample size should be
 based
 on the size of the table or some calculation of its deviation.


Mark,

 Do you have any evidence that the Sample Size had anything to do
 with the performance problem you're seeing?

I have evidence, if you look through some of the messages in this thread,
you'll see how a sample size of 1 provides enough data points to
create stats the planner can use.


 I also do a lot with the complete Census/TIGER database.

Cool, have any code for Mapserver?


 Every problem I have with the optimizer comes down to the
 fact that the data is loaded (and ordered on disk) by
 State/County FIPS codes, and then queried by zip-code
 or by city name.  Like this:

  Alabama36101 [hundreds of pages with zip's in 36***]
  Alaska 99686 [hundreds of pages with zip's in 9]
  Arizona85701 [hundreds of pages with zip's in 855**]

 Note that the zip codes are *NOT* sequential.

 The correlation statistic sees that the Zip codes are not
 sequential; so it makes the *HORRIBLE* assumption that they
 are scattered randomly across the disk.

It is my theory that this is because there are too few data points with
which to properly characterize the nature of the data.


 In reality, even though there's no total ordering of the
 zip codes; any given zip code only exists on a couple
 disk pages; so index scans would be the right choice.
I totally agree.


 But the single correlation parameter is not sufficient
 to let the optimizer known this.

 No matter how large a sample size you choose, ANALYZE
 will correctly see that Zip codes and State FIPS codes
 are non-correlated, and the optimizer will overestimate
 the # of pages an index scan will need.


I tried to create an analogy in another post, and TIGER is a perfect
example of the analogy.

Think of the difference between an oscilloscope and a spectrum analizer.
The current sampling code works more like an oscilloscope. It assumes a
fairly normalized distribution of data. Given this, it works perfectly
fine.

When a scope is presented with an audio signal, it looks more like
gibberish showing almost no correlation. When you view it in frequency
domain, as with a spectrum analyzer, you can see clear patterns in the
signal.

Now, fortunately, we don't need any sort of absolute visualization of the
data in TIGER, we only need to see that the data has many subtle trends
rather than one fairly evenly distributed one. That's why more samples
works.

If we could do anything, I would add more statistics to the database. A
standard deviation and maybe a sliding window deviation. A standard
deviation might be pretty high, were as a sliding window whould show less
localized deviation. Less localized deviation whould favor index scans in.

Anyway, like I said. I think the expectation that the data is fairly
normalized or evenly distributed works very well for data acquired over
time. It is data like TIGER that is in a multiple field order, i.e. state,
zipr, zipl that has complex paterns for the secondary sorts that can't be
detected with too small a sample.





 PS: I pointed out workarounds in my earlier posting
 in this thread.  Yes, I'm using the same TIGER data
 you are.


Cool.



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

Re: [HACKERS] Query optimizer 8.0.1 (and 8.0)

2005-02-08 Thread pgsql

 [EMAIL PROTECTED] writes:

 The basic problem with a fixed sample is that is assumes a normal
 distribution.

 That's sort of true, but not in the way you think it is.

[snip]

Greg, I think you have an excellent ability to articulate stats, but I
think that the view that this is like election polling is incorrect.

Election polling assumes a very simple outcome: Some standard ditribution
of a limited number options. I don't think it applies to this.


 When you look at a sine wave on an oscilloscope, you can see it clear as
 day. When you look at music on the scope, you know there are many waves
 there, but it is difficult to make heads or tails of it. (use xmms or
 winamp to see for yourself) The waves change in frequency, amplitude,
 and
 duration over a very large scale. That's why you use a spectrum analyzer
 to go from time domain to frequency domain. In frequency domain, you can
 see the trends better.

 That's not a bad analogy to many problems where you're measuring data that
 has
 non-randomness in it but that are not visible in the domain that the
 statistics that are being analyzed. This seems to happen a lot with
 geographic
 data, for instance.

EXACTLY!!!


 If you find that increasing the stats targets improves things then this
 isn't true. If you find that it doesn't then what's really needed is a
 cleverer set of statistics to look for.

I will be the first one to say that increasing the samples is not perfect,
but it is a methodology that will help without major changes in postgres.
Simply increasing the samples to a percentage of the estimated number of
rows (with some upper and lower limits of course) will increase the
accuracy of the n_distinct and correlation settings (at least a little
bit), and that will make a huge impact with very little work.

If we want to discuss improved statatistics, then we should include a
standard deviation and a sliding window deviation, or something like that.
Hell, maybe even FFT.

The basic problem, I think, is that the sampling mechanism is more like an
oscilloscope looking for large trends instead of a spectrum analyzer
looking for the smaller ones.

We have to be able to tell the planner that adjacent values are less
random even though, as a whole, they are seemingly random.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

[HACKERS] One Big trend vs multiple smaller trends in table statistics

2005-02-08 Thread pgsql

A couple of us using the US Census TIGER database have noticed something
about the statistics gathering of analyze. If you follow the thread Query
Optimizer 8.0.1 you'll see the progression of the debate.

To summarize what I think we've seen:

The current implementation of analyze is designed around sampling a table
to characterize the basic trend of the data. The problem with the approach
is that it assumes that the data has a singular trend behavior.

Greg Stark posts Cross column statistics touches on the general problem.

The best analogy so far is the difference between an oscilloscope and a
spectrum analizer. The current statistics gathering is like a sampling
oscilloscope trying to display a single wave form.

Some data trends are more like audio signals where the data has many
smaller trends in a seemingly random stream. With a specrum analyzer you
can see the various components. Use Winamp or XMMS for a visualization.

Lets assume data is in a multiple sort order. Lets assume it is a set of
street addresses sorted by:

state, streetname, streettyppe, address

MA, ABBOT, RD, 100
MA, ABBOT, RD, 200
MA, ABBOT, RD, 300
MA, ABBOT, ST, 100
MA, ABBOT, ST, 200
MA, MAPLE, RD, 100
MA, MAPLE, RD, 200
MA, MAPLE, ST, 100
...
...
WY, ABBOT, RD, 100
etc.


This table has MILLIONS of rows, every single address in the country. The
trend of state is clearly an increasing step ramp over the entire table.
The trend of streetname can be imagined as a waveform of a series of ramps
for each state. The trend of streettype, similarly, is a series of ramps
per street name, and the wave form for address is a ramp for each
streettype.

The statistics PostgreSQL currently employs will work great for state,
but much less so for streetname.

A query of select * from addresses where streetname = 'ABBOT' will be
seen as more expensive than it really is. Most of the ABBOTs will be
together in about 50 clusters (one for each state, assuming every state
has atlease on ABBOT), but the current stats are not designed to detect
this.

Yes, eventually, if the sub-trends are small enough, the index scans
become more expensive than table scans, but the current stats can't tell
where that point is. Clearly it is not at the secondary sort (or
streetname) level.

I've found that increasing the sample size in analyze.c can help in
specific cases, but the overall problem remains.

The question is: Is this really a problem? If so, what can we do?

I was thinking of trying to compute a sliding window standard deviation
which should be able to detect smaller trends in an overall table, this
would require a lot of work in analyze.c.

If the sliding window deviation is low, then the correlation of the table
should be increased, telling the planner that an index scan is a better
choice. The actual math behind the values has to be worked out, of course,
but what do you think about the idea?

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] correlation in pg_stats

2005-02-08 Thread pgsql


 Short summary:

   * It looks to me like the planner vastly overestimates
 the # of pages read by index scan in quite a few of my
 tables even though stats collected by ANALYZE are correct.

   * The problem happens any time you have multiple columns
 that have a number of repeated values in them, and
 you CLUSTER the table by a sort using both columns
 (like city,state,zip,phone# or firstname,lastname).

   * I think this is the problem that Mark Kirkwood is seeing
 in his threads Query optimizer 8.0.1 and One Big trend
 vs multiple smaller trends in hackers.

actually [EMAIL PROTECTED], is Mark Woodward. Pleased to meet you.
:)

(I hate using my name on lists like this because of spammers)


   * A test script demonstrating the issue also follows.

   * I think keeping one more stat per attribute in
 pg_stastic that could describe this behavior.


 Longer:


   If I understand the optimizer correctly,  correlation is used
   to both guess how much random disk access will be required in
   a query; as well as estimate how many pages will be read.

   Unfortunately, many tables in my larger databases have
   columns with values that are tightly packed on a few pages;
   even though there is no total-ordering across the whole table.
   Stephan Szabo described this as a clumping effect:
   http://archives.postgresql.org/pgsql-performance/2003-01/msg00286.php

Yes.

I think we are describing the exact same issue.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] external indices ...

2005-02-08 Thread pgsql

I haven't worked with GiST, although I have been curious from time to
time. Just never had the time to sit, read, and try out the GiST system.

On my text search system (FTSS) I use functions that return sets of data.
It make be easier to implement that than a GiST.

Basically, I create a unique ID index on a table of data. Create my
external index. When a search is done, I return a set of unique IDs and
ranks.



 I believe that this is what Oleg et al tap into with the tsearch2 stuff,
 no?  I have someone asking me about it, and want to make sure that I'm
 telling him the right answer ... is this what GiST is?  And, if so, what
 is a *good* doc for me to point them at to get up to speed with it?  Is
 what we have in the docs the best place, or is there something someone
 else has written that gets into it even more?

 Thanks ...

 
 Marc G. Fournier   Hub.Org Networking Services
 (http://www.hub.org)
 Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ:
 7615664




---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

[HACKERS] Fast reference without an index?

2005-02-08 Thread pgsql

A question to the hackers:

Is there a way, and if I'm being stupid please tell me, to use something
like a row ID to reference a row in a PostgreSQL database? Allowing the
database to find a specific row without using an index?

I mean, an index has to return something like a row ID for the database to
find the row, right? Granted it has to find the version with the right
transaction ID, but still, you get the idea. In the old days of ISAM
stuff, it would just be the offset into the file.

In my text search system, rather than create an additional indexed column
for row ID, wouldn't it  be more efficient if I could use PostgreSQL's
data for this? This would greatly improve the performance of FTSS as it
would eliminate the extra index operation per row returned.



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Fast reference without an index?

2005-02-08 Thread pgsql

 [EMAIL PROTECTED] writes:
 Is there a way, and if I'm being stupid please tell me, to use something
 like a row ID to reference a row in a PostgreSQL database? Allowing the
 database to find a specific row without using an index?

 ctid ... which changes on every update ...

Well, how does an index do it? Say this:

select * from mytable where name = 'foo';

The index must return something. Say I have a row that it constantly being
updated, or has an original item inserted. An item which is valid within
my transaction, and an item which has just be inserted but has a
transaction id greater than mine.


Dosn't the index have some base number which points to the first valid
occurance of the row, and then the valid row is found based on the
transaction ID, or has PG changed? Is that the ctid?



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] Query optimizer 8.0.1 (and 8.0)

2005-02-07 Thread pgsql

 [EMAIL PROTECTED] writes:
 One of the things that is disturbing to me about the analyze settings is
 that it wants to sample the same number of records from a table
 regardless
 of the size of that table.

 The papers that I looked at say that this rule has a good solid
 statistical foundation, at least for the case of estimating histograms.
 See the references cited in analyze.c.


Any and all random sampling assumes a degree of uniform distribution. This
is the basis of the model. It assumes that chunks of the whole will be
representative of the whole (to some degree). This works when normal
variations are more or less distributed uniformly. As variations and
trends becomes less uniformly distributed, more samples are required to
characterize it.

Douglas Adams had a great device called the Total Perspective Vortex
which infered the whole of the universe from a piece of fairy cake. It was
a subtle play on the absurd notion that a very small sample could lead to
an understanding of an infinitly larger whole.

On a very basic level, why bother sampling the whole table at all? Why not
check one block and infer all information from that? Because we know that
isn't enough data. In a table of 4.6 million rows, can you say with any
mathmatical certainty that a sample of 100 points can be, in any way,
representative?

Another problem with random sampling is trend analysis. Often times there
are minor trends in data. Ron pointed out the lastname firstname trend.
Although there seems to be no correlation between firstnames in the table,
there are clearly groups or clusters of ordered data that is an ordering
that is missed by too small a sample.

I understand why you chose the Vitter algorithm, because it provides a
basically sound methodology for sampling without knowledge of the size of
the whole, but I think we can do better. I would suggest using the current
algorithm the first time through, then adjust the number of samples [n]
based on the previous estimate of the size of the table [N]. Each
successive ANALYZE will become more accurate. The Vitter algorithm is
still useful as [N] will always be an estimate.

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Query optimizer 8.0.1 (and 8.0)

2005-02-07 Thread pgsql

 [EMAIL PROTECTED] writes:
 On a very basic level, why bother sampling the whole table at all? Why
 not
 check one block and infer all information from that? Because we know
 that
 isn't enough data. In a table of 4.6 million rows, can you say with any
 mathmatical certainty that a sample of 100 points can be, in any way,
 representative?

 This is a statistical argument, not a rhetorical one, and I'm not going
 to bother answering handwaving.  Show me some mathematical arguments for
 a specific sampling rule and I'll listen.


Tom, I am floored by this response, I am shaking my head in disbelief.

It is inarguable that increasing the sample size increases the accuracy of
a study, especially when diversity of the subject is unknown. It is known
that reducing a sample size increases probability of error in any poll or
study. The required sample size depends on the variance of the whole. It
is mathmatically unsound to ASSUME any sample size is valid without
understanding the standard deviation of the set.

http://geographyfieldwork.com/MinimumSampleSize.htm

Again, I understand why you used the Vitter algorithm, but it has been
proven insufficient (as used) with the US Census TIGER database. We
understand this because we have seen that the random sampling as
implemented has insufficient information to properly characterize the
variance in the data.


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Query optimizer 8.0.1 (and 8.0)

2005-02-07 Thread pgsql

 On Mon, Feb 07, 2005 at 11:27:59 -0500,
   [EMAIL PROTECTED] wrote:

 It is inarguable that increasing the sample size increases the accuracy
 of
 a study, especially when diversity of the subject is unknown. It is
 known
 that reducing a sample size increases probability of error in any poll
 or
 study. The required sample size depends on the variance of the whole. It
 is mathmatically unsound to ASSUME any sample size is valid without
 understanding the standard deviation of the set.

 For large populations the accuracy of estimates of statistics based on
 random
 samples from that population are not very sensitve to population size and
 depends primarily on the sample size. So that you would not expect to need
 to use larger sample sizes on larger data sets for data sets over some
 minimum size.

That assumes a fairly low standard deviation. If the standard deviation is
low, then a minimal sample size works fine. If there was zero deviation in
the  data, then a sample of one works fine.

If the standard deviation is high, then you need more samples. If you have
a high standard deviation and a large data set, you need more samples than
you would need for a smaller data set.

In the current implementation of analyze.c, the default is 100 samples. On
a table of 10,000 rows, that is probably a good number characterize the
data enough for the query optimizer (1% sample). For a table with 4.6
million rows, that's less than 0.002%

Think about an iregularly occuring event, unevenly distributed throughout
the data set. A randomized sample strategy normalized across the whole
data set with too few samples will mischaracterize the event or even miss
it altogether.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Query optimizer 8.0.1 (and 8.0)

2005-02-07 Thread pgsql

 On Mon, Feb 07, 2005 at 13:28:04 -0500,

 What you are saying here is that if you want more accurate statistics, you
 need to sample more rows. That is true. However, the size of the sample
 is essentially only dependent on the accuracy you need and not the size
 of the population, for large populations.

That's nonsense.

If your total data size is 100 elements in a set, then a sample size of
100 elements will cover 100% of your data.

If your total data size is 10,000 elements in a set, the a sample size of
100 elements will cover 1% of your data.

In the case of the TIGER database, the base of 100 samples is about .002%
0f the data is sampled. Think about that, that is an average of 1 sample
about every 50,000 records. You could have substantial but irregular
trends in the data that may never get detected, and this is EXACTLY what
we see. If we increase the sample size (targrows), the statistics suddenly
work better.

For instance, look at the data below.

The first analyze / select from pg_stats is with an analyze of 3000
samples. The zipl and zipr columns get calculated poorly and can cause the
planner to use a table scan instead of an index scan.

The second analyze / select from the pg_stats is with an analyse of 1
samples. The zipl and zipr n_distinct values are still off by a factor of
10, but close enough for the planner to deal.

If the premise is that samples size doesn't make a difference, I think
we've proved that this is not true.


tiger=# analyze verbose rt1;
INFO:  analyzing public.rt1
INFO:  rt1: scanned 3000 of 1527360 pages, containing 90978 live rows
and 0 dead rows; 3000 rows in sample, 46318719 estimated total rows
ANALYZE

tiger=# select * from pg_stats where tablename = 'rt1' and attname like
'zip%';
 schemaname | tablename | attname | null_frac | avg_width | n_distinct |  
 most_common_vals |  
   most_common_freqs 
 |  histogram_bounds 
| correlation
+---+-+---+---++-+--++-
 public | rt1   | zipl| 0.672 | 4 |960 |
{76240,52601,55746,71730,74604,92705,93117,95818}   |
{0.0017,0.0013,0.0013,0.0013,0.0013,0.0013,0.0013,0.0013}
   |
{1085,16652,28206,33412,43147,49428,58801,68110,77515,91340,99006} |  
-0.119519
 public | rt1   | zipr| 0.677 | 4 |960 |
{76240,52601,55746,71730,74604,78577,92705,93117,95818} |
{0.0017,0.0013,0.0013,0.0013,0.0013,0.0013,0.0013,0.0013,0.0013}
| {962,15613,28572,33606,43545,49428,60423,68064,77040,91340,99006}  |  
-0.104158
(2 rows)

Now this:
tiger=# analyze verbose rt1;
INFO:  analyzing public.rt1
INFO:  rt1: scanned 1 of 1527360 pages, containing 303419 live rows
and 0 dead rows; 1 rows in sample, 46343004 estimated total rows
ANALYZE

tiger=# select * from pg_stats where tablename = 'rt1' and attname like
'zip%';
 schemaname | tablename | attname | null_frac | avg_width | n_distinct |  
most_common_vals|
   most_common_freqs|
histogram_bounds  | correlation
+---+-+---+---++---+-+---+-
 public | rt1   | zipl|0.6807 | 4 |   2942 |
{61832,13090,17404,30907,31204,45342,47714,63050,80918,93726} |
{0.0008,0.0006,0.0005,0.0005,0.0005,0.0005,0.0005,0.0005,0.0005,0.0005} |
{654,15018,28208,33870,43006,49008,59741,68803,78640,92105,99687} |  
-0.137744
 public | rt1   | zipr| 0.684 | 4 |   2921 |
{13090,61832,30907,31204,45342,47714,63050,70122,80918,93726} |
{0.0006,0.0006,0.0005,0.0005,0.0005,0.0005,0.0005,0.0005,0.0005,0.0005} |
{731,14824,27871,33324,42276,48895,58401,68338,78575,92105,99654} |  
-0.140663
(2 rows)



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Query optimizer 8.0.1 (and 8.0)

2005-02-07 Thread pgsql

 Maybe I am missing something - ISTM that you can increase your
 statistics target for those larger tables to obtain a larger (i.e.
 better) sample.

No one is arguing that you can't manually do things, but I am not the
first to notice this. I saw the query planner doing something completely
stupid and set off to discover why.

Think about the person using PostgreSQL for the first time. He/she does
not know about this stuff. Even if they've read the FAQs and the manual
cover to cover, it will take them some time to figure out it all works
together. PostgreSQL is a big system, and this is exactly why MySQL gets
better marks from newbes.

In this case, the behavior observed could be changed by altering the
sample size for a table. I submit that an arbitrary fixed sample size is
not a good base for the analyzer, but that the sample size should be based
on the size of the table or some calculation of its deviation.

There is no reason why old stats can't be used to create more accurate
stats. Using succesive analyze operations, we could create better
statistics for the planner. We can increase the sample size based on the
table size. We could, I suppose, also calculate some sort of deviation
statistic so that n_distinct can be calculated better with a smaller
sample set.

The basic problem, though, is that PostgreSQL performed incorrectly on a
simple query after indexes were created and analyze performed. Yes, it can
be corrected, that's what led me to my conclusions, but shouldn't we try
to devise a better system in the future to improve PostgreSQL so it does
not need this sort of tuning?




 regards

 Mark

 [EMAIL PROTECTED] wrote:
[EMAIL PROTECTED] writes:
 Any and all random sampling assumes a degree of uniform distribution.
 This
 is the basis of the model. It assumes that chunks of the whole will be
 representative of the whole (to some degree). This works when normal
 variations are more or less distributed uniformly. As variations and
 trends becomes less uniformly distributed, more samples are required to
 characterize it.

 Douglas Adams had a great device called the Total Perspective Vortex
 which infered the whole of the universe from a piece of fairy cake. It
 was
 a subtle play on the absurd notion that a very small sample could lead
 to
 an understanding of an infinitly larger whole.

 On a very basic level, why bother sampling the whole table at all? Why
 not
 check one block and infer all information from that? Because we know
 that
 isn't enough data. In a table of 4.6 million rows, can you say with any
 mathmatical certainty that a sample of 100 points can be, in any way,
 representative?

 Another problem with random sampling is trend analysis. Often times
 there
 are minor trends in data. Ron pointed out the lastname firstname trend.
 Although there seems to be no correlation between firstnames in the
 table,
 there are clearly groups or clusters of ordered data that is an ordering
 that is missed by too small a sample.

 I understand why you chose the Vitter algorithm, because it provides a
 basically sound methodology for sampling without knowledge of the size
 of
 the whole, but I think we can do better. I would suggest using the
 current
 algorithm the first time through, then adjust the number of samples [n]
 based on the previous estimate of the size of the table [N]. Each
 successive ANALYZE will become more accurate. The Vitter algorithm is
 still useful as [N] will always be an estimate.



 ---(end of broadcast)---
 TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

1 2 3 >

1 - 100 of 241 matches

Mail list logo