Re: [HACKERS] Multi-Dimensional Histograms

2009-07-05 Thread Gregory Maxwell
On Mon, Jun 29, 2009 at 10:22 PM, Robert Haasrobertmh...@gmail.com wrote: I'm finding myself unable to follow all the terminology on this thead.  What's dimension reduction?  What's PCA? [snip] Imagine you have a dataset with two variables, say height in inches and age in years. For tue purpose

Re: [HACKERS] Significantly larger toast tables on 8.4?

2009-01-07 Thread Gregory Maxwell
On Fri, Jan 2, 2009 at 5:48 PM, Martijn van Oosterhout klep...@svana.org wrote: So you compromise. You split the data into say 1MB blobs and compress each individually. Then if someone does a substring at offset 3MB you can find it quickly. This barely costs you anything in the compression

Re: [HACKERS] Spinlock backoff algorithm

2007-11-14 Thread Gregory Maxwell
On Nov 14, 2007 10:12 PM, Joshua D. Drake [EMAIL PROTECTED] wrote: http://www.intel.com/performance/server/xeon/intspd.htm http://www.intel.com/performance/server/xeon/fpspeed.htm That says precisely nothing about the matter at hand. Someone should simply change it and benchmark it in pgsql. I

[HACKERS] GIST and GIN indexes on varchar[] aren't working in CVS.

2007-09-01 Thread Gregory Maxwell
There seems to be some behavior change in current CVS with respect to gist and gin indexes on varchar[]. Some side effect of the tsearch2 merge? \d search_pages Table public.search_pages Column |Type | Modifiers ---+-+---

Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-15 Thread Gregory Maxwell
On 6/15/07, Gregory Stark [EMAIL PROTECTED] wrote: While in theory spreading out the writes could have a detrimental effect I think we should wait until we see actual numbers. I have a pretty strong suspicion that the effect would be pretty minimal. We're still doing the same amount of i/o

Re: [HACKERS] Sorted writes in checkpoint

2007-06-14 Thread Gregory Maxwell
On 6/14/07, Simon Riggs [EMAIL PROTECTED] wrote: On Thu, 2007-06-14 at 16:39 +0900, ITAGAKI Takahiro wrote: Greg Smith [EMAIL PROTECTED] wrote: On Mon, 11 Jun 2007, ITAGAKI Takahiro wrote: If the kernel can treat sequential writes better than random writes, is it worth sorting dirty

Re: [HACKERS] [GENERAL] Index greater than 8k

2006-11-01 Thread Gregory Maxwell
On 11/1/06, Teodor Sigaev [EMAIL PROTECTED] wrote: [snip] Brain storm method: Develop a dictionary which returns all substring for lexeme, for example for word foobar it will be 'foobar fooba foob foo fo oobar ooba oob oo obar oba ob bar ba ar'. And make GIN functional index over your column

Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-24 Thread Gregory Maxwell
On 10/24/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: I wasn't aware that a system could protect against this. :-) I write 8 Kbytes - how can I guarantee that the underlying disk writes all 8 Kbytes before it loses power? And why isn't the CRC a valid means of dealing with this? :-) [snip]

Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-21 Thread Gregory Maxwell
On 10/21/06, Tom Lane [EMAIL PROTECTED] wrote: [snip] It hasn't even been tested. One thing I'd want to know about is the performance effect on non-Intel machines. On Opteron 265 his test code shows SB8 (the intel alg) is 2.48x faster for checksum and 1.95x faster for verify for the 800 *

Re: [HACKERS] Replication

2006-08-21 Thread Gregory Maxwell
On 8/21/06, Alvaro Herrera [EMAIL PROTECTED] wrote: But the confirmation that needs to come is that the WAL changes have been applied (fsync'ed), so the performance will be terrible. So bad, that I don't think anyone will want to use such a replication system ... Okay. I give up... Why is

Re: [HACKERS] How does the planner deal with multiple possible indexes?

2006-07-19 Thread Gregory Maxwell
On 7/19/06, Jim C. Nasby [EMAIL PROTECTED] wrote: [snip] \d does list bdata__ident_filed_departure before bdata_ident; I'm wondering if the planner is finding the first index with ident_id in it and stopping there? From my own experience it was grabbing the first that has the requested field

Re: [HACKERS] Compression and on-disk sorting

2006-05-15 Thread Gregory Maxwell
Oh come on, Sorry to troll but this is too easy. On 5/15/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: You guys have to kill your Windows hate - in jest or otherwise. It's zealous, and blinding. [snip] Why would it be assumed, that a file system designed for use from a desktop, would be

Re: [HACKERS] Support Parallel Query Execution in Executor

2006-04-09 Thread Gregory Maxwell
On 4/9/06, Tom Lane [EMAIL PROTECTED] wrote: Gregory Maxwell [EMAIL PROTECTED] writes: For example, one case made in this thread involved bursty performance with seqscans presumably because the I/O was stalling while processing was being performed. Actually, the question that that raised

Re: [HACKERS] Support Parallel Query Execution in Executor

2006-04-09 Thread Gregory Maxwell
On 4/9/06, Tom Lane [EMAIL PROTECTED] wrote: Certainly. If the OS has readahead logic at all, it ought to think that a seqscan of a large table qualifies. Your arguments seem to question whether readahead is useful at all --- but they would apply *just as well* to an app doing its own

Re: [HACKERS] Support Parallel Query Execution in Executor

2006-04-09 Thread Gregory Maxwell
On 4/9/06, Luke Lonergan [EMAIL PROTECTED] wrote: Gregory, On 4/9/06 1:36 PM, Gregory Maxwell [EMAIL PROTECTED] wrote: It might also be interesting for someone with the right testing rig on linux to try the adaptive readahead patch to see if that improves PG's ability to keep the disk

Re: [HACKERS] Support Parallel Query Execution in Executor

2006-04-08 Thread Gregory Maxwell
On 4/8/06, Tom Lane [EMAIL PROTECTED] wrote: This is exactly the bit of optimism I was questioning. We've already been sweating blood trying to reduce multiprocessor contention on data structures in which collisions ought to be avoidable (ie, buffer arrays where you hope not everyone is

[HACKERS] [GENERAL] A real currency type

2006-03-21 Thread Gregory Maxwell
On 3/21/06, Jim C. Nasby [EMAIL PROTECTED] wrote: ISTM that having a currency type is pretty common for most databases; I don't really see any reason not to just include it. Likewise for a type that actually stores timezone info with a timestamp. This really should be generalized to work with

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create

2006-02-17 Thread Gregory Maxwell
On 2/17/06, Ragnar [EMAIL PROTECTED] wrote: Say again ? Let us say you have 1 billion rows, where the column in question contains strings like baaaaaa baaaaab baaaaac ... not necessarily in this order on disc of course The minimum value

Re: [HACKERS] Why don't we allow DNS names in pg_hba.conf?

2006-02-13 Thread Gregory Maxwell
On 2/13/06, Joshua D. Drake [EMAIL PROTECTED] wrote: Well as one of the people that deploys and managees many, many postgresql installations I can say I have never run into the need to have dns names and the thought of dns names honestly seems silly. It will increase overhead and dependencies

[HACKERS] Fixing row comparison semantics

2005-12-26 Thread Gregory Maxwell
On 12/26/05, Pavel Stehule [EMAIL PROTECTED] wrote: (1,1) * (1,2) = true (1,2) * (2,1) is NULL (2,3) * (1,2) = false it's usefull for multicriterial optimalisation This is indeed a sane and useful function which should be adopted by the SQL standard.. in postgresql this would easily enough

Re: [HACKERS] Upcoming PG re-releases

2005-12-08 Thread Gregory Maxwell
On 12/8/05, Bruce Momjian pgman@candle.pha.pa.us wrote: A script which identifies non-utf-8 characters and provides some context, line numbers, etc, will greatly speed up the process of remedying the situation. I think the best we can do is the iconv -c with the diff

Re: [HACKERS] Replication on the backend

2005-12-06 Thread Gregory Maxwell
On 12/6/05, Jan Wieck [EMAIL PROTECTED] wrote: IMO this is not true. You can get affordable 10GBit network adapters, so you can have plenty of bandwith in a db server pool (if they are located in the same area). Even 1GBit Ethernet greatly helps here, and would make it possible to

Re: [HACKERS] Reduce NUMERIC size by 2 bytes, reduce max length to 508 digits

2005-12-05 Thread Gregory Maxwell
On 12/5/05, Tom Lane [EMAIL PROTECTED] wrote: Not only does 4000! not work, but 400! doesn't even work. I just lost demo wow factor points! It looks like the limit would be about factorial(256). The question remains, though, is this computational range good for anything except demos?

[HACKERS] Upcoming PG re-releases

2005-12-04 Thread Gregory Maxwell
On 12/4/05, Tom Lane [EMAIL PROTECTED] wrote: Paul Lindner [EMAIL PROTECTED] writes: On Sun, Dec 04, 2005 at 11:34:16AM -0500, Tom Lane wrote: Paul Lindner [EMAIL PROTECTED] writes: iconv -c -f UTF8 -t UTF8 -o fixed.sql dump.sql Is that really a one-size-fits-all solution? Especially

Re: [HACKERS] Reducing relation locking overhead

2005-12-02 Thread Gregory Maxwell
On 02 Dec 2005 15:25:58 -0500, Greg Stark [EMAIL PROTECTED] wrote: I suspect this comes out of a very different storage model from Postgres's. Postgres would have no trouble building an index of the existing data using only shared locks. The problem is that any newly inserted (or updated)

Re: [HACKERS] generalizing the planner knobs

2005-12-02 Thread Gregory Maxwell
On 02 Dec 2005 15:49:02 -0500, Greg Stark [EMAIL PROTECTED] wrote: Rod Taylor [EMAIL PROTECTED] writes: The missing capability in this case is to be able to provide or generate (self learning?) statistics for a function that describe a typical result and the cost of getting that result.

Re: [HACKERS] generalizing the planner knobs

2005-12-01 Thread Gregory Maxwell
On 12/1/05, Pollard, Mike [EMAIL PROTECTED] wrote: Optimizer hints were added because some databases just don't have a very smart optimizer. But you are much better served tracking down cases in which the optimizer makes a bad choice, and teaching the optimizer how to make a better one. That

Re: [HACKERS] Improving count(*)

2005-11-21 Thread Gregory Maxwell
On 11/21/05, Jim C. Nasby [EMAIL PROTECTED] wrote: What about Greg Stark's idea of combining Simon's idea of storing per-heap-block xmin/xmax with using that information in an index scan? ISTM that's the best of everything that's been presented: it allows for faster index scans without adding

Re: [HACKERS] Improving count(*)

2005-11-18 Thread Gregory Maxwell
On 11/18/05, Merlin Moncure [EMAIL PROTECTED] wrote: In Sybase ASE (and I'm pretty sure the same is true in Microsoft SQL Server) the leaf level of the narrowest index on the table is scanned, following a linked list of leaf pages. Leaf pages can be pretty dense under Sybase, because they

Re: Réf. : RE: [HACKERS] Running PostGre on DVD

2005-11-15 Thread Gregory Maxwell
On 11/15/05, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: I don't understand why an user can't WILLINGLY (by EXPLICITLY setting an OPTION) allow a privileged administrator to run PostGre. It is a MAJOR problem for me, that will force me to use another database because my database will be on a

Re: [HACKERS] MERGE vs REPLACE

2005-11-13 Thread Gregory Maxwell
On 11/13/05, Robert Treat [EMAIL PROTECTED] wrote: On Saturday 12 November 2005 04:06, Matteo Beccati wrote: | 1 |1 | NULL | Wow, that seems ugly maybe there's a reason for it, but I'm not sure we could deviate from my$ql's behavior on this even if we wanted... they are the standard

Re: [HACKERS] SIGSEGV taken on 8.1 during dump/reload

2005-11-09 Thread Gregory Maxwell
On 11/8/05, Tom Lane [EMAIL PROTECTED] wrote: Teodor Sigaev [EMAIL PROTECTED] writes: Layout of GIST_SPLITVEC struct has been changed from 8.0, I'm afraid that old .so is used. spl_(right|left)valid fields was added to GIST_SPLITVEC. Does look a bit suspicious ... Robert, are you *sure*

Re: [HACKERS] Interval aggregate regression failure (expected seems

2005-11-07 Thread Gregory Maxwell
On 07 Nov 2005 14:22:37 -0500, Greg Stark [EMAIL PROTECTED] wrote: IIRC, floating point registers are actually longer than a double so if the entire calculation is done in registers and then the result rounded off to store in memory it may get the right answer. Whereas if it loses the extra

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-04 Thread Gregory Maxwell
On 11/4/05, Martijn van Oosterhout kleptog@svana.org wrote: Yeah, and while one way of removing that dependance is to use ICU, that library wants everything in UTF-16. So we replace copying to add NULL to string with converting UTF-8 to UTF-16 on each call. Ugh! The argument for UTF-16 is that

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-04 Thread Gregory Maxwell
On 11/4/05, Tom Lane [EMAIL PROTECTED] wrote: Martijn van Oosterhout kleptog@svana.org writes: Yeah, and while one way of removing that dependance is to use ICU, that library wants everything in UTF-16. Really? Can't it do UCS4 (UTF-32)? There's a nontrivial population of our users that

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-04 Thread Gregory Maxwell
On 11/4/05, Martijn van Oosterhout kleptog@svana.org wrote: [snip] : ICU does not use UCS-2. UCS-2 is a subset of UTF-16. UCS-2 does not : support surrogates, and UTF-16 does support surrogates. This means : that UCS-2 only supports UTF-16's Base Multilingual Plane (BMP). The : notion of UCS-2

Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-03 Thread Gregory Maxwell
On 11/3/05, Martijn van Oosterhout kleptog@svana.org wrote: That's called UTF-16 and is currently not supported by PostgreSQL at all. That may change, since the locale library ICU requires UTF-16 for everything. UTF-16 doesn't get us out of the variable length character game, for that we need

Re: slru.c race condition (was Re: [HACKERS] TRAP: FailedAssertion(!((itemid)-lp_flags

2005-10-31 Thread Gregory Maxwell
On 10/31/05, Jim C. Nasby [EMAIL PROTECTED] wrote: On Mon, Oct 31, 2005 at 01:34:17PM -0500, Bruce Momjian wrote: There is no way if the system has some incorrect value whether that would later corrupt the data or not. Anything the system does that it shouldn't do is a potential corruption

Re: [HACKERS] Differences in UTF8 between 8.0 and 8.1

2005-10-30 Thread Gregory Maxwell
On 10/26/05, Christopher Kings-Lynne [EMAIL PROTECTED] wrote: iconv -c -f UTF8 -t UTF8 recode UTF-8..UTF-8 dump_in.sql dump_out.sql I've got a file with characters that pg won't accept that recode does not fix but iconv does. Iconv is fine for my application, so I'm just posting to the

Re: [HACKERS] enums

2005-10-28 Thread Gregory Maxwell
On 10/27/05, Andrew Dunstan [EMAIL PROTECTED] wrote: Yes, MySQL is broken in some regards, as usual. However, the API isn't bad (except for the fact that it doesn't care what invalid crap you throw at it), and more importantly there are thousands of apps and developers who think around that

Re: [HACKERS] enums

2005-10-27 Thread Gregory Maxwell
On 10/27/05, Jim Nasby [EMAIL PROTECTED] wrote: Adding -hackers back to the list... You could as equally say that it's ordering it by the order of the enum declaration, which seems quite reasonable to me. I don't really see why that's considered reasonable, especially as a default. I

Re: [HACKERS] enums

2005-10-27 Thread Gregory Maxwell
On 10/27/05, Andrew Dunstan [EMAIL PROTECTED] wrote: That seems counter-intuitive. It's also exposing an implimentation detail (that the enum is stored internally as a number). No it is not. Not in the slightest. It is honoring the enumeration order defined for the type. That is the ONLY

[HACKERS] On externals sorts and other IO bottlenecks in postgresql.

2005-10-23 Thread Gregory Maxwell
I don't recall this being mentioned in the prior threads: http://www.cs.duke.edu/TPIE/ GPLed, but perhaps it has some good ideas. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-10-03 Thread Gregory Maxwell
On 10/3/05, Ron Peacetree [EMAIL PROTECTED] wrote: [snip] Just how bad is this CPU bound condition? How powerful a CPU is needed to attain a DB IO rate of 25MBps? If we replace said CPU with one 2x, 10x, etc faster than that, do we see any performance increase? If a modest CPU can drive a

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-09-30 Thread Gregory Maxwell
On 9/30/05, Ron Peacetree [EMAIL PROTECTED] wrote: 4= I'm sure we are paying all sorts of nasty overhead for essentially emulating the pg filesystem inside another filesystem. That means ~2x as much overhead to access a particular piece of data. The simplest solution is for us to implement a

Re: [HACKERS] [PERFORM] A Better External Sort?

2005-09-30 Thread Gregory Maxwell
On 9/28/05, Ron Peacetree [EMAIL PROTECTED] wrote: 2= We use my method to sort two different tables. We now have these very efficient representations of a specific ordering on these tables. A join operation can now be done using these Btrees rather than the original data tables that involves

Re: [HACKERS] Spinlocks, yet again: analysis and proposed patches

2005-09-15 Thread Gregory Maxwell
On 9/15/05, Tom Lane [EMAIL PROTECTED] wrote: Yesterday's CVS tip: 1 32s 2 46s 4 88s 8 168s plus no-cmpb and spindelay2: 1 32s 2 48s 4 100s 8 177s plus just-committed code to pad LWLock to 32: 1 33s 2 50s 4 98s 8 179s alter to pad to 64: 1

Re: [HACKERS] pl/Ruby, deprecating plPython and Core

2005-08-16 Thread Gregory Maxwell
On 8/16/05, Joshua D. Drake [EMAIL PROTECTED] wrote: Sure... it hasn't been found. We can play the it might have or might not have game all day long but it won't get us anywhere. Today, and yesterday pl/Ruby can be run trust/untrusted, pl/python can not. Both of these things could be said

Re: [HACKERS] pl/Ruby, deprecating plPython and Core

2005-08-16 Thread Gregory Maxwell
On 8/16/05, David Fetter [EMAIL PROTECTED] wrote: It's not. In PL/parlance, trusted means prevented from ever opening a filehandle or a socket, and PL/PythonU is called PL/Python*U* (U for *un*trusted) because it cannot be so prevented. If somebody has figured out a way to make a PL/Python

Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-06-22 Thread Gregory Maxwell
On 6/23/05, Gavin Sherry [EMAIL PROTECTED] wrote: inertia) but seeking to a lot of new tracks to write randomly-positioned dirty sectors would require significant energy that just ain't there once the power drops. I seem to recall reading that the seek actuators eat the largest share of

Re: [HACKERS] LGPL

2005-06-17 Thread Gregory Maxwell
On 6/18/05, Tom Lane [EMAIL PROTECTED] wrote: What is important is that it is possible, and useful, to build Postgres in a completely non-GPL environment. If that were not so then I think we'd have some license issues. But the fact that building PG in a GPL-ized environment creates a

Re: [HACKERS] User/Group Quotas Revisited

2005-06-11 Thread Gregory Maxwell
- Who has permissions to set the user's quota per tablespace, the superuser and the tablespace owner? It would be nice if this were nestable, that is, if the sysadmin could carve out a tablespace for a user then the user could carve that into seperately quotated sub tables.. The idea being, a

[HACKERS] Bloom Filter indexes?

2005-05-28 Thread Gregory Maxwell
Has any thought been given to adding bloom filter indexes to PostgreSQL? A bloom index would be created on a column, and could then be used to accelerate exact matches where it is common that the user may query for a value that doesn't exist. For example, with the query select userid from

[HACKERS] Bloom Filter indexes?

2005-05-28 Thread Gregory Maxwell
Has any thought been given to adding bloom filter indexes to PostgreSQL? A bloom index would be created on a column, and could then be used to accelerate exact matches where it is common that the user may query for a value that doesn't exist. For example, with the query select userid from