Re: [HACKERS] Parallel query execution

2013-01-15 Thread Claudio Freire
On Tue, Jan 15, 2013 at 8:19 PM, Bruce Momjian wrote: >> Given our row-based storage architecture, I can't imagine we'd do >> anything other than take a row-based approach to this.. I would think >> we'd do two things: parallelize based on partitioning, and parallelize >> seqscan's across the ind

Re: [HACKERS] [PATCH] COPY .. COMPRESSED

2013-01-15 Thread Claudio Freire
On Tue, Jan 15, 2013 at 7:46 PM, Tom Lane wrote: >> Compressing every small packet seems like it'd be overkill and might >> surprise people by actually reducing performance in the case of lots of >> small requests. > > Yeah, proper selection and integration of a compression method would be > criti

Re: [HACKERS] Parallel query execution

2013-01-15 Thread Claudio Freire
On Wed, Jan 16, 2013 at 12:13 AM, Stephen Frost wrote: > * Claudio Freire (klaussfre...@gmail.com) wrote: >> On Tue, Jan 15, 2013 at 8:19 PM, Bruce Momjian wrote: >> > The 1GB idea is interesting. I found in pg_upgrade that file copy would >> > just overwhelm the

Re: [HACKERS] Parallel query execution

2013-01-15 Thread Claudio Freire
On Wed, Jan 16, 2013 at 12:55 AM, Stephen Frost wrote: >> If memory serves me correctly (and it does, I suffered it a lot), the >> performance hit is quite considerable. Enough to make it "a lot worse" >> rather than "not as good". > > I feel like we must not be communicating very well. > > If the

Re: [HACKERS] Parallel query execution

2013-01-16 Thread Claudio Freire
On Wed, Jan 16, 2013 at 10:33 AM, Stephen Frost wrote: > * Claudio Freire (klaussfre...@gmail.com) wrote: >> Well, there's the fault in your logic. It won't be as linear. > > I really don't see how this has become so difficult to communicate. > > It doesn'

Re: [HACKERS] Parallel query execution

2013-01-16 Thread Claudio Freire
On Wed, Jan 16, 2013 at 10:04 PM, Jeff Janes wrote: >> Hmm... >> >> How about being aware of multiple spindles - so if the requested data >> covers multiple spindles, then data could be extracted in parallel. This >> may, or may not, involve multiple I/O channels? > > > > effective_io_concurrency

Re: [HACKERS] [PATCH] COPY .. COMPRESSED

2013-01-16 Thread Claudio Freire
On Wed, Jan 16, 2013 at 8:19 PM, Robert Haas wrote: > On Tue, Jan 15, 2013 at 4:50 PM, Tom Lane wrote: >> I find the argument that this supports compression-over-the-wire to be >> quite weak, because COPY is only one form of bulk data transfer, and >> one that a lot of applications don't ever use

Re: [HACKERS] Parallel query execution

2013-01-16 Thread Claudio Freire
On Wed, Jan 16, 2013 at 11:44 PM, Bruce Momjian wrote: > On Wed, Jan 16, 2013 at 05:04:05PM -0800, Jeff Janes wrote: >> On Tuesday, January 15, 2013, Stephen Frost wrote: >> >> * Gavin Flower (gavinflo...@archidevsys.co.nz) wrote: >> > How about being aware of multiple spindles - so if the

Re: [HACKERS] [PATCH 1/3] Fix x + y < x overflow checks

2013-01-24 Thread Claudio Freire
On Thu, Jan 24, 2013 at 6:36 AM, Xi Wang wrote: > icc optimizes away the overflow check x + y < x (y > 0), because > signed integer overflow is undefined behavior in C. Instead, use > a safe precondition test x > INT_MAX - y. I should mention gcc 4.7 does the same, and it emits a warning. --

Re: unified vs context diffs (was Re: [HACKERS] Strange Windows problem, lock_timeout test request)

2013-02-24 Thread Claudio Freire
On Sun, Feb 24, 2013 at 11:08 AM, Stephen Frost wrote: > * Heikki Linnakangas (hlinnakan...@vmware.com) wrote: >> So if you want to be kind to readers, look at the patch and choose >> the format depending on which one makes it look better. But there's >> no need to make a point of it when someone

Re: [HACKERS] Spin Lock sleep resolution

2013-04-01 Thread Claudio Freire
On Tue, Apr 2, 2013 at 1:24 AM, Tom Lane wrote: > Jeff Janes writes: >> The problem is that the state is maintained only to an integer number of >> milliseconds starting at 1, so it can take a number of attempts for the >> random increment to jump from 1 to 2, and then from 2 to 3. > > Hm ... fai

Re: [HACKERS] Multi-pass planner

2013-04-19 Thread Claudio Freire
On Fri, Apr 19, 2013 at 6:19 PM, Jeff Janes wrote: > On Wed, Apr 3, 2013 at 6:40 PM, Greg Stark wrote: >> >> >> On Fri, Aug 21, 2009 at 6:54 PM, decibel wrote: >>> >>> Would it? Risk seems like it would just be something along the lines of >>> the high-end of our estimate. I don't think confiden

Re: [HACKERS] Multi-pass planner

2013-04-19 Thread Claudio Freire
On Fri, Apr 19, 2013 at 7:43 PM, Jeff Janes wrote: > On Fri, Apr 19, 2013 at 2:24 PM, Claudio Freire > wrote: >> >> >> Especially if there's some locality of occurrence, since analyze >> samples pages, not rows. > > > But it doesn't take all row

Re: [HACKERS] Allowing parallel pg_restore from pipe

2013-04-24 Thread Claudio Freire
On Wed, Apr 24, 2013 at 6:47 PM, Joachim Wieland wrote: > On Wed, Apr 24, 2013 at 4:05 PM, Stefan Kaltenbrunner > wrote: >> >> > What might make sense is something like pg_dump_restore which would have >> > no intermediate storage at all, just pump the data etc from one source >> > to another in

Re: [HACKERS] Parallel Sort

2013-05-14 Thread Claudio Freire
On Tue, May 14, 2013 at 11:50 AM, Noah Misch wrote: > On Mon, May 13, 2013 at 09:52:43PM +0200, Kohei KaiGai wrote: >> 2013/5/13 Noah Misch >> > The choice of whether to parallelize can probably be made a manner similar >> > to >> > the choice to do an external sort: the planner guesses the outco

Re: [HACKERS] Parallel Sort

2013-05-15 Thread Claudio Freire
On Wed, May 15, 2013 at 3:04 PM, Noah Misch wrote: > On Tue, May 14, 2013 at 12:15:24PM -0300, Claudio Freire wrote: >> You know what would be a low-hanging fruit that I've been thinking >> would benefit many of my own queries? >> >> "Parallel" sequ

Re: [HACKERS] [RFC] CREATE QUEUE (log-only table) for londiste/pgQ ccompatibility

2012-10-18 Thread Claudio Freire
On Thu, Oct 18, 2012 at 2:33 PM, Josh Berkus wrote: >> I should also add that this is an switchable sync/asynchronous >> transactional queue, whereas LISTEN/NOTIFY is a synchronous >> transactional queue. > > Thanks for explaining. New here, I missed half the conversation, but since it's been bro

[HACKERS] Prefetch index pages for B-Tree index scans

2012-10-18 Thread Claudio Freire
I've noticed, doing some reporting queries once, that index scans fail to saturate server resources on compute-intensive queries. Problem is, just after fetching a page, postgres starts computing stuff before fetching the next. This results in I/O - compute - I/O - compute alternation that results

Re: [HACKERS] [PATCH] Prefetch index pages for B-Tree index scans

2012-10-18 Thread Claudio Freire
On Thu, Oct 18, 2012 at 5:30 PM, Claudio Freire wrote: > Backward: > >Q

Re: [HACKERS] [PATCH] Support for Array ELEMENT Foreign Keys

2012-10-19 Thread Claudio Freire
On Fri, Oct 19, 2012 at 5:48 PM, Tom Lane wrote: > It looks like we could support > > CREATE TABLE t1 (c int[] REFERENCES BY ELEMENT t2); > > but (1) this doesn't seem terribly intelligible to me, and > (2) I don't see how we modify that if we want to provide > at-least-one-match semantics

Re: [HACKERS] [PATCH] Prefetch index pages for B-Tree index scans

2012-10-22 Thread Claudio Freire
On Thu, Oct 18, 2012 at 7:42 PM, Claudio Freire wrote: > Fun. That didn't take long. > > With the attached anti-sequential scan patch, and effective_io_concurrency=8: > > >

Re: [HACKERS] [PATCH] Prefetch index pages for B-Tree index scans

2012-10-23 Thread Claudio Freire
On Tue, Oct 23, 2012 at 9:44 AM, John Lumby wrote: >> From: Claudio Freire >> I hope I'm not talking to myself. > > Indeed not. I also looked into prefetching for pure index scans for > b-trees (and extension to use async io). > http://archives.p

Re: [HACKERS] Logical to physical page mapping

2012-10-27 Thread Claudio Freire
On Sat, Oct 27, 2012 at 3:41 PM, Heikki Linnakangas wrote: > >> I think you're just moving the atomic-write problem from the data pages >> to wherever you keep these pointers. > > > If the pointers are stored as simple 4-byte integers, you probably could > assume that they're atomic, and won't be

Re: [HACKERS] [PATCH] Prefetch index pages for B-Tree index scans

2012-10-29 Thread Claudio Freire
On Tue, Oct 23, 2012 at 10:54 AM, Claudio Freire wrote: >> Indeed not. I also looked into prefetching for pure index scans for >> b-trees (and extension to use async io). >> http://archives.postgresql.org/message-id/BLU0-SMTP31709961D846CCF4F5EB4C2A3930%40phx.gbl >

Re: [HACKERS] [PATCH] Prefetch index pages for B-Tree index scans

2012-10-29 Thread Claudio Freire
On Mon, Oct 29, 2012 at 12:53 PM, Claudio Freire wrote: >> Yes, I've seen that, though I thought it was only an improvement on >> PrefetchBuffer. That patch would interact quite nicely with mine. >> >> I'm now trying to prefetch heap tuples, and I got to a r

Re: [HACKERS] [PATCH] Prefetch index pages for B-Tree index scans

2012-10-29 Thread Claudio Freire
On Mon, Oct 29, 2012 at 4:17 PM, Cédric Villemain wrote: >> Ok, this is the best I could come up with, without some real test hardware. >> >> The only improvement I see in single-disk scenarios: >> * Huge speedup of back-sequential index-only scans >> * Marginal speedup on forward index-only s

Re: [HACKERS] [PATCH] Prefetch index pages for B-Tree index scans

2012-10-29 Thread Claudio Freire
On Mon, Oct 29, 2012 at 7:07 PM, Cédric Villemain wrote: >> But it also looks forgotten. Bringing it back to life would mean >> building the latest kernel with that patch included, replicating the >> benchmarks I ran here, sans pg patch, but with patched kernel, and >> reporting the (hopefully equ

Re: [HACKERS] [PATCH] Prefetch index pages for B-Tree index scans

2012-11-01 Thread Claudio Freire
On Thu, Nov 1, 2012 at 1:37 PM, John Lumby wrote: > > Claudio wrote : >> >> Oops - forgot to effectively attach the patch. >> > > I've read through your patch and the earlier posts by you and Cédric. > > This is very interesting. You chose to prefetch index btree (key-ptr) > pages > whereas

Re: [HACKERS] [PATCH] Prefetch index pages for B-Tree index scans

2012-11-01 Thread Claudio Freire
On Thu, Nov 1, 2012 at 2:00 PM, Andres Freund wrote: >> > I agree. I'm a bit hesitant to subscribe to yet another mailing list >> >> FYI you can send messages to linux-kernel without subscribing (there's >> no moderation either). >> >> Subscribing to linux-kernel is like drinking from a firehose :

Re: [HACKERS] [PATCH] Prefetch index pages for B-Tree index scans

2012-11-01 Thread Claudio Freire
On Thu, Nov 1, 2012 at 10:59 PM, Greg Smith wrote: > On 11/1/12 6:13 PM, Claudio Freire wrote: > >> posix_fadvise what's the trouble there, but the fact that the kernel >> stops doing read-ahead when a call to posix_fadvise comes. I noticed >> the performance hit,

Re: [HACKERS] libpq

2012-11-06 Thread Claudio Freire
On Tue, Nov 6, 2012 at 6:59 PM, Tom Lane wrote: >> If, instead, you are keen on getting the source code for libpq in a >> separate tarball, I'd seriously question why that would be expected to be >> valuable. On most systems, these days, it doesn't take terribly much time >> or space (on our syst

Re: [HACKERS] libpq

2012-11-06 Thread Claudio Freire
On Tue, Nov 6, 2012 at 7:25 PM, Tom Lane wrote: > Claudio Freire writes: >> Maybe anl libs / install-libs makefile target? > >> I've already faced the complicated procedure one has to go through to >> build and install only libpq built from source. > >

Re: [HACKERS] [PERFORM] pg_dump and thousands of schemas

2012-05-31 Thread Claudio Freire
On Thu, May 31, 2012 at 11:17 AM, Robert Klemme wrote: > > OK, my fault was to assume you wanted to measure only your part, while > apparently you meant overall savings.  But Tom had asked for separate > measurements if I understood him correctly.  Also, that measurement of > your change would go

Re: [HACKERS] [PERFORM] pg_dump and thousands of schemas

2012-05-31 Thread Claudio Freire
On Thu, May 31, 2012 at 11:50 AM, Tom Lane wrote: > The performance patches we applied to pg_dump over the past couple weeks > were meant to relieve pain in situations where the big server-side > lossage wasn't the dominant factor in runtime (ie, partial dumps). > But this one is targeting exactly

Re: [HACKERS] [PERFORM] pg_dump and thousands of schemas

2012-05-31 Thread Claudio Freire
On Thu, May 31, 2012 at 12:25 PM, Tom Lane wrote: >> No, Tatsuo's patch attacks a phase dominated by latency in some >> setups. > > No, it does not.  The reason it's a win is that it avoids the O(N^2) > behavior in the server.  Whether the bandwidth savings is worth worrying > about cannot be prov

Re: [HACKERS] What's needed for cache-only table scan?

2013-11-12 Thread Claudio Freire
On Tue, Nov 12, 2013 at 11:45 AM, Kohei KaiGai wrote: > Hello, > > It is a brief design proposal of a feature I'd like to implement on top of > custom-scan APIs. Because it (probably) requires a few additional base > features not only custom-scan, I'd like to see feedback from the hackers. > > The

Re: [HACKERS] Fast insertion indexes: why no developments

2013-11-12 Thread Claudio Freire
On Tue, Nov 12, 2013 at 6:41 PM, Nicolas Barbier wrote: > (Note that K B-trees can be merged by simply scanning all of them > concurrently, and merging them just like a merge sort merges runs. > Also, all B-trees except for the first level (of size S) can be > compacted 100% as there is no need to

Re: [HACKERS] Optimize kernel readahead using buffer access strategy

2013-11-14 Thread Claudio Freire
On Thu, Nov 14, 2013 at 9:09 AM, KONDO Mitsumasa wrote: > I create a patch that is improvement of disk-read and OS file caches. It can > optimize kernel readahead parameter using buffer access strategy and > posix_fadvice() in various disk-read situations. > > In general OS, readahead parameter wa

Re: [HACKERS] Optimize kernel readahead using buffer access strategy

2013-11-14 Thread Claudio Freire
On Thu, Nov 14, 2013 at 11:13 PM, KONDO Mitsumasa wrote: > Hi Claudio, > > > (2013/11/14 22:53), Claudio Freire wrote: >> >> On Thu, Nov 14, 2013 at 9:09 AM, KONDO Mitsumasa >> wrote: >>> >>> I create a patch that is improvement of disk-read an

Re: [HACKERS] Optimize kernel readahead using buffer access strategy

2013-11-17 Thread Claudio Freire
On Sun, Nov 17, 2013 at 11:02 PM, KONDO Mitsumasa wrote: >>> However, my patch is on the way and needed to more improvement. I am >>> going >>> to add method of controlling readahead by GUC, for user can freely select >>> readahed parameter in their transactions. >> >> >> Rather, I'd try to avoid

Re: [HACKERS] Why is UPDATE with column-list syntax not implemented

2013-11-21 Thread Claudio Freire
On Thu, Nov 21, 2013 at 3:50 PM, David Johnston wrote: >> Why is this not implemented? Is it considered inconvenient to use, or >> difficult to implement. or not important enough, or some other reason? > > I cannot answer why but I too would like to see this. I actually asked this > a long while

Re: [HACKERS] Can we trust fsync?

2013-11-22 Thread Claudio Freire
On Fri, Nov 22, 2013 at 1:16 PM, Tom Lane wrote: >> The original mail was referencing a problem with syncing *meta* data >> though. The semantics around meta data syncs are much less clearly >> specified, in part because file systems traditionally made nearly all meta >> data operations synchronou

Re: [HACKERS] Why is UPDATE with column-list syntax not implemented

2013-11-22 Thread Claudio Freire
On Fri, Nov 22, 2013 at 6:36 PM, AK wrote: > Claudio, > > Can you elaborate how rules can help? Well... that specific example: > UPDATE accounts SET (contact_last_name, contact_first_name) = > (SELECT last_name, first_name FROM salesmen > WHERE salesmen.id = accounts.sales_id); Can be

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-03 Thread Claudio Freire
On Wed, Dec 4, 2013 at 12:57 AM, Amit Kapila wrote: >> As a quick side, we also repeated the same experiment on an EC2 instance >> with 16 CPU cores, and found that the scale out behavior became worse there. >> (We also tried increasing the shared_buffers to 30 GB. This change >> completely solved

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-03 Thread Claudio Freire
On Wed, Dec 4, 2013 at 4:28 AM, Tatsuo Ishii wrote: >>> Can we avoid the Linux kernel problem by simply increasing our shared >>> buffer size, say up to 80% of memory? >> It will be swap more easier. > > Is that the case? If the system has not enough memory, the kernel > buffer will be used for ot

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Claudio Freire
On Wed, Dec 4, 2013 at 9:19 AM, Metin Doslu wrote: > > Here are the results of "vmstat 1" while running 8 parallel TPC-H Simple > (#6) queries: Although there is no need for I/O, "wa" fluctuates between 0 > and 1. > > procs ---memory-- ---swap-- -io --system-- > -cpu--

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-04 Thread Claudio Freire
On Wed, Dec 4, 2013 at 1:54 PM, Andres Freund wrote: > On 2013-12-04 18:43:35 +0200, Metin Doslu wrote: >> > I'd strongly suggest doing a "perf record -g -a ; >> > perf report" run to check what's eating up the time. >> >> Here is one example: >> >> + 38.87% swapper [kernel.kallsyms] [k] hyp

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-05 Thread Claudio Freire
On Thu, Dec 5, 2013 at 11:42 AM, Greg Stark wrote: > (b) is the way more interesting research project though. I don't think > anyone's tried it and the kernel interface to provide the kinds of > information Postgres needs requires a lot of thought. If it's done > right then Postgres wouldn't need

Re: [HACKERS] Parallel Select query performance and shared buffers

2013-12-05 Thread Claudio Freire
On Thu, Dec 5, 2013 at 1:03 PM, Metin Doslu wrote: >> From what I've seen so far the bigger problem than contention in the >> lwlocks itself, is the spinlock protecting the lwlocks... > > Postgres 9.3.1 also reports spindelay, it seems that there is no contention > on spinlocks. Did you check hu

Re: [HACKERS] ANALYZE sampling is too good

2013-12-05 Thread Claudio Freire
On Tue, Dec 3, 2013 at 8:30 PM, Greg Stark wrote: > Worse, my experience with the posix_fadvise benchmarking is that on > spinning media reading one out of every 16 blocks takes about the same > time as reading them all. Presumably this is because the seek time > between tracks dominates and readi

Re: [HACKERS] ANALYZE sampling is too good

2013-12-09 Thread Claudio Freire
On Mon, Dec 9, 2013 at 6:47 PM, Heikki Linnakangas wrote: > On 12/09/2013 11:35 PM, Jim Nasby wrote: >> >> On 12/8/13 1:49 PM, Heikki Linnakangas wrote: >>> >>> On 12/08/2013 08:14 PM, Greg Stark wrote: The whole accounts table is 1.2GB and contains 10 million rows. As expected with

Re: [HACKERS] ANALYZE sampling is too good

2013-12-09 Thread Claudio Freire
On Mon, Dec 9, 2013 at 8:14 PM, Heikki Linnakangas wrote: > On 12/09/2013 11:56 PM, Claudio Freire wrote: >> Without patches to the kernel, it is much better. >> >> posix_fadvise interferes with read-ahead, so posix_fadvise on, say, >> bitmap heap scans (or similarly s

Re: [HACKERS] ANALYZE sampling is too good

2013-12-09 Thread Claudio Freire
On Mon, Dec 9, 2013 at 8:45 PM, Heikki Linnakangas wrote: > Claudio Freire wrote: >>On Mon, Dec 9, 2013 at 8:14 PM, Heikki Linnakangas >> wrote: >>> I took a stab at using posix_fadvise() in ANALYZE. It turned out to >>be very >>> easy, patch attached.

Re: [HACKERS] ANALYZE sampling is too good

2013-12-09 Thread Claudio Freire
On Tue, Dec 10, 2013 at 12:13 AM, Mark Kirkwood wrote: > Just one more... > > The Intel 520 with ext4: > > > Without patch: ANALYZE pgbench_accounts 5s > With patch: ANALYZE pgbench_accounts 1s > > And double checking - > With patch, but effective_io_concurrency = 1: ANALYZE pgbench_accounts 5s

Re: [HACKERS] Optimize kernel readahead using buffer access strategy

2013-12-10 Thread Claudio Freire
On Tue, Dec 10, 2013 at 5:03 AM, KONDO Mitsumasa wrote: > I revise this patch and re-run performance test, it can work collectry in > Linux and no complile wanings. I add GUC about enable_kernel_readahead > option in new version. When this GUC is on(default), it works in > POSIX_FADV_NORMAL which

Re: [HACKERS] ANALYZE sampling is too good

2013-12-10 Thread Claudio Freire
On Tue, Dec 10, 2013 at 11:02 AM, Greg Stark wrote: > > On 10 Dec 2013 08:28, "Albe Laurenz" wrote: >> >> >> Doesn't all that assume a normally distributed random variable? > > I don't think so because of the law of large numbers. If you have a large > population and sample it the sample behaves

Re: [HACKERS] ANALYZE sampling is too good

2013-12-10 Thread Claudio Freire
On Tue, Dec 10, 2013 at 11:32 AM, Claudio Freire wrote: > On Tue, Dec 10, 2013 at 11:02 AM, Greg Stark wrote: >> >> On 10 Dec 2013 08:28, "Albe Laurenz" wrote: >>> >>> >>> Doesn't all that assume a normally distributed random variable? &

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-10 Thread Claudio Freire
re than a few programs that want I/O performance, and in its current form is sub-optimal, the fix is rather simple, it just needs a lot of testing. But my report on LKML[0] spurred little actual work. So it's possible this kind of thing will need patches attached. On Tue, Dec 10, 2013 at 9:34

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-10 Thread Claudio Freire
On Tue, Dec 10, 2013 at 11:33 PM, Jeff Janes wrote: > On Tuesday, December 10, 2013, Tom Lane wrote: >> >> Jeff Janes writes: >> > On Tue, Dec 3, 2013 at 11:39 PM, Claudio Freire >> > wrote: >> >> Problem is, Postgres relies on a working kernel cache

Re: [HACKERS] Optimize kernel readahead using buffer access strategy

2013-12-11 Thread Claudio Freire
On Wed, Dec 11, 2013 at 3:14 AM, KONDO Mitsumasa wrote: > >> enable_readahead=os|fadvise >> >> with os = on, fadvise = off > > Hmm. fadvise is method and is not a purpose. So I consider another idea of > this GUC. Yeah, I was thinking of opening the door for readahead=aio, but whatever clearer t

Re: [HACKERS] ANALYZE sampling is too good

2013-12-12 Thread Claudio Freire
On Thu, Dec 12, 2013 at 3:29 PM, Tom Lane wrote: > Jeff Janes writes: >> It would be relatively easy to fix this if we trusted the number of visible >> rows in each block to be fairly constant. But without that assumption, I >> don't see a way to fix the sample selection process without reading

Re: [HACKERS] ANALYZE sampling is too good

2013-12-12 Thread Claudio Freire
On Thu, Dec 12, 2013 at 3:56 PM, Josh Berkus wrote: > > Estimated grouping should, however, affect MCVs. In cases where we > estimate that grouping levels are high, the expected % of observed > values should be "discounted" somehow. That is, with total random > distribution you have a 1:1 ratio

Re: [HACKERS] ANALYZE sampling is too good

2013-12-12 Thread Claudio Freire
On Thu, Dec 12, 2013 at 4:13 PM, Jeff Janes wrote: >> Well, why not take a supersample containing all visible tuples from N >> selected blocks, and do bootstrapping over it, with subsamples of M >> independent rows each? > > > Bootstrapping methods generally do not work well when ties are signific

Re: [HACKERS] RFC: Async query processing

2014-01-02 Thread Claudio Freire
On Wed, Dec 18, 2013 at 1:50 PM, Florian Weimer wrote: > On 11/04/2013 02:51 AM, Claudio Freire wrote: >> >> On Sun, Nov 3, 2013 at 3:58 PM, Florian Weimer wrote: >>> >>> I would like to add truly asynchronous query processing to libpq, >>> enabling

Re: [HACKERS] RFC: Async query processing

2014-01-03 Thread Claudio Freire
On Fri, Jan 3, 2014 at 10:22 AM, Florian Weimer wrote: > On 01/02/2014 07:52 PM, Claudio Freire wrote: > >>> No, because this doesn't scale automatically with the bandwidth-delay >>> product. It also requires that the client buffers queries and their >>> par

Re: [HACKERS] RFC: Async query processing

2014-01-03 Thread Claudio Freire
On Fri, Jan 3, 2014 at 12:20 PM, Tom Lane wrote: > Claudio Freire writes: >> On Fri, Jan 3, 2014 at 10:22 AM, Florian Weimer wrote: >>> Loading data into the database isn't such an uncommon task. Not everything >>> is OLTP. > >> Truly, but a sustained

Re: [HACKERS] [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

2014-01-09 Thread Claudio Freire
On Thu, Jan 9, 2014 at 2:22 PM, Robert Haas wrote: > It would be nice to have better operating system support for this. > For example, IIUC, 64-bit Linux has 128TB of address space available > for user processes. When you clone(), it can either share the entire > address space (i.e. it's a thread

Re: [HACKERS] [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

2014-01-09 Thread Claudio Freire
On Thu, Jan 9, 2014 at 4:24 PM, knizhnik wrote: > On 01/09/2014 09:46 PM, Claudio Freire wrote: >> >> On Thu, Jan 9, 2014 at 2:22 PM, Robert Haas wrote: >>> >>> It would be nice to have better operating system support for this. >>> For example, IIU

Re: [HACKERS] [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

2014-01-09 Thread Claudio Freire
On Thu, Jan 9, 2014 at 4:39 PM, knizhnik wrote: >> At fork time I only wrote about reserving the address space. After >> reserving it, all you have to do is implement an allocator that works >> in shared memory (protected by a lwlock of course). >> >> In essence, a hypothetical pg_dsm_alloc(region

Re: [HACKERS] [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

2014-01-09 Thread Claudio Freire
On Thu, Jan 9, 2014 at 4:48 PM, Claudio Freire wrote: > On Thu, Jan 9, 2014 at 4:39 PM, knizhnik wrote: >>> At fork time I only wrote about reserving the address space. After >>> reserving it, all you have to do is implement an allocator that works >>> in shared me

Re: [HACKERS] [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

2014-01-10 Thread Claudio Freire
On Fri, Jan 10, 2014 at 3:23 PM, Robert Haas wrote: > On Thu, Jan 9, 2014 at 12:46 PM, Claudio Freire > wrote: >> On Thu, Jan 9, 2014 at 2:22 PM, Robert Haas wrote: >>> It would be nice to have better operating system support for this. >>> For example, IIUC, 64-

Re: [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-13 Thread Claudio Freire
On Mon, Jan 13, 2014 at 5:15 PM, Robert Haas wrote: > On a related note, there's also the problem of double-buffering. When > we read a page into shared_buffers, we leave a copy behind in the OS > buffers, and similarly on write-out. It's very unclear what to do > about this, since the kernel an

Re: [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-13 Thread Claudio Freire
On Mon, Jan 13, 2014 at 5:23 PM, Jim Nasby wrote: > On 1/13/14, 2:19 PM, Claudio Freire wrote: >> >> On Mon, Jan 13, 2014 at 5:15 PM, Robert Haas >> wrote: >>> >>> On a related note, there's also the problem of double-buffering. When >>>

Re: [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-13 Thread Claudio Freire
On Mon, Jan 13, 2014 at 5:32 PM, Jim Nasby wrote: >> >> That's my point. In terms of kernel-postgres interaction, it's fairly >> simple. >> >> What's not so simple, is figuring out what policy to use. Remember, >> you cannot tell the kernel to put some page in its page cache without >> reading it

Re: [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-13 Thread Claudio Freire
On Mon, Jan 13, 2014 at 7:36 PM, Mel Gorman wrote: > That could be something we look at. There are cases buried deep in the > VM where pages get shuffled to the end of the LRU and get tagged for > reclaim as soon as possible. Maybe you need access to something like > that via posix_fadvise to say

Re: [HACKERS] Block level parallel vacuum WIP

2017-01-10 Thread Claudio Freire
On Tue, Jan 10, 2017 at 6:42 AM, Masahiko Sawada wrote: > Attached result of performance test with scale factor = 500 and the > test script I used. I measured each test at four times and plot > average of last three execution times to sf_500.png file. When table > has index, vacuum execution time

Re: [HACKERS] Block level parallel vacuum WIP

2017-01-10 Thread Claudio Freire
On Tue, Jan 10, 2017 at 6:42 AM, Masahiko Sawada wrote: >> Does this work negate the other work to allow VACUUM to use > >> 1GB memory? > > Partly yes. Because memory space for dead TIDs needs to be allocated > in DSM before vacuum worker launches, parallel lazy vacuum cannot use > such a variable

Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem

2017-01-19 Thread Claudio Freire
On Thu, Jan 19, 2017 at 6:33 AM, Anastasia Lubennikova wrote: > 28.12.2016 23:43, Claudio Freire: > > Attached v4 patches with the requested fixes. > > > Sorry for being late, but the tests took a lot of time. I know. Takes me several days to run my test scripts once. &g

Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem

2017-01-23 Thread Claudio Freire
On Fri, Jan 20, 2017 at 6:24 AM, Masahiko Sawada wrote: > On Thu, Jan 19, 2017 at 8:31 PM, Claudio Freire > wrote: >> On Thu, Jan 19, 2017 at 6:33 AM, Anastasia Lubennikova >> wrote: >>> 28.12.2016 23:43, Claudio Freire: >>> >>> Attached v4 patche

Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem

2017-01-25 Thread Claudio Freire
always set later I think the first line starting with > "seg = ..." is not necessary. Thought? That's correct. Attached a v6 with those changes (and rebased). Make check still passes. From c89019089a71517befac0920f22ed75577dda6c6 Mon Sep 17 00:00:00 2001 From: Claudio Freire

Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem

2017-01-30 Thread Claudio Freire
ug in lazy_clear_dead_tuples, so clearly it's not without merit. I'll rearrange the comments as you ask though. Updated and rebased v7 attached. [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=776671 From d32610b0ad6b9413aa4b2d808019d3c67d578f3c Mon Sep 17 00:00:00 2001 From:

Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem

2017-02-01 Thread Claudio Freire
On Wed, Feb 1, 2017 at 5:47 PM, Masahiko Sawada wrote: > Thank you for updating the patch. > > Whole patch looks good to me except for the following one comment. > This is the final comment from me. > > /* > * lazy_tid_reaped() -- is a particular tid deletable? > * > * This has the right

Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem

2017-02-01 Thread Claudio Freire
On Wed, Feb 1, 2017 at 6:13 PM, Masahiko Sawada wrote: > On Wed, Feb 1, 2017 at 10:02 PM, Claudio Freire > wrote: >> On Wed, Feb 1, 2017 at 5:47 PM, Masahiko Sawada >> wrote: >>> Thank you for updating the patch. >>> >>> Whole patch looks go

Re: ParallelFinish-hook of FDW/CSP (Re: [HACKERS] Steps inside ExecEndGather)

2017-02-03 Thread Claudio Freire
On Mon, Oct 31, 2016 at 11:33 AM, Kouhei Kaigai wrote: > Hello, > > The attached patch implements the suggestion by Amit before. > > What I'm motivated is to collect extra run-time statistics specific > to a particular ForeignScan/CustomScan, not only the standard > Instrumentation; like DMA trans

Re: ParallelFinish-hook of FDW/CSP (Re: [HACKERS] Steps inside ExecEndGather)

2017-02-05 Thread Claudio Freire
On Sun, Feb 5, 2017 at 9:19 PM, Kouhei Kaigai wrote: >> If the use case for this is to gather instrumentation, I'd suggest calling >> the hook in RetrieveInstrumentation, and calling it appropriately. It would >> make the intended use far clearer than it is now. >> >> And if it saves some work, al

Re: ParallelFinish-hook of FDW/CSP (Re: [HACKERS] Steps inside ExecEndGather)

2017-02-05 Thread Claudio Freire
On Mon, Feb 6, 2017 at 1:00 AM, Claudio Freire wrote: > On Sun, Feb 5, 2017 at 9:19 PM, Kouhei Kaigai wrote: >>> If the use case for this is to gather instrumentation, I'd suggest calling >>> the hook in RetrieveInstrumentation, and calling it appropriately. It would

Re: ParallelFinish-hook of FDW/CSP (Re: [HACKERS] Steps inside ExecEndGather)

2017-02-05 Thread Claudio Freire
On Mon, Feb 6, 2017 at 1:42 AM, Kouhei Kaigai wrote: > I also had thought an idea to have extra space to Instrumentation structure, > however, it needs to make Instrumentation flexible-length structure according > to the custom format by CSP/FDW. Likely, it is not a good design. > As long as exten

Re: [HACKERS] Improve OR conditions on joined columns (common star schema problem)

2017-02-09 Thread Claudio Freire
On Thu, Feb 9, 2017 at 9:50 PM, Jim Nasby wrote: > WHERE t1 IN ('a','b') OR t2 IN ('c','d') > > into > > WHERE f1 IN (1,2) OR f2 IN (3,4) > > (assuming a,b,c,d maps to 1,2,3,4) > > BTW, there's an important caveat here: users generally do NOT want duplicate > rows from the fact table if the dimens

Re: [HACKERS] Sum aggregate calculation for single precsion real

2017-02-15 Thread Claudio Freire
On Wed, Feb 15, 2017 at 9:52 AM, Robert Haas wrote: > On Tue, Feb 14, 2017 at 11:45 PM, Tom Lane wrote: >> You could perhaps make an argument that sum(float4) would have less risk >> of overflow if it accumulated in and returned float8, but frankly that >> seems a bit thin. > > I think that's mor

Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem

2016-12-22 Thread Claudio Freire
On Thu, Dec 22, 2016 at 12:15 PM, Anastasia Lubennikova wrote: > The following review has been posted through the commitfest application: > make installcheck-world: tested, failed > Implements feature: not tested > Spec compliant: not tested > Documentation:not tested

Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem

2016-12-22 Thread Claudio Freire
On Thu, Dec 22, 2016 at 12:22 PM, Claudio Freire wrote: > On Thu, Dec 22, 2016 at 12:15 PM, Anastasia Lubennikova > wrote: >> The following review has been posted through the commitfest application: >> make installcheck-world: tested, failed >> Implements feature:

Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem

2016-12-23 Thread Claudio Freire
On Fri, Dec 23, 2016 at 1:39 PM, Anastasia Lubennikova wrote: >> On Thu, Dec 22, 2016 at 12:22 PM, Claudio Freire >> wrote: >>> >>> On Thu, Dec 22, 2016 at 12:15 PM, Anastasia Lubennikova >>> wrote: >>>> >>>> The following revie

Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem

2016-12-27 Thread Claudio Freire
ather than global index). On Tue, Dec 27, 2016 at 10:41 AM, Anastasia Lubennikova wrote: > 23.12.2016 22:54, Claudio Freire: > > On Fri, Dec 23, 2016 at 1:39 PM, Anastasia Lubennikova > wrote: > > I found the reason. I configure postgres with CFLAGS="-O0" and it causes

Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem

2016-12-27 Thread Claudio Freire
On Tue, Dec 27, 2016 at 10:41 AM, Anastasia Lubennikova wrote: > Program terminated with signal SIGSEGV, Segmentation fault. > #0 0x006941e7 in lazy_vacuum_heap (onerel=0x1ec2360, > vacrelstats=0x1ef6e00) at vacuumlazy.c:1417 > 1417tblk = > ItemPointerGetBlockNumber(&seg->

Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem

2016-12-28 Thread Claudio Freire
On Wed, Dec 28, 2016 at 10:26 AM, Anastasia Lubennikova wrote: > 27.12.2016 20:14, Claudio Freire: > > On Tue, Dec 27, 2016 at 10:41 AM, Anastasia Lubennikova > wrote: > > Program terminated with signal SIGSEGV, Segmentation fault. > #0 0x006941e7 in lazy_vacuum_h

Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem

2016-12-28 Thread Claudio Freire
On Wed, Dec 28, 2016 at 3:41 PM, Claudio Freire wrote: >> Anyway, I found the problem that had caused segfault. >> >> for (segindex = 0; segindex <= vacrelstats->dead_tuples.last_seg; tupindex = >> 0, segindex++) >> { >> DeadTuplesSegment *seg =

Re: [HACKERS] Block level parallel vacuum WIP

2017-01-06 Thread Claudio Freire
On Fri, Jan 6, 2017 at 2:38 PM, Masahiko Sawada wrote: > table_size | indexes | parallel_degree | time > +-+-+-- > 6.5GB | 0 | 1 | 00:00:14 > 6.5GB | 0 | 2 | 00:00:02 > 6.5GB | 0 |

Re: [HACKERS] reducing our reliance on MD5

2015-02-10 Thread Claudio Freire
On Tue, Feb 10, 2015 at 10:19 PM, Peter Geoghegan wrote: > On Tue, Feb 10, 2015 at 5:14 PM, Arthur Silva wrote: >> I don't think the "password storing best practices" apply to db connection >> authentication. > > Why not? Usually because handshakes use a random salt on both sides. Not sure abou

Re: [HACKERS] reducing our reliance on MD5

2015-02-11 Thread Claudio Freire
On Wed, Feb 11, 2015 at 10:31 AM, Magnus Hagander wrote: > On Wed, Feb 11, 2015 at 4:57 AM, Tom Lane wrote: >> >> Robert Haas writes: >> > On Tue, Feb 10, 2015 at 9:30 PM, Tom Lane wrote: >> >> Another thing we need to keep in mind besides client compatibility >> >> is dump/reload compatibility

Re: [HACKERS] reducing our reliance on MD5

2015-02-11 Thread Claudio Freire
On Wed, Feb 11, 2015 at 11:48 AM, José Luis Tallón wrote: > On 02/11/2015 03:39 PM, Claudio Freire wrote: >> >> [snip] >> Seems the risk of someone either lifting pg_authid from disk or by hacking >> the system and being postgres, thereby accessing passwords stor

<    1   2   3   4   5   >