Re: [HACKERS] pgsql 10: hash indexes testing

2017-08-14 Thread AP
On Fri, Aug 11, 2017 at 07:33:51AM +0530, Amit Kapila wrote: > On Thu, Aug 10, 2017 at 4:11 PM, AP wrote: > > mdstash=# select * from pgstathashindex('link_datum_id_idx'); > > version | bucket_pages | overflow_pages | bitmap_pages | unused_pages | > > live_items | dead_items | free_percent > >

Re: [HACKERS] pgsql 10: hash indexes testing

2017-08-13 Thread Amit Kapila
On Mon, Aug 14, 2017 at 6:10 AM, AP wrote: > On Fri, Aug 11, 2017 at 07:33:51AM +0530, Amit Kapila wrote: >> On Thu, Aug 10, 2017 at 4:11 PM, AP wrote: >> > mdstash=# select * from pgstathashindex('link_datum_id_idx'); >> > version | bucket_pages | overflow_pages | bitmap_pages | unused_pages |

Re: [HACKERS] pgsql 10: hash indexes testing

2017-08-10 Thread Amit Kapila
On Fri, Aug 11, 2017 at 5:01 AM, AP wrote: > On Thu, Aug 10, 2017 at 01:12:25PM -0400, Robert Haas wrote: >> On Thu, Aug 10, 2017 at 6:41 AM, AP wrote: >> > The index is 135GB rather than 900GB (from memory/give or take). >> >> Whoa. Big improvement. > > > As an aside, btree for the above is aro

Re: [HACKERS] pgsql 10: hash indexes testing

2017-08-10 Thread Amit Kapila
On Thu, Aug 10, 2017 at 4:11 PM, AP wrote: > On Sun, Aug 06, 2017 at 04:32:29PM +1000, AP wrote: >> On Sat, Aug 05, 2017 at 04:41:24PM +0530, Amit Kapila wrote: >> > > (On another note, I committed these patches.) >> > >> > Thanks. >> >> Seconded. :) >> >> Now uploading data with fillfactor of 90.

Re: [HACKERS] pgsql 10: hash indexes testing

2017-08-10 Thread AP
On Thu, Aug 10, 2017 at 01:12:25PM -0400, Robert Haas wrote: > On Thu, Aug 10, 2017 at 6:41 AM, AP wrote: > > The index is 135GB rather than 900GB (from memory/give or take). > > Whoa. Big improvement. Not a good direct comparison in general but it fits my workload. The 900GB was fillfactor 10

Re: [HACKERS] pgsql 10: hash indexes testing

2017-08-10 Thread Robert Haas
On Thu, Aug 10, 2017 at 6:41 AM, AP wrote: > The index is 135GB rather than 900GB (from memory/give or take). Whoa. Big improvement. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.or

Re: [HACKERS] pgsql 10: hash indexes testing

2017-08-10 Thread AP
On Sun, Aug 06, 2017 at 04:32:29PM +1000, AP wrote: > On Sat, Aug 05, 2017 at 04:41:24PM +0530, Amit Kapila wrote: > > > (On another note, I committed these patches.) > > > > Thanks. > > Seconded. :) > > Now uploading data with fillfactor of 90. I'll know in 2-3 days > if the new patches are suc

Re: [HACKERS] pgsql 10: hash indexes testing

2017-08-07 Thread AP
On Sat, Aug 05, 2017 at 04:41:24PM +0530, Amit Kapila wrote: > > (On another note, I committed these patches.) > > Thanks. Seconded. :) Now uploading data with fillfactor of 90. I'll know in 2-3 days if the new patches are successful (earlier if they did not help). I compiled (as apt.postgresql

Re: [HACKERS] pgsql 10: hash indexes testing

2017-08-05 Thread Amit Kapila
On Sat, Aug 5, 2017 at 7:50 AM, Robert Haas wrote: > On Fri, Aug 4, 2017 at 2:45 PM, Amit Kapila wrote: >> I have not done anything for this comment as it doesn't sound wrong to >> me. I think it is not making much sense in the current code and we >> can remove it or change it as part of the sep

Re: [HACKERS] pgsql 10: hash indexes testing

2017-08-04 Thread Robert Haas
On Fri, Aug 4, 2017 at 2:45 PM, Amit Kapila wrote: > I have not done anything for this comment as it doesn't sound wrong to > me. I think it is not making much sense in the current code and we > can remove it or change it as part of the separate patch if you or > others think so. I don't get it.

Re: [HACKERS] pgsql 10: hash indexes testing

2017-08-04 Thread Robert Haas
On Fri, Aug 4, 2017 at 2:49 PM, Amit Kapila wrote: > On Fri, Aug 4, 2017 at 10:59 PM, Robert Haas wrote: >> On Fri, Aug 4, 2017 at 6:22 AM, Amit Kapila wrote: >>> I have increased the number of hash bitmap pages as a separate patch. >>> I am not completely sure if it is a good idea to directly i

Re: [HACKERS] pgsql 10: hash indexes testing

2017-08-04 Thread Amit Kapila
On Fri, Aug 4, 2017 at 10:59 PM, Robert Haas wrote: > On Fri, Aug 4, 2017 at 6:22 AM, Amit Kapila wrote: >> I have increased the number of hash bitmap pages as a separate patch. >> I am not completely sure if it is a good idea to directly increase it >> to 1024 as that will increase the size of h

Re: [HACKERS] pgsql 10: hash indexes testing

2017-08-04 Thread Amit Kapila
On Fri, Aug 4, 2017 at 11:45 PM, Amit Kapila wrote: > On Fri, Aug 4, 2017 at 11:15 PM, Robert Haas wrote: >> On Fri, Aug 4, 2017 at 6:22 AM, Amit Kapila wrote: >>> I have implemented the patch with this approach as other approach >>> require quite extensive changes which I am not sure is the rig

Re: [HACKERS] pgsql 10: hash indexes testing

2017-08-04 Thread Amit Kapila
On Fri, Aug 4, 2017 at 11:15 PM, Robert Haas wrote: > On Fri, Aug 4, 2017 at 6:22 AM, Amit Kapila wrote: >> I have implemented the patch with this approach as other approach >> require quite extensive changes which I am not sure is the right thing >> to do at this stage. > > I think this approach

Re: [HACKERS] pgsql 10: hash indexes testing

2017-08-04 Thread Alvaro Herrera
Robert Haas wrote: > I think this approach is actually better anyway. There's no guarantee > that VACUUM can be responsive enough to get the job done in time, work > items or no work items, Yeah, autovacuum work items don't have a guaranteed response time. They're okay for things that "ought to

Re: [HACKERS] pgsql 10: hash indexes testing

2017-08-04 Thread Robert Haas
On Fri, Aug 4, 2017 at 6:22 AM, Amit Kapila wrote: > I have implemented the patch with this approach as other approach > require quite extensive changes which I am not sure is the right thing > to do at this stage. I think this approach is actually better anyway. There's no guarantee that VACUUM

Re: [HACKERS] pgsql 10: hash indexes testing

2017-08-04 Thread Robert Haas
On Fri, Aug 4, 2017 at 6:22 AM, Amit Kapila wrote: > I have increased the number of hash bitmap pages as a separate patch. > I am not completely sure if it is a good idea to directly increase it > to 1024 as that will increase the size of hashmetapagedata from 960 > bytes to 4544 bytes. Shall we

Re: [HACKERS] pgsql 10: hash indexes testing

2017-08-04 Thread Amit Kapila
On Fri, Aug 4, 2017 at 9:19 AM, AP wrote: > On Fri, Aug 04, 2017 at 08:21:01AM +0530, Amit Kapila wrote: >> Note - AP has off list shared the data dump and we (Ashutosh Sharma >> and me) are able to reproduce the problem and we could see that if we >> force vacuum via the debugger, then it is able

Re: [HACKERS] pgsql 10: hash indexes testing

2017-08-04 Thread AP
On Fri, Aug 04, 2017 at 08:21:01AM +0530, Amit Kapila wrote: > Note - AP has off list shared the data dump and we (Ashutosh Sharma > and me) are able to reproduce the problem and we could see that if we > force vacuum via the debugger, then it is able to free overflow pages. > The exact numbers are

Re: [HACKERS] pgsql 10: hash indexes testing

2017-08-04 Thread Amit Kapila
On Fri, Aug 4, 2017 at 8:21 AM, Amit Kapila wrote: > On Wed, Aug 2, 2017 at 9:04 PM, Robert Haas wrote: >> On Wed, Jul 12, 2017 at 1:10 AM, Amit Kapila wrote: > Yes, I also think the same idea can be used, in fact, I have mentioned > it [1] as soon as you have committed that patch. Do w

Re: [HACKERS] pgsql 10: hash indexes testing

2017-08-03 Thread Amit Kapila
On Wed, Aug 2, 2017 at 9:04 PM, Robert Haas wrote: > On Wed, Jul 12, 2017 at 1:10 AM, Amit Kapila wrote: Yes, I also think the same idea can be used, in fact, I have mentioned it [1] as soon as you have committed that patch. Do we want to do anything at this stage for PG-10? I do

Re: [HACKERS] pgsql 10: hash indexes testing

2017-08-03 Thread AP
On Wed, Aug 02, 2017 at 11:34:13AM -0400, Robert Haas wrote: > On Wed, Jul 12, 2017 at 1:10 AM, Amit Kapila wrote: > > It seems so. Basically, in the case of a large number of duplicates, > > we hit the maximum number of overflow pages. There is a theoretical > > possibility of hitting it but it

Re: [HACKERS] pgsql 10: hash indexes testing

2017-08-02 Thread Robert Haas
On Wed, Jul 12, 2017 at 1:10 AM, Amit Kapila wrote: >>> Yes, I also think the same idea can be used, in fact, I have mentioned >>> it [1] as soon as you have committed that patch. Do we want to do >>> anything at this stage for PG-10? I don't think we should attempt >>> something this late unles

Re: [HACKERS] pgsql 10: hash indexes testing

2017-07-11 Thread Amit Kapila
On Tue, Jul 11, 2017 at 11:08 PM, Alvaro Herrera wrote: > Amit Kapila wrote: > >> Yes, I also think the same idea can be used, in fact, I have mentioned >> it [1] as soon as you have committed that patch. Do we want to do >> anything at this stage for PG-10? I don't think we should attempt >> so

Re: [HACKERS] pgsql 10: hash indexes testing

2017-07-11 Thread Alvaro Herrera
Amit Kapila wrote: > Yes, I also think the same idea can be used, in fact, I have mentioned > it [1] as soon as you have committed that patch. Do we want to do > anything at this stage for PG-10? I don't think we should attempt > something this late unless people feel this is a show-stopper issu

Re: [HACKERS] pgsql 10: hash indexes testing

2017-07-11 Thread Amit Kapila
On Tue, Jul 11, 2017 at 8:10 AM, Alvaro Herrera wrote: > Amit Kapila wrote: >> On Tue, Jul 11, 2017 at 6:51 AM, AP wrote: >> > On Fri, Jul 07, 2017 at 05:58:25PM +0530, Amit Kapila wrote: > >> >> I can understand your concerns. To address first concern we need to >> >> work on one or more of fol

Re: [HACKERS] pgsql 10: hash indexes testing

2017-07-11 Thread AP
On Fri, Jul 07, 2017 at 05:58:25PM +0530, Amit Kapila wrote: > On Fri, Jul 7, 2017 at 8:22 AM, AP wrote: > > On Thu, Jul 06, 2017 at 05:19:59PM +0530, Amit Kapila wrote: > >> I think if you are under development, it is always advisable to create > >> indexes after initial bulk load. That way it w

Re: [HACKERS] pgsql 10: hash indexes testing

2017-07-10 Thread Alvaro Herrera
Amit Kapila wrote: > On Tue, Jul 11, 2017 at 6:51 AM, AP wrote: > > On Fri, Jul 07, 2017 at 05:58:25PM +0530, Amit Kapila wrote: > >> I can understand your concerns. To address first concern we need to > >> work on one or more of following work items: (a) work on vacuums that > >> can be trigger

Re: [HACKERS] pgsql 10: hash indexes testing

2017-07-10 Thread Amit Kapila
On Tue, Jul 11, 2017 at 6:51 AM, AP wrote: > On Fri, Jul 07, 2017 at 05:58:25PM +0530, Amit Kapila wrote: >> On Fri, Jul 7, 2017 at 8:22 AM, AP wrote: >> > On Thu, Jul 06, 2017 at 05:19:59PM +0530, Amit Kapila wrote: >> >> I think if you are under development, it is always advisable to create >>

Re: [HACKERS] pgsql 10: hash indexes testing

2017-07-07 Thread Amit Kapila
On Fri, Jul 7, 2017 at 8:22 AM, AP wrote: > On Thu, Jul 06, 2017 at 05:19:59PM +0530, Amit Kapila wrote: >> I think if you are under development, it is always advisable to create >> indexes after initial bulk load. That way it will be faster and will >> take lesser space atleast in case of hash i

Re: [HACKERS] pgsql 10: hash indexes testing

2017-07-06 Thread AP
On Thu, Jul 06, 2017 at 05:19:59PM +0530, Amit Kapila wrote: > I think if you are under development, it is always advisable to create > indexes after initial bulk load. That way it will be faster and will > take lesser space atleast in case of hash index. This is a bit of a pickle, actually: * if

Re: [HACKERS] pgsql 10: hash indexes testing

2017-07-06 Thread AP
On Wed, Jul 05, 2017 at 07:31:39PM +1000, AP wrote: > On Tue, Jul 04, 2017 at 08:23:20PM -0700, Jeff Janes wrote: > > On Tue, Jul 4, 2017 at 3:57 AM, AP wrote: > > > The data being indexed is BYTEA, (quasi)random and 64 bytes in size. > > > The table has over 2 billion entries. The data is not uni

Re: [HACKERS] pgsql 10: hash indexes testing

2017-07-06 Thread AP
On Thu, Jul 06, 2017 at 12:38:38PM +0530, Amit Kapila wrote: > On Thu, Jul 6, 2017 at 9:32 AM, AP wrote: > > On Thu, Jul 06, 2017 at 08:52:03AM +0530, Amit Kapila wrote: > >> On Thu, Jul 6, 2017 at 2:40 AM, AP wrote: > >> > On Wed, Jul 05, 2017 at 05:52:32PM +0530, Amit Kapila wrote: > >> >> >> >

Re: [HACKERS] pgsql 10: hash indexes testing

2017-07-06 Thread AP
On Thu, Jul 06, 2017 at 08:52:03AM +0530, Amit Kapila wrote: > On Thu, Jul 6, 2017 at 2:40 AM, AP wrote: > > On Wed, Jul 05, 2017 at 05:52:32PM +0530, Amit Kapila wrote: > >> >> > version | bucket_pages | overflow_pages | bitmap_pages | > >> >> > unused_pages | live_items | dead_items | free_p

Re: [HACKERS] pgsql 10: hash indexes testing

2017-07-06 Thread Amit Kapila
On Thu, Jul 6, 2017 at 5:04 PM, AP wrote: > On Thu, Jul 06, 2017 at 12:38:38PM +0530, Amit Kapila wrote: >> On Thu, Jul 6, 2017 at 9:32 AM, AP wrote: >> > On Thu, Jul 06, 2017 at 08:52:03AM +0530, Amit Kapila wrote: >> >> On Thu, Jul 6, 2017 at 2:40 AM, AP wrote: >> >> > On Wed, Jul 05, 2017 at

Re: [HACKERS] pgsql 10: hash indexes testing

2017-07-06 Thread Amit Kapila
On Thu, Jul 6, 2017 at 9:32 AM, AP wrote: > On Thu, Jul 06, 2017 at 08:52:03AM +0530, Amit Kapila wrote: >> On Thu, Jul 6, 2017 at 2:40 AM, AP wrote: >> > On Wed, Jul 05, 2017 at 05:52:32PM +0530, Amit Kapila wrote: >> >> >> > version | bucket_pages | overflow_pages | bitmap_pages | >> >> >> >

Re: [HACKERS] pgsql 10: hash indexes testing

2017-07-05 Thread Amit Kapila
On Thu, Jul 6, 2017 at 2:40 AM, AP wrote: > On Wed, Jul 05, 2017 at 05:52:32PM +0530, Amit Kapila wrote: >> >> > version | bucket_pages | overflow_pages | bitmap_pages | unused_pages >> >> > | live_items | dead_items | free_percent >> >> > -+--++

Re: [HACKERS] pgsql 10: hash indexes testing

2017-07-05 Thread AP
On Wed, Jul 05, 2017 at 05:52:32PM +0530, Amit Kapila wrote: > >> > version | bucket_pages | overflow_pages | bitmap_pages | unused_pages | > >> > live_items | dead_items | free_percent > >> > -+--++--+--+++

Re: [HACKERS] pgsql 10: hash indexes testing

2017-07-05 Thread AP
On Tue, Jul 04, 2017 at 08:23:20PM -0700, Jeff Janes wrote: > On Tue, Jul 4, 2017 at 3:57 AM, AP wrote: > > The data being indexed is BYTEA, (quasi)random and 64 bytes in size. > > The table has over 2 billion entries. The data is not unique. There's > > an average of 10 duplicates for every uniqu

Re: [HACKERS] pgsql 10: hash indexes testing

2017-07-05 Thread AP
On Wed, Jul 05, 2017 at 03:33:45PM +1000, AP wrote: > > Do you have any deletes? How have you verified whether autovacuum has > > No DELETEs. Just the initial COPY, then SELECTs, then a DB rename to get it > out of the way of other testing, then the REINDEX. > > > been triggered or not? > > I j

Re: [HACKERS] pgsql 10: hash indexes testing

2017-07-05 Thread AP
On Wed, Jul 05, 2017 at 10:29:09AM +0530, Amit Kapila wrote: > >> bitmappages. Can you try to use pgstattuple extension and get us the > >> results of Select * from pgstathashindex('index_name');? If the > >> number of bitmappages is 128 and total overflow pages are 128 * 4096, > >> then that wou

Re: [HACKERS] pgsql 10: hash indexes testing

2017-07-05 Thread Amit Kapila
On Wed, Jul 5, 2017 at 11:03 AM, AP wrote: > On Wed, Jul 05, 2017 at 10:29:09AM +0530, Amit Kapila wrote: >> >> bitmappages. Can you try to use pgstattuple extension and get us the >> >> results of Select * from pgstathashindex('index_name');? If the >> >> number of bitmappages is 128 and total

Re: [HACKERS] pgsql 10: hash indexes testing

2017-07-04 Thread Amit Kapila
On Wed, Jul 5, 2017 at 9:53 AM, AP wrote: > On Wed, Jul 05, 2017 at 08:10:10AM +0530, Amit Kapila wrote: >> On Tue, Jul 4, 2017 at 4:27 PM, AP wrote: >> > There is one index that caused an issue. Towards the end of an import >> > I got the following error: >> > >> > out of overflow pages in hash

Re: [HACKERS] pgsql 10: hash indexes testing

2017-07-04 Thread AP
On Wed, Jul 05, 2017 at 08:10:10AM +0530, Amit Kapila wrote: > On Tue, Jul 4, 2017 at 4:27 PM, AP wrote: > > There is one index that caused an issue. Towards the end of an import > > I got the following error: > > > > out of overflow pages in hash index > > > > The data being indexed is BYTEA, (qu

Re: [HACKERS] pgsql 10: hash indexes testing

2017-07-04 Thread Jeff Janes
On Tue, Jul 4, 2017 at 3:57 AM, AP wrote: > > The data being indexed is BYTEA, (quasi)random and 64 bytes in size. > The table has over 2 billion entries. The data is not unique. There's > an average of 10 duplicates for every unique value. > What is the number of duplicates for the most common

Re: [HACKERS] pgsql 10: hash indexes testing

2017-07-04 Thread Amit Kapila
On Tue, Jul 4, 2017 at 4:27 PM, AP wrote: > Hi, > > As I am actively working on a big project I figured I'd give PGSQL 10 a > go, primarily because of hash indexes. > > PostgreSQL 10 version in use is: > 10~beta2~20170620.2224-1~491.gitd412f79.pgdg+1 > > Things are mostly well with hash indexes (