Re: [HACKERS] GIN pending list clean up exposure to SQL

2016-02-07 Thread Jeff Janes
On Wed, Jan 27, 2016 at 12:54 PM, Julien Rouhaud
 wrote:
> On 27/01/2016 10:27, Fujii Masao wrote:
>>
>> Thanks for updating the patch! It looks good to me.
>>
>> Based on your patch, I just improved the doc. For example, I added
>> the following note into the doc.
>>
>> +These functions cannot be executed during recovery. +Use
>> of these functions is restricted to superusers and the owner +
>> of the given index.
>>
>> If there is no problem, I'm thinking to commit this version.
>>
>
> Just a detail:
>
> +Note that the cleanup does not happen and the return value is 0
> +if the argument is the GIN index built with fastupdate
> +option disabled because it doesn't have a pending list.
>
> It should be "if the argument is *a* GIN index"
>
> I find this sentence a little confusing, maybe rephrase like this would
> be better:
>
> -Note that the cleanup does not happen and the return value is 0
> -if the argument is the GIN index built with fastupdate
> -option disabled because it doesn't have a pending list.
> +Note that if the argument is a GIN index built with
> fastupdate
> +option disabled, the cleanup does not happen and the return value
> is 0
> +because the index doesn't have a pending list.
>
> Otherwise, I don't see any problem on this version.

This is a corner case that probably does not need to be in the docs,
but I wanted to clarify it here in case you disagree: If the index
ever had fastupdate turned on, when it is turned off the index will
keep whatever pending_list it had until something cleans it up one
last time.

Thanks for the review and changes and commit.

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN pending list clean up exposure to SQL

2016-02-07 Thread Fujii Masao
On Mon, Feb 8, 2016 at 9:53 AM, Jeff Janes  wrote:
> On Wed, Jan 27, 2016 at 12:54 PM, Julien Rouhaud
>  wrote:
>> On 27/01/2016 10:27, Fujii Masao wrote:
>>>
>>> Thanks for updating the patch! It looks good to me.
>>>
>>> Based on your patch, I just improved the doc. For example, I added
>>> the following note into the doc.
>>>
>>> +These functions cannot be executed during recovery. +Use
>>> of these functions is restricted to superusers and the owner +
>>> of the given index.
>>>
>>> If there is no problem, I'm thinking to commit this version.
>>>
>>
>> Just a detail:
>>
>> +Note that the cleanup does not happen and the return value is 0
>> +if the argument is the GIN index built with fastupdate
>> +option disabled because it doesn't have a pending list.
>>
>> It should be "if the argument is *a* GIN index"
>>
>> I find this sentence a little confusing, maybe rephrase like this would
>> be better:
>>
>> -Note that the cleanup does not happen and the return value is 0
>> -if the argument is the GIN index built with fastupdate
>> -option disabled because it doesn't have a pending list.
>> +Note that if the argument is a GIN index built with
>> fastupdate
>> +option disabled, the cleanup does not happen and the return value
>> is 0
>> +because the index doesn't have a pending list.
>>
>> Otherwise, I don't see any problem on this version.
>
> This is a corner case that probably does not need to be in the docs,
> but I wanted to clarify it here in case you disagree: If the index
> ever had fastupdate turned on, when it is turned off the index will
> keep whatever pending_list it had until something cleans it up one
> last time.

I agree to add that note to the doc. Or we should remove the above
description that I added?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN pending list clean up exposure to SQL

2016-01-27 Thread Fujii Masao
On Thu, Jan 28, 2016 at 5:54 AM, Julien Rouhaud
 wrote:
> On 27/01/2016 10:27, Fujii Masao wrote:
>> On Mon, Jan 25, 2016 at 3:54 PM, Jeff Janes 
>> wrote:
>>> On Wed, Jan 20, 2016 at 6:17 AM, Fujii Masao
>>>  wrote:
 On Sat, Jan 16, 2016 at 7:42 AM, Julien Rouhaud
  wrote:
> On 15/01/2016 22:59, Jeff Janes wrote:
>> On Sun, Jan 10, 2016 at 4:24 AM, Julien Rouhaud
>>  wrote:
>
> All looks fine to me, I flag it as ready for committer.

 When I compiled the PostgreSQL with the patch, I got the
 following error. ISTM that the inclusion of pg_am.h header file
 is missing in ginfast.c.
>>>
>>> Thanks.  Fixed.
>>>
 gin_clean_pending_list() must check whether the server is in
 recovery or not. If it's in recovery, the function must exit
 with an error. That is, IMO, something like the following check
 must be added.

 if (RecoveryInProgress()) ereport(ERROR,

 (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 errmsg("recovery is in progress"), errhint("GIN pending list
 cannot be cleaned up during recovery.")));

 It's better to make gin_clean_pending_list() check whether the
 target index is temporary index of other sessions or not, like
 pgstatginindex() does.
>>>
>>> I've added both of these checks.  Sorry I overlooked your early
>>> email in this thread about those.
>>>

 +RelationindexRel = index_open(indexoid,
 AccessShareLock);

 ISTM that AccessShareLock is not safe when updating the pending
 list and GIN index main structure. Probaby we should use
 RowExclusiveLock?
>>>
>>> Other callers of the ginInsertCleanup function also do so while
>>> holding only the AccessShareLock on the index.  It turns out
>>> that there is a bug around this, as discussed in "Potential GIN
>>> vacuum bug"
>>> (http://www.postgresql.org/message-id/flat/CAMkU=1xalflhuuohfp5v33rzedlvb5aknnujceum9knbkrb...@mail.gmail.com)
>>>
>>>
>>>
> But, that bug has to be solved at a deeper level than this patch.
>>>
>>> I've also cleaned up some other conflicts, and chose a more
>>> suitable OID for the new catalog function.
>>>
>>> The number of new header includes needed to implement this makes
>>> me wonder if I put this code in the correct file, but I don't see
>>> a better location for it.
>>>
>>> New version attached.
>>
>> Thanks for updating the patch! It looks good to me.
>>
>> Based on your patch, I just improved the doc. For example, I added
>> the following note into the doc.
>>
>> +These functions cannot be executed during recovery. +Use
>> of these functions is restricted to superusers and the owner +
>> of the given index.
>>
>> If there is no problem, I'm thinking to commit this version.
>>
>
> Just a detail:
>
> +Note that the cleanup does not happen and the return value is 0
> +if the argument is the GIN index built with fastupdate
> +option disabled because it doesn't have a pending list.
>
> It should be "if the argument is *a* GIN index"
>
> I find this sentence a little confusing, maybe rephrase like this would
> be better:
>
> -Note that the cleanup does not happen and the return value is 0
> -if the argument is the GIN index built with fastupdate
> -option disabled because it doesn't have a pending list.
> +Note that if the argument is a GIN index built with
> fastupdate
> +option disabled, the cleanup does not happen and the return value
> is 0
> +because the index doesn't have a pending list.
>
> Otherwise, I don't see any problem on this version.

Thanks for the review! I adopted the description you suggested.
Just pushed the patch.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN pending list clean up exposure to SQL

2016-01-27 Thread Julien Rouhaud
On 27/01/2016 10:27, Fujii Masao wrote:
> On Mon, Jan 25, 2016 at 3:54 PM, Jeff Janes 
> wrote:
>> On Wed, Jan 20, 2016 at 6:17 AM, Fujii Masao
>>  wrote:
>>> On Sat, Jan 16, 2016 at 7:42 AM, Julien Rouhaud 
>>>  wrote:
 On 15/01/2016 22:59, Jeff Janes wrote:
> On Sun, Jan 10, 2016 at 4:24 AM, Julien Rouhaud 
>  wrote:
 
 All looks fine to me, I flag it as ready for committer.
>>> 
>>> When I compiled the PostgreSQL with the patch, I got the
>>> following error. ISTM that the inclusion of pg_am.h header file
>>> is missing in ginfast.c.
>> 
>> Thanks.  Fixed.
>> 
>>> gin_clean_pending_list() must check whether the server is in
>>> recovery or not. If it's in recovery, the function must exit
>>> with an error. That is, IMO, something like the following check
>>> must be added.
>>> 
>>> if (RecoveryInProgress()) ereport(ERROR,
>>> 
>>> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), 
>>> errmsg("recovery is in progress"), errhint("GIN pending list
>>> cannot be cleaned up during recovery.")));
>>> 
>>> It's better to make gin_clean_pending_list() check whether the
>>> target index is temporary index of other sessions or not, like
>>> pgstatginindex() does.
>> 
>> I've added both of these checks.  Sorry I overlooked your early
>> email in this thread about those.
>> 
>>> 
>>> +RelationindexRel = index_open(indexoid,
>>> AccessShareLock);
>>> 
>>> ISTM that AccessShareLock is not safe when updating the pending
>>> list and GIN index main structure. Probaby we should use
>>> RowExclusiveLock?
>> 
>> Other callers of the ginInsertCleanup function also do so while 
>> holding only the AccessShareLock on the index.  It turns out
>> that there is a bug around this, as discussed in "Potential GIN
>> vacuum bug" 
>> (http://www.postgresql.org/message-id/flat/CAMkU=1xalflhuuohfp5v33rzedlvb5aknnujceum9knbkrb...@mail.gmail.com)
>>
>>
>> 
But, that bug has to be solved at a deeper level than this patch.
>> 
>> I've also cleaned up some other conflicts, and chose a more
>> suitable OID for the new catalog function.
>> 
>> The number of new header includes needed to implement this makes
>> me wonder if I put this code in the correct file, but I don't see
>> a better location for it.
>> 
>> New version attached.
> 
> Thanks for updating the patch! It looks good to me.
> 
> Based on your patch, I just improved the doc. For example, I added
> the following note into the doc.
> 
> +These functions cannot be executed during recovery. +Use
> of these functions is restricted to superusers and the owner +
> of the given index.
> 
> If there is no problem, I'm thinking to commit this version.
> 

Just a detail:

+Note that the cleanup does not happen and the return value is 0
+if the argument is the GIN index built with fastupdate
+option disabled because it doesn't have a pending list.

It should be "if the argument is *a* GIN index"

I find this sentence a little confusing, maybe rephrase like this would
be better:

-Note that the cleanup does not happen and the return value is 0
-if the argument is the GIN index built with fastupdate
-option disabled because it doesn't have a pending list.
+Note that if the argument is a GIN index built with
fastupdate
+option disabled, the cleanup does not happen and the return value
is 0
+because the index doesn't have a pending list.

Otherwise, I don't see any problem on this version.

Regards.

> Regards,
> 
> 
> 
> 


-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN pending list clean up exposure to SQL

2016-01-27 Thread Fujii Masao
On Mon, Jan 25, 2016 at 3:54 PM, Jeff Janes  wrote:
> On Wed, Jan 20, 2016 at 6:17 AM, Fujii Masao  wrote:
>> On Sat, Jan 16, 2016 at 7:42 AM, Julien Rouhaud
>>  wrote:
>>> On 15/01/2016 22:59, Jeff Janes wrote:
 On Sun, Jan 10, 2016 at 4:24 AM, Julien Rouhaud
  wrote:
>>>
>>> All looks fine to me, I flag it as ready for committer.
>>
>> When I compiled the PostgreSQL with the patch, I got the following error.
>> ISTM that the inclusion of pg_am.h header file is missing in ginfast.c.
>
> Thanks.  Fixed.
>
>> gin_clean_pending_list() must check whether the server is in recovery or not.
>> If it's in recovery, the function must exit with an error. That is, IMO,
>> something like the following check must be added.
>>
>>  if (RecoveryInProgress())
>>  ereport(ERROR,
>>
>> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>>   errmsg("recovery is in progress"),
>>   errhint("GIN pending list cannot be
>> cleaned up during recovery.")));
>>
>> It's better to make gin_clean_pending_list() check whether the target index
>> is temporary index of other sessions or not, like pgstatginindex() does.
>
> I've added both of these checks.  Sorry I overlooked your early email
> in this thread about those.
>
>>
>> +RelationindexRel = index_open(indexoid, AccessShareLock);
>>
>> ISTM that AccessShareLock is not safe when updating the pending list and
>> GIN index main structure. Probaby we should use RowExclusiveLock?
>
> Other callers of the ginInsertCleanup function also do so while
> holding only the AccessShareLock on the index.  It turns out that
> there is a bug around this, as discussed in "Potential GIN vacuum bug"
> (http://www.postgresql.org/message-id/flat/CAMkU=1xalflhuuohfp5v33rzedlvb5aknnujceum9knbkrb...@mail.gmail.com)
>
> But, that bug has to be solved at a deeper level than this patch.
>
> I've also cleaned up some other conflicts, and chose a more suitable
> OID for the new catalog function.
>
> The number of new header includes needed to implement this makes me
> wonder if I put this code in the correct file, but I don't see a
> better location for it.
>
> New version attached.

Thanks for updating the patch! It looks good to me.

Based on your patch, I just improved the doc. For example,
I added the following note into the doc.

+These functions cannot be executed during recovery.
+Use of these functions is restricted to superusers and the owner
+of the given index.

If there is no problem, I'm thinking to commit this version.

Regards,

-- 
Fujii Masao
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***
*** 18036,18044  postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
--- 18036,18051 
  brin_summarize_new_values
 
  
+
+ gin_clean_pending_list
+
+ 
 
   shows the functions
  available for index maintenance tasks.
+ These functions cannot be executed during recovery.
+ Use of these functions is restricted to superusers and the owner
+ of the given index.
 
  
 
***
*** 18056,18061  postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
--- 18063,18075 
 integer
 summarize page ranges not already summarized

+   
+
+ gin_clean_pending_list(index regclass)
+
+bigint
+move GIN pending list entries into main index structure
+   
   
  
 
***
*** 18069,18074  postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
--- 18083,18100 
  into the index.
 
  
+
+ gin_clean_pending_list accepts the OID or name of
+ a GIN index and cleans up the pending list of the specified GIN index
+ by moving entries in it to the main GIN data structure in bulk.
+ It returns the number of pages cleaned up from the pending list.
+ Note that the cleanup does not happen and the return value is 0
+ if the argument is the GIN index built with fastupdate
+ option disabled because it doesn't have a pending list.
+ Please see  and 
+ for details of the pending list and fastupdate option.
+
+ 

  

*** a/doc/src/sgml/gin.sgml
--- b/doc/src/sgml/gin.sgml
***
*** 734,740 
 from the indexed item). As of PostgreSQL 8.4,
 GIN is capable of postponing much of this work by inserting
 new tuples into a temporary, unsorted list of pending entries.
!When the table is vacuumed, or if the pending list becomes larger than
 , the entries are moved to the
 main GIN data structure using the same bulk insert
 techniques used during initial index creation.  This greatly improves
--- 734,742 
 from the indexed item). As of PostgreSQL 8.4,
 GIN is capable of postponing much of this work 

Re: [HACKERS] GIN pending list clean up exposure to SQL

2016-01-24 Thread Jeff Janes
On Wed, Jan 20, 2016 at 6:17 AM, Fujii Masao  wrote:
> On Sat, Jan 16, 2016 at 7:42 AM, Julien Rouhaud
>  wrote:
>> On 15/01/2016 22:59, Jeff Janes wrote:
>>> On Sun, Jan 10, 2016 at 4:24 AM, Julien Rouhaud
>>>  wrote:
>>
>> All looks fine to me, I flag it as ready for committer.
>
> When I compiled the PostgreSQL with the patch, I got the following error.
> ISTM that the inclusion of pg_am.h header file is missing in ginfast.c.

Thanks.  Fixed.

> gin_clean_pending_list() must check whether the server is in recovery or not.
> If it's in recovery, the function must exit with an error. That is, IMO,
> something like the following check must be added.
>
>  if (RecoveryInProgress())
>  ereport(ERROR,
>
> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>   errmsg("recovery is in progress"),
>   errhint("GIN pending list cannot be
> cleaned up during recovery.")));
>
> It's better to make gin_clean_pending_list() check whether the target index
> is temporary index of other sessions or not, like pgstatginindex() does.

I've added both of these checks.  Sorry I overlooked your early email
in this thread about those.

>
> +RelationindexRel = index_open(indexoid, AccessShareLock);
>
> ISTM that AccessShareLock is not safe when updating the pending list and
> GIN index main structure. Probaby we should use RowExclusiveLock?

Other callers of the ginInsertCleanup function also do so while
holding only the AccessShareLock on the index.  It turns out that
there is a bug around this, as discussed in "Potential GIN vacuum bug"
(http://www.postgresql.org/message-id/flat/CAMkU=1xalflhuuohfp5v33rzedlvb5aknnujceum9knbkrb...@mail.gmail.com)

But, that bug has to be solved at a deeper level than this patch.

I've also cleaned up some other conflicts, and chose a more suitable
OID for the new catalog function.

The number of new header includes needed to implement this makes me
wonder if I put this code in the correct file, but I don't see a
better location for it.

New version attached.

Thanks,

Jeff
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index 9c143b2..ada228f
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
*** postgres=# SELECT * FROM pg_xlogfile_nam
*** 18036,18041 
--- 18036,18045 
  brin_summarize_new_values
 
  
+
+ gin_clean_pending_list
+
+ 
 
   shows the functions
  available for index maintenance tasks.
*** postgres=# SELECT * FROM pg_xlogfile_nam
*** 18056,18061 
--- 18060,18072 
 integer
 summarize page ranges not already summarized

+   
+
+ gin_clean_pending_list(index_oid regclass)
+
+bigint
+move fast-update pending list entries into main index structure
+   
   
  
 
*** postgres=# SELECT * FROM pg_xlogfile_nam
*** 18069,18074 
--- 18080,18092 
  into the index.
 
  
+
+gin_clean_pending_list accepts a GIN index OID argument
+ and moves the fast-update entries from the pending list into the
+ main index structure for that index.  It returns the number of pages
+ removed from the pending list.
+
+ 

  

diff --git a/doc/src/sgml/gin.sgml b/doc/src/sgml/gin.sgml
new file mode 100644
index 9eb0b5a..17a5087
*** a/doc/src/sgml/gin.sgml
--- b/doc/src/sgml/gin.sgml
***
*** 734,740 
 from the indexed item). As of PostgreSQL 8.4,
 GIN is capable of postponing much of this work by inserting
 new tuples into a temporary, unsorted list of pending entries.
!When the table is vacuumed, or if the pending list becomes larger than
 , the entries are moved to the
 main GIN data structure using the same bulk insert
 techniques used during initial index creation.  This greatly improves
--- 734,742 
 from the indexed item). As of PostgreSQL 8.4,
 GIN is capable of postponing much of this work by inserting
 new tuples into a temporary, unsorted list of pending entries.
!When the table is vacuumed or autoanalyzed, or when 
!gin_clean_pending_list(regclass) is called, or if the
!pending list becomes larger than
 , the entries are moved to the
 main GIN data structure using the same bulk insert
 techniques used during initial index creation.  This greatly improves
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
new file mode 100644
index 681ce09..dfdc45f
*** a/src/backend/access/gin/ginfast.c
--- b/src/backend/access/gin/ginfast.c
***
*** 20,29 
--- 20,32 
  
  #include "access/gin_private.h"
  #include "access/xloginsert.h"
+ #include "access/xlog.h"
  #include "commands/vacuum.h"
+ #include "catalog/pg_am.h"
  #include "miscadmin.h"
  #include 

Re: [HACKERS] GIN pending list clean up exposure to SQL

2016-01-22 Thread Julien Rouhaud
On 20/01/2016 15:17, Fujii Masao wrote:
> 
> When I compiled the PostgreSQL with the patch, I got the following error.
> ISTM that the inclusion of pg_am.h header file is missing in ginfast.c.
> 
> ginfast.c: In function 'gin_clean_pending_list':
> ginfast.c:980: error: 'GIN_AM_OID' undeclared (first use in this function)
> ginfast.c:980: error: (Each undeclared identifier is reported only once
> ginfast.c:980: error: for each function it appears in.)
> 
> gin_clean_pending_list() must check whether the server is in recovery or not.
> If it's in recovery, the function must exit with an error. That is, IMO,
> something like the following check must be added.
> 
>  if (RecoveryInProgress())
>  ereport(ERROR,
> 
> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>   errmsg("recovery is in progress"),
>   errhint("GIN pending list cannot be
> cleaned up during recovery.")));
> 
> It's better to make gin_clean_pending_list() check whether the target index
> is temporary index of other sessions or not, like pgstatginindex() does.
> 
> +RelationindexRel = index_open(indexoid, AccessShareLock);
> 
> ISTM that AccessShareLock is not safe when updating the pending list and
> GIN index main structure. Probaby we should use RowExclusiveLock?
> 

This lock should be safe, as pointed by Alvaro and Jaime earlier in this
thread
(http://www.postgresql.org/message-id/20151119161846.GK614468@alvherre.pgsql),
as ginInsertCleanup() can be called concurrently.

Also, since 38710a374ea the ginInsertCleanup() call must be fixed:

-   ginInsertCleanup(, false, true, );
+   ginInsertCleanup(, true, );


Regards.

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN pending list clean up exposure to SQL

2016-01-20 Thread Fujii Masao
On Sat, Jan 16, 2016 at 7:42 AM, Julien Rouhaud
 wrote:
> On 15/01/2016 22:59, Jeff Janes wrote:
>> On Sun, Jan 10, 2016 at 4:24 AM, Julien Rouhaud
>>  wrote:
>>> On 29/12/2015 00:30, Jeff Janes wrote:
 On Wed, Nov 25, 2015 at 9:29 AM, Jeff Janes  wrote:
>
> I'll prepare a patch for core for the January commitfest, and see if
> it flies.  If not, there is always the extension to fall back to.

 Here is an updated patch.  I've added type and permission checks,
 docs, and some regression tests.

>>>
>>> I just reviewed it. Patch applies cleanly, everything works as intended,
>>> including regression tests.
>>>
>>> I think the function should be declared as strict.
>>
>> OK.  I see that brin_summarize_new_values, which I modeled this on,
>> was recently changed to be strict.  So I've changed this the same way.
>>
>>>
>>> Also, there are some trailing whitespaces in the documentation diff.
>>
>> Fixed.
>
> Thanks!
>
>>  I also added the DESC to the pg_proc entry, which I somehow
>> missed before.
>>
>
> Good catch, I also missed it.
>
> All looks fine to me, I flag it as ready for committer.

When I compiled the PostgreSQL with the patch, I got the following error.
ISTM that the inclusion of pg_am.h header file is missing in ginfast.c.

ginfast.c: In function 'gin_clean_pending_list':
ginfast.c:980: error: 'GIN_AM_OID' undeclared (first use in this function)
ginfast.c:980: error: (Each undeclared identifier is reported only once
ginfast.c:980: error: for each function it appears in.)

gin_clean_pending_list() must check whether the server is in recovery or not.
If it's in recovery, the function must exit with an error. That is, IMO,
something like the following check must be added.

 if (RecoveryInProgress())
 ereport(ERROR,

(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
  errmsg("recovery is in progress"),
  errhint("GIN pending list cannot be
cleaned up during recovery.")));

It's better to make gin_clean_pending_list() check whether the target index
is temporary index of other sessions or not, like pgstatginindex() does.

+RelationindexRel = index_open(indexoid, AccessShareLock);

ISTM that AccessShareLock is not safe when updating the pending list and
GIN index main structure. Probaby we should use RowExclusiveLock?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN pending list clean up exposure to SQL

2016-01-15 Thread Jeff Janes
On Sun, Jan 10, 2016 at 4:24 AM, Julien Rouhaud
 wrote:
> On 29/12/2015 00:30, Jeff Janes wrote:
>> On Wed, Nov 25, 2015 at 9:29 AM, Jeff Janes  wrote:
>>>
>>> I'll prepare a patch for core for the January commitfest, and see if
>>> it flies.  If not, there is always the extension to fall back to.
>>
>> Here is an updated patch.  I've added type and permission checks,
>> docs, and some regression tests.
>>
>
> I just reviewed it. Patch applies cleanly, everything works as intended,
> including regression tests.
>
> I think the function should be declared as strict.

OK.  I see that brin_summarize_new_values, which I modeled this on,
was recently changed to be strict.  So I've changed this the same way.

>
> Also, there are some trailing whitespaces in the documentation diff.

Fixed.  I also added the DESC to the pg_proc entry, which I somehow
missed before.

Thanks,

Jeff


gin_clean_pending_user_v003.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN pending list clean up exposure to SQL

2016-01-15 Thread Julien Rouhaud
On 15/01/2016 22:59, Jeff Janes wrote:
> On Sun, Jan 10, 2016 at 4:24 AM, Julien Rouhaud
>  wrote:
>> On 29/12/2015 00:30, Jeff Janes wrote:
>>> On Wed, Nov 25, 2015 at 9:29 AM, Jeff Janes  wrote:

 I'll prepare a patch for core for the January commitfest, and see if
 it flies.  If not, there is always the extension to fall back to.
>>>
>>> Here is an updated patch.  I've added type and permission checks,
>>> docs, and some regression tests.
>>>
>>
>> I just reviewed it. Patch applies cleanly, everything works as intended,
>> including regression tests.
>>
>> I think the function should be declared as strict.
> 
> OK.  I see that brin_summarize_new_values, which I modeled this on,
> was recently changed to be strict.  So I've changed this the same way.
> 
>>
>> Also, there are some trailing whitespaces in the documentation diff.
> 
> Fixed.

Thanks!

>  I also added the DESC to the pg_proc entry, which I somehow
> missed before.
> 

Good catch, I also missed it.

All looks fine to me, I flag it as ready for committer.


-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN pending list clean up exposure to SQL

2016-01-10 Thread Julien Rouhaud
On 29/12/2015 00:30, Jeff Janes wrote:
> On Wed, Nov 25, 2015 at 9:29 AM, Jeff Janes  wrote:
>>
>> I'll prepare a patch for core for the January commitfest, and see if
>> it flies.  If not, there is always the extension to fall back to.
> 
> Here is an updated patch.  I've added type and permission checks,
> docs, and some regression tests.
> 

I just reviewed it. Patch applies cleanly, everything works as intended,
including regression tests.

I think the function should be declared as strict.

Also, there are some trailing whitespaces in the documentation diff.

Regards.

> Cheers,
> 
> Jeff
> 
> 
> 
> 


-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN pending list clean up exposure to SQL

2015-12-28 Thread Jeff Janes
On Wed, Nov 25, 2015 at 9:29 AM, Jeff Janes  wrote:
>
> I'll prepare a patch for core for the January commitfest, and see if
> it flies.  If not, there is always the extension to fall back to.

Here is an updated patch.  I've added type and permission checks,
docs, and some regression tests.

Cheers,

Jeff
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index 8ef9fce..6cbe11b
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
*** postgres=# SELECT * FROM pg_xlogfile_nam
*** 17919,17924 
--- 17919,17928 
  brin_summarize_new_values
 
  
+
+ gin_clean_pending_list
+
+ 
 
   shows the functions
  available for index maintenance tasks.
*** postgres=# SELECT * FROM pg_xlogfile_nam
*** 17939,17944 
--- 17943,17955 
 integer
 summarize page ranges not already summarized

+   
+
+ gin_clean_pending_list(index_oid regclass)
+
+bigint
+move fast-update pending list entries into main index structure
+   
   
  
 
*** postgres=# SELECT * FROM pg_xlogfile_nam
*** 17952,17957 
--- 17963,17975 
  into the index.
 
  
+
+gin_clean_pending_list accepts a GIN index OID argument 
+ and moves the fast-update entries from the pending list into the 
+ main index structure for that index.  It returns the number of pages
+ removed from the pending list.
+
+ 

  

diff --git a/doc/src/sgml/gin.sgml b/doc/src/sgml/gin.sgml
new file mode 100644
index 262f1e4..c6f19de
*** a/doc/src/sgml/gin.sgml
--- b/doc/src/sgml/gin.sgml
***
*** 728,734 
 from the indexed item). As of PostgreSQL 8.4,
 GIN is capable of postponing much of this work by inserting
 new tuples into a temporary, unsorted list of pending entries.
!When the table is vacuumed, or if the pending list becomes larger than
 , the entries are moved to the
 main GIN data structure using the same bulk insert
 techniques used during initial index creation.  This greatly improves
--- 728,736 
 from the indexed item). As of PostgreSQL 8.4,
 GIN is capable of postponing much of this work by inserting
 new tuples into a temporary, unsorted list of pending entries.
!When the table is vacuumed or autoanalyzed, or when 
!gin_clean_pending_list(regclass) is called, or if the 
!pending list becomes larger than
 , the entries are moved to the
 main GIN data structure using the same bulk insert
 techniques used during initial index creation.  This greatly improves
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
new file mode 100644
index 4bb22fe..373b52d
*** a/src/backend/access/gin/ginfast.c
--- b/src/backend/access/gin/ginfast.c
***
*** 24,29 
--- 24,30 
  #include "miscadmin.h"
  #include "utils/memutils.h"
  #include "utils/rel.h"
+ #include "utils/acl.h"
  #include "storage/indexfsm.h"
  
  /* GUC parameter */
*** ginInsertCleanup(GinState *ginstate,
*** 962,964 
--- 963,999 
  	MemoryContextSwitchTo(oldCtx);
  	MemoryContextDelete(opCtx);
  }
+ 
+ /*
+  * SQL-callable function to clean the insert pending list
+  */
+ Datum
+ gin_clean_pending_list(PG_FUNCTION_ARGS)
+ {
+ 	Oid			indexoid = PG_GETARG_OID(0);
+ 	Relation	indexRel = index_open(indexoid, AccessShareLock);
+ 	IndexBulkDeleteResult stats;
+ 	GinState	ginstate;
+ 
+ 	/* Must be a GIN index */
+ 	if (indexRel->rd_rel->relkind != RELKIND_INDEX ||
+ 		indexRel->rd_rel->relam != GIN_AM_OID)
+ 		ereport(ERROR,
+ (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+  errmsg("\"%s\" is not a GIN index",
+ 		RelationGetRelationName(indexRel;
+ 
+ 	/* User must own the index (comparable to privileges needed for VACUUM) */
+ 	if (!pg_class_ownercheck(indexoid, GetUserId()))
+ 		aclcheck_error(ACLCHECK_NOT_OWNER, ACL_KIND_CLASS,
+ 	   RelationGetRelationName(indexRel));
+ 
+ 
+ 	memset(, 0, sizeof(stats));
+ 	initGinState(, indexRel);
+ 	ginInsertCleanup(, false, true, );
+ 
+ 	index_close(indexRel, AccessShareLock);
+ 
+ 	PG_RETURN_INT64((int64) stats.pages_deleted);
+ }
diff --git a/src/include/access/gin_private.h b/src/include/access/gin_private.h
new file mode 100644
index 5021887..9373ef5
*** a/src/include/access/gin_private.h
--- b/src/include/access/gin_private.h
*** extern void ginFreeScanKeys(GinScanOpaqu
*** 878,883 
--- 878,886 
  /* ginget.c */
  extern Datum gingetbitmap(PG_FUNCTION_ARGS);
  
+ /* ginfast.c */
+ extern Datum gin_clean_pending_list(PG_FUNCTION_ARGS);
+ 
  /* ginlogic.c */
  extern void ginInitConsistentFunction(GinState *ginstate, GinScanKey key);
  
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
new file mode 100644
index d8640db..4f76b84
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
*** 

Re: [HACKERS] GIN pending list clean up exposure to SQL

2015-11-25 Thread Jeff Janes
On Thu, Nov 19, 2015 at 8:37 AM, Jaime Casanova
 wrote:
> On 12 August 2015 at 20:19, Jeff Janes  wrote:
>>
>> But where does this belong?  Core?  Its own separate extension?
>>
>
> I will say core. Gin indexes are in core and i don't see why this
> function shouldn't.
> FWIW, brin indexes has a similar function brin_summarize_new_values() in core


I was holding off on doing more on this until after the bug
http://www.postgresql.org/message-id/CAMkU=1xalflhuuohfp5v33rzedlvb5aknnujceum9knbkrb...@mail.gmail.com
was resolved, as I think it might change the way things work enough to
make a difference, at least to the documentation if not the
functioning.

I will look at brin_summarize_new_values for guidance.  But now we
have one vote for core and one for 'gevel' extension, so I'm not sure
where to go.  The nice thing about core is that is always there (in
the future) and maintained by a group of people, and as you point out
there is precedence with the BRIN index.  But the nice thing with an
extension is that it easier to adapt for existing installations, so
you don't have to wait for 9.6 to get access to it.

I'll prepare a patch for core for the January commitfest, and see if
it flies.  If not, there is always the extension to fall back to.

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN pending list clean up exposure to SQL

2015-11-24 Thread Fujii Masao
On Sun, Nov 22, 2015 at 1:15 PM, Jaime Casanova
 wrote:
> On 21 November 2015 at 03:54, Jim Nasby  wrote:
>> On 11/19/15 10:47 AM, Jaime Casanova wrote:
>>>
>>> - only superusers?
>>
>>
>> I would think the owner of the table (index?) should also be able to run
>> this.
>
> agreed, that makes sense

Also the function should ensure that the server is running in
normal mode (not recovery mode) and the specified relation is
NOT other session's temporary table.

I added the similar function into pg_bigm extension
(pg_gin_pending_cleanup function in
https://osdn.jp/projects/pgbigm/scm/git/pg_bigm/blobs/master/bigm_gin.c).
It might help us improve the patch.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN pending list clean up exposure to SQL

2015-11-21 Thread Jaime Casanova
On 21 November 2015 at 03:54, Jim Nasby  wrote:
> On 11/19/15 10:47 AM, Jaime Casanova wrote:
>>
>> - only superusers?
>
>
> I would think the owner of the table (index?) should also be able to run
> this.

agreed, that makes sense

-- 
Jaime Casanova  www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN pending list clean up exposure to SQL

2015-11-20 Thread Jaime Casanova
On 19 November 2015 at 14:57, Jaime Casanova
 wrote:
> On 19 November 2015 at 14:47, Jaime Casanova
>  wrote:
>> On 19 November 2015 at 14:18, Alvaro Herrera  
>> wrote:
>>> Alvaro Herrera wrote:
 Jeff Janes wrote:
 > I've written a function which allows users to clean up the pending list.
 > It takes the index name and returns the number of pending list pages
 > deleted.

 I just noticed that your patch uses AccessShareLock on the index.  Is
 that okay?  I would have assumed that you'd need ShareUpdateExclusive
 (same as vacuum uses), but I don't really know.  Was that a carefully
 thought-out choice?
>>>
>>> After reading gitPendingCleanup it becomes clear that there's no need
>>> for a stronger lock than what you've chosen.  Jaime Casanova just
>>> pointed this out to me.
>>>
>>
>> But it should do some checks, no?
>> - only superusers?
>> - what i received as parameter is a GIN index?
>>
>
> I just notice this:
>
> +   ginInsertCleanup(, true, );
>
> ginInsertCleanup() now has four parameters, so you should update the call
>

Btw, this is not in the commitfest and seems like a useful thing to have

-- 
Jaime Casanova  www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN pending list clean up exposure to SQL

2015-11-20 Thread Jim Nasby

On 11/19/15 10:47 AM, Jaime Casanova wrote:

- only superusers?


I would think the owner of the table (index?) should also be able to run 
this.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN pending list clean up exposure to SQL

2015-11-19 Thread Alvaro Herrera
Alvaro Herrera wrote:
> Jeff Janes wrote:
> > I've written a function which allows users to clean up the pending list.
> > It takes the index name and returns the number of pending list pages
> > deleted.
> 
> I just noticed that your patch uses AccessShareLock on the index.  Is
> that okay?  I would have assumed that you'd need ShareUpdateExclusive
> (same as vacuum uses), but I don't really know.  Was that a carefully
> thought-out choice?

After reading gitPendingCleanup it becomes clear that there's no need
for a stronger lock than what you've chosen.  Jaime Casanova just
pointed this out to me.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN pending list clean up exposure to SQL

2015-11-19 Thread Jaime Casanova
On 12 August 2015 at 20:19, Jeff Janes  wrote:
>
> But where does this belong?  Core?  Its own separate extension?
>

I will say core. Gin indexes are in core and i don't see why this
function shouldn't.
FWIW, brin indexes has a similar function brin_summarize_new_values() in core

-- 
Jaime Casanova  www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN pending list clean up exposure to SQL

2015-11-19 Thread Alvaro Herrera
Jeff Janes wrote:
> I've written a function which allows users to clean up the pending list.
> It takes the index name and returns the number of pending list pages
> deleted.

I just noticed that your patch uses AccessShareLock on the index.  Is
that okay?  I would have assumed that you'd need ShareUpdateExclusive
(same as vacuum uses), but I don't really know.  Was that a carefully
thought-out choice?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN pending list clean up exposure to SQL

2015-11-19 Thread Jaime Casanova
On 19 November 2015 at 14:18, Alvaro Herrera  wrote:
> Alvaro Herrera wrote:
>> Jeff Janes wrote:
>> > I've written a function which allows users to clean up the pending list.
>> > It takes the index name and returns the number of pending list pages
>> > deleted.
>>
>> I just noticed that your patch uses AccessShareLock on the index.  Is
>> that okay?  I would have assumed that you'd need ShareUpdateExclusive
>> (same as vacuum uses), but I don't really know.  Was that a carefully
>> thought-out choice?
>
> After reading gitPendingCleanup it becomes clear that there's no need
> for a stronger lock than what you've chosen.  Jaime Casanova just
> pointed this out to me.
>

But it should do some checks, no?
- only superusers?
- what i received as parameter is a GIN index?

-- 
Jaime Casanova  www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN pending list clean up exposure to SQL

2015-11-19 Thread Jaime Casanova
On 19 November 2015 at 14:47, Jaime Casanova
 wrote:
> On 19 November 2015 at 14:18, Alvaro Herrera  wrote:
>> Alvaro Herrera wrote:
>>> Jeff Janes wrote:
>>> > I've written a function which allows users to clean up the pending list.
>>> > It takes the index name and returns the number of pending list pages
>>> > deleted.
>>>
>>> I just noticed that your patch uses AccessShareLock on the index.  Is
>>> that okay?  I would have assumed that you'd need ShareUpdateExclusive
>>> (same as vacuum uses), but I don't really know.  Was that a carefully
>>> thought-out choice?
>>
>> After reading gitPendingCleanup it becomes clear that there's no need
>> for a stronger lock than what you've chosen.  Jaime Casanova just
>> pointed this out to me.
>>
>
> But it should do some checks, no?
> - only superusers?
> - what i received as parameter is a GIN index?
>

I just notice this:

+   ginInsertCleanup(, true, );

ginInsertCleanup() now has four parameters, so you should update the call

-- 
Jaime Casanova  www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] GIN pending list clean up exposure to SQL

2015-08-12 Thread Jeff Janes
I've written a function which allows users to clean up the pending list.
It takes the index name and returns the number of pending list pages
deleted.

# select * from gin_clean_pending_list('foo_text_array_idx');
 gin_clean_pending_list

278
(1 row)

Time: 31994.880 ms


This is needed because there needs to be a way to offload this duty from
the user backends, and the only other way to intentionaly clean up the list
is by vacuum (and the rest of a vacuum can take days to run on a large
table).  Autoanalyze will also do it, but it hard to arrange for those to
occur at need, and unless you drop default_statistics_target very low they
can also take a long time.  And if you do lower the target, it screws up
your statistics, of course.

I've currently crammed it into pageinspect, simply because that is where I
found the existing code which I used as an exemplar for writing this code.

But where does this belong?  Core?  Its own separate extension?

Cheers,

Jeff


gin_clean_pending_user_v001.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN pending list clean up exposure to SQL

2015-08-12 Thread Oleg Bartunov
On Thu, Aug 13, 2015 at 2:19 AM, Jeff Janes jeff.ja...@gmail.com wrote:

 I've written a function which allows users to clean up the pending list.
 It takes the index name and returns the number of pending list pages
 deleted.

 # select * from gin_clean_pending_list('foo_text_array_idx');
  gin_clean_pending_list
 
 278
 (1 row)

 Time: 31994.880 ms


 This is needed because there needs to be a way to offload this duty from
 the user backends, and the only other way to intentionaly clean up the list
 is by vacuum (and the rest of a vacuum can take days to run on a large
 table).  Autoanalyze will also do it, but it hard to arrange for those to
 occur at need, and unless you drop default_statistics_target very low they
 can also take a long time.  And if you do lower the target, it screws up
 your statistics, of course.

 I've currently crammed it into pageinspect, simply because that is where I
 found the existing code which I used as an exemplar for writing this code.

 But where does this belong?  Core?  Its own separate extension?


For a long time we have gevel extension (
http://www.sigaev.ru/git/gitweb.cgi?p=gevel.git;a=summary) , which was
originally developed to help us debugging indexes, but it's also contains
user's functions.  Your function could be there, but I have no idea about
gevel itself. Probably, it should be restructurized and come to contrib. Do
you have time to look into ?




 Cheers,

 Jeff


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers