Re: Control your disk usage in PG: Introduction to Disk Quota Extension

2020-03-27 Thread Haozhou Wang
Thanks David!

We rebased this patch with the newest master.

Thank you very much!

Regards,
Haozhou


On Wed, Mar 25, 2020 at 12:00 AM David Steele  wrote:

> On 12/2/19 1:39 AM, Haozhou Wang wrote:
> >
> > Thank you very much for your email. I rebased the code with the newest
> > master and attached it in the attachment.
>
> This patch applies but no longer builds on Linux or Windows:
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__travis-2Dci.org_github_postgresql-2Dcfbot_postgresql_builds_666113036=DwICaQ=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw=nGUCIcT5CVp6-pQcplYyagWnpAqoYm8YxWruds9UGI0=UfgeYnaIkXdE7XmY-eCIHV_ZCTEZqCAjXkPausd5qZI=zHJ5pgH_P4mMhZAuXtfTCGUB7lyo7dJ4VeoapRhIN4g=
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.appveyor.com_project_postgresql-2Dcfbot_postgresql_build_1.0.85284=DwICaQ=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw=nGUCIcT5CVp6-pQcplYyagWnpAqoYm8YxWruds9UGI0=UfgeYnaIkXdE7XmY-eCIHV_ZCTEZqCAjXkPausd5qZI=WmyVNF-rItz10LoeLtY-Xu6l-iV3uwI_U-R9lR7cKjc=
>
> The CF entry has been updated to Waiting on Author.
>
> Regards,
> --
> -David
> da...@pgmasters.net
>


-- 
Regards,
Haozhou


disk_quota_hooks_v7.patch
Description: Binary data


Re: Control your disk usage in PG: Introduction to Disk Quota Extension

2019-12-01 Thread Haozhou Wang
Hi Michael,

Thank you very much for your email. I rebased the code with the newest
master and attached it in the attachment.

Thank you very much!

Regards,
Haozhou

On Sun, Dec 1, 2019 at 11:20 AM Michael Paquier  wrote:

> On Fri, Sep 27, 2019 at 11:30:08AM +0800, Haozhou Wang wrote:
> > I rebased this patch with the newest master branch. Attached the new
> > patch disk_quota_hooks_v5.patch in the attachment.
>
> This again needs a rebase, so I have switched it as waiting on
> author.
> --
> Michael
>


-- 
Regards,
Haozhou


disk_quota_hooks_v6.patch
Description: Binary data


Re: Control your disk usage in PG: Introduction to Disk Quota Extension

2019-09-26 Thread Haozhou Wang
Thanks Alvaro!
I rebased this patch with the newest master branch. Attached the new
patch disk_quota_hooks_v5.patch in the attachment.

Regards,
Haozhou

On Thu, Sep 26, 2019 at 3:54 AM Alvaro Herrera 
wrote:

> This patch no longer applies. Can you please rebase?
>
> --
> Álvaro Herrera
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.2ndQuadrant.com_=DwIDAw=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw=nGUCIcT5CVp6-pQcplYyagWnpAqoYm8YxWruds9UGI0=aNzGoEI15bAE-vAivY34BtG2WdgrVojH-B-kjvuXsYA=sT6zyiq4s8meelNuFw-lGD_mdvmUzv9zpVYWbFWusI0=
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>
>

-- 
Regards,
Haozhou


disk_quota_hooks_v5.patch
Description: Binary data


Re: Control your disk usage in PG: Introduction to Disk Quota Extension

2019-01-07 Thread Haozhou Wang
Thanks very much for your comments.

To the best of my knowledge, smgr is a layer that abstract the storage
operations. Therefore, it is a good place to control or collect information
the storage operations without touching the physical storage layer.
Moreover, smgr is coming with actual disk IO operation (not consider the OS
cache) for postgres. So we do not need to worry about the buffer management
in postgres.
It will make the purpose of hook is pure: a hook for actual disk IO.

Regards,
Haozhou

On Wed, Dec 26, 2018 at 1:56 PM Michael Paquier  wrote:

> On Wed, Nov 21, 2018 at 09:47:44AM -0500, Robert Haas wrote:
> > +1 for adding some hooks to support this kind of thing, but I think
> > the names you've chosen are not very good.  The hook name should
> > describe the place from which it is called, not the purpose for which
> > one imagines that it will be used, because somebody else might imagine
> > another use.  Both BufferExtendCheckPerms_hook_type and
> > SmgrStat_hook_type are imagining that they know what the hook does -
> > CheckPerms in the first case and Stat in the second case.
>
> I personally don't mind making Postgres more pluggable, but I don't
> think that we actually need the extra ones proposed here at the layer
> of smgr, as smgr is already a layer designed to call an underlying set
> of APIs able to extend, unlink, etc. depending on the storage type.
>
> > For this particular purpose, I don't immediately see why you need a
> > hook in both places.  If ReadBuffer is called with P_NEW, aren't we
> > guaranteed to end up in smgrextend()?
>
> Yes, that's a bit awkward.
> --
> Michael


Re: Control your disk usage in PG: Introduction to Disk Quota Extension

2018-11-22 Thread Haozhou Wang
Thank you very much for your review.
We refactored our patch with new names and comments.

For ReadBufferExtended hook, yes, Readbuffer with P_NEW will then call
smgrextend.

But in smgrextend, we cannot get the oid of a relation, and it will take
some time to get the oid via smgrrelation.
We would like to add a hook just before the smgrextend to get the oid and
avoid use RelidByRelfilenode().

New patch is attached in the attachment.
Thank a lot!

Regards,
Haozhou


On Wed, Nov 21, 2018 at 10:48 PM Robert Haas  wrote:

> On Tue, Nov 20, 2018 at 2:20 AM Haozhou Wang  wrote:
> > We prepared a patch that includes the hook points. And such hook points
> are needed for disk quota extension.
> > There are two hooks.
> > One is SmgrStat_hook. It's used to perform ad-hoc logic in storage when
> doing smgr create/extend/truncate in general. As for disk quota extension,
> this hook is used to detect active tables(new created tables, tables
> extending new blocks, or tables being truncated)
> > The other is BufferExtendCheckPerms_hook. It's used to perform ad-hoc
> logic when buffer extend a new block. Since ReadBufferExtended is a hot
> function, we call this hook only when blockNum == P_NEW. As  for disk quota
> extension, this hook is used to do query enforcement during the query is
> loading data.
> >
> > Any comments are appreciated.
>
> +1 for adding some hooks to support this kind of thing, but I think
> the names you've chosen are not very good.  The hook name should
> describe the place from which it is called, not the purpose for which
> one imagines that it will be used, because somebody else might imagine
> another use.  Both BufferExtendCheckPerms_hook_type and
> SmgrStat_hook_type are imagining that they know what the hook does -
> CheckPerms in the first case and Stat in the second case.
>
> For this particular purpose, I don't immediately see why you need a
> hook in both places.  If ReadBuffer is called with P_NEW, aren't we
> guaranteed to end up in smgrextend()?
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


disk_quota_hooks_v2.patch
Description: Binary data


Re: Control your disk usage in PG: Introduction to Disk Quota Extension

2018-11-19 Thread Haozhou Wang
Hi all,

We prepared a patch that includes the hook points. And such hook points are
needed for disk quota extension.
There are two hooks.
One is SmgrStat_hook. It's used to perform ad-hoc logic in storage when
doing smgr create/extend/truncate in general. As for disk quota extension,
this hook is used to detect active tables(new created tables, tables
extending new blocks, or tables being truncated)
The other is BufferExtendCheckPerms_hook. It's used to perform ad-hoc logic
when buffer extend a new block. Since ReadBufferExtended is a hot function,
we call this hook only when blockNum == P_NEW. As  for disk quota
extension, this hook is used to do query enforcement during the query is
loading data.

Any comments are appreciated.

Regards,
Haozhou

On Wed, Nov 14, 2018 at 6:07 PM Hubert Zhang  wrote:

> Thanks, Tomas,
>
> Yes, we want to add the hooks into postgres repo.
> But before that, we plan to firstly get some feedbacks from community
> about the diskquota extension implementation and usage?
> Later, we'll modify our license and submit the hooks into CF.
>
> Thanks
> Hubert
>
> On Wed, Nov 14, 2018 at 3:54 AM Tomas Vondra 
> wrote:
>
>> On Tue, 2018-11-13 at 16:47 +0800, Hubert Zhang wrote:
>> > Hi all,
>> >
>> > We implement disk quota feature on Postgresql as an extension(link:
>> > https://github.com/greenplum-db/diskquota),
>> > If you are interested, try and use it to limit the amount of disk
>> > space that
>> > a schema or a role can use in Postgresql.
>> > Any feedback or question are appreciated.
>> >
>>
>> It's not clear to me what's the goal of this thread? I understand what
>> quotas are about, but are you sharing it because (a) it's a useful
>> extension, (b) you propose adding a couple of new hooks (and keep the
>> extension separate) or (c) you propose adding both the hooks and the
>> extension (into contrib)?
>>
>> I assume it's either (b) and (c), in which case you should add it to
>> 2019-01 CF: https://commitfest.postgresql.org/21/
>>
>> In either case, it might be useful to add a LICENSE to the github
>> repository, it's not clear what's the situation in this respect. That
>> probably matters especially for (b), because for (c) it'd end up with
>> PostgreSQL license automatically.
>>
>> regards
>>
>> --
>> Tomas Vondra  http://www.2ndQuadrant.com
>> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>>
>>
>
> --
> Thanks
>
> Hubert Zhang
>


disk_quota_hooks_v1.patch
Description: Binary data


Re: Vacuum Full does not release the disk size space after delete from table

2018-11-04 Thread Haozhou Wang
Thank Tom!
We will check it.

On Fri, Nov 2, 2018 at 10:35 PM Tom Lane  wrote:

> Haozhou Wang  writes:
> > We meet a corner case that related to the behavior of Vacuum Full.
> > ...
> > If we run both sql scripts on same database in parallel, the "VACUUM FULL
> > a;" will not release the disk space.
>
> I think what's happening is that the delete in script 1 happens after the
> "pg_sleep" in script 2 starts.  Then the pg_sleep has an open snapshot
> that could potentially see the deleted rows, so they can't be removed yet.
>
> You could check this theory by changing the vacuum to use VERBOSE, and
> seeing what it says about rows that can't be removed yet.
>
> regards, tom lane
>


-- 
Regards,
Haozhou


Vacuum Full does not release the disk size space after delete from table

2018-11-02 Thread Haozhou Wang
Hi hackers,

We meet a corner case that related to the behavior of Vacuum Full.

There are two SQL scripts

SQL1:
-
-- Test vacuum full

create schema s2;

set search_path to s2;

create table a (i int);

create table b (i int);

insert into a select generate_series(1,5);

delete from a where i > 10;

-- Disk space should release after vacuum full

vacuum full a;

drop table a, b;

reset search_path;

drop schema s2;
-

SQL2:
--
create schema s1;

set search_path to s1;

create table c(i int);

insert into c select generate_series(1,5);

select pg_sleep(5);

reset search_path;

drop schema s1;
---

If we run both sql scripts on same database in parallel, the "VACUUM FULL
a;" will not release the disk space.

But if we run then on same database in sequence, "VACUUM FULL a" can
release the disk space.

We use the PG master branch and the autovacuum has set to off.

Could someone please help us to figure out the root cause of this issue?

Thanks a lot!

-- 
Regards,
Haozhou


Re: [PATCH] Add missing type conversion functions for PL/Python

2018-07-16 Thread Haozhou Wang
+1, I also think that we may not change the previous behavior of plpython.
@Nikita Glukhov  maybe we just check the
whether pyobject is int or long only in related conversion functions, and
fallback otherwise?

On Fri, Jul 13, 2018 at 12:09 AM Heikki Linnakangas  wrote:

> On 12/07/18 18:06, Nikita Glukhov wrote:
> > I have added some cross-type test cases and now almost all new code is
> covered
> > (excluding several error cases which can be triggered only by custom
> numeric
> > type implementations).
>
> Thanks! Some of those new tests actually fail, if you run them against
> unpatched master. For example:
>
>   SELECT * FROM test_type_conversion_float8_int2(100::float8);
>   INFO:  (100.0, )
> - test_type_conversion_float8_int2
> ---
> -  100
> -(1 row)
> -
> +ERROR:  invalid input syntax for integer: "100.0"
> +CONTEXT:  while creating return value
> +PL/Python function "test_type_conversion_float8_int2"
>
> So this patch is making some subtle changes to behavior. I don't think
> we want that.
>
> - Heikki
>


-- 
Regards,
Haozhou


Re: [PATCH] Add missing type conversion functions for PL/Python

2018-03-26 Thread Haozhou Wang
Thanks Nikita!

On Tue, Mar 27, 2018 at 12:07 AM, Nikita Glukhov <n.glu...@postgrespro.ru>
wrote:

> On 26.03.2018 17:19, David Steele wrote:
>
> On 2/20/18 10:14 AM, Haozhou Wang wrote:
>>
>>> Thank you very much for your review!
>>>
>>> I attached a new patch with typo fixed.
>>>
>> I think it's a bit premature to mark this Ready for Committer after a
>> review consisting of a few typos.  Anthony only said that he started
>> looking at it so I've marked it Needs Review.
>>
>
> Hi.
>
> I also have looked at this patch and found some problems.
> Attached fixed 3th version of the patch:
>  * initialization of arg->u.scalar was moved into PLy_output_setup_func()
>  * added range checks for int16 and int32 types
>  * added subroutine PLyInt_AsLong() for correct handling OverflowError
> that can
>be thrown from PyInt_AsLong()
>  * casting from Python float to PostgreSQL numeric using
> PyFloat_AsDouble() was
>removed because it can return incorrect result for Python long and
>float8_numeric() uses float8 and numeric I/O functions
>  * fixed whitespace
>
>
> --
> Nikita Glukhov
>
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
>


-- 
Regards,
Haozhou


Re: [PATCH] Add missing type conversion functions for PL/Python

2018-02-20 Thread Haozhou Wang
Thank you very much for your review!

I attached a new patch with typo fixed.

Regards,
Haozhou

On Mon, Feb 19, 2018 at 2:37 PM, Anthony Bykov <a.by...@postgrespro.ru>
wrote:

> On Wed, 31 Jan 2018 11:57:12 +0800
> Haozhou Wang <haw...@pivotal.io> wrote:
>
> > Hi All,
> >
> > PL/Python already has different type conversion functions to
> > convert PostgreSQL datum to Python object. However, the conversion
> > functions from Python object to PostgreSQL datum only has Boolean,
> > Bytea and String functions.
> >
> > In this patch, we add rest of Integer and Float related datatype
> > conversion functions
> > and can increase the performance of data conversion greatly especially
> > when returning a large array.
> >
> > We did a quick test about the performance of returning array in
> > PL/Python:
> >
> > The following UDF is used for test:
> >
> > ```
> > CREATE OR REPLACE FUNCTION pyoutfloat8(num int) RETURNS float8[] AS $$
> > return [x/3.0 for x in range(num)]
> > $$ LANGUAGE plpythonu;
> > ```
> >
> > The test command is
> >
> > ```
> > select count(pyoutfloat8(n));
> > ```
> >
> > The count is used for avoid large output, where n is the number of
> > element in returned array, and the size is from 1.5k to15m.
> >
> > Size of Array  1.5k   |15k |
> >  150k|   1.5m| 15m   |
> >
> > Origin 2.324ms | 19.834ms  | 194.991ms
> > | 1927.28ms|   19982.1ms  |
> >
> > With this patch   1.168ms  |  3.052ms   |
> > 21.888ms | 213.39ms  |2138.5ms   |
> >
> > All test for both PG and PL/Python are passed.
> >
> > Thanks very much.
> >
> >
>
> Hello,
> sounds like a really nice patch. I've started looking
> through the code and noticed a sort of a typos (or I just couldn't
> understand what did you mean) in comments.
>
> file "src/pl/plpython/plpy_typeio.c"
> the comment is
> * If can not convert if directly, fallback to PLyObject_ToDatum
> * to convert it
>
> Maybe it should be something like ("it" instead of second "if")
> * If can not convert it directly, fallback to PLyObject_ToDatum
> * to convert it
>
> And the same typo is repeated several times in comments.
>
> --
> Anthony Bykov
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>


0001-Add-missing-type-conversion-functions-for-PL-Python-v2.patch
Description: Binary data


[PATCH] Add missing type conversion functions for PL/Python

2018-01-30 Thread Haozhou Wang
Hi All,

PL/Python already has different type conversion functions to
convert PostgreSQL datum to Python object. However, the conversion
functions from Python object to PostgreSQL datum only has Boolean,
Bytea and String functions.

In this patch, we add rest of Integer and Float related datatype conversion
functions
and can increase the performance of data conversion greatly especially
when returning a large array.

We did a quick test about the performance of returning array in PL/Python:

The following UDF is used for test:

```
CREATE OR REPLACE FUNCTION pyoutfloat8(num int) RETURNS float8[] AS $$
return [x/3.0 for x in range(num)]
$$ LANGUAGE plpythonu;
```

The test command is

```
select count(pyoutfloat8(n));
```

The count is used for avoid large output, where n is the number of element
in returned array, and the size is from 1.5k to15m.

Size of Array  1.5k   |15k |
 150k|   1.5m| 15m   |

Origin 2.324ms | 19.834ms  | 194.991ms
| 1927.28ms|   19982.1ms  |

With this patch   1.168ms  |  3.052ms   |  21.888ms |
   213.39ms  |2138.5ms   |

All test for both PG and PL/Python are passed.

Thanks very much.


-- 
Regards,
Haozhou


0001-Add-missing-type-conversion-functions-for-PL-Python.patch
Description: Binary data