Re: [HACKERS] issue log message to suggest VACUUM FULL if a table is nearly empty

2014-03-23 Thread Wang, Jing
On Thursday, 20 March 2014 2:45 PM, Amit Kapila amit.kapil...@gmail.com wrote:
On Wed, Mar 19, 2014 at 6:25 AM, Wang, Jing ji...@fast.au.fujitsu.com wrote:
 On Friday, 14 March 2014 2:42 PM, Amit Kapila amit.kapil...@gmail.com 
 wrote:
 I think it might be okay to even change this API to return the 
 FreeSpace, as the other place it is used is for Index Vacuum, so even if we 
 don't have any intention to print such a message for index in this patch, 
 but similar information could be useful there as well to suggest a user 
 that index has lot of free space.

 Enclosed please find the new patch which get the FreeSpace for one relation 
 from the return of FreeSpaceMapVacuum() function. This function and the 
 fsm_vacuum_page() function have been slightly modified to get the FreeSpace 
 and no I/O burden increasing. The little side-effect is it will calculate 
 FreeSpace for every table even the table is very small.

I think that can also be avoided, because by the time you call 
FreeSpaceMapVacuum(), you already have the required information based on which 
you can decide not to ask for freespace if required.

That will make the function FreeSpaceMapVacuum() look strange and be difficult 
to understand, so I think keeping the existing patch is better.  Cause the 
number of pages of FSM file is small , calculating FreeSpace for small table 
will not bring the burden in performance.

Can't we avoid the new calculation you have added in fsm_vacuum_page(), as 
this function already updates the size, so might be we can get it from current 
calculation done in function.

Sorry, I can't find that information from the current calculation. Could you 
give me some more detail information?

Kind regards
Jing Wang
Fujitsu Australia



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] issue log message to suggest VACUUM FULL if a table is nearly empty

2014-03-18 Thread Wang, Jing
On Friday, 14 March 2014 2:42 PM, Amit Kapila amit.kapil...@gmail.com wrote:
On Wed, Mar 12, 2014 at 12:22 PM, Haribabu Kommi kommi.harib...@gmail.com 
wrote:
 On Tue, Mar 11, 2014 at 2:59 PM, Amit Kapila amit.kapil...@gmail.com wrote:

 By the way have you checked if FreeSpaceMapVacuum() can serve your 
 purpose, because this call already traverses FSM in depth-first 
 order to update the freespace. So may be by using this call or 
 wrapper on this such that it returns total freespace as well apart 
 from updating freespace can serve the need.

 Thanks for information. we can get the table free space by writing 
 some wrapper or modify a little bit of FreeSpaceMapVacuum() function.

 I think it might be okay to even change this API to return the FreeSpace, as 
 the other place it is used is for Index Vacuum, so even if we don't have any 
 intention to print such a message for index in this patch, 
 but similar information could be useful there as well to suggest a user that 
 index has lot of free space.

Enclosed please find the new patch which get the FreeSpace for one relation 
from the return of FreeSpaceMapVacuum() function. This function and the 
fsm_vacuum_page() function have been slightly modified to get the FreeSpace and 
no I/O burden increasing. The little side-effect is it will calculate FreeSpace 
for every table even the table is very small.


Kind regards
Jing Wang
Fujitsu Australia


vacuum_v2.patch
Description: vacuum_v2.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] issue log message to suggest VACUUM FULL if a table is nearly empty

2014-03-09 Thread Wang, Jing
Hi 

 

Enclosed is the patch to implement the requirement that issue log
message to suggest VACUUM FULL if a table is nearly empty.

The requirement comes from the Postgresql TODO list.

 

[Benefit]

To find which table is nearly empty and suggest using 'VACUUM FULL' to
release the unused disk space this table occupied. 

 

[Analysis]

A table is nearly empty include two scenario:

1. The table occupy small disk size and contains few unused rows. 

2. The table occupy large disk size and contains large numbers of unused
rows. 

 

Obviously the requirement is used to release the disk in the scenario2.

 

[Solution details]

A check function is added in the function 'lazy_vacuum_rel' to check if
the table is large enough and contains large numbers of unused rows. If
it is then issue a log message that suggesting using 'VACUUM FULL' on
the table.

 

The judgement policy is as following:

If the relpage of the table  RELPAGES_VALUES_THRESHOLD(default 1000)
then the table is considered to be large enough.

If the free_space/total_space  FREESPACE_PERCENTAGE_THRESHOLD(default
0.5) then the table is considered to have large numbers of unused rows.

The free_space is calculated by reading the details from the FSM pages.
This may increase the IO, but expecting very less FSM pages thus it
shouldn't cause 

Any problems. Please let me know your suggestions.

 

[When the log message prints]

When executing SQL command 'VACUUM' or 'VACUUM on a table', this
function will be invoked and may issue the log message if the table
reach the condition.

When auto vacuum work and execute 'VACUUM on a table', this function
will be invoked and may issue the log message if the table reach the
condition.

 

[Example]

SELECT count(*) from t5;

 

count 

---

  3000

(1 row)

 

DELETE FROM t5 where f12900;

DELETE 2899

 

SELECT count(*) from t5;

count 

---

   101

(1 row)

 

LOG:  automatic vacuum of table wjdb.public.t5: index scans: 0

pages: 0 removed, 20
remain

tuples: 2899 removed,
101 remain, 0 are dead but not yet removable

buffer usage: 64 hits, 1
misses, 25 dirtied

avg read rate: 0.130
MB/s, avg write rate: 3.261 MB/s

system usage: CPU
0.00s/0.00u sec elapsed 0.05 sec

LOG:  Table t5 contains large numbers of unused row, suggest using
VACUUM FULL on it!

 

VACUUM t5;

LOG:  Table t5 contains large numbers of unused row, suggest using
VACUUM FULL on it!

 

 

Kind regards

Jing Wang

Fujitsu Australia

 



vacuum_v1.patch
Description: vacuum_v1.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump reporing version of server pg_dump as comments in the output

2014-03-03 Thread Wang, Jing
On 4 March 2014 2:41 Euler Taveira wrote:
On 27-02-2014 21:10, Wang, Jing wrote:
 Using pg_dump can dump the data into the file with format set to be 
 'c','t' or plain text. In the existing version the version of server
 
 pg_dump is already there when the format of file is 'c' or 't'. And 
 even for the plain text format file the version of server  pg_dump
is 
 already there if using '--verbose' in pg_dump. Using '--verbose'
leads 
 to some many other prints which are not required always.
 
I don't buy your argument. Why isn't verbose option sufficient? Did you
read the old thread about this [1]?
[1] http://www.postgresql.org/message-id/3677.1253912...@sss.pgh.pa.us

AFAICS a lot of people compare pg_dump diffs. If we apply this patch,
it would break those applications. Also, it is *already* available if
you add verbose option (which is sufficient to satisfy those that want
the client and/or 
server version) in plain mode (the other modes already include the
desired info by default). In the past, timestamps were removed to avoid
noise in diffs.

Sorry, I don't understand which application will break?  Can you give me
more detail information?
Timestamps always vary which do affect the diff.  But I can't imagine
why adding the version will affect the pg_dump diffs.
I don't think the version number vary  frequently.

Kind regards
Jing Wang
Fujitsu Australia



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_dump reporing version of server pg_dump as comments in the output

2014-02-27 Thread Wang, Jing
Enclosed is the patch to implement the requirement that pg_dump should
report version of server  pg_dump as comments in the output. 

 

[Benefit]

By running head on pg_dump output, you can readily discover what
version of PostgreSQL was used to generate that dump. Very useful
especially for mouldy old database dumps.

The benefit of this requirement is to let user clearly understand from
which version the dump output file will be insert into which version
database server and easy handle the problems of Incompatibility
versions.

 

[Analysis]

Using pg_dump can dump the data into the file with format set to be
'c','t' or plain text. In the existing version the version of server 
pg_dump is already there when the format of file is 'c' or 't'. And even
for the plain text format file the version of server  pg_dump is
already there if using '--verbose' in pg_dump. Using '--verbose' leads
to some many other prints which are not required always. 

 

So the requirement is dump the version of server  pg_dump as comment
into the plain text format output file even without using '--verbose'
option.

 

[Solution details]

The main change is in the pg_backup_archiver.c file, in the function
'RestoreArchive' the version of server  pg_dump is only print when
finding the '--verbose' option to be used in current version.  Now we
just let the printing works even without finding the '--verbose' option.

 

[what is changed when applying the patch]

1. The output file which is created by pg_dump with format set to be
'plain text' and without using '--verbose' option will include the
version of server  pg_dump. One example is  as following:

--

-- PostgreSQL database dump

--

 

-- Dumped from database version 9.2.4

-- Dumped by pg_dump version 9.4devel

 

SET statement_timeout = 0;

SET lock_timeout = 0;

SET client_encoding = 'UTF8';

...

 

2. The output file which is created by pg_dumpall with format set to be
'plain text' and without using '--verbose' option will include the
version of server  pg_dump. The example is as following:

 

--

-- PostgreSQL database cluster dump

--

 

SET default_transaction_read_only = off;

 

...

 

\connect connectdb

 

SET default_transaction_read_only = off;

 

--

-- PostgreSQL database dump

--

 

-- Dumped from database version 9.2.4

-- Dumped by pg_dump version 9.4devel

 

SET statement_timeout = 0;

SET lock_timeout = 0;

SET client_encoding = 'UTF8';

SET standard_conforming_strings = on;

 

...

 

\connect postgres

 

SET default_transaction_read_only = off;

 

--

-- PostgreSQL database dump

--

 

-- Dumped from database version 9.2.4

-- Dumped by pg_dump version 9.4devel

 

SET statement_timeout = 0;

SET lock_timeout = 0;

SET client_encoding = 'UTF8';

SET standard_conforming_strings = on;

SET check_function_bodies = false;

 

 

3. The version of server and pg_dump will be dumped into the output
file. The output file is created by the following command:

pg_restore inputFile -f output.sql 

 

One example is as following:

--

-- PostgreSQL database dump

--

 

-- Dumped from database version 9.2.4

-- Dumped by pg_dump version 9.4devel

 

SET statement_timeout = 0;

SET lock_timeout = 0;

SET client_encoding = 'UTF8';

SET standard_conforming_strings = on;

...

 

 

Kind regards

Jing



Re: [HACKERS] pg_dump reporing version of server pg_dump as comments in the output

2014-02-27 Thread Wang, Jing
Sorry for missing the patch file in the original email.  Enclosed please
find it.

 

Jing Wang

Fujitsu Australia

 

 

From: Arulappan, Arul Shaji 
Sent: Friday, 28 February 2014 11:21 AM
To: Wang, Jing
Subject: RE: [HACKERS] pg_dump reporing version of server  pg_dump as
comments in the output
Importance: High

 

Jing, You missed the patch attachement.

 

Rgds,

Arul Shaji

 

 

From: pgsql-hackers-ow...@postgresql.org [
mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Wang, Jing
Sent: Friday, 28 February 2014 11:11 AM
To: pgsql-hackers@postgresql.org
Subject: [HACKERS] pg_dump reporing version of server  pg_dump as
comments in the output

 

Enclosed is the patch to implement the requirement that pg_dump should
report version of server  pg_dump as comments in the output. 

 

[Benefit]

By running head on pg_dump output, you can readily discover what
version of PostgreSQL was used to generate that dump. Very useful
especially for mouldy old database dumps.

The benefit of this requirement is to let user clearly understand from
which version the dump output file will be insert into which version
database server and easy handle the problems of Incompatibility
versions.

 

[Analysis]

Using pg_dump can dump the data into the file with format set to be
'c','t' or plain text. In the existing version the version of server 
pg_dump is already there when the format of file is 'c' or 't'. And even
for the plain text format file the version of server  pg_dump is
already there if using '--verbose' in pg_dump. Using '--verbose' leads
to some many other prints which are not required always. 

 

So the requirement is dump the version of server  pg_dump as comment
into the plain text format output file even without using '--verbose'
option.

 

[Solution details]

The main change is in the pg_backup_archiver.c file, in the function
'RestoreArchive' the version of server  pg_dump is only print when
finding the '--verbose' option to be used in current version.  Now we
just let the printing works even without finding the '--verbose' option.

 

[what is changed when applying the patch]

1. The output file which is created by pg_dump with format set to be
'plain text' and without using '--verbose' option will include the
version of server  pg_dump. One example is  as following:

--

-- PostgreSQL database dump

--

 

-- Dumped from database version 9.2.4

-- Dumped by pg_dump version 9.4devel

 

SET statement_timeout = 0;

SET lock_timeout = 0;

SET client_encoding = 'UTF8';

...

 

2. The output file which is created by pg_dumpall with format set to be
'plain text' and without using '--verbose' option will include the
version of server  pg_dump. The example is as following:

 

--

-- PostgreSQL database cluster dump

--

 

SET default_transaction_read_only = off;

 

...

 

\connect connectdb

 

SET default_transaction_read_only = off;

 

--

-- PostgreSQL database dump

--

 

-- Dumped from database version 9.2.4

-- Dumped by pg_dump version 9.4devel

 

SET statement_timeout = 0;

SET lock_timeout = 0;

SET client_encoding = 'UTF8';

SET standard_conforming_strings = on;

 

...

 

\connect postgres

 

SET default_transaction_read_only = off;

 

--

-- PostgreSQL database dump

--

 

-- Dumped from database version 9.2.4

-- Dumped by pg_dump version 9.4devel

 

SET statement_timeout = 0;

SET lock_timeout = 0;

SET client_encoding = 'UTF8';

SET standard_conforming_strings = on;

SET check_function_bodies = false;

 

 

3. The version of server and pg_dump will be dumped into the output
file. The output file is created by the following command:

pg_restore inputFile -f output.sql 

 

One example is as following:

--

-- PostgreSQL database dump

--

 

-- Dumped from database version 9.2.4

-- Dumped by pg_dump version 9.4devel

 

SET statement_timeout = 0;

SET lock_timeout = 0;

SET client_encoding = 'UTF8';

SET standard_conforming_strings = on;

...

 

 

Kind regards

Jing



pg_dump.patch
Description: pg_dump.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers