Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-10-29 Thread Syed, Rahila
Hello,

Please find attached an updated patch.

>Flag isn't reset on error.
Corrected in the attached.

> + pgstat_reset_activityflag;
>Does this actually compile?
It does compile but with no effect.  It has been corrected.

>snprintf()?  I don't think you need to keep track of schemaname_len at all.
memcpy() has been replaced by snprintf() to avoid calculating schemaname_len.

>In fact, I wonder if you need to send total_pages at all -- surely the client 
>can add both total_heap_pages and total_index_pages by itself ...
This has  been corrected in the attached patch.

>It seems a bit strange that the remaining progress_param entries are not 
>initialized to anything.  Also, why aren't the number of params of each type 
>saved too?  
The number of params for each command remains constant hence it has been 
hardcoded.

>In the receiving code you check whether each value equals 0, and if it does 
>then report NULL, but imagine vacuuming a table with no indexes where the 
>number of index pages is going to be zero.
>Shouldn't we display zero there rather than null?
Agree.  IIUC, NULL should rather be used when a value is invalid. But for valid 
values like 'zero index pages' it is clearer to display 0. It has been 
corrected in the attached. 

>This patch lacks a comment somewhere explaining how this whole thing works.
Have added few lines in pgstat.h inside PgBackendStatus struct.

>I believe you don't need this include.
Corrected.

>This not only adds an unnecessary empty line at the end of the struct 
>declaration, but also fails to preserve the "st_" prefix used in all the other 
>fields
Corrected.

Thank you,
Rahila Syed


__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Vacuum_progress_checker_v6.patch
Description: Vacuum_progress_checker_v6.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-10-20 Thread Syed, Rahila
Hello,

>I think that you should add the flag or something which indicates whether this 
>backend is running VACUUM or not, into PgBackendStatus.
>pg_stat_vacuum_progress should display the entries of only backends with that 
>flag set true. This design means that you need to set the flag to true when 
>starting VACUUM and reset at the end of VACUUM progressing.
Please find attached  updated patch which adds a flag in PgBackendStatus which 
indicates whether this backend in running VACUUM.
Also, pgstat_report_progress function is changed to make it generic for all 
commands reporting progress.

Thank you,
Rahila Syed



__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.


Vacuum_progress_checker_v5.patch
Description: Vacuum_progress_checker_v5.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-10-06 Thread Syed, Rahila
Hello,

Please check the attached patch as the earlier one had typo in regression test 
output.

>+#define PG_STAT_GET_PROGRESS_COLS30
>Why did you use 30?
That has come from N_PROGRESS_PARAM * 3  where N_PROGRESS_PARAM = 10 is the 
number of progress parameters of each type stored in shared memory.
There are three such types (int, float, string) hence total number of progress 
parameters can be 30.

Thank you,
Rahila Syed


__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.


Vacuum_progress_checker_v4.patch
Description: Vacuum_progress_checker_v4.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-10-06 Thread Syed, Rahila
Hello Fujii-san,

>Here are another review comments
Thank you for review. Please find attached an updated patch. 

> You removed some empty lines, for example, in vacuum.h.
>Which seems useless to me.
Has been corrected in the attached.

>Why did you use an array to store the progress information of VACUUM?
>I think that it's better to use separate specific variables for them for 
>better code readability, for example, variables scanned_pages, 
>heap_total_pages, etc.
It is designed this way to keep it generic for all the commands which can store 
different progress parameters in shared memory.

>Currently only progress_param_float[0] is used. So there is no need to use an 
>array here.
Same as before . This is for later use by other commands.

>progress_param_float[0] saves the percetage of VACUUM progress.
>But why do we need to save that information into shared memory?
>We can just calculate the percentage whenever pg_stat_get_vacuum_progress() is 
>executed, instead. There seems to be no need to save that information.
This has been corrected in the attached.

>char progress_message[PROGRESS_MESSAGE_LENGTH][N_PROGRESS_PARAM];
>As Sawada pointed out, there is no user of this variable.
Have used it to store table name in the updated patch. It can also be used to 
display index names, current phase of VACUUM.  
This has not been included in the patch yet to avoid cluttering the display 
with too much information.

>For example, st_activity of autovacuum worker displays "autovacuum ...".
>So as Sawada reported, he could not find any entry for autovacuum in 
>pg_stat_vacuum_progress.
In the attached patch , I have performed check for autovacuum also.

>I think that you should add the flag or something which indicates whether this 
>backend is running VACUUM or not, into PgBackendStatus.
>pg_stat_vacuum_progress should display the entries of only backends with that 
>flag set true. This design means that you need to set the flag to true when 
>starting VACUUM and reset at the end of VACUUM progressing.
This design seems better in the sense that we don’t rely on st_activity entry 
to display progress values. 
A variable which stores flags for running commands can be created in 
PgBackendStatus. These flags can be used at the time of display of progress of 
particular command. 
 
>That is, non-superuser should not be allowed to see the 
>pg_stat_vacuum_progress entry of superuser running VACUUM?
This has been included in the updated patch.

>This code may cause total_pages and total_heap_pages to decrease while VACUUM 
>is running.
Yes. This is because the initial count of total pages to be vacuumed and the 
pages which are actually vacuumed can vary depending on visibility of tuples.
The pages which are all visible are skipped and hence have been subtracted from 
total page count.


Thank you,
Rahila Syed 

__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.


Vacuum_progress_checker_v4.patch
Description: Vacuum_progress_checker_v4.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-09-22 Thread Syed, Rahila
Hello,

Please find attached patch with bugs reported by Thom and Sawada-san solved.

>* The progress of vacuum by autovacuum seems not to be displayed.
The progress is stored in shared variables during autovacuum. I guess the 
reason they are not visible is that the entries are deleted as soon as the 
process exits.
But the progress can be viewed while autovacuum worker is running.

Thank you,
Rahila Syed

__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.


Vacuum_progress_checker_v3.patch
Description: Vacuum_progress_checker_v3.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-09-15 Thread Syed, Rahila

Hello Thom,

>Okay, I've just tested this with a newly-loaded table (1,252,973 of jsonb 
>data),
Thanks a lot!

>but after it's finished, I end up with this:
>json=# select * from pg_stat_vacuum_progress;
>-[ RECORD 1 ]---+---
>pid | 5569
>total_pages | 217941
>scanned_pages   | 175243
>total_heap_pages    | 175243
>scanned_heap_pages  | 175243
>total_index_pages   | 42698
>scanned_index_pages | 
>percent_complete    | 80
>This was running with a VACUUM ANALYZE.  This output seems to suggest that it 
>didn't complete.

Ok. The patch fails here because 'total pages to be scanned' takes into account 
index pages and no index pages are actually scanned. 
So the scanned pages count does not reach total pages count . I will fix this. 
It seems that no index pages were scanned during this  because there were no 
dead tuples to be cleaned as the table was newly loaded.

>After, I ran VACUUM FULL.  pg_stat_vacuum_progress didn't change from before, 
>so that doesn't appear to show up in the view.
The scope of this patch is to report progress of basic VACUUM . It does not 
take into account VACUUM FULL yet.  I think this can be included after basic 
VACUUM progress is done.

>I then deleted 40,000 rows from my table, and ran VACUUM ANALYZE again.  This 
>time it progressed and percent_complete reached 100
OK.

Thank you,
Rahila Syed.


__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-09-11 Thread Syed, Rahila

Hello,

Please find attached updated VACUUM progress checker patch.
Following have been accomplished in the patch

1. Accounts for index pages count while calculating  total progress of VACUUM.
2. Common location for storing progress parameters for any command. Idea is 
every command which needs to report progress can populate and interpret the 
shared variables in its own way.
 Each can display progress by implementing separate views.
3. Separate VACUUM progress view to display various progress parameters has 
been implemented . Progress of various phases like heap scan, index scan, total 
pages scanned along with 
completion percentage is reported.
4.This view can display progress for all active backends running VACUUM.

Basic testing has been performed. Thorough testing is yet to be done. Marking 
it as Needs Review in  Sept-Commitfest.

ToDo:
Display count of heap pages actually vacuumed(marking line pointers unused)
Display percentage of work_mem being used to store dead tuples.

Thank you,
Rahila Syed

__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.


Vacuum_progress_checker_v2.patch
Description: Vacuum_progress_checker_v2.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-08-13 Thread Syed, Rahila
Hello,

>Autovacuum knows what % of a table needs to be cleaned - that is how it is 
>triggered. 
>When a vacuum runs we should calculate how many TIDs we will collect and 
>therefore how many trips to the indexes we need for given memory.
>We can use the VM to find out how many blocks we'll need to scan in the table. 
>So overall, we know how many blocks we need to scan.

Total heap pages to be scanned can be obtained from VM as mentioned. To figure 
out number of index scans we need an estimate of dead tuples. 

IIUC, autovacuum acquires information that a table has to be cleaned by looking 
at pgstat entry for the table. i.e n_dead_tuples.
Hence,initial estimate of dead tuple TIDs can be made using n_dead_tuples in 
pgstat.
n_dead_tuples in pgstat table entry is the value  updated by last analyze and 
may not be up to date.
In cases where pgstat entry for table is NULL, number of dead tuples TIDs 
cannot be estimated.
In  such cases where TIDs cannot be estimated , we can start with an initial 
estimate of 1 index scan and later go on adding number of index pages to the 
total count of pages(heap+index)  if count of index scan exceeds.

Thank you,
Rahila Syed.

 

__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-08-10 Thread Syed, Rahila
Hello,

>When we're in Phase2 or 3, don't we need to report the number of total page 
>scanned or percentage of how many table pages scanned, as well?
The total heap pages scanned need to be reported with phase 2 or 3. Complete 
progress report need to have numbers from each phase when applicable. 

> Phase 1. Report 2 integer counters: heap pages scanned and total heap 
> pages,
> 1 float counter: percentage_complete and progress message.
> Phase 2. Report 2 integer counters: index pages scanned and total 
> index pages(across all indexes) and progress message.
> Phase 3. 1 integer counter: heap pages vacuumed.

Sorry for being unclear here. What I meant to say is, each phase of a command 
will correspond to a slot in COMMAND_NUM_SLOTS. Each phase will be a separate 
array element and 
will comprise of n integers, n floats, string. So , in the view reporting 
progress, VACUUM command can have 3 entries one for each phase. 

Thank you,
Rahila Syed



__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-07-16 Thread Syed, Rahila
Hello,

>Naming the GUC pgstat* seems a little inconsistent.
Sorry, there is a typo in the mail. The GUC name is 'track_activity_progress'.

>Also, adding the new GUC to src/backend/utils/misc/postgresql.conf.sample
>might be helpful 
Yes.  I will update.

Thank you,
Rahila Syed

__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-07-03 Thread Syed, Rahila
Hello,

>TBH, I think that designing this as a hook-based solution is adding a whole 
>lot of complexity for no value.  The hard parts of the problem are collecting 
>the raw data and making the results visible to users, and both of those 
>require involvement of the core code.  Where is the benefit from pushing some 
>trivial >intermediate arithmetic into an external module?
>If there's any at all, it's certainly not enough to justify problems such as 
>you mention here.

>So I'd just create a "pgstat_report_percent_done()" type of interface in 
>pgstat.c and then teach VACUUM to call it directly.

Thank you for suggestion. I agree that adding code in core will reduce code 
complexity with no additional overhead. 
Going by the consensus, I will update the patch with code to collect and store 
progress information from vacuum in pgstat.c and
UI using pg_stat_activity view.

Thank you,
Rahila Syed

__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-07-02 Thread Syed, Rahila
Hello,

>Unless I am missing something, I guess you can still keep the actual code that 
>updates counters outside the core if you adopt an approach that Simon suggests.
Yes. The code to extract progress information from VACUUM and storing in shared 
memory can be outside core even with pg_stat_activity as a user interface.

>Whatever the view (existing/new), any related counters would have a valid 
>(non-NULL) value when read off the view iff hooks are set perhaps because you 
>have an extension that sets them. 
>I guess that means any operation that "supports" progress tracking would have 
>an extension with suitable hooks implemented.
Do you mean to say , any operation/application that want progress  tracking 
feature will dynamically load the progress checker module which will set the 
hooks for progress reporting?
If yes , unless I am missing something such dynamic loading cannot happen if we 
use pg_stat_activity as it gets values from shared memory. Module has to be a 
shared_preload_library
to allocate a shared memory. So this will mean the module can be loaded only at 
server restart. Am I missing something?

Thank you,
Rahila Syed




__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2015-06-30 Thread Syed, Rahila
Hello,

>There's no need to add those curly braces, or to subsequent if blocks
Yes, those are added by mistake.

>Also, is this patch taking the visibility map into account for its 
>calculations?
Yes, it subtracts skippable/all-visible pages from total pages to be scanned.
For each page processed by lazy_scan_heap, if number of all visible pages ahead 
 exceeds the threshold, it is subtracted from
the ‘total pages to be scanned’ count.

The all visible pages are accounted for incrementally  during the execution of 
VACUUM and not before starting the process.

Thank you,
Rahila Syed

From: pgsql-hackers-ow...@postgresql.org 
[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Thom Brown
Sent: Tuesday, June 30, 2015 2:20 PM
To: Rahila Syed
Cc: PostgreSQL-development
Subject: Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

On 30 June 2015 at 08:37, Rahila Syed 
mailto:rahilasye...@gmail.com>> wrote:
Hello Hackers,

Following is a proposal for feature to calculate VACUUM progress.

Use Case : Measuring progress of long running VACUUMs to help DBAs make 
informed decision
whether to continue running VACUUM or abort it.

Design:

A shared preload library to store progress information from different backends 
running VACUUM, calculate remaining time for each and display progress in the
in the form a view.

VACUUM  needs to be instrumented with a hook to collect progress information 
(pages vacuumed/scanned) periodically.
The patch attached implements a new hook to store vacuumed_pages and 
scanned_pages count at the end of each page scanned by VACUUM.
This information is stored in a shared memory structure.
In addition to measuring progress this function using hook also calculates 
remaining time for VACUUM.


The frequency of collecting progress information can be reduced by appending 
delays in between hook function calls.
Also, a GUC parameter
log_vacuum_min_duration can be used.
This will cause VACUUM progress to be calculated only if VACUUM runs more than 
specified milliseconds.
A value of zero calculates VACUUM progress for each page processed. -1 disables 
logging.

Progress calculation :

percent_complete = scanned_pages * 100 / total_pages_to_be_scanned;

remaining_time = elapsed_time * (total_pages_to_be_scanned - scanned_pages) / 
scanned_pages;

Shared memory struct:

typedef struct PgStat_VacuumStats

{

  Oid databaseoid;

  Oid tableoid;

  Int32   vacuumed_pages;

  Int32   total_pages;

  Int32   scanned_pages;

  doubleelapsed_time;

  doubleremaining_time;

 } PgStat_VacuumStats[max_connections];

Reporting :
 A view named 'pg_maintenance_progress' can be created using the values in the 
struct above.
pg_stat_maintenance can be called from any other backend and will display 
progress of
each running VACUUM.

Other uses of hook in VACUUM:

Cost of VACUUM in terms of pages hit , missed and dirtied same as autovacuum 
can be collected using this hook.
Autovacuum does it at the end of VACUUM for each table. It can be done while 
VACUUM on a table is in progress.
This can be helpful to track manual VACUUMs also not just autovacuum.

Read/Write(I/O) rates can be computed on the lines of autovacuum.
Read rate patterns can be used to help tuning future vacuum on the table(like 
shared buffers tuning)
Other resource usages can also be collected using progress checker hook.

Attached patch is POC patch of progress calculation for a single backend.
Also attached is a brief snapshot of the output log.

@@ -559,7 +567,9 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
  * following blocks.
  */
 if (next_not_all_visible_block - blkno > SKIP_PAGES_THRESHOLD)
+{
 skipping_all_visible_blocks = true;
+}
There's no need to add those curly braces, or to subsequent if blocks.
Also, is this patch taking the visibility map into account for its calculations?

--
Thom

__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.


Re: [HACKERS] [Proposal] More Vacuum Statistics

2015-06-16 Thread Syed, Rahila
Hello,

>Maybe, For DBAs,
>It might be better to show vacuum progress in pg_stat_activity.
>(if we'd do, add a free-style column like "progress" ?) This column might also 
>be able to use for other long time commands like ANALYZE, CREATE/RE INDEX and 
>COPY. To realize this feature, we certainly need to properly change 
>pgstat_report_activity, use it more and add a new track-activity parameter
 Very similar idea was proposed in the following
http://www.postgresql.org/message-id/1284756643.25048.42.ca...@vanquo.pezone.net

IIUC, problem with showing progress in pg_stat_activity is that it introduces 
compulsary progress calculation overhead  in core for every command.
As work units of each command varies, common infrastructure might not be able 
to represent every command progress effectively.
An architecture which will display progress only  on users demand  for each 
command separately will be more efficient.
So, suggestion was rather to have a detailed progress report including 
"remaining time" for a command on users demand.

FWIW, I am working on designing an approach to report VACUUM progress stats for 
which I will be posting a detailed proposal.
The use case is reporting progress for long running VACUUMs. The approach 
involves using hooks to extract VACUUM progress statistics .
The progress can be displayed using psql view (ex. pg_stat_maintenance).

Thank you,
Rahila Syed


-Original Message-
From: pgsql-hackers-ow...@postgresql.org 
[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Naoya Anzai
Sent: Tuesday, June 16, 2015 8:41 AM
To: Tomas Vondra
Cc: pgsql-hackers@postgresql.org; Akio Iwaasa; bench.cof...@gmail.com; Tom 
Lane; Jeff Janes; Jim Nasby; Andres Freund; Alvaro Herrera
Subject: Re: [HACKERS] [Proposal] More Vacuum Statistics

Hi,

Thank you for comments. and Sorry for my late response.

>> 
>> pg_stat_vacuum view
>> 
>>
>> I understand it is not good to simply add more counters in 
>> pg_stat_*_tables. For now, I'd like to suggest an extension which can 
>> confirm vacuum statistics like pg_stat_statements.
>>
>> Similar feature has been already provided by pg_statsinfo package.
>> But it is a full-stack package for PG-stats and it needs to redesign 
>> pg_log and design a repository database for introduce.
>> And it is not a core-extension for PostgreSQL.
>> (I don't intend to hate pg_statsinfo,
>>   I think this package is a very convinient tool)
>>
>> Everyone will be able to do more easily tuning of VACUUM.
>> That's all I want.
>
>I'm still wondering whether these stats will really make the tuning any 
>easier. What I do right now is looking at pg_stat_all_tables.n_deat_tup 
>and if it exceeds some threshold, it's a sign that vacuum may need a 
>bit of tuning. Sometimes it really requires tuning vacuum itself, but 
>more often than not it's due to something else (a large bulk delete, 
>autovacuum getting stuck on another table, ...). I don't see how the 
>new stats would make this any easier.
>
>Can you give some examples on how the new stats might be used (and 
>where the current stats are insufficient)? What use cases do you 
>imagine for those stats?

pg_stat_vacuum can keep histories of vacuum statistics for each tables/indices 
into shared memory.(They are not only last vacuum. 
This is already able to confirm using pg_stat_all_tables.) It makes easier 
analysis of vacuum histories because this view can sort or aggregate or filter.

My use cases for those stats are following.

- examine TRANSITION of vacuum execution time on any table  (you can predict 
the future vacuum execution time)
- examine EXECUTION INTERVAL of vacuum for each table  (if too frequent, it 
should make vacuum-threshold tuning to up)
- examine REST of dead-tuples just after vacuum  (if dead-tuples remain, it may 
be due to any idle in transaction sessions)

>
>It might help differentiate the autovacuum activity from the rest of 
>the system (e.g. there's a lot of I/O going on - how much of that is 
>coming from autovacuum workers?). This would however require a more 
>fine-grained reporting, because often the vacuums run for a very long 
>time, especially on very large tables (which is exactly the case when 
>this might be handy) - I just had a VACUUM that ran for 12 hours. These 
>jobs should report the stats incrementally, not just once at the very 
>end, because that makes it rather useless IMNSHO.

+1

Certainly, VACUUM have often much execution time, I just had too.
At present, we cannot predict when this vacuum finishes, what this vacuum is 
doing now, and whether this vacuum have any problem or not.

Maybe, For DBAs,
It might be better to show vacuum progress in pg_stat_activity.
(if we'd do, add a free-style column like "progress" ?) This column might also 
be able to use for other long time commands like ANALYZE, CREATE/RE INDEX and 
COPY. To realize this feature, we certainly need to properly change 
pgstat_report_activity, use it more and a

Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2015-03-05 Thread Syed, Rahila

Hello,

Please find attached  a patch. As discussed, flag to denote compression and 
presence of hole in block image has been added in XLogRecordImageHeader rather 
than block header.  

Following are WAL numbers based on attached  test script  posted by Michael 
earlier in the thread.
 
  WAL generated
FPW compression on   122.032 MB

FPW compression off   155.223 MB

HEAD  155.236 MB 

Compression : 21 %
Number of block images generated in WAL :   63637


Thank you,
Rahila Syed


__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.


Support-compression-of-full-page-writes-in-WAL_v23.patch
Description: Support-compression-of-full-page-writes-in-WAL_v23.patch


compress_run.bash
Description: compress_run.bash

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2015-03-04 Thread Syed, Rahila
Hello,

>Are there any other flag bits that we should or are planning to add into WAL 
>header newly, except the above two? If yes and they are required by even a 
>block which doesn't have an image, I will change my mind and agree to add 
>something like chunk ID to a block header. 
>But I guess the answer of the question is No. Since the flag bits now we are 
>thinking to add are required only by a block having an image, adding them into 
>a block header (instead of block image header) seems a waste of bytes in WAL. 
>So I concur with Michael.
I agree.
As per my understanding, this change of xlog format was to provide for future 
enhancement which would need flags relevant to entire block.
But as mentioned, currently the flags being added are related to block image 
only. Hence for this patch it makes sense to add a field to 
XLogRecordImageHeader rather than block header. 
This will also save bytes per WAL record. 

Thank you,
Rahila Syed

 


__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2015-03-03 Thread Syed, Rahila
Hello,

>It would be good to get those problems fixed first. Could you send an updated 
>patch?

Please find attached updated patch with WAL replay error fixed. The patch 
follows chunk ID approach of xlog format.

Following are brief measurement numbers. 
  WAL
FPW compression on   122.032 MB

FPW compression off   155.239 MB

HEAD  155.236 MB


Thank you,
Rahila Syed


__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Support-compression-of-full-page-writes_v22.patch
Description: Support-compression-of-full-page-writes_v22.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2015-02-24 Thread Syed, Rahila
Hello ,

>I've not read this logic yet, but ISTM there is a bug in that new WAL format 
>because I got the following error and the startup process could not replay any 
>WAL records when I set up replication and enabled wal_compression.

>LOG:  record with invalid length at 0/3B0
>LOG:  record with invalid length at 0/3000518
>LOG:  Invalid block length in record 0/30005A0
>LOG:  Invalid block length in record 0/3000D60 ...

Please fine attached patch which replays WAL records.

Thank you,
Rahila Syed

-Original Message-
From: pgsql-hackers-ow...@postgresql.org 
[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Fujii Masao
Sent: Monday, February 23, 2015 5:52 PM
To: Rahila Syed
Cc: PostgreSQL-development; Andres Freund; Michael Paquier
Subject: Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

On Mon, Feb 23, 2015 at 5:28 PM, Rahila Syed  wrote:
> Hello,
>
> Attached is a patch which has following changes,
>
> As suggested above block ID in xlog structs has been replaced by chunk ID.
> Chunk ID is used to distinguish between different types of xlog record 
> fragments.
> Like,
> XLR_CHUNK_ID_DATA_SHORT
> XLR_CHUNK_ID_DATA_LONG
> XLR_CHUNK_BKP_COMPRESSED
> XLR_CHUNK_BKP_WITH_HOLE
>
> In block references, block ID follows the chunk ID. Here block ID 
> retains its functionality.
> This approach increases data by 1 byte for each block reference in an 
> xlog record. This approach separates ID referring different fragments 
> of xlog record from the actual block ID which is used to refer  block 
> references in xlog record.

I've not read this logic yet, but ISTM there is a bug in that new WAL format 
because I got the following error and the startup process could not replay any 
WAL records when I set up replication and enabled wal_compression.

LOG:  record with invalid length at 0/3B0
LOG:  record with invalid length at 0/3000518
LOG:  Invalid block length in record 0/30005A0
LOG:  Invalid block length in record 0/3000D60 ...

Regards,

--
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.


Support-compression-for-full-page-writes-in-WAL_v20.patch
Description: Support-compression-for-full-page-writes-in-WAL_v20.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2015-02-18 Thread Syed, Rahila
Hello,

>I think we should change the xlog format so that the block_id (which currently 
>is XLR_BLOCK_ID_DATA_SHORT/LONG or a actual block id) isn't the block id but 
>something like XLR_CHUNK_ID. Which is used as is for 
>XLR_CHUNK_ID_DATA_SHORT/LONG, but for backup blocks can be set to to 
>>XLR_CHUNK_BKP_WITH_HOLE, XLR_CHUNK_BKP_COMPRESSED, XLR_CHUNK_BKP_REFERENCE... 
>The BKP blocks will then follow, storing the block id following the chunk id.

>Yes, that'll increase the amount of data for a backup block by 1 byte, but I 
>think that's worth it. I'm pretty sure we will be happy about the added 
>extensibility pretty soon.

To clarify my understanding of the above change,

Instead of a block id to reference different fragments of an xlog record , a 
single byte field "chunk_id"  should be used.  chunk_id  will be same as 
XLR_BLOCK_ID_DATA_SHORT/LONG for main data fragments. 
But for block references, it will take store following values in order to store 
information about the backup blocks.
#define XLR_CHUNK_BKP_COMPRESSED  0x01  
#define XLR_CHUNK_BKP_WITH_HOLE 0x02
...

The new xlog format should look like follows,

Fixed-size header (XLogRecord struct)
Chunk_id(add a field before id field in XLogRecordBlockHeader struct)
XLogRecordBlockHeader 
Chunk_id
 XLogRecordBlockHeader 
...
...
Chunk_id ( rename id field of the XLogRecordDataHeader struct) 
XLogRecordDataHeader[Short|Long] 
 block data
 block data
 ...
 main data

I will post a patch based on this.

Thank you,
Rahila Syed

-Original Message-
From: Andres Freund [mailto:and...@2ndquadrant.com] 
Sent: Monday, February 16, 2015 5:26 PM
To: Syed, Rahila
Cc: Michael Paquier; Fujii Masao; PostgreSQL mailing lists
Subject: Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

On 2015-02-16 11:30:20 +, Syed, Rahila wrote:
> - * As a trivial form of data compression, the XLOG code is aware that
> - * PG data pages usually contain an unused "hole" in the middle, 
> which
> - * contains only zero bytes.  If hole_length > 0 then we have removed
> - * such a "hole" from the stored data (and it's not counted in the
> - * XLOG record's CRC, either).  Hence, the amount of block data 
> actually
> - * present is BLCKSZ - hole_length bytes.
> + * Block images are able to do several types of compression:
> + * - When wal_compression is off, as a trivial form of compression, 
> + the
> + * XLOG code is aware that PG data pages usually contain an unused "hole"
> + * in the middle, which contains only zero bytes.  If length < BLCKSZ
> + * then we have removed such a "hole" from the stored data (and it is
> + * not counted in the XLOG record's CRC, either).  Hence, the amount
> + * of block data actually present is "length" bytes.  The hole "offset"
> + * on page is defined using "hole_offset".
> + * - When wal_compression is on, block images are compressed using a
> + * compression algorithm without their hole to improve compression
> + * process of the page. "length" corresponds in this case to the 
> + length
> + * of the compressed block. "hole_offset" is the hole offset of the 
> + page,
> + * and the length of the uncompressed block is defined by 
> + "raw_length",
> + * whose data is included in the record only when compression is 
> + enabled
> + * and "with_hole" is set to true, see below.
> + *
> + * "is_compressed" is used to identify if a given block image is 
> + compressed
> + * or not. Maximum page size allowed on the system being 32k, the 
> + hole
> + * offset cannot be more than 15-bit long so the last free bit is 
> + used to
> + * store the compression state of block image. If the maximum page 
> + size
> + * allowed is increased to a value higher than that, we should 
> + consider
> + * increasing this structure size as well, but this would increase 
> + the
> + * length of block header in WAL records with alignment.
> + *
> + * "with_hole" is used to identify the presence of a hole in a block image.
> + * As the length of a block cannot be more than 15-bit long, the 
> + extra bit in
> + * the length field is used for this identification purpose. If the 
> + block image
> + * has no hole, it is ensured that the raw size of a compressed block 
> + image is
> + * equal to BLCKSZ, hence the contents of 
> + XLogRecordBlockImageCompressionInfo
> + * are not necessary.
>   */
>  typedef struct XLogRecordBlockImageHeader  {
> - uint16  hole_offset;/* number of bytes before "hole" */
> - uint16  hole_length;/* number of bytes in "hole" */
&g

Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2015-02-16 Thread Syed, Rahila
Hello,

Thank you for reviewing and testing the patch.

>+   /* leave if data cannot be compressed */
>+   if (compressed_len == 0)
>+   return false;
>This should be < 0, pglz_compress returns -1 when compression fails.
>
>+   if (pglz_decompress(block_image, bkpb->bkp_len, 
>record->uncompressBuf,
>+   
>bkpb->bkp_uncompress_len) == 0)
>Similarly, this should be < 0.

These have been corrected in the attached.

>Regarding the sanity checks that have been added recently. I think that they 
>are useful but I am suspecting as well that only a check on the record CRC is 
>done because that's reliable enough and not doing those checks accelerates a 
>bit replay. So I am thinking that we should simply replace >them by assertions.
Removing the checks makes sense as CRC ensures correctness . Moreover ,as error 
message for invalid length of record is present in the code , messages for 
invalid block length can be redundant.
Checks have been replaced by assertions in the attached patch.

Following if  condition in XLogCompressBackupBlock has been modified as follows

Previous
/*
+  * We recheck the actual size even if pglz_compress() reports success 
and
+  * see if at least 2 bytes of length have been saved, as this 
corresponds
+  * to the additional amount of data stored in WAL record for a 
compressed
+  * block via raw_length when block contains hole..
+  */
+  *len = (uint16) compressed_len;
+  if (*len >= orig_len - SizeOfXLogRecordBlockImageCompressionInfo)
+  return false;
+  return true;


Current
if ((hole_length != 0) &&
+  (*len >= orig_len - 
SizeOfXLogRecordBlockImageCompressionInfo))
+  return false;
+return true

This is because the extra information raw_length is included only if compressed 
block has hole in it.

>Once we get those small issues fixes, I think that it is with having a 
>committer look at this patch, presumably Fujii-san
Agree. I will mark this patch as ready for committer

Thank you,
Rahila Syed

__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Support-compression-for-full-page-writes-in-WAL_v19.patch
Description: Support-compression-for-full-page-writes-in-WAL_v19.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2015-02-12 Thread Syed, Rahila

Thank you for comments. Please find attached the updated patch.

>This patch fails to compile:
>xlogreader.c:1049:46: error: extraneous ')' after condition, expected a 
>statement
>blk->with_hole && blk->hole_offset <= 
> 0))
This has been rectified.

>Note as well that at least clang does not like much how the sanity check with 
>with_hole are done. You should place parentheses around the '&&' expressions. 
>Also, I would rather define with_hole == 0 or with_hole == 1 explicitly int 
>those checks
The expressions are modified accordingly.

>There is a typo:
>s/true,see/true, see/
>[nitpicky]Be as well aware of the 80-character limit per line that is usually 
>normally by comment blocks.[/]

Have corrected the typos and changed the comments as mentioned. Also , 
realigned certain lines to meet the 80-char limit.

Thank you,
Rahila Syed



__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Support-compression-for-full-page-writes-in-WAL_v18.patch
Description: Support-compression-for-full-page-writes-in-WAL_v18.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2015-02-11 Thread Syed, Rahila

>IMO, we should add details about how this new field is used in the comments on 
>top of XLogRecordBlockImageHeader, meaning that when a page hole is present we 
>use the compression info structure and when there is no hole, we are sure that 
>the FPW raw length is BLCKSZ meaning that the two bytes of the CompressionInfo 
>stuff is unnecessary.
This comment is included in the patch attached.

> For correctness with_hole should be set even for uncompressed pages. I think 
> that we should as well use it for sanity checks in xlogreader.c when decoding 
> records.
This change is made in the attached patch. Following sanity checks have been 
added in xlogreader.c

if (!(blk->with_hole) && blk->hole_offset != 0 || blk->with_hole && 
blk->hole_offset <= 0))

if (blk->with_hole && blk->bkp_len >= BLCKSZ)

if (!(blk->with_hole) && blk->bkp_len != BLCKSZ)

Thank you,
Rahila Syed



__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Support-compression-for-full-page-writes-in-WAL_v18.patch
Description: Support-compression-for-full-page-writes-in-WAL_v18.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2015-02-09 Thread Syed, Rahila
Hello,

A bug had been introduced in the latest versions of the patch. The order of 
parameters passed to pglz_decompress was wrong.
Please find attached patch with following correction,

Original code,
+   if (pglz_decompress(block_image, record->uncompressBuf,
+   bkpb->bkp_len, 
bkpb->bkp_uncompress_len) == 0)
Correction
+   if (pglz_decompress(block_image, bkpb->bkp_len,
+   record->uncompressBuf, 
bkpb->bkp_uncompress_len) == 0)


>For example here you should not have a space between the function definition 
>and the variable declarations:
>+{
>+
>+int orig_len = BLCKSZ - hole_length;
>This is as well incorrect in two places:
>if(hole_length != 0)
>There should be a space between the if and its condition in parenthesis.

Also corrected above code format mistakes.

Thank you,
Rahila Syed

-Original Message-
From: pgsql-hackers-ow...@postgresql.org 
[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Syed, Rahila
Sent: Monday, February 09, 2015 6:58 PM
To: Michael Paquier; Fujii Masao
Cc: PostgreSQL mailing lists
Subject: Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

Hello,

>> Do we always need extra two bytes for compressed backup block?
>> ISTM that extra bytes are not necessary when the hole length is zero.
>> In this case the length of the original backup block (i.e.,
>> uncompressed) must be BLCKSZ, so we don't need to save the original 
>> size in the extra bytes.

>Yes, we would need a additional bit to identify that. We could steal it from 
>length in XLogRecordBlockImageHeader.

This is implemented in the attached patch by dividing length field as follows,
uint16  length:15,  
with_hole:1; 

>"2" should be replaced with the macro variable indicating the size of 
>extra header for compressed backup block.
Macro SizeOfXLogRecordBlockImageCompressionInfo is used instead of 2

>You can refactor XLogCompressBackupBlock() and move all the above code 
>to it for more simplicity
This is also implemented in the patch attached.

Thank you,
Rahila Syed


-Original Message-
From: Michael Paquier [mailto:michael.paqu...@gmail.com]
Sent: Friday, February 06, 2015 6:00 PM
To: Fujii Masao
Cc: Syed, Rahila; PostgreSQL mailing lists
Subject: Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

On Fri, Feb 6, 2015 at 3:03 PM, Fujii Masao wrote:
> Do we always need extra two bytes for compressed backup block?
> ISTM that extra bytes are not necessary when the hole length is zero.
> In this case the length of the original backup block (i.e.,
> uncompressed) must be BLCKSZ, so we don't need to save the original 
> size in the extra bytes.

Yes, we would need a additional bit to identify that. We could steal it from 
length in XLogRecordBlockImageHeader.

> Furthermore, when fpw compression is disabled and the hole length is 
> zero, we seem to be able to save one byte from the header of backup 
> block. Currently we use 4 bytes for the header, 2 bytes for the length 
> of backup block, 15 bits for the hole offset and 1 bit for the flag 
> indicating whether block is compressed or not. But in that case, the 
> length of backup block doesn't need to be stored because it must be 
> BLCKSZ. Shouldn't we optimize the header in this way? Thought?

If we do it, that's something to tackle even before this patch on HEAD, because 
you could use the 16th bit of the first 2 bytes of XLogRecordBlockImageHeader 
to do necessary sanity checks, to actually not reduce record by 1 byte, but 2 
bytes as hole-related data is not necessary. I imagine that a patch optimizing 
that wouldn't be that hard to write as well.

> +int page_len = BLCKSZ - hole_length;
> +char *scratch_buf;
> +if (hole_length != 0)
> +{
> +scratch_buf = compression_scratch;
> +memcpy(scratch_buf, page, hole_offset);
> +memcpy(scratch_buf + hole_offset,
> +   page + (hole_offset + hole_length),
> +   BLCKSZ - (hole_length + hole_offset));
> +}
> +else
> +scratch_buf = page;
> +
> +/* Perform compression of block */
> +if (XLogCompressBackupBlock(scratch_buf,
> +page_len,
> +regbuf->compressed_page,
> +&compress_len))
> +{
> +/* compression is done, add record */
> +is_compresse

Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2015-02-09 Thread Syed, Rahila
Hello,

>> Do we always need extra two bytes for compressed backup block?
>> ISTM that extra bytes are not necessary when the hole length is zero.
>> In this case the length of the original backup block (i.e., 
>> uncompressed) must be BLCKSZ, so we don't need to save the original 
>> size in the extra bytes.

>Yes, we would need a additional bit to identify that. We could steal it from 
>length in XLogRecordBlockImageHeader.

This is implemented in the attached patch by dividing length field as follows,
uint16  length:15,  
with_hole:1; 

>"2" should be replaced with the macro variable indicating the size of 
>extra header for compressed backup block.
Macro SizeOfXLogRecordBlockImageCompressionInfo is used instead of 2

>You can refactor XLogCompressBackupBlock() and move all the 
>above code to it for more simplicity
This is also implemented in the patch attached.

Thank you,
Rahila Syed


-Original Message-
From: Michael Paquier [mailto:michael.paqu...@gmail.com] 
Sent: Friday, February 06, 2015 6:00 PM
To: Fujii Masao
Cc: Syed, Rahila; PostgreSQL mailing lists
Subject: Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

On Fri, Feb 6, 2015 at 3:03 PM, Fujii Masao wrote:
> Do we always need extra two bytes for compressed backup block?
> ISTM that extra bytes are not necessary when the hole length is zero.
> In this case the length of the original backup block (i.e., 
> uncompressed) must be BLCKSZ, so we don't need to save the original 
> size in the extra bytes.

Yes, we would need a additional bit to identify that. We could steal it from 
length in XLogRecordBlockImageHeader.

> Furthermore, when fpw compression is disabled and the hole length is 
> zero, we seem to be able to save one byte from the header of backup 
> block. Currently we use 4 bytes for the header, 2 bytes for the length 
> of backup block, 15 bits for the hole offset and 1 bit for the flag 
> indicating whether block is compressed or not. But in that case, the 
> length of backup block doesn't need to be stored because it must be 
> BLCKSZ. Shouldn't we optimize the header in this way? Thought?

If we do it, that's something to tackle even before this patch on HEAD, because 
you could use the 16th bit of the first 2 bytes of XLogRecordBlockImageHeader 
to do necessary sanity checks, to actually not reduce record by 1 byte, but 2 
bytes as hole-related data is not necessary. I imagine that a patch optimizing 
that wouldn't be that hard to write as well.

> +int page_len = BLCKSZ - hole_length;
> +char *scratch_buf;
> +if (hole_length != 0)
> +{
> +scratch_buf = compression_scratch;
> +memcpy(scratch_buf, page, hole_offset);
> +memcpy(scratch_buf + hole_offset,
> +   page + (hole_offset + hole_length),
> +   BLCKSZ - (hole_length + hole_offset));
> +}
> +else
> +scratch_buf = page;
> +
> +/* Perform compression of block */
> +if (XLogCompressBackupBlock(scratch_buf,
> +page_len,
> +regbuf->compressed_page,
> +&compress_len))
> +{
> +/* compression is done, add record */
> +is_compressed = true;
> +}
>
> You can refactor XLogCompressBackupBlock() and move all the above code 
> to it for more simplicity.

Sure.
--
Michael

__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Support-compression-for-full-page-writes-in-WAL_v17.patch
Description: Support-compression-for-full-page-writes-in-WAL_v17.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2015-02-06 Thread Syed, Rahila

>In any case, those things have been introduced by what I did in previous 
>versions... And attached is a new patch.
Thank you for feedback.

>   /* allocate scratch buffer used for compression of block images */
>+  if (compression_scratch == NULL)
>+  compression_scratch = MemoryContextAllocZero(xloginsert_cxt,
>+  
> BLCKSZ);
 >}
The compression patch can  use the latest interface MemoryContextAllocExtended 
to proceed without compression when sufficient memory is not available for 
scratch buffer. 
The attached patch introduces OutOfMem flag which is set on when 
MemoryContextAllocExtended returns NULL . 

Thank you,
Rahila Syed

-Original Message-
From: Michael Paquier [mailto:michael.paqu...@gmail.com] 
Sent: Friday, February 06, 2015 12:46 AM
To: Syed, Rahila
Cc: PostgreSQL mailing lists
Subject: Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

On Thu, Feb 5, 2015 at 11:06 PM, Syed, Rahila  wrote:
>>/*
>>+* We recheck the actual size even if pglz_compress() report success,
>>+* because it might be satisfied with having saved as little as one byte
>>+* in the compressed data.
>>+*/
>>+   *len = (uint16) compressed_len;
>>+   if (*len >= orig_len - 1)
>>+   return false;
>>+   return true;
>>+}
>
> As per latest code ,when compression is 'on' we introduce additional 2 bytes 
> in the header of each block image for storing raw_length of the compressed 
> block.
> In order to achieve compression while accounting for these two additional 
> bytes, we must ensure that compressed length is less than original length - 2.
> So , IIUC the above condition should rather be
>
> If (*len >= orig_len -2 )
> return false;
> return true;
> The attached patch contains this. It also has a cosmetic change-  renaming 
> compressBuf to uncompressBuf as it is used to store uncompressed page.

Agreed on both things.

Just looking at your latest patch after some time to let it cool down, I 
noticed a couple of things.

 #define MaxSizeOfXLogRecordBlockHeader \
 (SizeOfXLogRecordBlockHeader + \
- SizeOfXLogRecordBlockImageHeader + \
+ SizeOfXLogRecordBlockImageHeader, \
+ SizeOfXLogRecordBlockImageCompressionInfo + \
There is a comma here instead of a sum sign. We should really sum up all those 
sizes to evaluate the maximum size of a block header.

+ * Permanently allocate readBuf uncompressBuf.  We do it this way,
+ * rather than just making a static array, for two reasons:
This comment is just but weird, "readBuf AND uncompressBuf" is more appropriate.

+ * We recheck the actual size even if pglz_compress() report success,
+ * because it might be satisfied with having saved as little as one byte
+ * in the compressed data. We add two bytes to store raw_length  with the
+ * compressed image. So for compression to be effective
compressed_len should
+ * be atleast < orig_len - 2
This comment block should be reworked, and misses a dot at its end. I rewrote 
it like that, hopefully that's clearer:
+   /*
+* We recheck the actual size even if pglz_compress() reports
success and see
+* if at least 2 bytes of length have been saved, as this
corresponds to the
+* additional amount of data stored in WAL record for a compressed block
+* via raw_length.
+*/

In any case, those things have been introduced by what I did in previous 
versions... And attached is a new patch.
--
Michael

__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Support-compression-for-full-page-writes-in-WAL_v17.patch
Description: Support-compression-for-full-page-writes-in-WAL_v17.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2015-02-05 Thread Syed, Rahila
Hello,
 
>/*
>+* We recheck the actual size even if pglz_compress() report success,
>+* because it might be satisfied with having saved as little as one byte
>+* in the compressed data.
>+*/
>+   *len = (uint16) compressed_len;
>+   if (*len >= orig_len - 1)
>+   return false;
>+   return true;
>+}

As per latest code ,when compression is 'on' we introduce additional 2 bytes in 
the header of each block image for storing raw_length of the compressed block. 
In order to achieve compression while accounting for these two additional 
bytes, we must ensure that compressed length is less than original length - 2.
So , IIUC the above condition should rather be 

If (*len >= orig_len -2 )
return false;
return true;
 
The attached patch contains this. It also has a cosmetic change-  renaming 
compressBuf to uncompressBuf as it is used to store uncompressed page.
 
Thank you,
Rahila Syed

-Original Message-
From: pgsql-hackers-ow...@postgresql.org 
[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Michael Paquier
Sent: Wednesday, January 07, 2015 9:32 AM
To: Rahila Syed
Cc: PostgreSQL mailing lists
Subject: Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

On Wed, Jan 7, 2015 at 12:51 AM, Rahila Syed  wrote:
> Following are some comments,
Thanks for the feedback.

>>uint16  hole_offset:15, /* number of bytes in "hole" */
> Typo in description of hole_offset
Fixed. That's "before hole".

>>for (block_id = 0; block_id <= record->max_block_id; block_id++)
>>-   {
>>-   if (XLogRecHasBlockImage(record, block_id))
>>-   fpi_len += BLCKSZ -
> record->blocks[block_id].hole_length;
>>-   }
>>+   fpi_len += record->blocks[block_id].bkp_len;
>
> IIUC, if condition, /if(XLogRecHasBlockImage(record, block_id))/ is 
> incorrectly removed from the above for loop.
Fixed.

>>typedef struct XLogRecordCompressedBlockImageHeader
> I am trying to understand the purpose behind declaration of the above 
> struct. IIUC, it is defined in order to introduce new field uint16 
> raw_length and it has been declared as a separate struct from 
> XLogRecordBlockImageHeader to not affect the size of WAL record when 
> compression is off.
> I wonder if it is ok to simply memcpy the uint16 raw_length in the 
> hdr_scratch when compression is on and not have a separate header 
> struct for it neither declare it in existing header. raw_length can be 
> a locally defined variable is XLogRecordAssemble or it can be a field 
> in registered_buffer struct like compressed_page.
> I think this can simplify the code.
> Am I missing something obvious?
You are missing nothing. I just introduced this structure for a matter of 
readability to show the two-byte difference between non-compressed and 
compressed header information. It is true that doing it my way makes the 
structures duplicated, so let's simply add the compression-related information 
as an extra structure added after XLogRecordBlockImageHeader if the block is 
compressed. I hope this addresses your concerns.

>> /*
>>  * Fill in the remaining fields in the XLogRecordBlockImageHeader
>>  * struct and add new entries in the record chain.
>> */
>
>>   bkpb.fork_flags |= BKPBLOCK_HAS_IMAGE;
>
> This code line seems to be misplaced with respect to the above comment.
> Comment indicates filling of XLogRecordBlockImageHeader fields while 
> fork_flags is a field of XLogRecordBlockHeader.
> Is it better to place the code close to following condition?
>   if (needs_backup)
>   {
Yes, this comment should not be here. I replaced it with the comment in HEAD.


>>+  *the original length of the
>>+ * block without its page hole being deducible from the compressed 
>>+ data
>>+ * itself.
> IIUC, this comment before XLogRecordBlockImageHeader seems to be no 
> longer valid as original length is not deducible from compressed data 
> and rather stored in header.
Aah, true. This was originally present in the header of PGLZ that has been 
removed to make it available for frontends.

Updated patches are attached.
Regards,
--
Michael

__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Support-compression-for-full-page-writes-in-WAL_v15.patch
Description: Support-compression-for-full-page-writes-in-WAL_v15.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2014-11-26 Thread Syed, Rahila
Hello,
I would like to contribute few points.

>XLogInsertRecord(XLogRecData *rdata, XLogRecPtr fpw_lsn)
>   RedoRecPtr = Insert->RedoRecPtr;
>   }
>   doPageWrites = (Insert->fullPageWrites || Insert->forcePageWrites);
>   doPageCompression = (Insert->fullPageWrites == 
> FULL_PAGE_WRITES_COMPRESS);

Don't we need to initialize doPageCompression  similar to doPageWrites in 
InitXLOGAccess?

Also , in the earlier patches compression was set 'on' even when fpw GUC is 
'off'. This was to facilitate compression of FPW which are forcibly written 
even when fpw GUC is turned off.
 doPageCompression in this patch is set to true only if value of fpw GUC is 
'compress'. I think its better to compress forcibly written full page writes.


Regards,

Rahila Syed
-Original Message-
From: pgsql-hackers-ow...@postgresql.org 
[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Michael Paquier
Sent: Wednesday, November 26, 2014 1:55 PM
To: Alvaro Herrera
Cc: Andres Freund; Robert Haas; Fujii Masao; Rahila Syed; Rahila Syed; 
PostgreSQL-development
Subject: Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

So, Here are reworked patches for the whole set, with the following changes:
- Found why replay was failing, xlogreader.c took into account BLCKSZ
- hole while it should have taken into account the compressed data length when 
fetching a compressed block image.
- Reworked pglz portion to have it return status errors instead of simple 
booleans. pglz stuff is as well moved to src/common as Alvaro suggested.

I am planning to run some tests to check how much compression can reduce WAL 
size with this new set of patches. I have been however able to check that those 
patches pass installcheck-world with a standby replaying the changes behind. 
Feel free to play with those patches...
Regards,
--
Michael

__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2014-10-28 Thread Syed, Rahila
Hello Fujii-san,

Thank you for your comments.

>The patch isn't applied to the master cleanly.
>The compilation of the document failed with the following error message.
>openjade:config.sgml:2188:12:E: end tag for element "TERM" which is not open
>make[3]: *** [HTML.index] Error 1
>xlog.c:930: warning: ISO C90 forbids mixed declarations and code
>xlogreader.c:744: warning: ISO C90 forbids mixed declarations and code
>xlogreader.c:744: warning: ISO C90 forbids mixed declarations and code

Please find attached patch with these rectified.

>Only backend calls CompressBackupBlocksPagesAlloc when SIGHUP is sent.
>Why does only backend need to do that? What about other processes which can 
>write FPW, e.g., autovacuum?
I had overlooked this. I will correct it.

>Do we release the buffers for compressed data when fpw is changed from 
>"compress" to "on"?
The current code does not do this.

>The memory is always (i.e., even when fpw=on) allocated to uncompressedPages, 
>but not to compressedPages. Why? I guess that the test of fpw needs to be there
uncompressedPages is also used to store the decompression output at the time of 
recovery. Hence, memory for uncompressedPages needs to be allocated even if 
fpw=on which is not the case for compressedPages.

Thank you,
Rahila Syed

-Original Message-
From: Fujii Masao [mailto:masao.fu...@gmail.com] 
Sent: Monday, October 27, 2014 6:50 PM
To: Rahila Syed
Cc: Andres Freund; Syed, Rahila; Alvaro Herrera; Rahila Syed; 
PostgreSQL-development; Tom Lane
Subject: Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

On Fri, Oct 17, 2014 at 1:52 PM, Rahila Syed  wrote:
> Hello,
>
> Please find the updated patch attached.

Thanks for updating the patch! Here are the comments.

The patch isn't applied to the master cleanly.

I got the following compiler warnings.

xlog.c:930: warning: ISO C90 forbids mixed declarations and code
xlogreader.c:744: warning: ISO C90 forbids mixed declarations and code
xlogreader.c:744: warning: ISO C90 forbids mixed declarations and code

The compilation of the document failed with the following error message.

openjade:config.sgml:2188:12:E: end tag for element "TERM" which is not open
make[3]: *** [HTML.index] Error 1

Only backend calls CompressBackupBlocksPagesAlloc when SIGHUP is sent.
Why does only backend need to do that? What about other processes which can 
write FPW, e.g., autovacuum?

Do we release the buffers for compressed data when fpw is changed from 
"compress" to "on"?

+if (uncompressedPages == NULL)
+{
+uncompressedPages = (char *)malloc(XLR_TOTAL_BLCKSZ);
+if (uncompressedPages == NULL)
+outOfMem = 1;
+}

The memory is always (i.e., even when fpw=on) allocated to uncompressedPages, 
but not to compressedPages. Why? I guess that the test of fpw needs to be there.

Regards,

--
Fujii Masao

__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.


compress_fpw_v2.patch
Description: compress_fpw_v2.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2014-09-22 Thread Syed, Rahila
Hello,

>Please find attached the patch to compress FPW.

Sorry I had forgotten to attach. Please find the patch attached.

Thank you,
Rahila Syed





From: pgsql-hackers-ow...@postgresql.org 
[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Rahila Syed
Sent: Monday, September 22, 2014 3:16 PM
To: Alvaro Herrera
Cc: Rahila Syed; PostgreSQL-development; Tom Lane
Subject: Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes


Hello All,


>Well, from Rahila's point of view the patch is actually attached, but
>she's posting from the Nabble interface, which mangles it and turns into
>a link instead.

Yes.



>but the end result is the
>same: to properly submit a patch, you need to send an email to the
> mailing list, not join a group/forum from
>some intermediary newsgroup site that mirrors the list.


Thank you. I will take care of it henceforth.

Please find attached the patch to compress FPW.  Patch submitted by Fujii-san 
earlier in the thread is used to merge compression GUC with full_page_writes.



I am reposting the measurement numbers.

Server Specification:
Processors:Intel® Xeon ® Processor E5-2650 (2 GHz, 8C/16T, 20 MB) * 2 nos
RAM: 32GB
Disk : HDWD 450GB 10K Hot Plug 2.5-inch SAS HDD * 8 nos
1 x 450 GB SAS HDD, 2.5-inch, 6Gb/s, 10,000 rpm

Checkpoint segments: 1024
Checkpoint timeout: 5 mins

pgbench -c 64 -j 64 -r -T 900 -M prepared
Scale factor: 1000

WAL generated (MB)   Throughput (tps)   
   Latency(ms)
On 9235.43   979.03 
  65.36
Compress(pglz)   6518.681072.34 
59.66
Off 501.04  1135.17 
   56.34

The results show  around 30 percent decrease in WAL volume due to compression 
of FPW.

Thank you ,

Rahila Syed

Tom Lane wrote:
> Rahila Syed 
> mailto:rahilasyed...@gmail.com>.90@gmail.com>
>  writes:
> > Please find attached patch to compress FPW using pglz compression.
>
> Patch not actually attached AFAICS (no, a link is not good enough).

Well, from Rahila's point of view the patch is actually attached, but
she's posting from the Nabble interface, which mangles it and turns into
a link instead.  Not her fault, really -- but the end result is the
same: to properly submit a patch, you need to send an email to the
pgsql-hackers@postgresql.orgmailing
 list, not join a group/forum from
some intermediary newsgroup site that mirrors the list.

--
Álvaro Herrera   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

__
Disclaimer:This email and any attachments are sent in strictest confidence for 
the sole use of the addressee and may contain legally privileged, confidential, 
and proprietary data.  If you are not the intended recipient, please advise the 
sender by replying promptly to this email and then delete and destroy this 
email and any attachments without any further use, copying or forwarding


compress_fpw_v1.patch
Description: compress_fpw_v1.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers