Re: Recommendation to run vacuum FULL in parallel

2019-04-11 Thread Ron


Look also at pg_stat_all_tables.n_dead_tup for tables which are candidates 
for vacuuming.


On 4/10/19 11:49 PM, Perumal Raj wrote:

Thanks Kevin for the inputs,

In my Case there are 500+ Tables and biggest chunk 30GB ( Table only) + 
its indexes.
So i have created 6 batches and executed in parallel . All my scripts 
completed in 2 Hours and my DB size came down from 500GB to 300GB.


Yes i do see CPU spike, But i did whole activity with full apps down time.

Going forward i am going to run vacuum daily basis to maintain the DB size.

Also Table/DB Age came down drastically.

Thanks
Raj

On Thu, Apr 4, 2019 at 12:53 PM Kevin Brannen > wrote:


*From:* Perumal Raj mailto:peruci...@gmail.com>>

**

So conclude the requirement here , The only way to parallelism is
multiple script. And no need to do REINDEX exclusively.

Question : Do we need to consider  Table dependencies while preparing
script in order to avoid table locks during vacuum full ?

We have a small bash script (see below) that get the list of tables
and their sizes, sorted smallest to largest, and do “vacuum full” one
at a time because (as someone else pointed out) this is very I/O
intensive. That order also helps to ensure we finish because some of
our installs are at the edge of running out of space (an issue we’re
dealing with). I probably wouldn’t have a problem doing 2 at a time,
but we do this in the middle of the night when activity is lowest and
it only takes 1-2 hours, so we’re good with it. It sounds like you
have a lot more data though.

You might also consider putting the data into different tablespaces
which are spread over multiple disks to help I/O. If you can, use SSD
drives, they help with speed quite a bit. 

Don’t worry about table dependencies. This is a physical operation,
not a data operation.

HTH,

Kevin

    $PGPATH/psql -t -c "

    WITH s AS (SELECT nspname || '.' || relname AS TABLE_NAME,
pg_total_relation_size(c.oid) AS total_bytes

  FROM pg_class c

  LEFT JOIN pg_namespace n ON n.oid = c.relnamespace

  WHERE relkind = 'r' AND nspname NOT IN (
'pg_catalog', 'information_schema' )

  ORDER BY 2 )

    SELECT table_name FROM s

    " |

    while read t ; do echo "" ; echo $t; $PGPATH/vacuumdb -w -z -f -t
$t ; done

###

This e-mail transmission, and any documents, files or previous e-mail
messages attached to it, may contain confidential information. If you
are not the intended recipient, or a person responsible for delivering
it to the intended recipient, you are hereby notified that any
disclosure, distribution, review, copy or use of any of the
information contained in or attached to this message is STRICTLY
PROHIBITED. If you have received this transmission in error, please
immediately notify us by reply e-mail, and destroy the original
transmission and its attachments without reading them or saving them
to disk. Thank you.



--
Angular momentum makes the world go 'round.


Re: Recommendation to run vacuum FULL in parallel

2019-04-10 Thread Perumal Raj
Thanks Kevin for the inputs,

In my Case there are 500+ Tables and biggest chunk 30GB ( Table only) + its
indexes.
So i have created 6 batches and executed in parallel . All my scripts
completed in 2 Hours and my DB size came down from 500GB to 300GB.

Yes i do see CPU spike, But i did whole activity with full apps down time.

Going forward i am going to run vacuum daily basis to maintain the DB size.

Also Table/DB Age came down drastically.

Thanks
Raj

On Thu, Apr 4, 2019 at 12:53 PM Kevin Brannen  wrote:

> *From:* Perumal Raj 
>
> So conclude the requirement here , The only way to parallelism is multiple
> script. And no need to do REINDEX exclusively.
>
> Question : Do we need to consider  Table dependencies while preparing
> script in order to avoid table locks during vacuum full ?
>
>
>
> We have a small bash script (see below) that get the list of tables and
> their sizes, sorted smallest to largest, and do “vacuum full” one at a time
> because (as someone else pointed out) this is very I/O intensive. That
> order also helps to ensure we finish because some of our installs are at
> the edge of running out of space (an issue we’re dealing with). I probably
> wouldn’t have a problem doing 2 at a time, but we do this in the middle of
> the night when activity is lowest and it only takes 1-2 hours, so we’re
> good with it. It sounds like you have a lot more data though.
>
>
>
> You might also consider putting the data into different tablespaces which
> are spread over multiple disks to help I/O. If you can, use SSD drives,
> they help with speed quite a bit. 
>
>
>
> Don’t worry about table dependencies. This is a physical operation, not a
> data operation.
>
>
>
> HTH,
>
> Kevin
>
>
>
> $PGPATH/psql -t -c "
>
> WITH s AS (SELECT nspname || '.' || relname AS TABLE_NAME,
> pg_total_relation_size(c.oid) AS total_bytes
>
>   FROM pg_class c
>
>   LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
>
>   WHERE relkind = 'r' AND nspname NOT IN ( 'pg_catalog',
> 'information_schema' )
>
>   ORDER BY 2 )
>
> SELECT table_name FROM s
>
> " |
>
> while read t ; do echo "" ; echo $t; $PGPATH/vacuumdb -w -z -f -t $t ;
> done
>
> ###
> This e-mail transmission, and any documents, files or previous e-mail
> messages attached to it, may contain confidential information. If you are
> not the intended recipient, or a person responsible for delivering it to
> the intended recipient, you are hereby notified that any disclosure,
> distribution, review, copy or use of any of the information contained in or
> attached to this message is STRICTLY PROHIBITED. If you have received this
> transmission in error, please immediately notify us by reply e-mail, and
> destroy the original transmission and its attachments without reading them
> or saving them to disk. Thank you.
>


Re: Recommendation to run vacuum FULL in parallel

2019-04-05 Thread Peter J. Holzer
On 2019-04-03 19:42:03 +0400, rihad wrote:
> > And future updates can reuse it, too (an update is very similar to an
> > insert+delete).
> 
> Hm, then it's strange our DB takes 6 times as much space compared to freshly
> restored one (only public schema is considered).

This is indeed strange if you accumulated that much bloat gradually (as
you wrote). It is much less strange if you did some massive
reorganisations in the past (In one case I witnessed, changes had to be
made to almost every value in 4 or 5 columns of a large table. So the
person doing the updates first issued an update on the first column,
checked that the result looked plausible, then issued an update on the
second column, and so on. The result was of course massive bloat).

hp

-- 
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson 


signature.asc
Description: PGP signature


RE: Recommendation to run vacuum FULL in parallel

2019-04-04 Thread Kevin Brannen
From: Perumal Raj 

So conclude the requirement here , The only way to parallelism is multiple 
script. And no need to do REINDEX exclusively.
Question : Do we need to consider  Table dependencies while preparing script in 
order to avoid table locks during vacuum full ?

We have a small bash script (see below) that get the list of tables and their 
sizes, sorted smallest to largest, and do “vacuum full” one at a time because 
(as someone else pointed out) this is very I/O intensive. That order also helps 
to ensure we finish because some of our installs are at the edge of running out 
of space (an issue we’re dealing with). I probably wouldn’t have a problem 
doing 2 at a time, but we do this in the middle of the night when activity is 
lowest and it only takes 1-2 hours, so we’re good with it. It sounds like you 
have a lot more data though.

You might also consider putting the data into different tablespaces which are 
spread over multiple disks to help I/O. If you can, use SSD drives, they help 
with speed quite a bit. 

Don’t worry about table dependencies. This is a physical operation, not a data 
operation.

HTH,
Kevin

$PGPATH/psql -t -c "
WITH s AS (SELECT nspname || '.' || relname AS TABLE_NAME, 
pg_total_relation_size(c.oid) AS total_bytes
  FROM pg_class c
  LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
  WHERE relkind = 'r' AND nspname NOT IN ( 'pg_catalog', 
'information_schema' )
  ORDER BY 2 )
SELECT table_name FROM s
" |
while read t ; do echo "" ; echo $t; $PGPATH/vacuumdb -w -z -f -t $t ; done
###
This e-mail transmission, and any documents, files or previous e-mail messages 
attached to it, may contain confidential information. If you are not the 
intended recipient, or a person responsible for delivering it to the intended 
recipient, you are hereby notified that any disclosure, distribution, review, 
copy or use of any of the information contained in or attached to this message 
is STRICTLY PROHIBITED. If you have received this transmission in error, please 
immediately notify us by reply e-mail, and destroy the original transmission 
and its attachments without reading them or saving them to disk. Thank you.


Re: Recommendation to run vacuum FULL in parallel

2019-04-03 Thread Ron

On 4/3/19 3:45 PM, Perumal Raj wrote:

Hi Stephen

Thanks for the response ,

Version : 9.2
We never ran VACUUM FULL in the past, All we are doing just manual vacuum 
( Weekly ) .
Based on the Observation ( test run ) , we were able to reclaim 150 GB out 
of 500 GB .


We are heading to a planned down time soon , So thinking to run FULL 
during that time .


Reason behind to run FULL : 1. Reclaim unused space which postgres never 
using it.


Did you purge a *lot* of records?

                                                2. Considering  FULL may 
increase the performance.


Maybe. But choose your tables wisely.

                                                3. Daily backup size and 
time  will be reduced after reclaiming 150GB.


How are you currently performing backups?  (The size won't change if you're 
using pg_dump, and it won't change much if you're using pgbackrest with the 
compression option -- thought it will probably run faster.)


Bottom line:

1. choose your tables wisely.
2. make sure you have enough disk space.
3. Either autovacuum more aggressively or explicitly vacuum certain tables 
from a cron job.




Thanks,
Raj


On Wed, Apr 3, 2019 at 1:16 PM Stephen Eilert > wrote:


> Ideally VACUUM FULL should not require a giant lock on the table.

It is a massively expensive operation, regardless. Not sure if it is
something you want to run in production outside a maintenance window.

I would argue that frequent vacuum full is an antipattern. This will
become a matter of superstition in your company.

If db size growth is a problem, make autovacuum more agressive. Or run
your manual vacuum job (not full) more often than a week. Daily, if
you have to. This will not reclaim disk space as reported by the OS,
but it should make the space available for new row versions, so db
should mostly stop growing from the OS point of view(mostly, because
you may be adding new data, right?). If it is still a problem, then
there may be something else going on.

Which PG version is that?


— Stephen
On Apr 3, 2019, 10:02 AM -0700, Perumal Raj mailto:peruci...@gmail.com>>, wrote:

Hi All

Thanks for all your valuable  inputs,

Here is some more data,

Though we have 150 GB free space spread across 500 Tables , Every
alternative day DB is growing with 1 GB rate.
Also,We have manual vacuum job scheduled to run weekly basis, So
seems to be space is not reusing all the time ?

So conclude the requirement here , The only way to parallelism is
multiple script. And no need to do REINDEX exclusively.
Question : Do we need to consider  Table dependencies while preparing
script in order to avoid table locks during vacuum full ?

At present Maintenance work memory set to 20 GB.
Question : Do we need to tweak any other parameters ?

Note:
We are planning this activity with Application Downtime only.

Let me know if i missed anything.

Regards,
Raj





On Wed, Apr 3, 2019 at 8:42 AM rihad mailto:ri...@mail.ru>> wrote:

> And future updates can reuse it, too (an update is very similar
to an
> insert+delete).


Hm, then it's strange our DB takes 6 times as much space compared to
freshly restored one (only public schema is considered).

> Not if autovacuum has a chance to run between updates.

Ours is run regularly, although we had to tweak it down not to
interfere
with normal database activity, so it takes several hours each run
on the
table. We did that by setting autovacuum_vacuum_scale_factor =
0.05 from
default 0.2.




--
Angular momentum makes the world go 'round.


Re: Recommendation to run vacuum FULL in parallel

2019-04-03 Thread Perumal Raj
Hi Stephen

Thanks for the response ,

Version : 9.2
We never ran VACUUM FULL in the past, All we are doing just manual vacuum (
Weekly ) .
Based on the Observation ( test run ) , we were able to reclaim 150 GB out
of 500 GB .

We are heading to a planned down time soon , So thinking  to run FULL
during that time .

Reason behind to run FULL : 1. Reclaim unused space which postgres never
using it.
2. Considering  FULL may
increase the performance.
3. Daily backup size and
time  will be reduced after reclaiming 150GB.

Thanks,
Raj


On Wed, Apr 3, 2019 at 1:16 PM Stephen Eilert 
wrote:

> > Ideally VACUUM FULL should not require a giant lock on the table.
>
> It is a massively expensive operation, regardless. Not sure if it is
> something you want to run in production outside a maintenance window.
>
> I would argue that frequent vacuum full is an antipattern. This will
> become a matter of superstition in your company.
>
> If db size growth is a problem, make autovacuum more agressive. Or run
> your manual vacuum job (not full) more often than a week. Daily, if you
> have to. This will not reclaim disk space as reported by the OS, but it
> should make the space available for new row versions, so db should mostly
> stop growing from the OS point of view(mostly, because you may be adding
> new data, right?). If it is still a problem, then there may be something
> else going on.
>
> Which PG version is that?
>
>
> — Stephen
> On Apr 3, 2019, 10:02 AM -0700, Perumal Raj , wrote:
>
> Hi All
>
> Thanks for all your valuable  inputs,
>
> Here is some more data,
>
> Though we have 150 GB free space spread across 500 Tables , Every
> alternative day DB is growing with 1 GB rate.
> Also,We have manual vacuum job scheduled to run weekly basis, So seems to
> be space is not reusing all the time ?
>
> So conclude the requirement here , The only way to parallelism is multiple
> script. And no need to do REINDEX exclusively.
> Question : Do we need to consider  Table dependencies while preparing
> script in order to avoid table locks during vacuum full ?
>
> At present Maintenance work memory set to 20 GB.
> Question : Do we need to tweak any other parameters ?
>
> Note:
> We are planning this activity with Application Downtime only.
>
> Let me know if i missed anything.
>
> Regards,
> Raj
>
>
>
>
>
> On Wed, Apr 3, 2019 at 8:42 AM rihad  wrote:
>
>> > And future updates can reuse it, too (an update is very similar to an
>> > insert+delete).
>>
>>
>> Hm, then it's strange our DB takes 6 times as much space compared to
>> freshly restored one (only public schema is considered).
>>
>> > Not if autovacuum has a chance to run between updates.
>>
>> Ours is run regularly, although we had to tweak it down not to interfere
>> with normal database activity, so it takes several hours each run on the
>> table. We did that by setting autovacuum_vacuum_scale_factor = 0.05 from
>> default 0.2.
>>
>>
>>


Re: Recommendation to run vacuum FULL in parallel

2019-04-03 Thread Stephen Eilert
> Ideally VACUUM FULL should not require a giant lock on the table.

It is a massively expensive operation, regardless. Not sure if it is something 
you want to run in production outside a maintenance window.

I would argue that frequent vacuum full is an antipattern. This will become a 
matter of superstition in your company.

If db size growth is a problem, make autovacuum more agressive. Or run your 
manual vacuum job (not full) more often than a week. Daily, if you have to. 
This will not reclaim disk space as reported by the OS, but it should make the 
space available for new row versions, so db should mostly stop growing from the 
OS point of view(mostly, because you may be adding new data, right?). If it is 
still a problem, then there may be something else going on.

Which PG version is that?


— Stephen
On Apr 3, 2019, 10:02 AM -0700, Perumal Raj , wrote:
> Hi All
>
> Thanks for all your valuable  inputs,
>
> Here is some more data,
>
> Though we have 150 GB free space spread across 500 Tables , Every alternative 
> day DB is growing with 1 GB rate.
> Also,We have manual vacuum job scheduled to run weekly basis, So seems to be 
> space is not reusing all the time ?
>
> So conclude the requirement here , The only way to parallelism is multiple 
> script. And no need to do REINDEX exclusively.
> Question : Do we need to consider  Table dependencies while preparing script 
> in order to avoid table locks during vacuum full ?
>
> At present Maintenance work memory set to 20 GB.
> Question : Do we need to tweak any other parameters ?
>
> Note:
> We are planning this activity with Application Downtime only.
>
> Let me know if i missed anything.
>
> Regards,
> Raj
>
>
>
>
>
> > On Wed, Apr 3, 2019 at 8:42 AM rihad  wrote:
> > > > And future updates can reuse it, too (an update is very similar to an
> > > > insert+delete).
> > >
> > >
> > > Hm, then it's strange our DB takes 6 times as much space compared to
> > > freshly restored one (only public schema is considered).
> > >
> > > > Not if autovacuum has a chance to run between updates.
> > >
> > > Ours is run regularly, although we had to tweak it down not to interfere
> > > with normal database activity, so it takes several hours each run on the
> > > table. We did that by setting autovacuum_vacuum_scale_factor = 0.05 from
> > > default 0.2.
> > >
> > >


Re: Recommendation to run vacuum FULL in parallel

2019-04-03 Thread Perumal Raj
Hi All

Thanks for all your valuable  inputs,

Here is some more data,

Though we have 150 GB free space spread across 500 Tables , Every
alternative day DB is growing with 1 GB rate.
Also,We have manual vacuum job scheduled to run weekly basis, So seems to
be space is not reusing all the time ?

So conclude the requirement here , The only way to parallelism is multiple
script. And no need to do REINDEX exclusively.
Question : Do we need to consider  Table dependencies while preparing
script in order to avoid table locks during vacuum full ?

At present Maintenance work memory set to 20 GB.
Question : Do we need to tweak any other parameters ?

Note:
We are planning this activity with Application Downtime only.

Let me know if i missed anything.

Regards,
Raj





On Wed, Apr 3, 2019 at 8:42 AM rihad  wrote:

> > And future updates can reuse it, too (an update is very similar to an
> > insert+delete).
>
>
> Hm, then it's strange our DB takes 6 times as much space compared to
> freshly restored one (only public schema is considered).
>
> > Not if autovacuum has a chance to run between updates.
>
> Ours is run regularly, although we had to tweak it down not to interfere
> with normal database activity, so it takes several hours each run on the
> table. We did that by setting autovacuum_vacuum_scale_factor = 0.05 from
> default 0.2.
>
>
>


Re: Recommendation to run vacuum FULL in parallel

2019-04-03 Thread rihad

And future updates can reuse it, too (an update is very similar to an
insert+delete).



Hm, then it's strange our DB takes 6 times as much space compared to 
freshly restored one (only public schema is considered).



Not if autovacuum has a chance to run between updates.


Ours is run regularly, although we had to tweak it down not to interfere 
with normal database activity, so it takes several hours each run on the 
table. We did that by setting autovacuum_vacuum_scale_factor = 0.05 from 
default 0.2.





Re: Recommendation to run vacuum FULL in parallel

2019-04-03 Thread Peter J. Holzer
On 2019-04-03 18:49:02 +0400, rihad wrote:
> On 04/03/2019 06:40 PM, Michael Lewis wrote:
> > "Sometimes a table's usage pattern involves much more updates than
> > inserts, which gradually uses more and more unused space that is never
> > used again by postgres, and plain autovacuuming doesn't return it to the
> > OS."
> > 
> > Can you expound on that? I thought that was exactly what autovacuum did
> > for old versions of rows whether dead because of delete or update, so I
> > am surprised by this statement. I thought vacuum full was only ever
> > needed if storage space is an issue and the table is not expect to
> > quickly re-expand to current size on disk from new churn of tuples.
> 
> 
> From what I understand from the docs updates keep older versions of rows
> intact because other transactions might still use them (this is the essence
> of MVCC), and autovacuuming (plain VACUUM) marks that space as available
> when it is run, so future inserts can reuse it.

And future updates can reuse it, too (an update is very similar to an
insert+delete).

> In case the number of updates is much greater than the number of
> inserts, the unused zombie space gradually creeps up.

Not if autovacuum has a chance to run between updates.

hp

-- 
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson 


signature.asc
Description: PGP signature


Re: Recommendation to run vacuum FULL in parallel

2019-04-03 Thread rihad

On 04/03/2019 06:40 PM, Michael Lewis wrote:

"Sometimes a table's usage pattern involves much more updates than
inserts, which gradually uses more and more unused space that is never
used again by postgres, and plain autovacuuming doesn't return it to the
OS."

Can you expound on that? I thought that was exactly what autovacuum 
did for old versions of rows whether dead because of delete or update, 
so I am surprised by this statement. I thought vacuum full was only 
ever needed if storage space is an issue and the table is not expect 
to quickly re-expand to current size on disk from new churn of tuples.



From what I understand from the docs updates keep older versions of 
rows intact because other transactions might still use them (this is the 
essence of MVCC), and autovacuuming (plain VACUUM) marks that space as 
available when it is run, so future inserts can reuse it. In case the 
number of updates is much greater than the number of inserts, the unused 
zombie space gradually creeps up.






Re: Recommendation to run vacuum FULL in parallel

2019-04-03 Thread Michael Lewis
"Sometimes a table's usage pattern involves much more updates than
inserts, which gradually uses more and more unused space that is never
used again by postgres, and plain autovacuuming doesn't return it to the
OS."

Can you expound on that? I thought that was exactly what autovacuum did for
old versions of rows whether dead because of delete or update, so I am
surprised by this statement. I thought vacuum full was only ever needed if
storage space is an issue and the table is not expect to quickly re-expand
to current size on disk from new churn of tuples.


Re: Recommendation to run vacuum FULL in parallel

2019-04-03 Thread Peter J. Holzer
On 2019-04-03 13:12:56 +0400, rihad wrote:
> Ideally VACUUM FULL should not require a giant lock on the table.
[...]
> Since rewriting a table is a completely internal operation from
> clients' POV, hopefully one day we will see a concurrent version of
> vacuum full.

There are (at least) pg_repack and pg_squeeze. It would be nice to have
that in the core, though.

hp

-- 
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson 


signature.asc
Description: PGP signature


Re: Recommendation to run vacuum FULL in parallel

2019-04-03 Thread rihad
Does *every* table have *so much* free space that it's impractical to 
just

let the files just get refilled by normal usage?


Ideally VACUUM FULL should not require a giant lock on the table.

Sometimes a table's usage pattern involves much more updates than 
inserts, which gradually uses more and more unused space that is never 
used again by postgres, and plain autovacuuming doesn't return it to the 
OS. So DB size (as witnessed by psql's \l+) uses 5-6x times the space it 
actually needs. And using vacuum full is prohibitive because of the 
exclusive lock it takes on the table, preventing both writes and reads. 
Since rewriting a table is a completely internal operation from clients' 
POV, hopefully one day we will see a concurrent version of vacuum full.






Re: Recommendation to run vacuum FULL in parallel

2019-04-03 Thread Laurenz Albe
Perumal Raj wrote:
> We are  planning to reclaim unused space from 9.2 Version postgres Cluster,
> 
> Method : VACUUM FULL
> DB Size : 500 GB
> Expected space to reclaim 150 GB
> work_mem : 250 MB
> maintenance_work_mem : 20 GB
> 
> Question :
> 
> 1. vacuumdb --j option (Parallel) not available for version 9.2.
>   How to run vacuum full in parallel ? At present its taking 8Hrs if i run 
> sequential ( vacuum full verbose;)

Run several scripts in parallel, where each of them vacuums some bloated tables.
Be warned that VACUUM (FULL) is quite I/O intense, so too much parallelism
might overload your I/O system and harm performance.

> 2. If we run vacuum full, Do we need to run REINDEX/ANALYZE exclusively ?

You don't need to run REINDEX, because that happens automatically.
You can use VACUUM (FULL, ANALYZE) to also gather statistics.

> 3. What is the best way to run VACUUM FULL with less window.

Identify which tables really need it rather than VACUUMing everything.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com





Re: Recommendation to run vacuum FULL in parallel

2019-04-03 Thread Ron

On 4/3/19 12:50 AM, Perumal Raj wrote:

Hi ALL

We are  planning to reclaim unused space from 9.2 Version postgres Cluster,

Method : VACUUM FULL


Does *every* table have *so much* free space that it's impractical to just 
let the files just get refilled by normal usage?



DB Size : 500 GB
Expected space to reclaim 150 GB
work_mem : 250 MB
maintenance_work_mem : 20 GB

*Question :*

1. vacuumdb --j option (Parallel) not available for version 9.2.
  How to run vacuum full in parallel ? At present its taking 8Hrs if i run 
sequential ( vacuum full verbose;)


2. If we run vacuum full, Do we need to run REINDEX/ANALYZE exclusively ?

3. What is the best way to run VACUUM FULL with less window.


A good way to run *any* task like this in parallel is to generate X lists of 
objects, and then process each list in parallel.


--
Angular momentum makes the world go 'round.


Recommendation to run vacuum FULL in parallel

2019-04-02 Thread Perumal Raj
Hi ALL

We are  planning to reclaim unused space from 9.2 Version postgres Cluster,

Method : VACUUM FULL
DB Size : 500 GB
Expected space to reclaim 150 GB
work_mem : 250 MB
maintenance_work_mem : 20 GB

*Question :*

1. vacuumdb --j option (Parallel) not available for version 9.2.
  How to run vacuum full in parallel ? At present its taking 8Hrs if i run
sequential ( vacuum full verbose;)

2. If we run vacuum full, Do we need to run REINDEX/ANALYZE exclusively ?

3. What is the best way to run VACUUM FULL with less window.

Thanks,
Raj