>
> ​
> Why are you dropping and re-loading? You mentioned it before and it
> sounded like it had something to do with adding columns, but you

don't have to drop and reload to add a column.​


​Adding a NULL column is fast. Dropping one too. I need to set some row as
TRUE. I can do it with an update, but in postgresql update is done by
delete then insert with copy of the row. This is really slow. A drop
cascade, then bulk load is better.

This is not the only reason. Drop & load simplify all the ETL process. No
question of delta changes and no "fuck brain" when a problem occurs or a
modification of the table. I've tested, it loads 20milion rows in 5 min
(without time for reindexing and time to retrieve datas)

2015-03-10 9:31 GMT+01:00 Jim Nasby <jim.na...@bluetreble.com>:

> On 3/9/15 8:17 AM, Nicolas Paris wrote:
>
>> (sorry for top-posting, gmail does not help.)
>>
>
> *shakes fist at gmail*
>
>  Thanks to your advice Jim, I have done an other test :
>> No partial indexes, just a partial index on boolean columns does the
>> job.  (I get same perfs as MV)
>> CREATE INDEX ..ON (BoolColumnX) WHERE BoolColumnX IS TRUE
>>
>> Then VIEW =
>> SELECT colA....colZ
>> FROM huge_table
>> WHERE BoolColumnX IS TRUE
>>
>> Then this only index is used 800times (for each bool col)  and saves
>> place as it does'nt indexes NULL values, and does no replicate. subsets.
>> Moreover the huge indexes are allways loaded in cache memory.
>>
>
> Cool. :)
>
>      According to this link
>>     http://postgresql.nabble.com/NULL-saves-disk-space-td4344106.html
>>     NULL values do not take place if only one other column are null for
>>     that row.
>>     Boolean takes 1 byte wheras smallint 2bytes.
>>     Then the space problem is not anymore a problem with boolean columns
>>     95% empty
>>
>>     One thing that is really great with postgresql is transaction for
>>     drop table cascade, that allow te restore all stuf index, views on a
>>     rollback if problem in loading appears.
>>     I hope using one transaction to drop/load many table is not a
>>     performance issue ?
>>
>
> Why are you dropping and re-loading? You mentioned it before and it
> sounded like it had something to do with adding columns, but you don't have
> to drop and reload to add a column.
>
> --
> Jim Nasby, Data Architect, Blue Treble Consulting
> Data in Trouble? Get it in Treble! http://BlueTreble.com
>

Reply via email to