[SQL] Can't get upsert working.

2012-11-30 Thread Bert
Hello,

Anyone here with upsert experience?

I'm trying to run this query:

WITH UPSERT AS
  (UPDATE oltp.ST_TRANSACTION
   SET (row1,
row2,
row3) = (ET.row1,
 ET.row2,
 ET.row3)
   FROM
 (SELECT *
  FROM loadoltp.ST_TRANSACTION_INSERT
  WHERE LOADTABLETIME = '2012-11-30 14:48:59.252960') AS ET returning *)
INSERT INTO oltp.ST_TRANSACTION
SELECT (ET.row1,
ET.row2,
ET.row3)
FROM
   (SELECT *
   FROM loadoltp.ST_TRANSACTION_INSERT
   WHERE LOADTABLETIME = '2012-11-30 14:48:59.252960') AS ET
WHERE ET.tick_server_id NOT IN
(SELECT ET.tick_server_id
 FROM upsert b)
  AND tra_id NOT IN
(SELECT ET.tra_id
 FROM upsert b)

But I always get this error message:
ERROR:  column "row1" is of type integer but expression is of type record
LINE 67: SELECT (ET.ROW1,

Does anyone has an idea?

wkr,
Bert

-- 
Bert Desmet
0477/305361


[SQL] vacuum / analyze parent tables on partitioned tables.

2013-01-23 Thread Bert
Hello,

I wrote a script to make sure all tables are vacuumed and analyzed every
evening. This works very well, but I have a question:
I save in a table the start and end time of a vacuum/analyze. This way I
can measure what tables take a long time to vaccum/analyze, and what tables
are slow. (and much more).

But I have noticed that the parent table of a partitioned table also takes
a long time. Here is a snap shot of the following table

table_name ; avg runt time; max run time; min run time
"f_transaction_1";"00:03:07.8";"00:03:10";"00:03:03"
"f_transaction";"00:02:19.8";"00:02:25";"00:02:16"

f_tranaction_1 is 15GB data + 12GB of indexes. (I know, a lot of indexes).
f_tranaction is totally empy, but also contains all indexes. Which means 0B
table zise, and 140kB index size.

Does anyone has an idea why in this case the vacuum/analyze takes almost as
long on the parent table as on the biggest child table? (the other child
tables are smaller, and their vacuum/analyze time is much shorter).

wkr,
Bert

-- 
Bert Desmet
0477/305361


[SQL] Partition tables to improve select speed?

2013-01-31 Thread Bert
Hello,

We are using postgres as a datawarehouse.
We typically use very complex queries, and they seem to work very well.

We have several BIG tables. we chose to partition those tables on a
specific key.

We also adapted our selects, and added a 'where' clause which helps
postgres's query planner, so only one of the partition needs to be used in
stead of 32.

How ever, we get some strange results. For some queries the runtime really
is a lot faster. But for the runtime is a even a bit longer.

Can anyone give me insight on why that can happen?

wkr,

Bert


Re: [SQL] query doesn't always follow 'correct' path..

2013-02-18 Thread Bert
Hello,

yes, the tables are vacuumed every day with the following command: vacuum
analyze schema.table.
The last statistics were collected yesterday evening. I collected
statistics about the statistics, and I found the following:
table_name; starttime; runtime
"st_itemseat";"2013-02-17 23:48:42";"00:01:02"
"st_itemseat_45";"2013-02-17 23:35:15";"00:00:08"
"st_itemzone";"2013-02-17 23:35:33";"00:00:01"

st_itemseat_45 is a child-partition of st_itemseat.

They seem to be pretty much up to date I guess?
I also don't get any difference in the query plans when they are run in the
morning, or in the evening.

I have also run the query with set seq_scan to off, and then I get the
following output:
Total query runtime: 12025 ms.
20599 rows retrieved.
and the following plan: http://explain.depesz.com/s/yaJK

These are 3 different plans. And the last one is blazingly fast. That's the
one I would always want to use :-)

it's also weird that this is default plan for the biggest partition. But
the smaller the partition gets, the smaller the partition gets.
So I don't think it has anything to do with the memory settings. Since it
already chooses this plan for the bigger partitions...

wkr,
Bert


On Mon, Feb 18, 2013 at 11:51 AM, Frank Lanitz  wrote:

> Am 18.02.2013 10:43, schrieb Bert:
> > Does anyone has an idea what triggers this bad plan, and how I can fix
> it?
>
> Looks a bit like wrong statistics. Are the statistiks for your tables
> correct?
>
> Cheers,
> Frank
>
>
> --
> Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-sql
>



-- 
Bert Desmet
0477/305361


Re: [SQL] query doesn't always follow 'correct' path..

2013-02-18 Thread Bert
Hello,

Thanks the nice people on irc my problem is fixed.
I changed the following settings in the postgres.conf file:
default_statistics_target = 5000 -> and I analyzed the tables after the
change of course -> now I only got 2 plans anymore, in stead of 3
cpu_tuple_cost = 0.1 -> by setting this value the seq scans were stopped,
and the better index_only scan / bitmap index scan were used for this
query.

Thank you Robe and Mabe_ for helping me with this issue!

wkr,
Bert


On Mon, Feb 18, 2013 at 2:42 PM, Bert  wrote:

> Hello,
>
> yes, the tables are vacuumed every day with the following command: vacuum
> analyze schema.table.
> The last statistics were collected yesterday evening. I collected
> statistics about the statistics, and I found the following:
> table_name; starttime; runtime
> "st_itemseat";"2013-02-17 23:48:42";"00:01:02"
> "st_itemseat_45";"2013-02-17 23:35:15";"00:00:08"
> "st_itemzone";"2013-02-17 23:35:33";"00:00:01"
>
> st_itemseat_45 is a child-partition of st_itemseat.
>
> They seem to be pretty much up to date I guess?
> I also don't get any difference in the query plans when they are run in
> the morning, or in the evening.
>
> I have also run the query with set seq_scan to off, and then I get the
> following output:
> Total query runtime: 12025 ms.
> 20599 rows retrieved.
> and the following plan: http://explain.depesz.com/s/yaJK
>
> These are 3 different plans. And the last one is blazingly fast. That's
> the one I would always want to use :-)
>
> it's also weird that this is default plan for the biggest partition. But
> the smaller the partition gets, the smaller the partition gets.
> So I don't think it has anything to do with the memory settings. Since it
> already chooses this plan for the bigger partitions...
>
> wkr,
> Bert
>
>
> On Mon, Feb 18, 2013 at 11:51 AM, Frank Lanitz wrote:
>
>> Am 18.02.2013 10:43, schrieb Bert:
>> > Does anyone has an idea what triggers this bad plan, and how I can fix
>> it?
>>
>> Looks a bit like wrong statistics. Are the statistiks for your tables
>> correct?
>>
>> Cheers,
>> Frank
>>
>>
>> --
>> Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-sql
>>
>
>
>
> --
> Bert Desmet
> 0477/305361
>



-- 
Bert Desmet
0477/305361


Re: [SQL] query doesn't always follow 'correct' path..

2013-02-18 Thread Bert
Hello,

there were 3 hours in between the 2 queries. so I guess new data was loaded
already. new data is being loaded with that etl_run_id.

wkr,
Bert


On Mon, Feb 18, 2013 at 4:20 PM, Виктор Егоров  wrote:

> 2013/2/18 Bert 
>
>> When I don't touch the indexscan setting I get the following output:
>> Total query runtime: 611484 ms.
>> 20359 rows retrieved.
>> and the following plan: http://explain.depesz.com/s/sDy
>>
>> However, when I put set enable_indexscan=off; in fron of the same query I
>> get the following output:
>> Total query runtime: 16281 ms.
>> 20599 rows retrieved.
>> and the followign plan: http://explain.depesz.com/s/EpP
>>
>
> Is this a typo or do you really get different number of rows returned with
> and without indexscans?
> Is this expected for the same query to return different sets over time?
>
>
> --
> Victor Y. Yegorov
>



-- 
Bert Desmet
0477/305361