[GENERAL] max_stack_depth problem though query is substantially smaller

Charles Clavadetscher Fri, 08 Apr 2016 03:21:01 -0700

Hello

We have a process in R which reads statistical raw data from a table and 
computes time series values from them.
The time series values are in a hstore field with the date as the key and the 
value as the value.
The process writes the computed value into a temporary table and locks the 
corresponding row(s) of the target table for update.
Finally the row(s) are updated if they already exist or inserted if they do not 
exist.


This process runs nightly and processes many GB of data without generating any 
error. Normally these are low frequency time series
(monthly or quarterly data).

Now we have a daily time serie with about 46'000 key/value pairs. In near 
future this will increase to 86'000 including data from
previous years.

When R processes the daily time serie we get a stack size exceeded error, 
followed by the hint to increase the max_stack_depth. My
colleague, who wrote the R function and performed the test read the docs and 
increased, according to ulimit -s the max_stack_depth
to 7MB.

Here the details of OS and PG:
OS: osx 10.10.5
PG: 9.3.3

ulimit -s = 8192

The resize did work as *show max_stack_depth;" has shown. After this change, 
however, the query states the same error as before,
just with the new limit of 7 MB.

The query itself was written to a file in order to verify its size. The size 
turned out to be 1.7MB, i.e. even below the
conservative default limit of 2 MB, yet alone substantially below 7 MB.

Apart from the fact that we could consider using a different strategy to store 
time series, we would like to understand what is
causing the problem.

Here the query as it looks like in the R code:
sql_query_data <- sprintf("BEGIN;
                           CREATE TEMPORARY TABLE ts_updates(ts_key varchar, 
ts_data hstore, ts_frequency integer) ON COMMIT DROP;
                           INSERT INTO ts_updates(ts_key, ts_data) VALUES %s;
                           LOCK TABLE %s.timeseries_main IN EXCLUSIVE MODE;

                           UPDATE %s.timeseries_main
                           SET ts_data = ts_updates.ts_data
                           FROM ts_updates
                           WHERE ts_updates.ts_key = %s.timeseries_main.ts_key;

                           INSERT INTO %s.timeseries_main
                           SELECT ts_updates.ts_key, ts_updates.ts_data, 
ts_updates.ts_frequency
                           FROM ts_updates
                           LEFT OUTER JOIN %s.timeseries_main ON 
(%s.timeseries_main.ts_key = ts_updates.ts_key)
                           WHERE %s.timeseries_main.ts_key IS NULL;
                           COMMIT;",
                           values, schema, schema, schema, schema, schema, 
schema, schema)

And here is how it looks like at the end:

INSERT INTO ts_updates(ts_key, ts_data, ts_frequency) VALUES 
('somekey',hstore('1900-01-01','-0.395131869823009')||
                                                                        
hstore('1900-01-02','-0.595131869823009')||
                                                                        
hstore('1900-01-03','-0.395131869823009')||
                                                                        [...] 
46'000 times
                                                                        
hstore('1900-01-04','-0.395131869823009'),NULL);

The computer where my colleague made the test is local. There are no other 
concurrent users.
We thank you for hints on what the problem may be and/or how to investigate it 
further.

Please reply to all, as my colleague is not yet subscribed to the mailing list.

Regards,
Charles and Matthias




-- 
Sent via pgsql-general mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] max_stack_depth problem though query is substantially smaller

Reply via email to