Re: [GENERAL] optimizing a query

2016-06-21 Thread Jonathan Vanasco

On Jun 21, 2016, at 6:55 PM, David G. Johnston wrote:

> ​Aside from the name these indexes are identical...​

sorry. tired eyes copy/pasting between windows and trying to 'average' out 40 
similar queries.

> ​These two items combined reduce the desirability of diagnosing this...it 
> doesn't seem like you've faithfully recreated the scenario for us to evaluate.
> 
> Your post is also not self-contained and you haven't provided the actual 
> EXPLAINs you are getting.

I played around with some more indexes, creating and disabling them on one 
specific query
Eventually i found some index formats that didn't pull in the whole table.
They gave approximately the same results as the other selects, with some 
differences in reporting.  the heap scan on the table was negligible.  the big 
hit was off the outer hash join.
the formatting in explain made a negligible check look like it was the root 
issue


CREATE TABLE t_a (id SERIAL PRIMARY KEY,
  col_1 INT NOT NULL,
  col_2 BOOLEAN DEFAULT NULL
  );
CREATE INDEX test_idx__t_a_col1_col2__v1 on t_a (col_1) WHERE col_2 IS NOT 
FALSE;
CREATE INDEX test_idx__t_a_col1_col2__v2 on t_a (col_1, id) WHERE col_2 IS NOT 
FALSE;
CREATE INDEX test_idx__t_a_col1_col2__v3 on t_a (id, col_1) WHERE col_2 IS NOT 
FALSE;
CREATE INDEX test_idx__t_a_col1_col2__v4 on t_a (id, col_1, col_2) WHERE col_2 
IS NOT FALSE;
CREATE INDEX test_idx__t_a_col1_col2__v5 on t_a (col_1, col_2) WHERE col_2 IS 
NOT FALSE;

CREATE TABLE t_b (id SERIAL PRIMARY KEY,
  col_1 INT NOT NULL,
  col_2 BOOLEAN DEFAULT NULL
  );
CREATE TABLE t_a2b (a_id INT NOT NULL REFERENCES t_a(id),
b_id INT NOT NULL REFERENCES t_b(id),
col_a INT NOT NULL,
PRIMARY KEY (a_id, b_id)
);

EXPLAIN ANALYZE
SELECT t_a2b.b_id AS t_a2b_b_id,
   count(t_a2b.b_id) AS counted
FROM t_a2b
JOIN t_a ON t_a2b.a_id = t_a.id
WHERE t_a.col_1 = 730
  AND t_a2b.col_a = 1
  AND (t_a.col_2 IS NOT False)
GROUP BY t_a2b.b_id
ORDER BY counted DESC,
t_a2b.b_id ASC
LIMIT 25
OFFSET 0
;



 QUERY PLAN

 Limit  (cost=270851.55..270851.62 rows=25 width=4) (actual 
time=1259.950..1259.953 rows=25 loops=1)
   ->  Sort  (cost=270851.55..270863.43 rows=4750 width=4) (actual 
time=1259.945..1259.945 rows=25 loops=1)
 Sort Key: (count(t_a2b.b_id)) DESC, t_a2b.b_id
 Sort Method: top-N heapsort  Memory: 26kB
 ->  HashAggregate  (cost=270670.01..270717.51 rows=4750 width=4) 
(actual time=1259.430..1259.769 rows=1231 loops=1)
   Group Key: t_a2b.b_id
   ->  Hash Join  (cost=171148.45..270516.71 rows=30660 width=4) 
(actual time=107.662..1230.481 rows=124871 loops=1)
 Hash Cond: (t_a2b.a_id = t_a.id)
 ->  Seq Scan on t_a2b  (cost=0.00..89741.18 rows=2485464 
width=8) (actual time=0.011..661.978 rows=2492783 loops=1)
   Filter: (col_a = 1)
   Rows Removed by Filter: 2260712
 ->  Hash  (cost=170446.87..170446.87 rows=56126 width=4) 
(actual time=107.409..107.409 rows=48909 loops=1)
   Buckets: 65536  Batches: 1  Memory Usage: 2232kB
   ->  Bitmap Heap Scan on t_a  
(cost=1055.41..170446.87 rows=56126 width=4) (actual time=18.243..94.470 
rows=48909 loops=1)
 Recheck Cond: ((col_1 = 730) AND (col_2 IS NOT 
FALSE))
 Heap Blocks: exact=43972
 ->  Bitmap Index Scan on 
test_idx__t_a_col1_col2__v2  (cost=0.00..1041.38 rows=56126 width=0) (actual 
time=8.661..8.661 rows=48909 loops=1)
   Index Cond: (col_1 = 730)
 Planning time: 0.796 ms
 Execution time: 1260.092 ms


QUERY PLAN
--
 Limit  (cost=208239.59..208239.65 rows=25 width=4) (actual 
time=1337.739..1337.743 rows=25 loops=1)
   ->  Sort  (cost=208239.59..208251.47 rows=4750 width=4) (actual 
time=1337.737..1337.739 rows=25 loops=1)
 Sort Key: (count(t_a2b.b_id)) DESC, t_a2b.b_id
 Sort Method: top-N heapsort  Memory: 26kB
 ->  HashAggregate  (cost=208058.05..208105.55 rows=4750 width=4) 

Re: [GENERAL] Help needed structuring Postgresql correlation query

2016-06-21 Thread Tim Smith
Thanks for that, looks like something to sink my teeth into !

On 21 June 2016 at 13:29, Alban Hertroys  wrote:
>
>> On 19 Jun 2016, at 10:58, Tim Smith  wrote:
>>
>> Hi,
>>
>> My postgresql-fu is not good enough to write a query to achieve this
>> (some may well say r is a better suited tool to achieve this !).
>>
>> I need to calculate what I would call a correlation window on a time
>> series of data, my table looks like this :
>>
>> create table data(data_date date,data_measurement numeric);
>> insert into data values('2016-01-01',16.23);
>> 
>> insert into data values('2016-06-19',30.54);
>>
>> My "target sample" would be the N most recent samples in the table
>> (e.g. 20, the most recent 20 days)
>>
>> My "potential sample" would be a moving window of size N (the same
>> size N as above), starting at T0 (i.e. 2016-01-01 in this example) and
>> incrementing by one (i.e. 2016-01-01 + 20, then 2016-01-02+20 etc),
>> but the "target sample" would obviously be excluded.
>>
>> The output needs to display window date range (or at least the start
>> date of the "potential sample" window) and the result
>> corr(target,potential).
>>
>> Hope that makes sense
>
> Something like this could do the trick (untested):
>
> with recursive sample (nr, start_date) as (
> select 1 as nr, data_date as start_date, 
> SUM(data_measurement) as total
> from generate_series(0, 19) range(step)
> left join data on (data_date = start_date + range.step)
>
> union all
>
> select nr + 1, sample.start_date +1, SUM(data_measurement) as 
> total
> from sample
> join generate_series(0, 19) range(step)
> left join data on (data_date = start_date +1 + range.step)
> where start_date +1 +19 <= (select MAX(data_date) from data)
> group by 1, 2
> )
> select * from sample where start_date >= '2016-01-01';
>
> Not sure how best to go about parameterising sample size N, a stored function 
> seems like a good option.
>
>
> Another approach would be to move a (cumulative) window-function with 20 
> items over your data set and for each row subtract the first value of the 
> previous window from the total of the current window (that is, assuming 
> you're calculating a SUM of data_measurement for each window of 20 records).
>
> Visually that looks something like this for sample size 4:
> sample 1: (A + B + C + D)
> sample 2: (A + B + C + D) + E - A = (B + C + D + E)
> sample 3: (B + C + D + E) + F - B = (C + D + E + F)
> etc.
>
> To accomplish this, you calculate two cumulative totals (often misnamed as 
> running totals, but AFAIK that's something different), one from the start, 
> and one lagging N rows behind (you can use the lag() window function for 
> that) and subtract the two.
>
> Good luck!
>
> Alban Hertroys
> --
> If you can't see the forest for the trees,
> cut the trees and you'll find there is no forest.
>


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] optimizing a query

2016-06-21 Thread David G. Johnston
On Tue, Jun 21, 2016 at 6:44 PM, Adrian Klaver 
wrote:

> On 06/21/2016 03:33 PM, Jonathan Vanasco wrote:
>
>>
>>
>> In effort of simplifying the work, I've created indexes on t_a that have
>> all the related columns.
>>
>> CREATE INDEX test_idx ON t_a(col_1, id) WHERE col_2 IS NOT FALSE;
>> CREATE INDEX test_idx__a ON t_a(col_1, id) WHERE col_2 IS NOT
>> FALSE;
>>
>
​Aside from the name these indexes are identical...​


>> postgres will query test_idx__a first (yay!) but then does a bitmap heap
>> scan on t_a, and uses the raw t_a for the hash join.
>>
>> I don't actually need any information from t_a - it's just there for the
>> filtering, and ideally postgres would just use the index.
>>
>
​This is the description of a semi-join.

WHERE EXISTS (SELECT 1 FROM t_a WHERE t_a.id = ​t_a2b.a_id AND  t_a.col_1 =
730 AND t_a.col_2 IS NOT FALSE)


>> I thought this might have been from using a partial index, but the same
>> results happen with a full index.  I just can't seem to avoid this hash
>> join against the full table.
>>
>> anyone have a suggestion?
>>
>>
> The below works without including t_a in the FROM?
>
>
>> example query
>>
>> SELECT t_a2b.b_id AS b_id,
>>count(t_a2b.b_id) AS counted
>> FROM t_a2b
>> WHERE
>>   t_a2b.col_a = 1
>>   AND
>>   t_a.col_1 = 730
>>   AND
>>   t_a.col_2 IS NOT False
>> GROUP BY t_a2b.b_id
>> ORDER BYcounted DESC,
>> t_a2b.b_id ASC
>>
>>
​These two items combined reduce the desirability of diagnosing this...it
doesn't seem like you've faithfully recreated the scenario for us to
evaluate.

Your post is also not self-contained and you haven't provided the actual
EXPLAINs you are getting.

David J.


Re: [GENERAL] optimizing a query

2016-06-21 Thread Adrian Klaver

On 06/21/2016 03:33 PM, Jonathan Vanasco wrote:

I have a handful of queries in the following general form that I can't seem to 
optimize any further (same results on 9.3, 9.4, 9.5)

I'm wondering if anyone might have a suggestion, or if they're done.

The relevant table structure:

t_a2b
a_id INT references t_a(id)
b_id INT references t_b(id)
col_a

t_a
id INT
col_1 INT
col_2 BOOL

The selects query the association table (t_a2b) and join in a related table 
(t_a) for some filtering.

In effort of simplifying the work, I've created indexes on t_a that have all 
the related columns.

CREATE INDEX test_idx ON t_a(col_1, id) WHERE col_2 IS NOT FALSE;
CREATE INDEX test_idx__a ON t_a(col_1, id) WHERE col_2 IS NOT FALSE;

postgres will query test_idx__a first (yay!) but then does a bitmap heap scan 
on t_a, and uses the raw t_a for the hash join.

I don't actually need any information from t_a - it's just there for the 
filtering, and ideally postgres would just use the index.

I thought this might have been from using a partial index, but the same results 
happen with a full index.  I just can't seem to avoid this hash join against 
the full table.

anyone have a suggestion?



The below works without including t_a in the FROM?



example query

SELECT t_a2b.b_id AS b_id,
   count(t_a2b.b_id) AS counted
FROM t_a2b
WHERE
  t_a2b.col_a = 1
  AND
  t_a.col_1 = 730
  AND
  t_a.col_2 IS NOT False
GROUP BY t_a2b.b_id
ORDER BYcounted DESC,
t_a2b.b_id ASC






--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] optimizing a query

2016-06-21 Thread Jonathan Vanasco
I have a handful of queries in the following general form that I can't seem to 
optimize any further (same results on 9.3, 9.4, 9.5)

I'm wondering if anyone might have a suggestion, or if they're done.

The relevant table structure:

t_a2b
a_id INT references t_a(id)
b_id INT references t_b(id)
col_a

t_a
id INT
col_1 INT
col_2 BOOL

The selects query the association table (t_a2b) and join in a related table 
(t_a) for some filtering.

In effort of simplifying the work, I've created indexes on t_a that have all 
the related columns.

CREATE INDEX test_idx ON t_a(col_1, id) WHERE col_2 IS NOT FALSE;
CREATE INDEX test_idx__a ON t_a(col_1, id) WHERE col_2 IS NOT FALSE;

postgres will query test_idx__a first (yay!) but then does a bitmap heap scan 
on t_a, and uses the raw t_a for the hash join.  

I don't actually need any information from t_a - it's just there for the 
filtering, and ideally postgres would just use the index.

I thought this might have been from using a partial index, but the same results 
happen with a full index.  I just can't seem to avoid this hash join against 
the full table.

anyone have a suggestion?


example query

SELECT t_a2b.b_id AS b_id,
   count(t_a2b.b_id) AS counted
FROM t_a2b 
WHERE 
  t_a2b.col_a = 1
  AND
  t_a.col_1 = 730
  AND
  t_a.col_2 IS NOT False
GROUP BY t_a2b.b_id
ORDER BYcounted DESC,
t_a2b.b_id ASC  


   



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] does timestamp precision affect storage size?

2016-06-21 Thread Jonathan Vanasco

On Jun 21, 2016, at 4:50 PM, Tom Lane wrote:

> Storage-wise, no.  If you have a resolution spec on your columns now,
> I think dropping the resolution spec would save you a few nanoseconds per
> row insertion due to not having to apply the roundoff function.  Adding
> one would certainly not improve speed.


On Jun 21, 2016, at 4:47 PM, Vik Fearing wrote:
> No, there are no space savings here.
> 
> =# select pg_column_size('now'::timestamptz(0)),
> pg_column_size('now'::timestamptz);


Thanks.  I thought that was happening, but wanted to make sure.  the allure of 
shaving a byte or two off some rows couldn't be ignored ;)

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] does timestamp precision affect storage size?

2016-06-21 Thread Tom Lane
Jonathan Vanasco  writes:
> would there be any savings in storage or performance improvements from losing 
> the resolution on fractional seconds,

Storage-wise, no.  If you have a resolution spec on your columns now,
I think dropping the resolution spec would save you a few nanoseconds per
row insertion due to not having to apply the roundoff function.  Adding
one would certainly not improve speed.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] does timestamp precision affect storage size?

2016-06-21 Thread Vik Fearing
On 21/06/16 22:39, Jonathan Vanasco wrote:
> i'm cleaning up some queries for performance, and noticed that we never use 
> precision beyond the second (ie, `timestamp(0)`) in our business logic.
> 
> would there be any savings in storage or performance improvements from losing 
> the resolution on fractional seconds, or are `timestamp(precision)` 
> effectively the same for storage as `timestamp`?  (based on docs, I assume 
> the latter but wanted to check)

No, there are no space savings here.

=# select pg_column_size('now'::timestamptz(0)),
pg_column_size('now'::timestamptz);

 pg_column_size | pg_column_size
+
  8 |  8
(1 row)

-- 
Vik Fearing  +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] does timestamp precision affect storage size?

2016-06-21 Thread Jonathan Vanasco
i'm cleaning up some queries for performance, and noticed that we never use 
precision beyond the second (ie, `timestamp(0)`) in our business logic.

would there be any savings in storage or performance improvements from losing 
the resolution on fractional seconds, or are `timestamp(precision)` effectively 
the same for storage as `timestamp`?  (based on docs, I assume the latter but 
wanted to check)

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Help with namespaces in xpath (PostgreSQL 9.5.3)

2016-06-21 Thread Allan Kamau
Thank you David.

-Allan.

On Mon, Jun 20, 2016 at 11:19 PM, David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Sun, Jun 19, 2016 at 5:09 PM, Allan Kamau  wrote:
>
>> I have an xml document from which I would like to extract the contents of
>> several elements.
>>
>> I would like to use xpath to extract the contents of "name" from the xml
>> document shown below.
>>
>> WITH x AS
>> (
>> SELECT
>> '
>> http://uniprot.org/uniprot; xmlns:xsi="
>> http://www.w3.org/2001/XMLSchema-instance; xsi:schemaLocation="
>> http://uniprot.org/uniprot
>> http://www.uniprot.org/support/docs/uniprot.xsd;>
>> > version="56">
>> A0JM59
>> UBP20_XENTR
>> 
>> 
>> '::xml AS d
>> )
>> SELECT (xpath('/uniprot/entry/name/text()',a.d))[1]::text AS uniprot_name
>> FROM
>> x AS a
>> ;
>>
>> The documentation for xpath() ("
>> https://www.postgresql.org/docs/9.5/static/functions-xml.html;)
>> describes "xpath(xpath, xml [, nsarray]").
>>
>> For the above xml document, what would be the two dimensional array
>> "nsarray" for the xpath() function?
>>
>
> ​Is there a specific part of the description and two examples that doesn't
> make sense to you?
>
> ​Or more specifically, do you understand what namespaces are?​
>
> ARRAY[
> ARRAY['defaultns','http://uniprot.org/uniprot'],
> ARRAY['xsi','http://www.w3.org/2001/XMLSchema-instance']
> ]​
>
> In effect when the xpath function parses the XML document it tosses away
> all of the document-local namespace aliases and instead associated the full
> namespace URI with each element (in the DOM).  Since, in the xpath
> expression, usually you'd want to refer to nodes in the DOM via their
> namespace alias you need to tell the xpath function which aliases you
> intend to use in the xpath and which full URI they correspond to.
> Furthermore, there is not concept of a default namespace in the xpath
> expression.  So while you can simply copy-paste the aliases and URIs from
> all of the non-default namespace aliases you must also choose a unique
> alias for the default namespace in the original document.
>
> In the above I've copied the alias and namespace URI for the named "xsi"
> alias and gave the name "defaultns" to the original document's default
> namespace URI.
>
> David J.
>
>


Re: [HACKERS] [GENERAL] PgQ and pg_dump

2016-06-21 Thread Martín Marqués
2016-06-21 13:08 GMT-03:00 Robert Haas :
> On Thu, Jun 16, 2016 at 1:46 PM, Martín Marqués  
> wrote:
>> The comment is accurate on what is going to be dumpable and what's not
>> from the code. In our case, as the pgq schema is not dumpable becaause
>> it comes from an extension, other objects it contain will not be
>> dumpable as well.
>>
>> That's the reason why the PgQ event tables created by
>> pgq.create_queue() are not dumped.
>
> That sucks.

Yes, and I'm surprised we haven't had any bug report yet on
inconsistent dumps. The patch that changed pg_dump's behavior on
extension objects is more then a year old.

I'll find some time today to add tests and check for other objects
that are not dumped for the same reason.

Cheers,

-- 
Martín Marquéshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [HACKERS] [GENERAL] PgQ and pg_dump

2016-06-21 Thread Robert Haas
On Thu, Jun 16, 2016 at 1:46 PM, Martín Marqués  wrote:
> The comment is accurate on what is going to be dumpable and what's not
> from the code. In our case, as the pgq schema is not dumpable becaause
> it comes from an extension, other objects it contain will not be
> dumpable as well.
>
> That's the reason why the PgQ event tables created by
> pgq.create_queue() are not dumped.

That sucks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


R: R: [GENERAL] Vacuum full: alternatives?

2016-06-21 Thread Job
Hello,
very interesting comments and contributions, thank you.

>I've just tested pg_bulkload with the default settings, and it
>definitely isn't using the fsm to re-use freed space in the table.  If
>they use WRITER = BUFFERED it would, though.

So with WRITER = BUFFERED it should be slower but free-marked space should be 
reused again?

Thank you!
Francesco



Da: Jeff Janes [jeff.ja...@gmail.com]
Inviato: lunedì 20 giugno 2016 17.51
A: Martín Marqués
Cc: Melvin Davidson; Rakesh Kumar; Job; pgsql-general@postgresql.org
Oggetto: Re: R: [GENERAL] Vacuum full: alternatives?

On Mon, Jun 20, 2016 at 7:23 AM, Martín Marqués  wrote:
> El 20/06/16 a las 09:50, Melvin Davidson escribió:
>>
>>
>>>but it won't let it grow too (or am I missing something).
>>
>> Yes, you are missing something. By partioning and {Vacuum Full only the
>> table with data no longer needed}, the rest of the data remains
>> available to the users
>> AND space is reclaimed by the O/S, so it's the best of both worlds.
>
> That's not entirely true. Think about a SELECT which has to scan all
> child tables.
>

Yes, for the partitioning to be a good option, you would probably have
to arrange it such that you can prove that all tuples in a given
partition are eligible for deletion (or have already been deleted),
and then either truncate or dis-inherit the partition.  That still
requires a stringent lock, but it is only held for a very short time.

> Your are also adding another layer of complexity to the system.

I think that using pg_bulkload adds more complexity to the system than
partitioning would.  I wonder if they really need to use that, or if
they just picked it over COPY because it sounded like a free lunch.



Cheers,

Jeff

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Help needed structuring Postgresql correlation query

2016-06-21 Thread Alban Hertroys

> On 19 Jun 2016, at 10:58, Tim Smith  wrote:
> 
> Hi,
> 
> My postgresql-fu is not good enough to write a query to achieve this
> (some may well say r is a better suited tool to achieve this !).
> 
> I need to calculate what I would call a correlation window on a time
> series of data, my table looks like this :
> 
> create table data(data_date date,data_measurement numeric);
> insert into data values('2016-01-01',16.23);
> 
> insert into data values('2016-06-19',30.54);
> 
> My "target sample" would be the N most recent samples in the table
> (e.g. 20, the most recent 20 days)
> 
> My "potential sample" would be a moving window of size N (the same
> size N as above), starting at T0 (i.e. 2016-01-01 in this example) and
> incrementing by one (i.e. 2016-01-01 + 20, then 2016-01-02+20 etc),
> but the "target sample" would obviously be excluded.
> 
> The output needs to display window date range (or at least the start
> date of the "potential sample" window) and the result
> corr(target,potential).
> 
> Hope that makes sense

Something like this could do the trick (untested):

with recursive sample (nr, start_date) as (
select 1 as nr, data_date as start_date, SUM(data_measurement) 
as total
from generate_series(0, 19) range(step)
left join data on (data_date = start_date + range.step)

union all

select nr + 1, sample.start_date +1, SUM(data_measurement) as 
total
from sample
join generate_series(0, 19) range(step)
left join data on (data_date = start_date +1 + range.step)
where start_date +1 +19 <= (select MAX(data_date) from data)
group by 1, 2
)
select * from sample where start_date >= '2016-01-01';

Not sure how best to go about parameterising sample size N, a stored function 
seems like a good option.


Another approach would be to move a (cumulative) window-function with 20 items 
over your data set and for each row subtract the first value of the previous 
window from the total of the current window (that is, assuming you're 
calculating a SUM of data_measurement for each window of 20 records).

Visually that looks something like this for sample size 4:
sample 1: (A + B + C + D)
sample 2: (A + B + C + D) + E - A = (B + C + D + E)
sample 3: (B + C + D + E) + F - B = (C + D + E + F)
etc.

To accomplish this, you calculate two cumulative totals (often misnamed as 
running totals, but AFAIK that's something different), one from the start, and 
one lagging N rows behind (you can use the lag() window function for that) and 
subtract the two.

Good luck!

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Regression tests (Background Workers)

2016-06-21 Thread Michael Paquier
On Tue, Jun 21, 2016 at 2:02 PM, Dharini  wrote:
> Background process is initialized at server start and when trying to run the
> tests i get the following error.
>
> $ make installcheck
> (using postmaster on Unix socket, default port)
> == dropping database "contrib_regression" ==
> ERROR:  DROP DATABASE cannot be executed from a function or multi-command
> string
> command failed: "/postgres/install/bin/psql" -X -c "DROP DATABASE IF EXISTS
> \"contrib_regression\"" "postgres"
> make: *** [installcheck] Error 2

Which test are you trying to run? That's not from the in-core source
tree, right? And on which version of Postgres is this attempt tried?
With this level of details that's hard to know what's going on, one
can just guess that PreventTransactionChain is being called on DROP
DATABASE..
-- 
Michael


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general