Re: [GENERAL] Different results from view and from its defintion query [w/ windowing function]

2012-08-21 Thread Achilleas Mantzios
first_value refers to the first row from the window frame. Unless you force 
some kind of ordering, you cannot expect 
consistent results out of this.

See the PARTITION BY ... ORDER BY syntax in 
http://www.postgresql.org/docs/9.1/static/tutorial-window.html

On Δευ 20 Αυγ 2012 01:55:38 Thalis Kalfigkopoulos wrote:


Sorry for the lack of a more appropriate title. 
The summary of my problem is: i run a query and I get some results; then I 
create a view using this query, and I run the same query on the view, and get 
different results. Details follow.

On the original table the analytical data is as follows:
# SELECT id,experiment,insertedon,score FROM data WHERE id=1160;

  id |experiment| insertedon  | score  
+--+-+
  1160 | alpha  | 2012-08-19 01:01:12 | 220.69
  1160 | alpha  | 2012-08-19 01:01:22 | 220.69
  1160 | beta  | 2012-08-19 01:01:31 |  220.7
  1160 | beta  | 2012-08-19 01:01:42 |  220.7
  1160 | beta  | 2012-08-19 01:01:54 |  220.7

My query of interest using windowing functions is:

# SELECT id, experiment, first_value(insertedon) OVER (PARTITION BY score, id) 
AS first_insertedon, score FROM data WHERE id=1160;
 id   |experiment|  first_insertedon   | score  |
+--+-++--
  1160 | alpha  | 2012-08-19 01:01:12 | 220.69 |
  1160 | alpha  | 2012-08-19 01:01:12 | 220.69 |
  1160 | beta  | 2012-08-19 01:01:31 |  220.7 |
  1160 | beta  | 2012-08-19 01:01:31 |  220.7 |
  1160 | beta  | 2012-08-19 01:01:31 |  220.7 |

So far so good. I then create the view on this last query without the WHERE 
condition:
# CREATE VIEW clustered_view AS SELECT id, experiment, first_value(insertedon) 
OVER (PARTITION BY score, id) AS first_insertedon, score FROM data;

I see the view created correctly and its definition is according to the 
mentioned SQL query. I now select from the view adding the WHERE condition:
#  SELECT * from clustered_view WHERE id=1160;

  id   |experiment |  first_insertedon   | score  |
+--+-++
  1160 | alpha  | 2012-08-19 01:01:12 | 220.69 |
  1160 | alpha  | 2012-08-19 01:01:12 | 220.69 |
  1160 | beta  | 2012-08-19 01:01:54 |  220.7 |
  1160 | beta  | 2012-08-19 01:01:54 |  220.7 |
  1160 | beta  | 2012-08-19 01:01:54 |  220.7 |

As you see, the 'first_insertedon' timestamp for the experiment 'beta' is no 
longer the first of the timestamps i.e. '2012-08-19 01:01:31' as the original 
query's results gave correctly, but it's now the last one i.e. '2012-08-19 
01:01:54'

Any ideas? Missing the obvious?


TIA,
Thalis K.




-
Achilleas Mantzios
IT DEPT

[GENERAL] Different results from view and from its defintion query [w/ windowing function]

2012-08-20 Thread Thalis Kalfigkopoulos
Sorry for the lack of a more appropriate title.
The summary of my problem is: i run a query and I get some results; then I
create a view using this query, and I run the same query on the view, and
get different results. Details follow.

On the original table the analytical data is as follows:
# SELECT id,experiment,insertedon,score FROM data WHERE id=1160;

  id |experiment| insertedon  | score
+--+-+
  1160 | alpha  | 2012-08-19 01:01:12 | 220.69
  1160 | alpha  | 2012-08-19 01:01:22 | 220.69
  1160 | beta  | 2012-08-19 01:01:31 |  220.7
  1160 | beta  | 2012-08-19 01:01:42 |  220.7
  1160 | beta  | 2012-08-19 01:01:54 |  220.7

My query of interest using windowing functions is:

# SELECT id, experiment, first_value(insertedon) OVER (PARTITION BY score,
id) AS first_insertedon, score FROM data WHERE id=1160;
 id   |experiment|  first_insertedon   | score  |
+--+-++--
  1160 | alpha  | 2012-08-19 01:01:12 | 220.69 |
  1160 | alpha  | 2012-08-19 01:01:12 | 220.69 |
  1160 | beta  | 2012-08-19 01:01:31 |  220.7 |
  1160 | beta  | 2012-08-19 01:01:31 |  220.7 |
  1160 | beta  | 2012-08-19 01:01:31 |  220.7 |

So far so good. I then create the view on this last query without the WHERE
condition:
# CREATE VIEW clustered_view AS SELECT id, experiment,
first_value(insertedon) OVER (PARTITION BY score, id) AS first_insertedon,
score FROM data;

I see the view created correctly and its definition is according to the
mentioned SQL query. I now select from the view adding the WHERE condition:
#  SELECT * from clustered_view WHERE id=1160;

  id   |experiment |  first_insertedon   | score  |
+--+-++
  1160 | alpha  | 2012-08-19 01:01:12 | 220.69 |
  1160 | alpha  | 2012-08-19 01:01:12 | 220.69 |
  1160 | beta  | 2012-08-19 01:01:54 |  220.7 |
  1160 | beta  | 2012-08-19 01:01:54 |  220.7 |
  1160 | beta  | 2012-08-19 01:01:54 |  220.7 |

As you see, the 'first_insertedon' timestamp for the experiment 'beta' is
no longer the first of the timestamps i.e. '2012-08-19 01:01:31' as the
original query's results gave correctly, but it's now the last one i.e.
'2012-08-19 01:01:54'

Any ideas? Missing the obvious?


TIA,
Thalis K.


Re: [GENERAL] Different results from view and from its defintion query [w/ windowing function]

2012-08-20 Thread Tom Lane
Thalis Kalfigkopoulos tkalf...@gmail.com writes:
 # SELECT id, experiment, first_value(insertedon) OVER (PARTITION BY score,
 id) AS first_insertedon, score FROM data WHERE id=1160;

 [ versus ]

 # CREATE VIEW clustered_view AS SELECT id, experiment,
 first_value(insertedon) OVER (PARTITION BY score, id) AS first_insertedon,
 score FROM data;

 #  SELECT * from clustered_view WHERE id=1160;

One possible reason these produce different results is in that in the
first case, the WHERE condition eliminates rows from the window
function's consideration.  In the second case, it doesn't --- the WHERE
only filters the result rows from the view.  However, the fact that id
is part of the partition list may insulate you from that; not quite sure
without seeing a more complete example.

Another likely reason for trouble is that the window function seems
underspecified: without any ORDER BY clause, you are going to get a
random one of the insertedon values for the same score and id.  It's
entirely likely that moving the WHERE clause would change the plan
enough to change the ordering of the rows seen by the window function.
Possibly you should be using min() instead of first_value().

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Different results from view and from its defintion query [w/ windowing function]

2012-08-20 Thread Thalis Kalfigkopoulos
Hi Tom,

and thanks for the reply (I had the pleasure of meeting you 11 years ago in
Pittsburgh; still a pleasure seeing your concise and helpful replies.)

In the end I went for a change of window function. Using min(insertedon)
instead of first_value(insertedon) works correctly.

Alternatively your suggestion of adding an ORDER BY insertedon clause
also seems to work. It makes the first_value(insertedon) behave as
min(insertedon).


thanks again,
Thalis K.

On Mon, Aug 20, 2012 at 12:16 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Thalis Kalfigkopoulos tkalf...@gmail.com writes:
  # SELECT id, experiment, first_value(insertedon) OVER (PARTITION BY
 score,
  id) AS first_insertedon, score FROM data WHERE id=1160;

  [ versus ]

  # CREATE VIEW clustered_view AS SELECT id, experiment,
  first_value(insertedon) OVER (PARTITION BY score, id) AS
 first_insertedon,
  score FROM data;

  #  SELECT * from clustered_view WHERE id=1160;

 One possible reason these produce different results is in that in the
 first case, the WHERE condition eliminates rows from the window
 function's consideration.  In the second case, it doesn't --- the WHERE
 only filters the result rows from the view.  However, the fact that id
 is part of the partition list may insulate you from that; not quite sure
 without seeing a more complete example.

 Another likely reason for trouble is that the window function seems
 underspecified: without any ORDER BY clause, you are going to get a
 random one of the insertedon values for the same score and id.  It's
 entirely likely that moving the WHERE clause would change the plan
 enough to change the ordering of the rows seen by the window function.
 Possibly you should be using min() instead of first_value().

 regards, tom lane