Re: regr_slope returning NULL

2019-03-24 Thread Steve Baldwin
Thanks Dean, that's really helpful. Because my x axis values are actually
derived from 'extract(epoch from tstz_col)', it is simple for me to
subtract an offset.

Cheers,

Steve

On Sun, Mar 24, 2019 at 7:55 PM Dean Rasheed 
wrote:

> On Sun, 24 Mar 2019 at 08:01, Steve Baldwin 
> wrote:
> >
> > Thanks Tom,
> >
> > I've tried this on 11.2 (OS X 10.14.3, installed locally) and 10.6 (AWS
> RDS) instances with identical results.  The values you show are identical
> to those returned by Oracle so that's great but why am I seeing different
> results?
> >
>
> This is caused by the large magnitude of the ts values, which causes a
> cancellation error in the Sxx calculation, which is what commit
> e954a727f0 fixed in HEAD, and will be available in PG12 [1].
>
> You can see that by including regr_sxx in the results. With PG11, this
> gives the following:
>
> select id, regr_slope(elapsed, ts) as trend, regr_sxx(elapsed, ts) as sxx
>   from sb1 group by id;
>
>   id  |trend | sxx
> --+--+-
>  c742 |  |   0
>  317e |  |   0
>  5fe6 | 5.78750952760444e-06 | 19905896448
>  3441 |  |   0
> (4 rows)
>
> Those zeros for Sxx are the result of calculating the sum of the
> squares of ts values and then subtracting off the square of the mean,
> which results in a complete loss of accuracy because the intermediate
> values are so large they don't differ according to double precision
> arithmetic.
>
> A workaround in PG11 is to just offset the ts values by something
> close to their mean (offsetting the ts values by a constant amount
> shouldn't affect the mathematical result, but does eliminate the
> cancellation errors):
>
> select id, regr_slope(elapsed, ts-1552892914) as trend,
>regr_sxx(elapsed, ts-1552892914) as sxx
>   from sb1 group by id;
>
>   id  |trend |sxx
> --+--+
>  c742 | 19.6077357654714 | 0.0468182563781738
>  317e |-1.08385104429772 |   59.2381523980035
>  5fe6 | 5.78750948360697e-06 |   19905896596.7403
>  3441 |-3.82839508895523 |   20.1098628044128
> (4 rows)
>
>
> For PG12 the algorithm for calculating these quantities has been
> changed by e954a727f0, so the result should be more accurate
> regardless of the offset:
>
> select id, regr_slope(elapsed, ts) as trend, regr_sxx(elapsed, ts) as sxx
>   from sb1 group by id;
>
>   id  |trend |sxx
> --+--+
>  c742 | 19.6078587812905 | 0.0468179252929986
>  317e | -1.0838511987809 |   59.2381423694815
>  5fe6 | 5.78750948358674e-06 |   19905896596.7605
>  3441 |-3.82839546309736 |   20.1098619909822
> (4 rows)
>
> select id, regr_slope(elapsed, ts-1552892914) as trend,
>regr_sxx(elapsed, ts-1552892914) as sxx
>   from sb1 group by id;
>
>   id  |trend |sxx
> --+--+
>  c742 | 19.6078431374563 | 0.046817990382
>  317e |-1.08385109620679 |   59.2381495556381
>  5fe6 | 5.78750948360693e-06 |   19905896596.7403
>  3441 |-3.82839509931361 |20.109862749992
> (4 rows)
>
> Regards,
> Dean
>
> [1] https://github.com/postgres/postgres/commit/e954a727f0
>


Re: When to store data that could be derived

2019-03-24 Thread Chris Travers
Meant to send this to the list but hit the wrong button.

On Sun, Mar 24, 2019 at 9:45 AM Ron  wrote:

> On 3/24/19 3:05 AM, Frank wrote:
> >
> >
> > On 2019-03-24 9:25 AM, Ron wrote:
> >> On 3/24/19 1:42 AM, Frank wrote:
> >>> Hi all
> >>>
> >>> As I understand it, a  general rule of thumb is that you should never
> >>> create a physical column if the data could be derived from existing
> >>> columns. A possible reason for breaking this rule is for performance
> >>> reasons.
> >>>
> >>> I have a situation where I am considering breaking the rule, but I am
> >>> not experienced enough in SQL to know if my reason is valid. I would
> >>> appreciate it if someone could glance at my 'before' and 'after'
> >>> scenarios and see if, from a 'gut-feel' point of view, I should
> proceed.
> >>>
> >
> > [snip]
> >
> >>
> >> Sure the second query joins a lot of tables, but is pretty
> straightforward.
> >>
> >> What REALLY worries me is whether or not the query optimiser would look
> >> at the WHERE CASE, run away screaming and then make it use sequential
> >> scans. Thus, even query #1 would be slow.
> >>
> >
> > I had not realised that. I hope someone else chimes in on this.
>
> In every DBMS that I've used, the lside (left side) needs to be static
> (not
> "a" static) instead of variable (like a function).
>
> For example, this always leads to a sequential scan:
> WHERE EXTRACT(DAY FROM DATE_FIELD) = 5
>


PostgreSQL allows expression indexes

So you can:

create index foo on bar ((id % 1000));

And then use the index on:

select * from bar where id % 1000 = 45;

You could similarly

create index foo on bar (extract(day from date_field));

The left side needs to be indexed (and an immutable expression) but beyond
that.

>
> >
> >>
> >> Is this a historical data set that's never updated, or current data
> >> that's constantly added to?
> >>
> >
> > It is the latter - current data constantly added to.
> >
> > Frank
> >
>
> --
> Angular momentum makes the world go 'round.
>
>

-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more


Re: When to store data that could be derived

2019-03-24 Thread Peter J. Holzer
On 2019-03-24 10:05:02 +0200, Frank wrote:
> 
> 
> On 2019-03-24 9:25 AM, Ron wrote:
> > On 3/24/19 1:42 AM, Frank wrote:
> > > As I understand it, a  general rule of thumb is that you should
> > > never create a physical column if the data could be derived from
> > > existing columns.

The main reason for this rule (which leads to the 3rd normal form) is to
avoid inconsistencies when data is changed.

> > Is this a historical data set that's never updated, or current data
> > that's constantly added to?
> > 
> 
> It is the latter - current data constantly added to.

So the important part here is not whether data is added, but whether
data is changed. Sure, new transactions are added all the time. But is
it expected that the data used to derive amount_cust and amount_local
(e.g. the exchange rate) is changed retroactively, and if it is should
the computed amount change? (I'm a bit worried about the join with the
customers table here - what happens when a customer moves their
headquarters to a country with a different currency?)

> > Sure the second query joins a lot of tables, but is pretty straightforward.
> > 
> > What REALLY worries me is whether or not the query optimiser would look
> > at the WHERE CASE, run away screaming and then make it use sequential
> > scans. Thus, even query #1 would be slow.
> > 
> 
> I had not realised that. I hope someone else chimes in on this.

Your condition is:

CASE
WHEN a.tran_type = 'ar_rec' THEN y.posted
WHEN a.tran_type = 'cb_rec' THEN w.posted
END = '1'

Think about how you would check that. The straightforward way is to
compute the value of the case/end clause and compare that to 1. But to
compute that value you first need the value of a.tran_type. There are
two possible values here, so maybe an index scan on a.tran_type might be
possible, but I'm not sure whether the optimizer is even smart enought
to figure that out and if it is, whether those to values are selective
enough (Maybe all or most records are either ar_rec or cb_rec). After
that you can retrieve the posted value from the correct table.

As a human I see that the condition can only ever be true for records
from y and w with posted = 1. So it might be better to use index scans
on those columns. But this is not that easy to see, and I don't know
whether the optimizer can do it.

Rewriting the condition as

(a.tran_type = 'ar_rec' and y.posted = 1) or
(a.tran_type = 'cb_rec' and w.posted = 1)

might make it easier for the optimizer to find a good plan.

(Please note that I haven't tested any of this. Use EXPLAIN to check
what the optimizer really does.

hp

-- 
   _  | Peter J. Holzer| we build much bigger, better disasters now
|_|_) || because we have much more sophisticated
| |   | h...@hjp.at | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson 


signature.asc
Description: PGP signature


Re: When to store data that could be derived

2019-03-24 Thread Ron

On 3/24/19 3:45 AM, Ron wrote:
[snip]
In every DBMS that I've used, the lside (left side) needs to be static 
(not "a" static) instead of variable (like a function).


Thanks to Chris Travers for reminding me that the word is "immutable", not 
"static".


--
Angular momentum makes the world go 'round.



Re: When to store data that could be derived

2019-03-24 Thread Frank




On 2019-03-24 11:11 AM, Tony Shelver wrote:

Not the answer you are looking for, but...

I'd suggest trying to create a non-trivial set of dummy data to test your
assumptions before deciding on a route.
It's saved my (professional) life a few times over the years when dealing
with untested designs and new (to us) technology.

Some years ago we were implementing an identity management system for a
large US bank, with SQL Server as the store, with a planned integration to
the id / access / permissions of some 300+ systems, targeting 30k plus
users.

Requirements kept changing as we added new systems to the initial mix,
which the Id management package couldn't handle out the box, so we had to
implement a custom design.  We had to choose between 2 database designs,
one being fully 'normalized' (considering that everything was an object')
and one where we made some assumptions and fixed some table structures in
the interest of performance.

Eventually we spent a few days adding non-trivial amounts of test data to
the proposed designs and it became quickly became very apparent that option
1 was unworkable once we got beyond 10 systems or so.



Good advice - much appreciated.

Frank



Re: When to store data that could be derived

2019-03-24 Thread Tony Shelver
Not the answer you are looking for, but...

I'd suggest trying to create a non-trivial set of dummy data to test your
assumptions before deciding on a route.
It's saved my (professional) life a few times over the years when dealing
with untested designs and new (to us) technology.

Some years ago we were implementing an identity management system for a
large US bank, with SQL Server as the store, with a planned integration to
the id / access / permissions of some 300+ systems, targeting 30k plus
users.

Requirements kept changing as we added new systems to the initial mix,
which the Id management package couldn't handle out the box, so we had to
implement a custom design.  We had to choose between 2 database designs,
one being fully 'normalized' (considering that everything was an object')
and one where we made some assumptions and fixed some table structures in
the interest of performance.

Eventually we spent a few days adding non-trivial amounts of test data to
the proposed designs and it became quickly became very apparent that option
1 was unworkable once we got beyond 10 systems or so.


Re: regr_slope returning NULL

2019-03-24 Thread Dean Rasheed
On Sun, 24 Mar 2019 at 08:01, Steve Baldwin  wrote:
>
> Thanks Tom,
>
> I've tried this on 11.2 (OS X 10.14.3, installed locally) and 10.6 (AWS RDS) 
> instances with identical results.  The values you show are identical to those 
> returned by Oracle so that's great but why am I seeing different results?
>

This is caused by the large magnitude of the ts values, which causes a
cancellation error in the Sxx calculation, which is what commit
e954a727f0 fixed in HEAD, and will be available in PG12 [1].

You can see that by including regr_sxx in the results. With PG11, this
gives the following:

select id, regr_slope(elapsed, ts) as trend, regr_sxx(elapsed, ts) as sxx
  from sb1 group by id;

  id  |trend | sxx
--+--+-
 c742 |  |   0
 317e |  |   0
 5fe6 | 5.78750952760444e-06 | 19905896448
 3441 |  |   0
(4 rows)

Those zeros for Sxx are the result of calculating the sum of the
squares of ts values and then subtracting off the square of the mean,
which results in a complete loss of accuracy because the intermediate
values are so large they don't differ according to double precision
arithmetic.

A workaround in PG11 is to just offset the ts values by something
close to their mean (offsetting the ts values by a constant amount
shouldn't affect the mathematical result, but does eliminate the
cancellation errors):

select id, regr_slope(elapsed, ts-1552892914) as trend,
   regr_sxx(elapsed, ts-1552892914) as sxx
  from sb1 group by id;

  id  |trend |sxx
--+--+
 c742 | 19.6077357654714 | 0.0468182563781738
 317e |-1.08385104429772 |   59.2381523980035
 5fe6 | 5.78750948360697e-06 |   19905896596.7403
 3441 |-3.82839508895523 |   20.1098628044128
(4 rows)


For PG12 the algorithm for calculating these quantities has been
changed by e954a727f0, so the result should be more accurate
regardless of the offset:

select id, regr_slope(elapsed, ts) as trend, regr_sxx(elapsed, ts) as sxx
  from sb1 group by id;

  id  |trend |sxx
--+--+
 c742 | 19.6078587812905 | 0.0468179252929986
 317e | -1.0838511987809 |   59.2381423694815
 5fe6 | 5.78750948358674e-06 |   19905896596.7605
 3441 |-3.82839546309736 |   20.1098619909822
(4 rows)

select id, regr_slope(elapsed, ts-1552892914) as trend,
   regr_sxx(elapsed, ts-1552892914) as sxx
  from sb1 group by id;

  id  |trend |sxx
--+--+
 c742 | 19.6078431374563 | 0.046817990382
 317e |-1.08385109620679 |   59.2381495556381
 5fe6 | 5.78750948360693e-06 |   19905896596.7403
 3441 |-3.82839509931361 |20.109862749992
(4 rows)

Regards,
Dean

[1] https://github.com/postgres/postgres/commit/e954a727f0



Re: When to store data that could be derived

2019-03-24 Thread Ron

On 3/24/19 3:05 AM, Frank wrote:



On 2019-03-24 9:25 AM, Ron wrote:

On 3/24/19 1:42 AM, Frank wrote:

Hi all

As I understand it, a  general rule of thumb is that you should never 
create a physical column if the data could be derived from existing 
columns. A possible reason for breaking this rule is for performance 
reasons.


I have a situation where I am considering breaking the rule, but I am 
not experienced enough in SQL to know if my reason is valid. I would 
appreciate it if someone could glance at my 'before' and 'after' 
scenarios and see if, from a 'gut-feel' point of view, I should proceed.




[snip]



Sure the second query joins a lot of tables, but is pretty straightforward.

What REALLY worries me is whether or not the query optimiser would look 
at the WHERE CASE, run away screaming and then make it use sequential 
scans. Thus, even query #1 would be slow.




I had not realised that. I hope someone else chimes in on this.


In every DBMS that I've used, the lside (left side) needs to be static (not 
"a" static) instead of variable (like a function).


For example, this always leads to a sequential scan:
   WHERE EXTRACT(DAY FROM DATE_FIELD) = 5





Is this a historical data set that's never updated, or current data 
that's constantly added to?




It is the latter - current data constantly added to.

Frank



--
Angular momentum makes the world go 'round.



Re: When to store data that could be derived

2019-03-24 Thread Frank




On 2019-03-24 9:25 AM, Ron wrote:

On 3/24/19 1:42 AM, Frank wrote:

Hi all

As I understand it, a  general rule of thumb is that you should never 
create a physical column if the data could be derived from existing 
columns. A possible reason for breaking this rule is for performance 
reasons.


I have a situation where I am considering breaking the rule, but I am 
not experienced enough in SQL to know if my reason is valid. I would 
appreciate it if someone could glance at my 'before' and 'after' 
scenarios and see if, from a 'gut-feel' point of view, I should proceed.




[snip]



Sure the second query joins a lot of tables, but is pretty straightforward.

What REALLY worries me is whether or not the query optimiser would look 
at the WHERE CASE, run away screaming and then make it use sequential 
scans. Thus, even query #1 would be slow.




I had not realised that. I hope someone else chimes in on this.



Is this a historical data set that's never updated, or current data 
that's constantly added to?




It is the latter - current data constantly added to.

Frank



Re: regr_slope returning NULL

2019-03-24 Thread Steve Baldwin
Thanks Tom,

I've tried this on 11.2 (OS X 10.14.3, installed locally) and 10.6 (AWS
RDS) instances with identical results.  The values you show are identical
to those returned by Oracle so that's great but why am I seeing different
results?

This is from my local install:

log=# select version();
 version

--
 PostgreSQL 11.2 on x86_64-apple-darwin18.2.0, compiled by Apple LLVM
version 10.0.0 (clang-1000.11.45.5), 64-bit

Cheers,

Steve

On Sun, Mar 24, 2019 at 4:34 PM Tom Lane  wrote:

> Steve Baldwin  writes:
> > Consider the following:
> > ...
> > log=# select id, regr_slope(elapsed, ts) as trend from sb1 group by id;
> >   id  |trend
> > --+--
> >  c742 |
> >  317e |
> >  5fe6 | 5.78750952760444e-06
> >  3441 |
> > (4 rows)
>
> Hm, I get
>
> regression=# select id, regr_slope(elapsed, ts) as trend from sb1 group by
> id;
>   id  | trend
> --+---
>  c742 |19.607858781290517
>  317e |   -1.0838511987808963
>  5fe6 | 5.787509483586743e-06
>  3441 |-3.828395463097356
> (4 rows)
>
> What platform are you doing this on, and what exactly is the PG version?
>
> > If pg is correctly returning NULL, I'd be interested to understand the
> > circumstances under which this can occur.
>
> The source code shows two cases in which NULL would be returned:
>
> /* if N is 0 we should return NULL */
> if (N < 1.0)
> PG_RETURN_NULL();
>
> /* per spec, return NULL for a vertical line */
> if (Sxx == 0)
> PG_RETURN_NULL();
>
> Maybe the cases you're looking at are sufficiently numerically
> ill-conditioned that you could get Sxx == 0 depending on platform-
> specific roundoff error, but it seems fishy.
>
> regards, tom lane
>


Re: When to store data that could be derived

2019-03-24 Thread Ron

On 3/24/19 1:42 AM, Frank wrote:

Hi all

As I understand it, a  general rule of thumb is that you should never 
create a physical column if the data could be derived from existing 
columns. A possible reason for breaking this rule is for performance reasons.


I have a situation where I am considering breaking the rule, but I am not 
experienced enough in SQL to know if my reason is valid. I would 
appreciate it if someone could glance at my 'before' and 'after' scenarios 
and see if, from a 'gut-feel' point of view, I should proceed.


I have a VIEW constructed as follows -

CREATE VIEW view_name AS
[select statement 1]
UNION ALL
[select statement 2]
etc.

This is one of the select statements. I will give the 'after' scenario 
first -


SELECT
    'arec' AS tran_type, a.row_id AS tran_row_id,
    a.tran_number AS tran_number, a.cust_row_id AS cust_row_id,
    CASE
    WHEN a.tran_type = 'ar_rec' THEN y.tran_date
    WHEN a.tran_type = 'cb_rec' THEN w.tran_date
    END AS tran_date,
    CASE
    WHEN a.tran_type = 'ar_rec' THEN y.text
    WHEN a.tran_type = 'cb_rec' THEN w.text
    END AS text,
    0 - a.arec_cust AS amount_cust,
    0 - a.arec_local AS amount_local
FROM ar_rec_subtran a
LEFT JOIN ar_tran_rec_det z ON z.row_id = a.tran_det_row_id
LEFT JOIN ar_tran_rec y ON y.row_id = z.tran_row_id
LEFT JOIN cb_tran_rec_det x ON x.row_id = a.tran_det_row_id
LEFT JOIN cb_tran_rec w ON w.row_id = x.tran_row_id
WHERE
    CASE
    WHEN a.tran_type = 'ar_rec' THEN y.posted
    WHEN a.tran_type = 'cb_rec' THEN w.posted
    END = '1'

The two columns a.arec_cust and a.arec_local *can* be derived from other 
columns, and in fact that is how it is working at the moment, so here is 
the 'before' scenario -


SELECT
    'arec' AS tran_type, a.row_id AS tran_row_id,
    a.tran_number AS tran_number, a.cust_row_id AS cust_row_id,
    CASE
    WHEN a.tran_type = 'ar_rec' THEN y.tran_date
    WHEN a.tran_type = 'cb_rec' THEN w.tran_date
    END AS tran_date,
    CASE
    WHEN a.tran_type = 'ar_rec' THEN y.text
    WHEN a.tran_type = 'cb_rec' THEN w.text
    END AS text,
    ROUND(0 - (ROUND(a.arec_amount / CASE
    WHEN a.tran_type = 'ar_rec' THEN y.tran_exch_rate
    WHEN a.tran_type = 'cb_rec' THEN w.tran_exch_rate
    END * a.cust_exch_rate, u.scale)), 2) AS amount_cust,
    ROUND(0 - (ROUND(a.arec_amount / CASE
    WHEN a.tran_type = 'ar_rec' THEN y.tran_exch_rate
    WHEN a.tran_type = 'cb_rec' THEN w.tran_exch_rate
    END, s.scale)), 2) AS amount_local
FROM ar_rec_subtran a
LEFT JOIN ar_tran_rec_det z ON z.row_id = a.tran_det_row_id
LEFT JOIN ar_tran_rec y ON y.row_id = z.tran_row_id
LEFT JOIN cb_tran_rec_det x ON x.row_id = a.tran_det_row_id
LEFT JOIN cb_tran_rec w ON w.row_id = x.tran_row_id
LEFT JOIN ar_customers v ON v.row_id = a.cust_row_id
LEFT JOIN adm_currencies u ON u.row_id = v.currency_id
LEFT JOIN adm_params t ON t.row_id = 1
LEFT JOIN adm_currencies s ON s.row_id = t.local_curr_id
WHERE
    CASE
    WHEN a.tran_type = 'ar_rec' THEN y.posted
    WHEN a.tran_type = 'cb_rec' THEN w.posted
    END = '1'

As you can see, complexity has increased and there are four additional JOINs.

I am expecting the VIEW to be used extensively for query purposes, and my 
gut-feel says that the second one is likely to lead to performance 
problems in a system with a lot of data and a lot of users.


Generate an artificial load and test it?

I am not looking for an answer - I know that I should create dummy data 
and run some timing tests. I was just wondering if someone more 
experienced would wince when they look at the second SELECT, or if they 
would shrug and think that it looks fine.


Sure the second query joins a lot of tables, but is pretty straightforward.

What REALLY worries me is whether or not the query optimiser would look at 
the WHERE CASE, run away screaming and then make it use sequential scans.  
Thus, even query #1 would be slow.




Any input will be appreciated.



Is this a historical data set that's never updated, or current data that's 
constantly added to?




Frank Millman





--
Angular momentum makes the world go 'round.




When to store data that could be derived

2019-03-24 Thread Frank

Hi all

As I understand it, a  general rule of thumb is that you should never 
create a physical column if the data could be derived from existing 
columns. A possible reason for breaking this rule is for performance 
reasons.


I have a situation where I am considering breaking the rule, but I am 
not experienced enough in SQL to know if my reason is valid. I would 
appreciate it if someone could glance at my 'before' and 'after' 
scenarios and see if, from a 'gut-feel' point of view, I should proceed.


I have a VIEW constructed as follows -

CREATE VIEW view_name AS
[select statement 1]
UNION ALL
[select statement 2]
etc.

This is one of the select statements. I will give the 'after' scenario 
first -


SELECT
    'arec' AS tran_type, a.row_id AS tran_row_id,
    a.tran_number AS tran_number, a.cust_row_id AS cust_row_id,
    CASE
    WHEN a.tran_type = 'ar_rec' THEN y.tran_date
    WHEN a.tran_type = 'cb_rec' THEN w.tran_date
    END AS tran_date,
    CASE
    WHEN a.tran_type = 'ar_rec' THEN y.text
    WHEN a.tran_type = 'cb_rec' THEN w.text
    END AS text,
    0 - a.arec_cust AS amount_cust,
    0 - a.arec_local AS amount_local
FROM ar_rec_subtran a
LEFT JOIN ar_tran_rec_det z ON z.row_id = a.tran_det_row_id
LEFT JOIN ar_tran_rec y ON y.row_id = z.tran_row_id
LEFT JOIN cb_tran_rec_det x ON x.row_id = a.tran_det_row_id
LEFT JOIN cb_tran_rec w ON w.row_id = x.tran_row_id
WHERE
    CASE
    WHEN a.tran_type = 'ar_rec' THEN y.posted
    WHEN a.tran_type = 'cb_rec' THEN w.posted
    END = '1'

The two columns a.arec_cust and a.arec_local *can* be derived from other 
columns, and in fact that is how it is working at the moment, so here is 
the 'before' scenario -


SELECT
    'arec' AS tran_type, a.row_id AS tran_row_id,
    a.tran_number AS tran_number, a.cust_row_id AS cust_row_id,
    CASE
    WHEN a.tran_type = 'ar_rec' THEN y.tran_date
    WHEN a.tran_type = 'cb_rec' THEN w.tran_date
    END AS tran_date,
    CASE
    WHEN a.tran_type = 'ar_rec' THEN y.text
    WHEN a.tran_type = 'cb_rec' THEN w.text
    END AS text,
    ROUND(0 - (ROUND(a.arec_amount / CASE
    WHEN a.tran_type = 'ar_rec' THEN y.tran_exch_rate
    WHEN a.tran_type = 'cb_rec' THEN w.tran_exch_rate
    END * a.cust_exch_rate, u.scale)), 2) AS amount_cust,
    ROUND(0 - (ROUND(a.arec_amount / CASE
    WHEN a.tran_type = 'ar_rec' THEN y.tran_exch_rate
    WHEN a.tran_type = 'cb_rec' THEN w.tran_exch_rate
    END, s.scale)), 2) AS amount_local
FROM ar_rec_subtran a
LEFT JOIN ar_tran_rec_det z ON z.row_id = a.tran_det_row_id
LEFT JOIN ar_tran_rec y ON y.row_id = z.tran_row_id
LEFT JOIN cb_tran_rec_det x ON x.row_id = a.tran_det_row_id
LEFT JOIN cb_tran_rec w ON w.row_id = x.tran_row_id
LEFT JOIN ar_customers v ON v.row_id = a.cust_row_id
LEFT JOIN adm_currencies u ON u.row_id = v.currency_id
LEFT JOIN adm_params t ON t.row_id = 1
LEFT JOIN adm_currencies s ON s.row_id = t.local_curr_id
WHERE
    CASE
    WHEN a.tran_type = 'ar_rec' THEN y.posted
    WHEN a.tran_type = 'cb_rec' THEN w.posted
    END = '1'

As you can see, complexity has increased and there are four additional 
JOINs.


I am expecting the VIEW to be used extensively for query purposes, and 
my gut-feel says that the second one is likely to lead to performance 
problems in a system with a lot of data and a lot of users.


I am not looking for an answer - I know that I should create dummy data 
and run some timing tests. I was just wondering if someone more 
experienced would wince when they look at the second SELECT, or if they 
would shrug and think that it looks fine.


Any input will be appreciated.

Frank Millman