Re: [HACKERS] Proposal for resolving casting issues

2002-09-19 Thread Zeugswetter Andreas SB SD


 Zeugswetter Andreas SB SD [EMAIL PROTECTED] writes:
  Note that if you write, say,
  set numericcol = numericcol * 3.14159;
  my proposal would do the right thing since the constant would be typed
  as numeric to start with and would stay that way.  To do what you want
  with a float variable, it'd be necessary to write
  set numericcol = numericcol * float4col::numeric;
 
  Yes, that is the case where the new behavior would imho not be good (but you 
  say spec compliant). I loose precision even though there is room to hold it.
 
 Lose what precision?  It seems silly to imagine that the product of

Have you seen my example ? If calculated in float4 the result of
1.01*1000.0-1000.0 would be 0.0, no ? 

 a numeric and a float4 is good to more digits than there are in the
 float4.  This is exactly the spec's point: combining an exact and an
 approximate input will give you an approximate result.

Does it actually say how approximate the result needs to be, or is it simply 
approximate by nature that one part was only approximate ?
Do they really mean, that an approximate calculation with one float4 must be 
calculated in float4 arithmetic ? If you e.g. calculate in float8 it would still 
be an approximate result and thus imho conform.

 (Unless of course the value in the float4 happens to be exact, eg,
 an integer of not very many digits.  But if you are relying on that
 to be true, why aren't you using an exact format for storing it?)

Probably because the approximate is more efficient in storage size,
or the designer knew he only wants to store 6 significant digits ?

  Informix does the calculations in numeric, and then converts the result
  if no casts are supplied (would do set float4col = float4(float4col::numeric * 
numericcol)).
 
 I am not sure what the argument is for following Informix's lead rather
 than the standard's lead; especially when Informix evidently doesn't
 understand numerical analysis ;-)

It was only an example of how someone else does it and was why I asked what 
other db's do. I would e.g. suspect Oracle does it similarily.
Please, someone check another db !

Andreas

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Proposal for resolving casting issues

2002-09-19 Thread Tom Lane

Zeugswetter Andreas SB SD [EMAIL PROTECTED] writes:
 Yes, that is the case where the new behavior would imho not be good (but you 
 say spec compliant). I loose precision even though there is room to hold it.
 
 Lose what precision?  It seems silly to imagine that the product of

 Have you seen my example ? If calculated in float4 the result of
 1.01*1000.0-1000.0 would be 0.0, no ? 

So?  If you are storing one input as float4, then you cannot rationally
say that you know the result to better than 6 digits, because you don't
know the input to better than 6 digits.  Claiming that 1000.001
is a more accurate answer for the product than 1000.0 is simply wishful
thinking on your part: nothing to the right of the sixth digit actually
means a darn thing, because you don't know whether the input was really
exactly 1000, or should have been perhaps 1000.001.

 Do they really mean, that an approximate calculation with one float4 must be 
 calculated in float4 arithmetic ? If you e.g. calculate in float8 it would still 
 be an approximate result and thus imho conform.

And still the output would be illusory: if you think you'd get 16 digits
of precision that way, then you are failing to grasp the problem.

 (Unless of course the value in the float4 happens to be exact, eg,
 an integer of not very many digits.  But if you are relying on that
 to be true, why aren't you using an exact format for storing it?)

 Probably because the approximate is more efficient in storage size,
 or the designer knew he only wants to store 6 significant digits ?

Seems an exceedingly uncompelling scenario.  The only values that could
be expected to be stored exactly in a float4 (without very careful
analysis) are integers of up to 6 digits; you might as well store the
column as int4 if that's what you plan to keep in it.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Proposal for resolving casting issues

2002-09-19 Thread Zeugswetter Andreas SB SD


  Yes, that is the case where the new behavior would imho not be good (but you 
  say spec compliant). I loose precision even though there is room to hold it.
  
  Lose what precision?  It seems silly to imagine that the product of
 
  Have you seen my example ? If calculated in float4 the result of
  1.01*1000.0-1000.0 would be 0.0, no ? 
 
 So?  If you are storing one input as float4, then you cannot rationally
 say that you know the result to better than 6 digits, because you don't
 know the input to better than 6 digits.  Claiming that 1000.001
 is a more accurate answer for the product than 1000.0 is simply wishful
 thinking on your part: nothing to the right of the sixth digit actually
 means a darn thing, because you don't know whether the input was really
 exactly 1000, or should have been perhaps 1000.001.

I still see 1E-10 as a better answer to above calculation than your 0,
and my snapshot 9/11 does return that 1E-10.

For better understanding the test in pg:
create table atab (a decimal(30,20), b float4, c decimal(30,20), d float4);
insert into atab values (1.001,10.0,0, 0);
update atab set c=a*b-b, d=a*b-b where 1=1;
create view av as select a*b-b, 1, b, c,d from atab;
\d av
View definition: SELECT ((atab.a * numeric(atab.b)) - numeric(atab.b)), atab.a, 
atab.b
, atab.c, atab.d FROM atab;

If I understood your proposal that would now change to:
View definition: SELECT ((float4(atab.a) * atab.b) - atab.b), atab.a, atab.b
, atab.c, atab.d FROM atab;

 
  Do they really mean, that an approximate calculation with one float4 must be 
  calculated in float4 arithmetic ? If you e.g. calculate in float8 it would still 
  be an approximate result and thus imho conform.
 
 And still the output would be illusory: if you think you'd get 16 digits
 of precision that way, then you are failing to grasp the problem.

I have not said 16 digits exact precision. I was saying, that an approximate 
result calculated in numeric makes more sense, than your float4 calculated result,
and does the correct thing more often than not in the db centric cases I can think 
of.

I do think I grasp the problem :-)

  (Unless of course the value in the float4 happens to be exact, eg,
  an integer of not very many digits.  But if you are relying on that
  to be true, why aren't you using an exact format for storing it?)
 
  Probably because the approximate is more efficient in storage size,
  or the designer knew he only wants to store 6 significant digits ?
 
 Seems an exceedingly uncompelling scenario.  The only values that could
 be expected to be stored exactly in a float4 (without very careful
 analysis) are integers of up to 6 digits; you might as well store the
 column as int4 if that's what you plan to keep in it.

You can store 6 significant digits and an exponent (iirc 10E+-38) ! 
e.g. 1.23456E-20 an int can't do that.

I give up now. I voiced my concern, and that is as far as my interest goes on this
actually. I still think fielding what other db's do in this area would be a good 
thing before proceeding further.

Andreas

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Proposal for resolving casting issues

2002-09-19 Thread Ross J. Reedstrom

On Thu, Sep 19, 2002 at 04:57:30PM +0200, Zeugswetter Andreas SB SD wrote:
 
  
   Have you seen my example ? If calculated in float4 the result of
   1.01*1000.0-1000.0 would be 0.0, no ? 
  
  So?  If you are storing one input as float4, then you cannot rationally
  say that you know the result to better than 6 digits, because you don't
  know the input to better than 6 digits.  Claiming that 1000.001
  is a more accurate answer for the product than 1000.0 is simply wishful
  thinking on your part: nothing to the right of the sixth digit actually
  means a darn thing, because you don't know whether the input was really
  exactly 1000, or should have been perhaps 1000.001.
 
 I still see 1E-10 as a better answer to above calculation than your 0,
 and my snapshot 9/11 does return that 1E-10.

Well, then you'd be wrong. Numerical analysis says you _can't_ get more
information out than went in to the _least_ precise part of a calculation.
What your suggesting is the equivalent of wanting to put up a shelf, so
you estimate the length of the wall by eyeballing it, then measure the
wood for the shelf with a micrometer, to be sure it fits exactly right.

We teach this in intro science classes all the time: if you calculate with
3.14 as an approximation to pi, you better not report the circumference
of a circle as 2.45678932 cm, I'll take off points!

 
 I do think I grasp the problem :-)

Hmm, I'm not so sure. ;-)

 
 I give up now. I voiced my concern, and that is as far as my interest goes on this
 actually. I still think fielding what other db's do in this area would be a good 
 thing before proceeding further.

Ah, sorry to drag this on, then. But this is one of those clear cases
were we must fo the right thing, not follow the crowd. PostgreSQL gets
used by a lot of scientific projects (Have you noticed all the big
bioinformatics databases being mentioned on the lists?). Partly because
we're always underfunded, partly because we're academics who like to
have the code. If we start getting basic maths wrong, that'll be a huge
balck eye for the project.

Ross
-- 
Ross Reedstrom, Ph.D. [EMAIL PROTECTED]
Executive Director  phone: 713-348-6166
Gulf Coast Consortium for Bioinformatics  fax: 713-348-6182
Rice University MS-39
Houston, TX 77005

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Proposal for resolving casting issues

2002-09-19 Thread Ross J. Reedstrom

On Thu, Sep 19, 2002 at 10:30:51AM -0500, Ross J. Reedstrom wrote:
 
 Ah, sorry to drag this on, then. But this is one of those clear cases
 were we must fo the right thing, not follow the crowd. PostgreSQL gets
   do
 used by a lot of scientific projects (Have you noticed all the big
 bioinformatics databases being mentioned on the lists?). Partly because
 we're always underfunded, partly because we're academics who like to
  ^^(scientific projects)  ^^
 have the code. If we start getting basic maths wrong, that'll be a huge
^^(PostgreSQL)
 balck eye for the project.
  black

Clearly, it's time for an early lunch for me. I need sugar for my brain.

Ross


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Proposal for resolving casting issues

2002-09-19 Thread Zeugswetter Andreas SB SD


Have you seen my example ? If calculated in float4 the result of
1.01*1000.0-1000.0 would be 0.0, no ? 
   
   So?  If you are storing one input as float4, then you cannot rationally
   say that you know the result to better than 6 digits, because you don't
   know the input to better than 6 digits.  Claiming that 1000.001
   is a more accurate answer for the product than 1000.0 is simply wishful
   thinking on your part: nothing to the right of the sixth digit actually
   means a darn thing, because you don't know whether the input was really
   exactly 1000, or should have been perhaps 1000.001.
  
  I still see 1E-10 as a better answer to above calculation than your 0,
  and my snapshot 9/11 does return that 1E-10.
 
 Well, then you'd be wrong. Numerical analysis says you _can't_ get more
 information out than went in to the _least_ precise part of a calculation.
 What your suggesting is the equivalent of wanting to put up a shelf, so
 you estimate the length of the wall by eyeballing it, then measure the
 wood for the shelf with a micrometer, to be sure it fits 
 exactly right.
 
 We teach this in intro science classes all the time: if you calculate with
 3.14 as an approximation to pi, you better not report the circumference
 of a circle as 2.45678932 cm, I'll take off points!

What if he must display 9 digits and says the result is approximately 2.45678932
would that be worse than 2.4600 ? 
That is what I am trying to say. Probably the standard is meant as a hint for db 
users, that such results will be approximate, not where the first digit sits that 
is not exact any more.

For above calculation pg will in the future return 0. as an
answer to 1.01*1000.0-1000.0 when used in my example context, while
it currently returns 0.0010 ... 
You both are saying, that 0. is a better answer. 

Andreas

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] Proposal for resolving casting issues

2002-09-19 Thread Ross J. Reedstrom

On Thu, Sep 19, 2002 at 06:00:37PM +0200, Zeugswetter Andreas SB SD wrote:
 
 What if he must display 9 digits and says the result is approximately 2.45678932
 would that be worse than 2.4600 ? 

Yup. Trailing zeros are not significant. That's why scientific notation is nice:
you don't fill in all those insignificant placeholders.

 
 For above calculation pg will in the future return 0. as an
 answer to 1.01*1000.0-1000.0 when used in my example context, while
 it currently returns 0.0010 ... 
 You both are saying, that 0. is a better answer. 

That's right. And correct, as well.

Ross

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Proposal for resolving casting issues

2002-09-19 Thread Tom Lane

Zeugswetter Andreas SB SD [EMAIL PROTECTED] writes:
 For above calculation pg will in the future return 0. as an
 answer to 1.01*1000.0-1000.0 when used in my example context, while
 it currently returns 0.0010 ... 
 You both are saying, that 0. is a better answer. 

Not exactly: we are saying it is not a worse answer.  There's no reason
to prefer one over the other, because they are both within the range
of uncertainty given the inherent uncertainty in the float4 input.

If you want exact results, you should be using exact datatypes.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Proposal for resolving casting issues

2002-09-19 Thread Christopher Kings-Lynne

Will the new casting stuff address this kind of annoyance?

usa=# select average(octet_length(val)) from users_sessions;
ERROR:  Function 'average(int4)' does not exist
Unable to identify a function that satisfies the given argument
types
You may need to add explicit typecasts

Chris


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] Proposal for resolving casting issues

2002-09-19 Thread Christopher Kings-Lynne

Doh - I'm stupid.  Ignore my question :)

Helps if you spell 'average' as 'avg' :)

Chris

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED]]On Behalf Of Christopher
 Kings-Lynne
 Sent: Friday, 20 September 2002 10:03 AM
 To: Tom Lane; Zeugswetter Andreas SB SD
 Cc: Bruce Momjian; [EMAIL PROTECTED]
 Subject: Re: [HACKERS] Proposal for resolving casting issues 
 
 
 Will the new casting stuff address this kind of annoyance?
 
 usa=# select average(octet_length(val)) from users_sessions;
 ERROR:  Function 'average(int4)' does not exist
 Unable to identify a function that satisfies the given argument
 types
 You may need to add explicit typecasts
 
 Chris
 
 
 ---(end of broadcast)---
 TIP 4: Don't 'kill -9' the postmaster
 


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] Proposal for resolving casting issues

2002-09-19 Thread Tom Lane

Christopher Kings-Lynne [EMAIL PROTECTED] writes:
 Will the new casting stuff address this kind of annoyance?
 usa=# select average(octet_length(val)) from users_sessions;
 ERROR:  Function 'average(int4)' does not exist

regression=# select * from pg_proc where proname = 'average';
 proname | pronamespace | proowner | prolang | proisagg | prosecdef | proisstrict | 
proretset | provolatile | pronargs | prorettype | proargtypes | prosrc | probin | 
proacl
-+--+--+-+--+---+-+---+-+--++-+++
(0 rows)


No, I think you'll get the same error ...

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Proposal for resolving casting issues

2002-09-18 Thread Zeugswetter Andreas SB SD


  Note that if you write, say,
  set numericcol = numericcol * 3.14159;
  my proposal would do the right thing since the constant would be typed
  as numeric to start with and would stay that way.  To do what you want
  with a float variable, it'd be necessary to write
  set numericcol = numericcol * float4col::numeric;

Yes, that is the case where the new behavior would imho not be good (but you 
say spec compliant). I loose precision even though there is room to hold it.

  which is sort of ugly; but no uglier than
  set float4col = float4col * numericcol::float4;

Informix does the calculations in numeric, and then converts the result
if no casts are supplied (would do set float4col = float4(float4col::numeric * 
numericcol)).

Would be interesting what others do ?

Test script:
create table atab (a decimal(30), b smallfloat, c decimal(30), d smallfloat);
insert into atab values (1.001,10.0,0, 0);
update atab set c=a*b-b, d=a*b-b where 1=1;
select a*b-b, b, c,d from atab;

(expression)  bc  d

   1e-10 10.0001e-10  1e-10

I hope this test is ok ?
It still seems to me, that numeric should be the preferred type, and not float8.

Andreas

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Proposal for resolving casting issues

2002-09-18 Thread Tom Lane

Zeugswetter Andreas SB SD [EMAIL PROTECTED] writes:
 Note that if you write, say,
 set numericcol = numericcol * 3.14159;
 my proposal would do the right thing since the constant would be typed
 as numeric to start with and would stay that way.  To do what you want
 with a float variable, it'd be necessary to write
 set numericcol = numericcol * float4col::numeric;

 Yes, that is the case where the new behavior would imho not be good (but you 
 say spec compliant). I loose precision even though there is room to hold it.

Lose what precision?  It seems silly to imagine that the product of
a numeric and a float4 is good to more digits than there are in the
float4.  This is exactly the spec's point: combining an exact and an
approximate input will give you an approximate result.

(Unless of course the value in the float4 happens to be exact, eg,
an integer of not very many digits.  But if you are relying on that
to be true, why aren't you using an exact format for storing it?)

 Informix does the calculations in numeric, and then converts the result
 if no casts are supplied (would do set float4col = float4(float4col::numeric * 
numericcol)).

I am not sure what the argument is for following Informix's lead rather
than the standard's lead; especially when Informix evidently doesn't
understand numerical analysis ;-)

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Proposal for resolving casting issues

2002-09-18 Thread Peter Eisentraut

Bruce Momjian writes:

  Or possibly it should be AS IMPLICIT?

 I think AS IMPLICIT would be better because we have other AS [var]
 clauses.

But IMPLICIT is not a variable.

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Proposal for resolving casting issues

2002-09-18 Thread Bruce Momjian

Peter Eisentraut wrote:
 Bruce Momjian writes:
 
   Or possibly it should be AS IMPLICIT?
 
  I think AS IMPLICIT would be better because we have other AS [var]
  clauses.
 
 But IMPLICIT is not a variable.

I meant we have cases where we do AS [ keyword1 | keyword2 ].

CREATE OPERATOR CLASS any_name opt_default FOR TYPE_P Typename
USING access_method AS opclass_item_list

What I am saying is that is better to do AS [ keyword | keyword ] rather
than [ AS keyword | keyword ].

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Proposal for resolving casting issues

2002-09-18 Thread Tom Lane

Bruce Momjian [EMAIL PROTECTED] writes:
 What I am saying is that is better to do AS [ keyword | keyword ] rather
 than [ AS keyword | keyword ].

Yeah, I thought the same after looking at it a little.  Committed that
way (of course it's still open to adjustment...)

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Proposal for resolving casting issues

2002-09-17 Thread Bruce Momjian

Tom Lane wrote:
 We've been discussing this stuff in fits and starts for months now, but
 nothing satisfactory has been arrived at.  I've concluded that part of
 the problem is that we are trying to force the system's behavior into
 a model that is too limiting: we need more than an implicit/explicit cast
 distinction.  Accordingly, I suggest we bite the bullet and make it happen.
 (Note that I've resigned myself to having to do an initdb for 7.3beta2.)

I was reading my backlog of email and thinking, Oh, things are shaping
up well, then I hit this message.  Let me try to collect open items
tomorrow and get a plan together.  I have caught up on my email.  I am
heading to bed.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Proposal for resolving casting issues

2002-09-17 Thread Bruce Momjian

Tom Lane wrote:
 Zeugswetter Andreas SB SD [EMAIL PROTECTED] writes:
  I think the following three states may enable a closer match to an actually 
  desired (Peter said mandated by SQL99) behavior.
 
  1. okay as implicit cast in expression or assignment
  2. okay as implicit cast in expression or assignment but needs runtime check
  (precision loss possible)
  3. okay only as explicit cast (precision loss possible)
 
 The runtime checks are there already, eg
 
 regression=# select 123456789::int4::int2;
 ERROR:  i4toi2: '123456789' causes int2 overflow
 
 However this does not help us much; the critical point is that if we
 want function overloading to work in a sane fashion, we have to prefer
 up-conversions to down-conversions *at parse time*, at least for the
 operands of functions and operators (which is what I meant by in
 expressions).  Runtime checks are irrelevant to this problem.

I think there is some confusion here.  The runtime checks Andreas was
talking about was allowing a double of 64.0 to cast to an int4 while
disallowing 64.1 from being cast to an int4 because it is not a hole
number.  

I am not sure doubles have enough precision to make such comparisons
functional (NUMERIC certainly does) but that was his proposal, and he
stated he thought the standard required it.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Proposal for resolving casting issues

2002-09-17 Thread Zeugswetter Andreas SB SD


 What I will do instead is adjust parse_coerce.c so that a
 length-coercion function can have either of the signatures
   foo(foo,int4) returns foo
 or
   foo(foo,int4,bool) returns foo
 and then modify the above-mentioned length coercion functions to provide
 the desired behavior.  This has no direct impact on pg_cast because we
 do not use pg_cast for length-coercion functions.

Sounds good to me. 

When those are really truncated ESQL/C needs to set a warning in sqlca.sqlwarn
though, thus I think the second signature should also have an output flag to tell 
whether truncation actually occurred.
Maybe this should be kept for a protocol change though, since I would not think
a NOTICE would be suitable here. 

Andreas

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Proposal for resolving casting issues

2002-09-17 Thread Tom Lane

Bruce Momjian [EMAIL PROTECTED] writes:
 I think there is some confusion here.  The runtime checks Andreas was
 talking about was allowing a double of 64.0 to cast to an int4 while
 disallowing 64.1 from being cast to an int4 because it is not a hole
 number.  

 I am not sure doubles have enough precision to make such comparisons
 functional (NUMERIC certainly does) but that was his proposal, and he
 stated he thought the standard required it.

It seems clear to me that the standard requires us NOT to reject that.

In the explicit-cast case, SQL92 6.10 cast specification saith:

 3) If TD is exact numeric, then

Case:

a) If SD is exact numeric or approximate numeric, then

  Case:

  i) If there is a representation of SV in the data type TD
 that does not lose any leading significant digits after
 rounding or truncating if necessary, then TV is that rep-
 resentation. The choice of whether to round or truncate is
 implementation-defined.

 ii) Otherwise, an exception condition is raised: data exception-
 numeric value out of range.

So we are *only* allowed to throw an error for overflow; having to round
is not an error condition.

In the implicit-cast case, section 9.2 Store assignment has

k) If the data type of T is numeric and there is an approxi-
  mation obtained by rounding or truncation of the numerical
  value of V for the data type of T, then the value of T is set
  to such an approximation.

  If there is no such approximation, then an exception condi-
  tion is raised: data exception-numeric value out of range.

  If the data type of T is exact numeric, then it is implementation-
  defined whether the approximation is obtained by rounding or
  by truncation.

which is different wording but seems to boil down to the same thing: the
only error condition is out-of-range.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Proposal for resolving casting issues

2002-09-17 Thread Zeugswetter Andreas SB SD


  For numbers there is probably only the solution to invent an
  anynumber generic type.
 
 Actually, I had been toying with the notion of doing the following:
 
 1. A numeric literal is initially typed as the smallest type that will
 hold it in the series int2, int4, int8, numeric (notice NOT float8).

Yes, that sounds like a good plan for all scenarios that can follow !

 2. Allow implicit up-coercion int2-int4-int8-numeric-float4-float8,
 but down-coercions aren't implicit except for assignment.

How about int2-int4-int8-numeric-float4-float8-numeric ?
That would also allow an upward path from float8.

 3. Eliminate most or all of the cross-numeric-type operators 
 (eg, there is no reason to support int2+int4 as a separate operator).

Yes.

 With this approach, an expression like int4var = 42 would be initially
 typed as int4 and int2, but then the constant would be coerced to int4
 because int4=int4 is the closest-match operator.  (int2=int2 would not
 be considered because down-coercion isn't implicitly invokable.)  

It would fix the constants issue, yes. How about where int2col=int4col 
and it's indexability of int2col though ?

 Also
 we get more nearly SQL-standard behavior in expressions that combine
 numeric with float4/float8: the preferred type will be float, which
 accords with the spec's notions of exact numeric vs. 
 approximate numeric.

I do not understand the standard here.
Especially the following would seem awkward if that would switch to approximate:
set numericcol = numericcol * float4col; 

Andreas

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Proposal for resolving casting issues

2002-09-17 Thread Zeugswetter Andreas SB SD


 I think there is some confusion here.  The runtime checks Andreas was
 talking about was allowing a double of 64.0 to cast to an int4 while
 disallowing 64.1 from being cast to an int4 because it is not a hole
 number.  

Yes, and Tom's proposal for numbers is sufficient for constants, since the 64.0
will initially be an int2 and thus do the correct thing together with an int4,
and the 64.1 constant will be a numeric, and thus also do the correct thing with
all other types.

It is not sufficient for the optimizer for joins though, since it cannot use the 
int4 index when confronted with where tab1.int4col = tab2.numericcol.
Here only a runtime (non aborting) check would help.
Maybe this could be overcome if the index access (or something inbetween) would allow
a numeric constant for an int4 index (If the numeric value does not cleanly convert
to int4, return no rows).

Andreas

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Proposal for resolving casting issues

2002-09-17 Thread Tom Lane

Zeugswetter Andreas SB SD [EMAIL PROTECTED] writes:
 When those are really truncated ESQL/C needs to set a warning in sqlca.sqlwarn
 though, thus I think the second signature should also have an output flag to tell 
 whether truncation actually occurred.
 Maybe this should be kept for a protocol change though, since I would not think
 a NOTICE would be suitable here. 

Again, I don't want to invent output arguments for functions today ;-).

I agree that a NOTICE would be overkill, and that we need a protocol
change to implement completion conditions (sqlca.sqlwarn) properly.
When that happens, I think the explicit-cast paths in the coercion
routines can easily call the set a completion condition routine for
themselves; I see no reason to pass back the condition one level
before doing so.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Proposal for resolving casting issues

2002-09-17 Thread Tom Lane

Zeugswetter Andreas SB SD [EMAIL PROTECTED] writes:
 It is not sufficient for the optimizer for joins though, since it
 cannot use the int4 index when confronted with where tab1.int4col =
 tab2.numericcol.

For cross-datatype joins, the proposal as I sketched it would result in
the parser producing, eg,
where tab1.int4col::numeric = tab2.numericcol
that is, we'd have a single-datatype operator and a runtime cast in the
expression.

The optimizer is today capable of producing a nested loop with inner
indexscan join from this --- so long as the inner indexscan is on the
uncasted column (numericcol in this case).  It won't consider an int4
index on int4col for this.  This seems okay to me, actually.  It's
better than what you get now with a cross-datatype comparison operator
(neither side can be indexscanned since the operator matches neither
index opclass).

The major failing that needs to be rectified is that merge and hash
joins won't even be considered, because that code only works with
quals that are unadorned Var = Var.  I don't believe there is any
fundamental reason for this restriction.  As long as the top operator
is merge/hashjoinable, any expression should work on either side.
It's just a matter of cleaning up a few unwarranted shortcuts in the
planner.

But that work does need to be done before we can rip out all the
cross-datatype operators ... so this is definitely not happening
for 7.3 ...

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Proposal for resolving casting issues

2002-09-17 Thread Tom Lane

Zeugswetter Andreas SB SD [EMAIL PROTECTED] writes:
 2. Allow implicit up-coercion int2-int4-int8-numeric-float4-float8,
 but down-coercions aren't implicit except for assignment.

 How about int2-int4-int8-numeric-float4-float8-numeric ?
 That would also allow an upward path from float8.

Uh, what?  That seems logically impossible to me ... or at least it
would reintroduce exactly the problem we need to get away from: casts
between float4, float8, numeric would be considered equally good in
either direction, creating ambiguity about which operator to use.
How are you envisioning it would work exactly?

Perhaps I should clarify what I had in mind: because the parser only
considers one level of type coercion when choosing a function or
assigning to a result column, it's actually necessary to have all thirty
cast combinations between the six numeric types available in pg_cast.
My notation int2-int4-int8-numeric-float4-float8 is intended to
imply that of the thirty, these would be marked as implicitly coercible:

int2-int4
int2-int8
int2-numeric
int2-float4
int2-float8
int4-int8
int4-numeric
int4-float4
int4-float8
int8-numeric
int8-float4
int8-float8
numeric-float4
numeric-float8
float4-float8

while the fifteen reverse coercions would be assignment-only.

If we allow any circularity then we will have pairs of types with both
cast pathways marked as implicit, which will leave the parser unable to
choose which operator to use.  This is exactly why numeric = float8
has failed in past versions: there are two alternatives that are equally
easy to reach.


 It would fix the constants issue, yes. How about where int2col=int4col 
 and it's indexability of int2col though ?

See my other response.  The current scheme of using a cross-datatype
operator isn't helpful for indexing such cases anyway...

 Also
 we get more nearly SQL-standard behavior in expressions that combine
 numeric with float4/float8: the preferred type will be float, which
 accords with the spec's notions of exact numeric vs. 
 approximate numeric.

 I do not understand the standard here.
 Especially the following would seem awkward if that would switch to
 approximate:
   set numericcol = numericcol * float4col; 

Well, the spec's notion is that combining an exact number and an
approximate number must yield an approximate result.  This logic
is hard to argue with, even though in our implementation it would
seem to make more sense for numeric to be the top of the hierarchy
on range and precision grounds.

Note that if you write, say,
set numericcol = numericcol * 3.14159;
my proposal would do the right thing since the constant would be typed
as numeric to start with and would stay that way.  To do what you want
with a float variable, it'd be necessary to write
set numericcol = numericcol * float4col::numeric;
which is sort of ugly; but no uglier than
set float4col = float4col * numericcol::float4;
which is what you'd have to write if the system preferred numeric and
you wanted the other behavior.

I too have been thinking for a long time that I didn't like following
the spec's lead on this point; but I am now starting to think that it's
not all that bad.  This approach to handling constants is *much* cleaner
than what we've done in the past, or even any of the unimplemented
proposals that I can recall.  The behavior you'd get with combinations
of float and numeric variables is, well, debatable; from an
implementor's point of view preferring a numeric result makes sense,
but it's much less clear that users would automatically think the same.
Given the spec's position, I am starting to think that preferring float
is the right thing to do.

BTW, I am thinking that we don't need the notion of preferred type at
all in the numeric category if we use this approach.  I have not worked
through the details for the other type categories, but perhaps if we
adopt similar systems of one-way implicit promotions in each category,
we could retire preferred types altogether --- which would let us get
rid of hardwired type categories, too.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Proposal for resolving casting issues

2002-09-17 Thread Tom Lane

I wrote:
 I think we must extend pg_cast's castimplicit column to a three-way value:
   * okay as implicit cast in expression (or in assignment)
   * okay as implicit cast in assignment only
   * okay only as explicit cast

 Question: what shall we call these alternatives in CREATE CAST?  The SQL99
 phrase AS ASSIGNMENT looks like it should mean the second, but I think
 the spec semantics require it to mean the first.  Ugh.  Perhaps AS
 ASSIGNMENT ONLY for the second case?

On looking more closely, SQL99 appears to define user-defined casts as
invocable *only* in explicit cast and assignment contexts.  Part 2 sez:

 4.13  Data conversions

 Explicit data conversions can be specified by a CAST operator.
 A CAST operator defines how values of a source data type are
 converted into a value of a target data type according to
 the Syntax Rules and General Rules of Subclause 6.22, cast
 specification. Data conversions between predefined data types
 and between constructed types are defined by the rules of this part
 of ISO/IEC 9075. Data conversions between one or more user-defined
 types are defined by a user-defined cast.

 A user-defined cast identifies an SQL-invoked function, called the
 cast function, that has one SQL parameter whose declared type is
 the same as the source data type and a result data type that is the
 target data type. A cast function may optionally be specified to
 be implicitly invoked whenever values are assigned to targets of
 its result data type. Such a cast function is called an implicitly
 invocable cast function.

This seems to mean that we can get away with defining AS ASSIGNMENT to
mean my second category (implicit in assignment only), and then picking
some more natural term for my first category (implicit anywhere).

I favor using IMPLICIT, which would make the syntax of CREATE CAST be

CREATE CAST (sourcetype AS targettype)
WITH FUNCTION funcname (argtype)
[ AS ASSIGNMENT | IMPLICIT ]

CREATE CAST (sourcetype AS targettype)
WITHOUT FUNCTION
[ AS ASSIGNMENT | IMPLICIT ]

Or possibly it should be AS IMPLICIT?

Comments?

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Proposal for resolving casting issues

2002-09-17 Thread Bruce Momjian

Tom Lane wrote:
 I favor using IMPLICIT, which would make the syntax of CREATE CAST be
 
 CREATE CAST (sourcetype AS targettype)
 WITH FUNCTION funcname (argtype)
 [ AS ASSIGNMENT | IMPLICIT ]
 
 CREATE CAST (sourcetype AS targettype)
 WITHOUT FUNCTION
 [ AS ASSIGNMENT | IMPLICIT ]
 
 Or possibly it should be AS IMPLICIT?

I think AS IMPLICIT would be better because we have other AS [var]
clauses.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Proposal for resolving casting issues

2002-09-17 Thread Bruce Momjian

Tom Lane wrote:
 Note that if you write, say,
   set numericcol = numericcol * 3.14159;
 my proposal would do the right thing since the constant would be typed
 as numeric to start with and would stay that way.  To do what you want
 with a float variable, it'd be necessary to write
   set numericcol = numericcol * float4col::numeric;
 which is sort of ugly; but no uglier than
   set float4col = float4col * numericcol::float4;
 which is what you'd have to write if the system preferred numeric and
 you wanted the other behavior.

I need a clarification.  In the non-assignment case, does:

WHERE numericcol = numericcol * 3.14159

evaluate numericcol * 3.14159 as a numeric?

And does:

WHERE 5.55 = numericcol * 3.14159

evaluate numericcol * 3.14159 as a numeric too?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Proposal for resolving casting issues

2002-09-17 Thread Tom Lane

Bruce Momjian [EMAIL PROTECTED] writes:
 I need a clarification.  In the non-assignment case, does:
   WHERE numericcol = numericcol * 3.14159
 evaluate numericcol * 3.14159 as a numeric?

Yup (given my proposed changes that is).

 And does:
   WHERE 5.55 = numericcol * 3.14159
 evaluate numericcol * 3.14159 as a numeric too?

Yup.  The context does not matter: when we have foo * bar, we are going
to decide which kind of * operator is meant without regard to
surrounding context.  It's very much a bottom-up process, and has to be.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org