RE: [sqlite] SUM and NULL values
On Fri, 2005-09-09 at 16:54 -0400, Ned Batchelder wrote: > I can't follow this thread (NULLs make my head hurt), but it looks like Dr. > Hipp has already taken action. Yesterday he made two changes to the source: > The changes yesterday brought SQLite's SUM function into compliance with the (broken) SQL standard. My plan is to leave things that way. I'll likely introduce some non-standard functions (perhaps named "_sum" or "sqlite_sum") that work in a useful way - which is to say the way that Martin Engelschalk and several others have suggested: sum() == 0 sum() == NULL sum() == the correct sum -- D. Richard Hipp <[EMAIL PROTECTED]>
RE: [sqlite] SUM and NULL values
I can't follow this thread (NULLs make my head hurt), but it looks like Dr. Hipp has already taken action. Yesterday he made two changes to the source: SUM returns NULL when it has no inputs: http://www.sqlite.org/cvstrac/chngview?cn=2678 A SUM() of all NULLs returns NULL. A SUM() of nothing return 0. A SUM() of a mixture of NULLs and numbers returns the sum of the numbers: http://www.sqlite.org/cvstrac/chngview?cn=2677 --Ned. http://nedbatchelder.com -Original Message- From: Jay Sprenkle [mailto:[EMAIL PROTECTED] Sent: Friday, 09 September, 2005 4:22 PM To: sqlite-users@sqlite.org Subject: Re: [sqlite] SUM and NULL values Dr. Hipp, If my opinion is worth anything I'd prefer to stay with the standard, even though it might be a pain, etc. I've had too much bad experience with people writing code that depends on the quirks in other people's code. The decision is ultimately yours, and thanks for putting so much effort into it!
RE: [sqlite] SUM and NULL values
I'll second that opinion, FWIW. >>-Original Message- >>From: Jay Sprenkle [mailto:[EMAIL PROTECTED] >>Sent: Friday, September 09, 2005 4:22 PM >>To: sqlite-users@sqlite.org >>Subject: Re: [sqlite] SUM and NULL values >>Importance: Low >> >>Dr. Hipp, >> >>If my opinion is worth anything I'd prefer to stay with the >>standard, even though it might be a pain, etc. I've had too >>much bad experience with people writing code that depends on >>the quirks in other people's code. >> >>The decision is ultimately yours, and thanks for putting so >>much effort into it! >>
Re: [sqlite] SUM and NULL values
Dr. Hipp, If my opinion is worth anything I'd prefer to stay with the standard, even though it might be a pain, etc. I've had too much bad experience with people writing code that depends on the quirks in other people's code. The decision is ultimately yours, and thanks for putting so much effort into it!
Re: [sqlite] SUM and NULL values
On Sep 9, 2005, at 2:59 PM, Cam Crews wrote: I heard someone say that SUM is a binary operation. It is not, because you can feed it any number of values. It is not the same as "+". While I don't believe anyone has claimed SUM represents a binary operation, the function's behavior has been contrasted to the binary '+' operator. Certainly, I hope we can all agree, when SUM is fed 2 values it should arrive at the same result as the '+' operator. The similarity between the two is worth exploring further: As Jay explained earlier so well, NULL + 0 = NULL expressly because NULL is valueless, leaving the binary operation with only one value, which results in the expression not being able to evaluate properly. When you can't evaluate something, this results in an absence of value (i.e. NULL). there is a potential problem here... any set operation can be reduced eventually to a binary operation. SUM(4, 3, 2, 1) = SUM(4, 3, 3) = SUM(4, 6) = 4 + 6 = 10 still, most would agree that SUM(4, NULL, 2, 1) = 7 even though SUM(NULL, NULL, NULL, NULL) = NULL the above would be the common expectation for most folks for all aggregate operations except for COUNT whereby COUNT(NULL, NULL, NULL, NULL) = 4 without any controversy. Similarly, the SUM of a valueless entity (NULL) should evaluate to a valueless entity (NULL). If we instead determine that the SUM of an empty set or series of NULLs evaluates to 0, we are inferring value where none exists. For example, let's say we want to calculate a value of "light intensity" for a pixel, which we'll determine by taking the SUM of a pixel's RGB hex values. Now, let's say we try calculating the intensity of a non-existent pixel. If SUM( NULL, NULL, NULL) evaluates to 0, this indicates a black pixel when in reality there is no pixel at all. Which prompts the question, how can "nothing" possess a color? Would we really want to assume the absence of color is equivalent to pitch black? actually, when you are dealing with RGB, absence of "color" is actually absence of light, which is in effect, pitch black. Now, if you were talking CMYK... still, you make a valid point. Can somebody come up with a realistic scenario where they would actually want SUM() to return NULL instead of 0? Yes; lets say I manage a system for a company that issues credit cards to customers. As they use their credit line, they accrue debt to my company, and this is reflected in the database. Now, lets say I want my staff to be able to access the amount a customer owes, searching by a customer ID, and returning the SUM of the customer's debts for each card she has been issued. If she's paid all her balances off (like a smart spender) we'd expect the SUM should be 0. But let's say that she owes us $20k and the staff member enters in an incorrect ID # when looking up the balance. The system takes the SUM of NULL(s) and returns 0, leading to the mistaken interpretation that she has paid her credit line off, when in actuality the query was erroneous and she is deep in the red. Would you not agree that NULL value should have been returned for this case? How else can we discriminate between erroneous cases and actual zero-sums? Yes, you can argue that the error here is the result of the staff member mistyping an ID, but it's important to recognize we are obfuscating the matter by returning a valid-appearing result when the system should be returning an error. It's these cases to watch out for where the zero-sum shortcut will be potentially dangerous. well, the system should be returning an error only if someone programs it to recognize that condition as an error. The question is whether or not the following should be programmed in SUM|AVE|MIN|MAX(of all NULL values) = NULL SUM|AVE|MIN|MAX(of some NULL and some NON NULL values) = SUM(of all NON NULL values) COUNT(of all NULL OR some NULL and some NON NULL values) = count of all elements in the set, null or not I say, yes, as it makes sense to me. -- Puneet Kishor
Re: [sqlite] SUM and NULL values
> I heard someone say that SUM is a binary operation. It is not, because you > can feed it any number of values. It is not the same as "+". While I don't believe anyone has claimed SUM represents a binary operation, the function's behavior has been contrasted to the binary '+' operator. Certainly, I hope we can all agree, when SUM is fed 2 values it should arrive at the same result as the '+' operator. The similarity between the two is worth exploring further: As Jay explained earlier so well, NULL + 0 = NULL expressly because NULL is valueless, leaving the binary operation with only one value, which results in the expression not being able to evaluate properly. When you can't evaluate something, this results in an absence of value (i.e. NULL). Similarly, the SUM of a valueless entity (NULL) should evaluate to a valueless entity (NULL). If we instead determine that the SUM of an empty set or series of NULLs evaluates to 0, we are inferring value where none exists. For example, let's say we want to calculate a value of "light intensity" for a pixel, which we'll determine by taking the SUM of a pixel's RGB hex values. Now, let's say we try calculating the intensity of a non-existent pixel. If SUM( NULL, NULL, NULL) evaluates to 0, this indicates a black pixel when in reality there is no pixel at all. Which prompts the question, how can "nothing" possess a color? Would we really want to assume the absence of color is equivalent to pitch black? > Can somebody come up with a realistic scenario where they would > actually want SUM() to return NULL instead of 0? Yes; lets say I manage a system for a company that issues credit cards to customers. As they use their credit line, they accrue debt to my company, and this is reflected in the database. Now, lets say I want my staff to be able to access the amount a customer owes, searching by a customer ID, and returning the SUM of the customer's debts for each card she has been issued. If she's paid all her balances off (like a smart spender) we'd expect the SUM should be 0. But let's say that she owes us $20k and the staff member enters in an incorrect ID # when looking up the balance. The system takes the SUM of NULL(s) and returns 0, leading to the mistaken interpretation that she has paid her credit line off, when in actuality the query was erroneous and she is deep in the red. Would you not agree that NULL value should have been returned for this case? How else can we discriminate between erroneous cases and actual zero-sums? Yes, you can argue that the error here is the result of the staff member mistyping an ID, but it's important to recognize we are obfuscating the matter by returning a valid-appearing result when the system should be returning an error. It's these cases to watch out for where the zero-sum shortcut will be potentially dangerous. Lastly, consider that SUM exhibits the identity property when passed a single value. ex: SUM(4) = 4 SUM(0) = 0 SUM(NULL) = NULL If we pass SUM a NULL value (i.e. no value) we should still expect it to return what we put in (i.e. no value). Otherwise, we're manufacturing zero-values which, while this may often be what *think* we want, represents a confusing twist on standard expectations that may mask deeper problems within a system.
RE: [sqlite] SUM and NULL values
If NULL means "I don't know" then IMHO the only sensible and logic correct thing to do is: SUM() = 0, since that conforms to common-sense algebra. Like SUM(9, 9, 9) = 3x9 = 27 SUM(9, 9) = 2x9 = 18 SUM(9) = 1x9 = 9 SUM() = 0x9 = 0 and certainly not NULL (I don't know). I heard someone say that SUM is a binary operation. It is not, because you can feed it any number of values. It is not the same as "+". Not however that this is not valid for all operations. For example: MIN() = NULL MAX() = NULL AVG() = NULL COUNT() = 0 However: SUM() = NULL since 1 of the values is unknown, we can't know the answer Best regards, Frank.
RE: [sqlite] SUM and NULL values
On Thu, 8 Sep 2005, Marcus Welz wrote: > Yes, the NULL as it is returned by SUM means "No data to answer your > question". > > If that query returned 0 (as SQLite currently does), rather than NULL (as > the standard specifies), it would not allow me to figure out whether I sold > $0 worth of items or if I didn't sell anything at all. Exactly! I was reading through the thread wondering when this point would be made (and getting ready to make it myself.) The SUM() of non-NULL values can be 0! If SQL(ite) would return 0 for all NULL input there would be no way do make the distiction between a SUM() that's 0 and 'no data to be summed'. Which can be an important distinction. Please! Make SQLite return NULL for all NULL input. I totally agree that null handeling seems/is inconsistent as hell at times... It seems hackish here and there. Which, IMHO, is the ALL THE REASON jou need to to just follow the standard! Rgds, Mark. > -Original Message- > From: Puneet Kishor [mailto:[EMAIL PROTECTED] > Sent: Thursday, September 08, 2005 6:50 PM > To: sqlite-users@sqlite.org > Subject: Re: [sqlite] SUM and NULL values > > > On Sep 8, 2005, at 5:45 PM, Marcus Welz wrote: > > > If "SELECT SUM(amt)" means "How much did I sell?" > > > > Then the "NULL" should mean, "You didn't sell anything.", no? > > > > no... NULL means, "I don't have any data to answer that question" which > is very different from "I sold $0 worth of things" or "I didn't sell > anything." > > -- > Puneet Kishor > > Regards, Mark
Re: [sqlite] SUM and NULL values
On Thu, 08 Sep 2005 16:24:14 -0400, "D. Richard Hipp" <[EMAIL PROTECTED]> wrote: > Can somebody come up with a realistic scenario where they would > actually want SUM() to return NULL instead of 0? NULL is such > a profoundly broken return value for sum() in my mind that I'm > thinking of ignoring the standard and just coding SQLite to do > the Right Thing. But I am open to the possibility that there > are some cases outside of my imagination where returning zero > might be considered "wrong". Isn't the meaning of 0 part of the application logic? If you really are going to do it, can there be a pragma to switch the behaviour? > If nobody can suggest a scenario where SUM() returning NULL is > actually helpful, then I will likely return SQLite to its former > mode of operation which is to have SUM() return 0 when it has no > input. What about when the sum is on a field in a left joined table? create table items (item_code char(10), description varchar(30)); insert into items values('WIDGETS','WIDGET 3.5 inch'); insert into items values('BOXES','Box 2 inch'); insert into items values('ETCHA','Etch-a-sketch'); create table transactions (item_code char(10), customer_code char(10), purchase_date date, amt numeric); -- customer buys widget insert into transactions('WIDGETS','CASH','20050905',11.00); -- customer brings widget back insert into transactions('WIDGETS','CASH','20050907,'-11.00); -- customer from last month brings box back insert into transactions('BOXES','CASH','20050907,'-20.00); -- monthly profitiblity report select items.item_code, sum(transactions.amt), count (transactions.item_code) >from items left outer join transactions on transactions.item_code = items.item_code where date >= '20050901' and date <= '20050930' order by 3 desc, 4 desc I want the result to be descending money, then descending counts, with no activity items at the end. klint. +---+-+ : Klint Gore: "Non rhyming: : EMail : [EMAIL PROTECTED] : slang - the: : Snail : A.B.R.I.: possibilities : : Mail University of New England : are useless" : : Armidale NSW 2351 Australia : L.J.J. : : Fax : +61 2 6772 5376 : : +---+-+
Re: [sqlite] SUM and NULL values
Robert Simpson wrote: Perhaps a pragma is the right answer? There shouldn't have to be a pragma to enforce SQL conformance. I should have clarified it should default to conforming, if there were to be such a thing. Like pragma friendly_null, defaulting to off. But your salary example is much more compelling than the October sales, so I change my vote. Standards are good. :) -- Ted Unangst www.coverity.com Coverity, Inc.
RE: [sqlite] SUM and NULL values
Yes, the NULL as it is returned by SUM means "No data to answer your question". I was talking in context with the example of the fictional sales situation, rather than the technical description of what SUM() exactly does, because, as I see it, we're trying to figure out whether the technical description/implementation makes sense. If that query returned 0 (as SQLite currently does), rather than NULL (as the standard specifies), it would not allow me to figure out whether I sold $0 worth of items or if I didn't sell anything at all. -Original Message- From: Puneet Kishor [mailto:[EMAIL PROTECTED] Sent: Thursday, September 08, 2005 6:50 PM To: sqlite-users@sqlite.org Subject: Re: [sqlite] SUM and NULL values On Sep 8, 2005, at 5:45 PM, Marcus Welz wrote: > If "SELECT SUM(amt)" means "How much did I sell?" > > Then the "NULL" should mean, "You didn't sell anything.", no? > no... NULL means, "I don't have any data to answer that question" which is very different from "I sold $0 worth of things" or "I didn't sell anything." -- Puneet Kishor
Re: [sqlite] SUM and NULL values
On Thu, Sep 08, 2005 at 05:16:30PM -0400, D. Richard Hipp wrote: > So SQL sez: > > SUM() == NULL > SUM() == NULL > SUM() == Sum of non-null values > > Martin sez: > > SUM() == 0 > SUM() == NULL > SUM() == NULL > > SQLite 3.2.5 does this: > > SUM() == 0 > SUM() == 0 > SUM() == Sum of non-null values > > I can understand Martin's point of view. The SQL standard > point of view makes no sense to me at all. Note that all three of the above systems define SUM() to be an identity element such that: SUM(SUM(X), SUM()) = SUM(X) However, the following system breaks the identity: SUM() == 0 SUM() == NULL SUM() == Sum of non-null values SUM(SUM(), SUM()) = SUM(NULL, 0) = 0 != SUM() So for a mathematically consistent definition of SUM, it makes sense that SUM() == NULL if SUM() == NULL. - glen
Re: [sqlite] SUM and NULL values
On Sep 8, 2005, at 5:45 PM, Marcus Welz wrote: If "SELECT SUM(amt)" means "How much did I sell?" Then the "NULL" should mean, "You didn't sell anything.", no? no... NULL means, "I don't have any data to answer that question" which is very different from "I sold $0 worth of things" or "I didn't sell anything." -- Puneet Kishor
RE: [sqlite] SUM and NULL values
If "SELECT SUM(amt)" means "How much did I sell?" Then the "NULL" should mean, "You didn't sell anything.", no? To me, there is a difference between 0 ("You sold merchandise worth $0" perhaps because of sweepstakes, giveaway, rebate coupons, etc) and NULL ("You didn't sell anything."). I think that Robert Simpson nailed on the head with his salary example. If you want a quick report on sales per month: SELECT month, COALESCE(SUM(amount), 'No Sales') from table GROUP BY month If you insist that you want the result to be zero rather than NULL: SELECT month, COALESCE(SUM(amount), 0) from table GROUP BY month The behavior according to the SQL standard allows for flexibility. -Original Message- From: Ted Unangst [mailto:[EMAIL PROTECTED] Sent: Thursday, September 08, 2005 6:21 PM To: sqlite-users@sqlite.org Subject: Re: [sqlite] SUM and NULL values *snip* I would think "SELECT SUM(amt)" means "How much did I sell?" or "What was my income?", and a better query here would be "SELECT SUM(amt) - SUM(costs)". Just because there were no sales doesn't mean that the profit (or loss) is indeterminate. If your boss asked you "How much did you make in October?" the answer is not "I can't tell." Exacting correctness aside, I would expect 0 to confuse fewer people. Is it more likely an SQL expert relies on NULL being different than 0, or that a non-expert selects sum() without considering the NULL possibility? Perhaps a pragma is the right answer? -- Ted Unangst www.coverity.com Coverity, Inc.
Re: [sqlite] SUM and NULL values
- Original Message - From: "Ted Unangst" <[EMAIL PROTECTED]> To: Sent: Thursday, September 08, 2005 3:20 PM Subject: Re: [sqlite] SUM and NULL values Robert Simpson wrote: Lets take the October sales further. Lets say if there were any sales in October, that you had to subtract $100 from the total sales for October to account for ... lets say "shipping costs". SELECT SUM(amt) - 100 from sales where month = 'october' If there were no sales, under your query plan, I'd still have been in the hole $100. This is vastly oversimplified in order to show that "0" does not always answer the question and can cascade into an even worse scenario. I would think "SELECT SUM(amt)" means "How much did I sell?" or "What was my income?", and a better query here would be "SELECT SUM(amt) - SUM(costs)". Just because there were no sales doesn't mean that the profit (or loss) is indeterminate. If your boss asked you "How much did you make in October?" the answer is not "I can't tell." You're looking at the query and injecting intent into it when it is merely an example of what can go wrong when you deviate from the SQL standard. The bottom line is that SQLite returns different query results than other SQL databases because of this deviation, which is compounded when you build other SQL-compliant code around those functions. One of SQLite's design goals is standards conformance. Lets not give it up. Exacting correctness aside, I would expect 0 to confuse fewer people. Is it more likely an SQL expert relies on NULL being different than 0, or that a non-expert selects sum() without considering the NULL possibility? Perhaps a pragma is the right answer? There shouldn't have to be a pragma to enforce SQL conformance. Robert
Re: [sqlite] SUM and NULL values
On Sep 8, 2005, at 5:20 PM, Ted Unangst wrote: Robert Simpson wrote: Lets take the October sales further. Lets say if there were any sales in October, that you had to subtract $100 from the total sales for October to account for ... lets say "shipping costs". SELECT SUM(amt) - 100 from sales where month = 'october' If there were no sales, under your query plan, I'd still have been in the hole $100. This is vastly oversimplified in order to show that "0" does not always answer the question and can cascade into an even worse scenario. I would think "SELECT SUM(amt)" means "How much did I sell?" or "What was my income?", and a better query here would be "SELECT SUM(amt) - SUM(costs)". Just because there were no sales doesn't mean that the profit (or loss) is indeterminate. If your boss asked you "How much did you make in October?" the answer is not "I can't tell." oh yes, it is the correct answer. "I can't tell because I don't have the data." Therein lies the problem... if we contrive concrete examples, we can go both ways -- the way of NULL or the way of 0. The fact is, NULL is the absence of data. Hence, I can't tell is a valid answer. -- Puneet Kishor
Re: [sqlite] SUM and NULL values
Robert Simpson wrote: Lets take the October sales further. Lets say if there were any sales in October, that you had to subtract $100 from the total sales for October to account for ... lets say "shipping costs". SELECT SUM(amt) - 100 from sales where month = 'october' If there were no sales, under your query plan, I'd still have been in the hole $100. This is vastly oversimplified in order to show that "0" does not always answer the question and can cascade into an even worse scenario. I would think "SELECT SUM(amt)" means "How much did I sell?" or "What was my income?", and a better query here would be "SELECT SUM(amt) - SUM(costs)". Just because there were no sales doesn't mean that the profit (or loss) is indeterminate. If your boss asked you "How much did you make in October?" the answer is not "I can't tell." Exacting correctness aside, I would expect 0 to confuse fewer people. Is it more likely an SQL expert relies on NULL being different than 0, or that a non-expert selects sum() without considering the NULL possibility? Perhaps a pragma is the right answer? -- Ted Unangst www.coverity.com Coverity, Inc.
Re: [sqlite] SUM and NULL values
NULL is the absence of value. Any aggregation operation on a set where ALL the elements of the set are "absences of values" should return an "absence of value." If ALL the elements are NOT NULL then the aggregation operation should be applied to the elements that have a "presence of value." The only exception should be COUNT as it would return the number of elements in the set without peeking in at the values of the members of the set, hence never even encountering the absence or presence of values of them. -- Puneet Kishor
Re: [sqlite] SUM and NULL values
If SQLite doesn't conform to the SQL standard, it'll break several other dependencies on the standard: CREATE TABLE salaries (empid int, hourlywage money, yearlywage money, emptype int) INSERT INTO salaries VALUES(1, 12.95, null, 1) INSERT INTO salaries VALUES(2, null, 8.00, 2) SELECT COALESCE(SUM(hourlywage * 40 * 52), SUM(yearlywage)) from salaries GROUP BY emptype In SQL Server, this returns: - 26936. 8. In SQLite it returns: - 26936 0
RE: [sqlite] SUM and NULL values
For some reason I have to agree with the SQL standard, which I would interprete as the following (and this may not be all that bullet proof): "If the result set contains numeric values, sum them up, ignoring NULLs. If there are no numeric values present (the result set is either empty or contains only NULLs) return NULL." There's a fine difference between NULL and 0. If I have an unordered result set of: a|b -- NULL |NULL NULL |NULL NULL |NULL 0|NULL NULL |NULL I would want SUM(a) to be zero, but I'd want SUM(b) to be NULL. Just like I'd want MIN(a) to return zero, but MIN(b) to be NULL. If SUM() were to always return 0, wouldn't it break the logic of how it behaved compared to its MIN(), MAX() and AVG() siblings? Always returning at least 0 because it makes sense to me would not be enough for me to justify breaking away from the standard, particularly because it's so easy to ensure that the result will be 0 and not NULL as Martin mentioned using "sum(coalesce(row, 0))". And after all -- it's the standard. And absolutely I love using SQLite simply because it (seems to) follow the standard so closely. The introduced oh-so-subtle difference for developers ("MySQL, PostgreSQL, etc. etc. return NULL, but SQLite actually defies the SQL standard and returns 0") could be rather confusing. I'm by no means an expert or authority when it comes to SQL, but NULL is rather interesting. Since "NULL is always false even to itself" I think that by having SUM(b) return NULL this "quirkiness" or "speciality" behavior of NULL is preserved. Having the SUM() of a bunch of NULLs be 0 seems a bit too magic to me. -Original Message- From: D. Richard Hipp [mailto:[EMAIL PROTECTED] Sent: Thursday, September 08, 2005 5:17 PM To: sqlite-users@sqlite.org Subject: Re: [sqlite] SUM and NULL values *snip* So SQL sez: SUM() == NULL SUM() == NULL SUM() == Sum of non-null values *snip* SQLite 3.2.5 does this: SUM() == 0 SUM() == 0 SUM() == Sum of non-null values
Re: [sqlite] SUM and NULL values
SELECT SUM(temperature) FROM antartica WHERE temperature < -150 LIMIT 1 Ignore the obvious logic error in this statement, since it will always return 1 row ... the meaning was that 0 does not always answer every question. Robert
Re: [sqlite] SUM and NULL values
- Original Message - From: "Martin Engelschalk" <[EMAIL PROTECTED]> To: Sent: Thursday, September 08, 2005 2:23 PM Subject: Re: [sqlite] SUM and NULL values SUM() itself does imply an numerical type return and does not need any rows from which to get a type - even more so in SQLite, where types are not a big issue. To return to the example "sales in october": If there are no records for october, then there were no sales, and the sum of - say - profits is obviously 0. If there is a record for october, but is contains a NULL value for the profits, this means there is no data for an existing october sale, and sum() should return NULL. Lets take the October sales further. Lets say if there were any sales in October, that you had to subtract $100 from the total sales for October to account for ... lets say "shipping costs". SELECT SUM(amt) - 100 from sales where month = 'october' If there were no sales, under your query plan, I'd still have been in the hole $100. This is vastly oversimplified in order to show that "0" does not always answer the question and can cascade into an even worse scenario. Robert
RE: [sqlite] SUM and NULL values
There is one other thing, though. Allthough the sql standard is (insert your favorite 4-letter word), and although I personally hate nulls, I try to write applications where the sql is as portable as it can be. So, in the name of portability, you should follow the sql standard.
Re: [sqlite] SUM and NULL values
D. Richard Hipp schrieb: On Thu, 2005-09-08 at 23:05 +0200, Martin Engelschalk wrote: I have found that, when dealing with NULL values, it helps to think of a Null as "I don't know, I have no data". So, if the where clause returns no records, I do know the result: It is 0. If there where clause returns records with NULL values, that means I did not know the values and so can not know the sum, too. Sum() should therefore return NULL. When I want to regard a NULL value as 0 in this (or any) context, I use a sum(coalesce(row, 0)). Martin, your argument makes good sense. Unfortunately, that is not what the SQL standard says should happen. According to SQL (or at least the authorities on the subject I have access to) any NULL values in the series input to SUM are to be ignored. SUM returns the summation of the non-null values. Or if there are no non-NULL values SUM returns NULL. So SQL sez: SUM() == NULL SUM() == NULL SUM() == Sum of non-null values Martin sez: SUM() == 0 SUM() == NULL SUM() == NULL SQLite 3.2.5 does this: SUM() == 0 SUM() == 0 SUM() == Sum of non-null values I can understand Martin's point of view. The SQL standard point of view makes no sense to me at all. Yes, I see - The case escaped my notice in the heat of the argument :-) In any case, I would never build my queries to depend on the DB Engine (not even the fantastic SQLite) in such a difficult case, but perhaps check for null values first and use coalesce / nvl Martin
Re: [sqlite] SUM and NULL values
On Thu, 2005-09-08 at 14:12 -0700, Robert Simpson wrote: > If SUM returns 0 when no rows were evaluated, then SQLite is making an > assumption about what the answer should be, which is incorrect. > > SELECT SUM(temperature) FROM antartica WHERE temperature < -150 LIMIT 1 > > Does it makes sense for that query to return 0 if no rows matched the > criteria? Temperature is an intensive property, not an extensive property, and is therefore not additive. SUM(temperature) not a sensible thing to do. The correct query should omit the SUM like this: SELECT temperature FROM antartica WHERE temperature < -150 LIMIT 1 Which gives the result you seek. Thank you for the suggestion, though... -- D. Richard Hipp <[EMAIL PROTECTED]>
Re: [sqlite] SUM and NULL values
Robert Simpson wrote: SUM without any rows should return NULL. It had no type affinity with which to obtain an answer for the question, and no meaningful rows were available with which to glean an answer. If SUM returns 0 when no rows were evaluated, then SQLite is making an assumption about what the answer should be, which is incorrect. SELECT SUM(temperature) FROM antartica WHERE temperature < -150 LIMIT 1 Does it makes sense for that query to return 0 if no rows matched the criteria? If SQLite doesn't know the answer definitively, then it shouldn't return one. Robert I am sorry, but I do not agree. SUM() itself does imply an numerical type return and does not need any rows from which to get a type - even more so in SQLite, where types are not a big issue. To return to the example "sales in october": If there are no records for october, then there were no sales, and the sum of - say - profits is obviously 0. If there is a record for october, but is contains a NULL value for the profits, this means there is no data for an existing october sale, and sum() should return NULL. Martin PS: I am aware that the discussion about NULL is an old one and divides the world in warring factions :-)
Re: [sqlite] SUM and NULL values
On Thu, 2005-09-08 at 23:05 +0200, Martin Engelschalk wrote: > I have found that, when dealing with NULL values, it helps to think of a > Null as "I don't know, I have no data". > So, if the where clause returns no records, I do know the result: It is 0. > If there where clause returns records with NULL values, that means I did > not know the values and so can > not know the sum, too. Sum() should therefore return NULL. > When I want to regard a NULL value as 0 in this (or any) context, I use > a sum(coalesce(row, 0)). > Martin, your argument makes good sense. Unfortunately, that is not what the SQL standard says should happen. According to SQL (or at least the authorities on the subject I have access to) any NULL values in the series input to SUM are to be ignored. SUM returns the summation of the non-null values. Or if there are no non-NULL values SUM returns NULL. So SQL sez: SUM() == NULL SUM() == NULL SUM() == Sum of non-null values Martin sez: SUM() == 0 SUM() == NULL SUM() == NULL SQLite 3.2.5 does this: SUM() == 0 SUM() == 0 SUM() == Sum of non-null values I can understand Martin's point of view. The SQL standard point of view makes no sense to me at all. -- D. Richard Hipp <[EMAIL PROTECTED]>
Re: [sqlite] SUM and NULL values
- Original Message - From: "D. Richard Hipp" <[EMAIL PROTECTED]> To: Sent: Thursday, September 08, 2005 1:24 PM Subject: Re: [sqlite] SUM and NULL values Can somebody come up with a realistic scenario where they would actually want SUM() to return NULL instead of 0? NULL is such a profoundly broken return value for sum() in my mind that I'm thinking of ignoring the standard and just coding SQLite to do the Right Thing. But I am open to the possibility that there are some cases outside of my imagination where returning zero might be considered "wrong". If nobody can suggest a scenario where SUM() returning NULL is actually helpful, then I will likely return SQLite to its former mode of operation which is to have SUM() return 0 when it has no input. If 0 does not answer the question, then 0 should not be proffered as the answer. SUM without any rows should return NULL. It had no type affinity with which to obtain an answer for the question, and no meaningful rows were available with which to glean an answer. If SUM returns 0 when no rows were evaluated, then SQLite is making an assumption about what the answer should be, which is incorrect. SELECT SUM(temperature) FROM antartica WHERE temperature < -150 LIMIT 1 Does it makes sense for that query to return 0 if no rows matched the criteria? If SQLite doesn't know the answer definitively, then it shouldn't return one. Robert
RE: [sqlite] SUM and NULL values
On Thu, 2005-09-08 at 16:43 -0400, Thomas Briggs wrote: > > > Can somebody come up with a realistic scenario where they would > > actually want SUM() to return NULL instead of 0? > >I think your example of totaling sales in October makes the argument > itself - you didn't make sales totaling $0 in October, you just didn't > make any sales. A subtle but irritatingly big difference. :) > No. The question I asked was what was the total value of all sales in October. That answer is 0. If I had wanted to know whether or not there were any sales in October I would have used COUNT instead of SUM. By your argument, the official meaning of SUM in SQL is some strange and not particularly useful comingling of the concepts of magnitude (sum) and cardinality (count). It seems much more useful to keep these concepts distinct and in separate functions, which is what I am proposing to do in SQLite in defiance of the SQL standard. -- D. Richard Hipp <[EMAIL PROTECTED]>
Re: [sqlite] SUM and NULL values
D. Richard Hipp wrote: Can somebody come up with a realistic scenario where they would actually want SUM() to return NULL instead of 0? I have found that, when dealing with NULL values, it helps to think of a Null as "I don't know, I have no data". So, if the where clause returns no records, I do know the result: It is 0. If there where clause returns records with NULL values, that means I did not know the values and so can not know the sum, too. Sum() should therefore return NULL. When I want to regard a NULL value as 0 in this (or any) context, I use a sum(coalesce(row, 0)). Martin
RE: [sqlite] SUM and NULL values
I would think that if you are looking to know the total of something (which is what SUM provides), and there is nothing to total, the total should be 0. If you want to know if any sales were made, it seems you should be using COUNT and not SUM. Just my opinion, of course. Bob Envision Information Technologies Associate [EMAIL PROTECTED] v. 608.256.5680 f. 608.256.3780 -Original Message- From: Thomas Briggs [mailto:[EMAIL PROTECTED] Sent: Thursday, September 08, 2005 3:43 PM To: sqlite-users@sqlite.org Subject: RE: [sqlite] SUM and NULL values > Can somebody come up with a realistic scenario where they would > actually want SUM() to return NULL instead of 0? NULL is such I think your example of totaling sales in October makes the argument itself - you didn't make sales totaling $0 in October, you just didn't make any sales. A subtle but irritatingly big difference. :) -Tom
Re: [sqlite] SUM and NULL values
On 9/8/05, Thomas Briggs <[EMAIL PROTECTED]> wrote: > > > > > Can somebody come up with a realistic scenario where they would > > actually want SUM() to return NULL instead of 0? NULL is such > > I think your example of totaling sales in October makes the argument > itself - you didn't make sales totaling $0 in October, you just didn't > make any sales. A subtle but irritatingly big difference. :) Null is a big pain sometimes! :)
RE: [sqlite] SUM and NULL values
> Can somebody come up with a realistic scenario where they would > actually want SUM() to return NULL instead of 0? NULL is such I think your example of totaling sales in October makes the argument itself - you didn't make sales totaling $0 in October, you just didn't make any sales. A subtle but irritatingly big difference. :) -Tom
Re: [sqlite] SUM and NULL values
On 9/8/05, Dennis Jenkins <[EMAIL PROTECTED]> wrote: > The point being that if I sum up the rows using "sum()", I could get one > answer. However, if I iterate the cursor in a stored procedure and > manually sum them up using " accum += value" I would get a different > answer. The math is inconsistant (in postgresql anyway). I actaully > did not test sqlite :). I was just posting this to bring up the point > about consistnecy. I'll let those with more knowledge than me hash it > out. (no pun there). Depends on if you start with accum = 0; If you don't it throws an exception, if you do, then you get zero (for an empty set). That's the crux of the difference in philosophies. -- --- The Castles of Dereth Calendar: a tour of the art and architecture of Asheron's Call http://www.lulu.com/content/77264
Re: [sqlite] SUM and NULL values
On 9/8/05, D. Richard Hipp <[EMAIL PROTECTED]> wrote: > > Can somebody come up with a realistic scenario where they would > actually want SUM() to return NULL instead of 0? NULL is such > a profoundly broken return value for sum() in my mind that I'm > thinking of ignoring the standard and just coding SQLite to do > the Right Thing. But I am open to the possibility that there > are some cases outside of my imagination where returning zero > might be considered "wrong". > > If nobody can suggest a scenario where SUM() returning NULL is > actually helpful, then I will likely return SQLite to its former > mode of operation which is to have SUM() return 0 when it has no > input. Look at NULL as an error message. When you ask it to do a binary operation (sum is addition) on an empty set it can't since it doesn't have two values, so it returns an error indication. It's probably a lot more practically useful to return zero, but it's not standard.
Re: [sqlite] SUM and NULL values
Jay Sprenkle wrote: So, in some cases (explicite addition), NULL + 0 = NULL. (we all know that NULL != NULL). However, the "sum" function skips NULLs. This seems inconsistant. NULL + 0 is not valid since you can't do a binary operation on only one number (zero is a number, NULL is not). So it returns an appropriate result when you ask it to do something it can't do. Exactly. NULL is the appropraite result for the exact reason that you gave. To me this seems similar to mixing up units, like saying that the store is 3.5 km + 10 minutes + 2 radians from my house. The units just don't mix. I understand the rational for "NULL + anything" returning NULL (makes sense mathematically). I understand the utility of having "sum" skip NULLs. However, we can now construct a math in the sql engine where a+b+c != a+b+c. Yucky. Only where one of the variables is different than the other side of the equation. The point being that if I sum up the rows using "sum()", I could get one answer. However, if I iterate the cursor in a stored procedure and manually sum them up using " accum += value" I would get a different answer. The math is inconsistant (in postgresql anyway). I actaully did not test sqlite :). I was just posting this to bring up the point about consistnecy. I'll let those with more knowledge than me hash it out. (no pun there).
Re: [sqlite] SUM and NULL values
Can somebody come up with a realistic scenario where they would actually want SUM() to return NULL instead of 0? NULL is such a profoundly broken return value for sum() in my mind that I'm thinking of ignoring the standard and just coding SQLite to do the Right Thing. But I am open to the possibility that there are some cases outside of my imagination where returning zero might be considered "wrong". If nobody can suggest a scenario where SUM() returning NULL is actually helpful, then I will likely return SQLite to its former mode of operation which is to have SUM() return 0 when it has no input. -- D. Richard Hipp <[EMAIL PROTECTED]>
Re: [sqlite] SUM and NULL values
> So, in some cases (explicite addition), NULL + 0 = NULL. (we all know > that NULL != NULL). However, the "sum" function skips NULLs. This > seems inconsistant. NULL + 0 is not valid since you can't do a binary operation on only one number (zero is a number, NULL is not). So it returns an appropriate result when you ask it to do something it can't do. I understand the rational for "NULL + anything" returning NULL (makes > sense mathematically). I understand the utility of having "sum" skip > NULLs. However, we can now construct a math in the sql engine where > a+b+c != a+b+c. Yucky. > Only where one of the variables is different than the other side of the equation. --- The Castles of Dereth Calendar: a tour of the art and architecture of Asheron's Call http://www.lulu.com/content/77264
Re: [sqlite] SUM and NULL values
On 9/8/05, D. Richard Hipp <[EMAIL PROTECTED]> wrote: > PostgreSQL docs say that NULL is returned if all inputs > to sum are NULL. > > So then, if there are no input rows at all (if no rows > match the WHERE clause) then SUM returns 0. Actually PostgreSQL 8.0.3 and FirebirdSQL 1.5.2 return NULL in this case too. > (This makes > sense because if you say: > > SELECT sum(amt) FROM sales WHERE month='october'; > I agree that this does make sense in some cases, and it can also be quite handy to distinguish between 'no rows matched WHERE clause' and 'all matching rows were NULL'. But it appears it's not very consistent with other DB engines. -- Nemanja Corlija <[EMAIL PROTECTED]>
Re: [sqlite] SUM and NULL values
From postgresql 8.0.1 on FreeBSD: syslog-ng=# select NULL is null; ?column? -- t (1 row) syslog-ng=# select (0+NULL) is null; ?column? -- t (1 row) syslog-ng=# select (0) is null; ?column? -- f (1 row) syslog-ng=# create table tmp1 ( a int4 ); CREATE TABLE syslog-ng=# insert into tmp1 values (2); INSERT 16949751 1 syslog-ng=# insert into tmp1 values (0); INSERT 16949752 1 syslog-ng=# insert into tmp1 values (NULL); INSERT 16949756 1 syslog-ng=# select * from tmp1; a --- 2 0 (3 rows) syslog-ng=# select a, a is null from tmp1; a | ?column? ---+-- 2 | f 0 | f | t (3 rows) syslog-ng=# select sum(a) from tmp1; sum - 2 (1 row) syslog-ng=# drop table tmp1; DROP TABLE syslog-ng=# \q So, in some cases (explicite addition), NULL + 0 = NULL. (we all know that NULL != NULL). However, the "sum" function skips NULLs. This seems inconsistant. I understand the rational for "NULL + anything" returning NULL (makes sense mathematically). I understand the utility of having "sum" skip NULLs. However, we can now construct a math in the sql engine where a+b+c != a+b+c. Yucky.
Re: [sqlite] SUM and NULL values
On 9/8/05, Thomas Briggs <[EMAIL PROTECTED]> wrote: > > You (and I) may want an answer of 0, but I unfortunately believe NULL > is the correct answer. The answer isn't 0 - the answer is "there is no > answer", because there were no inputs. If that translates to 0 in > reality that's up to you to interpret. :) Yeah, I've cursed nulls occasionally because I had to jump through hoops to get the results I wanted, but used them too when people wanted to know how many records had missing data. That's why we get paid the big bucks! ;)
RE: [sqlite] SUM and NULL values
> So then, if there are no input rows at all (if no rows > match the WHERE clause) then SUM returns 0. (This makes > sense because if you say: > > SELECT sum(amt) FROM sales WHERE month='october'; > > and you didn't sell anything in October, you want an > answer of 0, not NULL.) Or if *some* of the entries are You (and I) may want an answer of 0, but I unfortunately believe NULL is the correct answer. The answer isn't 0 - the answer is "there is no answer", because there were no inputs. If that translates to 0 in reality that's up to you to interpret. :) I think the set-theoretical explanation is that an operation on a NULL set is NULL. COUNT seems to be an exception there, though... Hrmph. > The more I learn about NULLs in SQL the less sense they > make... I have never intentionally declared a column that allowed NULLs. :) -Tom
Re: [sqlite] SUM and NULL values
On 9/8/05, D. Richard Hipp <[EMAIL PROTECTED]> wrote: > > > So then, if there are no input rows at all (if no rows > match the WHERE clause) then SUM returns 0. (This makes > sense because if you say: > > SELECT sum(amt) FROM sales WHERE month='october'; > > and you didn't sell anything in October, you want an > answer of 0, not NULL.) Or if *some* of the entries are > NULL, then the answer is the sum of the non-NULL entries. > But if the number of entries is greater than zero and > they are all NULL, then the answer is NULL. > > Logical, right A sum is the addition of a set of numbers. Your logic is correct if you make an assumption that there's always a zero to start with. Then the sum of an empty set is: zero + (nothing) = zero. An empty set can't have a binary operation applied to it. You could also argue "The Since there are no numbers, there is no sum." Isn't this specified in cj date's SQL standards book someplace?
Re: [sqlite] SUM and NULL values
At http://www.oreilly.com/catalog/sqlnut/chapter/ch04.html I read: The AVG function computes the average of values in a column or an expression. SUM computes the sum. Both functions ... ignore NULL values. PostgreSQL docs say that NULL is returned if all inputs to sum are NULL. So then, if there are no input rows at all (if no rows match the WHERE clause) then SUM returns 0. (This makes sense because if you say: SELECT sum(amt) FROM sales WHERE month='october'; and you didn't sell anything in October, you want an answer of 0, not NULL.) Or if *some* of the entries are NULL, then the answer is the sum of the non-NULL entries. But if the number of entries is greater than zero and they are all NULL, then the answer is NULL. Logical, right The more I learn about NULLs in SQL the less sense they make... -- D. Richard Hipp <[EMAIL PROTECTED]>
Re: [sqlite] SUM and NULL values
Just for information: Oracle returns NULL's in both cases: for SUM() and for AVG(). I checked it for ver. 8.1.7 and 9.2.0 --- "D. Richard Hipp" <[EMAIL PROTECTED]> wrote: > On Thu, 2005-09-08 at 10:24 -0700, Will Leshner > wrote: > > Say I have a column called 'b' in a table called > 'test' and every > > field in 'b' has a NULL value. What is the > expected behavior of the > > following query: > > > > SELECT SUM(b) FROM TEST; > > > > I guess I'm thinking I'll get back NULL. Instead I > get 0 (or, in the > > 3.2.5 code, 0.0). > > > > SUM ignores NULLs. So if it ignores everything, it > returns the > sum of nothing, which is 0. (Not the 0.0 issue is > fixed in CVS HEAD.) > > > On the other hand, the following query does return > NULL: > > > > SELECT AVG(b) FROM TEST; > > > > AVG is implemented as SUM/COUNT. But the count is > zero. So > you get a NULL. > -- > D. Richard Hipp <[EMAIL PROTECTED]> > >
RE: [sqlite] SUM and NULL values
Actually, every database I'm aware of returns NULL for any aggregate whenever the inputs are NULL, and also when no input rows are processed (i.e. if no rows match the criteria in the WHERE clause, the result of the aggregation is NULL). -Tom > -Original Message- > From: Will Leshner [mailto:[EMAIL PROTECTED] > Sent: Thursday, September 08, 2005 2:51 PM > To: sqlite-users@sqlite.org > Subject: Re: [sqlite] SUM and NULL values > > > On Sep 8, 2005, at 11:33 AM, D. Richard Hipp wrote: > > > SUM ignores NULLs. So if it ignores everything, it returns the > > sum of nothing, which is 0. (Not the 0.0 issue is fixed in > CVS HEAD.) > > As a data point, MySQL returns NULL for a SUM over fields that are > all NULL. I am not suggesting, of course, that SQLite should imitate > MySQL. >
Re: [sqlite] SUM and NULL values
On 9/8/05, D. Richard Hipp <[EMAIL PROTECTED]> wrote: > > > SUM ignores NULLs. So if it ignores everything, it returns the > sum of nothing, which is 0. (Not the 0.0 issue is fixed in CVS HEAD.) I would expect the sum of nothing is nothing, not zero, but that's interpretation and I can see how others could easily expect something different. --- The Castles of Dereth Calendar: a tour of the art and architecture of Asheron's Call http://www.lulu.com/content/77264
Re: [sqlite] SUM and NULL values
On Sep 8, 2005, at 11:33 AM, D. Richard Hipp wrote: SUM ignores NULLs. So if it ignores everything, it returns the sum of nothing, which is 0. (Not the 0.0 issue is fixed in CVS HEAD.) As a data point, MySQL returns NULL for a SUM over fields that are all NULL. I am not suggesting, of course, that SQLite should imitate MySQL.
Re: [sqlite] SUM and NULL values
On Sep 8, 2005, at 11:33 AM, D. Richard Hipp wrote: SUM ignores NULLs. So if it ignores everything, it returns the sum of nothing, which is 0. (Not the 0.0 issue is fixed in CVS HEAD.) Ah. Perhaps I shouldn't have filed a bug report then. Sorry about that. I wonder what other databases do with this?
Re: [sqlite] SUM and NULL values
On Thu, 2005-09-08 at 10:24 -0700, Will Leshner wrote: > Say I have a column called 'b' in a table called 'test' and every > field in 'b' has a NULL value. What is the expected behavior of the > following query: > > SELECT SUM(b) FROM TEST; > > I guess I'm thinking I'll get back NULL. Instead I get 0 (or, in the > 3.2.5 code, 0.0). > SUM ignores NULLs. So if it ignores everything, it returns the sum of nothing, which is 0. (Not the 0.0 issue is fixed in CVS HEAD.) > On the other hand, the following query does return NULL: > > SELECT AVG(b) FROM TEST; > AVG is implemented as SUM/COUNT. But the count is zero. So you get a NULL. -- D. Richard Hipp <[EMAIL PROTECTED]>
Re: [sqlite] SUM and NULL values
> Say I have a column called 'b' in a table called 'test' and every > field in 'b' has a NULL value. What is the expected behavior of the > following query: > > SELECT SUM(b) FROM TEST; > > I guess I'm thinking I'll get back NULL. Instead I get 0 (or, in the > 3.2.5 code, 0.0). > > On the other hand, the following query does return NULL: > > SELECT AVG(b) FROM TEST; > I should think NULL would be the correct result, but it's not what I get: D:\temp\convention>sqlite3 testnull SQLite version 3.0.8 Enter ".help" for instructions sqlite> create table t( b integer, c integer ); sqlite> insert into t(b) values(1); sqlite> insert into t(b) values(2); sqlite> insert into t(b) values(3); sqlite> select * from t; 1| 2| 3| sqlite> select sum(c) from t; 0 sqlite> select sum(b) from t; 6 sqlite> select count(*) from t where c is null; 3 sqlite> -- --- The Castles of Dereth Calendar: a tour of the art and architecture of Asheron's Call http://www.lulu.com/content/77264