Re: [patch] bit XOR aggregate functions

2021-03-07 Thread David Rowley
On Sun, 7 Mar 2021 at 23:24, Pavel Stehule  wrote:
 The mandatory ORDER BY clause should be necessary for operations when
the result depends on the order. You need an order for calculation of
median. And you don't need to know an order for average. More if the
result is one number and is not possible to do a visual check of
correctness (like median).

I really don't think so.

# create table f (f float not null);
# insert into f values(1e100),(-1e100),(1.5);
# select sum(f order by f) from f;
 sum
-
   0
(1 row)

# select sum(f) from f;
 sum
-
 1.5
(1 row)

Users are going to be pretty annoyed with us if we demanded that they
include an ORDER BY for that query. Especially so since our ORDER BY
aggregate implementation still has no planner support.

David




Re: [patch] bit XOR aggregate functions

2021-03-07 Thread Tom Lane
Pavel Stehule  writes:
> But this is offtopic in this discussion :)

The whole topic is off-topic.  As a general rule, things that depend on
input order shouldn't be declared as aggregates --- they should be window
functions or ordered-set aggregates, for which the syntax forces you to
specify input order.  All of the standard aggregates, and most of our
custom ones (including BIT_XOR) do not depend on input order (... mumble
floating-point roundoff error mumble ...), so forcing users to write an
ordering clause would be useless, not to mention being a SQL spec
violation.

There are a small minority like array_agg that do have such a dependency,
but as far as I recall our docs for each of those warn about the need to
sort the input for reproducible results.  I think that's sufficient.
Who's to say whether a particular query actually requires reproducible
results?  Seeing that we don't provide reproducible row ordering
without an ORDER BY, I'm not sure why we should apply a different
standard to array_agg.

regards, tom lane




Re: [patch] bit XOR aggregate functions

2021-03-07 Thread Pavel Stehule
ne 7. 3. 2021 v 12:39 odesílatel Vik Fearing 
napsal:

> On 3/7/21 11:37 AM, Pavel Stehule wrote:
> > ne 7. 3. 2021 v 11:28 odesílatel Vik Fearing 
> > napsal:
> >
> >> On 3/7/21 11:24 AM, Pavel Stehule wrote:
> 
>  And so you are now mandating an ORDER BY on every query and in every
>  aggregate and/or window function.  Users will not like that at all.  I
>  certainly shan't.
> 
> >>>
> >>> The mandatory ORDER BY clause should be necessary for operations when
> the
> >>> result depends on the order. You need an order for calculation of
> median.
> >>> And you don't need to know an order for average. More if the result is
> >> one
> >>> number and is not possible to do a visual check of correctness (like
> >>> median).
> >>
> >> The syntax for median (percentile_cont(0.5)) already requires an order
> >> by clause.  You are now requiring one on array_agg().
> >>
> >
> > array_agg is discuttable, because PostgreSQL arrays are ordered set type.
> > But very common usage is using arrays instead and unordered sets (because
> > ANSI/SQL sets) are not supported. But anyway - for arrays I can do visual
> > check if it is ordered well or not.
>
> If by "visual check" you mean "with my human eyeballs" then I would
> argue that that is always the case and we don't need nannying for other
> aggregates either.
>

The correct solution is using arrays like arrays and sets like sets. When
you mix two different features to one, then you will have problems.

But if I see {{1,2,3},{3,4,5}} I have some knowledge - it is not 100%, but
it is. If I have 27373 as a result of median, I have nothing other
information.

The design of arrays (in pg) was incremental - it is older than Postgres
supported ordered aggregates, and probably older than ANSI/SQL introduced
sets. So the implementation of strong safeguards is not possible for
compatibility reasons. If I designed array_agg or string_agg today, then I
prefer to design it like ordered aggregates.

Sure - it is about life philosophy, and it is about projects where you are,
and about risks, .. some people prefer risks, some people prefer
safeguards. I see a complexity boom as a very big issue - I remember good
books about programming on 50 pagers, and then now we should start from
green or zero again or we have to implement most safeguards that are
possible to hold systems workable. But anyway - a good system is robust,
and robust systems try to reduce possible errors how it is possible (human
errors are most common).

But this is offtopic in this discussion :)





-- 
> Vik Fearing
>


Re: [patch] bit XOR aggregate functions

2021-03-07 Thread Vik Fearing
On 3/7/21 11:37 AM, Pavel Stehule wrote:
> ne 7. 3. 2021 v 11:28 odesílatel Vik Fearing 
> napsal:
> 
>> On 3/7/21 11:24 AM, Pavel Stehule wrote:

 And so you are now mandating an ORDER BY on every query and in every
 aggregate and/or window function.  Users will not like that at all.  I
 certainly shan't.

>>>
>>> The mandatory ORDER BY clause should be necessary for operations when the
>>> result depends on the order. You need an order for calculation of median.
>>> And you don't need to know an order for average. More if the result is
>> one
>>> number and is not possible to do a visual check of correctness (like
>>> median).
>>
>> The syntax for median (percentile_cont(0.5)) already requires an order
>> by clause.  You are now requiring one on array_agg().
>>
> 
> array_agg is discuttable, because PostgreSQL arrays are ordered set type.
> But very common usage is using arrays instead and unordered sets (because
> ANSI/SQL sets) are not supported. But anyway - for arrays I can do visual
> check if it is ordered well or not.

If by "visual check" you mean "with my human eyeballs" then I would
argue that that is always the case and we don't need nannying for other
aggregates either.
-- 
Vik Fearing




Re: [patch] bit XOR aggregate functions

2021-03-07 Thread Pavel Stehule
ne 7. 3. 2021 v 11:28 odesílatel Vik Fearing 
napsal:

> On 3/7/21 11:24 AM, Pavel Stehule wrote:
> >>
> >> And so you are now mandating an ORDER BY on every query and in every
> >> aggregate and/or window function.  Users will not like that at all.  I
> >> certainly shan't.
> >>
> >
> > The mandatory ORDER BY clause should be necessary for operations when the
> > result depends on the order. You need an order for calculation of median.
> > And you don't need to know an order for average. More if the result is
> one
> > number and is not possible to do a visual check of correctness (like
> > median).
>
> The syntax for median (percentile_cont(0.5)) already requires an order
> by clause.  You are now requiring one on array_agg().
>

array_agg is discuttable, because PostgreSQL arrays are ordered set type.
But very common usage is using arrays instead and unordered sets (because
ANSI/SQL sets) are not supported. But anyway - for arrays I can do visual
check if it is ordered well or not.


-- 
> Vik Fearing
>


Re: [patch] bit XOR aggregate functions

2021-03-07 Thread Vik Fearing
On 3/7/21 11:24 AM, Pavel Stehule wrote:
>>
>> And so you are now mandating an ORDER BY on every query and in every
>> aggregate and/or window function.  Users will not like that at all.  I
>> certainly shan't.
>>
> 
> The mandatory ORDER BY clause should be necessary for operations when the
> result depends on the order. You need an order for calculation of median.
> And you don't need to know an order for average. More if the result is one
> number and is not possible to do a visual check of correctness (like
> median).

The syntax for median (percentile_cont(0.5)) already requires an order
by clause.  You are now requiring one on array_agg().
-- 
Vik Fearing




Re: [patch] bit XOR aggregate functions

2021-03-07 Thread Pavel Stehule
>
> And so you are now mandating an ORDER BY on every query and in every
> aggregate and/or window function.  Users will not like that at all.  I
> certainly shan't.
>

The mandatory ORDER BY clause should be necessary for operations when the
result depends on the order. You need an order for calculation of median.
And you don't need to know an order for average. More if the result is one
number and is not possible to do a visual check of correctness (like
median).

-- 
> Vik Fearing
>


Re: [patch] bit XOR aggregate functions

2021-03-07 Thread Vik Fearing
On 3/7/21 11:05 AM, Pavel Stehule wrote:
> ne 7. 3. 2021 v 11:02 odesílatel Vik Fearing 
> napsal:
> 
>> On 3/7/21 10:53 AM, Pavel Stehule wrote:
>>> ne 7. 3. 2021 v 10:36 odesílatel Vik Fearing 
>>> napsal:
>>>
 On 3/6/21 9:06 PM, David G. Johnston wrote:
> On Saturday, March 6, 2021, David Fetter  wrote:
>
>>
 SELECT BIT_XOR(b ORDER BY a, c).../* works */
 SELECT BIT_XOR(b) OVER (ORDER BY a, c)... /* works */
 SELECT BIT_XOR(b) FROM... /* errors out */
>>>
>>>
>>> Why would such an error be necessary, or even desirable?
>>
>> Because there is no way to ensure that the results remain consistent
>> from one execution to the next without such a guarantee.
>>
>
> Numerous existing aggregate functions have this behavior.  Making those
> error isn’t an option.  So is making this a special case something we
 want
> to do (and also maybe make doing so the rule going forward)?

 Aside from the fact that bit_xor() does not need this, I am opposed to
 it in general.  It is not our job to make people write correct queries.

>>>
>>> I cannot agree with the last sentence. It is questions about costs and
>>> benefits, but good tool should to make warnings when users does some
>> stupid
>>> things.
>>>
>>> It is important at this time, because complexity in IT is pretty high,
>> and
>>> a lot of users are not well trained (but well trained people can make
>>> errors too). And a lot of users have zero knowledge about technology, So
>>> when it is possible, and when it makes sense, then Postgres should be
>>> simple and safe. I think it is important for renome too. It is about
>> costs
>>> and  benefits. Good reputation is a good benefit for us too. Ordered
>>> aggregation was designed for some purposes, and should be used, when it
>> has
>>> sense.
>>
>> How many cycles do you recommend we spend on determining whether  ORDER
>> BY a, b  is sufficient but  ORDER BY a  is not?
>>
>> If we had an optimization_effort_level guc (I have often wanted that),
>> then I agree that this could be added to a very high level.  But we
>> don't, so I don't want any of it.
>>
> 
> The safeguard is mandatory ORDER BY clause.


And so you are now mandating an ORDER BY on every query and in every
aggregate and/or window function.  Users will not like that at all.  I
certainly shan't.
-- 
Vik Fearing




Re: [patch] bit XOR aggregate functions

2021-03-07 Thread Pavel Stehule
ne 7. 3. 2021 v 11:02 odesílatel Vik Fearing 
napsal:

> On 3/7/21 10:53 AM, Pavel Stehule wrote:
> > ne 7. 3. 2021 v 10:36 odesílatel Vik Fearing 
> > napsal:
> >
> >> On 3/6/21 9:06 PM, David G. Johnston wrote:
> >>> On Saturday, March 6, 2021, David Fetter  wrote:
> >>>
> 
> >> SELECT BIT_XOR(b ORDER BY a, c).../* works */
> >> SELECT BIT_XOR(b) OVER (ORDER BY a, c)... /* works */
> >> SELECT BIT_XOR(b) FROM... /* errors out */
> >
> >
> > Why would such an error be necessary, or even desirable?
> 
>  Because there is no way to ensure that the results remain consistent
>  from one execution to the next without such a guarantee.
> 
> >>>
> >>> Numerous existing aggregate functions have this behavior.  Making those
> >>> error isn’t an option.  So is making this a special case something we
> >> want
> >>> to do (and also maybe make doing so the rule going forward)?
> >>
> >> Aside from the fact that bit_xor() does not need this, I am opposed to
> >> it in general.  It is not our job to make people write correct queries.
> >>
> >
> > I cannot agree with the last sentence. It is questions about costs and
> > benefits, but good tool should to make warnings when users does some
> stupid
> > things.
> >
> > It is important at this time, because complexity in IT is pretty high,
> and
> > a lot of users are not well trained (but well trained people can make
> > errors too). And a lot of users have zero knowledge about technology, So
> > when it is possible, and when it makes sense, then Postgres should be
> > simple and safe. I think it is important for renome too. It is about
> costs
> > and  benefits. Good reputation is a good benefit for us too. Ordered
> > aggregation was designed for some purposes, and should be used, when it
> has
> > sense.
>
> How many cycles do you recommend we spend on determining whether  ORDER
> BY a, b  is sufficient but  ORDER BY a  is not?
>
> If we had an optimization_effort_level guc (I have often wanted that),
> then I agree that this could be added to a very high level.  But we
> don't, so I don't want any of it.
>

The safeguard is mandatory ORDER BY clause.



-- 
> Vik Fearing
>


Re: [patch] bit XOR aggregate functions

2021-03-07 Thread Vik Fearing
On 3/7/21 10:53 AM, Pavel Stehule wrote:
> ne 7. 3. 2021 v 10:36 odesílatel Vik Fearing 
> napsal:
> 
>> On 3/6/21 9:06 PM, David G. Johnston wrote:
>>> On Saturday, March 6, 2021, David Fetter  wrote:
>>>

>> SELECT BIT_XOR(b ORDER BY a, c).../* works */
>> SELECT BIT_XOR(b) OVER (ORDER BY a, c)... /* works */
>> SELECT BIT_XOR(b) FROM... /* errors out */
>
>
> Why would such an error be necessary, or even desirable?

 Because there is no way to ensure that the results remain consistent
 from one execution to the next without such a guarantee.

>>>
>>> Numerous existing aggregate functions have this behavior.  Making those
>>> error isn’t an option.  So is making this a special case something we
>> want
>>> to do (and also maybe make doing so the rule going forward)?
>>
>> Aside from the fact that bit_xor() does not need this, I am opposed to
>> it in general.  It is not our job to make people write correct queries.
>>
> 
> I cannot agree with the last sentence. It is questions about costs and
> benefits, but good tool should to make warnings when users does some stupid
> things.
> 
> It is important at this time, because complexity in IT is pretty high, and
> a lot of users are not well trained (but well trained people can make
> errors too). And a lot of users have zero knowledge about technology, So
> when it is possible, and when it makes sense, then Postgres should be
> simple and safe. I think it is important for renome too. It is about costs
> and  benefits. Good reputation is a good benefit for us too. Ordered
> aggregation was designed for some purposes, and should be used, when it has
> sense.

How many cycles do you recommend we spend on determining whether  ORDER
BY a, b  is sufficient but  ORDER BY a  is not?

If we had an optimization_effort_level guc (I have often wanted that),
then I agree that this could be added to a very high level.  But we
don't, so I don't want any of it.
-- 
Vik Fearing




Re: [patch] bit XOR aggregate functions

2021-03-07 Thread Pavel Stehule
ne 7. 3. 2021 v 10:36 odesílatel Vik Fearing 
napsal:

> On 3/6/21 9:06 PM, David G. Johnston wrote:
> > On Saturday, March 6, 2021, David Fetter  wrote:
> >
> >>
>  SELECT BIT_XOR(b ORDER BY a, c).../* works */
>  SELECT BIT_XOR(b) OVER (ORDER BY a, c)... /* works */
>  SELECT BIT_XOR(b) FROM... /* errors out */
> >>>
> >>>
> >>> Why would such an error be necessary, or even desirable?
> >>
> >> Because there is no way to ensure that the results remain consistent
> >> from one execution to the next without such a guarantee.
> >>
> >
> > Numerous existing aggregate functions have this behavior.  Making those
> > error isn’t an option.  So is making this a special case something we
> want
> > to do (and also maybe make doing so the rule going forward)?
>
> Aside from the fact that bit_xor() does not need this, I am opposed to
> it in general.  It is not our job to make people write correct queries.
>

I cannot agree with the last sentence. It is questions about costs and
benefits, but good tool should to make warnings when users does some stupid
things.

It is important at this time, because complexity in IT is pretty high, and
a lot of users are not well trained (but well trained people can make
errors too). And a lot of users have zero knowledge about technology, So
when it is possible, and when it makes sense, then Postgres should be
simple and safe. I think it is important for renome too. It is about costs
and  benefits. Good reputation is a good benefit for us too. Ordered
aggregation was designed for some purposes, and should be used, when it has
sense.

Regards

Pavel

-- 
> Vik Fearing
>
>
>


Re: [patch] bit XOR aggregate functions

2021-03-07 Thread Vik Fearing
On 3/6/21 9:06 PM, David G. Johnston wrote:
> On Saturday, March 6, 2021, David Fetter  wrote:
> 
>>
 SELECT BIT_XOR(b ORDER BY a, c).../* works */
 SELECT BIT_XOR(b) OVER (ORDER BY a, c)... /* works */
 SELECT BIT_XOR(b) FROM... /* errors out */
>>>
>>>
>>> Why would such an error be necessary, or even desirable?
>>
>> Because there is no way to ensure that the results remain consistent
>> from one execution to the next without such a guarantee.
>>
> 
> Numerous existing aggregate functions have this behavior.  Making those
> error isn’t an option.  So is making this a special case something we want
> to do (and also maybe make doing so the rule going forward)?

Aside from the fact that bit_xor() does not need this, I am opposed to
it in general.  It is not our job to make people write correct queries.
-- 
Vik Fearing




Re: [patch] bit XOR aggregate functions

2021-03-06 Thread David G. Johnston
On Saturday, March 6, 2021, David Fetter  wrote:

>
> > > SELECT BIT_XOR(b ORDER BY a, c).../* works */
> > > SELECT BIT_XOR(b) OVER (ORDER BY a, c)... /* works */
> > > SELECT BIT_XOR(b) FROM... /* errors out */
> >
> >
> > Why would such an error be necessary, or even desirable?
>
> Because there is no way to ensure that the results remain consistent
> from one execution to the next without such a guarantee.
>

Numerous existing aggregate functions have this behavior.  Making those
error isn’t an option.  So is making this a special case something we want
to do (and also maybe make doing so the rule going forward)?

David J.


Re: [patch] bit XOR aggregate functions

2021-03-06 Thread David Fetter
On Sat, Mar 06, 2021 at 09:03:25PM +0100, Vik Fearing wrote:
> On 3/6/21 9:00 PM, David Fetter wrote:
> > On Sat, Mar 06, 2021 at 08:57:46PM +0100, Vik Fearing wrote:
> >> On 3/6/21 8:55 PM, David Fetter wrote:
> >>> On Wed, Mar 03, 2021 at 03:30:15PM +0100, Peter Eisentraut wrote:
>  On 10.02.21 06:42, Kyotaro Horiguchi wrote:
> > We already had CREATE AGGREATE at the time, so BIT_XOR can be
> > thought as it falls into the same category with BIT_AND and
> > BIT_OR, that is, we may have BIT_XOR as an intrinsic aggregation?
> 
>  I think the use of BIT_XOR is quite separate from BIT_AND and
>  BIT_OR. The latter give you an "all" or "any" result of the bits
>  set.  BIT_XOR will return 1 or true if an odd number of inputs are 1
>  or true, which isn't useful by itself.  But it can be used as a
>  checksum, so it seems pretty reasonable to me to add it.  Perhaps
>  the use case could be pointed out in the documentation.
> >>>
> >>> If this is the only use case, is there some way to refuse to execute
> >>> it if it doesn't contain an unambiguous ORDER BY, as illustrated
> >>> below?
> >>>
> >>> SELECT BIT_XOR(b ORDER BY a, c).../* works */
> >>> SELECT BIT_XOR(b) OVER (ORDER BY a, c)... /* works */
> >>> SELECT BIT_XOR(b) FROM... /* errors out */
> >>
> >>
> >> Why would such an error be necessary, or even desirable?
> > 
> > Because there is no way to ensure that the results remain consistent
> > from one execution to the next without such a guarantee.
> 
> I think one of us is forgetting how XOR works.

Oops. You're right.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate




Re: [patch] bit XOR aggregate functions

2021-03-06 Thread Vik Fearing
On 3/6/21 9:00 PM, David Fetter wrote:
> On Sat, Mar 06, 2021 at 08:57:46PM +0100, Vik Fearing wrote:
>> On 3/6/21 8:55 PM, David Fetter wrote:
>>> On Wed, Mar 03, 2021 at 03:30:15PM +0100, Peter Eisentraut wrote:
 On 10.02.21 06:42, Kyotaro Horiguchi wrote:
> We already had CREATE AGGREATE at the time, so BIT_XOR can be
> thought as it falls into the same category with BIT_AND and
> BIT_OR, that is, we may have BIT_XOR as an intrinsic aggregation?

 I think the use of BIT_XOR is quite separate from BIT_AND and
 BIT_OR. The latter give you an "all" or "any" result of the bits
 set.  BIT_XOR will return 1 or true if an odd number of inputs are 1
 or true, which isn't useful by itself.  But it can be used as a
 checksum, so it seems pretty reasonable to me to add it.  Perhaps
 the use case could be pointed out in the documentation.
>>>
>>> If this is the only use case, is there some way to refuse to execute
>>> it if it doesn't contain an unambiguous ORDER BY, as illustrated
>>> below?
>>>
>>> SELECT BIT_XOR(b ORDER BY a, c).../* works */
>>> SELECT BIT_XOR(b) OVER (ORDER BY a, c)... /* works */
>>> SELECT BIT_XOR(b) FROM... /* errors out */
>>
>>
>> Why would such an error be necessary, or even desirable?
> 
> Because there is no way to ensure that the results remain consistent
> from one execution to the next without such a guarantee.

I think one of us is forgetting how XOR works.
-- 
Vik Fearing




Re: [patch] bit XOR aggregate functions

2021-03-06 Thread David Fetter
On Sat, Mar 06, 2021 at 08:57:46PM +0100, Vik Fearing wrote:
> On 3/6/21 8:55 PM, David Fetter wrote:
> > On Wed, Mar 03, 2021 at 03:30:15PM +0100, Peter Eisentraut wrote:
> >> On 10.02.21 06:42, Kyotaro Horiguchi wrote:
> >>> We already had CREATE AGGREATE at the time, so BIT_XOR can be
> >>> thought as it falls into the same category with BIT_AND and
> >>> BIT_OR, that is, we may have BIT_XOR as an intrinsic aggregation?
> >>
> >> I think the use of BIT_XOR is quite separate from BIT_AND and
> >> BIT_OR. The latter give you an "all" or "any" result of the bits
> >> set.  BIT_XOR will return 1 or true if an odd number of inputs are 1
> >> or true, which isn't useful by itself.  But it can be used as a
> >> checksum, so it seems pretty reasonable to me to add it.  Perhaps
> >> the use case could be pointed out in the documentation.
> > 
> > If this is the only use case, is there some way to refuse to execute
> > it if it doesn't contain an unambiguous ORDER BY, as illustrated
> > below?
> > 
> > SELECT BIT_XOR(b ORDER BY a, c).../* works */
> > SELECT BIT_XOR(b) OVER (ORDER BY a, c)... /* works */
> > SELECT BIT_XOR(b) FROM... /* errors out */
> 
> 
> Why would such an error be necessary, or even desirable?

Because there is no way to ensure that the results remain consistent
from one execution to the next without such a guarantee.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate




Re: [patch] bit XOR aggregate functions

2021-03-06 Thread Vik Fearing
On 3/6/21 8:55 PM, David Fetter wrote:
> On Wed, Mar 03, 2021 at 03:30:15PM +0100, Peter Eisentraut wrote:
>> On 10.02.21 06:42, Kyotaro Horiguchi wrote:
>>> We already had CREATE AGGREATE at the time, so BIT_XOR can be
>>> thought as it falls into the same category with BIT_AND and
>>> BIT_OR, that is, we may have BIT_XOR as an intrinsic aggregation?
>>
>> I think the use of BIT_XOR is quite separate from BIT_AND and
>> BIT_OR. The latter give you an "all" or "any" result of the bits
>> set.  BIT_XOR will return 1 or true if an odd number of inputs are 1
>> or true, which isn't useful by itself.  But it can be used as a
>> checksum, so it seems pretty reasonable to me to add it.  Perhaps
>> the use case could be pointed out in the documentation.
> 
> If this is the only use case, is there some way to refuse to execute
> it if it doesn't contain an unambiguous ORDER BY, as illustrated
> below?
> 
> SELECT BIT_XOR(b ORDER BY a, c).../* works */
> SELECT BIT_XOR(b) OVER (ORDER BY a, c)... /* works */
> SELECT BIT_XOR(b) FROM... /* errors out */


Why would such an error be necessary, or even desirable?
-- 
Vik Fearing




Re: [patch] bit XOR aggregate functions

2021-03-06 Thread David Fetter
On Wed, Mar 03, 2021 at 03:30:15PM +0100, Peter Eisentraut wrote:
> On 10.02.21 06:42, Kyotaro Horiguchi wrote:
> > We already had CREATE AGGREATE at the time, so BIT_XOR can be
> > thought as it falls into the same category with BIT_AND and
> > BIT_OR, that is, we may have BIT_XOR as an intrinsic aggregation?
> 
> I think the use of BIT_XOR is quite separate from BIT_AND and
> BIT_OR. The latter give you an "all" or "any" result of the bits
> set.  BIT_XOR will return 1 or true if an odd number of inputs are 1
> or true, which isn't useful by itself.  But it can be used as a
> checksum, so it seems pretty reasonable to me to add it.  Perhaps
> the use case could be pointed out in the documentation.

If this is the only use case, is there some way to refuse to execute
it if it doesn't contain an unambiguous ORDER BY, as illustrated
below?

SELECT BIT_XOR(b ORDER BY a, c).../* works */
SELECT BIT_XOR(b) OVER (ORDER BY a, c)... /* works */
SELECT BIT_XOR(b) FROM... /* errors out */

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate




Re: [patch] bit XOR aggregate functions

2021-03-06 Thread Peter Eisentraut

On 05.03.21 13:42, Alexey Bashtanov wrote:

Thanks for your reviews.
I've updated my patch to the current master and added a documentation 
line suggesting using the new function as a checksum.


committed




Re: [patch] bit XOR aggregate functions

2021-03-05 Thread Alexey Bashtanov

Hi all,

Thanks for your reviews.
I've updated my patch to the current master and added a documentation 
line suggesting using the new function as a checksum.


Best regards, Alex

On 04/03/2021 17:14, Ibrar Ahmed wrote:



On Wed, Mar 3, 2021 at 7:30 PM Peter Eisentraut 
> wrote:


On 10.02.21 06:42, Kyotaro Horiguchi wrote:
> We already had CREATE AGGREATE at the time, so BIT_XOR can be
thought
> as it falls into the same category with BIT_AND and BIT_OR, that is,
> we may have BIT_XOR as an intrinsic aggregation?

I think the use of BIT_XOR is quite separate from BIT_AND and BIT_OR.
The latter give you an "all" or "any" result of the bits set. 
BIT_XOR
will return 1 or true if an odd number of inputs are 1 or true, which
isn't useful by itself.  But it can be used as a checksum, so it
seems
pretty reasonable to me to add it.  Perhaps the use case could be
pointed out in the documentation.




Hi Alex,

The patch is failing just because of a comment, which is already 
changed by another patch


-/* Define to build with OpenSSL support. (--with-ssl=openssl) */

+/* Define to 1 if you have OpenSSL support. */


Do you mind sending an updated patch?

http://cfbot.cputube.org/patch_32_2980.log.

I am changing the status to "Waiting for Author"


In my opinion that change no more requires so I removed that and 
attached the patch.


--
Ibrar Ahmed


diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index fee0561961..dff158e99a 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -19198,6 +19198,33 @@ SELECT NULLIF(value, '(none)') ...
Yes
   
 
+  
+   
+
+ bit_xor
+
+bit_xor ( smallint )
+smallint
+   
+   
+bit_xor ( integer )
+integer
+   
+   
+bit_xor ( bigint )
+bigint
+   
+   
+bit_xor ( bit )
+bit
+   
+   
+Computes the bitwise exclusive OR of all non-null input values.
+May be useful as a checksum for an unordered set of values.
+   
+   Yes
+  
+
   

 
diff --git a/src/include/catalog/pg_aggregate.dat b/src/include/catalog/pg_aggregate.dat
index 5c1f962251..0d8c5a922a 100644
--- a/src/include/catalog/pg_aggregate.dat
+++ b/src/include/catalog/pg_aggregate.dat
@@ -505,18 +505,26 @@
   aggcombinefn => 'int2and', aggtranstype => 'int2' },
 { aggfnoid => 'bit_or(int2)', aggtransfn => 'int2or', aggcombinefn => 'int2or',
   aggtranstype => 'int2' },
+{ aggfnoid => 'bit_xor(int2)', aggtransfn => 'int2xor', aggcombinefn => 'int2xor',
+  aggtranstype => 'int2' },
 { aggfnoid => 'bit_and(int4)', aggtransfn => 'int4and',
   aggcombinefn => 'int4and', aggtranstype => 'int4' },
 { aggfnoid => 'bit_or(int4)', aggtransfn => 'int4or', aggcombinefn => 'int4or',
   aggtranstype => 'int4' },
+{ aggfnoid => 'bit_xor(int4)', aggtransfn => 'int4xor', aggcombinefn => 'int4xor',
+  aggtranstype => 'int4' },
 { aggfnoid => 'bit_and(int8)', aggtransfn => 'int8and',
   aggcombinefn => 'int8and', aggtranstype => 'int8' },
 { aggfnoid => 'bit_or(int8)', aggtransfn => 'int8or', aggcombinefn => 'int8or',
   aggtranstype => 'int8' },
+{ aggfnoid => 'bit_xor(int8)', aggtransfn => 'int8xor', aggcombinefn => 'int8xor',
+  aggtranstype => 'int8' },
 { aggfnoid => 'bit_and(bit)', aggtransfn => 'bitand', aggcombinefn => 'bitand',
   aggtranstype => 'bit' },
 { aggfnoid => 'bit_or(bit)', aggtransfn => 'bitor', aggcombinefn => 'bitor',
   aggtranstype => 'bit' },
+{ aggfnoid => 'bit_xor(bit)', aggtransfn => 'bitxor', aggcombinefn => 'bitxor',
+  aggtranstype => 'bit' },
 
 # xml
 { aggfnoid => 'xmlagg', aggtransfn => 'xmlconcat2', aggtranstype => 'xml' },
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 59d2b71ca9..506689d8ac 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -7995,24 +7995,36 @@
 { oid => '2237', descr => 'bitwise-or smallint aggregate',
   proname => 'bit_or', prokind => 'a', proisstrict => 'f', prorettype => 'int2',
   proargtypes => 'int2', prosrc => 'aggregate_dummy' },
+{ oid => '8452', descr => 'bitwise-xor smallint aggregate',
+  proname => 'bit_xor', prokind => 'a', proisstrict => 'f', prorettype => 'int2',
+  proargtypes => 'int2', prosrc => 'aggregate_dummy' },
 { oid => '2238', descr => 'bitwise-and integer aggregate',
   proname => 'bit_and', prokind => 'a', proisstrict => 'f',
   prorettype => 'int4', proargtypes => 'int4', prosrc => 'aggregate_dummy' },
 { oid => '2239', descr => 'bitwise-or integer aggregate',
   proname => 'bit_or', prokind => 'a', proisstrict => 'f', prorettype => 'int4',
   proargtypes => 'int4', prosrc => 'aggregate_dummy' },
+{ oid => '8453', descr => 'bitwise-xor integer aggregate',
+  proname => 'bit_xor', prokind => 'a', proisstrict => 'f', prorettype => 'int4',
+  proargtypes => 'int4', prosrc => 'aggrega

Re: [patch] bit XOR aggregate functions

2021-03-04 Thread Ibrar Ahmed
On Wed, Mar 3, 2021 at 7:30 PM Peter Eisentraut <
peter.eisentr...@enterprisedb.com> wrote:

> On 10.02.21 06:42, Kyotaro Horiguchi wrote:
> > We already had CREATE AGGREATE at the time, so BIT_XOR can be thought
> > as it falls into the same category with BIT_AND and BIT_OR, that is,
> > we may have BIT_XOR as an intrinsic aggregation?
>
> I think the use of BIT_XOR is quite separate from BIT_AND and BIT_OR.
> The latter give you an "all" or "any" result of the bits set.  BIT_XOR
> will return 1 or true if an odd number of inputs are 1 or true, which
> isn't useful by itself.  But it can be used as a checksum, so it seems
> pretty reasonable to me to add it.  Perhaps the use case could be
> pointed out in the documentation.
>
>
>
>
Hi Alex,


The patch is failing just because of a comment, which is already changed by
another patch

-/* Define to build with OpenSSL support. (--with-ssl=openssl) */

+/* Define to 1 if you have OpenSSL support. */

Do you mind sending an updated patch?

http://cfbot.cputube.org/patch_32_2980.log.

I am changing the status to "Waiting for Author"


In my opinion that change no more requires so I removed that and attached
the patch.

-- 
Ibrar Ahmed


bit-xor-agg-v002.diff
Description: Binary data


Re: [patch] bit XOR aggregate functions

2021-03-03 Thread Peter Eisentraut

On 10.02.21 06:42, Kyotaro Horiguchi wrote:

We already had CREATE AGGREATE at the time, so BIT_XOR can be thought
as it falls into the same category with BIT_AND and BIT_OR, that is,
we may have BIT_XOR as an intrinsic aggregation?


I think the use of BIT_XOR is quite separate from BIT_AND and BIT_OR. 
The latter give you an "all" or "any" result of the bits set.  BIT_XOR 
will return 1 or true if an odd number of inputs are 1 or true, which 
isn't useful by itself.  But it can be used as a checksum, so it seems 
pretty reasonable to me to add it.  Perhaps the use case could be 
pointed out in the documentation.






Re: [patch] bit XOR aggregate functions

2021-02-09 Thread Kyotaro Horiguchi
At Tue, 9 Feb 2021 15:25:19 +, Alexey Bashtanov  wrote 
in 
> I personally use it as a checksum for a large unordered set, where
> performance and simplicity is prioritized over collision resilience.
> Maybe there are other ways to use them.

FWIW the BIT_XOR can be created using CREATE AGGREGATE.

CREATE OR REPLACE AGGREGATE BIT_XOR(IN v smallint) (SFUNC = int2xor, STYPE = 
smallint);
CREATE OR REPLACE AGGREGATE BIT_XOR(IN v int4) (SFUNC = int4xor, STYPE = int4);
CREATE OR REPLACE AGGREGATE BIT_XOR(IN v bigint) (SFUNC = int8xor, STYPE = 
bigint);
CREATE OR REPLACE AGGREGATE BIT_XOR(IN v bit) (SFUNC = bitxor, STYPE = bit);

The bit_and/bit_or aggregates are back to 2004, that commit says that:

> commit 8096fe45cee42ce02e602cbea08e969139a77455
> Author: Bruce Momjian 
> Date:   Wed May 26 15:26:28 2004 +
...
>(2) bitwise integer aggregates named bit_and and bit_or for
>int2, int4, int8 and bit types. They are not standard, but I find
>them useful. I needed them once.

We already had CREATE AGGREATE at the time, so BIT_XOR can be thought
as it falls into the same category with BIT_AND and BIT_OR, that is,
we may have BIT_XOR as an intrinsic aggregation?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




[patch] bit XOR aggregate functions

2021-02-09 Thread Alexey Bashtanov

Hi,

I personally use it as a checksum for a large unordered set, where 
performance and simplicity is prioritized over collision resilience.

Maybe there are other ways to use them.

Best, Alex
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 1ab31a9056..f33358f8db 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -19192,6 +19192,32 @@ SELECT NULLIF(value, '(none)') ...
Yes
   
 
+  
+   
+
+ bit_xor
+
+bit_xor ( smallint )
+smallint
+   
+   
+bit_xor ( integer )
+integer
+   
+   
+bit_xor ( bigint )
+bigint
+   
+   
+bit_xor ( bit )
+bit
+   
+   
+Computes the bitwise exclusive OR of all non-null input values.
+   
+   Yes
+  
+
   

 
diff --git a/src/include/catalog/pg_aggregate.dat b/src/include/catalog/pg_aggregate.dat
index 5c1f962251..0d8c5a922a 100644
--- a/src/include/catalog/pg_aggregate.dat
+++ b/src/include/catalog/pg_aggregate.dat
@@ -505,18 +505,26 @@
   aggcombinefn => 'int2and', aggtranstype => 'int2' },
 { aggfnoid => 'bit_or(int2)', aggtransfn => 'int2or', aggcombinefn => 'int2or',
   aggtranstype => 'int2' },
+{ aggfnoid => 'bit_xor(int2)', aggtransfn => 'int2xor', aggcombinefn => 'int2xor',
+  aggtranstype => 'int2' },
 { aggfnoid => 'bit_and(int4)', aggtransfn => 'int4and',
   aggcombinefn => 'int4and', aggtranstype => 'int4' },
 { aggfnoid => 'bit_or(int4)', aggtransfn => 'int4or', aggcombinefn => 'int4or',
   aggtranstype => 'int4' },
+{ aggfnoid => 'bit_xor(int4)', aggtransfn => 'int4xor', aggcombinefn => 'int4xor',
+  aggtranstype => 'int4' },
 { aggfnoid => 'bit_and(int8)', aggtransfn => 'int8and',
   aggcombinefn => 'int8and', aggtranstype => 'int8' },
 { aggfnoid => 'bit_or(int8)', aggtransfn => 'int8or', aggcombinefn => 'int8or',
   aggtranstype => 'int8' },
+{ aggfnoid => 'bit_xor(int8)', aggtransfn => 'int8xor', aggcombinefn => 'int8xor',
+  aggtranstype => 'int8' },
 { aggfnoid => 'bit_and(bit)', aggtransfn => 'bitand', aggcombinefn => 'bitand',
   aggtranstype => 'bit' },
 { aggfnoid => 'bit_or(bit)', aggtransfn => 'bitor', aggcombinefn => 'bitor',
   aggtranstype => 'bit' },
+{ aggfnoid => 'bit_xor(bit)', aggtransfn => 'bitxor', aggcombinefn => 'bitxor',
+  aggtranstype => 'bit' },
 
 # xml
 { aggfnoid => 'xmlagg', aggtransfn => 'xmlconcat2', aggtranstype => 'xml' },
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 4e0c9be58c..95577094bb 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -7992,24 +7992,36 @@
 { oid => '2237', descr => 'bitwise-or smallint aggregate',
   proname => 'bit_or', prokind => 'a', proisstrict => 'f', prorettype => 'int2',
   proargtypes => 'int2', prosrc => 'aggregate_dummy' },
+{ oid => '8452', descr => 'bitwise-xor smallint aggregate',
+  proname => 'bit_xor', prokind => 'a', proisstrict => 'f', prorettype => 'int2',
+  proargtypes => 'int2', prosrc => 'aggregate_dummy' },
 { oid => '2238', descr => 'bitwise-and integer aggregate',
   proname => 'bit_and', prokind => 'a', proisstrict => 'f',
   prorettype => 'int4', proargtypes => 'int4', prosrc => 'aggregate_dummy' },
 { oid => '2239', descr => 'bitwise-or integer aggregate',
   proname => 'bit_or', prokind => 'a', proisstrict => 'f', prorettype => 'int4',
   proargtypes => 'int4', prosrc => 'aggregate_dummy' },
+{ oid => '8453', descr => 'bitwise-xor integer aggregate',
+  proname => 'bit_xor', prokind => 'a', proisstrict => 'f', prorettype => 'int4',
+  proargtypes => 'int4', prosrc => 'aggregate_dummy' },
 { oid => '2240', descr => 'bitwise-and bigint aggregate',
   proname => 'bit_and', prokind => 'a', proisstrict => 'f',
   prorettype => 'int8', proargtypes => 'int8', prosrc => 'aggregate_dummy' },
 { oid => '2241', descr => 'bitwise-or bigint aggregate',
   proname => 'bit_or', prokind => 'a', proisstrict => 'f', prorettype => 'int8',
   proargtypes => 'int8', prosrc => 'aggregate_dummy' },
+{ oid => '8454', descr => 'bitwise-xor bigint aggregate',
+  proname => 'bit_xor', prokind => 'a', proisstrict => 'f', prorettype => 'int8',
+  proargtypes => 'int8', prosrc => 'aggregate_dummy' },
 { oid => '2242', descr => 'bitwise-and bit aggregate',
   proname => 'bit_and', prokind => 'a', proisstrict => 'f', prorettype => 'bit',
   proargtypes => 'bit', prosrc => 'aggregate_dummy' },
 { oid => '2243', descr => 'bitwise-or bit aggregate',
   proname => 'bit_or', prokind => 'a', proisstrict => 'f', prorettype => 'bit',
   proargtypes => 'bit', prosrc => 'aggregate_dummy' },
+{ oid => '8455', descr => 'bitwise-xor bit aggregate',
+  proname => 'bit_xor', prokind => 'a', proisstrict => 'f', prorettype => 'bit',
+  proargtypes => 'bit', prosrc => 'aggregate_dummy' },
 
 # formerly-missing interval + datetime operators
 { oid => '2546',
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 55ca