Re: [Firebird-devel] User-defined aggregate functions

2018-10-01 Thread Dimitry Sibiryakov

01.10.2018 16:27, Adriano dos Santos Fernandes wrote:

Usage of FETCH seems good here, then I believe the production of rows
would need a YIELD .


  Why to invent a new syntax instead of assignment + SUSPEND which everybody 
got used to?

--
  WBR, SD.


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] User-defined aggregate functions

2018-10-01 Thread Adriano dos Santos Fernandes
On 30/09/2018 06:25, Dimitry Sibiryakov wrote:
> 30.09.2018 4:52, Adriano dos Santos Fernandes wrote:
>> - Instead of agg_finished, adjust SUSPEND (probably with another
>> keyword) to somehing like: SUSPEND WHEN FETCHED DO  WHEN
>> FINISHED DO 
>
>   Isn't it how FETCH is used to be used:
>
>   WHILE (1 = 1) DO
>   BEGIN
>     FETCH C INTO :RNAME;
>  IF (ROW_COUNT = 0) THEN
>    LEAVE;
>   END
>
Usage of FETCH seems good here, then I believe the production of rows
would need a YIELD .

And since FETCH and YIELD will be loose coupled, I think subsequent
fetches without YIELD should return the last yielded value.

This is weird, but maybe acceptable for one not following a standard
approach.


Adriano



Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] User-defined aggregate functions

2018-10-01 Thread Adriano dos Santos Fernandes
On 01/10/2018 10:49, Dimitry Sibiryakov wrote:
> 01.10.2018 15:35, Adriano dos Santos Fernandes wrote:
>> A single (aggregate) function call should not produce (describe) more
>> than one column in a query.
>
>   Where is such limitation described except current codebase?
>
>
avg_deviation(x) + 10 -- wtf


Adriano



Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] User-defined aggregate functions

2018-10-01 Thread Dimitry Sibiryakov

01.10.2018 15:35, Adriano dos Santos Fernandes wrote:

A single (aggregate) function call should not produce (describe) more
than one column in a query.


  Where is such limitation described except current codebase?


--
  WBR, SD.


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] User-defined aggregate functions

2018-10-01 Thread Adriano dos Santos Fernandes
On 01/10/2018 09:46, Alex Peshkoff via Firebird-devel wrote:
> On 10/01/18 12:44, Dimitry Sibiryakov wrote:
>> 01.10.2018 11:24, Pavel Cisar wrote:
>>> Aggregates could produce various single-value outputs
>>
>>   Why to limit them to single value? Some aggregates could profit if
>> they are able to return more values at once. For example, avg +
>> deviation.
>>
>
> That really can improve performance. Specially interesting when
> dealing with emulated decfloat datatype.
> And I see no tech difference with
>   create aggregate function ...
> and
>   create aggregate procedure ...
> Procedure's body may follow exactly same rules.
>

A single (aggregate) function call should not produce (describe) more
than one column in a query.

Two-functions-in-one could be supported with ROW type. When ROW type is
implemented, it should be implemented to work in all cases supporting
others types, like standard function and aggregate functions.


Adriano



Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] User-defined aggregate functions

2018-10-01 Thread Alex Peshkoff via Firebird-devel

On 10/01/18 12:44, Dimitry Sibiryakov wrote:

01.10.2018 11:24, Pavel Cisar wrote:

Aggregates could produce various single-value outputs


  Why to limit them to single value? Some aggregates could profit if 
they are able to return more values at once. For example, avg + 
deviation.




That really can improve performance. Specially interesting when dealing 
with emulated decfloat datatype.

And I see no tech difference with
  create aggregate function ...
and
  create aggregate procedure ...
Procedure's body may follow exactly same rules.




Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] User-defined aggregate functions

2018-10-01 Thread Dimitry Sibiryakov

01.10.2018 12:41, Adriano dos Santos Fernandes wrote:

Saying that aggregate functions should return more than one data with a
feature different than ROW type, is like saying our function (and any
language's function) is broken.


  No thing is broken. ROW type is just a syntax sugar. But if you insist, you can put 
implementation of user-defined aggregates off until someone implement this ROW type.



--
  WBR, SD.


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] User-defined aggregate functions

2018-10-01 Thread Simonov Denis via Firebird-devel
Dimitry Sibiryakov  wrote Mon, 01 Oct 2018 13:28:44  
+0300:



01.10.2018 12:22, Simonov Denis via Firebird-devel wrote:
Current SPs do not know how to aggregate data by an external cursor. To  
do this, you need to write queries inside the SP.


   How is this related to (in)ability to return several values at once?

create aggregate function custom_avg (i double precision) returns (o
double precision, dev double precision) as .

select custom_avg(x) from t; <-- this query can return two fields w/o  
any problem no matter how the function access input data.





Dmitry, do not go beyond the context of the discussion. Here, user-defined  
aggregate functions are discussed. None of the built-in aggregate  
functions can do this either. And yes, you did not give an example of how  
this should look for universal stored aggregation procedures.


--
Simonov Denis



Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] User-defined aggregate functions

2018-10-01 Thread Adriano dos Santos Fernandes
On 01/10/2018 07:28, Dimitry Sibiryakov wrote:
> 01.10.2018 12:22, Simonov Denis via Firebird-devel wrote:
>> Current SPs do not know how to aggregate data by an external cursor.
>> To do this, you need to write queries inside the SP.
>
>   How is this related to (in)ability to return several values at once?
>
> create aggregate function custom_avg (i double precision) returns (o
> double precision, dev double precision) as .
>
> select custom_avg(x) from t; <-- this query can return two fields w/o
> any problem no matter how the function access input data.
>
>
Saying that aggregate functions should return more than one data with a
feature different than ROW type, is like saying our function (and any
language's function) is broken.

This is highly non-sense.


Adriano



Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] User-defined aggregate functions

2018-10-01 Thread Dimitry Sibiryakov

01.10.2018 12:22, Simonov Denis via Firebird-devel wrote:
Current SPs do not know how to aggregate data by an external cursor. To do this, you need 
to write queries inside the SP.


  How is this related to (in)ability to return several values at once?

create aggregate function custom_avg (i double precision) returns (o
double precision, dev double precision) as .

select custom_avg(x) from t; <-- this query can return two fields w/o any problem no 
matter how the function access input data.



--
  WBR, SD.


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] User-defined aggregate functions

2018-10-01 Thread Simonov Denis via Firebird-devel
Dimitry Sibiryakov  wrote Mon, 01 Oct 2018 13:12:36  
+0300:



01.10.2018 12:07, Simonov Denis via Firebird-devel wrote:
To return more than one value you need support for ROW types.  
Otherwise, the function still does not extract more than one value.


   Current SP can return several values without any new type.




Current SPs do not know how to aggregate data by an external cursor. To do  
this, you need to write queries inside the SP.



--
Simonov Denis



Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] User-defined aggregate functions

2018-10-01 Thread Simonov Denis via Firebird-devel
Adriano dos Santos Fernandes  wrote Sat, 29 Sep 2018  
22:55:52 +0300:



So SUSPEND in this context is very like JS YIELD with caller doing a  
reduce.


I have even thought in use YIELD keyword in this context.


Adriano



The keyword YEILD is more preferable for aggregate functions. At the same  
time, the return type can be written as usual for functions, not  
procedures.


create aggregate function custom_count (i integer) returns integer
as
declare accumulated integer cnt;
begin
cnt = 0;

while (not agg_finished)
do
begin
if (i is not null) then
cnt = cnt + 1;
yeild cnt;
end
end

create aggregate function custom_avg (i double precision) returns double  
precision

as
declare accumulated cnt integer = 0;
declare accumulated double precision = 0;
declare result double precision = 0;
begin
while (not agg_finished)
do
begin
if (i is not null) then
begin
count = count + 1;
accumulated = accumulated + i;
result = accumulated / count;
end
yeild result;
end
end

--
Simonov Denis



Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] User-defined aggregate functions

2018-10-01 Thread Dimitry Sibiryakov

01.10.2018 12:07, Simonov Denis via Firebird-devel wrote:
To return more than one value you need support for ROW types. Otherwise, the function 
still does not extract more than one value.


  Current SP can return several values without any new type.


--
  WBR, SD.


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] User-defined aggregate functions

2018-10-01 Thread Simonov Denis via Firebird-devel
Dimitry Sibiryakov  wrote Mon, 01 Oct 2018 12:44:58  
+0300:



01.10.2018 11:24, Pavel Cisar wrote:

Aggregates could produce various single-value outputs


   Why to limit them to single value? Some aggregates could profit if  
they are able to return more values at once. For example, avg +  
deviation.





To return more than one value you need support for ROW types. Otherwise,  
the function still does not extract more than one value.



--
Simonov Denis



Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] User-defined aggregate functions

2018-10-01 Thread Dimitry Sibiryakov

01.10.2018 11:24, Pavel Cisar wrote:

Aggregates could produce various single-value outputs


  Why to limit them to single value? Some aggregates could profit if they are able to 
return more values at once. For example, avg + deviation.



--
  WBR, SD.


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] User-defined aggregate functions

2018-10-01 Thread Pavel Cisar

Hi,

Aggregate UDF's are nice, but I think it's a mistake to think about them 
only in classic COUNT(), AVG(), SUM() etc. context. Why we don't look at 
them in more general way?


Aggregates loop over input set and produce single value. Right now we do 
UDF-like aggregates inside PSQL SP using FOR SELECT loops or CURSORs. 
Moving the cursor definition up from SP/Func body to the caller would 
not only extend the flexibility of this aggregating procedures, but 
would allows seamless integration into GROUP BY and other contexts 
suitable for aggregation tasks.


If we would introduce new parameter type -> the cursor, it would be 
perfect fit for enhanced FOR loop that could accept the cursor variable 
instead literal SQL statement. No need for SUSPEND, the UDF execution 
would be self-contained and with single invocation within given context. 
The input parameter could be provided by engine (as part of group by or 
other processing) or other UDF (we can already create cursors in PSQL 
using DECLARE CURSOR, and if we would extend it to allow creation of 
unassigned cursor variable and the possibility to store the EXECUTE 
STATEMENT handle into such CURSOR variable, it could be even more 
powerful). That way it could be easily comprehended general feature with 
simple rules and methods of invocation that could be used in various 
contexts and for various purposes.


Aggregates could produce various single-value outputs, for example 
frequency distributions using an array, percent represented by string 
(number of # in fixed count of chars, as literal string etc.), they 
could produce hashes, JSON packets, URI path names etc.


best regards
Pavel Cisar


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] User-defined aggregate functions

2018-09-29 Thread Adriano dos Santos Fernandes
I think four adjustments makes the initial syntax better:
- Allow usage of more than one input parameter. Maybe even zero would be
allowed.
- Instead of have special semantics for empty data source, declare the
constant to be returned for it, say: create aggregate function my_count
returning 0 for empty data source  If this clause is omitted, null is
returned for empty data source.
- A clause to ignore null input values. If used, engine will filter out
null input and not pass to the routine. This need to declare
what.parameters shluld be considered
- Instead of agg_finished, adjust SUSPEND (probably with another keyword)
to somehing like: SUSPEND WHEN FETCHED DO  WHEN FINISHED DO



Adriano

(Sorry to not be more detailed, I'm writing from smartphone)
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] User-defined aggregate functions

2018-09-29 Thread Adriano dos Santos Fernandes
On Sat, Sep 29, 2018, 12:14 Dimitry Sibiryakov  wrote:

> 29.09.2018 16:30, Dmitry Yemanov wrote:
> > AFAIU, Adriano's point is more about technology (use stall mode for the
> function request),
> > syntax may be adjusted. We may choose a different keyword, if necessary
> (as a syntax sugar
> > for the same BLR verb).
>
>Ok. How simple is this technology from engine POV if there are several
> user-defined
> aggregates in a query?
>One function fetched a row from result set, then another do the same.
> Wouldn't be
> non-cached result set in troubles?
>What engine will do if one of functions return without calling SUSPEND?
>


It's the engine who controls the moving of rows in the dataset. It moves a
row and continue each aggregate function and so on.

It already does this for internal aggregate functions.


Adriano
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] User-defined aggregate functions

2018-09-29 Thread Adriano dos Santos Fernandes
On Sat, Sep 29, 2018, 10:02 Mark Rotteveel  wrote:

>
> > It's also the way modern languages implemented the same concept
> > recently, say JavaScript (ecmascript) and C# with yield.
>
> Do you know of an equivalent example in those languages, because I can't
> easily think of an equivalent aggregate function in a similar style in
> either JavaScript or C#.
>

JavaScript "yield" could also receive data sent by the caller. It does not
aggregate (reduce number of rows) directly, but if you think well, at least
the concept in Firebird should not aggregate directly, it should produce
accumulated aggregation (necessary for window function running
aggregation), so in fact is the engine that reduce numbers of rows.

So SUSPEND in this context is very like JS YIELD with caller doing a reduce.

I have even thought in use YIELD keyword in this context.


Adriano
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] User-defined aggregate functions

2018-09-29 Thread Dimitry Sibiryakov

29.09.2018 16:30, Dmitry Yemanov wrote:
AFAIU, Adriano's point is more about technology (use stall mode for the function request), 
syntax may be adjusted. We may choose a different keyword, if necessary (as a syntax sugar 
for the same BLR verb).


  Ok. How simple is this technology from engine POV if there are several user-defined 
aggregates in a query?
  One function fetched a row from result set, then another do the same. Wouldn't be 
non-cached result set in troubles?

  What engine will do if one of functions return without calling SUSPEND?


--
  WBR, SD.


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] User-defined aggregate functions

2018-09-29 Thread Dmitry Yemanov

29.09.2018 16:57, Dimitry Sibiryakov wrote:

I don't think that to bring completely new meaning ("pull a row from 
set") to existing well-known keyword SUSPEND (used to be "push a row 
into result set") is a right way to "integrate".


AFAIU, Adriano's point is more about technology (use stall mode for the 
function request), syntax may be adjusted. We may choose a different 
keyword, if necessary (as a syntax sugar for the same BLR verb).



Dmitry


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] User-defined aggregate functions

2018-09-29 Thread Dimitry Sibiryakov

29.09.2018 14:32, Adriano dos Santos Fernandes wrote:

The focus is to have a feature well integrated with Firebird others features.


  I don't think that to bring completely new meaning ("pull a row from set") to existing 
well-known keyword SUSPEND (used to be "push a row into result set") is a right way to 
"integrate".



--
  WBR, SD.


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] User-defined aggregate functions

2018-09-29 Thread Mark Rotteveel

On 29-9-2018 14:02, Adriano dos Santos Fernandes wrote:
Mark, saying that an object-based interface is better to store state or 
the way to support external routines is disregard how good Firebird 
selectable procedures are.


Selectable procedures stores intermediary states in a very elegant way.


I am worried that it disallows (or at least complicates) more complex 
implementations of aggregate functions, and may not work so great when 
used with sliding windows. My first aggregate proposal wouldn't work 
well for that either, the second proposal could possibly be extended in 
a similar manner as PostgreSQL does.


However, I have to admit, not having written a lot of aggregate 
functions, I'm having a hard time coming up with good examples to be 
able to consider both designs.


It's also the way modern languages implemented the same concept 
recently, say JavaScript (ecmascript) and C# with yield.


Do you know of an equivalent example in those languages, because I can't 
easily think of an equivalent aggregate function in a similar style in 
either JavaScript or C#.


Aggregate functions are a combinations of an accumulator and a 
'finisher' (and maybe a combiner if you want to be able to parallelize), 
and trying to combine those two is more complex (and I think it makes it 
brittle to write, so more susceptible to bugs).


My second proposal will make it purely functional, separating the 
concerns of accumulating and finishing.


My propose is to use the same elegant concept for aggregation. Your 
example seems much more confusing. It makes sense thinking on 
interfaces, but very different from Firebird way of doing things.


It may need to have a different syntax, although I tried to follow the 
package body syntax in my example for consistency.


Also, it does not use pass-by-reference concept. It uses underlying 
Firebird mechanism of messages, the same used for initial parameters 
values and return values. PSQL syntax may be changed to.make it more 
clear that it's not a parameter by reference.


From the perspective of the one writing the PSQL function, it does seem 
to be using a pass by reference: the value of the parameter can be 
changed from outside while within the loop, which is something entirely 
new for PSQL.


Mark
--
Mark Rotteveel


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] User-defined aggregate functions

2018-09-29 Thread Adriano dos Santos Fernandes
It's incredible small the number of custom aggregate functions overall in
projects, nor I think the focus of this feature is to make others DBMS
users migrate to Firebird due to it, specially considering that their
underlying mechanism are very different than what Firebird has.

The focus is to have a feature well integrated with Firebird others
features. That feature is going to be used by advanced Firebird users, we
known.

It's a very rarely used (develop the routine) but with high benefits
(usage).


Adriano

On Sat, Sep 29, 2018, 09:23 Dimitry Sibiryakov  wrote:

> 29.09.2018 14:02, Adriano dos Santos Fernandes wrote:
> > My propose is to use the same elegant concept for aggregation. Your
> example seems much
> > more confusing. It makes sense thinking on interfaces, but very
> different from Firebird
> > way of doing things.
>
>Yes, but being similar to others (PG or Oracle) is a good thing for
> migration.
>
>
> --
>WBR, SD.
>
>
> Firebird-Devel mailing list, web interface at
> https://lists.sourceforge.net/lists/listinfo/firebird-devel
>
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] User-defined aggregate functions

2018-09-29 Thread Dimitry Sibiryakov

29.09.2018 14:02, Adriano dos Santos Fernandes wrote:
My propose is to use the same elegant concept for aggregation. Your example seems much 
more confusing. It makes sense thinking on interfaces, but very different from Firebird 
way of doing things.


  Yes, but being similar to others (PG or Oracle) is a good thing for migration.


--
  WBR, SD.


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] User-defined aggregate functions

2018-09-29 Thread Adriano dos Santos Fernandes
Mark, saying that an object-based interface is better to store state or the
way to support external routines is disregard how good Firebird selectable
procedures are.

Selectable procedures stores intermediary states in a very elegant way.

It's also the way modern languages implemented the same concept recently,
say JavaScript (ecmascript) and C# with yield.

My propose is to use the same elegant concept for aggregation. Your example
seems much more confusing. It makes sense thinking on interfaces, but very
different from Firebird way of doing things.

Also, it does not use pass-by-reference concept. It uses underlying
Firebird mechanism of messages, the same used for initial parameters values
and return values. PSQL syntax may be changed to.make it more clear that
it's not a parameter by reference.

External procedures are already supported with intermediary states. I'm
sure the interface for external aggregate function would be different than
standard functions and procedures, but not very different. Just that, each
functionality with their required interfaces.

For grouping set support, it's something I must investigate better. I
sometimes use it in Oracle, but did not thought about it in this spec or
even for ours internal aggregate functions.


Adriano


On Sat, Sep 29, 2018, 04:54 Mark Rotteveel  wrote:

> On 28-9-2018 18:34, Adriano dos Santos Fernandes wrote:
> > Hi!
> >
> > I propose creation of user-defined aggregate functions.
> >
> > Lets initially define the requirements for it:
> > - It should work with empty data set too, in case of use without GROUP
> > BY clause: i.e., COUNT results in 0, while most others functions result
> > in NULL
> > - It should also produce "running (or accumulated) aggregation", like
> > SUM(...) OVER (ORDER BY ...)
> >
> > What is not proposed here:
> > - Make a single function work with different source/target data types.
> > Custom aggregate function is statically typed as normal functions.
> >
> > Design is open if good alternatives are presented. Initially I think:
> >
> > - There should be one input parameter and one output parameter.
> > - SUSPEND is used to make engine feed the input parameter with a new
> > value, setting AGG_FINISHED to TRUE/FALSE, and returning running
> > (accumulated) aggregation value.
> > - Empty data set makes the function already starts with AGG_FINISHED =
> > TRUE, and non empty data set makes the function starts with AGG_FINISHED
> > = FALSE and the first row already fed in the input parameter
> > - When AGG_FINISHED is TRUE, it's expected that user code does not call
> > SUSPEND anymore, so only for the empty data set case, the value present
> > in the output parameter when the function ends is the result of the
> > aggregate function in the query.
>
> I think the syntax below is rather verbose, and I wonder if it can
> easily be used to support all types of aggregates.
>
> It introduces some additional behavior and syntax that hasn't surfaced
> before in PSQL, like:
>
> - a pass by reference call mechanism (the input parameter) to pump new
> values in, this may make it harder to use UDRs for this (not all
> languages (easily) support such call mechanisms)
> - a named output parameter for a function, which breaks syntax
> consistency with normal stored functions
> - an externally controlled context value (agg_finished), again something
> that I don't can be easily supported in UDR
>
> Although I haven't used it, I think something like PostgreSQL's
> aggregate (see https://www.postgresql.org/docs/10/static/xaggr.html and
> https://www.postgresql.org/docs/10/static/sql-createaggregate.html) may
> be simpler, but that may require the introduction of some sort of record
> type (unless arrays are 'good enough') to be able to maintain and pass
> more complex state. There are also some other 'bears' their as it relies
> on PostgreSQLs object support.
>
> We could also consider to have some aggregate object that has state and
> contains the functions that the engine calls for calculating the
> aggregate. In a way, this aggregate would implement an interface
> required by the engine.
>
> That interface consists of a procedure state_transition to accept the
> value(s) of the aggregate, and modifies the internal state, and a
> function final_state that returns the aggregate value.
>
> The engine creates an instance of the aggregate, and calls the
> state_transition to add values, and calls the final_state whenever it
> needs the final result.
>
> The sum example
>
> create aggregate custom_sum(i bigint) returns bigint
> as
> declare summed bigint = 0;
> begin
> procedure state_transition(i bigint)
> as
> begin
>   if (is is null) then leave;
>   summed = summed + i;
> end
>
> function final_state returns bigint
> as
>   return summed;
> end
> end;
>
> The average example. Be aware that this implementation (like yours) will
> have a precision problem for a large number of small values, and may be

Re: [Firebird-devel] User-defined aggregate functions

2018-09-29 Thread Mark Rotteveel

On 28-9-2018 20:14, Adriano dos Santos Fernandes wrote:

On 28/09/2018 15:04, Leyne, Sean wrote:



-Original Message-
From: Adriano dos Santos Fernandes 
Sent: Friday, September 28, 2018 1:55 PM
To: firebird-devel@lists.sourceforge.net
Subject: Re: [Firebird-devel] User-defined aggregate functions

On 28/09/2018 14:35, Leyne, Sean wrote:

How would this work within a GROUP BY context?

I see how it works within a flat list, but when there are intermediate levels, I

don't see this working.



Engine does all the work for groups.

Each group makes the function start - suspend (n times) - finish.

So if there are 3 GROUP levels, each function would be called 3 times, one for 
each level?



If by "group level" you mean the "group key", yes.

Each time the group key changes, the function will be executed again.

Within the group rows (or window partition), SUSPEND is used to feed
input and produce result.


Maybe Sean is talking about SQL standard grouping sets which Firebird 
doesn't support yet.


Mark
--
Mark Rotteveel


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] User-defined aggregate functions

2018-09-29 Thread Mark Rotteveel

On 28-9-2018 18:34, Adriano dos Santos Fernandes wrote:

Hi!

I propose creation of user-defined aggregate functions.

Lets initially define the requirements for it:
- It should work with empty data set too, in case of use without GROUP
BY clause: i.e., COUNT results in 0, while most others functions result
in NULL
- It should also produce "running (or accumulated) aggregation", like
SUM(...) OVER (ORDER BY ...)

What is not proposed here:
- Make a single function work with different source/target data types.
Custom aggregate function is statically typed as normal functions.

Design is open if good alternatives are presented. Initially I think:

- There should be one input parameter and one output parameter.
- SUSPEND is used to make engine feed the input parameter with a new
value, setting AGG_FINISHED to TRUE/FALSE, and returning running
(accumulated) aggregation value.
- Empty data set makes the function already starts with AGG_FINISHED =
TRUE, and non empty data set makes the function starts with AGG_FINISHED
= FALSE and the first row already fed in the input parameter
- When AGG_FINISHED is TRUE, it's expected that user code does not call
SUSPEND anymore, so only for the empty data set case, the value present
in the output parameter when the function ends is the result of the
aggregate function in the query.


I think the syntax below is rather verbose, and I wonder if it can 
easily be used to support all types of aggregates.


It introduces some additional behavior and syntax that hasn't surfaced 
before in PSQL, like:


- a pass by reference call mechanism (the input parameter) to pump new 
values in, this may make it harder to use UDRs for this (not all 
languages (easily) support such call mechanisms)
- a named output parameter for a function, which breaks syntax 
consistency with normal stored functions
- an externally controlled context value (agg_finished), again something 
that I don't can be easily supported in UDR


Although I haven't used it, I think something like PostgreSQL's 
aggregate (see https://www.postgresql.org/docs/10/static/xaggr.html and 
https://www.postgresql.org/docs/10/static/sql-createaggregate.html) may 
be simpler, but that may require the introduction of some sort of record 
type (unless arrays are 'good enough') to be able to maintain and pass 
more complex state. There are also some other 'bears' their as it relies 
on PostgreSQLs object support.


We could also consider to have some aggregate object that has state and 
contains the functions that the engine calls for calculating the 
aggregate. In a way, this aggregate would implement an interface 
required by the engine.


That interface consists of a procedure state_transition to accept the 
value(s) of the aggregate, and modifies the internal state, and a 
function final_state that returns the aggregate value.


The engine creates an instance of the aggregate, and calls the 
state_transition to add values, and calls the final_state whenever it 
needs the final result.


The sum example

create aggregate custom_sum(i bigint) returns bigint
as
   declare summed bigint = 0;
begin
   procedure state_transition(i bigint)
   as
   begin
 if (is is null) then leave;
 summed = summed + i;
   end

   function final_state returns bigint
   as
 return summed;
   end
end;

The average example. Be aware that this implementation (like yours) will 
have a precision problem for a large number of small values, and may be 
problematic for large values.


create aggregate custom_avg(i double precision) returns double precision
as
  declare current_count integer = 0;
  declare accumulated double precision = 0;
begin
  procedure state_transition(i double precision)
  as
  begin
if (is is null) then leave;
current_count = current_count + 1;
accumulated = accumulated + i;
  end

  function final_state returns double precision
return accumulated / current_count;
  end
end

Your final example (with a change to not count null):

create aggregate custom_count_plus_1000 (i integer) returns integer
as
  declare current_count integer;
begin
  procedure state_transition(i integer)
  as
  begin
if (current_count is null) then
  current_count = 0;
if (is is null) then leave;
current_count = current_count + 1;
  end

  function final_state returns integer
  as
  begin
if (current_count is null) then
   return 1000;
else
   return current_count;
  end
end

This may be extended to also allow support for moving aggregates with 
window functions (see "Moving-Aggregate Mode" in the PostgreSQL docs) by 
having additional procedures and functions.


However it looks like the way PostgreSQL passes state between functions 
may actually be better, as it would probably be easier for supporting 
grouping sets and other aggregations happening on multiple levels.


Taking the avg example again, such a form could look like:

create aggregate custom_avg(i double precision) returns double precision
as

Re: [Firebird-devel] User-defined aggregate functions

2018-09-28 Thread Adriano dos Santos Fernandes
On 28/09/2018 15:04, Leyne, Sean wrote:
>
>> -Original Message-
>> From: Adriano dos Santos Fernandes 
>> Sent: Friday, September 28, 2018 1:55 PM
>> To: firebird-devel@lists.sourceforge.net
>> Subject: Re: [Firebird-devel] User-defined aggregate functions
>>
>> On 28/09/2018 14:35, Leyne, Sean wrote:
>>> How would this work within a GROUP BY context?
>>>
>>> I see how it works within a flat list, but when there are intermediate 
>>> levels, I
>> don't see this working.
>>>
>> Engine does all the work for groups.
>>
>> Each group makes the function start - suspend (n times) - finish.
> So if there are 3 GROUP levels, each function would be called 3 times, one 
> for each level?
>
>
If by "group level" you mean the "group key", yes.

Each time the group key changes, the function will be executed again.

Within the group rows (or window partition), SUSPEND is used to feed
input and produce result.


Adriano



Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] User-defined aggregate functions

2018-09-28 Thread Leyne, Sean



> -Original Message-
> From: Adriano dos Santos Fernandes 
> Sent: Friday, September 28, 2018 1:55 PM
> To: firebird-devel@lists.sourceforge.net
> Subject: Re: [Firebird-devel] User-defined aggregate functions
> 
> On 28/09/2018 14:35, Leyne, Sean wrote:
> >
> > How would this work within a GROUP BY context?
> >
> > I see how it works within a flat list, but when there are intermediate 
> > levels, I
> don't see this working.
> >
> >
> Engine does all the work for groups.
> 
> Each group makes the function start - suspend (n times) - finish.

So if there are 3 GROUP levels, each function would be called 3 times, one for 
each level?

OK.


Sean



Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] User-defined aggregate functions

2018-09-28 Thread Adriano dos Santos Fernandes
On 28/09/2018 14:35, Leyne, Sean wrote:
>
> How would this work within a GROUP BY context?
>
> I see how it works within a flat list, but when there are intermediate 
> levels, I don't see this working.
>
>
Engine does all the work for groups.

Each group makes the function start - suspend (n times) - finish.


Adriano



Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] User-defined aggregate functions

2018-09-28 Thread Leyne, Sean


> -- Works as standard AVG.
> create aggregate function custom_avg (i double precision) returns (o double
> precision) as
>     declare count integer = 0;
>     declare accumulated double precision = 0; begin
>     while (not agg_finished)
>     do
>     begin
>         if (i is not null) then
>         begin
>             count = count + 1;
>             accumulated = accumulated + i;
>             o = accumulated / count;
>         end

How would this work within a GROUP BY context?

I see how it works within a flat list, but when there are intermediate levels, 
I don't see this working.


Sean


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] User-defined aggregate functions

2018-09-28 Thread Dimitry Sibiryakov

28.09.2018 18:34, Adriano dos Santos Fernandes wrote:

Design is open if good alternatives are presented. Initially I think:


  Why not simply borrow Oracle concepts as you did with packages?


--
  WBR, SD.


Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel