Re: [WIP] UNNEST(REFCURSOR): allowing SELECT to consume data from a REFCURSOR

2019-09-20 Thread Roman Pekar
Hi John,

Thanks for pushing this, for me it looks like promising start! I need a bit
more time to go through the code (and I'm not an expert in Postgres
internals in any way) but I really appreciate you doing this.

Roman


Re: (select query)/relation as first class citizen

2019-08-19 Thread Roman Pekar
Hi, John,

I think you've outlined the problem and possible solutions quite well. It's
great to see that the goal might be not that far from implementing.


Re: (select query)/relation as first class citizen

2019-07-08 Thread Roman Pekar
Hi,

what do you think about this idea in general? If you don't have to think
about implementation for now? From my point of view writing Sql queries is
very close to how functional language work if you treat "select" queries as
functions without side-effects, and having query being first-class-citizen
could move this even further.

Regards,
Roman

On Sun, 7 Jul 2019 at 16:22, Roman Pekar  wrote:

> Hi,
>
> Yes, I'm thinking about 'query like a view', 'query like a cursor' is
> probably possible even now in ms sql server (not sure about postgresql),
> but it requires this paradygm shift from set-based thinking to row-by-row
> thinking which I'd not want to do.
>
> I completely agree with your points of plan caching and static checks.
> With static checks, though it might be possible to do if the query would be
> defined as typed, so all the types of the columns is known in advance.
> In certain cases having possibility of much better decomposition is might
> be more important than having cached plan. Not sure how often these cases
> appear in general, but personally for me it'd be awesome to have this
> possibility.
>
> Regards,
> Roman Pekar
>
> On Sun, 7 Jul 2019 at 15:39, Pavel Stehule 
> wrote:
>
>> Hi
>>
>> ne 7. 7. 2019 v 14:54 odesílatel Roman Pekar 
>> napsal:
>>
>>> Hello,
>>>
>>> Just a bit of background - I currently work as a full-time db developer,
>>> mostly with Ms Sql server but I like Postgres a lot, especially because I
>>> really program in sql all the time and type system / plpgsql language of
>>> Postgres seems to me more suitable for actual programming then t-sql.
>>>
>>> Here's the problem - current structure of the language doesn't allow to
>>> decompose the code well and split calculations and data into different
>>> modules.
>>>
>>> For example. Suppose I have a table employee and I have a function like
>>> this (I'll skip definition of return types for the sake of simplicity):
>>>
>>> create function departments_salary ()
>>> returns  table (...)
>>> as
>>> return $$
>>> select department, sum(salary) as salary from employee group by
>>> department;
>>> $$;
>>>
>>> so that's fine, but what if I want to run this function on filtered
>>> employee? I can adjust the function of course, but it implies I can predict
>>> all possible filters I'm going to need in the future.
>>> And logically, function itself doesn't have to be run on employee table,
>>> anything with department and salary columns will fit.
>>> So it'd be nice to be able to define the function like this:
>>>
>>> create function departments_salary(_employee query)
>>> returns table (...)
>>> as
>>> return $$
>>> select department, sum(salary) as salary from _employee group by
>>> department;
>>> $$;
>>>
>>> and then call it like this:
>>>
>>> declare _employee query;
>>> ...
>>> _poor_employee = (select salary, department from employee where salary <
>>> 1000);
>>> select * from  departments_salary( _poor_employee);
>>>
>>> And just to be clear, the query is not really invoked until the last
>>> line, so re-assigning _employee variable is more like building query
>>> expression.
>>>
>>> As far as I understand the closest way to do this is to put the data
>>> into temporary table and use this temporary table inside of the function.
>>> It's not exactly the same of course, cause in case of temporary tables data
>>> should be transferred to temporary table, while it will might be filtered
>>> later. So it's something like array vs generator in python, or List vs
>>> IQueryable in C#.
>>>
>>> Adding this functionality will allow much better decomposition of the
>>> program's logic.
>>> What do you think about the idea itself? If you think the idea is
>>> worthy, is it even possible to implement it?
>>>
>>
>> If we talk about plpgsql, then I afraid so this idea can disallow plan
>> caching - or significantly increase the cost of plan cache.
>>
>> There are two possibilities of implementation - a) query like cursor -
>> unfortunately it effectively disables any optimization and it carry ORM
>> performance to procedures. This usage is known performance antipattern, b)
>> query like view - it should not to have a performance problems with late
>> optimization, but I am not sure about possibility to reuse execution plans.
>>
>> Currently PLpgSQL is compromise between performance and dynamic (PLpgSQL
>> is really static language). Your proposal increase much more dynamic
>> behave, but performance can be much more worse.
>>
>> More - with this behave, there is not possible to do static check - so
>> you have to find bugs only at runtime. I afraid about performance of this
>> solution.
>>
>> Regards
>>
>> Pavel
>>
>>
>>
>>> Regards,
>>> Roman Pekar
>>>
>>>
>>>


Re: (select query)/relation as first class citizen

2019-07-07 Thread Roman Pekar
 Hi,

Yes, I'm thinking about 'query like a view', 'query like a cursor' is
probably possible even now in ms sql server (not sure about postgresql),
but it requires this paradygm shift from set-based thinking to row-by-row
thinking which I'd not want to do.

I completely agree with your points of plan caching and static checks. With
static checks, though it might be possible to do if the query would be
defined as typed, so all the types of the columns is known in advance.
In certain cases having possibility of much better decomposition is might
be more important than having cached plan. Not sure how often these cases
appear in general, but personally for me it'd be awesome to have this
possibility.

Regards,
Roman Pekar

On Sun, 7 Jul 2019 at 15:39, Pavel Stehule  wrote:

> Hi
>
> ne 7. 7. 2019 v 14:54 odesílatel Roman Pekar 
> napsal:
>
>> Hello,
>>
>> Just a bit of background - I currently work as a full-time db developer,
>> mostly with Ms Sql server but I like Postgres a lot, especially because I
>> really program in sql all the time and type system / plpgsql language of
>> Postgres seems to me more suitable for actual programming then t-sql.
>>
>> Here's the problem - current structure of the language doesn't allow to
>> decompose the code well and split calculations and data into different
>> modules.
>>
>> For example. Suppose I have a table employee and I have a function like
>> this (I'll skip definition of return types for the sake of simplicity):
>>
>> create function departments_salary ()
>> returns  table (...)
>> as
>> return $$
>> select department, sum(salary) as salary from employee group by
>> department;
>> $$;
>>
>> so that's fine, but what if I want to run this function on filtered
>> employee? I can adjust the function of course, but it implies I can predict
>> all possible filters I'm going to need in the future.
>> And logically, function itself doesn't have to be run on employee table,
>> anything with department and salary columns will fit.
>> So it'd be nice to be able to define the function like this:
>>
>> create function departments_salary(_employee query)
>> returns table (...)
>> as
>> return $$
>> select department, sum(salary) as salary from _employee group by
>> department;
>> $$;
>>
>> and then call it like this:
>>
>> declare _employee query;
>> ...
>> _poor_employee = (select salary, department from employee where salary <
>> 1000);
>> select * from  departments_salary( _poor_employee);
>>
>> And just to be clear, the query is not really invoked until the last
>> line, so re-assigning _employee variable is more like building query
>> expression.
>>
>> As far as I understand the closest way to do this is to put the data into
>> temporary table and use this temporary table inside of the function. It's
>> not exactly the same of course, cause in case of temporary tables data
>> should be transferred to temporary table, while it will might be filtered
>> later. So it's something like array vs generator in python, or List vs
>> IQueryable in C#.
>>
>> Adding this functionality will allow much better decomposition of the
>> program's logic.
>> What do you think about the idea itself? If you think the idea is worthy,
>> is it even possible to implement it?
>>
>
> If we talk about plpgsql, then I afraid so this idea can disallow plan
> caching - or significantly increase the cost of plan cache.
>
> There are two possibilities of implementation - a) query like cursor -
> unfortunately it effectively disables any optimization and it carry ORM
> performance to procedures. This usage is known performance antipattern, b)
> query like view - it should not to have a performance problems with late
> optimization, but I am not sure about possibility to reuse execution plans.
>
> Currently PLpgSQL is compromise between performance and dynamic (PLpgSQL
> is really static language). Your proposal increase much more dynamic
> behave, but performance can be much more worse.
>
> More - with this behave, there is not possible to do static check - so you
> have to find bugs only at runtime. I afraid about performance of this
> solution.
>
> Regards
>
> Pavel
>
>
>
>> Regards,
>> Roman Pekar
>>
>>
>>


(select query)/relation as first class citizen

2019-07-07 Thread Roman Pekar
Hello,

Just a bit of background - I currently work as a full-time db developer,
mostly with Ms Sql server but I like Postgres a lot, especially because I
really program in sql all the time and type system / plpgsql language of
Postgres seems to me more suitable for actual programming then t-sql.

Here's the problem - current structure of the language doesn't allow to
decompose the code well and split calculations and data into different
modules.

For example. Suppose I have a table employee and I have a function like
this (I'll skip definition of return types for the sake of simplicity):

create function departments_salary ()
returns  table (...)
as
return $$
select department, sum(salary) as salary from employee group by
department;
$$;

so that's fine, but what if I want to run this function on filtered
employee? I can adjust the function of course, but it implies I can predict
all possible filters I'm going to need in the future.
And logically, function itself doesn't have to be run on employee table,
anything with department and salary columns will fit.
So it'd be nice to be able to define the function like this:

create function departments_salary(_employee query)
returns table (...)
as
return $$
select department, sum(salary) as salary from _employee group by
department;
$$;

and then call it like this:

declare _employee query;
...
_poor_employee = (select salary, department from employee where salary <
1000);
select * from  departments_salary( _poor_employee);

And just to be clear, the query is not really invoked until the last line,
so re-assigning _employee variable is more like building query expression.

As far as I understand the closest way to do this is to put the data into
temporary table and use this temporary table inside of the function. It's
not exactly the same of course, cause in case of temporary tables data
should be transferred to temporary table, while it will might be filtered
later. So it's something like array vs generator in python, or List vs
IQueryable in C#.

Adding this functionality will allow much better decomposition of the
program's logic.
What do you think about the idea itself? If you think the idea is worthy,
is it even possible to implement it?

Regards,
Roman Pekar