Re: an difficult SQL

2022-11-06 Thread Rafal Pietrak
Great, with a little tweaking (to get the remaining rows ordered 
correctly), this did the job.


Thank you Erik.

BR

-R

W dniu 6.11.2022 o 15:23, Erik Wienhold pisze:

On 06/11/2022 13:48 CET Rafal Pietrak  wrote:

W dniu 5.11.2022 o 19:05, Thiemo Kellner pisze:


You first could select the three users with the most recent entries with
a windowing function
(https://www.postgresql.org/docs/15/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS)


surely I'm missing something crucial here:
select row_number() over w,* from eventlog where row_number() over w < 5
   window w as (partition by user);
ERROR:  window functions are not allowed in WHERE

So I'm unable to pick a limited number of rows within the user
"group-window" ranges.

Without that, I cannot proceed.

Any suggestions?


Windows functions are only permitted in SELECT and ORDER BY because they are
executed after WHERE, GROUP BY, and HAVING[1].

You need a derived table to filter on row_number:

 with
 ranked as (
 select *, row_number() over w
 from eventlog
 window w as (partition by user)
 )
 select *
 from ranked
 where row_number < 5;

[1] https://www.postgresql.org/docs/15/tutorial-window.html

--
Erik







Re: an difficult SQL

2022-11-06 Thread Rafal Pietrak

Hi Thiemo,

Thank you for suggestions.

W dniu 5.11.2022 o 19:05, Thiemo Kellner pisze:

Hi Rafal

You first could select the three users with the most recent entries with 
a windowing function 
(https://www.postgresql.org/docs/15/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS) 


surely I'm missing something crucial here:
select row_number() over w,* from eventlog where row_number() over w < 5 
 window w as (partition by user);

ERROR:  window functions are not allowed in WHERE

So I'm unable to pick a limited number of rows within the user 
"group-window" ranges.


Without that, I cannot proceed.

Any suggestions?

-R
PS: regarding "my sets background", yes I do "think in sets" ... as 
opposed to thinking "in functions" (like iterating procedures). I do 
prefer solutions based on set definitions.


putting it into a with query 
(https://www.postgresql.org/docs/15/sql-select.html), in following with 
queries I would select 2.1 to 2.3 with each a constant column with each 
a different value you later sort by. In a next with query you can select 
all the rest (except all 
https://www.postgresql.org/docs/15/sql-select.html#SQL-EXCEPT) the 
results of 2.1 to 2.3 for 2.4 also with the notorious sort column. In a 
last with query you can put together the partial results for 2.1 to 2.4 
with a union all 
(https://www.postgresql.org/docs/15/sql-select.html#SQL-UNION) and 
selecting sort by the sort column and the timestamp in the final select.


I do not know your background, however, sql is about data sets end it is 
not always easy to get ones head around thinking in sets. I hope you 
could follow my suggestions. It might not be the most efficient way but 
should work.


Kind regards

Thiemo


Am 05.11.22 um 16:10 schrieb Rafal Pietrak:

Hi Everybody,

I was wondering if anybody here could help me cook up a query:

1. against a list of events (like an activity log in the database). 
The list is a single table: create table events (tm timestamp, user 
int, description text).


2. of which the output would be sorted in such a way, that:
2.1 most recent event would "select" most recent events of that same 
user, and displayed in a group (of say 10) of them (in "tm" order).


2.2 going through the events back in time, first event of ANOTHER user 
selects next group, where (say 10) most recent events of that OTHER 
user is presented.


2.3 next most recent event of yet another user selects yet another 
group to display and this selection process goes on, up to a maximum 
of (say 20) users/groups-of-their-events.


2.4 after that, all other events are selected in tm order.

This is to present most recent telephone activities grouped by most 
recent subscribers so that the dashboard doesn't get cluttered with 
information but allows for an overview of other activity of most 
recent users.


I tend to think, that it's a problem for a window function ... but 
I've stumbled on the problem how to limit the window "frame" to just a 
few (say 10) events within the "window" and have all the rest returned 
as "tail" of the query.


BTW: the eventlog table is big. (and partitioned).

Any help appreciated.

-R







an difficult SQL

2022-11-05 Thread Rafal Pietrak

Hi Everybody,

I was wondering if anybody here could help me cook up a query:

1. against a list of events (like an activity log in the database). The 
list is a single table: create table events (tm timestamp, user int, 
description text).


2. of which the output would be sorted in such a way, that:
2.1 most recent event would "select" most recent events of that same 
user, and displayed in a group (of say 10) of them (in "tm" order).


2.2 going through the events back in time, first event of ANOTHER user 
selects next group, where (say 10) most recent events of that OTHER user 
is presented.


2.3 next most recent event of yet another user selects yet another group 
to display and this selection process goes on, up to a maximum of (say 
20) users/groups-of-their-events.


2.4 after that, all other events are selected in tm order.

This is to present most recent telephone activities grouped by most 
recent subscribers so that the dashboard doesn't get cluttered with 
information but allows for an overview of other activity of most recent 
users.


I tend to think, that it's a problem for a window function ... but I've 
stumbled on the problem how to limit the window "frame" to just a few 
(say 10) events within the "window" and have all the rest returned as 
"tail" of the query.


BTW: the eventlog table is big. (and partitioned).

Any help appreciated.

-R




Re: Fwd: row level security (RLS)

2021-03-15 Thread Rafal Pietrak



W dniu 15.03.2021 o 18:01, Laurenz Albe pisze:
> On Mon, 2021-03-15 at 16:28 +0100, Rafal Pietrak wrote:
>> 5. My experiments with RLS was like following:
>> - I've implemented a STABLE function, that returns INTEGER 1/0
>> - I've linked that function as POLICY to my tables
>> - I've GRANTED PUBLIC access to those tables
>> ---> and all works as predicted only slow (10x slower!).
>>
>> [lots of questions about how to solve this is some other way]
>>
>> Those questions come from my bad experience with POLICY performance.
> 
> You should figure out why RLS was so slow.

Yes I should... although I didn't. Somewhat because I thought it was
obvious (an additional function call on every row). Still, as I've
mentioned in my initial post, I'm going to revisit the case in the
couple of days and gather more evidence.

Having said that, I'm really interested in any comments on the way I've
"imagined" addressing RLS years ago (and described it in the post), when
I've looked for a solution and settled for the rule system. The question
about partition/check/role approach irrespective of where they come from.

Pls address the following reasoning:
1. POLICY calls a function on every row to check it's visibility to the
client (for 1mln rows, 1mln checks).
2. "alternative" does just one check on all the rows contained in a
particular partition (for 100 tenets 100 checks)

No matter how hard one optimises the POLICY function, it will always loose.

Then again, I'll be back with some "ANALYSE" in a couple of days.

-R




Fwd: row level security (RLS)

2021-03-15 Thread Rafal Pietrak
Hello,

I was told, that pgsql-hackers is not the right list for the following
questions. So I'm reposting to general.

Does anybody have an opinion regarding the following questions?


---
Hello,

Forgive me for probably naive questions, being so talkative like the
following. But the less one knows the more one must explain. And I don't
know much regarding RLS.

1. Some time ago I've implemented in my schema a poore mans' RLS using
the rule system.

2. like half a year ago I've discovered postgreSQL native implementation
with policies, so I've decided to give it a try.

3. to my ultimate surprise, this turned out to be like 10 times slower.
So I abondened the project.

4. but it bites me, one question in particular  which requires the
lengthy explanations:

5. My experiments with RLS was like following:
- I've implemented a STABLE function, that returns INTEGER 1/0
- I've linked that function as POLICY to my tables
- I've GRANTED PUBLIC access to those tables
---> and all works as predicted only slow (10x slower!).

As I understand it, RLS took time to get implemented in postgreSQL for
may reasons, one of which was the requirement to prevent "not belonging"
rows from leaking into the query results of library buffers. Eventually,
this was somehow achieved.

FMHE (for my eyes) the most striking change the policy (as of step 5)
introduces is a change from "access denied" error, which GRANT would
raise when it declines access, to a "silent omission", which POLICY does
... AT THE SAME SITUATION.

This lead me to the following conclusions:
1. in the pass (like I was implementing poor mans RLS with rules), I
found it very useful for some GRANTs to silently omit access to object
instead of raising an error. But this is impossible, isn't it?

2. in particular, I thought I could partition a table (using
inheritance) and do RLS on GRANT/REVOKE into individual partitions. It
would certainly hard limit any rows leaking into library buffers,
particularly if partitions are on separate tablespaces. But
unfortunately GRANT/REVOKE did raises an error, (and doesn't simply
silently ignore those not granted).

3. So, what if one could change the way GRANT/REVOKE behave when denying
access?

4. one feature necesary for such scenario to work, is the ability to
select one particular (single) ROLE, from all the ROLEs a particular
session__user has, that would "solely" be used for RLS checking of such
"silent GRANT/REVOKE" validates. (a multitenet database). I mean here
something along the lines of: "SET ROLE  [FOR RLS]".

5. the above should come in pair with "CHECK (RLS = )" at partition
level. This way, when postgresql-session does NOT HAVE the "role for
rls" set, all GRANT/REVOKE would work as usual, i.e.: ignore that CHECK
and normally raise "access denied".

IMHO, such implementation would not suffer performance hit, that current
implementation of POLICIES do.

So, I have two questions here:
1. does the above scenario look like safe enough regarding unauthorised
rows leaking (and as substitute for POLICIES)?
2. would it be feasible to add such variant of RLS, should one attempt
to implement it? (i.e. would the community accept it?).

Those questions come from my bad experience with POLICY performance.
Unfortunatly I did that test like half a year ago, so I don't have
results at hand to quote them, but should anybody be interested, I may
try to do it again in a couple of days.

with best regards,

-R






Re: FK v.s unique indexes

2018-07-05 Thread Rafal Pietrak



W dniu 05.07.2018 o 23:04, David G. Johnston pisze:
> On Thu, Jul 5, 2018 at 1:45 PM, Rafal Pietrak  <mailto:ra...@ztk-rp.eu>>wrote:
> 
> I was thinking, that when "add constraint" cannot choose appropriate
> index, may be some explicit help (like ... using ;) would be
> due.
> 
> 
> ​Basically all the FK trigger does is:
> 
> SELECT EXISTS(SELECT 1 FROM pk_table WHERE pk_col1 = val1 AND pk_col2 =
> val2)
> ​
> And fails if query returns false.  The planner is still free to use the
> index or not to execute the query just as if you had written it by
> hand.  For a small table fully in memory it likely would prefer a
> sequential scan and it would be perfectly within its rights to do so.
> 

I must assume, that the above "WHERE pk_col1 - val1..." is not the same
place as the one you've mentioned earlier, where "FK don't have where
clausures".

Thus, the bulk of code that maintains FK consistency "does not use where
clausures", would break on partial indexes. But in the above SELECT,
with partial indexed, that select will never fail. So it is not the
blocker... on the other hand, I would rather rewrite it along the lines of:
SELECT 1 = (SELECT count(1) FROM pk_table WHERE pk_col1 = val1 AND
pk_col2 = val2)

Naturally, if the planner choses to do a seq scan, everything would
break down - inappropriate rows would get hit; but that's a different story:
1. one may think, doing a "SELECT * FROM pk_table WHERE... INTO current"
and making all constraint trigger functions use that "current" instead
of making more lookups, could solve the problem.
2. or with some syntax help during FK creation (aka: ... add constraint
... using ) one could force the planner to always use
indicated index.

Still, whatever way to go, it is well beyond my level.

-R



Re: FK v.s unique indexes

2018-07-05 Thread Rafal Pietrak



W dniu 05.07.2018 o 10:11, Rob Sargent pisze:
[---]
>>
>> Pls consider in real life: load (a person), can have either a (a kind of
>> brest cancer); or b (a kind of prostrate) - this is only a cooked
>> example attemping to illustrate, that one may need to put additional
>> constraints on the entire dataset.
>>
>  
> It’s difficult enough to define a unique person (without mother and father) 
> and certainly this weeks definition of burden is not likely to help matters.  
> If you’re main worry is data consistency you might be better off normalizing 
> your structure - either with separate tables per cancer type (person id, 
> cancer specifics; unique on person) or in a single table one per line (person 
> id, cancer type, cancer description; unique on person). You can reconstitue 
> person,breast,prostate from either of those.  We won’t quibble on one person 
> having both (though remotely possible, men do get breast cancer).
> 

No, no.

This was just cooked example, my reality (messaged between parties,
whose identity is stored in columns A and B, and column C keeping a flag
indicating who was the sender). I've just used gender example to limit
the discussion of whether A can be linked to something, or if B cannot
be linked to something else. It cannot, the constraints I described are
as real, as prostrate cancer never happening to a women. I tried to
eliminate from the discussion all but mathematical relations between
values of those columns. Which are:

(load,a,b,c) is unique
(load,a,true) is unique
(load,b,false) is unique

Nothing else matters.

And I need FK to such dataset. Partial.

There is one table with data only referencing (load,a, true)... and I'd
prefere that table NOT NEED to have the spurous column (c) containing
value "true" for every row in that table.

Same goes for (load,b,false).

Now it occured to me, I can do:
create table (load text, a int, b int, c int);
instead of...
create table (load text, a int, b int, c bool);

With the new layout I'll just do:
insert (load,a,b,c) values (, , , ); for the cases of
"true", and...
insert (load,a,b,c) values (, , , ); for other cases

(load,c) will be unique over the entire dataset.

Now I can FK to (a,c) from whenever I want. A computed column (d bool) =
(a = c) could be helpfull, but I don't think it'll be indispensable.

Sorry to mislead you.

-R



Re: FK v.s unique indexes

2018-07-05 Thread Rafal Pietrak



W dniu 05.07.2018 o 15:18, David G. Johnston pisze:
> On Thursday, July 5, 2018, Rafal Pietrak  <mailto:ra...@ztk-rp.eu>> wrote:
> 
> W dniu 04.07.2018 o 00:55, David G. Johnston pisze:
> > On Tuesday, July 3, 2018, Rafal Pietrak  <mailto:ra...@ztk-rp.eu>
> > <mailto:ra...@ztk-rp.eu <mailto:ra...@ztk-rp.eu>>> wrote:
> >
> >
> >     ERROR:  there is no unique constraint matching given keys for
> referenced
> >     table "test2"
> >     
> >
> >     I cannot see any reasons why this functionality is blocked.
> >
> >     In particular, contrary to what the ERROR says, the target
> table *does
> >     have* a "unique constraint matching given keys", admittedly only
> >     partial.
> >
> >
> > You are making the common error of confusing the distinct concepts of
> > constraints and indexs.  Table constraints cannot be partial by
> > definition, and are a logical concept constraining the data model.
> 
> Hmmm..
> 
> This does not match "my reality". Naturally I may be wrong, but the
> example I've posted reflects my actual data I'm putting into the RDBMS.
> That is:
> 1. the data has unique constraint on (load,a,b,c)
> 2. and the data have additional unique constraints on (load,a), provided
> c is true, and (load,b) whenever c is false.
> 
> 
> I probably generalized too much, especially since exclusion constrains
> can be partial, so I'll be specific here.  Foreign Keys don't have where
> clauses and thus cannot target partial constraints.  If you want to
> overcome that limitation you can write a custom trigger.

Actually. It looks like I'm a really really slow learner :(

Only now I realized how should I code the scenario in question. For
those interested, I'm currently implementing it like this: Instead of
bool column "C", I'm putting there a column, which will keep a *copy* of
value from column A or B depending on "traditional value" (true/false)
of earlier column C. Now I can have a plain ordinary unique index over
(load,C), and have it as FK target for other tables. win-win.

I haven't realized it for years (this is how long the design stays with
me to this day).

> 
> I'm sure at least some of this is simply due to desirability as opposed
> to some fundamental limitation, but in the end that is how the system
> works today.  Integer-table FK relationships are defined over the entire
> PK table, not a subset.

In my simplistic view of postgresql internal, I was thinking, that the
engine (e.g. the set of internal triggers maintaining all FK consistency
as layed down by application schema), having a row of data (freshly
inserted or updated) just looks up an index it has associated with that
particilar FK, and uses it to see if the other end "is comliant", or
"has to change, too", or whatever else.

So I was thinking, that when FK has an index to use, it shouldn't matter
if it's complete or partial.

I was thinking, that when "add constraint" cannot choose appropriate
index, may be some explicit help (like ... using ;) would be
due.

But form the above explanation I fear that there is significantly more
to the full picture than I though. I only king of hoped those
"fundamental limitations" would be something I could grasp.

Anyway, although indireclty, this thread brought me a solution. This is
good.

thenx,

-R




Re: FK v.s unique indexes

2018-07-05 Thread Rafal Pietrak



W dniu 03.07.2018 o 11:23, David Rowley pisze:
> On 3 July 2018 at 19:30, Rafal Pietrak  wrote:
[---]
>>
>> Why is this forbidden?
> 
> I don't think there were any actual roadblocks, it was more of just
> not enough time in the cycle to make it work due to a few technical
> details that required extra effort to make work.
> 
> Alvaro managed to simplify the problem and allow foreign keys to be
> defined on partitioned tables and get that into PG11.
> 
> So it was a case of 50% is better than 0%, which I very agree with.
> 

I see.

It's great and usefull, yes. And I'm looking forward to see the other 50% :)

Thenx!!

-R



FK v.s unique indexes

2018-07-03 Thread Rafal Pietrak
Hi,

For some time now, I'm withholding new features in my DB application as
I wasn't able to have unique constraints on partitioned tables. PG-v11
now has it and I've given it a try, but to my surprise it does not give
it fully to the application. Those indexes don't support FK! At this
point I've also checked partial indexes to see if they could support a
sort of "FK duality" I have in my datasets, but they don't either (see
below EXPLAINING).

I'd like to understand why.

I'd appreciate it if somebody could shred some light on the technical
reasons/background behind those restrictions.

EXPLAINING:
--
psql (11beta2 (Debian 11~beta2-1))
Type "help" for help.

tst=# create table test1(load bigint, a int, b int, c bool) partition by
list (c);
CREATE TABLE
tst=# create table test1_true  partition of test1 for values in (true);
CREATE TABLE
tst=# create table test1_false  partition of test1 for values in (false);
CREATE TABLE
tst=# create unique index load ON test1 (load,a,b,c);
CREATE INDEX
tst=# create table info_text1 (load text, a int, b int, c bool, info
text,  foreign key (load,a,b,c) references test1(load,a,b,c)) ;
ERROR:  cannot reference partitioned table "test1"
--

Why is this forbidden?

For my application I could live without partitioning. Just using partial
indexes would be enough. Still, this does not work either:
---
tst=# create table test2(load bigint, a int, b int, c bool) ;
CREATE TABLE
tst=# create unique index test2_true ON test2 (load,a) where c is true ;
CREATE INDEX
tst=# create unique index test2_false ON test2 (load,b) where c is false;
CREATE INDEX
tst=# create table info_text2 (load text, a int, info text, more_info
text, foreign key (load,a) references test2(load,a)) ;
ERROR:  there is no unique constraint matching given keys for referenced
table "test2"


I cannot see any reasons why this functionality is blocked.

In particular, contrary to what the ERROR says, the target table *does
have* a "unique constraint matching given keys", admittedly only
partial. Yet, why should that matter at all? A unique index, partial or
not, always yield a single row, and that's all what matters for FK. Right?

I would very much like to understand the reasoning behind the above
restrictions (on the use of indexes for FK targets), as this would
probably help me avoid poor decisions in my database schema design. So I
would appreciate it very very much if somebody could point me to
reasoning behind such implementation.

Regards,

-R



Re: unsubscribe

2017-11-21 Thread Rafal Pietrak
I'm very very sorry for my rude words. I do understand and truelly appreciate continues availability of the list. I know (from my own experence) it takes signifficant effort to maintain.
Nonetheless, I do stand by opinion, that it would annoy less if the infant state of the new list servar was babysitted for a while.
Regards
R
21.11.2017 8:19 AM Craig Ringer <cr...@2ndquadrant.com> napisał(a):On 21 November 2017 at 15:17, Rafal Pietrak <rafal@ztk-rp.eu> wrote:Hi all,

Have anyone noticed, that last couple of days on the list, can become a
nice example of HOWTO turn the best list on the planet into a nightmare?

Pls forgive rudeness, but IMHO, whoever made the "upgrade" should
manually take all the "unsubscribe" messages from archive and do that
"unsubscription" by hand  instead of giving not always helpfull
guidance.The list used to have a filter that blocked messages with "unsubscribe" in the subject or forced them into moderation the moderation queue. I thought that'd be preserved with the PgLister migration. But really, a nightmare? Yeah, it's a pain, but I think that's laying it on a bit strong. Personally I appreciate the hard and usually thankless work the infrastructure and admin team do.--  Craig Ringer   http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services



Re: unsubscribe

2017-11-20 Thread Rafal Pietrak
Hi all,

Have anyone noticed, that last couple of days on the list, can become a
nice example of HOWTO turn the best list on the planet into a nightmare?

Pls forgive rudeness, but IMHO, whoever made the "upgrade" should
manually take all the "unsubscribe" messages from archive and do that
"unsubscription" by hand  instead of giving not always helpfull
guidance.

I do system maintenance for money. I do that when I screwup.

Regards,

-R

W dniu 21.11.2017 o 04:16, Ryan pisze:
> Hi,
> 
> All right, so not for lack of trying, I cannot figure out how to
> unsubscribe. I've tried three different things, but they've either been
> ineffective or result in me getting an automatic email that the attempt
> failed. Kindly help? I enjoyed my daily digest of PostgreSQL messages,
> but this has been a bit difficult to manage.
> 
> Best regards.
> 
> On Mon, Nov 20, 2017 at 9:37 PM, Danyelle Davis  > wrote:
> 
> 
> On Mon, Nov 20, 2017 at 9:29 PM, Amila Jayasooriya
> > wrote:
> 
> Please unsubscribe me from the list.
> 
> Thanks and Regards
>  Amila Jayasooriya
> 
> On Tue, Nov 21, 2017 at 3:48 AM, Ibram Remzi
> > wrote:
> 
> Please unsubscribe me from the list.
> 
> Please look at the listserv for options on how to unsub.  Yall are
> clogging the list with these.  
> 
> 
> 
> 
>   Virus-free. www.avast.com
> 
> 
> 
> 
> 
> <#m_4396185798928146462_m_941694909827362166_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
> 
>