Re: The hidden cost of limit-offset

2020-12-06 Thread 孙冰
I think the subquery approach should be something like:

---
select id, pg_sleep(0.1) from (select id from thing offset 90 order by tag)
last_things
---

Is that right?

Then what if the interface is exposed as a view? e.g.,

---
create view thing_interface as select id, tag, pg_sleep(0.1) from thing;
---

I can't think of a subquery which can avoid the unnecessary pg_sleep calls
when queries are executed against thing_interface.

It's perfectly valid that the problem could be solved by a subquery for
some *ad-hoc* and  *oneshot* queries. But there are more often the cases
that limit-offset are used in general queries (hand-crafted or
programe-generated) and it is not very realistic to rewrite all of them
into an offset-inside-subquery form.

Bing

David G. Johnston  于2020年12月7日周一 上午12:05写道:

> On Sunday, December 6, 2020, 孙冰  wrote:
>
>> The skipped rows by an OFFSET clause have to be computed nevertheless. I
>> am wondering if there could be any chance to improve, since the computation
>> is on the *entire* rows rather than on the *criterial* columns.
>>
>> [...]
>>
>> I don't understand the postgresql internal, but I suspect such a change
>> may introduce significant work on the planner and executor. From my point
>> view, skipping everything (or expensive ones) except the criteria in the
>> target list would greatly improve the usability of OFFSET, and it is
>> definitely worth the effort.
>>
>
> Given that one can write this with a subquery without much difficulty i’m
> doubtful that effort spent in this area is going to be particularly
> valuable.
>
> David J.
>
>


Partitioning with FDW and table size limits

2020-12-06 Thread Godfrin, Philippe E
Greetings,
In the case where you have a 'local' server, from which you are working with 
foreign tables. And the foreign tables are partitioned. As each of the 
partitioned tables is a table in its own right, is it correct to assume the 
table (relation) size limit of 32 TB applies? For example, provided I had the 
disk space, with 10 partitioned tables, I could have 320TB of data on 1 or more 
remote servers.
Am I correct?
Thanks,
phil

Phil Godfrin | Database Administrator
NOV
NOV US | Engineering Data
9720 Beechnut St | Houston, Texas 77036
M  281.825.2311
E   philippe.godf...@nov.com



JDBC driver - is "getGeneratedKeys()" guaranteed to return the ids in the same order a batch insert was made?

2020-12-06 Thread electrotype

Hi,

Using JDBC, I batch insert multiple rows (/"executeBatch()/"). I then use 
/'//getGeneratedKeys("id")/' to get the generated ids ("id" is a "/SERIAL PRIMARY KEY/" column).


My question: does the PostgreSQL JDBC driver /guarantees /that the order of the returned generated 
ids will be the same as the rows to insert have been specified, using "/addBatch()/"?



The best "answer" to that question I have found is https://stackoverflow.com/a/16119489/843699 , but 
it is not 100% clear.


Would it be possible to have an official answer on this?

Thanks in advance!




Re: SELECT but only if not present in another table

2020-12-06 Thread Alexander Farber
Thank you, Steve -

On Sun, Dec 6, 2020 at 6:50 PM Steve Baldwin 
wrote:

> Can't you just use table aliases? So, the outer word_moves would become
> 'word_moves as wm', word_puzzles would become 'word_puzzles as wp', and the
> where clause 'WHERE wp.mid = wm.mid' ?
>

table aliases have worked for me!


Re: SELECT but only if not present in another table

2020-12-06 Thread Steve Baldwin
Can't you just use table aliases? So, the outer word_moves would become
'word_moves as wm', word_puzzles would become 'word_puzzles as wp', and the
where clause 'WHERE wp.mid = wm.mid' ?

hth,

Steve

On Mon, Dec 7, 2020 at 4:08 AM Alexander Farber 
wrote:

> Good evening,
>
> in PostgreSQL 13.1 I save player moves in the table:
>
> # \d words_moves
>   Table "public.words_moves"
>  Column  |   Type   | Collation | Nullable |
>Default
>
> -+--+---+--+--
>  mid | bigint   |   | not null |
> nextval('words_moves_mid_seq'::regclass)
>  action  | text |   | not null |
>  gid | integer  |   | not null |
>  uid | integer  |   | not null |
>  played  | timestamp with time zone |   | not null |
>  tiles   | jsonb|   |  |
>  score   | integer  |   |  |
>  str | text |   |  |
>  hand| text |   |  |
>  letters | character(1)[]   |   |  |
>  values  | integer[]|   |  |
> Indexes:
> "words_moves_pkey" PRIMARY KEY, btree (mid)
> "words_moves_gid_played_idx" btree (gid, played DESC)
> "words_moves_puzzle_idx" btree (puzzle)
> "words_moves_uid_action_played_idx" btree (uid, action, played)
> "words_moves_uid_idx" btree (uid)
> Check constraints:
> "words_moves_score_check" CHECK (score >= 0)
> Foreign-key constraints:
> "words_moves_gid_fkey" FOREIGN KEY (gid) REFERENCES words_games(gid)
> ON DELETE CASCADE
> "words_moves_uid_fkey" FOREIGN KEY (uid) REFERENCES words_users(uid)
> ON DELETE CASCADE
> Referenced by:
> TABLE "words_puzzles" CONSTRAINT "words_puzzles_mid_fkey" FOREIGN KEY
> (mid) REFERENCES words_moves(mid) ON DELETE CASCADE
> TABLE "words_scores" CONSTRAINT "words_scores_mid_fkey" FOREIGN KEY
> (mid) REFERENCES words_moves(mid) ON DELETE CASCADE
>
> Some of the moves can be "interesting" in the sense that the player have
> used all 7 letter tiles or achieved a high score over 90 points,
>
> I want to display those moves as "puzzles" and have prepared a table to
> store, per-user, who has solved them:
>
> # \d words_puzzles
> Table "public.words_puzzles"
>  Column |   Type   | Collation | Nullable | Default
> +--+---+--+-
>  mid| bigint   |   | not null |
>  uid| integer  |   | not null |
>  solved | timestamp with time zone |   | not null |
> Foreign-key constraints:
> "words_puzzles_mid_fkey" FOREIGN KEY (mid) REFERENCES words_moves(mid)
> ON DELETE CASCADE
> "words_puzzles_uid_fkey" FOREIGN KEY (uid) REFERENCES words_users(uid)
> ON DELETE CASCADE
>
> Now I am trying to create a custom stored function which would return just
> one mid (move id) which is not too new (1 year old) and the user has not
> tackled it yet:
>
> CREATE OR REPLACE FUNCTION words_daily_puzzle(
> in_uid   int
> ) RETURNS table (
> out_mid  bigint,
> out_secret   text
> ) AS
> $func$
> SELECT
> mid,
> MD5(mid ||'my little secret')
> FROM words_moves
> WHERE action = 'play'
> AND (LENGTH(str) = 7 OR score > 90)
> AND played BETWEEN CURRENT_TIMESTAMP - INTERVAL '51 week' AND
> CURRENT_TIMESTAMP - INTERVAL '50 week'
> -- the user has not solved this puzzle yet
> AND NOT EXISTS (SELECT 1 FROM words_puzzles WHERE mid =
> the_outer_mid AND uid = in_uid)
> ORDER BY PLAYED ASC
> LIMIT 1;
> $func$ LANGUAGE sql;
>
> As you can see I am missing 1 piece - how do I address the outer SELECT
> mid from the EXISTS-SELECT?
>
> I have written "the_outer_mid" there.
>
> Should I use LEFT JOIN LATERAL here? I have difficulties wrapping my head
> around this.
>
> TLDR: how to return 1 mid from 1 year ago, which is not solved by the user
> in_uid yet?
>
> Thank you
> Alex
>
>
>


SELECT but only if not present in another table

2020-12-06 Thread Alexander Farber
Good evening,

in PostgreSQL 13.1 I save player moves in the table:

# \d words_moves
  Table "public.words_moves"
 Column  |   Type   | Collation | Nullable |
 Default
-+--+---+--+--
 mid | bigint   |   | not null |
nextval('words_moves_mid_seq'::regclass)
 action  | text |   | not null |
 gid | integer  |   | not null |
 uid | integer  |   | not null |
 played  | timestamp with time zone |   | not null |
 tiles   | jsonb|   |  |
 score   | integer  |   |  |
 str | text |   |  |
 hand| text |   |  |
 letters | character(1)[]   |   |  |
 values  | integer[]|   |  |
Indexes:
"words_moves_pkey" PRIMARY KEY, btree (mid)
"words_moves_gid_played_idx" btree (gid, played DESC)
"words_moves_puzzle_idx" btree (puzzle)
"words_moves_uid_action_played_idx" btree (uid, action, played)
"words_moves_uid_idx" btree (uid)
Check constraints:
"words_moves_score_check" CHECK (score >= 0)
Foreign-key constraints:
"words_moves_gid_fkey" FOREIGN KEY (gid) REFERENCES words_games(gid) ON
DELETE CASCADE
"words_moves_uid_fkey" FOREIGN KEY (uid) REFERENCES words_users(uid) ON
DELETE CASCADE
Referenced by:
TABLE "words_puzzles" CONSTRAINT "words_puzzles_mid_fkey" FOREIGN KEY
(mid) REFERENCES words_moves(mid) ON DELETE CASCADE
TABLE "words_scores" CONSTRAINT "words_scores_mid_fkey" FOREIGN KEY
(mid) REFERENCES words_moves(mid) ON DELETE CASCADE

Some of the moves can be "interesting" in the sense that the player have
used all 7 letter tiles or achieved a high score over 90 points,

I want to display those moves as "puzzles" and have prepared a table to
store, per-user, who has solved them:

# \d words_puzzles
Table "public.words_puzzles"
 Column |   Type   | Collation | Nullable | Default
+--+---+--+-
 mid| bigint   |   | not null |
 uid| integer  |   | not null |
 solved | timestamp with time zone |   | not null |
Foreign-key constraints:
"words_puzzles_mid_fkey" FOREIGN KEY (mid) REFERENCES words_moves(mid)
ON DELETE CASCADE
"words_puzzles_uid_fkey" FOREIGN KEY (uid) REFERENCES words_users(uid)
ON DELETE CASCADE

Now I am trying to create a custom stored function which would return just
one mid (move id) which is not too new (1 year old) and the user has not
tackled it yet:

CREATE OR REPLACE FUNCTION words_daily_puzzle(
in_uid   int
) RETURNS table (
out_mid  bigint,
out_secret   text
) AS
$func$
SELECT
mid,
MD5(mid ||'my little secret')
FROM words_moves
WHERE action = 'play'
AND (LENGTH(str) = 7 OR score > 90)
AND played BETWEEN CURRENT_TIMESTAMP - INTERVAL '51 week' AND
CURRENT_TIMESTAMP - INTERVAL '50 week'
-- the user has not solved this puzzle yet
AND NOT EXISTS (SELECT 1 FROM words_puzzles WHERE mid =
the_outer_mid AND uid = in_uid)
ORDER BY PLAYED ASC
LIMIT 1;
$func$ LANGUAGE sql;

As you can see I am missing 1 piece - how do I address the outer SELECT mid
from the EXISTS-SELECT?

I have written "the_outer_mid" there.

Should I use LEFT JOIN LATERAL here? I have difficulties wrapping my head
around this.

TLDR: how to return 1 mid from 1 year ago, which is not solved by the user
in_uid yet?

Thank you
Alex


Re: Using a boolean column with IF / THEN

2020-12-06 Thread Alexander Farber
On Sat, Dec 5, 2020 at 9:00 PM David G. Johnston 
wrote:

> Maybe not “simpler” but for all those checks you could write a single
> query that pulls out all the data at once into a record variable and test
> against the columns pf that instead of executing multiple queries.
>

Thank you!


Re: The hidden cost of limit-offset

2020-12-06 Thread David G. Johnston
On Sunday, December 6, 2020, 孙冰  wrote:

> The skipped rows by an OFFSET clause have to be computed nevertheless. I
> am wondering if there could be any chance to improve, since the computation
> is on the *entire* rows rather than on the *criterial* columns.
>
> [...]
>
> I don't understand the postgresql internal, but I suspect such a change
> may introduce significant work on the planner and executor. From my point
> view, skipping everything (or expensive ones) except the criteria in the
> target list would greatly improve the usability of OFFSET, and it is
> definitely worth the effort.
>

Given that one can write this with a subquery without much difficulty i’m
doubtful that effort spent in this area is going to be particularly
valuable.

David J.