Re: [GENERAL] Merging records in a table with 2-columns primary key

2017-04-02 Thread Francisco Olarte
Alexander:

On Sun, Apr 2, 2017 at 5:27 PM, Alexander Farber
 wrote:
> 2) Is there a way to use an UPDATE reviews instead of the inefficient
> (because copying) INSERT ... SELECT ... ON CONFLICT DO NOTHING?

mmm, I've just sent a sugestion to use delete+reinsert and would like
to point that in pg update~=delete+insert. I use those because many
times they are more efficient ( simple conditions on delete, insert is
fast in postgres, and you can vacuum in the middle if a large portion
is going to get reinserted to reuse the space )

Francisco Olarte.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Merging records in a table with 2-columns primary key

2017-04-02 Thread Francisco Olarte
Alexander:

On Sun, Apr 2, 2017 at 4:26 PM, Alexander Farber
 wrote:
> The purpose of the function is to merge several user records to one (with
> the lowest uid).

It looks complicated ( more on this later )

> While merging the reviews records I delete all self-reviews and try to copy
> over as many remaining reviews as possible.
...
> test=> SELECT out_uid FROM merge_users(ARRAY[1,2,3,4]);
> ERROR:  new row for relation "reviews" violates check constraint
> "reviews_check"
> DETAIL:  Failing row contains (1, 1, User 4 says: 3 is ugly).

mmm, Maybe this is related to constrint immediateness, but anyway I
would suggest another approach for calculating the set of new reviews.

If you just delete every review for the set and capture the
not-self-referential:

with deleted as ( delete from reviews where uid in in_uids returning *)
, candidates as ( select * from deleted where author not in in_uids )

You can then generate a new review-set from it with some criteria:

, cleaned as (select author, min(review) as review from candidates group by 1)

And then insert them back

insert into reviews select $out_id, author, review from cleaned;

If I were more fluent with the window functions I would recommend
ordering the cleaned query by uid=$out_id DESC and getting the first
row with one of them ( the DESC order puts true first, IIRC, so it
favors keeping the original review for $out_id), or using string_agg
to try to keep all the texts ).

I've found that on many of this "merging" problems it's easier to
extract all, clean them, reinsert. Normally my data sets are big so I
just delete to a temporary ( not in sql way, just one which I create
and then drop ) table, clean on it and reimport them. It also makes
debugging the code much easier ( as the temp table can be cloned to
test easily ). For encapsulation "with" helps a lot, or, in a
function, you can use an real temporary table.

Francisco Olarte


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Merging records in a table with 2-columns primary key

2017-04-02 Thread Alexander Farber
Hello Andy and others -

On Sun, Apr 2, 2017 at 5:13 PM, Andy Colson  wrote:

> On 04/02/2017 09:26 AM, Alexander Farber wrote:
>>
>> http://stackoverflow.com/questions/43168406/merging-records-
>> in-a-table-with-2-columns-primary-key
>
>

after some thinking, when I call

SELECT out_uid FROM merge_users(ARRAY[1,2,3,4]);

then "reviews" table should be empty (get rid of all self-reviews)
and "users" should contain just 1 record (1, 'User 1').

And if instead I call

SELECT out_uid FROM merge_users(ARRAY[1,2]);

then "reviews" should be (records with User 2 removed because overlapped
with User 1):

 uid | author | review
-++
   1 |  3 | User 3 says: 1 is nice
   1 |  4 | User 4 says: 1 is nice
   3 |  1 | User 1 says: 3 is nice
   3 |  4 | User 4 says: 3 is ugly
   4 |  1 | User 1 says: 4 is ugly
   4 |  3 | User 3 says: 4 is ugly

and "users":

 uid |  name
-+
   1 | User 1
   3 | User 3
   4 | User 4

So my 2 questions are -

1) Why the error when calling merge_users(ARRAY[1,2]) and then
merge_users(ARRAY[1,2,3,4])?

2) Is there a way to use an UPDATE reviews instead of the inefficient
(because copying) INSERT ... SELECT ... ON CONFLICT DO NOTHING?

Thank you
Alex


Re: [GENERAL] Merging records in a table with 2-columns primary key

2017-04-02 Thread Andy Colson

On 04/02/2017 09:26 AM, Alexander Farber wrote:

Good afternoon,

I have prepared a simple test case for my question -

CREATE TABLE users (
uid SERIAL PRIMARY KEY,
name varchar(255) NOT NULL
);

CREATE TABLE reviews (
uid integer NOT NULL CHECK (uid <> author) REFERENCES users ON DELETE 
CASCADE,
author integer NOT NULL REFERENCES users(uid) ON DELETE CASCADE,
review varchar(255),
PRIMARY KEY(uid, author)
);

Here I fill the above tables with sample data -

INSERT INTO users (uid, name) VALUES (1, 'User 1');
INSERT INTO users (uid, name) VALUES (2, 'User 2');
INSERT INTO users (uid, name) VALUES (3, 'User 3');
INSERT INTO users (uid, name) VALUES (4, 'User 4');

INSERT INTO reviews (uid, author, review) VALUES (1, 2, 'User 2 says: 1 is 
nice');
INSERT INTO reviews (uid, author, review) VALUES (1, 3, 'User 3 says: 1 is 
nice');
INSERT INTO reviews (uid, author, review) VALUES (1, 4, 'User 4 says: 1 is 
nice');

INSERT INTO reviews (uid, author, review) VALUES (2, 1, 'User 1 says: 2 is 
nice');
INSERT INTO reviews (uid, author, review) VALUES (2, 3, 'User 3 says: 2 is 
nice');
INSERT INTO reviews (uid, author, review) VALUES (2, 4, 'User 4 says: 2 is 
ugly');

INSERT INTO reviews (uid, author, review) VALUES (3, 1, 'User 1 says: 3 is 
nice');
INSERT INTO reviews (uid, author, review) VALUES (3, 2, 'User 2 says: 3 is 
ugly');
INSERT INTO reviews (uid, author, review) VALUES (3, 4, 'User 4 says: 3 is 
ugly');

INSERT INTO reviews (uid, author, review) VALUES (4, 1, 'User 1 says: 4 is 
ugly');
INSERT INTO reviews (uid, author, review) VALUES (4, 2, 'User 2 says: 4 is 
ugly');
INSERT INTO reviews (uid, author, review) VALUES (4, 3, 'User 3 says: 4 is 
ugly');

And finally here is my problematic custom stored function:

CREATE OR REPLACE FUNCTION merge_users(
in_uids integer[],
OUT out_uid integer
) RETURNS integer AS
$func$
BEGIN
SELECT
MIN(uid)
INTO STRICT
out_uid
FROM users
WHERE uid = ANY(in_uids);

-- delete self-reviews
DELETE FROM reviews
WHERE uid = out_uid
AND author = ANY(in_uids);

DELETE FROM reviews
WHERE author = out_uid
AND uid = ANY(in_uids);

-- try to copy as many reviews OF this user as possible
INSERT INTO reviews (
uid,
author,
review
) SELECT
out_uid,-- change to out_uid
author,
review
FROM reviews
WHERE uid <> out_uid
AND uid = ANY(in_uids)
ON CONFLICT DO NOTHING;

DELETE FROM reviews
WHERE uid <> out_uid
AND uid = ANY(in_uids);

-- try to copy as many reviews BY this user as possible
INSERT INTO reviews (
uid,
author,
review
) SELECT
uid,
out_uid,-- change to out_uid
review
FROM reviews
WHERE author <> out_uid
AND author = ANY(in_uids)
ON CONFLICT DO NOTHING;

DELETE FROM reviews
WHERE author <> out_uid
AND author = ANY(in_uids);

DELETE FROM users
WHERE uid <> out_uid
AND uid = ANY(in_uids);
END
$func$ LANGUAGE plpgsql;

The purpose of the function is to merge several user records to one (with the 
lowest uid).

While merging the reviews records I delete all self-reviews and try to copy 
over as many remaining reviews as possible.

However with PostgreSQL 9.5 the following 2 calls fail:

test=> SELECT out_uid FROM merge_users(ARRAY[1,2]);
 out_uid
-
   1
(1 row)

test=> SELECT out_uid FROM merge_users(ARRAY[1,2,3,4]);
ERROR:  new row for relation "reviews" violates check constraint "reviews_check"
DETAIL:  Failing row contains (1, 1, User 4 says: 3 is ugly).
CONTEXT:  SQL statement "INSERT INTO reviews (
uid,
author,
review
) SELECT
uid,
out_uid,-- change to out_uid
review
FROM reviews
WHERE author <> out_uid
AND author = ANY(in_uids)
ON CONFLICT DO NOTHING"
PL/pgSQL function merge_users(integer[]) line 38 at SQL statement

I have provided more context at
http://stackoverflow.com/questions/43168406/merging-records-in-a-table-with-2-columns-primary-key

Also I have tried to create an SQL Fiddle at
http://sqlfiddle.com/#!15/5f37e/2
for your convenience

Regards
Alex


I'm not sure what you are trying to do.  You posted a sample starting point, 
which is great.  Perhaps you could post how you want the tables would look in 
the end?

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Merging records in a table with 2-columns primary key

2017-04-02 Thread Alexander Farber
Good afternoon,

I have prepared a simple test case for my question -

CREATE TABLE users (
uid SERIAL PRIMARY KEY,
name varchar(255) NOT NULL
);

CREATE TABLE reviews (
uid integer NOT NULL CHECK (uid <> author) REFERENCES users ON
DELETE CASCADE,
author integer NOT NULL REFERENCES users(uid) ON DELETE CASCADE,
review varchar(255),
PRIMARY KEY(uid, author)
);

Here I fill the above tables with sample data -

INSERT INTO users (uid, name) VALUES (1, 'User 1');
INSERT INTO users (uid, name) VALUES (2, 'User 2');
INSERT INTO users (uid, name) VALUES (3, 'User 3');
INSERT INTO users (uid, name) VALUES (4, 'User 4');

INSERT INTO reviews (uid, author, review) VALUES (1, 2, 'User 2 says: 1 is
nice');
INSERT INTO reviews (uid, author, review) VALUES (1, 3, 'User 3 says: 1 is
nice');
INSERT INTO reviews (uid, author, review) VALUES (1, 4, 'User 4 says: 1 is
nice');

INSERT INTO reviews (uid, author, review) VALUES (2, 1, 'User 1 says: 2 is
nice');
INSERT INTO reviews (uid, author, review) VALUES (2, 3, 'User 3 says: 2 is
nice');
INSERT INTO reviews (uid, author, review) VALUES (2, 4, 'User 4 says: 2 is
ugly');

INSERT INTO reviews (uid, author, review) VALUES (3, 1, 'User 1 says: 3 is
nice');
INSERT INTO reviews (uid, author, review) VALUES (3, 2, 'User 2 says: 3 is
ugly');
INSERT INTO reviews (uid, author, review) VALUES (3, 4, 'User 4 says: 3 is
ugly');

INSERT INTO reviews (uid, author, review) VALUES (4, 1, 'User 1 says: 4 is
ugly');
INSERT INTO reviews (uid, author, review) VALUES (4, 2, 'User 2 says: 4 is
ugly');
INSERT INTO reviews (uid, author, review) VALUES (4, 3, 'User 3 says: 4 is
ugly');

And finally here is my problematic custom stored function:

CREATE OR REPLACE FUNCTION merge_users(
in_uids integer[],
OUT out_uid integer
) RETURNS integer AS
$func$
BEGIN
SELECT
MIN(uid)
INTO STRICT
out_uid
FROM users
WHERE uid = ANY(in_uids);

-- delete self-reviews
DELETE FROM reviews
WHERE uid = out_uid
AND author = ANY(in_uids);

DELETE FROM reviews
WHERE author = out_uid
AND uid = ANY(in_uids);

-- try to copy as many reviews OF this user as possible
INSERT INTO reviews (
uid,
author,
review
) SELECT
out_uid,-- change to out_uid
author,
review
FROM reviews
WHERE uid <> out_uid
AND uid = ANY(in_uids)
ON CONFLICT DO NOTHING;

DELETE FROM reviews
WHERE uid <> out_uid
AND uid = ANY(in_uids);

-- try to copy as many reviews BY this user as possible
INSERT INTO reviews (
uid,
author,
review
) SELECT
uid,
out_uid,-- change to out_uid
review
FROM reviews
WHERE author <> out_uid
AND author = ANY(in_uids)
ON CONFLICT DO NOTHING;

DELETE FROM reviews
WHERE author <> out_uid
AND author = ANY(in_uids);

DELETE FROM users
WHERE uid <> out_uid
AND uid = ANY(in_uids);
END
$func$ LANGUAGE plpgsql;

The purpose of the function is to merge several user records to one (with
the lowest uid).

While merging the reviews records I delete all self-reviews and try to copy
over as many remaining reviews as possible.

However with PostgreSQL 9.5 the following 2 calls fail:

test=> SELECT out_uid FROM merge_users(ARRAY[1,2]);
 out_uid
-
   1
(1 row)

test=> SELECT out_uid FROM merge_users(ARRAY[1,2,3,4]);
ERROR:  new row for relation "reviews" violates check constraint
"reviews_check"
DETAIL:  Failing row contains (1, 1, User 4 says: 3 is ugly).
CONTEXT:  SQL statement "INSERT INTO reviews (
uid,
author,
review
) SELECT
uid,
out_uid,-- change to out_uid
review
FROM reviews
WHERE author <> out_uid
AND author = ANY(in_uids)
ON CONFLICT DO NOTHING"
PL/pgSQL function merge_users(integer[]) line 38 at SQL statement

I have provided more context at
http://stackoverflow.com/questions/43168406/merging-records-in-a-table-with-2-columns-primary-key

Also I have tried to create an SQL Fiddle at
http://sqlfiddle.com/#!15/5f37e/2
for your convenience

Regards
Alex