Re: [GENERAL] loading file with en dash character into postgres 9.6.1 database

2017-07-12 Thread Hu, Patricia
Thanks Laurenz, that nailed it. It was what Tom was saying, except I didn't 
figure out how. 

Thanks,
Patricia 


-Original Message-
From: Albe Laurenz [mailto:laurenz.a...@wien.gv.at] 
Sent: Wednesday, July 12, 2017 5:31 AM
To: 'Tom Lane *EXTERN*'; Hu, Patricia
Cc: pgsql general (pgsql-general@postgresql.org)
Subject: [EXTERNAL] RE: [GENERAL] loading file with en dash character into 
postgres 9.6.1 database

Tom Lane wrote:
> "Hu, Patricia" <patricia...@finra.org> writes:
>> The server and client encoding are both set to UTF8, and according to 
>> this http://www.fileformat.info/info/unicode/char/2013/index.htm en 
>> dash is a valid UTF8 character, but when running a script with insert 
>> statement with en dash character in it, I got the error below.
>> psql:activity_type.lst:379: ERROR:  invalid byte sequence for 
>> encoding "UTF8": 0x96
> 
> Well, that certainly isn't valid UTF8, so your script file isn't in UTF8.
> 
>> If I set client_encoding to WIN1252, the same file will be run ok but 
>> afterwards the en dash character showed up as "û", instead of the en 
>> dash character "-"
> 
> This indicates that your terminal program does *not* think its 
> encoding is WIN1252.  Having loaded that script file, you need to 
> revert client_encoding to whatever your terminal program is using, or 
> non-ASCII characters are going to be displayed wrong.
> 
> A bit of poking around suggests that your terminal may be operating 
> with code page 437 or similar, as 0x96 is "û" in that encoding --- 
> according to Wikipedia, at least:
> https://en.wikipedia.org/wiki/Code_page_437
> I don't think Postgres supports that as a client_encoding setting, so 
> one way or another you're going to need to switch the terminal 
> program's character set setting.

Running "chcp 1252" in your Windows console before starting psql should do the 
trick.

Yours,
Laurenz Albe

Confidentiality Notice::  This email, including attachments, may include 
non-public, proprietary, confidential or legally privileged information.  If 
you are not an intended recipient or an authorized agent of an intended 
recipient, you are hereby notified that any dissemination, distribution or 
copying of the information contained in or transmitted with this e-mail is 
unauthorized and strictly prohibited.  If you have received this email in 
error, please notify the sender by replying to this message and permanently 
delete this e-mail, its attachments, and any copies of it immediately.  You 
should not retain, copy or use this e-mail or any attachment for any purpose, 
nor disclose all or any part of the contents to any other person. Thank you.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] loading file with en dash character into postgres 9.6.1 database

2017-07-11 Thread Hu, Patricia
The server and client encoding are both set to UTF8, and according to this 
http://www.fileformat.info/info/unicode/char/2013/index.htm en dash is a valid 
UTF8 character, but when running a script with insert statement with en dash 
character in it, I got the error below.

mydb=> select name, setting from pg_settings where name like '%encoding%';
  name   | setting
-+-
client_encoding | UTF8
server_encoding | UTF8

mydb=> \i activity_type.lst
psql:activity_type.lst:379: ERROR:  invalid byte sequence for encoding "UTF8": 
0x96

If I set client_encoding to WIN1252, the same file will be run ok and records 
inserted in, but afterwards the en dash character showed up as "û", instead of 
the en dash character "-"
mydb=> show client_encoding;
client_encoding
-
WIN1252

I created a database with WIN1252 encoding so both server and client encoding 
are WIN1252, loaded the same file in, en dash character still showed up as 
"û"(actually that is on a windows box, on a linux box the character didn't show 
up at all), so the client setting still makes a difference, even though 
client_encoding showed the same value.

Is there any way I will be able to load the en dash character into the 
postgresql database as is? I had worked around it by editing the input file to 
replace en dash with a plain dash, but that's quite some manual work each time 
when a new dump is generated from an oracle database with WE8MSWIN1252 
characterset.


Thanks,
Patricia

Confidentiality Notice::  This email, including attachments, may include 
non-public, proprietary, confidential or legally privileged information.  If 
you are not an intended recipient or an authorized agent of an intended 
recipient, you are hereby notified that any dissemination, distribution or 
copying of the information contained in or transmitted with this e-mail is 
unauthorized and strictly prohibited.  If you have received this email in 
error, please notify the sender by replying to this message and permanently 
delete this e-mail, its attachments, and any copies of it immediately.  You 
should not retain, copy or use this e-mail or any attachment for any purpose, 
nor disclose all or any part of the contents to any other person. Thank you.


Re: [GENERAL] relation create time

2017-05-10 Thread Hu, Patricia
Thanks for your prompt reply Peter. 

Sure I could write a trigger to capture and store it, but for such common 
functionalities seems to me it would be best to have it in the engine, vs. each 
application having to write its own trigger and reinvent the wheel. 

Is there any concerns on adding it as a feature, or is it just backlog? Has a 
feature request been filed for this?

Thanks,
Patricia 


-Original Message-
From: Peter Eisentraut [mailto:peter.eisentr...@2ndquadrant.com] 
Sent: Wednesday, May 10, 2017 12:32 PM
To: Hu, Patricia; pgsql-general@postgresql.org
Subject: [EXTERNAL] Re: [GENERAL] relation create time

On 5/10/17 12:05, Hu, Patricia wrote:
> I am trying to find out when a table was created in postgresql. 
> Thought it would be easy (coming from Oracle world), but haven't had 
> any luck, especially since we are on RDS and can't peek at the 
> timestamp on the file system. Is this information stored anywhere in the 
> catalog?

It is not.

> Or I
> need to store it myself? Is there any plan to add such meta data 
> information to the catalog as a feature? Thanks a lot!

You could write an event trigger to record it.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Confidentiality Notice::  This email, including attachments, may include 
non-public, proprietary, confidential or legally privileged information.  If 
you are not an intended recipient or an authorized agent of an intended 
recipient, you are hereby notified that any dissemination, distribution or 
copying of the information contained in or transmitted with this e-mail is 
unauthorized and strictly prohibited.  If you have received this email in 
error, please notify the sender by replying to this message and permanently 
delete this e-mail, its attachments, and any copies of it immediately.  You 
should not retain, copy or use this e-mail or any attachment for any purpose, 
nor disclose all or any part of the contents to any other person. Thank you.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] relation create time

2017-05-10 Thread Hu, Patricia
I am trying to find out when a table was created in postgresql. Thought it 
would be easy (coming from Oracle world), but haven't had any luck, especially 
since we are on RDS and can't peek at the timestamp on the file system. Is this 
information stored anywhere in the catalog? Or I need to store it myself? Is 
there any plan to add such meta data information to the catalog as a feature? 
Thanks a lot!

Thanks,
Patricia

Confidentiality Notice::  This email, including attachments, may include 
non-public, proprietary, confidential or legally privileged information.  If 
you are not an intended recipient or an authorized agent of an intended 
recipient, you are hereby notified that any dissemination, distribution or 
copying of the information contained in or transmitted with this e-mail is 
unauthorized and strictly prohibited.  If you have received this email in 
error, please notify the sender by replying to this message and permanently 
delete this e-mail, its attachments, and any copies of it immediately.  You 
should not retain, copy or use this e-mail or any attachment for any purpose, 
nor disclose all or any part of the contents to any other person. Thank you.


[GENERAL] application generated an eternal block in the database

2017-02-16 Thread Hu, Patricia
I recently came across an interesting locking/blocking situation in a Postgres 
database(9.5.4, RDS but that shouldn't matter). The application is 
java/hibernate/springboot with connection pooling. The developers pushed in 
some code that seemed to be doing this: 

Start a transaction, update row1 in table1, then spawn another process to 
update the same row in the same table (but within the context of this 1st 
transaction?). The result is that the 2nd process was blocked waiting for the 
lock on the 1st transaction to complete, but the 1st transaction can't complete 
either b/c the 2nd update was blocked. It wasn't a deadlock situation - neither 
was rolled back, just more and more locks lined up for that table, till manual 
intervention by killing the blocker or blocked pid. 

What I saw in the database when this blocking was happening seems pretty 
standard: the 1st update holds a RowExclusiveLock on the table, there is an 
ExclusiveLock on the tuple of the table, another ExclusiveLock on the 
transactionID of the 1st update, the 2nd update unable to acquire a ShareLock 
on the transactionID (granted=f). 

I am trying to understand how could the application have caused this forever 
blocking.. I wasn't able to reproduce it from the database end: if I have 2 
sessions doing update on a same row in same table, after session 1 
commits/rolls back the blocking is resolved. In psql if 2 updates on the same 
row on the same table within the same transaction, on commit psql keeps the 
value of the 2nd update. The developers couldn't explain thoroughly how the 
code triggered this either.

I'd like to see if anyone has insight/explanation how this could happen beyond 
the database boundary in the application layer. Any input is appreciated!

Thanks,
Patricia


Confidentiality Notice::  This email, including attachments, may include 
non-public, proprietary, confidential or legally privileged information.  If 
you are not an intended recipient or an authorized agent of an intended 
recipient, you are hereby notified that any dissemination, distribution or 
copying of the information contained in or transmitted with this e-mail is 
unauthorized and strictly prohibited.  If you have received this email in 
error, please notify the sender by replying to this message and permanently 
delete this e-mail, its attachments, and any copies of it immediately.  You 
should not retain, copy or use this e-mail or any attachment for any purpose, 
nor disclose all or any part of the contents to any other person. Thank you.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Re: PSQL 9.5 select for update locks too many rows when using numeric instead of int

2017-02-03 Thread Hu, Patricia
Looks to me the first plan was using seq scan not the index b/c the value had 
to be cast to numeric. In such case index is not used, as expected.

   Filter: ((true_data_id)::numeric = '209390104'::numeric)


Thanks,
Patricia

From: Sfiligoi, Igor [mailto:igor.sfili...@ga.com]
Sent: Thursday, February 02, 2017 4:29 PM
To: pgsql-general@postgresql.org
Subject: Re: PSQL 9.5 select for update locks too many rows when using numeric 
instead of int

Uhm... maybe I misinterpreted the results.

Looking better, the root cause seems to be that the query planner is not using 
the index, resorting to a seq scan instead.

OK... that makes more sense.

Sorry for the bogus email.

Igor

From: 
pgsql-general-ow...@postgresql.org 
[mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Sfiligoi, Igor
Sent: Thursday, February 02, 2017 1:22 PM
To: pgsql-general@postgresql.org
Subject: -EXT-[GENERAL] PSQL 9.5 select for update locks too many rows when 
using numeric instead of int

Dear PSQL team.

I just found a weird problem.

When I pass a numeric type to a select for update statement, it locks loads of 
rows, instead of a single one!
See explains below.

Is this a known bug (in 9.5)?
Any chance it was fixed in a more recent release?

Thanks,
  Igor

Note: My table has about 200M rows.
true_data_id is of type bigint.

mcatdb=> PREPARE fooplan3 (NUMERIC) AS SELECT 
DI.object_status,DR.replica_status,DR.replication_id FROM MCAT_DATA_INFO DI, 
MCAT_DATA_REPLICA DR WHERE DI.true_data_id = DR.data_id AND DI.true_data_id = 
$1 FOR UPDATE;

mcatdb=> explain analyze EXECUTE fooplan3(209390104);
 QUERY PLAN
-
LockRows  (cost=0.57..16852579.49 rows=1036721 width=32) (actual 
time=233942.206..254040.547 rows=1 loops=1)
   ->  Nested Loop  (cost=0.57..16842212.28 rows=1036721 width=32) (actual 
time=233942.171..254040.505 rows=1 loops=1)
 ->  Seq Scan on mcat_data_info di  (cost=0.00..9867006.22 rows=1037098 
width=22) (actual time=233942.109..254040.419 rows=1 loops=1)
   Filter: ((true_data_id)::numeric = '209390104'::numeric)
   Rows Removed by Filter: 207368796
 ->  Index Scan using pkey_data_replica on mcat_data_replica dr  
(cost=0.57..6.72 rows=1 width=26) (actual time=0.047..0.052 rows=1 loops=1)
   Index Cond: (data_id = di.true_data_id)
Execution time: 254040.632 ms

mcatdb=> PREPARE fooplan4 (INT) AS SELECT 
DI.object_status,DR.replica_status,DR.replication_id FROM MCAT_DATA_INFO DI, 
MCAT_DATA_REPLICA DR WHERE DI.true_data_id = DR.data_id AND DI.true_data_id = 
$1 FOR UPDATE;
mcatdb=> explain analyze EXECUTE fooplan4(209390104);
 QUERY PLAN
-
LockRows  (cost=1.14..17.20 rows=1 width=32) (actual time=0.307..0.318 rows=1 
loops=1)
   ->  Nested Loop  (cost=1.14..17.19 rows=1 width=32) (actual 
time=0.232..0.243 rows=1 loops=1)
 ->  Index Scan using idx_0_data_info on mcat_data_info di  
(cost=0.57..8.59 rows=1 width=22) (actual time=0.193..0.197 rows=1 loops=1)
   Index Cond: (true_data_id = 209390104)
 ->  Index Scan using pkey_data_replica on mcat_data_replica dr  
(cost=0.57..8.59 rows=1 width=26) (actual time=0.032..0.039 rows=1 loops=1)
   Index Cond: (data_id = 209390104)
Execution time: 0.420 ms

mcatdb=> PREPARE fooplan5 (BIGINT) AS SELECT 
DI.object_status,DR.replica_status,DR.replication_id FROM MCAT_DATA_INFO DI, 
MCAT_DATA_REPLICA DR WHERE DI.true_data_id = DR.data_id AND DI.true_data_id = 
$1 FOR UPDATE;
mcatdb=> explain analyze EXECUTE fooplan5(209390104);
 QUERY PLAN
-
LockRows  (cost=1.14..17.20 rows=1 width=32) (actual time=0.316..0.347 rows=1 
loops=1)
   ->  Nested Loop  (cost=1.14..17.19 rows=1 width=32) (actual 
time=0.252..0.283 rows=1 loops=1)
 ->  Index Scan using idx_0_data_info on mcat_data_info di  
(cost=0.57..8.59 rows=1 width=22) (actual time=0.042..0.059 rows=1 loops=1)
   Index Cond: (true_data_id = '209390104'::bigint)
 ->  Index Scan using pkey_data_replica on mcat_data_replica dr  
(cost=0.57..8.59 rows=1 width=26) (actual time=0.199..0.212 rows=1 loops=1)
   Index Cond: (data_id = '209390104'::bigint)
Execution time: 0.443 ms
(7 rows)


Confidentiality Notice::  This email, including attachments, may include 
non-public, 

[GENERAL] view dependent on system view caused an upgrade to fail

2017-01-20 Thread Hu, Patricia
I have the following function and view in my db: 

create or replace function ${catalogSchema}.fn_show_pg_stat_activity() returns 
setof pg_catalog.pg_stat_activity as $$ select * from 
pg_catalog.pg_stat_activity; $$ language sql volatile security definer; 

create or replace view ${catalogSchema}.pg_stat_activity as select * from 
${catalogSchema}.fn_show_pg_stat_activity();

During an upgrade from 9.5.4 to 9.6.1, the upgrade failed due to the error msg 
below: obviously the structure of pg_catalog.pg_stat_activity has changed 
between these 2 version. 
pg_restore: creating VIEW "rcmmaster.pg_stat_activity" pg_restore: [archiver 
(db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC 
entry 205; 1259 1995821 VIEW pg_stat_activity rcm_master_user pg_restore: 
[archiver (db)] could not execute query: ERROR: column reference "query" is 
ambiguous

My question: seems like pg_dump and pg_restore duirng the pg_upgrade actually 
used a hard-coded record/column set of the old verison of pg_stat_activity, 
instead of just recompiling the function then the view after the system catalog 
upgrade, so it would dynamically just retrieve the new column lists. Could this 
be considered a bug/defect? Is there any way to work around it w/o a postgresql 
fix?  

Thanks,
Patricia


Confidentiality Notice::  This email, including attachments, may include 
non-public, proprietary, confidential or legally privileged information.  If 
you are not an intended recipient or an authorized agent of an intended 
recipient, you are hereby notified that any dissemination, distribution or 
copying of the information contained in or transmitted with this e-mail is 
unauthorized and strictly prohibited.  If you have received this email in 
error, please notify the sender by replying to this message and permanently 
delete this e-mail, its attachments, and any copies of it immediately.  You 
should not retain, copy or use this e-mail or any attachment for any purpose, 
nor disclose all or any part of the contents to any other person. Thank you.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] What is the best thing to do with PUBLIC schema in Postgresql database

2016-11-04 Thread Hu, Patricia
Since it could potentially be a security loop hole. So far the action taken to 
address it falls into these two categories:

drop the PUBLIC schema altogether. One of the concerns is with some of the 
system objects that have been exposed through PUBLIC schema previously, now 
they will need other explicit grants to be accessible to users. e.g 
pg_stat_statements.
keep the PUBLIC schema but revoke all privileges to it from public role, 
then grant as necessity comes up.

Any feedback and lessons from those who have implemented this? 

Confidentiality Notice::  This email, including attachments, may include 
non-public, proprietary, confidential or legally privileged information.  If 
you are not an intended recipient or an authorized agent of an intended 
recipient, you are hereby notified that any dissemination, distribution or 
copying of the information contained in or transmitted with this e-mail is 
unauthorized and strictly prohibited.  If you have received this email in 
error, please notify the sender by replying to this message and permanently 
delete this e-mail, its attachments, and any copies of it immediately.  You 
should not retain, copy or use this e-mail or any attachment for any purpose, 
nor disclose all or any part of the contents to any other person. Thank you.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Predicting query runtime

2016-09-13 Thread Hu, Patricia
I’ve been looking for this on postgres too.  Does Postgres have something 
similar to Oracle’s v$session_longops? It gives info on total unit of work, 
units done so far, last update time, and time remaining etc, and I found it 
valuable in providing an estimate to how long a certain query would keep 
running and whether or not to kill it if applicable. This should be relatively 
easy to implement in postgres too if it is not available yet?


Thanks,
Patricia

From: Oleksandr Shulgin [mailto:oleksandr.shul...@zalando.de]
Sent: Monday, September 12, 2016 11:08 AM
To: Vinicius Segalin
Cc: pgsql general
Subject: Re: Predicting query runtime

On Mon, Sep 12, 2016 at 4:03 PM, Vinicius Segalin 
> wrote:
Hi everyone,

I'm trying to find a way to predict query runtime (I don't need to be extremely 
precise). I've been reading some papers about it, and people are using machine 
learning to do so. For the feature vector, they use what the DBMS's query 
planner provide, such as operators and their cost. The thing is that I haven't 
found any work using PostgreSQL, so I'm struggling to adapt it.
My question is if anyone is aware of a work that uses machine learning and 
PostgreSQL to predict query runtime, or maybe some other method to perform this.

Hi,

I'm not aware of machine-learning techniques to achieve that (and I don't 
actually believe it's feasible), but there you might find this extension 
particularly useful: 
https://www.postgresql.org/docs/9.5/static/pgstatstatements.html[postgresql.org]

Can you share some links to the papers you are referring to (assuming these are 
publicly available)?

Regards,
--
Alex


Confidentiality Notice::  This email, including attachments, may include 
non-public, proprietary, confidential or legally privileged information.  If 
you are not an intended recipient or an authorized agent of an intended 
recipient, you are hereby notified that any dissemination, distribution or 
copying of the information contained in or transmitted with this e-mail is 
unauthorized and strictly prohibited.  If you have received this email in 
error, please notify the sender by replying to this message and permanently 
delete this e-mail, its attachments, and any copies of it immediately.  You 
should not retain, copy or use this e-mail or any attachment for any purpose, 
nor disclose all or any part of the contents to any other person. Thank you.