Re: Idle sessions keep deleted files open

2020-03-29 Thread Andres Freund
Hi,

On 2020-03-30 00:49:13 +0200, Peter J. Holzer wrote:
> I noticed this with PostgreSQL 11.7 on Linux:
> 
> I did a «vacuum full» on a table which had become quite bloated (size
> before the vacuum: 392 GB, after: 291 GB). I expected the disk usage to
> increase as the new table was written and then drop sharply after the
> vacuum was finished.
> 
> However, that didn't happen. While usage increased by about 300 GB
> during the vacuum, it stayed the same afterwards (as reported by «df»).
> 
> However, «du» reported about 400 GB less than «df». So apparently, files
> had been unlinked but were still in use by some process.
> 
> «lsof» confirmed this: There were some idle sessions which were keeping
> the files open. As soon as I got those sessions to execute some request,
> they closed the files, causing the disk space to be freed.
> 
> I think I noticed that before when dropping tables, but didn't draw the
> right conclusion at the time.
> 
> So, if you are reorganizing large tables, keep an eye on idle sessions -
> they may keep deleted files around for quite some time.

I think this is partially related to
https://postgr.es/m/20181003223725.elcu3t44fpd4lm56%40alap3.anarazel.de

Greetings,

Andres Freund




Re: Ident authentication failed

2020-03-29 Thread Peter J. Holzer
On 2020-03-26 21:33:51 +0100, Christoph Moench-Tegeder wrote:
> ## Ted To (t...@theo.to):
> > Thank you -- I added two lines to the hba file to allow for ident
> > authentication, restarted postgres and still the same errors.
> 
> You probably don't want "ident" authentication - that's the thing
> with "identd" (see RfC 1413),

Depends on the type of socket. For unix sockets it's the same as peer
authentication.

> which even 20 years ago was only used in connection with IRC, but not
> really for "authentication".

I don't think there are any linux distributions which install identd by
default any more, so it wouldn't work anyway.


> You'd rather want "scram-sha-256" as the authentication method (hoping
> that whatever-go-uses can do that - come on, we got that in version 10),
> or (if that's not possible) "md5".

Actually, for local connections I prefer "peer". I'm already
authenticated by the OS, no need for a (second) password.

I should add that you shouldn't use "trust" unless 
* no connection from other hosts is allowed, and
* all users on this host should have full access anyway.
This is rarely the case.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


Idle sessions keep deleted files open

2020-03-29 Thread Peter J. Holzer
I noticed this with PostgreSQL 11.7 on Linux:

I did a «vacuum full» on a table which had become quite bloated (size
before the vacuum: 392 GB, after: 291 GB). I expected the disk usage to
increase as the new table was written and then drop sharply after the
vacuum was finished.

However, that didn't happen. While usage increased by about 300 GB
during the vacuum, it stayed the same afterwards (as reported by «df»).

However, «du» reported about 400 GB less than «df». So apparently, files
had been unlinked but were still in use by some process.

«lsof» confirmed this: There were some idle sessions which were keeping
the files open. As soon as I got those sessions to execute some request,
they closed the files, causing the disk space to be freed.

I think I noticed that before when dropping tables, but didn't draw the
right conclusion at the time.

So, if you are reorganizing large tables, keep an eye on idle sessions -
they may keep deleted files around for quite some time.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


Re: could not determine encoding for locale "et_EE.UTF-8": codeset is "CPUTF-8" in pg_restore

2020-03-29 Thread Andrus

Hi!

Same warning appears two times. This command execute by pg_restore 
probably causes this (harmless?)  warning:

What warning?


pg_restore: WARNING:  could not determine encoding for locale "et_EE.UTF-8":
codeset is "CPUTF-8"

I cranked up a Windows 7 instance and tried to migrate a Postgres 11 
database from Ubuntu and it failed on the CREATE DATABASE step because of 
this line in the dump file:
CREATE DATABASE redmine WITH TEMPLATE = template0 ENCODING = 'UTF8' 
LC_COLLATE = 'en_US.UTF-8' LC_CTYPE = 'en_US.UTF-8';


I ran this statemnt it in Windows 10 with Postgres 12 successfully. Result 
was:


WARNING:  could not determine encoding for locale "en_US.UTF-8": codeset is 
"CPUTF-8"
WARNING:  could not determine encoding for locale "en_US.UTF-8": codeset is 
"CPUTF-8"

CREATE DATABASE

Query returned successfully in 1 secs 75 msec.

redmine database was created. I dont understand why it failed in your test.


When I manually changed it in the plain text version of the dump file to:
CREATE DATABASE redmine WITH TEMPLATE = template0 ENCODING = 'UTF8' 
LC_COLLATE = 'English_United States.1252' LC_CTYPE = 'English_United 
States.1252';


I verifed that data was restored using pg_restore without manually changing 
anything.


Andrus.





Re: could not determine encoding for locale "et_EE.UTF-8": codeset is "CPUTF-8" in pg_restore

2020-03-29 Thread Adrian Klaver

On 3/29/20 11:12 AM, Andrus wrote:

Hi!


There seems to a difference of opinion of what Baltic Code Page to use:
https://en.wikipedia.org/wiki/Code_page_775
https://en.wikipedia.org/wiki/Windows-1257
The post below shows a users method of dealing with this for another CP:
https://www.postgresql.org/message-id/549275CC.4010607%40gmail.com


Console code page 775 message appears since psql is console application 
running from command line.


It does not have any relation to pg_dump/pg_restore issue since console 
code page is not used in this case.


There is Estonian locale everywhere.
Maybe this warning is harmless since Linux code page is ignored and 
default collation is used.
All table definitions in restored database contain references to default 
collation:


CREATE TABLE firma1.acquirpo
(
    kassanr numeric(3,0) NOT NULL,
    policyid character(2) COLLATE pg_catalog."default" NOT NULL,
    trantype character(6) COLLATE pg_catalog."default",
    tacdefault character(10) COLLATE pg_catalog."default",
    tacdenial character(10) COLLATE pg_catalog."default",
    taconline character(10) COLLATE pg_catalog."default",
    floorlimit numeric(12,0),
    randselthr numeric(12,0),
    minrandper numeric(2,0),
    maxrandper numeric(2,0),
    CONSTRAINT acquirpo_pkey PRIMARY KEY (kassanr, policyid)
)

TABLESPACE pg_default;

Same warning appears two times. This command execute by pg_restore 
probably causes this (harmless?)  warning:


What warning?

I cranked up a Windows 7 instance and tried to migrate a Postgres 11 
database from Ubuntu and it failed on the CREATE DATABASE step because 
of this line in the dump file:


CREATE DATABASE redmine WITH TEMPLATE = template0 ENCODING = 'UTF8' 
LC_COLLATE = 'en_US.UTF-8' LC_CTYPE = 'en_US.UTF-8';


When I manually changed it in the plain text version of the dump file to:

CREATE DATABASE redmine WITH TEMPLATE = template0 ENCODING = 'UTF8' 
LC_COLLATE = 'English_United States.1252' LC_CTYPE = 'English_United 
States.1252';


borrowing from Tom's suggestion here:

https://www.postgresql.org/message-id/22672.1585442578%40sss.pgh.pa.us

it worked.

So basically what you see below.

If you followed Tom's suggestion fully you could restore into a manually 
created database on the Windows side by dropping the --create and 
pointing -d at the previously created database.




CREATE DATABASE mydb
    WITH
    OWNER = mydb_owner
    ENCODING = 'UTF8'
    LC_COLLATE = 'et_EE.UTF-8'
    LC_CTYPE = 'et_EE.UTF-8'
    TABLESPACE = pg_default
    CONNECTION LIMIT = -1;

If new database is created manually in windows cluster the following 
command is generated:



CREATE DATABASE mydbmanually
    WITH
    OWNER = mydbmanually_owner
    ENCODING = 'UTF8'
    LC_COLLATE = 'Estonian_Estonia.1257'
    LC_CTYPE = 'Estonian_Estonia.1257'
    TABLESPACE = pg_default
    CONNECTION LIMIT = -1;

Andrus.




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Hot standby from Debian to Windows

2020-03-29 Thread Adrian Klaver

On 3/29/20 12:52 PM, Andrus wrote:

Hi!

Since you are moving between different OSes you will need to use some 
form of logical replication as binary replication will not work.


I can use Hyper-V or something other to run Debian with Windows.

This hopefully will also allow to bypass Windows 20 connection limit so 
that more than 20 users can connect.


Given  that you are Postgres 12 you could use the builtin logical 
replication:

https://www.postgresql.org/docs/12/logical-replication.html


I see possibilities:

1. Use Hyper-V to run Debian container and Postgres binary replication
2. Use Hyper-V to run Debian container and Postgres logical replication
3. Use Windows and Postgres logical replication.
4. Use Hyper-V to run Debian container and backup utility  for whole 
disk block level backup



How that needs to managed is going to need more information. As a start:
1) Downtime allowed?


Yes. If main server stops, I can tell users to enter backup server 
address instead of main server.



2) All databases to be replicated or just some?


I can create separate cluster so that all databases and users will be 
replicated.
There are 30 databases with total size 70 GB. Size will increase slowly 
every day when new data is entered.
There are some test and demo databases whose replcation is not really 
required but those can also replicated if this



3) Permissible lag between servers?


Currently backups are created every night and restored in new server.
Backup of 67GB data takes 1 hour, transfer 1 hour, restore and analyze 
to new server 4 hours. Total 6 hours. So current lag in 6 .. 24 hours.


Goal is to decrease this lag.

4) How are you going to deal with the down server and how do you plan 
on bringing it up again?


VPS hosting company will bring it up again. I will then manually 
synchronize two clusters when users continue to enter data, this is not 
time critical.


Would it not be easier to just set up another Debian server, run binary 
replication and put them behind something like pgpool?




Andrus.



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Hot standby from Debian to Windows

2020-03-29 Thread Andrus

Hi!

Since you are moving between different OSes you will need to use some form 
of logical replication as binary replication will not work.


I can use Hyper-V or something other to run Debian with Windows.

This hopefully will also allow to bypass Windows 20 connection limit so that 
more than 20 users can connect.


Given  that you are Postgres 12 you could use the builtin logical 
replication:

https://www.postgresql.org/docs/12/logical-replication.html


I see possibilities:

1. Use Hyper-V to run Debian container and Postgres binary replication
2. Use Hyper-V to run Debian container and Postgres logical replication
3. Use Windows and Postgres logical replication.
4. Use Hyper-V to run Debian container and backup utility  for whole disk 
block level backup



How that needs to managed is going to need more information. As a start:
1) Downtime allowed?


Yes. If main server stops, I can tell users to enter backup server address 
instead of main server.



2) All databases to be replicated or just some?


I can create separate cluster so that all databases and users will be 
replicated.
There are 30 databases with total size 70 GB. Size will increase slowly 
every day when new data is entered.
There are some test and demo databases whose replcation is not really 
required but those can also replicated if this



3) Permissible lag between servers?


Currently backups are created every night and restored in new server.
Backup of 67GB data takes 1 hour, transfer 1 hour, restore and analyze to 
new server 4 hours. Total 6 hours. So current lag in 6 .. 24 hours.


Goal is to decrease this lag.

4) How are you going to deal with the down server and how do you plan on 
bringing it up again?


VPS hosting company will bring it up again. I will then manually synchronize 
two clusters when users continue to enter data, this is not time critical.


Andrus. 






Re: could not determine encoding for locale "et_EE.UTF-8": codeset is "CPUTF-8" in pg_restore

2020-03-29 Thread Andrus

Hi!


Per my previous post, you might try adding something like:
cmd.exe /c chcp 1257
to the top of the batch file. This idea came from here:
https://stackoverflow.com/questions/20794035/postgresql-warning-console-code-page-437-differs-from-windows-code-page-125
If I am following the error correctly then the issue is that the Postgres 
console programs are using CP755 and that is not something for which there 
is an automatic conversion:

https://www.postgresql.org/docs/12/multibyte.html#id-1.6.10.5.7
There is a conversion for 1257 clients so having your console run as 1257 
should solve the problem. Someone with more current experience on Windows 
will need to comment on whether that is the viable or best solution.


Both servers have UTF-8 encoding.

Non-unicode code page 755 referes only to command line applications like 
psql.


Postgres service, pg_dump and pg_restore do not use console codepages any 
way, they operate using only UTF-8 character set since both databases are in 
UTF-8


I think console code page warning message is not related to this issue.

Andrus.





Re: could not determine encoding for locale "et_EE.UTF-8": codeset is "CPUTF-8" in pg_restore

2020-03-29 Thread Andrus

Hi!


There seems to a difference of opinion of what Baltic Code Page to use:
https://en.wikipedia.org/wiki/Code_page_775
https://en.wikipedia.org/wiki/Windows-1257
The post below shows a users method of dealing with this for another CP:
https://www.postgresql.org/message-id/549275CC.4010607%40gmail.com


Console code page 775 message appears since psql is console application 
running from command line.


It does not have any relation to pg_dump/pg_restore issue since console code 
page is not used in this case.


There is Estonian locale everywhere.
Maybe this warning is harmless since Linux code page is ignored and default 
collation is used.
All table definitions in restored database contain references to default 
collation:


CREATE TABLE firma1.acquirpo
(
   kassanr numeric(3,0) NOT NULL,
   policyid character(2) COLLATE pg_catalog."default" NOT NULL,
   trantype character(6) COLLATE pg_catalog."default",
   tacdefault character(10) COLLATE pg_catalog."default",
   tacdenial character(10) COLLATE pg_catalog."default",
   taconline character(10) COLLATE pg_catalog."default",
   floorlimit numeric(12,0),
   randselthr numeric(12,0),
   minrandper numeric(2,0),
   maxrandper numeric(2,0),
   CONSTRAINT acquirpo_pkey PRIMARY KEY (kassanr, policyid)
)

TABLESPACE pg_default;

Same warning appears two times. This command execute by pg_restore probably 
causes this (harmless?)  warning:


CREATE DATABASE mydb
   WITH
   OWNER = mydb_owner
   ENCODING = 'UTF8'
   LC_COLLATE = 'et_EE.UTF-8'
   LC_CTYPE = 'et_EE.UTF-8'
   TABLESPACE = pg_default
   CONNECTION LIMIT = -1;

If new database is created manually in windows cluster the following command 
is generated:



CREATE DATABASE mydbmanually
   WITH
   OWNER = mydbmanually_owner
   ENCODING = 'UTF8'
   LC_COLLATE = 'Estonian_Estonia.1257'
   LC_CTYPE = 'Estonian_Estonia.1257'
   TABLESPACE = pg_default
   CONNECTION LIMIT = -1;

Andrus.





Re: Hot standby from Debian to Windows

2020-03-29 Thread Adrian Klaver

On 3/29/20 5:35 AM, Andrus wrote:

Hi!

Postgres 12 server is running on Debian 10 and has number of databases.

How to mirror changes of those databases to Windows 10 workstation which 
runs also Postgres 12.


Changes in server databases in Debian should sent to Postgres 12 
database in Windows over internet.


If Debian server goes down, users can change server address to Windows 
computer as temporary workaround and continue working.


Since you are moving between different OSes you will need to use some 
form of logical replication as binary replication will not work. Given 
that you are Postgres 12 you could use the builtin logical replication:


https://www.postgresql.org/docs/12/logical-replication.html

How that needs to managed is going to need more information. As a start:

1) Downtime allowed?

2) All databases to be replicated or just some?

3) Permissible lag between servers?

4) How are you going to deal with the down server and how do you plan on 
bringing it up again?





Andrus.





--
Adrian Klaver
adrian.kla...@aklaver.com




Re: could not determine encoding for locale "et_EE.UTF-8": codeset is "CPUTF-8" in pg_restore

2020-03-29 Thread Adrian Klaver

On 3/29/20 1:02 AM, Andrus wrote:

Hi!


Generally speaking, locale names from Unix systems won't work at all on
Windows.  You need to create the database manually with whatever seems
to be the closest Windows locale match, and then restore its contents
without using --create.


This is unattended script running in every night from .bat file.


Per my previous post, you might try adding something like:

cmd.exe /c chcp 1257

to the top of the batch file. This idea came from here:

https://stackoverflow.com/questions/20794035/postgresql-warning-console-code-page-437-differs-from-windows-code-page-125



How to replace pg_restore --create  option with   psql and/or createdb 
calls

and specify proper locale for them ?

Currently everthing has "Estonian_Estonia.1257"  locale in windows.
Which locale name should be specified in Windows instead of this?
Or maybe creating new template with proper encoding or changing 
template0 encoding helps?


If I am following the error correctly then the issue is that the 
Postgres console programs are using CP755 and that is not something for 
which there is an automatic conversion:


https://www.postgresql.org/docs/12/multibyte.html#id-1.6.10.5.7

There is a conversion for 1257 clients so having your console run as 
1257 should solve the problem. Someone with more current experience on 
Windows will need to comment on whether that is the viable or best solution.




Andrus.





--
Adrian Klaver
adrian.kla...@aklaver.com




Re: could not determine encoding for locale "et_EE.UTF-8": codeset is "CPUTF-8" in pg_restore

2020-03-29 Thread Adrian Klaver

On 3/29/20 12:56 AM, Andrus wrote:

Hi!


System Locale: et;Eesti
Input Locale:  et;Eesti
Hmm, I was expecting to see et_EE though I will admit to not truly 
understanding how Windows does locales.
I should have asked earlier, in the Postgres instance on Windows what 
does \l show for template0?



I rarely use Windows anymore so take the below with that in mind.



"D:\Program Files\PostgreSQL\12\bin\psql"  postgres postgres

psql (12.2)
WARNING: Console code page (775) differs from Windows code page (1257)
     8-bit characters might not work correctly. See psql reference
     page "Notes for Windows users" for details.


There seems to a difference of opinion of what Baltic Code Page to use:

https://en.wikipedia.org/wiki/Code_page_775

https://en.wikipedia.org/wiki/Windows-1257

The post below shows a users method of dealing with this for another CP:

https://www.postgresql.org/message-id/549275CC.4010607%40gmail.com





Type "help" for help.

postgres=# \l template0
    List of databases
   Name    |  Owner   | Encoding |    Collate    | Ctype 
|   Access privileges
---+--+--+---+---+--- 

template0 | postgres | UTF8 | Estonian_Estonia.1257 | 
Estonian_Estonia.1257 | =c/postgres  +
   |  |  |   | | 


I'm guessing it is picking up Estonian_Estonia.1257 from the system.

The Windows Postgres instance was installed from the EDB installer?


postgres=CTc/postgres
(1 row)

Andrus.



--
Adrian Klaver
adrian.kla...@aklaver.com




How to get char indices from text parser | tsvector?

2020-03-29 Thread Roman Gurinovich
Hi all,

is there any possibility to get char "coordinates", not only token
"coordinates" from tsvector or, basically, text parser? Like in
Elasticsearch


GET _analyze?text=This fox is brown=english
{
"tokens": [
{
  "token": "fox",
  "start_offset": 5,
  "end_offset": 8,
  "type": "",
  "position": 1
}

while in PostgreSQL

SELECT to_tsvector('This fox is brown')
'brown':4 'fox':2 'is':3 'this':1



-- 

Best Regards,
Roman Gurinovich


Hot standby from Debian to Windows

2020-03-29 Thread Andrus

Hi!

Postgres 12 server is running on Debian 10 and has number of databases.

How to mirror changes of those databases to Windows 10 workstation which 
runs also Postgres 12.


Changes in server databases in Debian should sent to Postgres 12 database in 
Windows over internet.


If Debian server goes down, users can change server address to Windows 
computer as temporary workaround and continue working.


Andrus. 






Re: could not determine encoding for locale "et_EE.UTF-8": codeset is "CPUTF-8" in pg_restore

2020-03-29 Thread Andrus

Hi!


System Locale: et;Eesti
Input Locale:  et;Eesti
Hmm, I was expecting to see et_EE though I will admit to not truly 
understanding how Windows does locales.
I should have asked earlier, in the Postgres instance on Windows what does 
\l show for template0?


"D:\Program Files\PostgreSQL\12\bin\psql"  postgres postgres

psql (12.2)
WARNING: Console code page (775) differs from Windows code page (1257)
8-bit characters might not work correctly. See psql reference
page "Notes for Windows users" for details.
Type "help" for help.

postgres=# \l template0
   List of databases
  Name|  Owner   | Encoding |Collate| Ctype 
|   Access privileges

---+--+--+---+---+---
template0 | postgres | UTF8 | Estonian_Estonia.1257 | 
Estonian_Estonia.1257 | =c/postgres  +
  |  |  |   | 
| postgres=CTc/postgres

(1 row)

Andrus. 






Re: could not determine encoding for locale "et_EE.UTF-8": codeset is "CPUTF-8" in pg_restore

2020-03-29 Thread Andrus

Hi!


Generally speaking, locale names from Unix systems won't work at all on
Windows.  You need to create the database manually with whatever seems
to be the closest Windows locale match, and then restore its contents
without using --create.


This is unattended script running in every night from .bat file.

How to replace pg_restore --create  option with   psql and/or createdb calls
and specify proper locale for them ?

Currently everthing has "Estonian_Estonia.1257"  locale in windows.
Which locale name should be specified in Windows instead of this?
Or maybe creating new template with proper encoding or changing template0 
encoding helps?


Andrus. 






Re: PG12 autovac issues

2020-03-29 Thread Michael Paquier
On Sat, Mar 28, 2020 at 11:29:41AM -0700, Andres Freund wrote:
> I assume you're still trying to track the actual cause of the problem
> further?

That's the plan, and I'll try to spend some time on it next week.  Any
new information I have will be added to the thread you have begun on
-hackers a couple of days ago.
--
Michael


signature.asc
Description: PGP signature