Re: easy way to acquire height / width from images (PNG, JPEG) stored as bytea?

2020-04-17 Thread Steve Atkins



On 17/04/2020 13:37, Achilleas Mantzios wrote:

Hello Dear List,

we have a table holding email attachments as bytea, and we would like 
to filter out images of small dimensions, which are not of any value 
to our logic.


I took a look at pg_image extension, tested it, and it proved 
problematic, it killed my 200+ days uptime FreeBSD box :( . I dropped 
the extension and uninstalled this as soon as fsck finally finished.


If running an extension crashed your server you should look at how / 
why, especially if it corrupted your filesystem.


That shouldn't happen on a correctly configured system, so the 
underlying issue might cause you other problems. Crashing postgresql, 
sure, but not anything that impacts the rest of the server.


Cheers,
  Steve





Re: Using unlogged tables for web sessions

2020-04-17 Thread Steve Atkins



On 16/04/2020 19:39, Stephen Carboni wrote:

Hello.

I was wondering if anyone was using unlogged tables for website
sessions in production. I'm interested if it breaks the prevailing
opinion that you don't put sessions in PG.


I generally put sessions in postgresql, with regular tables, when I'm 
using persistent server-side sessions, rather than just stashing all the 
user data in a signed cookie or local storage.


It ... works fine? It drastically simplifies app deployment to have a 
single point of persistent storage, and having one that you can easily 
interrogate by hand simplifies development and debugging. Reads are a 
single indexed query, writes are typically HOT, so the IO and overhead 
aren't drastically different from any other persistent store.


A lot of webapp development advice is based on "Ah, but how will you 
scale it up to Facebook levels of traffic?" and then skipping over the 
answer "Um, I won't ever need to. And if I did I'd have enough revenue 
to hire someone very, very good to rearchitect my session storage.".


Cheers,
  Steve






Re: Rules versus triggers

2020-03-07 Thread Steve Atkins



On 07/03/2020 11:56, stan wrote:

Could someone give me a brief description of the intended functionally, and
how the 2 features work of rules, versus triggers? It appears to me that
they are simply 2 different ways to set up triggers, but I am certain that
is just because of my lack of knowledge.


The pages 
https://wiki.postgresql.org/wiki/Don%27t_Do_This#Don.27t_use_rules links 
to will give you some idea of the differences.


Cheers,
  Steve





Re: Fwd: sensible configuration of max_connections

2020-02-07 Thread Steve Atkins


On 07/02/2020 13:18, Chris Withers wrote:

On 07/02/2020 12:49, Chris Ellis wrote:

What's "too much" for max_connections? What happens when you set it to

high? What factors affect that number?


When sizing max_connections you need to trade off how many 
connections your application will use at peak vs how much RAM and CPU 
you have.


Each connection is capable of allocating work_mem and has a stack etc.

As such you don't want max_connections to be able to run your system 
out of RAM.
Sure, but that's where I'm trying to find out what's sensible. The box 
has 196GB memory, most of that in hugepages, 18 core Intel Skylake 
with HT on giving 36 cores and tonnes of SSD for storage. How would I 
turn that spec into a sensible number for max_connections? As that 
number grows, what contention points in postgres will start creaking 
(shared memory where the IPC happens?)


The max_connections setting  is an upper limit after which postgresql 
will reject connections. You don't really want to hit that limit, rather 
you want to keep the number of concurrent connections to a reasonable 
number (and have max_connections somewhere above that).


Each connection is a postgresql process, so active connections are 
competing for resources and even idle connections take up some RAM. 
Creating a new connection is launching a new process (and doing some 
setup) so it's relatively expensive.


Doing some sort of connection pooling is a good idea, especially for 
web-apps that connect, do a few short queries and disconnect. Django is 
probably doing a passable job at pooling already, so you might want to 
see how many connections it's using under normal load. Adding a 
dedicated pooler in between Django and PostgreSQL would give you more 
flexibility and might be a good idea, but if what Django is already 
doing is reasonable you may not need it.


What's a good number of active connections to aim for? It probably 
depends on whether they tend to be CPU-bound or IO-bound, but I've seen 
the rule of thumb of "around twice the number of CPU cores" tossed 
around, and it's probably a decent place to start, then run it under 
normal load and see how it behaves - cpu usage, RAM, IO, request latency 
and throughput.


Cheers,
  Steve



Re: slow insert speeds with bytea

2019-12-02 Thread Steve Atkins



On 01/12/2019 17:59, Alex O'Ree wrote:
Is there anything I can to increase insert speeds for bytea? Currently 
running postgres 9.6.15


I have a few tables without a bytea and a few with bytea. There is a 
large performance difference with inserts between the two. I'm 
inserting a byte[] that's usually less than 1MB on content. The 
content itself is actually just utf8 string data.


For the non-bytea table, inserts can be as high as 40k rows/sec, 
whereas the bytea table is closer to 4k/sec or less.


4k 1MB rows/sec would be 4GB a second. It would need to be a fairly 
decent IO system to manage that speed, let alone ten times that.


What's the typical row size of the tables with bytea fields vs the other 
tables?


What are your IO stats during the two sorts of insert?

Cheers,
  Steve




Re: Remote Connection Help

2019-11-21 Thread Steve Atkins


On 21/11/2019 14:30, Ekaterina Amez wrote:

El 21/11/19 a las 15:21, Jason L. Amerson escribió:


I am at a loss for what to do. I have read article after article 
about how to allow remote connections on my PostgreSQL server and 
none of what the articles say do, worked for me. I have edited the 
“postgresql.conf” file and changed “listen_address = ‘localhost’ to 
listen_address = ‘*’.



I think that's "listen_addresses" on recent versions of postgresql.

Did you stop and restart the service after you edited the config file? 
Check the logs for errors too, maybe.



I have even tried it commented out and uncommented and I get the same 
results. I also edited the “pg_hba.conf” file and added the following 
at the end of the file:


host all all 0.0.0.0/0 md5

host all all ::/0 md5

The first line in pg_hba.conf that matches a connection will take 
effect; later lines won't.




For testing connection purposes I'm used to change md5 to trust, this 
way you won't have troubles with passwords nor users.


Don't do this on a machine that's reachable from the open internet, 
ever. It's asking to get your box compromised.


My computer that is running the server is Ubuntu and it has a static 
IP. I am trying to connect remotely with computers running Windows 10 
using the static IP. When I run pgAdmin from my Windows 10 machine, 
or use the command line to connect, I get the following error:


unable to connect to server:

could not connect to server: Connection refused (Ox274D/10061) Is 
the server running on host " xx.xx.xx.xx" and accepting


TCP/IP connections on port 5432'

I would like to be able to connect to my Ubuntu PostgreSQL server 
from all Windows 10 machines as well as from a client on my Android 
phone while away from home. That is my goal now. I am hoping that 
someone can help me to get this working. This is been very frustrating.


Can you connect to your server on it's external address at all? i.e. if 
it's external IP address is 10.11.12.13, can you run "psql -h 
10.11.12.13" on your ubuntu box and connect / log in?


If you can then postgresql is configured correctly and you can focus on 
where the issue on the network is.


If not, then the problem is the local machine, either postgresql 
configuration or _maybe_ local network configuration silliness.


Cheers,
  Steve




Re: jsonb_set() strictness considered harmful to data

2019-10-21 Thread Steve Atkins


On 21/10/2019 17:39, Steven Pousty wrote:
 Turning a JSON null into a SQL null  and thereby "deleting" the data 
is not the path of least surprises.


In what situation does that happen? (If it's already been mentioned I 
missed it, long thread, sorry).


Cheers,
  Steve




Re: jsonb_set() strictness considered harmful to data

2019-10-20 Thread Steve Atkins



On 19/10/2019 07:52, Ariadne Conill wrote:


I would say that any thing like

update whatever set column=jsonb_set(column, '{foo}', NULL)

should throw an exception.  It should do, literally, *anything* else
but blank that column.


steve=# create table foo (bar jsonb not null);
CREATE TABLE
steve=# insert into foo (bar) values ('{"a":"b"}');
INSERT 0 1
steve=# update foo set bar = jsonb_set(bar, '{foo}', NULL);
ERROR:  null value in column "bar" violates not-null constraint
DETAIL:  Failing row contains (null).
steve=# update foo set bar = jsonb_set(bar, '{foo}', 'null'::jsonb);
UPDATE 1

I don't see any behaviour that's particularly surprising there? Though I 
understand how an app developer who's light on SQL might get it wrong - 
and I've made similar mistakes in schema upgrade scripts without 
involving jsonb.


Cheers,
  Steve





Re: PostgreSQL License

2019-09-19 Thread Steve Atkins



On 19/09/2019 13:48, Steve Litt wrote:

My understanding is the PostgreSQL license is more like the MIT
license, which actually allows one to modify the code and claim it as
proprietary.

You could do that, yes. :)

https://wiki.postgresql.org/wiki/PostgreSQL_derived_databases

Cheers,
  Steve




Re: Work hours?

2019-08-28 Thread Steve Atkins



> On Aug 27, 2019, at 11:27 PM, stan  wrote:
> 
> I am just starting to explore the power of PostgreSQL's time and date
> functionality. I must say they seem very powerful.
> 
> I need to write a function that, given a month, and a year as input returns
> the "work hours" in that month. In other words something like
> 
> 8 * the count of all days in the range Monday to Friday) within that
> calendar month.
> 
> Any thoughts as to the best way to approach this?

You might find this useful: 
https://gist.github.com/wttw/b6f5d0d67c31d499c05f22a4f2b6f628

It's not the most efficient approach, but it's relatively simple to customize.

Cheers,
  Steve





Re: Importing from CSV, auto creating table?

2019-08-23 Thread Steve Atkins

On 21/08/2019 22:15, stan wrote:

I have a situation where we need to import data, as an interim measure,
from spreadsheets.

I have read up on \copy and COPY, but I do not see that either of these can
use the header from a CSV file to define a new table. Am I missing
something?

Also downloaded something called pgfutter, which I thought would do this,
but have not had any success with this. After I (thought) I had figured out
the arguments, it just seams to hag forever.

You might find https://github.com/wttw/csvpg useful. It creates tables 
with column names based on the CSV header, and data types intuited from 
the data.


(The only pre-built binary there is for Mac right now; I should at least 
add one for Windows).


Cheers,
  Steve






Re: DRY up GUI wiki pages

2019-07-10 Thread Steve Atkins



> On Jul 10, 2019, at 7:38 PM, Bruce Momjian  wrote:
> 
> On Wed, Jul 10, 2019 at 09:08:30AM -0600, Roger Pack wrote:
>> I learned today there are "two" wiki pages for GUI clients:
>> 
>> https://wiki.postgresql.org/wiki/PostgreSQL_Clients
>> 
>> https://wiki.postgresql.org/wiki/Community_Guide_to_PostgreSQL_GUI_Tools
>> 
>> I'd like to DRY them up so there aren't two lists which confuses
>> newcomers.  Any objections?  If not I'll probably make one of those
>> pages into GUI's and one into "non GUI's" or something like that.

I created the first one because the second one was full of old, stale, useless 
things. I believe that everything valid on the second one was added to the 
first one at the time.

Also look at https://wiki.postgresql.org/wiki/Design_Tools and 
https://wiki.postgresql.org/wiki/Documentation_Tools if you're looking to 
combine / divide things.

Cheers,
  Steve

> 
> Agreed, a cleanup would be nice.  :-)
> 
> -- 
>  Bruce Momjian  http://momjian.us
>  EnterpriseDB http://enterprisedb.com
> 
> + As you are, so once was I.  As I am, so you will be. +
> +  Ancient Roman grave inscription +
> 
> 





Re: Research on ?? operators

2019-06-04 Thread Steve Atkins



> On Jun 4, 2019, at 1:35 PM, Michael Lewis  wrote:
> 
> ":foo" named placeholders
> 
> If I may, is this supported natively in Postgres prepared statements?

It's not. The only type we support are numbered $1 type placeholders.

> Can I see an example? I do not care much for the numbers of positional 
> placeholders and would love to use names instead if possible.

It'd be nice. They're supported via rewriting at the driver level in some 
drivers, and I've written shims to convert them in an app a few times and it 
makes for much more readable - and bug-resistant - code.

Supporting it at the protocol level would be very nice.

Cheers,
  Steve



Re: Research on ?? operators

2019-06-04 Thread Steve Atkins



> On Jun 4, 2019, at 10:00 AM, Matteo Beccati  wrote:
> 
> Hello generals,
> 
> I'm trying to resurrect a change to the PHP PDO driver to allow using
> the "?" json operator (and other operators containing a question mark),
> without it being interpreted as a placeholder for a query argument. In
> order to do so, like Hibernate, I went for the double "??" escaping:
> 
> https://wiki.php.net/rfc/pdo_escape_placeholders
> 
> One question that I'm supposed to answer now is: is there any known
> usage in the wild of some custom "??" operator that would require funny
> escaping like ""?

I don't know of one today, but that doesn't mean there isn't or won't
be tomorrow.

> I've tried to search pgxn but couldn't find any match, so I thought it
> was worth to try and ask here.

Doubling "?" to "??" isn't an obvious, intuitive way to do that, at least
to me. Maybe it would be to people coming from Java.

Perl's DBD::Pg deals with it in two ways. One is to allow escaping
the ? with a backslash - so "?" is a placeholder, while "\?" is passed as
"?" to the database. That's more consistent with other languages, and
I think you're far less likely to ever see a backslash in a custom operator
than "??".

The other is that it supports the postgresql standard placeholders - $1,
$2 etc. - which avoid the issue and are more flexible. It also has a 
configuration
option to completely ignore "?" in queries, so "$1" is a placeholder and "?"
is passed through to the database.

(Though I like the ":foo" or "@foo" style named placeholders best)

Cheers,
  Steve





Re: schema change tracking

2019-05-16 Thread Steve Atkins



> On May 16, 2019, at 5:41 PM, Benedict Holland  
> wrote:
> 
> Hi All,
> 
> I am fairly this question has many answers but here it goes:
> 
> I need a tool that can track schema changes in a postgesql database, write 
> scripts to alter the tables, and store those changes in git. Are there tools 
> that exist that can do this?

If you're looking for something that'll reverse engineer schema change scripts 
from a database someone has made ad-hoc changes to ... try and avoid doing that 
if at all possible. Use the change scripts in git as the source of truth.

If you do that there are many answers, starting at "it's fairly easy to do 
yourself, with a simple schema version and upgrade / downgrade scripts". 
Depending on what language you're writing your app in there may be libraries 
that can help.

But if you're looking for something standalone, look at https://sqitch.org

Cheers,
  Steve






Re: SQL queries not matching on certain fields

2019-04-03 Thread Steve Atkins



> On Apr 3, 2019, at 2:06 PM, Felix Ableitner  wrote:
> 
> Hello,
> 
> I'm having a very strange problem with the Postgres database for my website. 
> Some SQL queries are not matching on certain fields. I am running these 
> commands via the psql command.
> 
> Here is a query that works as expected:
> 
> # SELECT id, "preferredUsername" FROM actor WHERE "preferredUsername"='emma';
>   id   | preferredUsername
> ---+---
>  48952 | emma
>  58672 | emma
> (2 rows)
> 
> The following query should work as well, because the username exists. But in 
> fact, it consistently returns nothing:
> 
> # SELECT id, "preferredUsername" FROM actor WHERE 
> "preferredUsername"='mailab';
>  id | preferredUsername
> +---
> 
>   (0 rows)
> 
> There are some workarounds which fix the WHERE statement, all of the 
> following work as expected:
> 
> SELECT id, "preferredUsername" FROM actor WHERE 
> trim("preferredUsername")='mailab';
> 
> 
> 
> SELECT id, "preferredUsername" FROM actor WHERE "preferredUsername" ILIKE 
> 'mailab'; 
> 
> SELECT id, "preferredUsername" FROM actor WHERE 
> md5("preferredUsername")=md5('mailab'); 
> 
> 
> Now you might think that there is something wrong with the encoding, or the 
> field contains some extra whitespace. But SHOW SERVER_ENCODING and SHOW 
> SERVER_ENCODING show UTF8. Length and m5 sum are also exactly as expected. 
> And I checked the individual bytes with get_byte(), all of them are in the 
> range 97-122.
> 
> About the setup, this is Postgres 10 running in Docker, on an Ubuntu VPS (see 
> below for all versions etc). I had this problem before on the same setup, so 
> I did an export to text file with pg_dump, and imported into a completely new 
> database with psql. That fixed the problem for a few days, but it came back 
> soon after.
> 
> The problem only seems to affect one or two specific columns, and only a few 
> specific rows in those columns. Most other rows work normally. Affected 
> columns also randomly start working again after a few days, and other columns 
> get affected. I havent noticed any kind of pattern.
> 
> You can find the table definition here: https://gitlab.com/snippets/1840320

You can use "explain" to see what plan is being used for the query, but I'm 
betting that it's using an index on preferredUsername. Your workarounds won't 
use that index, they'll scan the table.

If that index is corrupted it could cause the symptoms you're seeing. You can 
use "reindex" to rebuild the index from scratch and see if it fixes it but 
corrupted indexes aren't normal, and the issue seems to be recurring. On 
physical hardware I'd be wondering about filesystem corruption and taking a 
good look at my system logs. On someone else's VPS you don't have the same 
visibility, but I'd still check the system logs for issues.

Cheers,
  Steve

> 
> Version info:
> 
> Postgres Docker Image: postgres:10.7-alpine 
> Docker version: 18.09.2
> OS: Ubuntu 18.04.2
> 
> Please tell me if you have any idea how to fix or debug this. I already asked 
> multiple people, and no one has a clue what is going on.
> 
> Best,
> Felix Ableitner
> 





Re: Case Insensitive

2019-03-28 Thread Steve Atkins



> On Mar 28, 2019, at 9:08 AM, Ron  wrote:
> 
> On 3/28/19 3:33 AM, Steve Atkins wrote:
>> 
>>> On Mar 28, 2019, at 8:29 AM, Ron  wrote:
>>> 
>>> On 3/28/19 3:23 AM, Sameer Kumar wrote:
>>> [snip]
>>>> You can write a query with upper function:
>>>> 
>>>> select * from emp where upper(ename)=upper('aaa');
>>> That's a guaranteed table scan.
>> Unless you have an index on upper(ename).
> 
> Are you sure?  I thought the lpart had to be immutable for the query 
> optimizer to decide to use an index (and upper(ename) is mutable).

Yeah. Case insensitive searches like this are pretty much the first example 
given for why you might want to use an expression index.

The expression in an expression index has to be immutable, but upper() is - it 
will always give the same output for a given input. (For values of "always" 
that probably depend on not performing major surgery on collations, but that 
falls into the "lie to the planner, get rotten results" category).

Check "\df+ upper"

Cheers,
  Steve





Re: Case Insensitive

2019-03-28 Thread Steve Atkins



> On Mar 28, 2019, at 8:29 AM, Ron  wrote:
> 
> On 3/28/19 3:23 AM, Sameer Kumar wrote:
> [snip]
>> You can write a query with upper function:
>> 
>> select * from emp where upper(ename)=upper('aaa');
> 
> That's a guaranteed table scan.

Unless you have an index on upper(ename).

Cheers,
  Steve





Re: Forks of pgadmin3?

2019-03-22 Thread Steve Atkins



> On Mar 22, 2019, at 10:56 AM, Christian Henz  
> wrote:
> 
> I know I'm late to the party, but we're only now migrating from
> Postgres 9.x, realizing that pgadmin3 does not support Postgres 11.
> 
> I have checked out pgadmin4, but I don't like it at all. My colleagues
> feel the same way, and some web searching suggests that we are not
> alone.
> 
> So I wonder if there are any active forks of pgadmin3?

There's the BigSQL fork, which had at least some minimal support
for 10. I've no idea whether it's had / needs anything for 11.

> 
> I found some on Github with some significant changes that I assume
> were done by people working for VK, the Russian social network. These
> appear to be personal hacks though (monosyllabic commit messages, build
> scripts added with hard coded local paths etc.).
> 
> There are also the Debian packages that have patches adding Postgres
> 10 support among other things. Not sure if there would be interest
> there in continuing to support newer Postgres versions.
> 
> Are there other, more organized efforts to continue pgadmin3?
> 
> Are there technical reasons why such a continuation would not make
> sense?
> 

It's significant work, and it'd be expended maintaining a fairly mediocre
GUI client.

You might see if you like OmniDB, or one of the other GUI clients, perhaps?

https://wiki.postgresql.org/wiki/PostgreSQL_Clients

Cheers,
  Steve

> Cheers,
> Christian
> 
> -- 
> Christian Henz
> Software Developer, software & vision Sarrazin GmbH & Co. KG
> 
> 




Re: Replication

2019-02-26 Thread Steve Atkins



> On Feb 26, 2019, at 9:41 AM, Sonam Sharma  wrote:
> 
> Hi,
> 
> Can we do master to master replication in Postgres.

Look at https://bucardo.org/Bucardo/ , BDR (e.g. 
https://www.2ndquadrant.com/en/resources/postgres-bdr-2ndquadrant/ ), RubyRep 
or any of the other multimaster solutions at 
https://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling#Replication
 .

You probably don't actually need bidirectional master-master replication, 
though, and might want to look hard at other ways to do what you want.

Cheers,
  Steve


Re: the installation of pgadmin4 makes me weep in frustration

2019-01-14 Thread Steve Atkins



> On Jan 14, 2019, at 7:58 AM, robert  wrote:
> 
> Hi There
> 
> 
> 
> first, thanks a lot for the great an beautiful software we get from 
> PostgreSQL and all people around it.
> 
> But I wonder how it comes, that installing pgadmin4 is so incredibly hard?
> 
> And no documentation.
> 
> I would like to install pgadmin4 to my ubuntu 18.1 laptop.
> 
> Where do I find doku on how to install pgadmin4. Preferably I would like to 
> install it using pip?

pgadmin 4 is nothing to do with the PostgreSQL project itself, it's just a 
third party client.

There are many other third-party clients listed here - 
https://wiki.postgresql.org/wiki/PostgreSQL_Clients -
most of them probably better than pgadmin4.

Cheers,
  Steve


Re: incomplete startup packet messages in logs

2018-11-01 Thread Steve Atkins



> On Nov 1, 2018, at 1:27 PM, Vijaykumar Jain  wrote:
> 
> Hi Team,
> 
> I know this is has been answered a lot on the internet wrt ignoring,
> but i am unable to figure out why I get these messages in logs
> "incomplete startup packet"

Check which of your monitoring systems is configured to
check postgresql liveness 15 times an hour.

If you can't find it, and you care about the log entries, start
sniffing packets to see where the traffic is coming from.

> 
> tail -1000 /var/log/postgresql/postgresql-10-main.log| grep
> 'incomplete startup packet'
> 
> 2018-11-01 13:04:18 UTC LOG:  incomplete startup packet
> 
> 2018-11-01 13:08:18 UTC LOG:  incomplete startup packet
> 
> 2018-11-01 13:12:18 UTC LOG:  incomplete startup packet
> 
> 
> Note: exactly at 4 min interval.

Cheers,
  Steve



Re: Pg_logical without subscription. Can we log the operations ?

2018-10-10 Thread Steve Atkins


> On Oct 9, 2018, at 11:45 PM, Dilshan  wrote:
> 
> Hi Team,
>  I am working on a requirement, where I need to log each and every operation 
> on the master db. So I was thinking if I could do with pg_logical by setting 
> my master db as publisher and setting a new db as subscriber and dropping the 
> subscription there after. I am wondering, how to get the operations that a 
> subscription would receive just into logs. Could you guide me on that? 
> Otherwise is there a possibility to receive all the operation without 
> dropping subscription and logging the details and deleting the subscription 
> tables to save space. I am planning to have logs rotated and purging logs 
> every month. Could you please guide me about the possibility of this approach?

You can read the logical replication stream with clients other than postgresql, 
and you can use plugins to format it in different ways.

https://wiki.postgresql.org/wiki/Logical_Decoding_Plugins has some of the 
third-party plugins to format the changes.

There are libraries for most languages to consume the logical decoding stream, 
or the included client "pg_recvlogical" can be used to write it to disk.

pg_recvlogical + wal2json might be the simplest way to do a basic audit trail.

Cheers,
  Steve


Re: Code of Conduct plan

2018-09-17 Thread Steve Atkins


> On Sep 17, 2018, at 4:57 PM, Steve Litt  wrote:
> 
> On Mon, 17 Sep 2018 08:27:48 -0700
> "Joshua D. Drake"  wrote:
> 
>> 
>> At this point it is important to accept that the CoC is happening. We 
>> aren't going to stop that. The goal now is to insure a CoC that is 
>> equitable for all community members and that has appropriate 
>> accountability. At hand it appears that major concern is the CoC
>> trying to be authoritative outside of community channels. As well as
>> wording that is a bit far reaching. Specifically I think people's
>> main concern is these two sentences:
>> 
>> "To that end, we have established this Code of Conduct for community 
>> interaction and participation in the project’s work and the community
>> at large. This Code is meant to cover all interaction between
>> community members, whether or not it takes place within
>> postgresql.org infrastructure, so long as there is not another Code
>> of Conduct that takes precedence (such as a conference's Code of
>> Conduct)."
>> 
>> If we can constructively provide feedback about those two sentences, 
>> great (or constructive feedback on other areas of the CoC). If we
>> can't then this thread needs to stop. It has become unproductive.
>> 
>> My feedback is that those two sentences provide an overarching
>> authority that .Org does not have the right to enforce and that it is
>> also largely redundant because we allow that the idea that if another
>> CoC exists, then ours doesn't apply. Well every single major
>> collaboration channel we would be concerned with (including something
>> like Blogger) has its own CoC within its Terms of use. That
>> effectively neuters the PostgreSQL CoC within places like Slack,
>> Facebook, Twitter etc...
> 
> The perfect is the enemy of the good. Whatever CoC is decided upon, it
> will be updated later. If it's easier, for now, to pass it with
> enforcement WITHIN the Postgres community, why not do that? If, later
> on, we get instances of people retaliating, in other venues, for
> positions taken in Postgres, that can be handled when it comes up.

I'll note that a fairly common situation with mailing lists I've seen is people
taking an on-list disagreement off-list and being offensive there. I've not
had that happen to me personally on the pgsql-* lists, but I have had it
happen on other technical mailing lists. That harassment would be "outside
of community channels".

A CoC that doesn't cover that situation (or it's equivalent on IRC) isn't
going to be particularly easy to apply.

Whether the CoC can be applied or not isn't necessarily the most important
thing about it - it's more a statement of beliefs - but if the situation comes
up where someone is behaving unacceptably via IRC or email and "we"
say that we aren't interested in helping, or our hands are tied, because
"off-list" communication isn't covered by the CoC that's likely to lead to
a loud and public mess.

Cheers,
  Steve




Re: Question on postgresql.conf

2018-07-31 Thread Steve Atkins


> On Jul 31, 2018, at 10:59 AM, Dimitri Maziuk  wrote:
> 
> On 07/31/2018 12:33 PM, George Neuner wrote:
> 
>> Is there some reason that postgresql.conf cannot be a link to your
>> file?
> 
> It's six of one, half a dozen of the other. The big problem is when the
> network share is unavailable at host boot and all of your databases are
> belong to bitbucket. If you want to do it right, you need chef/puppet as
> already suggested.
> 
> One useful thing linux (at least) has now is conf.d directories whose
> contents are auto-included after the main config: this way you can keep
> the vendor-provided /etc/blah.conf and have all your host-specific
> settings in /etc/blah.d/myhost.conf, all living together happily ever
> after. You still want the latter to be a local file that's guaranteed
> available when blah service starts up. (Postgres devs: hint! hint!)

You'd do that with "include_dir 'conf.d'" or similar, I think, which PostgreSQL
has in all current versions.

Cheers,
  Steve




Re: Load data from a csv file without using COPY

2018-06-19 Thread Steve Atkins


> On Jun 19, 2018, at 10:14 PM, Ravi Krishna  wrote:
> 
>> 
>> If performance is relevant then your app should probably be using COPY 
>> protocol, not line by line inserts. It's
>> supported by most postgresql access libraries. If your app does that then 
>> using "\copy" from psql would be
>> an appropriate benchmark.
> 
> Actually the reluctance to not use COPY is to make the benchmark same across 
> two different RDBMS in
> two diff env.

That's something I'd only do if I intended to rig a benchmark between a RDBMS 
with good bulk import
and one without. If that's not your goal, your approach doesn't seem to make 
sense and is unlikely
to provide performance metrics that are useful or related to your app 
performance, unless you intend
to hamstring your app in exactly the same way you're running the benchmark.

Maybe use your app, or write ten minutes worth of code that'll interact with 
the database in much the
same way as your app will?

Cheers,
  Steve




Re: Load data from a csv file without using COPY

2018-06-19 Thread Steve Atkins


> On Jun 19, 2018, at 9:16 PM, Ravi Krishna  wrote:
> 
> In order to test a real life scenario (and use it for benchmarking) I want to 
> load large number of data from csv files.  
> The requirement is that the load should happen like an application writing to 
> the database ( that is, no COPY command). 
> Is there a tool which can do the job.  Basically parse the csv file and 
> insert it to the database row by row.

If performance is relevant then your app should probably be using COPY 
protocol, not line by line inserts. It's
supported by most postgresql access libraries. If your app does that then using 
"\copy" from psql would be
an appropriate benchmark.

Cheers,
  Steve




Re: Performance problem postgresql 9.5

2018-06-08 Thread Steve Atkins


> On Jun 8, 2018, at 1:09 PM, Alvaro Herrera  wrote:
> 
> On 2018-Jun-08, Miguel Angel Sanchez Sandoval wrote:
> 
>> Hi guys, migrate from 8.4 to 9.5, all OK except that 2-3 days pass and the
>> database experiences slowness, I execute the linux top command and it shows
>> me a postgres user process executing a strange command (2yhdgrfrt63788)
>> that I consume a lot of CPU, I see the querys active and encounter select
>> fun ('./ 2yhdgrfrt63788') , this did not happen in version 8.4, any help
>> would appreciate it.
> 
> Hmm, has your database been compromised?  You may have an intruder there --
> beware.

Definitely.

The machine is compromised and doing Bad Things.

Image it if possible; save the compromise payload you know about if not.

Treat it as compromised and unsafe to attach to a network until you completely 
wipe and reinstall it.

It's probably a compromise via postgresql open to the network with insecure 
settings. I've seen several of those reported recently, and this one is saving 
it's payload to the postgresql data directory - somewhere no other user or app 
will have access to, but which a compromised postgresql can easily write to.

Check the pg_hba.conf and packet filter / firewall settings and see what the 
issue may be. Do the same checks on all your other postgresql servers, test and 
production. If there's a configuration mistake that let one server be 
compromised it's may well be there on others too.

Unless you are positive the server was not attacked, don't trust it unless you 
can be absolutely certain it is clean. Best bet is to backup any critical data 
(and check it for trustworthiness), wipe and rebuild.

Cheers,
  Steve




Re: Query hitting empty tables taking 48 minutes

2018-06-07 Thread Steve Atkins


> On Jun 7, 2018, at 12:11 PM, Rob Sargent  wrote:
> 
> What's the url doing in "blob_id = ds3.blob.id 

Re: Code of Conduct plan

2018-06-05 Thread Steve Atkins


> On Jun 5, 2018, at 9:51 AM, James Keener  wrote:
> 
> > [T]he
> main goal is to ensure that if someone is being harassed by a community
> member, they have an appropriate avenue to safely report it and ensure
> the CoC committee will review
> 
> To be honest, this is a bigger problem. Why would someone not feel 
> comfortable contacting the core team? Why would they feel better contacting 
> the CoC board who is probably mostly core team or otherwise self-selected 
> community members who have a strong belief in the CoC (and I don't mean that 
> kindly)?

The whole point of having a CoC is to advertise that we, as an organization, 
don't tolerate harassment and offensive behaviour. It also advertises that "we" 
will deal with it, if reported, and provides a clear, appropriate point of 
contact to do so. It also states roughly what process will be taken at that 
point.

Also, an alternative perspective, what makes you think every member of the core 
team would be comfortable being contacted? Handling allegations of, for 
example, drunken tech bros sexually harassing people isn't comfortable, is time 
consuming and does require a particular set of soft skills - skills that do not 
correlate with software architecture chops.

Cheers,
  Steve




Re: Code of Conduct plan

2018-06-04 Thread Steve Atkins


> On Jun 4, 2018, at 3:30 PM, Joshua D. Drake  wrote:
> 
> On 06/04/2018 01:46 PM, Tom Lane wrote:
>> "Joshua D. Drake"  writes:
>>> On 06/03/2018 11:29 AM, Tom Lane wrote:
 We are now asking for a final round of community comments.
>> Actually, it's intentional that we are not saying that.  The idea is
>> that any interaction between PG community members is subject to the CoC,
>> whether it takes place in postgresql.org infrastructure or not, so long as
>> there is not another CoC that takes precedence (such as a conference's
>> CoC).  The reason for this is an unfortunate situation that took place in
>> the FreeBSD community awhile back [1], wherein one community member was
>> abusing another via Twitter, and their existing CoC failed to cover that
>> because it had been explicitly written to cover only community-run forums.
>> So we're trying to learn from that mistake, and make sure that if such a
>> situation ever came up here, the CoC committee would have authority to
>> act.
> 
> O.k. I can see that. The problem I am trying to prevent is contributor X 
> being disciplined for behavior that has nothing to do with PostgreSQL.Org. I 
> am not sure what the exact good solution is for that but it is none of our 
> business if contributor X gets into a fight (online or not) with anyone who 
> is not within the PostgreSQL.Org community.

That can be a problem when people who are known by some to be toxic join a 
community, and those who have previous experience with them leave. That can 
leave them as a "missing stair" or, worse, if they continue to be horrible but 
within moderation guidelines they can provoke responses from other participants 
that can cause them to be moderated or be chastized and then leave. In some 
cases that has caused the entire culture to drift, and pretty much destroyed 
the community.

(Community management is hard. The more you formalize some of it the more you 
have to formalize all of it and do so near-perfectly. Developers, who tend to 
prefer hard black/white, true/false rules rather than leaving some decisions to 
personal judgement can be some of the worst people at doing community 
management, and some of the easiest to manipulate.)

Cheers,
  Steve




Re: posgresql.log

2018-05-21 Thread Steve Atkins

> On May 21, 2018, at 3:21 PM, Steve Crawford  
> wrote:
> 
> 
> 
> If this is a test server and you can take it offline for forensics I would do 
> so, especially if it could provide a path to other internal or critical 
> resources. If you can image it for safekeeping and forensics, even better.

+1

It's compromised. Image it if possible; save the compromise payload you know 
about if not.

Treat it as compromised and unsafe to attach to a network until you completely 
wipe and reinstall it.

> 
> That appears to be output from wget but the intrusion, if any, could be 
> through any number of vectors (web, ssh, local attack, etc.) not directly 
> related to PostgreSQL. Check in your other logs starting with a search for 
> anything related to that IP address.

It's probably a compromise via postgresql open to the network with insecure 
settings. I've seen several of those reported recently, and this one is saving 
it's payload to the postgresql data directory - somewhere no other user or app 
will have access to, but which a compromised postgresql can easily write to.

Check the pg_hba.conf and packet filter / firewall settings and see what the 
issue may be. Do the same checks on all your other postgresql servers, test and 
production. If there's a configuration mistake that let one server be 
compromised it's may well be there on others too.

> 
> Verify the usual. Patches up to date, ports appropriately firewalled off, no 
> default passwords, etc.
> 
> IP comes back to vultr.com which is a cloud company (i.e. could be anyone) 
> but if it is an attack perhaps contact their abuse department.

The C server there is already down; It can't hurt to notify them, but I doubt 
Choopa would be particularly interested beyond that point unless a subpoena or 
search warrant were involved.

> Unless you are positive the server was not attacked, don't trust it unless 
> you can be absolutely certain it is clean. Best bet is to backup any critical 
> data (and check it for trustworthiness), wipe and rebuild.

+1.

> 
> Only you (well, OK, maybe them, now) know what data was on this server but 
> depending on its importance, internal policies, legal requirements and 
> agreements with third-parties you may have notification requirements and 
> could need to engage forensics experts.

Cheers,
  Steve




Re: how to securely delete the storage freed when a table is dropped?

2018-04-13 Thread Steve Atkins

> On Apr 13, 2018, at 10:48 AM, Jonathan Morgan  
> wrote:
> 
> For a system with information stored in a PostgreSQL 9.5 database, in which 
> data stored in a table that is deleted must be securely deleted (like shred 
> does to files), and where the system is persistent even though any particular 
> table likely won't be (so can't just shred the disks at "completion"), I'm 
> trying to figure out my options for securely deleting the underlying data 
> files when a table is dropped.
> 
> As background, I'm not a DBA, but I am an experienced implementor in many 
> languages, contexts, and databases. I've looked online and haven't been able 
> to find a way to ask PostgreSQL to do the equivalent of shredding its 
> underlying files before releasing them to the OS when a table is DROPped. Is 
> there a built-in way to ask PostgreSQL to do this? (I might just not have 
> searched for the right thing - my apologies if I missed something)
> 
> A partial answer we're looking at is shredding the underlying data files for 
> a given relation and its indexes manually before dropping the tables, but 
> this isn't so elegant, and I'm not sure it is getting all the information 
> from the tables that we need to delete.
> 
> We also are looking at strategies for shredding free space on our data disk - 
> either running a utility to do that, or periodically replicating the data 
> volume, swapping in the results of the copy, then shredding the entire volume 
> that was the source so its "free" space is securely overwritten in the 
> process.
> 
> Are we missing something? Are there other options we haven't found? If we 
> have to clean up manually, are there other places we need to go to shred data 
> than the relation files for a given table, and all its related indexes, in 
> the database's folder? Any help or advice will be greatly appreciated.

Just "securely" deleting the files won't help much, as you'll leave data in 
spare space on the filesystem, in filesystem journals and so on.

Maybe put the transient tables an indexes in their own tablespace on their own 
filesystem, periodically move them to another tablespace and wipe the first 
one's filesystem (either physically or forgetting the key for an encrypted FS)? 
That'd leave you with just the WAL data to deal with.

Seems like a slightly odd requirement, though. What's your threat model?

Cheers,
  Steve




Re: best way to write large data-streams quickly?

2018-04-09 Thread Steve Atkins

> On Apr 9, 2018, at 8:49 AM, Mark Moellering  
> wrote:
> 
> Everyone,
> 
> We are trying to architect a new system, which will have to take several 
> large datastreams (total of ~200,000 parsed files per second) and place them 
> in a database.  I am trying to figure out the best way to import that sort of 
> data into Postgres.  
> 
> I keep thinking i can't be the first to have this problem and there are 
> common solutions but I can't find any.  Does anyone know of some sort method, 
> third party program, etc, that can accept data from a number of different 
> sources, and push it into Postgres as fast as possible?

Take a look at http://ossc-db.github.io/pg_bulkload/index.html. Check the 
benchmarks for different situations compared to COPY.

Depending on what you're doing using custom code to parse your data and then do 
multiple binary COPYs in parallel may be better.

Cheers,
  Steve




Re: Prepared statements

2018-03-21 Thread Steve Atkins

> On Mar 21, 2018, at 2:09 PM, Tim Cross  wrote:
> 
> 
> a simple question I wasn't able to get a clear answer on
> 
> It is general best practice to use prepared statements and parameters
> rather than concatenated strings to build sql statements as mitigation
> against SQL injection. However, in some databases I've used, there is
> also a performance advantage. For example, the planner may be able to
> more easily recognise a statement and reuse an existing plan rather than
> re-planning the query.
> 
> I wasn't sure what the situation is with postgres - is there a
> performance benefit in using prepared statements over a query string
> where the values are just concatenated into the string?

There are two separate things.

Parameterized queries are a query made by your code such that
the values are passed in alongside SQL that has placeholders
such as $1, $2, ... They're what help save you from SQL injection.

A prepared statement is a reference to a query that has previously
been passed to the database, and likely pre-interpreted and planned,
that's ready to accept parameters and run.

Using a prepared statement saves the planner from having to decide
on a plan to run the query, which saves you planning time. But it
does that by preparing a generic plan that'll work for any bound
parameter. The planner might be able to come up with a specific
plan based on the particular values passed in that is better than
the generic plan, so a naive implementation of prepared statements
might lead to the execution of the query being slower in some cases,
as it uses a generic plan when a specific one might be better.

Postgresql avoids the worst cases of that by only switching to a
generic plan for a prepared statement after it's re-planned it
a few times with specific values, and the specific plans have
been costed more expensive than the generic one (or something
like that).

The generic plan is also frozen in to the prepared statement, so
if the data statistics vary significantly during the lifetime of the
prepared statement the plan may no longer be a particularly
good one.

Prepared statements are certainly useful, but choosing whether
to use them or not isn't quite as simple as "it'll avoid the planning
overhead".

Parameterized queries are almost always a good idea.

Cheers,
  Steve




Re: Best options for new PG instance

2018-03-05 Thread Steve Atkins

> On Mar 5, 2018, at 8:53 AM, David Gauthier  wrote:
> 
> Hi:  
> 
> I'm going to be requesting a PG DB instance (v9.6.7) from an IT dept in a 
> large corp setting.  I was wondering if anyone could comment on the pros/cons 
> of getting this put on a virtual machine vs hard metal ?  Locally mounted 
> disk vs nfs ?

I've been running postgresql instances on ESXi VMs for years with no issues. 
I've not benchmarked them, but performance has been good enough despite their 
running on fairly wimpy hardware. Performance relative to bare metal is 
probably going to be dominated by disk IO, and depending on how you're hosting 
VMs that can be anywhere between pretty good and terrible - in a large 
corporation I'd expect it to be pretty good. Just don't skimp on RAM - having 
your hot data in the filesystem cache is always good and can make high latency 
storage tolerable.

If performance isn't critical then a VM is great. If it is, you'll want to plan 
and maybe benchmark a bit to decide whether bare metal is going to be 
significantly better for what you're doing.

I wouldn't let NFS anywhere near it. I'd ideally want something that looks to 
the VM like a locally mounted disk, whether that be really local or served from 
a SAN or iSCSI or ...

https://www.slideshare.net/jkshah/best-practices-of-running-postgresql-in-virtual-environments
 has some hints on VM-specific things to consider.

Cheers,
  Steve


Re: Is there a continuous backup for pg ?

2018-03-02 Thread Steve Atkins

> On Mar 2, 2018, at 11:05 AM, Gary M  wrote:
> 
> Hi,
> 
> I have an unusual requirement for schema based, live backup of a 24/7 
> database processing 100K inserts/updates per hour. The data store is around 
> 100TB.
> 
> The requirement is supporting an incremental backup of 10 minute windows.  
> Replication is not considered backup from malicious action.
> 
> Are there any best practices or solutions that can meet these requirements ? 

Sounds almost like you're looking for point-in-time recovery, which will let 
you restore an entire cluster to any time in the past (if you can afford the 
storage), using physical replication.

https://www.postgresql.org/docs/current/static/continuous-archiving.html

There are several sets of third-party tools that'll help with the setup, 
monitoring and other tooling. Look for omnipitr or pitrtools.

(I'd guess you could build something similar that would let you do logical 
recovery by recording changesets from a logical replication connection, but I 
don't know if anyone has put that together.)

Cheers,
  Steve


Re: Enforce primary key on every table during dev?

2018-03-01 Thread Steve Atkins
(Melvin's mail doesn't quote properly; I've tried to fix it but may have broken 
something)

> On Mar 1, 2018, at 8:50 AM, Melvin Davidson  wrote:
> 
> 
> On Thu, Mar 1, 2018 at 11:43 AM, Alban Hertroys  wrote:
> 
> 
>> On 1 March 2018 at 17:22, Steven Lembark  wrote:
>>> If you have a design with un-identified data it means that you havn't
>>> normalized it properly: something is missing from the table with
>>> un-identifiable rows.
>> 
>> While that holds true for a relational model, in reporting for
>> example, it is common practice to denormalize data without a
>> requirement to be able to identify a single record. The use case for
>> such tables is providing quick aggregates on the data. Often this
>> deals with derived data. It's not that uncommon to not have a primary
>> or even a uniquely identifiable key on such tables.
>> 
>> I do not disagree that having a primary key on a table is a bad thing,
>> but I do disagree that a primary key is a requirement for all tables.
>> 
>> More generally: For every rule there are exceptions. Even for this one.
> 
> You may perceive that to be "common practice", but in reality it is not, and 
> in fact a bad one. As was previously stated, PosgreSQL is a _relational_ 
> database,
> and breaking that premise will eventually land you in very big trouble. There 
> is no solid reason not to a primary key for every table.

Sure there is. It's an additional index and significant additional insert / 
update overhead.
If you're never going to retrieve single rows, nor join in such a way that 
uniqueness
on this side is required there's no need for a unique identifier.

It's a rare case that you won't want a primary key, and I'll often add 
a surrogate one for convenience even when it's not actually needed,
but there are cases where it's appropriate not to have one, even in
OLTP work. Log tables, for example.

"Every table should have a primary key, whether natural or surrogate"
is a great guideline, and everyone should follow it until they understand
when they shouldn't.

More generally: For every rule there are exceptions. Even for this one.

Cheers,
  Steve


Re: persistent read cache

2018-02-12 Thread Steve Atkins

> On Feb 11, 2018, at 5:14 PM, Sand Stone  wrote:
> 
> 
> Hi. I wonder if there is such a thing or extension in the PG world.
> 
> Here is my use case. I am using PG (PG10 to be more specific) in a
> cloud VM environment. The tables are stored in RAID0 managed SSD
> backed attached storage. Depending on the VM I am using, I usually
> have 256GB local SSD unused.
> 
> I wonder if PG could leverage this local SSD as a read (page/block)
> cache, to complement/extend  the DRAM by used by shared_buffer today.

It seems something that PostgreSQL could take advantage of, but
it's probably the wrong layer to implement it. If your VM infrastructure
doesn't have any way to use it directly, maybe you could do it at the
drive / filesystem level with something like bcache, lvmcache or
enhanceio?

Adding that sort of complexity to something that needs solid data
integrity makes me nervous, but those solutions have been in the
field for years.

Cheers,
  Steve





Re: Notify client when a table was full

2018-01-22 Thread Steve Atkins

> On Jan 21, 2018, at 6:44 PM, Vincenzo Romano  
> wrote:
> 
> 2018-01-21 19:31 GMT+01:00 Francisco Olarte :
>> On Sun, Jan 21, 2018 at 1:27 PM, Michael Paquier
>>  wrote:
>>> On Fri, Jan 19, 2018 at 03:40:01PM +, Raymond O'Donnell wrote:
>> ...
 How do you define "full"?
> 
> The only possible and meaningful case, IMHO, as stated by David
> earlier, is "file system full".

If your filesystem is full you're pretty much off the air. It's something
that should never happen on a production system. So ... any automation
around "the filesystem is full" is going to be much the same as "the server
is dead". You're unlikely to be able to do anything useful from the
app at that point, let alone from a trigger function.

If the answer involves handling the case where the file system is full we're
not answering a useful question, and the original poster probably needs to
clarify.

> Which is communicated by Postgres with the "Class 53 — Insufficient
> Resources" error codes.
> Please refer to official documentation like:
> 
> https://www.postgresql.org/docs/10/static/errcodes-appendix.html
> 
> For specific programming languages more details need to be checked.
> 

Cheers,
  Steve


Re: Connection type

2018-01-18 Thread Steve Atkins

> On Jan 18, 2018, at 4:06 PM, Enrico Pirozzi  wrote:
> 
> Hi,
> 
> is there a way to know what kind of connection a client is doing? 
> (ODBC,JDBC..etc)
> 
> I saw the pg_stat_activity view, 
> 
> 
> but in the application name field there no infomation about 
> what kind of connection a client is doing.

No. They're all exactly the same as far as postgresql is concerned all speaking 
the
same postgresql native protocol, or close enough. The ODBC / JDBC / libpq 
difference
is purely on the client side.

A client can voluntarily set the application_name, e.g. as part of it's 
connection string,
to identify itself to the server, if you want to be able to identify which sort 
of client
is connected easily.

Cheers,
  Steve




Re: Mailing list archiver

2018-01-03 Thread Steve Atkins

> On Jan 3, 2018, at 6:00 AM, Jordan Deitch  wrote:
> 
> Thanks for the feedback! I will continue to address these issues, and I 
> intend on adding #postgresql IRC logs as well :-)

Please don't make a public archive of #postgresql without discussion. I believe 
the general feeling of those who use the channel is that a public archive of it 
is not wanted.

Cheers,
  Steve




Re: Does PostgreSQL check database integrity at startup?

2017-12-29 Thread Steve Atkins

> On Dec 29, 2017, at 6:14 PM, Melvin Davidson  wrote:
> 
> Edson's original request was for a query that shows  the FILENAMEs for the 
> table.

Which the query you provide does not do.

> As for "qood" query, that is entirely an opinion.

But a well informed one. Your query may work well enough for you, but when 
you're
suggesting others rely on it you should expect more informed users to point out
that it has some critical flaws - especially ones that might not be obvious to 
new
users - lest others naively rely on it.

> The query WILL show all files associated
> with ALL tables. 

No, it won't. You're filtering out a bunch of things via the name of the table. 
That
doesn't do the right thing. You're heading in the right direction, but the 
query you
gave gets some things badly wrong.

Listen to the criticism from others and you'll improve your knowledge and 
skills.

There's absolutely nothing wrong with posting information that's not quite 
right,
nor with getting feedback from others on what's wrong with it. Copping an 
attitude
in response to that feedback is where things go downhill.

Cheers,
  Steve


Re: postgresql-10 for ubuntu-17.10 (artful)?

2017-12-27 Thread Steve Atkins

> On Dec 26, 2017, at 6:21 PM, Stuart McGraw  wrote:
> 
> Is there a repository for Postgresql-10 available at 
>  http://apt.postgresql.org/pub/repos/apt
> for Ubuntu-17.10 (artful)?  When I look at the dist/
> subdirectory there, there seem to be repos for all the
> other Ubuntu releases including an upcoming one (bionic) 
> but not artful.
> 
> Am I looking in the wrong place?  (I am new to Ubuntu 
> and Debian packaging.)
> 
> 

I think that LTS releases (14.04, 16.04, 18.04) are the main releases targeted, 
with intermediate releases only added if they're incompatible with the previous 
LTS release.

See the FAQ at https://wiki.postgresql.org/wiki/Apt/FAQ  - I think you should 
be able to use the 17.04 package on 17.10.

Cheers,
  Steve




Re: To all who wish to unsubscribe

2017-11-21 Thread Steve Atkins

> On Nov 21, 2017, at 10:39 AM, Andrew Sullivan <a...@crankycanuck.ca> wrote:
> 
> On Mon, Nov 20, 2017 at 02:46:08PM -0800, Steve Atkins wrote:
>> That's poor practice, for several reasons - replay attacks with added content
>> and it being an extremely rare practice that's likely to trigger bugs in DKIM
>> validation are two. The latter is the much bigger deal.
>> 
>> It also doesn't help much for most MIME encoded mail (including base64
>> encoded plain text, like the mail I'm replying to).
>> 
>> Pretending those paragraphs aren't there is the right thing to do.
> 
> Yes.  Also the DMARC and forthcoming ARC mechanisms -- super important
> for people behind gmail and yahoo and so on -- make that feature not
> really work, AFAICT.  

It doesn't break DMARC or ARC as they only care if the mail is
authenticated via SPF or DKIM (with an aligned, blah, blah, blah) so
if the DKIM passes, even with an l= flag, it's OK. But ...

> I think that part of DKIM is busted, and the
> authors of it I've talked to seem to agree.

Pretty much, yes. Certainly for mail where you don't have tight control
over how it's generated.

Removing Subject tagging and footers of the messages isn't an accidental
side effect of the migration, it's (I assume) a primary goal of it.
If that weren't done, more and more people at large consumer mailbox
providers would increasingly have problems sending mail successfully
to the lists. 

Peoples mail filters will just have to adjust.

Blame Yahoo, not PGDG.

Cheers,
  Steve




Re: To all who wish to unsubscribe

2017-11-20 Thread Steve Atkins

> On Nov 20, 2017, at 2:18 PM, Piotr Stefaniak  
> wrote:
> 
> On 2017-11-20 21:03, Tom Lane wrote:
>> "Joshua D. Drake"  writes:
>>> On 11/20/2017 11:40 AM, Magnus Hagander wrote:
>>> One thing I would note is that there is no longer a footer that
>>> tells people what to do if they want to unsubscribe. Perhaps one 
>>> thing that could be done is a header (for a temporary time period)
>>> that says:
>>> The mailing list software of Postgresql.org has changed. Please see
>>> this page on instructions on how to manage your subscription and
>>> filters. And then after the temporary time period that becomes a
>>> footer?
>> 
>> Unfortunately, the removal of the footer is a feature not a bug. In
>> order to be DKIM-compatible and thus help avoid becoming classified 
>> as spammers, we can't mangle message content anymore, just like we 
>> can't mangle the Subject: line.
> I don't miss the footers, but see RFC 6376, "5.3.1. Body Length Limits". 
> Two fragments quoted are copied below:

That's poor practice, for several reasons - replay attacks with added content
and it being an extremely rare practice that's likely to trigger bugs in DKIM
validation are two. The latter is the much bigger deal.

It also doesn't help much for most MIME encoded mail (including base64
encoded plain text, like the mail I'm replying to).

Pretending those paragraphs aren't there is the right thing to do.

Cheers,
  Steve



> 
>> A body length count MAY be specified to limit the signature 
>> calculation to an initial prefix of the body text, measured in 
>> octets. If the body length count is not specified, the entire message
>> body is signed.
> 
>> INFORMATIVE RATIONALE: This capability is provided because it is very
>> common for mailing lists to add trailers to messages (e.g., 
>> instructions on how to get off the list). Until those messages are
>> also signed, the body length count is a useful tool for the Verifier
>> since it can, as a matter of policy, accept messages having valid
>> signatures with extraneous data.