maximum distance vs fixed distance for tsquery_phrase

2019-07-18 Thread Where is Where
Hello the tsquery_phrase is searching a fixed distance, is there way to
search maximum distance so the search returns query1 followed by query2 up
to a certain distance? eg  distance=100 return result from query1 <1>
query2 ... query1 <100> query2 ?  Thanks!
tsquery_phrase(query1 tsquery, query2 tsquery, distance integer)


Re: Rearchitecting for storage

2019-07-18 Thread Andy Colson

On 7/18/19 8:44 AM, Matthew Pounsett wrote:


I've recently inherited a database that is dangerously close to outgrowing the 
available storage on its existing hardware.  I'm looking for (pointers to) 
advice on scaling the storage in a financially constrained not-for-profit.

The current size of the DB's data directory is just shy of 23TB.  When I 
received the machine it's on, it was configured with 18x3TB drives in RAID10 
(9x 2-drive mirrors striped together) for about 28TB of available storage.  As 
a short term measure I've reconfigured them into RAID50 (3x 6-drive RAID5 
arrays).  This is obviously a poor choice for performance, but it'll get us 
through until we figure out what to do about upgrading/replacing the hardware.  
The host is constrained to 24x3TB drives, so we can't get much of an upgrade by 
just adding/replacing disks.

One of my anticipated requirements for any replacement we design is that I 
should be able to do upgrades of Postgres for up to five years without needing 
major upgrades to the hardware.  My understanding of the standard upgrade 
process is that this requires that the data directory be smaller than the free 
storage (so that there is room to hold two copies of the data directory 
simultaneously).  I haven't got detailed growth statistics yet, but given that 
the DB has grown to 23TB in 5 years, I should assume that it could double in 
the next five years, requiring 100TB of available storage to be able to do 
updates.

This seems to be right on the cusp of what is possible to fit in a single 
chassis with a RAID10 configuration (at least, with commodify hardware), which 
means we're looking at pretty high cost:performance ratio.  I'd like to see if 
we can find designs that get that ratio down a bit, or a lot, but I'm a general 
sysadmin, and the detailed effects on those choices are outside of my limited 
DBA experience.

Are there good documents out there on sizing hardware for this sort of mid-range storage 
requirement, that is neither big data, nor "small data" able to fit on a single 
host?   I'm hoping for an overview of the tradeoffs between single head, dual-head setups 
with a JBOD array, or whatever else is advisable to consider these days.  Corrections of 
any poor assumptions exposed above are also quite welcome. :)

Thanks in advance for any assistance!



Now might be a good time to consider splitting the database onto multiple computers.  
Might be simpler with a mid-range database, then your plan for the future is "add 
more computers".

-Andy




Re: Rearchitecting for storage

2019-07-18 Thread Rob Sargent


> 
> That would likely keep the extra storage requirements small, but still 
> non-zero.  Presumably the upgrade would be unnecessary if it could be done 
> without rewriting files.  Is there any rule of thumb for making sure one has 
> enough space available for the upgrade?   I suppose that would come down to 
> what exactly needs to get rewritten, in what order, etc., but the pg_upgrade 
> docs don't seem to have that detail.  For example, since we've got an ~18TB 
> table (including its indices), if that needs to be rewritten then we're still 
> looking at requiring significant extra storage.  Recent experience suggests 
> postgres won't necessarily do things in the most storage-efficient way.. we 
> just had a reindex on that database fail (in --single-user) because 17TB was 
> insufficient free storage for the db to grow into.
> 
Can you afford to drop and re-create those 6 indices?



Re: Possible Values of Command Tag in PG Log file

2019-07-18 Thread Tom Lane
"Kumar, Virendra"  writes:
> I loaded log file (csv format) into postgres_log table as per below document 
> and can see entries like:
> https://www.postgresql.org/docs/10/runtime-config-logging.html

> postgres=# select command_tag from postgres_log where command_tag is not null 
> limit 5;
>   command_tag
> 
>  authentication
>  authentication

That's actually the "activity" portion of the process status display,
which there's no formal specification for.  You'd have to root around
in the source code looking for set_ps_display() calls if you really
want to know.

regards, tom lane




Re: Possible Values of Command Tag in PG Log file

2019-07-18 Thread Adrian Klaver

On 7/18/19 2:13 PM, Melvin Davidson wrote:


The values and explanations are listed here:
https://www.postgresql.org/docs/11/protocol-message-formats.html

Or simply INSERT, UPDATE, DELETE, SELECT, MOVE, FETCH and/or COPY


Except the OP shows a command that is not in the above, authentication.

Also this:

https://www.postgresql.org/docs/11/event-trigger-interface.html

The command tag associated with the event for which the event trigger is 
run, for example "CREATE FUNCTION".


I suspect it comes from the commands listed here:

/src/include/commands
/src/backend/commands

How to pull out the complete list is beyond me at the moment.



On Thu, Jul 18, 2019 at 5:04 PM Kumar, Virendra 
mailto:virendra.ku...@guycarp.com>> wrote:


Here is it:
--
https://www.postgresql.org/docs/10/protocol-message-formats.html

CommandComplete (B), string value of this section.

I loaded log file (csv format) into postgres_log table as per below
document and can see entries like:
https://www.postgresql.org/docs/10/runtime-config-logging.html

postgres=# select command_tag from postgres_log where command_tag is
not null limit 5;
   command_tag


  


  authentication

(5 rows)

Regads,
Virendra


-Original Message-
From: Adrian Klaver [mailto:adrian.kla...@aklaver.com
]
Sent: Thursday, July 18, 2019 4:55 PM
To: Kumar, Virendra; pgsql-general@lists.postgresql.org

Subject: Re: Possible Values of Command Tag in PG Log file

On 7/18/19 12:58 PM, Kumar, Virendra wrote:
 > Hello Team,
 >
 > Can somebody redirect me to document which lists all possible
value of
 > command tag in instance log of PostgreSQL instance.

Can you show example of a command tag in the log?

 >
 > Regards,
 >
 > Virendra
 >
 >
 >

 >
 > This message is intended only for the use of the addressee and
may contain
 > information that is PRIVILEGED AND CONFIDENTIAL.
 >
 > If you are not the intended recipient, you are hereby notified
that any
 > dissemination of this communication is strictly prohibited. If
you have
 > received this communication in error, please erase all copies of
the message
 > and its attachments and notify the sender immediately. Thank you.


--
Adrian Klaver
adrian.kla...@aklaver.com 



This message is intended only for the use of the addressee and may
contain
information that is PRIVILEGED AND CONFIDENTIAL.

If you are not the intended recipient, you are hereby notified that any
dissemination of this communication is strictly prohibited. If you have
received this communication in error, please erase all copies of the
message
and its attachments and notify the sender immediately. Thank you.




--
*Melvin Davidson**
Maj. Database & Exploration Specialist**
Universe Exploration Command – UXC***
Employment by invitation only!



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Possible Values of Command Tag in PG Log file

2019-07-18 Thread Melvin Davidson
The values and explanations are listed here:
https://www.postgresql.org/docs/11/protocol-message-formats.html

Or simply INSERT, UPDATE, DELETE, SELECT, MOVE, FETCH and/or COPY

On Thu, Jul 18, 2019 at 5:04 PM Kumar, Virendra 
wrote:

> Here is it:
> --
> https://www.postgresql.org/docs/10/protocol-message-formats.html
>
> CommandComplete (B), string value of this section.
>
> I loaded log file (csv format) into postgres_log table as per below
> document and can see entries like:
> https://www.postgresql.org/docs/10/runtime-config-logging.html
>
> postgres=# select command_tag from postgres_log where command_tag is not
> null limit 5;
>   command_tag
> 
>
>  authentication
>
>  authentication
>
> (5 rows)
>
> Regads,
> Virendra
>
>
> -Original Message-
> From: Adrian Klaver [mailto:adrian.kla...@aklaver.com]
> Sent: Thursday, July 18, 2019 4:55 PM
> To: Kumar, Virendra; pgsql-general@lists.postgresql.org
> Subject: Re: Possible Values of Command Tag in PG Log file
>
> On 7/18/19 12:58 PM, Kumar, Virendra wrote:
> > Hello Team,
> >
> > Can somebody redirect me to document which lists all possible value of
> > command tag in instance log of PostgreSQL instance.
>
> Can you show example of a command tag in the log?
>
> >
> > Regards,
> >
> > Virendra
> >
> >
> > 
> >
> > This message is intended only for the use of the addressee and may
> contain
> > information that is PRIVILEGED AND CONFIDENTIAL.
> >
> > If you are not the intended recipient, you are hereby notified that any
> > dissemination of this communication is strictly prohibited. If you have
> > received this communication in error, please erase all copies of the
> message
> > and its attachments and notify the sender immediately. Thank you.
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
> 
>
> This message is intended only for the use of the addressee and may contain
> information that is PRIVILEGED AND CONFIDENTIAL.
>
> If you are not the intended recipient, you are hereby notified that any
> dissemination of this communication is strictly prohibited. If you have
> received this communication in error, please erase all copies of the
> message
> and its attachments and notify the sender immediately. Thank you.
>
>
>

-- 
*Melvin Davidson*
*Maj. Database & Exploration Specialist*
*Universe Exploration Command – UXC*
Employment by invitation only!


RE: Possible Values of Command Tag in PG Log file

2019-07-18 Thread Kumar, Virendra
Here is it:
--
https://www.postgresql.org/docs/10/protocol-message-formats.html

CommandComplete (B), string value of this section.

I loaded log file (csv format) into postgres_log table as per below document 
and can see entries like:
https://www.postgresql.org/docs/10/runtime-config-logging.html

postgres=# select command_tag from postgres_log where command_tag is not null 
limit 5;
  command_tag


 authentication

 authentication

(5 rows)

Regads,
Virendra


-Original Message-
From: Adrian Klaver [mailto:adrian.kla...@aklaver.com]
Sent: Thursday, July 18, 2019 4:55 PM
To: Kumar, Virendra; pgsql-general@lists.postgresql.org
Subject: Re: Possible Values of Command Tag in PG Log file

On 7/18/19 12:58 PM, Kumar, Virendra wrote:
> Hello Team,
>
> Can somebody redirect me to document which lists all possible value of
> command tag in instance log of PostgreSQL instance.

Can you show example of a command tag in the log?

>
> Regards,
>
> Virendra
>
>
> 
>
> This message is intended only for the use of the addressee and may contain
> information that is PRIVILEGED AND CONFIDENTIAL.
>
> If you are not the intended recipient, you are hereby notified that any
> dissemination of this communication is strictly prohibited. If you have
> received this communication in error, please erase all copies of the message
> and its attachments and notify the sender immediately. Thank you.


--
Adrian Klaver
adrian.kla...@aklaver.com



This message is intended only for the use of the addressee and may contain
information that is PRIVILEGED AND CONFIDENTIAL.

If you are not the intended recipient, you are hereby notified that any
dissemination of this communication is strictly prohibited. If you have
received this communication in error, please erase all copies of the message
and its attachments and notify the sender immediately. Thank you.




Re: Possible Values of Command Tag in PG Log file

2019-07-18 Thread Adrian Klaver

On 7/18/19 12:58 PM, Kumar, Virendra wrote:

Hello Team,

Can somebody redirect me to document which lists all possible value of 
command tag in instance log of PostgreSQL instance.


Can you show example of a command tag in the log?



Regards,

Virendra




This message is intended only for the use of the addressee and may contain
information that is PRIVILEGED AND CONFIDENTIAL.

If you are not the intended recipient, you are hereby notified that any
dissemination of this communication is strictly prohibited. If you have
received this communication in error, please erase all copies of the message
and its attachments and notify the sender immediately. Thank you.



--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Corrupt index stopping autovacuum system wide

2019-07-18 Thread shauncutts

On 17.07.2019 18:14, Andres Freund wrote:



To me that means that we need prioritization across databases, and
between tables, and probably by multiple criteria. I suspect there 
need
to be multiple criteria how urgent vacuuming is, and autovacuum ought 
to

try to make progress on all of them.



I'm not a contributor, and don't know the code base, so don't normally 
comment. But it occurs to me in the present case that there may be a 
short-term workaround for the current problem: could autovacuum visit 
tables in a random order? Then at least all tables would get vacuumed 
eventually (in the current case). If, later, more complex priorities are 
implemented, they could be probabilistic weights.


Best,

-- Shaun Cutts




Re: Rearchitecting for storage

2019-07-18 Thread Matthew Pounsett
On Thu, 18 Jul 2019 at 13:34, Kenneth Marshall  wrote:

> Hi Matt,
>

Hi!  Thanks for your reply.


> Have you considered using the VDO compression for tables that are less
> update intensive. Using just compression you can get almost 4X size
> reduction. For a database, I would forgo the deduplication function.
> You can then use a non-compressed tablespace for the heavier I/O tables
> and indexes.
>

VDO is a RedHat-only thing, isn't it?  We're not running RHEL... Debian.
Anyway, the bulk of the data (nearly 80%) is in a single table and its
indexes.  ~6TB to the table, and ~12TB to its indices.  Even if we switched
over to RedHat, there's no value in compressing lesser-used tables.


>
> > My understanding of the standard
> > upgrade process is that this requires that the data directory be smaller
> > than the free storage (so that there is room to hold two copies of the
> data
> > directory simultaneously).
>
> The link option with pg_upgrade does not require 2X the space, since it
> uses hard links instead of copying the files to the new cluster.
>

That would likely keep the extra storage requirements small, but still
non-zero.  Presumably the upgrade would be unnecessary if it could be done
without rewriting files.  Is there any rule of thumb for making sure one
has enough space available for the upgrade?   I suppose that would come
down to what exactly needs to get rewritten, in what order, etc., but the
pg_upgrade docs don't seem to have that detail.  For example, since we've
got an ~18TB table (including its indices), if that needs to be rewritten
then we're still looking at requiring significant extra storage.  Recent
experience suggests postgres won't necessarily do things in the most
storage-efficient way.. we just had a reindex on that database fail (in
--single-user) because 17TB was insufficient free storage for the db to
grow into.


Possible Values of Command Tag in PG Log file

2019-07-18 Thread Kumar, Virendra
Hello Team,

Can somebody redirect me to document which lists all possible value of command 
tag in instance log of PostgreSQL instance.

Regards,
Virendra



This message is intended only for the use of the addressee and may contain
information that is PRIVILEGED AND CONFIDENTIAL.

If you are not the intended recipient, you are hereby notified that any
dissemination of this communication is strictly prohibited. If you have
received this communication in error, please erase all copies of the message
and its attachments and notify the sender immediately. Thank you.


Re: Rearchitecting for storage

2019-07-18 Thread Kenneth Marshall
Hi Matt,

On Thu, Jul 18, 2019 at 09:44:04AM -0400, Matthew Pounsett wrote:
> I've recently inherited a database that is dangerously close to outgrowing
> the available storage on its existing hardware.  I'm looking for (pointers
> to) advice on scaling the storage in a financially constrained
> not-for-profit.

Have you considered using the VDO compression for tables that are less
update intensive. Using just compression you can get almost 4X size
reduction. For a database, I would forgo the deduplication function.
You can then use a non-compressed tablespace for the heavier I/O tables
and indexes.

> 
> One of my anticipated requirements for any replacement we design is that I
> should be able to do upgrades of Postgres for up to five years without
> needing major upgrades to the hardware.  My understanding of the standard
> upgrade process is that this requires that the data directory be smaller
> than the free storage (so that there is room to hold two copies of the data
> directory simultaneously).  I haven't got detailed growth statistics yet,
> but given that the DB has grown to 23TB in 5 years, I should assume that it
> could double in the next five years, requiring 100TB of available storage
> to be able to do updates.
> 

The link option with pg_upgrade does not require 2X the space, since it
uses hard links instead of copying the files to the new cluster.

Regards,
Ken




Re: PostgreSQL as a Service

2019-07-18 Thread Adrian Klaver

On 7/18/19 9:06 AM, Dirk Riehle wrote:
Please reply to list also.
Ccing list.



On Thu, Jul 18, 2019, 16:56 Adrian Klaver > wrote:



 > So, back to my main question above. If I wanted to run a DBaaS
shop with
 > only PostgreSQL open source, how far away from being able to compete
 > with AWS or Azure (or YugaByte for that matter) would I be?

The difference in resources available. The pull of DBaaS as I see it is
the being able to spin up db's as needed on a scale needed from one or
more locations. All with a unified management fronted/API. Being
competitive means being able to match that.


Yes that's the point. I'm not aware of an open source DBaaS software 
layer for PostgreSQL. Are there any attempts?


I missed that, I thought you where referring only to the Postgres 
database component only. So you want the management infrastructure to be 
Open Source also. I am not aware of any, but I don't work in the DBaaS 
field so take my observation in that light.





Cheers, Dirk



 >
 > Thanks for any thoughts and opinions! Dirk
 >


-- 
Adrian Klaver

adrian.kla...@aklaver.com 




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Corrupt index stopping autovacuum system wide

2019-07-18 Thread Peter Geoghegan
On Thu, Jul 18, 2019 at 9:06 AM Aaron Pelz  wrote:
> It's a simple btree expression on a geometry(Point,4326) , no expression no 
> partial no composite.

The cause of the corruption may be a bug in a Postgis B-Tree operator
class. I reported a bug in the Geography type that could lead to
corrupt B-Tree indexes (not Geometry):

https://trac.osgeo.org/postgis/ticket/3841

Though I also see what could be a comparable bug in Geometry:

https://trac.osgeo.org/postgis/ticket/3777

These bugs are from about 3 years ago. If I'm right you should be able
to isolate the bug using amcheck.

-- 
Peter Geoghegan




Re: PostgreSQL as a Service

2019-07-18 Thread Dirk Riehle
Thanks for the pointer to BDR!

As to CitusData, you are right. The Microsoft acquisition does not seem to
have led to absorption. Rather, it remains an open core play. This also
means its DBaaS layer is not open source.

Cheers, Dirk

On Thu, Jul 18, 2019, 16:50 Achilleas Mantzios 
wrote:

> On 18/7/19 5:23 μ.μ., Dirk Riehle wrote:
> > Hello everyone!
> >
> > tl;dr: How well is PostgreSQL positioned to serve as the database of
> choice for a DBaaS operator? Specifically, how much open source is (may be)
> missing?
> >
> > 
> >
> > Im un-lurking hoping to learn more about PostgreSQL in DBaaS land.
> >
> > You may have seen this announcement.
> >
> >
> https://blog.yugabyte.com/why-we-changed-yugabyte-db-licensing-to-100-open-source/
> >
> > YugaByte bills itself as a PostgreSQL compatible database (yay to at
> least the intent) but most importantly, it decided to single-license its
> database under a permissive license, including "the
> > enterprise features" that frequently are held back by single-vendor open
> source firms who want to earn a RoI for their VC investment.
> >
> > The interesting part (and why I'm posting it here) is the following
> staging of functionality implied in that post.
> >
> > 1. Core database (permissively licensed)
> > 2. Enterprise features (permissively licensed)
> > 3. DBaaS features (trial license, commercial, no open source)
> > 4. Managed by YugaByte (commercial)
> >
> > Point 3. suggests that they want to make money from self-managed DBaaS,
> but in the post they also write they really only expect significant income
> from 4, i.e. YugaByte (the database) managed by
> > YugaByte (the company).
> >
> > Where is PostgreSQL in relation to this?
> >
> > 1. PostgreSQL itself is certainly 1 above, the core database.
> >
> > 2. PostgreSQL permissive license allows commercial offerings to build
> and not share enterprise features (and I'm sure some companies are holding
> back). However, PostgreSQL is true community open
> > source so whatever enterprise features become relevant, they'll
> eventually be commoditized and out in the open. Is there a lot that is
> missing? And that some companies have but are not contributing?
> >
> > 3. So, PostgreSQL as-a-service. There are several companies (plenty?)
> who service PostgreSQL. I wonder how this is being shared back? I don't
> have a clear picture here, my impression is that the
> > software to run these potentially large farms is proprietary? Or, that
> operators would argue, this is all configuration and shell scripts and not
> really shareable open source?
> >
> > One aspect related to as-a-service is scaling out, i.e. not just having
> many small customers, but also serving large customers in the cloud. I
> looked around for scaling out solutions. There used to
> > be CitusData (not any longer it seems), there is PostgresXL which seems
> to be moving slowly. Is that it?
> There is also the Bidirectional Replication project (BDR). Also Why the
> comment about CitusData? I would the guess the opposite is true.
> >
> > 4. Managed DBaaS is not relevant here but always a commercial offering.
> >
> > So, back to my main question above. If I wanted to run a DBaaS shop with
> only PostgreSQL open source, how far away from being able to compete with
> AWS or Azure (or YugaByte for that matter) would I be?
> >
> > Thanks for any thoughts and opinions! Dirk
> >
>
>
> --
> Achilleas Mantzios
> IT DEV Lead
> IT DEPT
> Dynacom Tankers Mgmt
>
>
>
>


Re: Corrupt index stopping autovacuum system wide

2019-07-18 Thread Aaron Pelz
> Can you tell us more about this index? Can you share its definition
> (i.e. what does \d show in psql)?

> Is it an expression index, or a partial index? A composite? What
> datatypes are indexed?

It's a simple btree expression on a geometry(Point,4326) , no expression no
partial no composite.

On Wed, Jul 17, 2019 at 3:58 PM Peter Geoghegan  wrote:

> On Wed, Jul 17, 2019 at 10:27 AM Peter Geoghegan  wrote:
> > > It's possible that amcheck would have given you an accurate diagnosis
> > > of the problem -- especially if you used bt_index_parent_check():
> > >
> > > https://www.postgresql.org/docs/current/amcheck.html
> >
> > BTW, be sure to use the 'heapallindexed' option with
> > bt_index_parent_check() to detect missing downlinks, which is exactly
> > the problem that VACUUM complained about.
>
> Can you tell us more about this index? Can you share its definition
> (i.e. what does \d show in psql)?
>
> Is it an expression index, or a partial index? A composite? What
> datatypes are indexed?
>
> Thanks
> --
> Peter Geoghegan
>


Re: PostgreSQL as a Service

2019-07-18 Thread Adrian Klaver

On 7/18/19 7:23 AM, Dirk Riehle wrote:

Hello everyone!

tl;dr: How well is PostgreSQL positioned to serve as the database of 
choice for a DBaaS operator? Specifically, how much open source is (may 
be) missing?




Im un-lurking hoping to learn more about PostgreSQL in DBaaS land.

You may have seen this announcement.

https://blog.yugabyte.com/why-we-changed-yugabyte-db-licensing-to-100-open-source/ 



YugaByte bills itself as a PostgreSQL compatible database (yay to at 
least the intent) but most importantly, it decided to single-license its 
database under a permissive license, including "the enterprise features" 
that frequently are held back by single-vendor open source firms who 
want to earn a RoI for their VC investment.


The interesting part (and why I'm posting it here) is the following 
staging of functionality implied in that post.


1. Core database (permissively licensed)
2. Enterprise features (permissively licensed)
3. DBaaS features (trial license, commercial, no open source)
4. Managed by YugaByte (commercial)

Point 3. suggests that they want to make money from self-managed DBaaS, 
but in the post they also write they really only expect significant 
income from 4, i.e. YugaByte (the database) managed by YugaByte (the 
company).


Where is PostgreSQL in relation to this?

1. PostgreSQL itself is certainly 1 above, the core database.

2. PostgreSQL permissive license allows commercial offerings to build 
and not share enterprise features (and I'm sure some companies are 
holding back). However, PostgreSQL is true community open source so 
whatever enterprise features become relevant, they'll eventually be 
commoditized and out in the open. Is there a lot that is missing? And 
that some companies have but are not contributing?


3. So, PostgreSQL as-a-service. There are several companies (plenty?) 
who service PostgreSQL. I wonder how this is being shared back? I don't 
have a clear picture here, my impression is that the software to run 
these potentially large farms is proprietary? Or, that operators would 
argue, this is all configuration and shell scripts and not really 
shareable open source?


One aspect related to as-a-service is scaling out, i.e. not just having 
many small customers, but also serving large customers in the cloud. I 
looked around for scaling out solutions. There used to be CitusData (not 
any longer it seems), there is PostgresXL which seems to be moving 
slowly. Is that it?


4. Managed DBaaS is not relevant here but always a commercial offering.

So, back to my main question above. If I wanted to run a DBaaS shop with 
only PostgreSQL open source, how far away from being able to compete 
with AWS or Azure (or YugaByte for that matter) would I be?


The difference in resources available. The pull of DBaaS as I see it is 
the being able to spin up db's as needed on a scale needed from one or 
more locations. All with a unified management fronted/API. Being 
competitive means being able to match that.




Thanks for any thoughts and opinions! Dirk




--
Adrian Klaver
adrian.kla...@aklaver.com




Re: PostgreSQL as a Service

2019-07-18 Thread Achilleas Mantzios

On 18/7/19 5:23 μ.μ., Dirk Riehle wrote:

Hello everyone!

tl;dr: How well is PostgreSQL positioned to serve as the database of choice for 
a DBaaS operator? Specifically, how much open source is (may be) missing?



Im un-lurking hoping to learn more about PostgreSQL in DBaaS land.

You may have seen this announcement.

https://blog.yugabyte.com/why-we-changed-yugabyte-db-licensing-to-100-open-source/

YugaByte bills itself as a PostgreSQL compatible database (yay to at least the intent) but most importantly, it decided to single-license its database under a permissive license, including "the 
enterprise features" that frequently are held back by single-vendor open source firms who want to earn a RoI for their VC investment.


The interesting part (and why I'm posting it here) is the following staging of 
functionality implied in that post.

1. Core database (permissively licensed)
2. Enterprise features (permissively licensed)
3. DBaaS features (trial license, commercial, no open source)
4. Managed by YugaByte (commercial)

Point 3. suggests that they want to make money from self-managed DBaaS, but in the post they also write they really only expect significant income from 4, i.e. YugaByte (the database) managed by 
YugaByte (the company).


Where is PostgreSQL in relation to this?

1. PostgreSQL itself is certainly 1 above, the core database.

2. PostgreSQL permissive license allows commercial offerings to build and not share enterprise features (and I'm sure some companies are holding back). However, PostgreSQL is true community open 
source so whatever enterprise features become relevant, they'll eventually be commoditized and out in the open. Is there a lot that is missing? And that some companies have but are not contributing?


3. So, PostgreSQL as-a-service. There are several companies (plenty?) who service PostgreSQL. I wonder how this is being shared back? I don't have a clear picture here, my impression is that the 
software to run these potentially large farms is proprietary? Or, that operators would argue, this is all configuration and shell scripts and not really shareable open source?


One aspect related to as-a-service is scaling out, i.e. not just having many small customers, but also serving large customers in the cloud. I looked around for scaling out solutions. There used to 
be CitusData (not any longer it seems), there is PostgresXL which seems to be moving slowly. Is that it?

There is also the Bidirectional Replication project (BDR). Also Why the comment 
about CitusData? I would the guess the opposite is true.


4. Managed DBaaS is not relevant here but always a commercial offering.

So, back to my main question above. If I wanted to run a DBaaS shop with only 
PostgreSQL open source, how far away from being able to compete with AWS or 
Azure (or YugaByte for that matter) would I be?

Thanks for any thoughts and opinions! Dirk




--
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt





Re: Postgers 9.3 - ubuntu 16.04 - Are clogs entires automatically deleted?

2019-07-18 Thread Adrian Klaver

On 7/18/19 1:34 AM, Cumer Cristiano wrote:

Hi List,

Today I wanted to dump a database but pg_dump is complaining about missing clog 
files.
I’m quite sure that nobody has deleted the files and that my filesystem is 
consistent.


What is the exact message you get?

What is the full version of Postgres, for example 9.3.x?



If I look in my clog directory I can see the files starting from 0013. The 
first ones are missing. I have checked the backups of a month ago and there I 
can see previous clog files, 0010,0011. I don’t have older backups.

So…. Is there a bug that delete clogs files even when they are still in use?

Cristiano




--
Adrian Klaver
adrian.kla...@aklaver.com




PostgreSQL as a Service

2019-07-18 Thread Dirk Riehle

Hello everyone!

tl;dr: How well is PostgreSQL positioned to serve as the database of choice 
for a DBaaS operator? Specifically, how much open source is (may be) missing?




Im un-lurking hoping to learn more about PostgreSQL in DBaaS land.

You may have seen this announcement.

https://blog.yugabyte.com/why-we-changed-yugabyte-db-licensing-to-100-open-source/

YugaByte bills itself as a PostgreSQL compatible database (yay to at least the 
intent) but most importantly, it decided to single-license its database under 
a permissive license, including "the enterprise features" that frequently are 
held back by single-vendor open source firms who want to earn a RoI for their 
VC investment.


The interesting part (and why I'm posting it here) is the following staging of 
functionality implied in that post.


1. Core database (permissively licensed)
2. Enterprise features (permissively licensed)
3. DBaaS features (trial license, commercial, no open source)
4. Managed by YugaByte (commercial)

Point 3. suggests that they want to make money from self-managed DBaaS, but in 
the post they also write they really only expect significant income from 4, 
i.e. YugaByte (the database) managed by YugaByte (the company).


Where is PostgreSQL in relation to this?

1. PostgreSQL itself is certainly 1 above, the core database.

2. PostgreSQL permissive license allows commercial offerings to build and not 
share enterprise features (and I'm sure some companies are holding back). 
However, PostgreSQL is true community open source so whatever enterprise 
features become relevant, they'll eventually be commoditized and out in the 
open. Is there a lot that is missing? And that some companies have but are not 
contributing?


3. So, PostgreSQL as-a-service. There are several companies (plenty?) who 
service PostgreSQL. I wonder how this is being shared back? I don't have a 
clear picture here, my impression is that the software to run these 
potentially large farms is proprietary? Or, that operators would argue, this 
is all configuration and shell scripts and not really shareable open source?


One aspect related to as-a-service is scaling out, i.e. not just having many 
small customers, but also serving large customers in the cloud. I looked 
around for scaling out solutions. There used to be CitusData (not any longer 
it seems), there is PostgresXL which seems to be moving slowly. Is that it?


4. Managed DBaaS is not relevant here but always a commercial offering.

So, back to my main question above. If I wanted to run a DBaaS shop with only 
PostgreSQL open source, how far away from being able to compete with AWS or 
Azure (or YugaByte for that matter) would I be?


Thanks for any thoughts and opinions! Dirk

--
Website: http://dirkriehle.com - Twitter: @dirkriehle
Ph (DE): +49-157-8153-4150 - Ph (US): +1-650-450-8550





Rearchitecting for storage

2019-07-18 Thread Matthew Pounsett
I've recently inherited a database that is dangerously close to outgrowing
the available storage on its existing hardware.  I'm looking for (pointers
to) advice on scaling the storage in a financially constrained
not-for-profit.

The current size of the DB's data directory is just shy of 23TB.  When I
received the machine it's on, it was configured with 18x3TB drives in
RAID10 (9x 2-drive mirrors striped together) for about 28TB of available
storage.  As a short term measure I've reconfigured them into RAID50 (3x
6-drive RAID5 arrays).  This is obviously a poor choice for performance,
but it'll get us through until we figure out what to do about
upgrading/replacing the hardware.  The host is constrained to 24x3TB
drives, so we can't get much of an upgrade by just adding/replacing disks.

One of my anticipated requirements for any replacement we design is that I
should be able to do upgrades of Postgres for up to five years without
needing major upgrades to the hardware.  My understanding of the standard
upgrade process is that this requires that the data directory be smaller
than the free storage (so that there is room to hold two copies of the data
directory simultaneously).  I haven't got detailed growth statistics yet,
but given that the DB has grown to 23TB in 5 years, I should assume that it
could double in the next five years, requiring 100TB of available storage
to be able to do updates.

This seems to be right on the cusp of what is possible to fit in a single
chassis with a RAID10 configuration (at least, with commodify hardware),
which means we're looking at pretty high cost:performance ratio.  I'd like
to see if we can find designs that get that ratio down a bit, or a lot, but
I'm a general sysadmin, and the detailed effects on those choices are
outside of my limited DBA experience.

Are there good documents out there on sizing hardware for this sort of
mid-range storage requirement, that is neither big data, nor "small data"
able to fit on a single host?   I'm hoping for an overview of the tradeoffs
between single head, dual-head setups with a JBOD array, or whatever else
is advisable to consider these days.  Corrections of any poor assumptions
exposed above are also quite welcome. :)

Thanks in advance for any assistance!


Re: Postgers 9.3 - ubuntu 16.04 - Are clogs entires automatically deleted?

2019-07-18 Thread Luca Ferrari
On Thu, Jul 18, 2019 at 10:34 AM Cumer Cristiano
 wrote:
> Today I wanted to dump a database but pg_dump is complaining about missing 
> clog files.
> I’m quite sure that nobody has deleted the files and that my filesystem is 
> consistent.
>
> If I look in my clog directory I can see the files starting from 0013. The 
> first ones are missing. I have checked the backups of a month ago and there I 
> can see previous clog files, 0010,0011. I don’t have older backups.

Uhm.. 9.3 is in end of life, ubuntu 16.04 refers to three years ago
and is approaching the extends security releases only, that is it
looks a little ancient combination.
I suspect pg_dump will provide you with an hint about which particular
object is causing it to fail, in the case I suggest to backup
everything except such object. At least you will have something backed
up.
Any chance your disk was full and something nasty happened?

Luca




Postgers 9.3 - ubuntu 16.04 - Are clogs entires automatically deleted?

2019-07-18 Thread Cumer Cristiano
Hi List,

Today I wanted to dump a database but pg_dump is complaining about missing clog 
files.
I’m quite sure that nobody has deleted the files and that my filesystem is 
consistent.

If I look in my clog directory I can see the files starting from 0013. The 
first ones are missing. I have checked the backups of a month ago and there I 
can see previous clog files, 0010,0011. I don’t have older backups.

So…. Is there a bug that delete clogs files even when they are still in use?

Cristiano

Re: Change in db size

2019-07-18 Thread Luca Ferrari
On Wed, Jul 17, 2019 at 7:33 PM Sonam Sharma  wrote:
>
> I took the backup using pg_dump with gzip option and restored it with psql.

It does not change the way you backed up, chances are when you
restored it the database cropped table and index bloating as already
mentioned.
The other, remote chance, is that you did not backed up all the
objects, so in this case you should show us your backup command line
(and restore one too), and also the output of commands like \l+, \dt+,
\di+ and differences you have found.

Luca