Re: [GENERAL] Please say it isn't so

2017-07-12 Thread Steve Crawford
On Tue, Jul 11, 2017 at 9:51 PM, Steve Litt 
wrote:

> Hi all,
>
> Please tell me this is a mistake:
>
> https://wiki.postgresql.org/wiki/Systemd
>
> Why a database system should care about how processes get started is
> beyond me. Systemd is an entangled mess that every year subsumes more
> and more of the operating system, in a very non-cooperative way.
>
> There are almost ten init systems. In every one of those init systems,
> one can run a process supervisor, such as runit or s6 or
> daemontools-encore, completely capable of starting the postgres server.
>
> Every year, systemd further hinders interoperability, further erodes
> interchangeability of parts, and continues to address problems with
> WONTFIX. In the long run, you do your users no favor by including
> init-system specific code in Postgres or its makefiles. If systemd
> can't correctly start Postgres, I guarantee you that s6 or runit,
> running on top of systemd, can.
>
> Postgres doesn't care which language makes a query to it. Why
> should Postgres care which init system started it? I hope you can free
> Postgres of init-specific code, and if for some reason you can't do
> that, at least don't recommend init-specific code.
>
>
Take a deep breath...

You are looking at a page about PostgreSQL with specifics surrounding
installation on a machine running systemd. In that case it is naturally
recommended to compile using the --with-systemd option to better integrate
with systemd.

As the docs about that option say, "...This improves integration if the
server binary is started under systemd but has no impact otherwise..." You
are no more required to use systemd than you are to run PostgreSQL on
Windows but the options are available to you.

Cheers,
Steve


Re: [GENERAL] Python versus Other Languages using PostgreSQL

2017-05-09 Thread Steve Crawford
On Mon, May 8, 2017 at 2:26 PM, Paul Hughes  wrote:

> Hello,
>
> I noticed that most of the largest web platforms that use PostgreSQL as
> their primary database, also use Python as their primary back-end language.
> Yet, according to every benchmark I could find over the last couple of
> years, back-end languages like PHP, HHVM, and Node.JS outperform Python by
> 2x to 8x!
>

This is a questionable and incomplete starting point starting with a vague
assertion lacking a reasonable set of examples.

You are conflating languages (PHP, Python) with various environments in
which they run (the Node.js runtime for Javascript and HHVM virtual machine
for PHP/Hack), failing to provide version/implementation/acceleration
information of the languages (Zend, PyPy,), and failing to state anything
about the environment: Is it raw code or are we actually discussing
framework performance? What are other requirements/constraints on the
overall system?

Picking a single data-point (speed) of a single part of a system (back end
language) is not typically informative. There is speed of development,
availability of developers, feature sets, robustness, licensing,
availability of commercial support, security issues and ability to
integrate with other parts of a corporate environment to name just a few.
The impact of the language speed may be minuscule compared to the impact of
other aspects of the overall system. One could as easily ask, "Why do so
many people use PHP when assembler is so much faster?"

If speed is the only requirement you can get a great boost in PostgreSQL by
turning off fsync. Of course you may lose all your data in a crash but,
hey, it's faster!

To quote our own illustrious Tom Lane:

"...Try to carry 500 people from Los Angeles to Tokyo in an F-15.  No?  Try
to win a dogfight in a 747.  No? But they both fly, so it must be useful to
compare them...  especially on the basis of the most simplistic test case
you can think of.  For extra points, use *only one* test case.  Perhaps
this paper can be described as "comparing an F-15 to a 747 on the basis of
required runway length...".

To really determine the answer you would have to ask each one why they
chose their specific combination of language(s) database, OS, hardware/VM,
caching, pooling, etc.; what they would do differently in retrospect and
also evaluate the validity of those decisions. For the most part we, here,
can only guess.



> So here are my questions:
>
> 1) Why do the largest web applications that use PostgreSQL also use
> Python, even though Python is significantly slower than it's biggest
> competitors?
>

See comments above.


>
> 2) Can PostgreSQL just as easily be used with PHP or Node.js? If not, why
> not?
>
>
Yes. We have PHP, Python, Ruby, C, Bash (via psql), Perl, LibreOffice, etc.
directly connecting to our server. There is also client support for Java,
.NET, Tcl, C++, ODBC, Erlang, Haskell, Lua, R, Smalltalk, ...


> 3) Can PostgreSQL be made to work seamlessly to take advantage of the
> superior performance of HHVM or Node.js?
>

See above.


>
>
> Thank you in advance!
>
> ~Paul
>
>
> 
>  Virus-free.
> www.avast.com
> 
> <#m_7020864994226728918_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>


Re: [GENERAL] why isn't this subquery wrong?

2017-04-20 Thread Steve Crawford
On Thu, Apr 20, 2017 at 3:56 PM, jonathan vanasco  wrote:

> thanks all!
>
> On Apr 20, 2017, at 6:42 PM, David G. Johnston wrote:
>
> ​Subqueries can see all columns of the parent.  When the subquery actually
> uses one of them it is called a "correlated subquery".
>
>
> i thought a correlated subquery had to note that table/alias, not a raw
> column.   I guess i've just been adhering to good form.
>
>
> On Apr 20, 2017, at 6:43 PM, Tom Lane wrote:
>
> Cautious SQL programmers qualify all references inside sub-selects to
> avoid getting caught by this accidentally.
>
>
> is there a syntax to qualify a reference to lock a subquery to the current
> scope (disable looking at the parents)?  that's how I got caught on this by
> accident.
>

Like Tom said, "qualify all references":

...(SELECT example_a__rollup.bar_id FROM example_a__rollup)...

Or shortened with alises:

...(SELECT x.bar_id FROM example_a__rollup x)...

Cheers,
Steve


Re: [GENERAL] why isn't this subquery wrong?

2017-04-20 Thread Steve Crawford
On Thu, Apr 20, 2017 at 3:17 PM, jonathan vanasco  wrote:

>
> I ran into an issue while changing a database schema around.  Some queries
> still worked, even though I didn't expect them to.
>
> Can anyone explain to me why the following is valid (running 9.6) ?
>
> schema
>
> CREATE TEMPORARY TABLE example_a__data (
> foo_id INT,
> bar_id INT
> );
> CREATE TEMPORARY TABLE example_a__rollup_source (
> id int primary key,
> name varchar(64),
> foo_id INT,
> check_bool BOOLEAN
> );
> CREATE TEMPORARY TABLE example_a__rollup AS
> SELECT id, name, foo_id
> FROM example_a__rollup_source
> WHERE check_bool IS TRUE
> ;
>
>
> query:
>
> SELECT foo_id
> FROM example_a__data
> WHERE foo_id IN (SELECT bar_id FROM example_a__rollup)
> ;
>
>
> a raw select of `SELECT bar_id FROM example_a__rollup;` will cause an
> error because bar_id doesn't exist
>
> postgres doesn't raise an error because example_a__data does have a bar_id
> -- but example_a__rollup doesn't and there's no explicit correlation in the
> query.
>
> can someone explain why this happens?  i'm guessing there is a good reason
> -- but I'm unfamiliar with the type of implicit join/queries this behavior
> is enabling.
>
>
>
There is no requirement in this query that bar_id be in the
example_a__rollup table and since it is only in one table it is unambiguous
so the server doesn't complain.

It may be explanatory to add a couple records to your example_a_rollup
table:

insert into example_a__data values (3,4),(5,6);

Then run a simple select showing what the where clause would see:

SELECT,
   foo_id,
   (SELECT bar_id FROM example_a__rollup)
FROM
example_a__data
;

 foo_id | bar_id
+
 3 |
 5 |

bar_id is null because there are no rows in example_a_rollup.

Now add a single record to example_a_rollup:

insert into example_a__rollup (id) values (10);

Rerun the query and you will get:

 foo_id | bar_id
+
 3 |  4
 5 |  6


If you add another record to example_a__rollup and run it and you will get:

ERROR:  more than one row returned by a subquery used as an expression


Although the subquery won't work as an expression it would still work in a
the where clause but I doubt it will return what you desire.
Unfortunately there
are lots of ways to write syntactically correct but logically flawed
statements.


Cheers,
Steve


Re: [GENERAL] Handling psql lost connections

2017-03-29 Thread Steve Crawford
On Wed, Mar 29, 2017 at 9:05 AM, Adrian Klaver <adrian.kla...@aklaver.com>
wrote:

> On 03/29/2017 08:49 AM, Steve Crawford wrote:
>
>> When firewalls/VPNs stand between my psql client and a remote PostgreSQL
>> server the connection will on occasion time out and drop. This results
>> in the following scenario:
>>
>> -Leave for lunch mid project - leave psql open.
>>
>> -Return from lunch, complete and submit large query.
>>
>> -Notice query is taking too long. cancel it.
>>
>> -Cancel doesn't return - realize that connection has dropped.
>>
>> -Kill psql - history is not written out. Start query from scratch.
>>
>> Is there:
>>
>> 1) A way to set psql to send keepalives?
>>
>
> From server side:
> https://www.postgresql.org/docs/9.6/static/runtime-config-
> connection.html#RUNTIME-CONFIG-CONNECTION-SETTINGS
>
> tcp_keepalives*
>
> I guess you could abuse \watch:
>
> https://www.postgresql.org/docs/9.6/static/app-psql.html
>
> \watch [ seconds ]
>
> Repeatedly execute the current query buffer (as \g does) until
> interrupted or the query fails. Wait the specified number of seconds
> (default 2) between executions. Each query result is displayed with a
> header that includes the \pset title string (if any), the time as of query
> start, and the delay interval.
>
> aklaver@test=> \watch 2
> Watch every 2s  Wed Mar 29 08:59:55 2017
>
>  ?column?
> --
> 1
> (1 row)
>
> Watch every 2s  Wed Mar 29 08:59:57 2017
>
>  ?column?
> --
> 1
> (1 row)
>
> With a larger value of seconds.



If I could remember to do that I would remember that I had psql running in
one or more terminals on one of my virtual screens and just close it. As it
is, I try to remember to close psql and restart if it has been sitting for
more than a few minutes.




> 2) A way to gracefully kill psql ensuring that the history is saved?
>>
>> Yes, I know I and my coworkers could spend brain cycles trying to
>> unerringly remember to close and restart connections, write all queries
>> in an external editor and then submit them, etc. but I'm looking for
>> more user friendly options.
>>
>
> Use the internal editor(\e)?


That is actually the typical *cause* of the problems. I usually do use \e
to fire up the external $EDITOR for anything more than a trivial query and
if I need to stop or I step away mid-edit then finish and write/quit, the
query is not visible on the screen where I could scroll back to it. If the
connection has dropped, I have to kill psql and the history is lost as well.

I think for now that I'll just add some tcp settings to sysctl.conf to deal
with the firewalls.

Cheers,
Steve


[GENERAL] Handling psql lost connections

2017-03-29 Thread Steve Crawford
When firewalls/VPNs stand between my psql client and a remote PostgreSQL
server the connection will on occasion time out and drop. This results in
the following scenario:

-Leave for lunch mid project - leave psql open.

-Return from lunch, complete and submit large query.

-Notice query is taking too long. cancel it.

-Cancel doesn't return - realize that connection has dropped.

-Kill psql - history is not written out. Start query from scratch.

Is there:

1) A way to set psql to send keepalives?

2) A way to gracefully kill psql ensuring that the history is saved?

Yes, I know I and my coworkers could spend brain cycles trying to
unerringly remember to close and restart connections, write all queries in
an external editor and then submit them, etc. but I'm looking for more user
friendly options.

Cheers,
Steve


Re: [GENERAL] CenOS 5/Postgresql 9.6

2017-03-20 Thread Steve Crawford
Looks like I'll be compiling. BTW, I'm not planning on running a server on
it but have a couple pieces of expensive specialized hardware that are only
certified on RHEL 5 i386 systems. They do need to talk to the server and I
don't want to delay the migration of the rest of the infrastructure just
because those aren't replaced, yet.

Cheers,
Steve


On Sat, Mar 18, 2017 at 3:59 AM, Devrim Gündüz <dev...@gunduz.org> wrote:

>
> Hi,
>
> On Fri, 2017-03-17 at 12:15 -0700, Steve Crawford wrote:
> > The question remains - does anyone know where I might find packages so I
> > don't have to compile them myself?
>
> (I'm the maintainer of the repository)
>
> There are no packages for CentOS 5 / PG 9.6, and it was my decision to drop
> support -- as I thought that using CentOS 5 with a new set of PG releases
> would
> not be the best idea.
>
> I still think so.
>
> However, you can still rebuild 9.6 RPMs on your CentOS 5 box, by using this
> SRPM:
>
> https://download.postgresql.org/pub/repos/yum/srpms/9.6/
> redhat/rhel-6-x86_64/postgresql96-9.6.2-2PGDG.rhel6.src.rpm
>
> I did not test it though, but it should work or less work...
>
> You can also contact a PostgreSQL support company, and ask them to provide
> the
> RPMs, too.
>
> Regards,
> --
> Devrim Gündüz
> EnterpriseDB: http://www.enterprisedb.com
> PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer
> Twitter: @DevrimGunduz , @DevrimGunduzTR
>


Re: [GENERAL] CenOS 5/Postgresql 9.6

2017-03-17 Thread Steve Crawford
On Fri, Mar 17, 2017 at 11:35 AM, John R Pierce <pie...@hogranch.com> wrote:

> On 3/17/2017 11:07 AM, Steve Crawford wrote:
>
>> Where might I find yum repos PostgreSQL 9.6 on CentOS 5 (i386 & x86_64)?
>>
>> RHEL/CentOS 5 is still in production with extended support through 2020
>> but seems to be dropped from the 9.6 PGDG repos.
>>
>
> CentOS 5 will be completely dropped in 2-3 weeks, I believe. The last
> CentOS 5 Update 11 release was in 2014.  Extended support applies only
> to RHEL, and costs significant money per server. You might contact the
> yum.postgresql.org repository manager's employer, EnterpriseDB, and see
> if they'd be willing to extend support for RHEL 6 and PG 9.6 under contract.
>
>
I'm aware of all those dates. Also that 9.6 has been out for six-months,
and that RHEL/CentOS 5 are listed among the currently supported versions at
https://yum.postgresql.org/.

The question remains - does anyone know where I might find packages so I
don't have to compile them myself?

Cheers,
Steve


[GENERAL] CenOS 5/Postgresql 9.6

2017-03-17 Thread Steve Crawford
Where might I find yum repos PostgreSQL 9.6 on CentOS 5 (i386 & x86_64)?

RHEL/CentOS 5 is still in production with extended support through 2020 but
seems to be dropped from the 9.6 PGDG repos.

Cheers,
Steve


Re: [GENERAL] GMT FATAL: remaining connection slots are reserved for non-replication superuser connections, but I'm using pgBouncer for connection pooling

2017-02-28 Thread Steve Crawford
On Sat, Feb 25, 2017 at 4:19 AM, lisandro 
wrote:

> Hi there! Please tell me if this isn't the place to post my question, I'm
> new
> in the list.
>
> I'm using PostgreSQL 9.3, I have around 150 databases, and I use pgBouncer
> for connection pooling.
> My server is a VPS with 8cpus and 24gb of RAM.
>
> My current postgreSQL configuration (resumed) is this:
>
> listen_addresses = '*'
> port = 6543
> max_connections = 250
> shared_buffers = 2GB
> effective_cache_size = 6GB
> work_mem = 10485kB
> maintenance_work_mem = 512MB
> checkpoint_segments = 32
> checkpoint_completion_target = 0.7
> wal_buffers = 16MB
> default_statistics_target = 100
>
>
> In the other hand, my pgBouncer configuration (resumed) is this:
>
> listen_addr = localhost
> listen_port = 5432
> pool_mode = transaction
> server_reset_query = DISCARD ALL
> max_client_conn = 1
> default_pool_size = 10
> min_pool_size = 2
> server_idle_timeout = 30
> ...


Note that pgBouncer pool size is per user/database pair. With these
settings and 150 databases I'm actually surprised that you aren't running
out of connections more often. Perhaps there are per-database settings that
haven't been shown. We are also missing info on reserve_pool_timeout,
max_db_connections, etc. which could all play a role, here.

Cheers,
Steve


Re: [GENERAL] postgresql how to duplicate rows in result.

2017-02-16 Thread Steve Crawford
For my enlightenment, why use LATERAL here? I get the same result with a
simple CROSS JOIN (though overall I like the clever solution).

Cheers,
Steve


On Thu, Feb 16, 2017 at 12:11 AM, Alessandro Baggi <
alessandro.ba...@gmail.com> wrote:

> Il 15/02/2017 19:11, Alessandro Baggi ha scritto:
>
>> Il 14/02/2017 21:51, Merlin Moncure ha scritto:
>>
>>> On Tue, Feb 14, 2017 at 12:36 PM, Merlin Moncure 
>>> wrote:
>>>
 On Tue, Feb 14, 2017 at 12:04 PM, Alessandro Baggi
  wrote:

> Hi list,
> sorry for my english, I will try to example as well. I've a query
> that joins
> multiple tables and return a result like:
>
> id,customers,phone,code,number
> 1 , ,3,123 , 2
> 2 , aassdsds,33322,211 , 1
> 3 , ,21221,221 , 1
>
>
> I need, where "number" field is > 1, to duplicate the row * N(number
> field
> value) with a result like this:
>
> id,customers,phone,code,number
> 1 , ,3,123 , 2
> 1 , ,3,123 , 2
> 2 , aassdsds,33322,211 , 1
> 3 , ,21221,221 , 1
>
> How I can accomplish to this problem?
>

 SELECT * FROM foo CROSS JOIN LATERAL (1,number);

 :-D

>>>
>>> oops -- copy/paste error
>>>
>>> SELECT * FROM foo CROSS JOIN LATERAL generate_series(1,number);
>>>
>>> merlin
>>> .
>>>
>>> Hi Merlin,
>> I've tried your suggested code and with cross join and generate_series I
>> can generate multiple row. There is a way to put as second args a column
>> values? I've tried to put "table.number" column values but I got
>> "generate_series() does not exists". Inserting a simple int like 5 I get
>> 5 results for each row.
>>
>> I've searched on google but can't find a valid example.
>>
>>
>>
>> Thanks in advance.
>>
>>
>> Hi Merlin,
> I've solved my problem (passing column as number) using a cast
> generate_series(1,table.number::int)
>
> thanks to all for answart.
>
> SOLVED
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


Re: [GENERAL] FATAL: remaining connection slots are reserved for non-replication superuser connections

2017-02-07 Thread Steve Crawford
On Tue, Feb 7, 2017 at 6:52 PM, Patrick B  wrote:

> Hi guys,
>
> I get these messages at least once a day in my Prod environment:
>
>> FATAL:  remaining connection slots are reserved for non-replication
>> superuser connections
>
> I do not have a DB pooler and my max_connections is 200. However, max
> connections for my PHP Application is 120.
>
> My server has 128GB and SSD 10K iops disks (Amazon EBS).
>
>
> Can you guys please outlines me the steps to troubleshoot this?
>
> Interesting is that I didn't see any IO/CPU limitation on my server.
>
> I'm currently running a Postgres 9.2 - one master and one slave streaming
> replication.
>
>
> Thanks
>
> Patrick
>

Something is using too many connections.

I may be wrong but I'm unaware of a limit on connections from PHP except
when you are using persistent connections. Since each PHP script is it's
own process, it can create one or more connections. I'd check to be sure
that every PHP script you have is, indeed, using pg_pconnect and not
pg_connect. That missing "p" could be hard to spot. I'm assuming, of
course, that you are sure that your PHP script are the only things that can
connect - no scripts, backups, etc. are consuming connections.

But generally I'd advise using pg_bouncer or a similar pooler which can
deal with a mix of connections from persistent and non-persistent
connections from one or multiple hosts.

Cheers,
Steve


Re: [GENERAL] BST Time Zone Discrepancy

2017-02-06 Thread Steve Crawford
On Mon, Feb 6, 2017 at 12:44 PM, Igal @ Lucee.org  wrote:

> Tom,
>
> Thank you for your reply:
> On 2/6/2017 12:18 PM, Tom Lane wrote:
>
> This is controlled by the timezone_abbreviations file, which if
>
> you haven't changed it lists:
>
> # CONFLICT! BST is not unique
> # Other timezones:
> #  - BST: Bougainville Standard Time (Papua New Guinea)
> BST  3600 D  # British Summer Time
>  # (Europe/London)
>
> I haven't changed any of the config files.  I can not find that file on my
> system (maybe it's in the source code only).
>
> I am using the Red Hat distribution:  PostgreSQL 9.6.1 on
> x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat
> 4.8.5-4), 64-bit
>
> pg_timezone_names shows the **current** abbreviation for the zone in question
>
> I'm not sure what you mean by "current".  If this is not an issue then
> that's fine, you can ignore this message.  It just seemed weird to me that
> pg_timezone_names and pg_timezone_abbrevs showed very different results for
> the same code.
>
> Thanks,
>
> Igal Sapir
> Lucee Core Developer
> Lucee.org 
>
"Current" in this context means the abbreviation in effect at the current
time. In other words, if I were to look at my current time zone
abbreviation it would currently be PST (Pacific Standard time) but in a
couple months it would be PDT so the *view* will change depending on the
time of year.

It's important to note that there is no BST timezone. That is an
abbreviation for a timezone *offset*. Where I live we could be in PST (-08)
or PDT (-07). Currently we are in standard time but I could ask for the
current time in daylight time and PostgreSQL will give me the current point
in time with a PDT or -07 offset. Compounding the problem is the fact that
abbreviations are not globally unique. Both the United States and Australia
have Eastern and Central Standard and Daylight times, for example.

A timezone, on the other hand, encapsulates the offset as it changes both
throughout the year and historically. It is almost always preferable to use
an actual timezone by specifying it by name as in Europe/London,
America/Los_Angeles, etc.

Cheers,
Steve


Re: [GENERAL] psql only works with -h (even localhost)

2017-01-25 Thread Steve Crawford
Adrian asks the correct questions. Lacking the answers to those I'm going
to venture a guess that a Unix-domain socket exists but access via
Unix-domain sockets is somehow blocked, probably by pg_hba.conf.

>From the psql man page: "...Not all of these options are required; there
are useful defaults. If you omit the host name, psql will connect via a
Unix-domain socket to a server on the local host, or via TCP/IP to
localhost on machines that don't have Unix-domain sockets"

Cheers,
Steve




On Wed, Jan 25, 2017 at 8:07 AM, Adrian Klaver 
wrote:

> On 01/25/2017 08:02 AM, ProPAAS DBA wrote:
>
>> Hi all;
>>
>>
>> we have a client server where 'psql -h localhost' is the only way psql
>> will connect when ssh'd onto the db server. Would like to be able to
>> just run psql but not sure what the issue/fix is. Anyone have any
>> insight hot to fix this?
>>
>
>
> What is the error you get if you use something other then -h localhost?
>
> What is in pg_hba.conf?
>
>
>>
>> Thanks in advance
>>
>>
>>
>>
>>
>>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


Re: [GENERAL] COPY to question

2017-01-17 Thread Steve Crawford
On Tue, Jan 17, 2017 at 10:23 AM, Rich Shepard 
wrote:

>   Running -9.6.1. I have a database created and owned by me, but cannot
> copy
> a table to my home directory. Postgres tells me it cannot write to that
> directory. The only way to copy tables to files is by doing so as the
> superuser (postgres).
>
>   Why is this, and can I change something so I, as a user, can copy tables
> directly to ~/?
>
> To add to the other answers, more info is available at
https://www.postgresql.org/docs/current/static/app-psql.html#APP-PSQL-META-COMMANDS-COPY
and https://wiki.postgresql.org/wiki/COPY

Note that you can invoke SQL COPY to STDOUT as in: COPY (some arbitrary
query) TO STDOUT;

You would either pipe/redirect the output of psql as desired or use the
"\o" within psql to reroute the output to a file or pipe to a program, for
example, output to a CSV using a pipe as the delimiter and double-quote as
the quote character but change all "ma" to "pa" and put into myoutput.txt

\o | sed s/ma/pa/g > myoutput.txt
copy (some query) to stdout csv header delimiter '|' quote '"';
\o

Cheers,
Steve


Re: [GENERAL] COPY: row is too big

2017-01-04 Thread Steve Crawford
...

> Numeric is expensive type - try to use float instead, maybe double.
>>
>
> If I am following the OP correctly the table itself has all the columns
> declared as varchar. The data in the CSV file is a mix of text, date and
> numeric, presumably cast to text on entry into the table.
>

But a CSV *is* purely text - no casting to text is needed. Conversion is
only needed when the strings in the CSV are text representations of
*non*-text data.

I'm guessing that the OP is using all text fields to deal with possibly
flawed input data and then validating and migrating the data in subsequent
steps. In that case, an ETL solution may be a better approach. Many
options, both open- closed- and hybrid-source exist.

Cheers,
Steve


Re: [GENERAL] Generating sample data

2016-12-27 Thread Steve Crawford
On Tue, Dec 27, 2016 at 12:01 PM, Steve Crawford <
scrawf...@pinpointresearch.com> wrote:

> You could start here:
> http://www.softwaretestingmagazine.com/tools/open-source-test-data-
> generators/
>
> I have rolled my own on occasion by just pulling some public lists of most
> common given names and family names and toing a full-join. Same for city,
> streets, etc.
>
> -Steve
>
> On Tue, Dec 27, 2016 at 11:23 AM, Rich Shepard <rshep...@appl-ecosys.com>
> wrote:
>
>>   My previous databases used real client (or my own) data; now I want to
>> generate sample data for the tables in the two applications I'm
>> developing.
>> My web search finds a bunch of pricey (IMO) commercial products.
>>
>>   Are there any open source data generators that can provide sample data
>> based on each table's schema?
>>
>> TIA,
>>
>> Rich
>>
>>
>>
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>>
>
>
Sorry, "doing" a full-join. Which also leads to lots of fun cross-cultural
names like "Muhammad Wang" and "Santiago O'Leary".

Cheers,
Steve


Re: [GENERAL] Generating sample data

2016-12-27 Thread Steve Crawford
You could start here:
http://www.softwaretestingmagazine.com/tools/open-source-test-data-generators/

I have rolled my own on occasion by just pulling some public lists of most
common given names and family names and toing a full-join. Same for city,
streets, etc.

-Steve

On Tue, Dec 27, 2016 at 11:23 AM, Rich Shepard 
wrote:

>   My previous databases used real client (or my own) data; now I want to
> generate sample data for the tables in the two applications I'm developing.
> My web search finds a bunch of pricey (IMO) commercial products.
>
>   Are there any open source data generators that can provide sample data
> based on each table's schema?
>
> TIA,
>
> Rich
>
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


Re: [GENERAL] timestamp without timezone to have timezone

2016-11-06 Thread Steve Crawford
On Sun, Nov 6, 2016 at 7:10 AM, Melvin Davidson 
wrote:
>
>
>
> On Sun, Nov 6, 2016 at 9:11 AM, Benjamin Adams 
wrote:
>>
>> I have a server that has a column timestamp without timezone.
>>
>> Is the time still saved?
>> if I select column with timestamp it will show server timestamp with
timezone.
>>
>> But If I move the data from EST to Central will the timestamp with
timezone be correct?
>> Or will it just not make the adjustment?
>>
>> Thanks
>> Ben
>
>
> >But If I move the data from EST to Central will the timestamp with
timezone be correct?
>
> The correct way to do that is to use the  AT TIME ZONE function.
>
>
https://www.postgresql.org/docs/9.4/static/functions-datetime.html#FUNCTIONS-DATETIME-ZONECONVERT
>
> eg:
>
> postgres=> SELECT TIMESTAMP '2001-02-16 20:38:40' AT TIME ZONE 'EST' as
Eastern,
> postgres->TIMESTAMP '2001-02-16 20:38:40' AT TIME ZONE 'CST' as
Central,
> postgres->TIMESTAMP '2001-02-16 20:38:40' AT TIME ZONE 'MST' as
Mountain,
> postgres->TIMESTAMP '2001-02-16 20:38:40' AT TIME ZONE 'PST' as
Pacific;
> eastern |central |mountain
 |pacific
>
+++
>  2001-02-16 20:38:40-05 | 2001-02-16 21:38:40-05 | 2001-02-16 22:38:40-05
| 2001-02-16 23:38:40-05
> (1 row)


The question does not completely make sense and Melvin's answer is headed
in the right direction but there are a number of subtleties and
complications.

First, let me rephrase to make sure I understand the question. You have
some table(s) with column(s) of type timestamp without time zone. You
currently view the data from the perspective of US/Eastern (probably not
actually EST - more comments on the difference between offsets and zones
below) and want to know what happens if you view it from the perspective of
US/Central.

The short answer is that nothing will change. I'm in US/Pacific by default:

steve=> create temporary table foo (bar timestamp without time zone);
CREATE TABLE
steve=> insert into foo values (now());
INSERT 0 1

steve=> select bar from foo;
   bar

2016-11-06 08:57:06.808096

steve=> set timezone to 'US/Eastern';

steve=> select bar from foo;
   bar

2016-11-06 08:57:06.808096

steve=> set timezone to 'US/Central';

steve=> select bar from foo;
   bar

2016-11-06 08:57:06.808096


But if you do any sort of operation for which the timestamp alone is not
sufficient thus time zone information is required, PostgreSQL will
convert/calculate based on the current time zone setting.

Starting back at Pacific time, if I look at various other time zones I get:
steve=> select
   bar at time zone 'US/Pacific',
   bar at time zone 'US/Central',
   bar at time zone 'US/Eastern'
from
   foo;
-[ RECORD 1 ]---
timezone | 2016-11-06 08:57:06.808096-08
timezone | 2016-11-06 06:57:06.808096-08
timezone | 2016-11-06 05:57:06.808096-08

But if my client is set to Eastern I get:

steve=> set time zone 'US/Eastern';

steve=> select
   bar at time zone 'US/Pacific',
   bar at time zone 'US/Central',
   bar at time zone 'US/Eastern'
from
   foo;
-[ RECORD 1 ]---
timezone | 2016-11-06 11:57:06.808096-05
timezone | 2016-11-06 09:57:06.808096-05
timezone | 2016-11-06 08:57:06.808096-05

If you alter the table and change the data type to timestamp with time zone
(which to my mind is a bad name that we are stuck with - it should be
thought of as a "point in time" that can be displayed in any local time
zone) the data will be converted as above based on the current time zone
setting.

Another "gotcha": "EST" is an *offset* from UTC - specifically, it is
5-hours behind UTC. "US/Eastern", or one of the equivalent full names for
that zone (select * from pg_timezone_names;), is a time *zone*. Time zones
incorporate the various spring-forward/fall-back offset rules as they have
changed through history. Today is a convenient day for demonstrating. I
changed the table to have two columns, bar1 and bar2. bar2 has the
timestamp we used before and bar1 is the same timestamp but a day earlier.
This is what you will get if you display the values and the difference
between them in a zone-unaware way:

steve=> select bar1, bar2, bar2-bar1 from foo;
-[ RECORD 1 ]
bar1 | 2016-11-05 08:57:06.808096
bar2 | 2016-11-06 08:57:06.808096
?column? | 1 day

Now, let's change the data types (I'm still set to US/Eastern):

steve=> alter table foo alter column bar1 type timestamp with time zone;

steve=> alter table foo alter column bar2 type timestamp with time zone;

steve=> select bar1, bar2, bar2-bar1 from foo;
-[ RECORD 1 ]---
bar1 | 2016-11-05 08:57:06.808096-04
bar2 | 2016-11-06 08:57:06.808096-05
?column? | 1 day 01:00:00

Note 

Re: [GENERAL] Hardware recommendations?

2016-11-02 Thread Steve Crawford
After much cogitation I eventually went RAID-less. Why? The only option for
hardware RAID was SAS SSDs and given that they are not built on
electro-mechanical spinning-rust technology it seemed like the RAID card
was just another point of solid-state failure. I combined that with the
fact that the RAID card limited me to the relatively slow SAS data-transfer
rates that are blown away by what you get with something like an Intel NVME
SSD plugged into the PCI bus. Raiding those could be done in software plus
$$$ for the NVME SSDs but I already have data-redundancy through a
combination of regular backups and streaming replication to identically
equipped machines which rarely lag the master by more than a second.

Cheers,
Steve






On Wed, Nov 2, 2016 at 1:20 PM, Scott Marlowe 
wrote:

> On Wed, Nov 2, 2016 at 11:40 AM, Joshua D. Drake 
> wrote:
> > On 11/02/2016 10:03 AM, Steve Atkins wrote:
> >>
> >> I'm looking for generic advice on hardware to use for "mid-sized"
> >> postgresql servers, $5k or a bit more.
> >>
> >> There are several good documents from the 9.0 era, but hardware has
> moved
> >> on since then, particularly with changes in SSD pricing.
> >>
> >> Has anyone seen a more recent discussion of what someone might want for
> >> PostreSQL in 2017?
> >
> >
> > The rules haven't changed much, more cores (even if a bit slower) is
> better
> > than less, as much ram as the budget will allow and:
> >
> > SSD
> >
> > But make sure you get datacenter/enterprise SSDs. Consider that even a
> slow
> > datacenter/enterprise SSD can do 500MB/s random write and read just as
> fast
> > if not faster. That means for most installations, a RAID1 is more than
> > enough.
>
> Just to add that many setups utilizing SSDs are as fast or faster
> using kernel level RAID as they are with a hardware RAID controller,
> esp if the RAID controller has caching enabled. We went from 3k to 5k
> tps to 15 to 18k tps by turnong off caching on modern LSI MegaRAID
> controllers running RAID5.
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


Re: [GENERAL] Way to quickly detect if database tables/columns/etc. were modified?

2016-10-31 Thread Steve Crawford
Not sure if it would work for your use-case but what about just monitoring
the PostgreSQL log for DDL statements? You may have to filter out temp
tables (as you might in the system catalogs as well) but you could probably
also watch for specific tablename patterns in case you only need to
invalidate cache under specific circumstances.

Cheers,
Steve


On Mon, Oct 31, 2016 at 7:17 AM, Melvin Davidson 
wrote:

>
>
> On Mon, Oct 31, 2016 at 9:48 AM, Karsten Hilbert 
> wrote:
>
>> On Mon, Oct 31, 2016 at 09:14:07AM -0400, Melvin Davidson wrote:
>>
>> >> Maybe create an event trigger that updates a simple table with the last
>> >> modification time or sends a notification?
>> ...
>> > That would certainly work, but
>> > the problem is, that trigger would have to be created for every table in
>> > the database. When you have more than a couple dozen tables, as in
>> > hundreds, it becsmes a huge undertaking.*--
>>
>> Well, it could be generated.
>>
>> Karsten
>> --
>> GPG key ID E4071346 @ eu.pool.sks-keyservers.net
>> E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346
>>
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>>
>
> >Well, it could be generated.
> True, but it still is more code to maintain, whereas I maintain the
> addition addition of one column in pg_class and pg_attribute with a default
> of now()
> would make more sense. That being said, there were so many naysayers
> grasping at corner cases the last time I brought this up I have given up
> pursuing it.
>
> --
> *Melvin Davidson*
> I reserve the right to fantasize.  Whether or not you
> wish to share my fantasy is entirely up to you.
>


Re: [GENERAL] Installing 9.6 RC on Ubuntu [Solved]

2016-09-13 Thread Steve Crawford
>
>> In case it is useful for reference, I beat my head on it a bit more and
>> replaced "main" with "9.6":
>> deb http://apt.postgresql.org/pub/repos/apt/ xenial-pgdg 9.6
>>
>
> Yes, that is an FAQ item:
>
> https://wiki.postgresql.org/wiki/Apt/FAQ
>
>

Yes, but to quote Douglass Adams, "It was on display in the bottom of a
locked filing cabinet stuck in a disused lavatory with a sign on the door
saying ‘Beware of the Leopard'."

If I might be so bold, one or more of the following would make life easier
on users and perhaps get more people to test:

1. Change the wording in the main postgresql.org announcement to be
"Download RC1" and link directly to
https://www.postgresql.org/download/snapshots/

2. Bring non Red Hat releases up to parity.

By way of example, from the snapshots page clicking on the link under Red
Hat (and variants) takes you to http://yum.postgresql.org/ page where 9.6
beta is right at the top. From there all you have to do is select your
release of choice.

By comparison, follow the https://apt.postgresql.org/ link under Debian,
Ubuntu Linux and you are redirected to the wiki where, if you are
fortunate, you will find "Have a look at the FAQ." nestled about half-way
down then dig through that page. I never got there but figured it out by
clicking through the file browser after wasting time seeing lots of 9.6
utilities/addons already listed but broken due to the lack of availability
of the actual server and client.

I know this is open source. I know that people work on their "itch" or what
their employer sponsors. I'm just sharing the user experience should it
provide value and increase the number of testers.

Cheers,
Steve


Re: [GENERAL] Installing 9.6 RC on Ubuntu [Solved]

2016-09-13 Thread Steve Crawford
On Tue, Sep 13, 2016 at 11:03 AM, Steve Crawford <
scrawf...@pinpointresearch.com> wrote:

> I'm trying to install 9.6 RC1 on Ubuntu 16.04 Xenial on my laptop and it
> seems broken.
>
> Installation of 9.6 RC1 on Centos was straightforward by comparison - just
> add the 9.6 yum package and install. Unfortunately Ubuntu seems
> second-class by comparison.
>
> I already have /etc/apt.repos.d/pgdg.list with:
> deb http://apt.postgresql.org/pub/repos/apt/ xenial-pgdg main
>
> This works and I previously installed the PGDG 9.5 package without issue.
> Apt lists numerous 9.6 support packages (unit, repmgr, repack, plv8, ...)
> but with unsatisfied dependencies due to the lasck of postgresql-9.6.
>
> I tried adding -testing to the repo but no joy:
> deb http://apt.postgresql.org/pub/repos/apt/ xenial-pgdg-testing main
>
> Is packaging just not complete or am I missing something? (I had hoped
> that getting configured for testing would be more friction-free.)
>
> Cheers,
> Steve
>


In case it is useful for reference, I beat my head on it a bit more and
replaced "main" with "9.6":
deb http://apt.postgresql.org/pub/repos/apt/ xenial-pgdg 9.6

Life is good, again.

Cheers,
Steve


[GENERAL] Installing 9.6 RC on Ubuntu

2016-09-13 Thread Steve Crawford
I'm trying to install 9.6 RC1 on Ubuntu 16.04 Xenial on my laptop and it
seems broken.

Installation of 9.6 RC1 on Centos was straightforward by comparison - just
add the 9.6 yum package and install. Unfortunately Ubuntu seems
second-class by comparison.

I already have /etc/apt.repos.d/pgdg.list with:
deb http://apt.postgresql.org/pub/repos/apt/ xenial-pgdg main

This works and I previously installed the PGDG 9.5 package without issue.
Apt lists numerous 9.6 support packages (unit, repmgr, repack, plv8, ...)
but with unsatisfied dependencies due to the lasck of postgresql-9.6.

I tried adding -testing to the repo but no joy:
deb http://apt.postgresql.org/pub/repos/apt/ xenial-pgdg-testing main

Is packaging just not complete or am I missing something? (I had hoped that
getting configured for testing would be more friction-free.)

Cheers,
Steve


Re: [GENERAL] 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2

2016-09-02 Thread Steve Crawford
>
> ...
>
> You can actually reduce the time more by pre-syncing to the new location.
> something like:
>
> rsync -va /var/lib/pgsql/ /var/lib/pgsql2/
> service postgres stop
> rsync -va /var/lib/pgsql/ /var/lib/pgsql2/
>
> The second rsync will only copy the deltas from the first, it still has to
> go in and determine what needs to be copied/what changed but the bulk of it
> can be prepared/migrated before the actual downtime window.
>
>
>
The benefit of an initial and final rsync will depend on how many files
change. Rsync's default when copying between local paths is to use the
--whole-file option so at least it won't busy itself reading and comparing
the source and destination files which is worse than simply copying the
entire thing but you will only save the time associated with those files
that have unchanged modification time and size between the first and second
rsync. If the initial rsync takes, say, a half hour it is potentially
beneficial to run a second or even additional preliminary rsync runs as
each additional run should be faster due to less time for files to change
during the rsync. You will have to test for your specific case.

If you *really* want to do a fast switch and your configuration disk
configuration supports it you could possibly play games with using
single-machine DBRD or LVM RAID to live-sync the old and new directories.
Of course it's equally possible that the setup involved to do this will
involve more initial downtime than just copying the files.

Another possibility is to set up an additional slave instance of PostgreSQL
on your master machine then cut over to that instance. I haven't though
through the issue of bringing up your actual slave servers after the
cutover. I suspect in the worse case you would have your current master
instance, your replica instance running on the master server and using the
new SSD then migrate the slave servers to cascade off the master server's
replica instance. When all is synched up, promote the master server replica
instance to a master and kill off the original master instance.

As always in these instance, testing and practice is mandatory.

Cheers,
Steve


Re: [GENERAL] Permissions pg_dump / import

2016-08-17 Thread Steve Crawford
Check out the --no-owner and/or --no-acl flags when performing the dump.
These eliminate the statements that set and/or alter ownership of database
objects.

For use in a test server where the username of the test-server database is
different than the username on the production server *and* where you don't
have lots of roles with different ownership and permissions across your
database you should be fine.

Or create role(s) on your test database that match those on the production
database. This may require updating pg_hba.conf on the test database.

Cheers,
Steve


On Wed, Aug 17, 2016 at 3:16 PM, Patrick B  wrote:

> Hi guys,
>
> I'm running a pg_dump and then importing the dump into a test server. I'm
> using PostgreSQL 9.5.
>
> *pg_dump:*
>
>> pg_dump --dbname=prod1 --host= servername --port=5432 --username=user1 -v
>> -f test1_NEW.sql
>
> *Steps into the new database (test1):*
>
>> CREATE SCHEMA public;
>> GRANT ALL ON SCHEMA public TO user1;
>> psql -h servername -U master --port=5432 --dbname=test1 -f test1_NEW.sql
>
>
> I get lots of errors like:
>
> psql:test1_NEW.sql:212: ERROR:  must be owner of relation accounts_id_seq
>
>
> prod1=> \d+ accounts_id_seq
>> Sequence "public.accounts_id_seq"
>> Column |  Type   |Value| Storage
>> ---+-+-+-
>>  sequence_name | name| accounts_id_seq | plain
>>  last_value| bigint  | 33  | plain
>>  start_value   | bigint  | 1   | plain
>>  increment_by  | bigint  | 1   | plain
>>  max_value | bigint  | 9223372036854775807 | plain
>>  min_value | bigint  | 1   | plain
>>  cache_value   | bigint  | 1   | plain
>>  log_cnt   | bigint  | 32  | plain
>>  is_cycled | boolean | f   | plain
>>  is_called | boolean | t   | plain
>> Owned by: public.accounts.id
>
>
> What do I have to do? Should I revoke the permissions on the prod1
> database before performing the dump?
>
> Cheers;
> Patrick
>


Re: [GENERAL] dumb question

2016-06-02 Thread Steve Crawford
On Thu, Jun 2, 2016 at 10:40 AM, Felipe Santos <felipe...@gmail.com> wrote:

>
>
> 2016-06-02 14:23 GMT-03:00 Steve Crawford <scrawf...@pinpointresearch.com>
> :
>
>> Something like:
>>
>> select max(id) from yourtable where sts=0 and ref_id is null;
>>
>> That assumes that ref_id is null. It would help to see your table
>> structure and the query you tried that doesn't work. If ref_id is actually
>> a character string then you might need ref_id='' or coalesce(ref_id,'')=''
>> if it can be null or empty string.
>>
>> Cheers,
>> Steve
>>
>>
>> On Thu, Jun 2, 2016 at 10:16 AM, Steve Clark <steve.cl...@netwolves.com>
>> wrote:
>>
>>> Hi List,
>>>
>>> I am a noob trying to do something that seems like it should be easy but
>>> I can't figure it out.
>>>
>>> I have a table like so:
>>>
>>> id | ref_id | sts
>>> --
>>> 1  ||  0
>>> 2  | 1  |  1
>>> 3  ||  0
>>> 4  ||  0
>>> 5  | 4  |  1
>>> 6  ||  0
>>> 7  | 6  |  1
>>>
>>> I want to find the max(id) whose sts is 0 but whose id is not referenced
>>> by ref_id.
>>>
>>> so the answer would be id=3.
>>>
>>> Thanks for any pointers,
>>> Steve
>>>
>>> --
>>>
>>>
>>>
>>> --
>>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>>> To make changes to your subscription:
>>> http://www.postgresql.org/mailpref/pgsql-general
>>>
>>
>>
>
> I think sts=0 means ref_id is null
>
> So, what I think he wants to achieve is:
>
> select max(id) from yourtable where sts=0 and id not in (select ref_id
> from yourtable);
>
> Isn't it?
>
> The OP will need to explain further as we are all guessing. As I mentioned
in my earlier (accidental top - curses GMail) post, table structures and
the query or queries that don't work would be useful. So would a
description of the problem that is being solved since there could be better
approaches.

Cheers,
Steve


Re: [GENERAL] dumb question

2016-06-02 Thread Steve Crawford
Something like:

select max(id) from yourtable where sts=0 and ref_id is null;

That assumes that ref_id is null. It would help to see your table structure
and the query you tried that doesn't work. If ref_id is actually a
character string then you might need ref_id='' or coalesce(ref_id,'')='' if
it can be null or empty string.

Cheers,
Steve


On Thu, Jun 2, 2016 at 10:16 AM, Steve Clark 
wrote:

> Hi List,
>
> I am a noob trying to do something that seems like it should be easy but I
> can't figure it out.
>
> I have a table like so:
>
> id | ref_id | sts
> --
> 1  ||  0
> 2  | 1  |  1
> 3  ||  0
> 4  ||  0
> 5  | 4  |  1
> 6  ||  0
> 7  | 6  |  1
>
> I want to find the max(id) whose sts is 0 but whose id is not referenced
> by ref_id.
>
> so the answer would be id=3.
>
> Thanks for any pointers,
> Steve
>
> --
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


Re: [GENERAL] psql color hostname prompt

2016-05-05 Thread Steve Crawford
BTW, I just noticed that as of 9.5 there is an optional GUC called
cluster_name. Unfortunately I don't see a way to reference it in the prompt
string. I'll suggest that as a feature. My earlier hack will work but in
9.5 use cluster_name instead of making up a fake extension variable.

Cheers,
Steve

On Thu, Apr 28, 2016 at 2:41 AM, Francisco Olarte <fola...@peoplecall.com>
wrote:

> Hi Steve:
>
> On Wed, Apr 27, 2016 at 7:09 PM, Steve Crawford
> <scrawf...@pinpointresearch.com> wrote:
> > The various hacks appear to not deal with the fact that there may be
> > multiple instances of postgresql running on different TCP ports or Unix
> > connections nor with the fact that the local connection may, in fact, be
> a
> > pooler and not a direct connection to the database.
>
> Because the problems is with the host, the port is solved trivially
> with %> and the local socket name is dependent on the listening port.
> And, regarding pgbouncer, psql just knows it's talking with someone
> who speaks the postgres protocol, it has no way to know what is being
> done with the socket.
>
> > As long as we're into hack-land, I'll offer one.
> > First, you abuse the custom variables feature in postgresql.conf and add
> > something like:
>
> That's a nice trick ( once you peel of the ; before \gset IIRC ) for
> the problem of knowing which of your server databases you are
> connected to. Low impact and if the database does not have the guc you
> can easily know it ( I think knowing the host is not the problem, it
> does not matter how many socket redirections, bouncers or other things
> you go through your solucion solves the problem ).
>
>
> > Next you update .psqlrc with something along the lines of:
>
> Just a problem, you need it somewhere where it can be re-executed on
> reconnects ( am I the only one who routinely uses \c ? ).
>
> > On the plus side, the custom GUC setting is available to any client, not
> > just psql. It also handles multiple PostgreSQL instances and connections
> > that are actually routed through a pooler.
>
> Yes, you do not know who you are connected to, but you know which
> server istance you end up in, which is nice. I think knowing the
> host/path+port is useful for some things, but for the originally
> stated problem this seems better.
>
> > On the down side, it is a hack. The method is not in any way guaranteed
> to
> > be future-proof. It leaves an ugly bit of server output at psql startup.
> It
> > requires ensuring that all servers have the variable set appropriately.
> You
> > need to avoid colliding with a custom GUC used by an extension. But
> perhaps
> > it is useful.
>
> Not this hacky, I'll use it in preference to changing the prompt with
> scripts ( I'll continue using %M and changing terminal titles, but I'm
> too used to it ).
>
> Well seen.
>
>
> Francisco Olarte.
>


Re: [GENERAL] Thoughts on "Love Your Database"

2016-05-04 Thread Steve Crawford
On Wed, May 4, 2016 at 8:04 AM, Steve Crawford <
scrawf...@pinpointresearch.com> wrote:

> First, you hit them over the head with a copy of "SQL Antipatterns:
> Avoiding the Pitfalls of Database Programming". It is a tad out of date and
> tends to use PHP and MySQL for the main examples but does also address
> different solutions available in PostgreSQL, Oracle. MS SQL server, etc.
> while pointing out the risks of various common foot-guns and providing
> alternatives.
>
> Or point them to this recent Linux Journal article by Reuven Lerner (who
> is occasionally seen on these lists):
> http://www.linuxjournal.com/content/use-your-database
>
> Developers often have a pre-Gallileo world view that they and whatever app
> they are coding is the center of the universe and databases, networks,
> storage and the rest all revolve around them existing only to support their
> app.
>
> But ultimately the church of the developer gets forced into the modern era
> and finds that the data is at the center and the apps that allow input,
> maintenance, extraction and analysis all revolve around those core crown
> jewels. Then, *gasp*, there are other people and apps touching "your" data.
> Are they all validating the data the way you do? Protecting it? Retrieving
> it efficiently? Only then does the real value of the database come into
> focus.
>
> Cheers,
> Steve
>
>
>
>
>
> On Tue, May 3, 2016 at 9:11 PM, Guyren Howe <guy...@gmail.com> wrote:
>
>> I've long been frustrated with how most web developers I meet have no
>> idea how to use an SQL database properly. I think I'm going to write a book
>> called Love Your Database, aimed at web developers, that explains how to
>> make their apps better by leveraging the power of SQL in general, and
>> Postgres in particular.
>>
>> I'm thinking of a section on features of SQL most folks don't know about
>> (CTEs are *way* to hell at the top of that list, but also EXCEPT/INTERSECT
>> and window functions), but much of the book would be about how to do things
>> server side. Benchmarks showing how much faster this can be, but mostly
>> techniques — stored procedures/triggers/rules, views.
>>
>> I asked a colleague about the advice I often hear stated but seldom
>> justified, that one shouldn't put business rules in the database. He
>> offered that server-side code can be hard to debug.
>>
>> I'm sure many here would love to see such a book published, maybe some
>> talks on the topic given.
>>
>>
>> What might I cover that I haven't mentioned? What are the usual
>> objections to server-side code and how can they be met? When *are* they
>> justified and what should the criteria be to put code in Postgres? Any
>> other thoughts? Any other websites or books on the topic I might consult?
>>
>> TIA
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>>
>
>
(Apologies for the top-posts - forgot to override the GMail defaults...)

-Steve


Re: [GENERAL] Thoughts on "Love Your Database"

2016-05-04 Thread Steve Crawford
Perhaps a bit US centric but PHB = Pointy Haired Boss. See the boss in
"Dilbert".

Cheers,
Steve

On Wed, May 4, 2016 at 7:55 AM, Pierre Chevalier Géologue <
pierrechevalierg...@free.fr> wrote:

> Le 04/05/2016 15:25, John McKown a écrit :
>
>> On Wed, May 4, 2016 at 8:13 AM, Geoff Winkless > >wrote:
>>
>> ​
>>
>> The sensible way is to do it as John wrote - to restrict access rights
>> to everyone except admin to calling functions only. That way the
>> functions are written by the people who are paid to understand the
>> business rules and the data behind it, and the application developers
>> can ask those experts to do the heavy lifting for them. Having to
>> persuade management that they should no longer be able to connect the
>> database to MS Access and make changes that way will usually put an
>> end to that pure model, though. :)
>>
>>
>> ​Allowing PHBs direct access t​o company data is a nasty thing.
>>
>
> Sorry, what is a PHB?  Our friend google didn't help me much on this
> matter.
>
>
> They become like some users who "know Excel". They are now just as
>> knowledgeable as someone who's been doing this for years. I've actually
>> heard one say something akin to: "Damn it, I can write Excel formulas. I
>> know very well that an new function on the web site could be written in
>> less than a day, if you'd just get off you a$$ and do it."
>>
>
> Hm.  Sounds familiar...
> I usually call "excelitis" a sort of mental disease related to a use and
> abuse of Excel, up to the point where one cannot imagine data which is
> *not* in a table-like array.  And they think that they do Relational
> Database Management...  In the 1990's, I met many-many deeply sick
> persons.  I had been infected for a while, I must confess.
>
> À+
> Pierre
> --
>
> 
> Pierre Chevalier
> PChGEI: Pierre Chevalier Géologue Et Informaticien
> Partenaire DALIBO
> Mesté Duran
> 32100 Condom
>   Tél+fax  :09 75 27 45 62
> 06 37 80 33 64
>   Émail  :   pierrechevaliergeolCHEZfree.fr
>   icq#   :   10432285
>   jabber: pierre.chevalier1...@jabber.fr
>   http://pierremariechevalier.free.fr/pierre_chevalier_geologue
>
> 
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


Re: [GENERAL] Thoughts on "Love Your Database"

2016-05-04 Thread Steve Crawford
First, you hit them over the head with a copy of "SQL Antipatterns:
Avoiding the Pitfalls of Database Programming". It is a tad out of date and
tends to use PHP and MySQL for the main examples but does also address
different solutions available in PostgreSQL, Oracle. MS SQL server, etc.
while pointing out the risks of various common foot-guns and providing
alternatives.

Or point them to this recent Linux Journal article by Reuven Lerner (who is
occasionally seen on these lists):
http://www.linuxjournal.com/content/use-your-database

Developers often have a pre-Gallileo world view that they and whatever app
they are coding is the center of the universe and databases, networks,
storage and the rest all revolve around them existing only to support their
app.

But ultimately the church of the developer gets forced into the modern era
and finds that the data is at the center and the apps that allow input,
maintenance, extraction and analysis all revolve around those core crown
jewels. Then, *gasp*, there are other people and apps touching "your" data.
Are they all validating the data the way you do? Protecting it? Retrieving
it efficiently? Only then does the real value of the database come into
focus.

Cheers,
Steve





On Tue, May 3, 2016 at 9:11 PM, Guyren Howe  wrote:

> I've long been frustrated with how most web developers I meet have no idea
> how to use an SQL database properly. I think I'm going to write a book
> called Love Your Database, aimed at web developers, that explains how to
> make their apps better by leveraging the power of SQL in general, and
> Postgres in particular.
>
> I'm thinking of a section on features of SQL most folks don't know about
> (CTEs are *way* to hell at the top of that list, but also EXCEPT/INTERSECT
> and window functions), but much of the book would be about how to do things
> server side. Benchmarks showing how much faster this can be, but mostly
> techniques — stored procedures/triggers/rules, views.
>
> I asked a colleague about the advice I often hear stated but seldom
> justified, that one shouldn't put business rules in the database. He
> offered that server-side code can be hard to debug.
>
> I'm sure many here would love to see such a book published, maybe some
> talks on the topic given.
>
>
> What might I cover that I haven't mentioned? What are the usual objections
> to server-side code and how can they be met? When *are* they justified and
> what should the criteria be to put code in Postgres? Any other thoughts?
> Any other websites or books on the topic I might consult?
>
> TIA
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


Re: [GENERAL] psql color hostname prompt

2016-04-26 Thread Steve Crawford
>
> 2)  %M vs shell call
>
> %M on when connected to the local machine displays the string "[local]"
> which I didn't like.  I wanted a real hostname to show no matter which
> client/server pair I was using.  Zero chance for mistaken commands on the
> wrong host.  Many times we ssh to a remote server, then run psql locally.
>
> Perhaps the more elegant route here, is to change psql's behavior with %M
> when connected to the local machine?  (This would also solve point #3)
>
>
There is a basic problem - what is the hostname?

1.2.3.4? db23.example.com? server42? 127.0.0.1? 192.168.54.23? Those could
all be the same PostgreSQL instance, all different or some mix. And we
haven't even considered local Unix connections, servers running on
different ports and the fact that localhost is machine specific and
non-fully-qualified names that depend on resolver search orders.

One possible solution would be to essentially do the reverse of
application_name. I.e. do something along the lines of adding a server-side
parameter to specify the name of the server instance that would be
exchanged in the startup handshake and made available to client processes.

I could see some value in that but can't speak to the work and possible
wire-protocol breakage that might be involved relative to the benefits.

Cheers,
Steve


Re: [GENERAL] Export binary data - PostgreSQL 9.2

2016-02-28 Thread Steve Crawford
What exactly are you trying to do? Dump/backup your data (e.g. pg_dump)?
Read binary data from a table? If so, what field type (bytea, blob, ...)?
Export to where?

Cheers,
Steve


On Sun, Feb 28, 2016 at 9:12 AM, drum.lu...@gmail.com 
wrote:

> Hi all,
>
>
> Which command would be to export the binary data for a table?
>
> I was unable to find it...
>
> Thanks
>


Re: [GENERAL] PostgreSQL flavors

2016-02-23 Thread Steve Crawford
Congratulations on the decision and welcome.

As an overview, there is the PostgreSQL *project* which is run by the
PostgreSQL Global Development Group (PgDG) with contributors around the
world most of whom work for a variety of companies that either use or
support PostgreSQL. PostgreSQL is BSD-licensed open-source software. PgDG
operates these mailing lists.

Within the PostgreSQL ecosystem you will find numerous commercial and
non-commercial entities that provide:

-Custom/commercial/extended products based on PostgreSQL (i.e. EnterpriseDB
and others) - some of which are designed to provide a level of Oracle
compatibility

-Commercial and open-source tools that extend, support or work with
PostgreSQL (you may want to look at Ora2Pg - a tool to migrate Oracle to
PostgreSQL)

-Development services

-Training

-DBA and support services (i.e. PGExperts)

-User-groups (very helpful but I don't see one in Florida:
http://www.postgresql.org/community/user-groups/)

-Conferences (http://www.postgresql.org/about/events/)

-Books and magazines

I'd agree with your plan to start with the core open-source PostgreSQL
software to learn then progress to other products/projects as needs dictate.

Cheers,
Steve


On Tue, Feb 23, 2016 at 7:51 AM, Sherrie Kubis <
sherrie.ku...@swfwmd.state.fl.us> wrote:

> Hello, my first post to the list, thank you for this place to ask
> questions and get help.
>
>
>
> Our management has tasked me with devising a plan to migrate our existing
> databases from Oracle to PostgreSQL.  I’m researching and getting familiar
> with PostgreSQL before getting a Linux box to start learning and staging.
> I have a long way to go, but it will be fun.
>
>
>
> Out of the gate, I can see different PostgreSQL products – PostgreSQL,
> PostgreSQLPlus, EnterpriseDB Advanced Server.
>
> For staging I’ll likely start with the vanilla version.  I’ve been
> searching for a comparison of different product sets; I’ve found some
> things, but in other places different information.  There is a lot to sift
> through.  For example, we are now using non-active Data Guard to a standby
> for disaster recovery.  What is the different on this functionality in the
> 3 products? Or perhaps there are more products than I’ve found.  Cost will
> be a factor for us, but I also consider that costs are not only product and
> maintenance costs, there are DBA and developer migration costs as well.
>
>
>
> Any insights or information is appreciated.
>
>
>
>
>
> *
>
> Sherrie Kubis
>
> Sr. Oracle DBA
>
> Information Technology Bureau
>
> Southwest Florida Water Management District
>
> 2379 Broad Street
>
> Brooksville, FL 34604-6899
>
> 352.796.7211 x4033
>
> sherrie.ku...@swfwmd.state.fl.us 
>
>
>
> Please take a moment to answer a few questions
>  and let us how we’re doing.
>
>
>
> *IMPORTANT NOTICE*
>
> *E-mails made or received in conjunction with the official business of the
> District are public records.  All e-mails sent to and from this address are
> automatically archived.  For more information regarding the State of
> Florida public records laws, please visit www.myflorida.com
> .*
>
>
>


[GENERAL] Transactions, stats and analyze (oh-my)

2016-02-16 Thread Steve Crawford
We have certain processes that import data then process and distribute the
data. Since the processing looks primarily, but not exclusively, at the new
records an ANALYZE prior to processing yields better plans. Although the
table changes will trigger autovacuum to analyze the table this happens too
late to be of use for the subsequent processing so we run an explicit
analyze (which, when it collides with autovacuum, triggers a warning in the
logs).

I would like to roll a number of steps into a transaction. What happens to
the stats data generated by ANALYZE if the transaction is rolled back?

This message says the stats are reverted:
http://postgresql.nabble.com/Analyze-during-a-transaction-td5775069.html

This implies they are not:
http://www.postgresql.org/message-id/e1xjbtw-0002f9...@gemulon.postgresql.org

Cheers,
Steve


Re: [GENERAL] Postgres and timezones

2016-01-20 Thread Steve Crawford
Right, I was just mentioning the existence of that built-in data in case it
was of use to the OP. But I should have also mentioned some caveats in case
it is important to his use-case:

1. Full names are not as standardized as one might like so there are
multiple possible full names for a time zone, i.e.
"America/Los_Angeles", "posix/America/Los_Angeles",
"posix/US/Pacific", "PST8PDT", ...

2. Unlike full names, abbreviations do *not* distinctly identify a single
time zone. CST is the short name for US Central Standard Time, Cuba, ROC
and PRC among others.

3. pg_timezone_names is a *view* and the results for abbreviation and
offset change depending on time of year. Right now it's winter on the US
West Coast so the abbreviation for "posix/US/Pacific" is PST and I will get
results searching for abbreviations matching "PST" but none for "PDT". Come
spring, that will change.

Cheers,
Steve


On Wed, Jan 20, 2016 at 7:46 AM, Pavel Stehule <pavel.steh...@gmail.com>
wrote:

>
>
> 2016-01-20 16:38 GMT+01:00 Steve Crawford <scrawf...@pinpointresearch.com>
> :
>
>> Is this of any use?
>>
>> select * from pg_timezone_names where name = 'Europe/Lisbon';
>> name  | abbrev | utc_offset | is_dst
>> ---+++
>> Europe/Lisbon | WET| 00:00:00   | f
>>
>>
> This is list of know timezones. So if you are searching "abbrev" then you
> can find it there.
>
>
> Pavel
>
>
>
>> -Steve
>>
>> On Wed, Jan 20, 2016 at 7:24 AM, Steve Rogerson <
>> steve...@yewtc.demon.co.uk> wrote:
>>
>>> On 20/01/16 13:27, Pavel Stehule wrote:
>>> >
>>> >
>>> >
>>> > Postgres doesn't store original TZ. It does recalculation to local TZ.
>>> If you
>>> > need original TZ, you have to store it separetely.
>>> >
>>>
>>> I know and that's what I'm trying to deal with. Given I know the origin
>>> TZ  -
>>> as in Europe/Lisbon I'm trying to determine the short name so I can
>>> store it.
>>>
>>> I guess I'll have to use something other than pg to do it.
>>>
>>> Steve
>>>
>>>
>>>
>>>
>>> --
>>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>>> To make changes to your subscription:
>>> http://www.postgresql.org/mailpref/pgsql-general
>>>
>>
>>
>


Re: [GENERAL] Postgres and timezones

2016-01-20 Thread Steve Crawford
Is this of any use?

select * from pg_timezone_names where name = 'Europe/Lisbon';
name  | abbrev | utc_offset | is_dst
---+++
Europe/Lisbon | WET| 00:00:00   | f

-Steve

On Wed, Jan 20, 2016 at 7:24 AM, Steve Rogerson 
wrote:

> On 20/01/16 13:27, Pavel Stehule wrote:
> >
> >
> >
> > Postgres doesn't store original TZ. It does recalculation to local TZ.
> If you
> > need original TZ, you have to store it separetely.
> >
>
> I know and that's what I'm trying to deal with. Given I know the origin
> TZ  -
> as in Europe/Lisbon I'm trying to determine the short name so I can store
> it.
>
> I guess I'll have to use something other than pg to do it.
>
> Steve
>
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


Re: [GENERAL] Deletion Challenge

2015-12-09 Thread Steve Crawford
The two general solutions are the "keep the last one" proposed by Adrian
"keep the last N" that I sent.

But it might be worth stepping back a bit. You said you are having
performance problems that you feel would be improved by removing only a
million rows which doesn't sound like that much to me. It's less than half
of what I *add* to just one of my tables every week and my database is
dwarfed by those of many of the participants on this list.

This suggests that there may be other issues such as tuning, indexing or
query optimization at play. Depending on your requirements, partitioning
might be useful. It wouldn't be last N but could easily be done to
partition by date-ranges which makes archiving and purging a low-cost
operation.

You might want to expand a bit on the core issue you are trying to solve.

Cheers,
Steve


On Wed, Dec 9, 2015 at 12:43 PM, David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Wed, Dec 9, 2015 at 1:31 PM, Adrian Klaver 
> wrote:
>
>> On 12/09/2015 12:24 AM, Berend Tober wrote:
>>
>>> Adrian Klaver wrote:
>>>
 On 12/05/2015 08:08 AM, Berend Tober wrote:

> /*
>
> Deletion Challenge
>
> I want to delete all but the most recent transaction, per person, from
> a
> table that records a transaction history because at some point the
> transaction history grows large enough to adversely effect performance,
> and also becomes less relevant for retention.
>
> ...
>
>
 test=> delete from cash_journal where ARRAY[click, cash_journal_id]
 NOT in (select max(ARRAY[click,cash_journal_id]) from cash_journal
 group by fairian_id);
 DELETE 7

 test=> SELECT * FROM cash_journal order by fairian_id, click,
 cash_journal_id;
   click | cash_journal_id | fairian_id | debit | credit | balance
 |   description

 ---+-++---++-+--

 412 |   1 |  7 | 5 ||  14 |
 Sold food quantity 7 units.
  37 |   7 |  8 | 8 ||   8 |
 Ratified contract f1abd670358e03
  37 |   9 |  9 | 7 ||   7 |
 Ratified contract 1574bddb75c78a
  36 |  14 | 18 | 0 |  0 |   0 |
 initial cash balance
 413 |   1 | 25 |   |995 |   0 |
 Redeemed bond 7719a1c782a1ba
 (5 rows)


>>> Nice.
>>>
>>> The idea of a NOT IN query had occurred to me briefly, but I failed to
>>> pursue it because at some point in the distant past I had gained the
>>> impression that NOT IN queries were not computationally efficient.
>>> During one round of testing I had like a million rows. I'll have to run
>>> some EXPLAIN query testing with a larger data sample for comparison.
>>> Thanks!
>>>
>>
>> Plan B:
>>
>> WITH d AS
>> (SELECT * FROM
>> cash_journal
>> LEFT JOIN
>> (SELECT
>> MAX(ARRAY[click,cash_journal_id]) AS mx
>> FROM
>> cash_journal
>> GROUP BY
>> fairian_id)
>> AS
>> mxa
>> ON
>> mxa.mx=ARRAY[click, cash_journal_id]
>> WHERE
>> mx IS NULL)
>> DELETE FROM
>> cash_journal
>> USING
>> d
>> WHERE
>> d.click = cash_journal.click
>> AND
>> d.cash_journal_id = cash_journal.cash_journal_id;
>>
>>
> ​Couldn't the LEFT JOIN relation in the CTE be better written using
> "SELECT DISTINCT ON (click, cash_journal_id) click, cash_journal_id [...]
> ORDER BY click DESC, cash_journal_id" or something similar?  It doesn't
> seem like you should need to introduce an array and an aggregate here.
>
> ​It does have the negative property of only providing a single row; which
> excludes using it for the "last 5" part but I suspect it will be
> considerably faster for the single version.
>
> David J.
>


Re: [GENERAL] Deletion Challenge

2015-12-08 Thread Steve Crawford
If I understand correctly the value of "click" always advances and within a
"click" the "cash_journal_id" always advances - not necessarily by single
steps so within a fairian_id, ordering by "click" plus "cash_journal_id"
would return the records in order from which you want the most recent 5 for
each farian_id.

Typing without testing and ignoring performance optimizations, something
along the lines of the following should work and covers the "last 5" issue
as well.

with stuff_to_delete as (
select farian_id, click, cash_journal_id,
rank() over (partition by farian_id order by (click, cash_journal_id) desc)
as howold)
from cash_journal)
delete from cash_journal
using stuff_to_delete
where
cash_journal.farian_id = stuff_to_delete.farian_id
and cash_journal.click = stuff_to_delete.click
and cash_journal.cash_journal_id = stuff_to_delete.cash_journal_id
and stuff_to_delete.howold > 5;

Cheers,
Steve


On Sat, Dec 5, 2015 at 8:08 AM, Berend Tober  wrote:

> /*
>
> Deletion Challenge
>
> I want to delete all but the most recent transaction, per person, from a
> table that records a transaction history because at some point the
> transaction history grows large enough to adversely effect performance,
> and also becomes less relevant for retention.
>
> I have devised a way to accomplish this, but it is a 'two-stage'
> approach: that is, it requires two delete statements. I would like to
> know if there is a way to do it in a single statement.
>
> Bonus challenge: Same question, except preserving the most recent N, for
> N > 1, rows for each person so that a short history is retained after
> the deletion.
>
> I have included below an annotated test case and my current solution for
> the N = 1 case.
>
> */
>
> DROP TABLE IF EXISTS cash_journal;
>
>
> CREATE TABLE cash_journal (
> click bigint NOT NULL,
> cash_journal_id bigint NOT NULL,
> fairian_id bigint NOT NULL,
> debit double precision,
> credit double precision,
> balance real DEFAULT 0,
> description text
> );
>
> COMMENT ON COLUMN cash_journal.clickIS 'Time of transaction.';
> COMMENT ON COLUMN cash_journal.cash_journal_id  IS 'Sequence of
> transaction within current click.';
> COMMENT ON COLUMN cash_journal.fairian_id   IS 'Fairian account
> effected.';
> COMMENT ON COLUMN cash_journal.debitIS 'Account balance
> increase amount.';
> COMMENT ON COLUMN cash_journal.credit   IS 'Account balance
> decrease amount.';
> COMMENT ON COLUMN cash_journal.balance  IS 'Account balance, per
> Fairian running total.';
> COMMENT ON COLUMN cash_journal.description  IS 'Transaction
> description.';
>
> /*
>
> Below is some sample data, listed in the click/sequence order that the
> data would actually be entered. That is, the 'click' column represents
> advancing time, and within each click, transactions are sequenced by the
> 'cash_journal_id' column. Note there are some missing cash_journal_id
> sequence numbers. This is an artifact of having presented here only
> an illustrative sample. Generally, within each click, the sequence
> would start at one and increment uniformly by one for each new row
> in the same click, and then reset to one for the next click. The
> missing increments in the sample data should not make any difference
> in the solution.
>
> The 'balance' column is a per-player running total, which is a
> deliberate denormalization. It is calculated in a before insert trigger
> by starting with the per-player previous balance, and then adding
> the new row debit, if any, and subtracting the new row credit, if any.
>
> Note, not all Fairians will have a transaction in every click, but any
> number of Fairians may have multiple transactions in any click.
>
> */
>
> copy cash_journal
> (click,cash_journal_id,fairian_id,debit,credit,balance,description) from
> stdin;
> 36  3   7   0   0   0   Initial cash balance
> 36  4   8   0   0   0   Initial cash balance
> 36  5   9   0   0   0   Initial cash balance
> 36  14  18  0   0   0   initial cash balance
> 37  5   7   9   \N  9   Ratified contract
> fa35e192121eab
> 37  7   8   8   \N  8   Ratified contract
> f1abd670358e03
> 37  9   9   7   \N  7   Ratified contract
> 1574bddb75c78a
> 411 1   25  0   0   0   Initial cash balance
> 411 2   25  1000\N  1000Issued bond 7719a1c782a1ba
> 412 1   7   5   \N  14  Sold food quantity 7 units.
> 412 2   25  \N  5   995 Bought food quantity 7
> units.
> 413 1   25  \N  995 0   Redeemed bond
> 7719a1c782a1ba
> \.
>
>
> SELECT * FROM cash_journal order by fairian_id, click, cash_journal_id;
>
> /*
>
> The sample starting data is shown here in order by Fairian so that it is
> perhaps easier to 

Re: [GENERAL] Packages for Ubuntu Wily (15.10)

2015-12-07 Thread Steve Crawford
You should be able to add the pgdg repository to your system and then
install through apt as normal. Scroll down to the "PostgreSQL APT
repository" section on this page:
http://www.postgresql.org/download/linux/ubuntu/

Cheers,
Steve

On Mon, Dec 7, 2015 at 9:27 AM, Antony Gelberg 
wrote:

> Hi all,
>
> We want to run 9.3 on the above distro, which comes with 9.4 as standard
> (in the distribution). However, we note that there are only 9.5 packages in
> the postgresql 15.10 repository. Can somebody flip the switch to build
> these? We really aren't ready to upgrade to 9.4 at the present time.
>
> Hope somebody can help. :)
>
> Antony
>
> --
> http://www.linkedin.com/in/antgel
> http://about.me/antonygelberg
>


Re: [GENERAL] Pgbouncer

2015-11-30 Thread Steve Crawford
Do you have any clients connected that are idle in transaction?

Cheers,
Steve

On Mon, Nov 30, 2015 at 1:46 PM, Torsten Förtsch 
wrote:

> Hi,
>
> I am not sure if this is the right place to ask this question. If not,
> please point me to it.
>
> I am trying out the new pgbouncer (latest git). "SHOW SERVERS" is
> telling me 2 connections in "active" state. Both show
> connect_time=2015-11-30 18:58:30. Request_time is 19:01:35 and 20:56:36.
> Both use pool_mode=transaction.
>
> Server_lifetime=600. And now is 21:38:55.
>
> Shouldn't those connections have been closed long ago?
>
> I also checked on the backend. The backend processes are there, are idle
> and query_start and state_change times match the request_times above.
>
> Also, since pool_mode=transaction, why are these connections "active"?
>
> Thanks,
> Torsten
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


Re: [GENERAL] Queuing query

2015-09-22 Thread Steve Crawford
Thanks, Jeff. Like I said, this is the way-stripped-down version of the
core query with things like "the_priority" and "the_work" standing for more
complicated expressions. Lots of other stuff is going on to make sure we
get a response, clear the queue, etc. and we will index appropriately.

I'm really looking for any things like planner ordering nuances that would
make the query operate in unexpected ways.

Cheers,
Steve



On Tue, Sep 22, 2015 at 9:26 AM, Jeff Janes <jeff.ja...@gmail.com> wrote:

> On Mon, Sep 21, 2015 at 3:51 PM, Steve Crawford <
> scrawf...@pinpointresearch.com> wrote:
>
>> While awaiting the awesomeness of the upcoming "skip locked" feature in
>> 9.5 I need to handle a work queue.
>>
>> Does anyone see any glaring issues or subtle nuances with the basic
>> method below which combines CTEs with queue-handling methods posted by
>> depesz, on the PG wiki and elsewhere.
>>
>> Note that it appears that there is the slight potential for a
>> race-condition which would cause one worker to occasionally fail to get a
>> record but the application code handles that issue fine.
>>
>> The work is sent to an externally hosted API which will ultimately reply
>> to a callback API at our end so obviously there's a lot of other stuff in
>> the system to update final results, recover from lost work, add to the
>> queue, etc. I'm just asking about the sanity of the queue processing query
>> itself:
>>
>> with next_up as (
>>select
>> the_id
>> from
>> queuetest
>>where
>> not sent_for_processing
>> and pg_try_advisory_xact_lock(12345, the_id)
>>order by
>> the_priority
>> limit 1 for update)
>> update
>> queuetest
>> set
>> sent_for_processing = true
>> where
>> the_id = (select the_id from next_up)
>> returning
>> the_work_to_do;
>>
>
> This will only be sane if the inner query can use an index to do the
> "order by".  Otherwise it is going to read every row in order to sort them,
> and get the advisory lock on every row, and you will run out of shared
> memory.  Of course, if it were sorting each time it would probably be too
> slow anyway.
>
> And it has to be a partial index:
>
>  (the_priority) where not sent_for_processing
>
> Because if you just have an index on the_priority, the sub select will
> start getting inefficient once all the lowest numbered priority items are
> marked as sent.
>
> Also, you probably want to make sent_for_processing be some kind of token
> or time stamp, to make it easier to detect lost work.  In which case NULL
> would mean not yet sent, so the partial index would be "where
> sent_for_processing is null".
>
> Cheers,
>
> Jeff
>


[GENERAL] Queuing query

2015-09-21 Thread Steve Crawford
While awaiting the awesomeness of the upcoming "skip locked" feature in 9.5
I need to handle a work queue.

Does anyone see any glaring issues or subtle nuances with the basic method
below which combines CTEs with queue-handling methods posted by depesz, on
the PG wiki and elsewhere.

Note that it appears that there is the slight potential for a
race-condition which would cause one worker to occasionally fail to get a
record but the application code handles that issue fine.

The work is sent to an externally hosted API which will ultimately reply to
a callback API at our end so obviously there's a lot of other stuff in the
system to update final results, recover from lost work, add to the queue,
etc. I'm just asking about the sanity of the queue processing query itself:

with next_up as (
   select
the_id
from
queuetest
   where
not sent_for_processing
and pg_try_advisory_xact_lock(12345, the_id)
   order by
the_priority
limit 1 for update)
update
queuetest
set
sent_for_processing = true
where
the_id = (select the_id from next_up)
returning
the_work_to_do;

Cheers,
Steve


Re: [GENERAL] Very puzzling sort behavior

2015-09-10 Thread Steve Crawford
Any null values in first name??

-Steve

On Thu, Sep 10, 2015 at 12:35 PM, Ken Tanzer  wrote:

> Hi.  In a table that includes these columns:
>
> my_db=> \d tbl_client
> ...
>  name_last   | character varying(40)  | not null
>  name_first  | character varying(30)  | not null
> ...
>
> I am extremely puzzled by the sorting of the "CLARKE"s in this list:
>
> my_db=> SELECT name_last,length(name_last),name_last='CLARK' AS
> clark,name_last='CLARKE' AS clarke FROM tbl_client WHERE name_last ILIKE
> 'CLARK%' OR name_last ILIKE 'ADAMS%' ORDER BY name_last || ', ' ||
> name_first;
>  name_last | length | clark | clarke
> ---++---+
>  ADAMS |  5 | f | f
>  ADAMS |  5 | f | f
>  ADAMS |  5 | f | f
>  ADAMS |  5 | f | f
>  ADAMS |  5 | f | f
>  ADAMS |  5 | f | f
>  ADAMSON   |  7 | f | f
>  CLARK |  5 | t | f
>  CLARK |  5 | t | f
>  CLARKE|  6 | f | t
>  CLARKE|  6 | f | t
>  CLARKE|  6 | f | t
>  CLARK |  5 | t | f
>  CLARK |  5 | t | f
>  CLARK |  5 | t | f
>  CLARK |  5 | t | f
>  CLARK |  5 | t | f
> (17 rows)
>
> The ADAMS are included just to show a similar example is ordering
> correctly.  I put the length and equality test columns in to try to make
> sure there weren't some bizarre characters in the data.  This is only
> happening on one particular database.  I did a reindex on the table just
> for good measure.  If I remove the name_first piece of the ORDER BY (which
> doesn't seem like it should matter), it sorts as expected:
>
> my_db=> SELECT name_last,length(name_last),name_last='CLARK' AS
> clark,name_last='CLARKE' AS clarke FROM tbl_client WHERE name_last ILIKE
> 'CLARK%' OR name_last ILIKE 'ADAMS%' ORDER BY name_last || ', '
>
> ;
>  name_last | length | clark | clarke
> ---++---+
>  ADAMS |  5 | f | f
>  ADAMS |  5 | f | f
>  ADAMS |  5 | f | f
>  ADAMS |  5 | f | f
>  ADAMS |  5 | f | f
>  ADAMS |  5 | f | f
>  ADAMSON   |  7 | f | f
>  CLARK |  5 | t | f
>  CLARK |  5 | t | f
>  CLARK |  5 | t | f
>  CLARK |  5 | t | f
>  CLARK |  5 | t | f
>  CLARK |  5 | t | f
>  CLARK |  5 | t | f
>  CLARKE|  6 | f | t
>  CLARKE|  6 | f | t
>  CLARKE|  6 | f | t
> (17 rows)
>
> I tried selecting those 17 rows from tbl_client into a new table, and get
> the same odd behavior.  However, if I run with new data I get an expected
> order:
>
> CREATE TEMP TABLE test (
>   name_first VARCHAR(40),
>   name_last VARCHAR(30)
>
> );
>
> INSERT INTO test VALUES ('JOE','CLARKE'),('BILL','CLARK');
>
> SELECT * FROM test ORDER BY name_last;
> SELECT * FROM test ORDER BY name_last || ', ' || name_first;
>
> Any thoughts about what's going on, what to do about it, or what obvious
> point I missing?  Thanks in advance!
>
> my_db=> SELECT version();
> version
>
>
> 
>  PostgreSQL 9.2.13 on x86_64-unknown-linux-gnu, compiled by gcc (GCC)
> 4.4.7 20120313 (Red Hat 4.4.7-11), 64-bit
> (1 row)
>
> Ken
>
>
>
> --
> AGENCY Software
> A Free Software data system
> By and for non-profits
> *http://agency-software.org/ *
> *https://agency-software.org/demo/client
> *
> ken.tan...@agency-software.org
> (253) 245-3801
>
> Subscribe to the mailing list
>  to
> learn more about AGENCY or
> follow the discussion.
>


Re: [GENERAL] Public facing PostgreSQL hosting ?

2015-08-31 Thread Steve Crawford
On Mon, Aug 31, 2015 at 12:03 AM, essam Ganadily 
wrote:

> hi
> i do develop on PostgreSQL from home and from work. i need public facing
> PostgreSQL , something i can use Pgadmin from anywhere.
> performance and scalability is not important because i will be running
> like few operations per day.
>
> any idea where can i find that ?
> thankx
>

I'm unclear what you are looking for. Are you trying to access one of your
servers remotely? If so, a VPN or SSH port-forwarding are a much safer
option.

Or are you looking for a third-party service that provides PostgreSQL as a
service?

Cheers,
Steve


Re: [GENERAL] Problem with database connections timing out for long-running queries

2015-08-21 Thread Steve Crawford
You might check the stunnel settings. A quick search of stunnel 12-hours
indicates that this is the stunnel default for idle connections.

Cheers,
Steve


On Fri, Aug 21, 2015 at 11:16 AM, Rich Schaaf rsch...@commoninf.com wrote:

 I’m running into a problem where the connection between application its
 database is timing out after 12 hours.



 Here’s the context:

 · Two servers (one running the application and another hosting
 the database) in the Amazon AWS environment on different subnets within a
 virtual private cloud



 · The server hosting the application is running Ubuntu 14.04.2 LTS



 · The server hosting the database is running Ubuntu 14.04.2 LTS
 and PostgreSQL version 9.4.4



 · An stunnel connection is provisioned from the client to the
 database.  The app server is running stunnel version 4.53 with the
 following stunnel.conf settings:

 ; PID is created inside the chroot jail

 pid = /stunnel4.pid



 [postgres-host]

 client = yes

 protocol = pgsql

 accept = 5432

 connect = host:5432

 options = NO_TICKET

 retry = yes



 · The database is provisioned to only accept “hostssl” type
 connections from the application server



 I’ve enabled the logging of PostgreSQL connections and disconnections and
 I see that for connections that terminate before the associated query
 completes, the session time shown in the disconnect message is 12 hours
 plus three or four seconds.



 I see this problem with both of the default tcp_keepalives settings of:

#tcp_keepalives_idle = 0

#tcp_keepalives_interval = 0

#tcp_keepalives_count = 0



 And with the following settings:

tcp_keepalives_idle = 120

tcp_keepalives_interval = 120

tcp_keepalives_count = 5



 I don’t see the disconnect problem for long-running queries that use a
 local database connection.



 Any advice on what might be causing remote database connections to drop
 after 12 hours (and how to work around the issue)?



 Kind regards,
 Rich





Re: [GENERAL] regexp_matches for digit

2015-07-09 Thread Steve Crawford

On 07/09/2015 09:24 AM, Ramesh T wrote:

Hi,
  in oracle regexp_like(entered 
date,'[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}','i')


for postgres i have regexp_matches ,But i need how to match [:digit:] 
in postgres when we pass date..?

any help

Konsole output

The tilde operator works fine for me.

Konsole output
select '2014-05-05' ~ '[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}';
?column?
--
t

But if you are attempting to validate a date the regex is *way* too 
simplistic as it will match any manner of junk:

123456-78-901234
thisisan-00-00invaliddate
etc.

At a minimum you need to anchor the ends with ^ and $:
'^[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}$'


If you can make reasonable assumptions about date ranges you can catch 
more errors with something like:

'^20[[:digit:]]{2}-[01][[:digit:]]{1}-[0123][[:digit:]]{1}$'

But trying to truly validate dates purely with a regex is more effort 
than I'm willing to put in. I don't recall where I ran across this 
snippet but it creates a function that ensures that the date is 
acceptable to PostgreSQL without raising an error:



CREATE OR REPLACE FUNCTION is_valid_date(text)
   RETURNS bool AS
'
begin
  return case when $1::date is null then false else true end;
exception when others then
  return false;
end;
'
LANGUAGE 'plpgsql' VOLATILE;

Cheers,
Steve



[GENERAL] Re: [PERFORM] Query running slow for only one specific id. (Postgres 9.3) version

2015-06-05 Thread Steve Crawford

On 06/05/2015 10:54 AM, Sheena, Prabhjot wrote:


Postgresql 9.3 Version

Guys

  Here  is the issue that I’m facing for couple of weeks now. 
I have table (size  7GB)


*If I run this query with this specific registration id it is using 
the wrong execution plan and takes more than a minute to complete. 
Total number of rows for this registration_id is only 414 in this table*


explain analyze SELECT max(last_update_date) AS last_update_date FROM 
btdt_responses WHERE registration_id = 8718704208 AND response != 4;


QUERY PLAN

--

Result  (cost=2902.98..2903.01 rows=1 width=0) (actual 
time=86910.730..86910.731 rows=1 loops=1)


   InitPlan 1 (returns $0)

 -  Limit  (cost=0.57..2902.98 rows=1 width=8) (actual 
time=86910.725..86910.725 rows=1 loops=1)


   -  Index Scan Backward using btdt_responses_n5 on 
btdt_responses  (cost=0.57..6425932.41 rows=2214 width=8) (actual 
time=86910.723..86910.723 rows=1 loops=1)


 Index Cond: (last_update_date IS NOT NULL)

 Filter: ((response  4) AND (registration_id = 
8718704208::bigint))


 Rows Removed by Filter: 52145434

Total runtime: 86910.766 ms

*Same query with any other registration id will come back in milli 
seconds *


explain analyze SELECT max(last_update_date) AS last_update_date FROM 
btdt_responses WHERE registration_id = 8688546267 AND response != 4;


QUERY PLAN

--

Aggregate  (cost=529.75..529.78 rows=1 width=8) (actual 
time=19.723..19.723 rows=1 loops=1)


   -  Index Scan using btdt_responses_u2 on btdt_responses  
(cost=0.57..529.45 rows=119 width=8) (actual time=0.097..19.689 
rows=72 loops=1)


 Index Cond: (registration_id = 8688546267::bigint)

 Filter: (response  4)

 Rows Removed by Filter: 22

Total runtime: 19.769 ms


A couple initial questions:

1. Does the result change if you analyze the table and rerun the query?

2. Are there any non-default settings for statistics collection on your 
database?


-Steve



Re: [PERFORM] Re: [GENERAL] Re: Query running slow for only one specific id. (Postgres 9.3) version

2015-06-05 Thread Steve Crawford

On 06/05/2015 12:28 PM, Steve Crawford wrote:

On 06/05/2015 11:38 AM, Sheena, Prabhjot wrote:


When I run vacuum analyze it fixes the problem but after 1 or 2 days 
the problem comes back




Is autovacuum running and using what settings?

(select name, setting from pg_settings where name ~ 'autovacuum' 
Konsole output or name ~ 'statistics';)


Cheers,
Steve

P.S. The convention on the PostgreSQL mailing lists it to bottom-post, 
not top-post replies.
Konsole outpor name ~ 'statistics';) 


And just to confirm, are there any table-specific overrides to the 
system-wide settings?


Cheers,
Steve


Re: [GENERAL] Re: Query running slow for only one specific id. (Postgres 9.3) version

2015-06-05 Thread Steve Crawford

On 06/05/2015 11:38 AM, Sheena, Prabhjot wrote:


When I run vacuum analyze it fixes the problem but after 1 or 2 days 
the problem comes back




Is autovacuum running and using what settings?

(select name, setting from pg_settings where name ~ 'autovacuum' Konsole 
output or name ~ 'statistics';)


Cheers,
Steve

P.S. The convention on the PostgreSQL mailing lists it to bottom-post, 
not top-post replies.

Konsole outpor name ~ 'statistics';)


Re: [GENERAL] date with month and year

2015-05-21 Thread Steve Crawford

On 05/21/2015 10:45 AM, Paul Jungwirth wrote:

You really shouldn't use WITHOUT TIME ZONE.


I'd like to know more about this. Can you say why?


Start by reading about the date and time data types with special 
attention to section 8.5.3:

www.postgresql.org/docs/current/static/datatype-datetime.html

Now go back and read it again and experiment a while until it makes 
sense. As Adrian Klaver so eloquently put it, If I have learned 
anything about dealing with dates and times, is that it is a set of 
exceptions bound together by a few rules. Every time you think you have 
the little rascals cornered, one gets away. This is also a very good 
reason to avoid reinventing the wheel.


When you need a break, watch this:
https://www.youtube.com/watch?v=-5wpm-gesOY

His conclusion is a good one: be very happy that someone else has done 
the dirty work for you.


The Ruby article does make one good point which is that we are talking 
about what they call an instant or what I like to refer to as a point 
in time. The point in time is actually a better way of thinking of 
timestamp with time zone since the timestamp with time zone does not 
actually store any timezone information - it stores a point in time that 
can be manipulated in the time-zone of your choosing whereas timestamp 
without time zone is not a point in time and must be combined with other 
information to do proper manipulation.


The article does also display a couple attitudes that I feel are 
especially rampant in the web-development community. The first is that 
web developers shouldn't become educated about the capabilities of a 
database but rather use the database as a dumb data-store and redo 
everything themselves (often this includes an utter failure to use the 
data-integrity capabilities of the database).


The second is the assumption that they are the only users of the 
database and that nobody will ever access the data except through their 
custom-written Ruby/PHP/Perl/Python code and that no other programming 
language will ever be used. Woe be to the poor slob who has to deal with 
ad-hoc queries, analytics platforms or reporting systems that weren't so 
brilliantly reinvented or who wants to use range-types or other nice 
PostgreSQL features.


Internally PostgreSQL stores timestamp without time zone in UTC but that 
is entirely irrelevant. What is relevant is that you can provide an 
instant/point in time in whatever time-zone representation you want 
and get it back the same way. Want to use a Unix epoch in your code. Go 
ahead:

extract(epoch from yourtstzcol)
abstime(yourepochint)

Want to assume everything is UTC? No problem:
Konsole output
set timezone to 'UTC';

Then you can reinvent wheels to your heart's content without wrecking 
the ability to easily use other tools.


By the way, use full timezone names to avoid ambiguity. I don't know 
what Ruby cooked up but PostgreSQL uses industry-standard names:

Konsole output
select * from pg_timezone_names;

Your original question had to do with month/year. You will have to 
define this for your use-case but beware that it won't necessarily get 
you away from time-zone issues as the month ticks over on a zone-by-zone 
basis.


Also note that time-intervals can be a source of interesting 
side-effects. Operator precedence is important. For example, what is one 
month? 28-days? 29? 30? 31? Every system must make a judgment call. Add 
a month to January 31 and you will get February 28. But add/subtract a 
month from February 28 and you get January 28/March 28. So you can 
create a query that takes a date, adds a month and subtracts a month and 
results in a different date. There is nothing to do here but to read the 
docs and try things.


There are similar issues when crossing DST boundaries. If I want to push 
something out a day in my time-zone on the day that DST changes I can do 
it easily and understand that PostgreSQL will handle the extra/missing 
hour. Or I can use an explicit increment of '24 hours' if that is what I 
want. No extra steps of converting the timestamp without time zone to 
UTC, converting that to the desired local zone, doing the calculations, 
converting back to UTC and back to timezone without timestamp all the 
while potentially adding an easy error such as doing things in the wrong 
order and checking for DST changeover in the wrong time-zone.


Cheers,
Steve



Re: [GENERAL] date with month and year

2015-05-21 Thread Steve Crawford

On 05/21/2015 10:01 AM, Daniel Torres wrote:
I everybody, I'm new in the Postgresql world, and have an easy 
question: Is it possible to have date type data that only contain 
month and year?, how can I obtain that from a timestamp (without time 
zone) column?...


Others have offered good tips but if you tell us more about the problem 
you are attempting to solve you may get some better advice and/or 
warnings about pitfalls.


-Steve



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] ERROR: could not open relation with OID

2015-04-22 Thread Steve Crawford
This morning we got the following error from a daily script that 
produces a simple largest-table report:

ERROR: could not open relation with OID 597597503

I reran the script and it completed without error.

Our server is running 9.1.15 from PgDg Ubuntu repos and the query run by 
the script is:


SELECT
relname AS TABLE,
lpad ( pg_size_pretty ( pg_relation_size ( oid ) ), 9 ) AS SIZE,
( 100 * pg_relation_size ( oid ) /
( SELECT
sum ( pg_relation_size ( oid ) )
FROM
pg_class
WHERE
relkind = 'r' ) ) ::numeric ( 4, 1 ) AS percent
FROM
pg_class
WHERE
relkind = 'r'
ORDER BY
relpages DESC
LIMIT 60;

My research hasn't led to a definitive conclusion on how to trace this 
issue and how much I should be concerned about this error.


If it is of importance, it appears that a temporary table and temporary 
index were being created within the same second that the query was run.


Any advice?

Cheers,
Steve


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] ERROR: could not open relation with OID

2015-04-22 Thread Steve Crawford

On 04/22/2015 01:25 PM, Adrian Klaver wrote:



If it is of importance, it appears that a temporary table and temporary
index were being created within the same second that the query was run.

Any advice?



WHERE
relkind = 'r'
AND
relpersistence != 't'

So to confirm. Fix the query and don't worry?

Cheers,
Steve



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] now() vs 'epoch'::timestamp

2015-04-02 Thread Steve Crawford

On 04/02/2015 10:34 AM, David G. Johnston wrote:
On Thu, Apr 2, 2015 at 10:27 AM, James Cloos cl...@jhcloos.com 
mailto:cl...@jhcloos.comwrote:


 SC == Steve Crawford scrawf...@pinpointresearch.com
mailto:scrawf...@pinpointresearch.com writes:

...
What I haven't determined is why converting back is off by 21600
seconds.


​ What timezone is your server set to - and/or the client requesting 
the calculation?


​ I haven't looked to see if that is a plausible explanation but if 
you are +/- 6hrs from UTC...


David J.

I was actually just looking at the microseconds being off. Now I'm 
curious again and haven't been able to come up with a plausible 
explanation. My client and server are in America/Pacific time zone. What 
I've seen so far:


First, there appears to be some lingering automatic casting:
select 'epoch';
 ?column?
--
 epoch

select 'epoch' at time zone 'UTC';
  timezone
-
 1970-01-01 00:00:00

In the Pacific time zone, I should be -07 from UTC but if I strip down 
James' statement to the following the result shows as -08, not -07:


select 'epoch'::timestamptz;
  timestamptz

 1969-12-31 16:00:00-08

Which we can see is correct:
select '1969-12-31 16:00:00-08'::timestamptz at time zone 'UTC';
  timezone
-
 1970-01-01 00:00:00

But something gets crossed up when we add a couple calculations:

select (now() - (now() - 'epoch')) ;
?column?

 1969-12-31 17:00:00-08

Now we are off by an hour:
select (now() - (now() - 'epoch')) at time zone 'UTC';
  timezone
-
 1970-01-01 01:00:00


select (now()::timestamp without time zone - (now()::timestamp without 
time zone - 'epoch'));

  ?column?
-
 1970-01-01 00:00:00

That's all I've discovered so far but I have to run to a meeting.

Cheers,
Steve


Re: [GENERAL] now() vs 'epoch'::timestamp

2015-04-01 Thread Steve Crawford

On 04/01/2015 11:50 AM, James Cloos wrote:

I've for some time used:

(now()::timestamp without time zone - 'epoch'::timestamp without time 
zone)::reltime::integer

to get the current seconds since the epoch.  The results are consistant
with date +%s.

(Incidently, is there a better way in 9.4?)

But I found the 'epoch'::timestamp + $THAT_VALUE::reltime was off.

I consitantly get 1970-01-01 06:00 plus a fraction of a second from:

select now() -  ((now()::timestamp without time zone - 'epoch'::timestamp 
without time zone)::reltime::integer)::reltime;

The machines on which I've tried it all have localtime == UTC.

Am I missing something obvious?


Very convoluted calculation as others have noted. As to why it is off, 
you are casting one part of the statement to an integer thus truncating 
the microseconds but are not doing the same on the other side of the 
calculation.




Also, is there any way to get the equiv of date +%s%N as a numeric or a
double precision?


Not exactly. PostgreSQL has resolution to the microsecond, not the 
nanosecond. But to get the correct number of digits just cast the 
following as needed for you application:


extract(epoch from now())*10


Cheers,
Steve



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Timezone mismatch

2015-03-20 Thread Steve Crawford

On 03/20/2015 08:29 AM, Leonardo M. Ramé wrote:
Hi, I had to change the O.S. timezone and aparently PostgreSql 
continues using the old timezone, how can I force update it's time zone?.


Using PostgreSql 8.4 on Ubuntu Server 12.04.

To update the OS timezone I used sudo dpkg-reconfigure tzdata 


Did you reload/restart PostgreSQL after making the change?

What is the setting of timezone in postgresql.conf?

What is set as the timezone on the *client* side? (That's really more 
relevant for timestamp_tz data.)


Cheers,
Steve





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] libs and upgrades

2015-03-17 Thread Steve Crawford

(reposting - should have originally posted here in general - sorry)

To prepare for server upgrades I'm planning to update the clients on a 
set of servers from 9.1 to 9.4. The servers on which the clients are 
installed are running CentOS 5 i386.


Somewhere between PostgreSQL 9.1 and 9.4 the compat-postgresql-libs, 
which offered libpq.so.4x, has disappeared from the repository causing 
dependency issues with packages that expect libpq.so.4.


Any recommendations on which of many approaches might be best to address 
this?


Cheers,
Steve



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] can you please share sample Postgres config file to enable max logging with syslog support?

2014-12-19 Thread Steve Crawford

On 12/18/2014 09:00 PM, M Tarkeshwar Rao wrote:


Hello friends,

can you please share sample Postgres config file to enable max logging 
with syslog support?


Actually we are facing some issues. We need to enable maximum logging.




The # ERROR REPORTING AND LOGGING section in postgresql.conf fairly 
self-explanatory and commented. To get started just do the following:


Route to syslog:
log_destination = 'syslog'
syslog_facility = 'LOCAL2'
syslog_ident = 'postgres'

(Set facility and ident as appropriate for your syslogging.)

To log all statements:
log_min_duration_statement = 0

There are a number of other things you can log (connections, 
disconnections, checkpoints, ...) as well as some tweaks you can make to 
the log formats but it's all laid out in a reasonably straightforward 
way in the config file.


If you described the issues you are facing the community could offer 
better advice on what information to capture and possible solutions. The 
PostgreSQL log is not your only source of troubleshooting information. 
SAR and other sources of system information can be important as well.


Cheers,
Steve



Re: [GENERAL] Modeling Friendship Relationships

2014-11-11 Thread Steve Crawford

On 11/11/2014 02:38 PM, Robert DiFalco wrote:
I have a question about modeling a mutual relationship. It seems basic 
but I can't decide, maybe it is 6 of one a half dozen of the other.


In my system any user might be friends with another user, that means 
they have a reciprocal friend relationship.


It seems I have two choices for modeling it.

1. I have a table with two columns userOne and userTwo. If John is 
friends with Jane there will be one row for both of them.
2. I have a table with two columns owner and friend. If John is 
friends with Jane there will be two rows, one that is {John, Jane} and 
another {Jane, John}.


The first option has the advantage of saving table size. But queries 
are more complex because to get John's friends I have to JOIN friends 
f ON  f.userA = John OR f.userB = John (not the real query, these 
would be id's but you get the idea).


In the second option the table rows would be 2x but the queries would 
be simpler -- JOIN friends f ON f.owner = John.


There could be 1M users. Each user would have 200 friends.

Thoughts? Do I just choose one or is there a clear winner? TIA!


What you are describing is basically an adjacency-list without any 
hierarchy information, i.e. there isn't a John reports to Dick reports 
to Jane type of tree.


One-million-users at 200 friends each would (order-of-magnitudeish) be 
200-million rows which tends to argue for saving space. It also reduces 
the number of rows impacted by deletes and avoids the risk of ending up 
with John,Jane without a corresponding Jane,John.


Getting John's friends isn't too complicated but I suspect the form of 
the query you gave won't lead to optimal query plans. For a two-column 
format I would imagine that ...userB as friend where userA='John' union 
userA as friend where userB='John'... would yield a more optimal plan 
assuming an index on each column.


I'm guessing that you will also want to study common-table-expressions 
and recursive queries (description and examples are available in the 
PostgreSQL docs) so you can start to write single queries to answer 
things like list everyone who is a friend of a friend of John.


Cheers,
Steve



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] which Update quicker

2014-09-23 Thread Steve Crawford

On 09/23/2014 12:35 PM, Emi Lu wrote:

Hello list,

For a big table with more than 1,000,000 records, may I know which 
update is quicker please?


(1) update t1
  set c1 = a.c1
  from a
  where pk and
 t1.c1a.c1;
 ..
  update t1
  set c_N = a.c_N
  from a
  where pk and
 t1.c_Na.c_N;


(2)  update t1
  set c1 = a.c1 ,
c2  = a.c2,
...
c_N = a.c_N
 from a
 where pk AND
   (  t1.c1  a.c1 OR t1.c2  a.c2. t1.c_N  a.c_N)




We don't have any info about table structures, index availability and 
usage for query optimization, whether or not the updated columns are 
part of an index, amount of memory available, disk speed, portion of t1 
that will be updated, PostgreSQL settings, etc. so it's really anyone's 
guess. A million rows is pretty modest so I was able to try a couple 
variants of update...from... on million row tables on my aging desktop 
without coming close to the 60-second mark.


*Usually* putting statements into a single transaction is better (as 
would happen automatically in case 2). Also, to the extent that a given 
tuple would have multiple columns updated you will have less bloat and 
I/O using the query that updates the tuple once rather than multiple 
times. But a lot will depend on the efficiency of looking up the 
appropriate data in a.


Cheers,
Steve






--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL Portable

2014-09-10 Thread Steve Crawford

On 09/10/2014 02:00 PM, Daniel Begin wrote:


First, I am a Newbie regarding PostgreSQL …

I just started to look at PostgreSQL to implement a large GIS DB 
(1Tb).  The data must reside in an external disk with eSATA connection 
and may be moved to different locations (and Windows 
desktops/laptops). I was looking to install PostgreSQL and PostGIS 
extensions on each PC (setting-up the proper PGDATA directory to the 
external disk) until I read about PostgreSQL and PgAdmin Portable …


http://sourceforge.net/projects/pgadminportable/

http://sourceforge.net/projects/postgresqlportable/

Is that a viable alternative considering the expected size of the DB? 
Any comments or proposal would be appreciated J


Daniel



It appears you are looking to take the PostgreSQL data directory from 
machine to machine on an external drive. I fear you will run into some 
potential problems:


1. Performance (mentioned by others).

2. OS mismatch. Have you ensured that all client machines are running 
identical setups? The underlying files are not guaranteed portable 
between OS versions and 64/32-bit. In fact they probably won't be.


3. Backups. What happens when one user screws up the database?

Perhaps you could explain further the genesis of this requirement. The 
message list is littered with questions like this asking how to 
implement a certain solution when, given an understanding of the reason 
the question is being asked, a far better solution exists. This happens 
even more often when the person asking is a newbie.


Cheers,
Steve



Re: [GENERAL] Timezone difference between Oracle SYSDATE and PostgreSQL timestamp functions

2014-08-29 Thread Steve Crawford

On 08/28/2014 10:06 PM, Vinayak wrote:

Hello,
We have converted Oracle SYSDATE to PostgreSQL statement_timestamp() but
there is a difference in timezone.
SYSDATE returns the time on the server where the database instance is
running(returns operating system time) so the time depends on the OS
timezone setting.
while the timezone of postgreSQL
statement_timestamp()/now()/clock_timestamp() depends on the DBMS setting.
so I think timezone settings are different between DBMS and OS.

Any idea how can we set OS timezone on PostgreSQL?

If you mean setting the default time zone for interpreting non-qualified 
input and displaying output, start with the 'timezone' setting in 
postgresql.conf. Most installs have that default to 'localtime' which 
means to use the servers local timezone but you can set it to whatever 
timezone you prefer. Absent an override by the client, this will be the 
default.


Next, the PGTZ environment variable can set a local default for clients 
reading that variable. Finally, that can then be overridden within a 
connection through the 'set time zone...' statement.


Cheers,
Steve



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] NOT IN and NOT EXIST

2014-07-03 Thread Steve Crawford

On 07/03/2014 08:35 AM, Sameer Kumar wrote:


Hi,

Postgres optimizer automatically tries to convert an IN clause to Hash 
Join (something similar to EXISTS equivalent of that query).


Does a similar translation happen for NOT IN as well? Given that the 
column used is NOT NUL.


Select * from emp where deptid not in (select deptid from dept where 
deptLocation='New York');


Will this above statement be automatically converted to a plan which 
would match below statement?


Select * from emp where not exists (select 1 from dept where 
deptLocation='New York' and dept.deptid=emp.deptid);




Caveat: I am *not* a planner guru... However:

You can easily test this for yourself using explain.

But note that the planner doesn't just blindly and consistently map from 
one plan to another. It also evaluates things like distribution of 
values, availability of indexes, size of tables, etc. So the planner 
will likely choose one plan if dept is very small and another if it is 
huge. Similarly, it might choose to use an index on deptid if available 
but again use of that index will depend on table size and distributions 
for deptid. In other words, what you see on a small test data set may 
differ from what you get on a large live one.


Also note that depending on the version of PostgreSQL and the nature of 
your data it could be preferable to use WHERE IN... or to use EXISTS. 
Fortunately as the planner has been imbued with ever increasing smarts, 
the need to test and choose between the two seems to have diminished.


Cheers,
Steve


Cheers,
Steve



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How can I replace the year of the created_at column with the current year dynamically ?

2014-07-02 Thread Steve Crawford

On 07/01/2014 11:27 PM, Arup Rakshit wrote:

Here is my try :

staging::= select  to_char(created_at,'DD/MM') || '/' || 
to_char(now(),'') as when from users;

when

 24/02/2014
...
 20/02/2014
(15 rows)

Can the same be done using any other clever trick ?



No tricks are springing to mind but a warning is. The above will produce 
illegal dates whenever you are an inconvenient number of years past 
February 29. I think this will fix that issue:


select created_at + ((extract(year from now()) - extract(year from 
created_at)) * '1 year'::interval);


Note that the above returns a date (assuming that created_at is a date). 
You may need to apply to_char to format to your desired specification.


Cheers,
Steve



Re: [GENERAL] Not able to understand how to write group by

2014-07-02 Thread Steve Crawford

On 07/02/2014 09:55 AM, Arup Rakshit wrote:

SELECT users.gender,count(*) as participant,avg(daily_action_answers.value) as
value
FROM users INNER JOIN daily_action_answers ON
daily_action_answers.user_id = users.id
INNER JOIN measures ON measures.id = daily_action_answers.measure_id
WHERE (((daily_action_answers.day between now() and last_date_of_year) and
daily_action_answers.daily_action_id = 1))
GROUP BY users.gender, measures.option

This is producing the below

gender  |participants  |   value
n   2  12
n   1  3
m  1   4
m  4  12
f3  23
f   4  15

Here n.m,f it comes 2 times, because the possible answer is 2. That's the
problem with my current query. I don't understand which average value for
which answer.

Can we make the output as below ?

genderparticipants   answer1_avg   answer2_avg
n  3 12  3
m  5 4  12
f   71523



As mentioned by jared, the problem is the additional group by 
measures.option which needs to be eliminated. To better understand what 
is happening, just add measures.option to your list of output columns. 
Right now the grouping is hidden because you aren't showing that column.


Cheers,
Steve



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Fast data, slow data

2014-06-26 Thread Steve Crawford

On 06/26/2014 02:29 AM, Tim Uckun wrote:
I have a use case in which the most recent data experiences a lot of 
transactions (inserts and updates) and then the churn kind of calms 
down.  Eventually the data is relatively static and will only be 
updated in special and sporatic events.


I was thinking about keeping the high churn data in a different table 
so that the vacuums on that table can go faster and the vacuums on the 
rest of the data will rarely be needed.


Ideally the high churn data would be in an in memory table but failing 
that I thought that an unlogged table would be good enough.  So now I 
need a way to flush the oldest data in this table to another logged 
table and do it in an efficient manner. I don't think partitioning is 
a good idea in this case because the partitions will be for small time 
periods (5 to 15 minutes).


Anybody have a similar problem? If so how did you solve it?


It's not at all unusual for a database to experience a high portion of 
its activity on a small subset of the data so in that sense the 
problem is already solved by appropriate sizing and tuning to make 
appropriate adjustments to the caching and other properties already 
provided by the OS and DB.


Having said that, there is far too little information here to provide 
specific advice on tuning and other approaches that may be of use for 
you. In particular:


Is this currently a hypothetical question or do you have a running 
system that is experiencing performance problems?


In either case can you provide some parameters including rows and size 
of your large table(s), what you mean by high churn (rate of inserts, 
deletes, updates and how they are grouped into transactions), the size 
of the table that is heavily updated, how the database is used 
(transactional, data-mining, ...) and anything else you feel might help 
the list understand your use-case.


If you have a live database, any metrics/observations you can supply 
might be helpful. For example, are particular queries slow? What speed 
is required? Do you have lots of simultaneous connections or is 
everything through a single connection.


Finally,  confirm that the data is not important or is easily recreated 
(implied by your in-memory/unlogged-table comments).


Cheers,
Steve




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How can I get first day date of the previous month ?

2014-06-20 Thread Steve Crawford

On 06/20/2014 12:11 AM, Arup Rakshit wrote:
Thanks for your answer. How to get the first day date of last 6 months 
from now then will be :


yelloday_development=# select date_trunc('month', now()) - interval '5 
month' as first_month;

first_month
---
 2014-01-01 00:00:00+05:30
(1 row)

Is it correct ? I am new pgdql DB :-) Awesome DB it is...

Welcome. And yes, it is awesome. Being new to the DB and mailing list, 
please note that the convention on all PostgreSQL mailing lists is to 
post your reply at the bottom and not to top-post.


The solution you gave will work but I'll offer a word of caution - date 
and time manipulation can get tricky and even the way it is handled in 
PostgreSQL has occasionally been tweaked between versions (a good reason 
to always read the release notes).


The three things that seem to cause the most confusion are time-zones, 
daylight saving time and irregular intervals. So if you assume that one 
day is 24 hours you can encounter trouble at DST changes. And 
PostgreSQL, like any system that manipulates time, needs to make certain 
assumptions about what an interval means (what is one month before March 
31) which can lead to this:


steve= select '2014-03-31'::date - '1 month'::interval + '1 
month'::interval;

-
 2014-03-28 00:00:00

when you might have expected this:

steve= select '2014-03-31'::date - ('1 month'::interval + '1 
month'::interval);

-
 2014-01-31 00:00:00

Have fun but read the docs, experiment and test - especially with dates 
and times.


Cheers,
Steve



Re: [GENERAL] Strange Error in postgresql 8.4

2014-06-20 Thread Steve Crawford

On 06/20/2014 01:18 AM, Dick Kniep wrote:

Hi list,

...
Now recently I have noticed a strange problem.

...
psycopg2.DatabaseError: SSL error: ccs received early
Did you try Googling SSL error: ccs received early? Like Adrian, I 
suspect your answer will be found in one of those messages. What recent 
updates were made to your systems - especially any related to SSL?


BTW, I suggest putting an upgrade on the front burner. 8.4 is 
end-of-life next month.


Cheers,
Steve





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Overlapping ranges

2014-06-19 Thread Steve Crawford

On 06/18/2014 04:47 PM, Jason Long wrote:

I have a large table of access logs to an application.

I want is to find all rows that overlap startdate and enddate with any
other rows.

The query below seems to work, but does not finish unless I specify a
single id.

select distinct a1.id
from t_access a1,
 t_access a2
where tstzrange(a1.startdate, a1.enddate) 
   tstzrange(a2.startdate, a2.enddate)





I'm not sure what you mean by specify a single id but a couple comments.

1. This query will return all ids since there is no constraint to 
prevent a1 from finding the matching record in a2 which will, of course, 
overlap. You need to add something like ...and a1.id != a2.id...


2. Even without the above issue there is a great potential to have this 
query run a very long time - especially if the indexes are such that 
each row on a1 requires scanning all rows in a2. I'd test it on a small 
table to make sure it gives the results you want and read up on what 
indexes are most appropriate to help speed it up. (I can't help much 
here as I haven't yet experimented enough with indexing on range types.)


Cheers,
Steve



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] unknown type name ‘pg_int64’

2014-06-11 Thread Steve Crawford
I am getting the error following errors trying to install pg_repack via 
pgxn on Ubuntu server 12.04.4 with all PostgreSQL packages from pgdg:


gcc -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat 
-Wformat-security -Werror=format-security -fPIC -pie 
-fno-omit-frame-pointer -Wall -Wmissing-prototypes -Wpointer-arith 
-Wdeclaration-after-statement -Wendif-labels -Wformat-security 
-fno-strict-aliasing -fwrapv -fexcess-precision=standard -g 
-I/usr/include/postgresql -DREPACK_VERSION=1.2.1 -I. -I./ 
-I/usr/include/postgresql/9.1/server -I/usr/include/postgresql/internal 
-D_FORTIFY_SOURCE=2 -DLINUX_OOM_ADJ=0 -D_GNU_SOURCE 
-I/usr/include/libxml2 -I/usr/include/tcl8.5  -c -o pg_repack.o pg_repack.c

In file included from pgut/pgut.h:21:0,
 from pgut/pgut-fe.h:13,
 from pg_repack.c:25:
/usr/include/postgresql/libpq-fe.h:547:1: error: unknown type name 
‘pg_int64’
/usr/include/postgresql/libpq-fe.h:547:50: error: unknown type name 
‘pg_int64’
/usr/include/postgresql/libpq-fe.h:551:1: error: unknown type name 
‘pg_int64’
/usr/include/postgresql/libpq-fe.h:553:48: error: unknown type name 
‘pg_int64’


I found mention of a similar issue relating to repmgr reported by Josh 
Berkus here but no recommended solution:

https://groups.google.com/forum/#!topic/repmgr/0rSwzVFlPsM

On that thread someone requested the output of dpkg -l | grep postgres 
so I'll preemptively post it here:
ii  pgdg-keyring 2013.2keyring for 
apt.postgresql.org
ii  postgresql-9.1 9.1.13-1.pgdg12.4+1   object-relational 
SQL database, version 9.1 server
ii  postgresql-client-9.1 9.1.13-1.pgdg12.4+1   front-end 
programs for PostgreSQL 9.1
ii  postgresql-client-common 154.pgdg12.4+1manager 
for multiple PostgreSQL client versions
ii  postgresql-common 154.pgdg12.4+1PostgreSQL 
database-cluster manager
ii  postgresql-contrib-9.1 9.1.13-1.pgdg12.4+1   additional 
facilities for PostgreSQL
ii  postgresql-doc-9.1 9.1.13-1.pgdg12.4+1   documentation 
for the PostgreSQL database management system
ii  postgresql-server-dev-9.1 9.1.13-1.pgdg12.4+1   
development files for PostgreSQL 9.1 server-side programming


Cheers,
Steve



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] unknown type name ‘pg_int64’

2014-06-11 Thread Steve Crawford

On 06/11/2014 11:35 AM, Tom Lane wrote:

Steve Crawford scrawf...@pinpointresearch.com writes:

I am getting the error following errors trying to install pg_repack via
pgxn on Ubuntu server 12.04.4 with all PostgreSQL packages from pgdg:
/usr/include/postgresql/libpq-fe.h:547:1: error: unknown type name
‘pg_int64’

That type name ought to be defined in postgres_ext.h, if it's of a similar
vintage to libpq-fe.h.  I suspect you have a mishmash of Postgres
installations of different releases and the compiler is managing to find
the wrong version of postgres_ext.h to go with the libpq-fe.h it's
finding.


On that thread someone requested the output of dpkg -l | grep postgres
so I'll preemptively post it here:
[ lots of postgres 9.1 packages ]

Well, 9.1 didn't have pg_int64, but its libpq-fe.h didn't use that type
either.  Whatever /usr/include/postgresql/libpq-fe.h is from, it's not
one of the packages you show here...

You were correct. I have:
/usr/include/postgresql/postgres_ext.h
/usr/include/postgresql/9.1/server/postgres_ext.h

Seems to be a packaging/dependency issue. I installed 
postgresql-server-dev-9.1 (9.1.13-1.pgdg12.4+1) this morning which 
installed, as a dependency, libpq-dev (9.3.4-1.pgdg12.4+1) which uses a 
different directory convention for where to place postgres_ext.h. I 
guess a forced downgrade of libpq/libpq-dev is in order.


Cheers,
Steve





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] How can I select rows by comparing an array data type column with multiple values ?

2014-06-02 Thread Steve Crawford

On 06/02/2014 03:54 AM, Arup Rakshit wrote:

Hi,

Suppose, I have a table as below :-

id |title |content |   tags
+--++---
  1 | sample post  | lorem ipsum| {apple,orange}
  2 | another post | fruits are bad | {apple,hamburger}
  3 | third post   | foo bar| { banana, lemon }
  4 | third post   | foo baz   | { watermelon, lemon }

Now I want to select all rows, for which tags will having either one 
or all value from the this array [apple,banana] ? how should I write 
the query using such a set ?


output should select 1,2,3.
Regards,
Arup Rakshit


Perhaps the  (overlaps) operator will work best for you. Check out the 
available operators at:

http://www.postgresql.org/docs/current/static/functions-array.html

Cheers,
Steve



[GENERAL] Force specific index disuse

2014-05-20 Thread Steve Crawford
Is there a way to force a specific index to be removed from 
consideration in planning a single query?


Specifically, on a 60-million-row table I have an index that is a 
candidate for removal. I have identified the sets of nightly queries 
that use the index but before dropping it I would like to run EXPLAIN 
and do timing tests on the queries to see the impact of not having that 
index available and rewrite the query to efficiently use other indexes 
if necessary.


Cheers,
Steve



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Force specific index disuse

2014-05-20 Thread Steve Crawford

On 05/20/2014 09:44 AM, Seamus Abshere wrote:

On 5/20/14, 1:38 PM, Steve Crawford wrote:

Is there a way to force a specific index to be removed from
consideration in planning a single query?


hi Steve,

What is the query? Or at least a sanitized but complete version?


I've now resolved the issue with the one update query I was 
investigating this morning. But this involved building a test where I 
removed the index then ran explain and timed the query.


The question is actually general as I anticipate reviewing the benefit 
of dropping more indexes and it would be much more quick and convenient 
to do something akin to:


begin;
disable index foodex;
explain update bar set baz ;
commit;

I'm not sure what would be involved in adding this. It seems that simply 
hiding an index from the planner would be all that is necessary but I'm 
sure there are, as always, subtleties.


Cheers,
Steve



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Force specific index disuse

2014-05-20 Thread Steve Crawford

On 05/20/2014 10:44 AM, Alvaro Herrera wrote:

Steve Crawford wrote:

Is there a way to force a specific index to be removed from
consideration in planning a single query?

Specifically, on a 60-million-row table I have an index that is a
candidate for removal. I have identified the sets of nightly queries
that use the index but before dropping it I would like to run
EXPLAIN and do timing tests on the queries to see the impact of not
having that index available and rewrite the query to efficiently use
other indexes if necessary.

If you can afford to lock the table for a while, the easiest is

BEGIN;
DROP INDEX bothersome_idx;
EXPLAIN your_query;
ROLLBACK;

Interesting. But what do you mean by a while? Does the above keep the 
index intact (brief lock) or does it have to rebuild it on rollback?


What would happen if you did:
BEGIN;
DROP INDEX bothersome_idx;
INSERT INTO indexed_table...;
ROLLBACK;

Cheers,
Steve



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Force specific index disuse

2014-05-20 Thread Steve Crawford

On 05/20/2014 11:48 AM, Steve Crawford wrote:

...

What would happen if you did:
BEGIN;
DROP INDEX bothersome_idx;
INSERT INTO indexed_table...;
ROLLBACK;


Never mind. Thought it through.

Cheers,
Steve


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Ubuntu Packages / Config Files

2014-05-01 Thread Steve Crawford

On 05/01/2014 11:40 AM, Stephan Fabel wrote:

I'm using Ubuntu 12.04 for these deployments at the moment. The Ubuntu
packages don't put the configuration files with the cluster data (by
default under /var/lib/postgresql/9.1/main under 12.04), but in
/etc/postgresql/9.1/main) and they start postgres with the -c option
pointing there.

Whenever I try to add a slave, first I stop the postgresql service, move
the above data directory to something like
/var/lib/postgresql/9.1/main.orig, create a new 'main' directory with
identical permissions/ownerships, and start pg_basebackup pointing
there. It will not copy the server.crt and server.key symlinks (by
default pointing to the snakeoil cert/key) so I re-create those. I
then put the appropriate recovery.conf into /etc/postgresql/9.1/main,
given that that's the configuration directory where everything is. I set
wal_level = hot_standby and hot_standby = on in postgresql.conf.
After  I then start the postgresql service again.


I built a script that works in our system to create a standby. It's nice 
to have something ready-to-go and well tested when you need to quickly 
get a standby up and running. I leave the script in the PostgreSQL 
config directory where it's tracked, along with other config changes, in 
git.


I also keep recovery.conf and the certificate files in 
/etc/postgresql/9.1/main where they, too, are revision controlled and 
ready to copy into place by the script.


There was some discussion some time ago about changing the requirement 
to have recovery.conf and certs in the data directory but I lost track 
of what was decided for post-9.1 versions.


My recovery.conf mostly consists of:
standby_mode = on
primary_conninfo = 'host=10.9.8.7 user=standby'

The script, stripped of various error-checking, confirmation screens and 
other fluff basically looks like:


### Start
postgresql_replication_user='standby'
postgresql_owner='postgres'
master_ip='10.9.8.7'
pg_parent='/var/lib/postgresql/9.1'
data_dir=${pg_parent}/main
backup_dir=${pg_parent}/$(date +%Y%m%d_%H%M)
conf_dir='/etc/postgresql/9.1/main/'
server_crt='/etc/ssl/certs/ssl-cert-snakeoil.pem'
server_key='/etc/ssl/private/ssl-cert-snakeoil.key'

# Stop the server
echo Stopping PostgreSQL
/etc/init.d/postgresql stop

# Delete and remake the data directory
echo Moving data directory
mv ${data_dir} ${backup_dir}
mkdir ${data_dir}
chown ${postgresql_owner}:${postgresql_owner} ${data_dir}
chmod 700 ${data_dir}

# Switch to postgres user and run basebackup
echo Re-synchronizing database from master
su - postgres -c 
pg_basebackup \
--pgdata \${data_dir}\ \
--xlog \
--host \${master_ip}\ \
--user \${postgresql_replication_user}\ \
--verbose \
--progress



# Relink the server keys
ln -s ${server_crt} ${data_dir}/server.crt
ln -s ${server_key} ${data_dir}/server.key

# Put recovery.conf into place
echo Setting up recovery.conf
cp -p ${conf_dir}/recovery.conf ${data_dir}

# Start the server
echo Starting standby server
/etc/init.d/postgresql start
### End

Hope this helps.

Cheers,
Steve



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] reindexdb

2014-04-29 Thread Steve Crawford

On 04/29/2014 07:22 AM, Steve Clark wrote:

Hello,

We have a small (2GB) 8.4.20 database. The prior maintainer ran a 
reindexdb
twice a day (originally started with 7.3.x). Can anybody see a reason 
why we need to

continue to do this?


I don't. It certainly appears to be a piece of duct-tape put on either 
due to a real or perceived ancient bug or due to somebody's 
misunderstanding of things in the distant past. (I would, however, start 
looking at upgrading from 8.4 which is EOL in about 3 months.)


Cheers,
Steve



Re: [GENERAL] timezone datetime issue

2014-04-16 Thread Steve Crawford

On 04/16/2014 03:56 AM, Glenn Pierce wrote:

Hi I have an issue with adjusting a timestamp.

I have a table  like

CREATE TABLE sensor_values
(
  ts timestamp with time zone NOT NULL,
  value double precision NOT NULL DEFAULT 'NaN'::real,
)

It was intended that ts timestamps would be the time we wanted to 
store in UTC.
Clients would adjust their times to UTC before entering into the 
database.



Unfortunately some values have been added with BST times.
The DB thinks  they are UTC times but are an hour out this time of year.

Is the a way to adjust those times ? Ie offset the summer times back 
an hour ?




I see two potential questions here.

If you are asking about the correct way to insert the data then you may 
be confusing the issue by adding conversions unnecessarily. I can't tell 
from your description but I'll hazard a guess that you may be doing 
something similar to the following (using my time zone on the US Pacific 
coast as an example):


1. You get a reading and a local time, say 2014-04-16 09:15:00. This 
really means 2014-04-16 09:15:00-07 since we're on Pacific Daylight Time.

2. You convert that to UTC which would be 2014-04-16 16:15:00.
3. You insert that value into your data: insert into sensor_values (ts, 
value) values ('2014-04-16 16:15:00', 1);
4. You note that the value in the database is not what you expected but 
rather is 2014-04-16 16:15:00-07 or 2014-04-16 23:15:00 UTC.


If this is the case then the problem is that you are double converting. 
The time stamp with time zone does not actually store any time zone 
information and is better thought of as a point in time. If I insert 
2014-04-16 09:15:00-07 into a table and someone else, who has their 
client set to UTC, views that record it will show 2014-04-16 16:15:00-00.


Further, if you insert data into a timestamptz column and omit the 
explicit time-zone offset, PostgreSQL will assume you mean that the 
value is your local time (or whatever you have set your client time zone 
to).


If you are taking readings in the UK it's quite possible that such an 
issue would not be apparent until the spring time change.


If the only issue is fixing incorrect data then you merely need to 
identify the incorrect records. If all readings come from sensors in a 
common time zone then you need to identify, probably by time range, the 
block of bad data and update it by subtracting '1 hour'::interval. 
Naturally the data-repair needs to be coordinated with fixing the bug 
that caused the incorrect entries in the first place. (It's easy to get 
things backward or choose incorrect blocks so I'd make a backup of the 
table first.)


If the readings come from sensors across different time zones then you 
will be tasked with the issue of somehow identifying which records need 
correcting and which don't - possibly a large task.


Cheers,
Steve


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] openssl heartbleed

2014-04-10 Thread Steve Crawford

On 04/10/2014 01:01 AM, Albe Laurenz wrote:

Steve Crawford  wrote:


If you aren't and weren't running a vulnerable version or if the
vulnerable systems were entirely within a trusted network space with no
direct external access then you are probably at low to no risk and need
to evaluate the cost of updates against the low level of risk.

If you are in a totally trusted environment, why would you use SSL?



I didn't say *totally* trusted - that doesn't exist. We use secure 
connections inside our firewall all the time and sometimes 
authentication convenience is as much a driving factor as security.


I didn't suggest someone *avoid* updating keys/certificates - just to 
evaluate cost vs. risk as one must always do. But I'd submit that anyone 
seriously concerned about this attack being launched from within their 
internal network has a whole bunch of higher-priority security problems.


-Steve



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] openssl heartbleed

2014-04-09 Thread Steve Crawford

On 04/09/2014 08:54 AM, Gabriel E. Sánchez Martínez wrote:

Hi all,

Our server is running Ubuntu Server 13.10 (we will soon upgrade to 
14.04) and PostgreSQL 9.1.  We use certificates for all client 
authentication on remote connections.  The server certificate is 
self-signed.  In light of the heartbleed bug, should we create a new 
server certificate and replace all client certificates?  My guess is yes.


The answer is, of course, it depends. Here's my take:

If your connections are coming from the Internet or other untrusted 
sources *and* you are or were running a vulnerable version of OpenSSL 
then yes, you should change your keys, certificates and any other 
credentials that might have been found at some point in RAM including 
passwords/keys used to access the vulnerable server *or* which the 
vulnerable server stores and uses to access other systems. Of course 
this means that if you have PostgreSQL backing a vulnerable public 
webserver then you are at risk.


If you aren't and weren't running a vulnerable version or if the 
vulnerable systems were entirely within a trusted network space with no 
direct external access then you are probably at low to no risk and need 
to evaluate the cost of updates against the low level of risk.


Cheers,
Steve



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Log file monitoring and event notification

2014-04-07 Thread Steve Crawford

On 04/05/2014 08:47 AM, Andy Colson wrote:

Hi All.

I've started using replication, and I'd like to monitor my logs for 
any errors or problems.  I don't want to do it manually, and I'm not 
interested in stats (a la PgBadger).


What I'd like, is the instant PG logs: FATAL: wal segment already 
removed (or some such bad thing), I'd like to get an email


As one component of our monitoring we route logging through syslog which 
has all messages go to one location for use by PgBadger and friends and 
simultaneously any message with a WARN or higher priority goes to a 
separate temporary postgresql_trouble.log.


A cron-job checks this file periodically (currently we use 5-minutes) 
for content. If the file has content the script sends the appropriate 
emails and truncates the trouble log.


Cheers,
Steve



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] SSD Drives

2014-04-04 Thread Steve Crawford

On 04/03/2014 12:44 PM, Brent Wood wrote:

Hi David,

Does the RAID 1 array give any performance benefits over a single 
drive? I'd guess that writes may be slower, reads may be faster (if 
balanced) but data security is improved.


I've been looking into upgrading to SSD and wondering about RAID and 
where to apply $$$ as well. In particular I'm curious about any 
real-world PostgreSQL-oriented performance and data-protections advice 
in the following areas:


1. With SSDs being orders of magnitude faster than spinning media, when 
does the RAID controller rather than the storage become the bottleneck?


2. Do I need both BBU on the RAID *and* capacitor on the SSD or just on 
one? Which one? I'm suspecting capacitor on the SSD and write-through on 
the RAID.


2. Current thoughts on hardware vs. software RAID - especially since 
many of the current SSD solutions plug straight into the bus.


3. Potential issues or conflicts with SSD-specific requirements like TRIM.

4. Manufacturers, models or technologies to seek out or avoid.

5. At what point do we consider the RAID controller an additional SPOF 
that decreases instead of increases reliability?


6. Thoughts on best bang for the buck? For example, am I better off 
dropping the RAID cards and additional drives and instead adding another 
standby server?


Cheers,
Steve



Re: [GENERAL] SSD Drives

2014-04-04 Thread Steve Crawford

On 04/04/2014 10:15 AM, Merlin Moncure wrote:

2. Do I need both BBU on the RAID *and* capacitor on the SSD or just on one?
Which one? I'm suspecting capacitor on the SSD and write-through on the
RAID.

You need both. The capacitor protects the drive, the BBU protects the
raid controller.
?? In write-through the controller shouldn't return success until it 
gets it from the drive so no BBU should be required. One LSI slide deck 
recommends write-back as the optimum policy for SSDs. But I could be 
wrong which is why I ask.

2. Current thoughts on hardware vs. software RAID - especially since many of
the current SSD solutions plug straight into the bus.

IMNSHO, software raid is a better bet.  The advantages are compelling:
Cost, TRIM support, etc. and the SSD drives do not benefit as much
from the write cache.   But hardware controllers offer very fast burst
write performance which is nice.

6. Thoughts on best bang for the buck? For example, am I better off
dropping the RAID cards and additional drives and instead adding another
standby server?
This is going to depend a lot on write patterns.  If you don't do much
writing, you can gear up accordingly.  For all around performance, the
S3700 (2.5$/gb) IMO held the crown for most of 2013 and I think is
still the one to buy.  The s3500 (1.25$/gb) came out and also looks
like a pretty good deal, and there are some decent competitors (600
pro for example).  If you're willing to spend more, there are a lot of
other options.  I don't think it's reasonable to spend less for a
write heavy application.


FWIW, the workload is somewhat over 50% writes and currently peaks at 
~1,600 queries/second after excluding set statements. This is 
currently spread across four 15k SATA drives in RAID 10.


Judicious archiving allows us to keep our total OS+data storage 
requirements under 100GB. Usually. So we should be able to easily stay 
in the $500/drive price range (200GB S3700) and still have plenty of 
headroom for wear-leveling.


One option I'm considering is no RAID at all but spend the savings from 
the controllers and extra drives toward an additional standby server.


Cheers,
Steve



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Mysterious DB reset

2014-03-06 Thread Steve Crawford

On 03/06/2014 09:33 AM, Israel Brewster wrote:

For starters, this happened again this morning (no data prior to 4:45 am and 
sequence reset), so whatever is going on appears to be reoccurring. Also, I 
forgot to mention if it is significant: this is running on slackware liunux 14.0


Also odd is that my cleanup script runs at 1am. I have records of there
being new data in the database up to 3:51am, but the oldest record
currently in the DB is from 4:45am (as specified by the default of now()
on the column). So I know records were added after my delete command
ran, but before this reset occurred.


A shot in the dark...

Have you searched /etc/crontab, root's crontab, PostgreSQL's crontab and 
the crontabs of any automatic scripts that connect. I'm not sure about 
Slackware but Red Hat and Centos run the cron.daily scripts at (wait for 
it...) just after 4am.


Some of the default daily scripts like logrotate can have side effects 
like restarting the service that writes to the log file being rotated.


Cheers,
Steve


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Cancelling of autovacuums considered harmful

2014-02-26 Thread Steve Crawford

On 02/26/2014 08:56 AM, Alvaro Herrera wrote:

...
No matter how heavily updated, regular activity should not cause
autovacuum kills.  Only heavier operations would do that (say ALTER
TABLE, etc).


Considered harmful got my attention. What, if any, known harm is caused?

We have many errors of this type but in our case most are due to batch 
processes that have a vacuum embedded at appropriate points in the 
string of commands in order to avoid excessive bloat and to ensure the 
tables are analyzed for the following steps. Occasionally the autovacuum 
triggers before the manual but gets canceled.


Any harm?

Cheers,
Steve



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Cancelling of autovacuums considered harmful

2014-02-26 Thread Steve Crawford

On 02/26/2014 04:40 PM, Steve Crawford wrote:

On 02/26/2014 08:56 AM, Alvaro Herrera wrote:

...
No matter how heavily updated, regular activity should not cause
autovacuum kills.  Only heavier operations would do that (say ALTER
TABLE, etc).


Considered harmful got my attention. What, if any, known harm is 
caused?


We have many errors of this type but in our case most are due to batch 
processes that have a vacuum embedded at appropriate points in the 
string of commands in order to avoid excessive bloat and to ensure the 
tables are analyzed for the following steps. Occasionally the 
autovacuum triggers before the manual but gets canceled.


Oops, I meant to say the processes have an ANALYZE embedded in them but 
this kills the vacuum. Typically they are bulk-loads into an empty table 
followed by analysis and distribution of the incoming data to other 
tables. The ANALYZE immediately follows the data load to ensure the 
planner has updated stats for the analysis and other following stages.


Cheers,
Steve


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] password-less access, without using pg_hba

2014-02-07 Thread Steve Crawford

On 02/06/2014 06:07 PM, Reece Hart wrote:
I'd like to provide public access, without a password, to a database 
hosted on Amazon RDS.


I'm familiar with using pg_hba.conf to enable trust (no) 
authentication for a user. pg_hba.conf is not available to DBAs on RDS.


Is there any other way to achieve password-less login in postgresql? I 
tried alter user password NULL.



Ignoring the scary security issues

If you can't access pg_hba.conf how about just sticking pgbouncer or 
similar in the middle and have your users connect through that?


Cheers,
Steve


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] subtracting from a date

2014-02-06 Thread Steve Crawford

On 02/06/2014 09:25 AM, Jay Vee wrote:
I have reviewed working with dates documentation but some things are 
not clear and I cannot get an example to work for what I need.


I am passing a date into a stored function like '2013-04-01'

The stored function accepts this string as a date type.

Within the function, I need to:
1.  subtract one year from this date into another date type
2.  subtract one month from this date into another date type
3.  subtract one day from this date into another date type

Are there any examples of this?  This is not a timestamp type, but a 
date type.


thanks

The basics are easy. Cast the string to a date and subtract the 
appropriate interval:

'2013-04-01'::date - '1 year'::interval

BUT...
PostgreSQL, like all programs, makes certain assumptions about dates and 
intervals. Generally they are good but you need to be sure they match 
your requirements lest you be surprised. For example:


1 day may or may not be 24 hours (DST changeovers)

1 month has varying numbers of days

Same thing with 1 year (leap years)

Certain calculations will give different results depending on what 
timezone you set due to the different scheduling and handling of 
summer/winter time offsets.


Operator precedence is important. You might expect select 
'2012-02-29'::date - '1 year'::interval + '1 year'::interval; to return 
2012-02-29 but it will return 2012-02-28 since February 2011 has no 29th 
and 2011-02-28 plus a year is 2012-02-28.


For an amusing take on date/time calculations see:
http://www.youtube.com/watch?v=-5wpm-gesOY

Cheers,
Steve


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] A Simple web application

2014-01-27 Thread Steve Crawford

On 01/27/2014 07:19 AM, Raymond O'Donnell wrote:

On 27/01/2014 15:16, Edson Richter wrote:

Em 27/01/2014 10:42, Andreas Kretschmer escreveu:

Raymond O'Donnell r...@iol.ie wrote:


On 27/01/2014 12:11, Manoj Soni wrote:

Name, Age and Sex
Which tutorial, may help me in this regard.

I did a Google search on tomcat netbeans postgresql tutorial and
turned up some likely-looking links should work for you too. :-)

Right, and as a hint: dont store the AGE of a person, store the
birthdate instead.


Andreas

Another hint: don't store sex (do/don't do?), but gender (even
better if tristate: male/female/not informed or other).
I already had lots of trouble with customers asking me to adjust my
systems to this new situation, then I've to change systems and
database...

+1 to this... there was a lengthy thread on this list some years ago on
this very subject, and it was an eye-opener to see the possibilities
that emerged.

Ray.



Actually, to be pedantic, use whichever is appropriate to your use-case.

If you are looking at biological/physiological attributes (chromosomes, 
hormones, testicular-cancer, ...), use sex.


If you are tracking social attributes or characteristics (preferred 
dress-style, sexual-preference, income, prevalence in various 
professional roles, etc.), use gender.


See http://www.who.int/gender/whatisgender/en/

Either way, an expandable lookup-table may end up being useful. Just 
when you think you are strictly dealing with sex and you hard-code for 
male and female, someone will show up and ask you to rewrite your 
code to account for chimeras 
(http://en.wikipedia.org/wiki/Chimera_%28genetics%29).


Cheers,
Steve



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] postgresql-common breaks logrotate

2013-12-26 Thread Steve Crawford
People who have attempted to install PGDG versions of PostgreSQL on 
recent Ubuntu releases have run into the cascading problem of postgresql 
wanting to destroy Ubuntu.


Based on the packaging:
postgresql depends on postgresql-common
postgresql-common breaks logrotate (3.8)
ubuntu-standard depends on logrotate
profit?...

It appears the reason for the claimed breakage is a very slight update 
to logrotate that requires one to tell logrotate about files/directories 
with non-standard ownership/permissions. Getting logrotate to stop 
complaining is trivially resolved by the user by adding a single su 
directive to the /etc/logrotate.d/postgresql-common or by having 
PostgreSQL log to syslog.


It seems that breaks is overkill and the hassle imposed by that 
declaration far exceeds any benefit therefrom.


Thoughts?

Cheers,
Steve




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Does --no-unlogged-table-data prevent table locking?

2013-12-02 Thread Steve Crawford
If a database has an unlogged table and that table is truncated during 
while a database dump with the --no-unlogged-table-data option is 
running, will the table be locked against a truncate or will the 
truncate succeed? The documentation is unclear on this detail.


Background:

We have a regular Bash script that essentially does the following:

1. Some preprocessing
2. psql: ...empty a table, populate the table, do some processing on the 
table...

3. More non PG steps in Bash
4. psql ...some remaining processing on the table and export the results...
5. Cleanup

We are now colliding with a nightly dump which is delaying this script. 
Note that since there is a break between the two calls to psql, we 
cannot use a temporary table but an unlogged table is perfectly safe for 
our purposes.


Cheers,
Steve



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Junk date getting uploaded into date field

2013-11-05 Thread Steve Crawford

On 11/05/2013 05:29 AM, Albe Laurenz wrote:

Thomas Kellerer wrote:

bsreejithin, 05.11.2013 13:14:

Not able to post the attached details as a comment in the reply box, so
attaching it as an image file :
http://postgresql.1045698.n5.nabble.com/file/n5776987/Untitled.png

It would have much easier if you had simply used copy  paste to post a text 
version of that SQL.
Does your mail client not allow you to do that?

But your test case is essentially this:

select to_date('33-OCT-2013', 'dd-mon-')

which indeed returns 2013-11-02 (using 9.3.1)

I don't know if this is inteded or actually a bug - I can't find anything in 
the docs relating to that
behaviour.

There is a comment in utils/adt/formatting.c:

  * This function does very little error checking, e.g.
  * to_timestamp('20096040','MMDD') works


I think the place for such warnings in addition to the source-code is in 
the documentation. This or similar issues with to_date have popped up on 
the lists a number of times.


Perhaps a see warnings below by the to_date description in table:
http://www.postgresql.org/docs/9.3/static/functions-formatting.html#FUNCTIONS-FORMATTING-TABLE

Then under usage notes something like:

The to_date and to_timestamp functions do minimal input error-checking 
and are intended for conversion of non-standard formats that cannot be 
handled by casting. These functions will attempt to convert illegal 
dates to the best of their ability, e.g. to_date('33-OCT-2013', 
'dd-mon-') will return 2013-11-02. Users of these functions are 
advised to perform whatever external error-checking they deem prudent.


Cheers,
Steve



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Table partitioning

2013-10-28 Thread Steve Crawford

On 10/28/2013 09:27 AM, Herouth Maoz wrote:

I have a rather large and slow table in Postgresql 9.1. I'm thinking of 
partitioning it by months, but I don't like the idea of creating and dropping 
tables all the time.
What is slow about it? Inserting? Selecting? Deleting? Partitioning can 
assist with some issues but does no good if what you really need is an 
index or better query. Partitioning shines as an option to manage 
archiving/purging of time-series data but only if you work with it, not 
against it.


What don't you like about creating and dropping tables? You can easily 
automate it: https://github.com/keithf4/pg_partman




I'm thinking of simply creating 12 child tables, in which the check condition 
will be, for example, date_part('month'', time_arrived) = 1 (or 2 for February, 
3 for March etc.).

I'll just be deleting records rather than dropping tables, the same way I do in 
my current setup. I delete a week's worth every time.
You are missing out on one of the best aspects of partitioning. Compared 
to dropping or truncating a child table, deleting is far slower and 
causes table bloat which may impact future queries.


Second, when I delete (not drop!) from the mother table, are records deleted 
automatically from the child tables or do I need to create rules/triggers for 
that?



Yes unless you use the keyword ONLY: If specified, deletes rows from 
the named table only. When not specified, any tables inheriting from the 
named table are also processed.


Cheers,
Steve



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_dumpall from a script

2013-10-22 Thread Steve Crawford

On 10/21/2013 10:20 PM, James Sewell wrote:
That looks great, but it doesn't really help with my problem unless 
I'm missing something (very possible!)


I need a way to backup either from SQL in PSQL (possibly \!) or from a 
PG/PLSQL function to a file with a name set from a :variable.


This would be triggered by a certain action in the database (applying 
a patch).


Hope that's a bit clearer!

...

Perhaps a better description of the problem you want to solve would 
help. Are backups running on the same machine as the server? How soon 
after the triggering event must the backup be run? If there is a 
subsequent triggering event that occurs before the backup starts on the 
first event will there be ill effects? Is there a method in place to 
prevent subsequent events prior to the backup completing on the first 
event? Does the test have to be done in psql for some reason or are 
other clients acceptable?


If you are running the backup from within psql then something started 
psql. I assume you aren't trying to have a constantly connected psql 
script running a loop or using listen/notify. If you can test for your 
triggering event via a query then you can use a simple bash script to 
create a temporary file based on your query then test that file to 
determine if a backup is required. Run via cron every minute and you're 
done.


If you really need relatively instant response and especially if you can 
use a different client-side program then I suppose you might be able to 
cook up something with listen/notify or advisory locks.


Alternately, you might be able to use a foreign-data-wrapper that would 
allow you to create/modify a file (external file-based table) when your 
event occurs. By watching that external file you can determine that a 
backup is required. A daemonized bash-script utilizing inotifytools can 
reilably sit and wait for a change to a file then immediately launch you 
backup script. Be careful that you have a mechanism to prevent too many 
overlapping backups.


Cheers,
Steve


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Forms for entering data into postgresql

2013-10-14 Thread Steve Crawford

On 10/12/2013 01:57 PM, Adrian Klaver wrote:

...
In the text based systems I am familiar with their where keyboard 
shortcuts that took you directly to fields and coding conventions that 
allowed direct entry of data. For example at a plumbing supply house I 
went to the convention was something like:


pv150el90 = PVC 1.5 ell 90 degree
abs150el90 = ABS 1.5 ell 90 degree

It looks cumbersome, but in the hands of someone experienced in the 
system, data entry was very fast.
Actually it looks like a Unix command-line. :) Very fast and easy to use 
but not necessarily intuitive and easy to learn for the newbie.


I recall a former co-worker telling me about a friend who had started 
work at a large accounting firm. He said that employees were not allowed 
to have a mouse at their computer for the first few months of 
employement which forced them to learn to use the computer by keyboard 
alone. When they got the mouse back it was rarely used and the speedy 
and effortless way they nagivated spreadsheets at audits repaid the time 
spent learning the shortcuts many times over.


Cheers,
Steve



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Need some help on Performance 9.0.4

2013-10-11 Thread Steve Crawford

On 10/11/2013 08:56 AM, akp geek wrote:
We have been running 4 of our applications on 9.0.4, which we are 
planning to update the database 9.2.2 by the year end


Sorry if I missed it but I couldn't find a question or description of a 
problem you are encountering.


Cheers,
Steve



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] ERROR: invalid value ???? for YYYY

2013-10-10 Thread Steve Crawford

On 10/09/2013 05:57 PM, Brian Wong wrote:
But from a user's perspective, why would it ever make sense that by 
adding an additional where clause, it actually brings in more data 
into the picture?  If I have query returning 100 rows.  Adding an 
additional where clause should only cut down the number of rows, not 
increase it.

...

It may be a bit surprising at first when one does not consider they are 
looking at a view, not at a table, but for the planner to do its job of 
returning the data you want efficiently it must be allowed to optimize 
your request by rewriting it into a logically equivalent form. For 
example, given something like ...where eventtime  
abstime(12334554321)... the planner will detect that abstime(1234554321) 
will always return the same result and will calculate that value once, 
not redo it once for every row.


The idea that you are just reducing the number of rows doesn't mean that 
doing things in that order is efficient. Imagine that you had a view 
that returned the records of all males from a table of all people in the 
US that includes an indexed birthday field as MMDD. This view would 
return a bit over 155-million rows - roughly 50% of the approximately 
310-million US population. You then query that view to find men who have 
a birthday today. It is obviously suboptimal to first retrieve 
155-million records from disk then evaluate those for the appropriate 
birthday when you can, instead, start by grabbing 1/365th or less than 
1-million rows which can be efficiently returned using the index then 
evaluate those records for male/female.


It's sort of like sending someone to the store with a shopping list - 
you have no expectation that they will fill the cart in the order you 
put things on the list, just that they will come back with the items on 
the list.


Cheers,
Steve



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


  1   2   3   4   5   6   >