Re: [GENERAL] mild modification to pg_dump

2017-11-18 Thread Matt Zagrabelny
On Fri, Nov 17, 2017 at 3:58 PM, marcelo  wrote:

> Again: knowing of .pgpass (thank you Scott) this is what I will do.
>
>
Just in case you might not know. The perms of the .pgpass file need to not
have group or all write access. For instance:

chmod 0600 .pgpass

-m


Re: [GENERAL] expanded auto and header linestyle double

2017-11-16 Thread Matt Zagrabelny
Thanks for the reply, Pavel!

On Thu, Nov 16, 2017 at 1:01 AM, Pavel Stehule 
wrote:

> Hi
>
> 2017-11-15 23:37 GMT+01:00 Matt Zagrabelny :
>
>> Greetings,
>>
>> Using PG 10.1.
>>
>> In my .psqlrc I have:
>>
>> \x auto
>> \pset linestyle 'unicode'
>> \pset unicode_header_linestyle double
>>
>> and when the output is expanded, I do not see a double line for the first
>> record, but I do for all subsequent records. For example:
>>
>> % select  * from artist;
>> ─[ RECORD 1 ]─
>> artistid │ 1
>> name │ AC/DC
>> ═[ RECORD 2 ]═
>> artistid │ 2
>> name │ Accept
>> ═[ RECORD 3 ]═
>> artistid │ 3
>> name │ Aerosmith
>> ═[ RECORD 4 ]═
>> artistid │ 4
>> name │ Alanis Morissette
>> ═[ RECORD 5 ]═
>> artistid │ 5
>>
>> I would like to have the initial "RECORD 1" line have the same "double"
>> linestyle as the other records.
>>
>> Am I missing a config item?
>>
>
> yes - it is border line
>
> use \pset border 2
>
>
Hmmm I didn't use the "border" setting.


> and you understand
>
> you are missing
>
> \pset unicode_border_linestyle double
>

But I did use the above setting.

So my .psqlrc looks like:

\pset linestyle 'unicode'
\pset unicode_border_linestyle double
\pset unicode_header_linestyle double

═[ RECORD 1 ]
artistid │ 1
name │ AC/DC
═[ RECORD 2 ]
artistid │ 2
name │ Accept
═[ RECORD 3 ]
artistid │ 3
name │ Aerosmith

Thanks for helping me get it corrected.

It still feels strange that when using "expanded" display the first record
separator would be categorized under "border_linestyle" and the remaining
record separators would be categorized under "header_linestyle".

Cheers!

-m


[GENERAL] expanded auto and header linestyle double

2017-11-15 Thread Matt Zagrabelny
Greetings,

Using PG 10.1.

In my .psqlrc I have:

\x auto
\pset linestyle 'unicode'
\pset unicode_header_linestyle double

and when the output is expanded, I do not see a double line for the first
record, but I do for all subsequent records. For example:

% select  * from artist;
─[ RECORD 1 ]─
artistid │ 1
name │ AC/DC
═[ RECORD 2 ]═
artistid │ 2
name │ Accept
═[ RECORD 3 ]═
artistid │ 3
name │ Aerosmith
═[ RECORD 4 ]═
artistid │ 4
name │ Alanis Morissette
═[ RECORD 5 ]═
artistid │ 5

I would like to have the initial "RECORD 1" line have the same "double"
linestyle as the other records.

Am I missing a config item?

Thanks!

-m


Re: [GENERAL] Performance options for CPU bound multi-SUM query

2016-01-27 Thread Matt
Moving from NUMERIC to FLOAT(8) did indeed lower query times by about 
20%.


I will try fixeddecimal and agg() as time permits.


On 25 Jan 2016, at 4:44, David Rowley wrote:


On 25 January 2016 at 15:45, Matt  wrote:
I have a warehousing case where data is bucketed by a key of an 
hourly
timestamp and 3 other columns. In addition there are 32 numeric 
columns. The
tables are partitioned on regular date ranges, and aggregated to the 
lowest

resolution usable.

The principal use case is to select over a range of dates or hours, 
filter

by the other key columns, and SUM() all 32 of the other columns. The
execution plan shows the primary key index limits row selection 
efficiently,

but the query appears CPU bound in performing all of those 32 SUM()
aggregates.



SUM(numeric) also has to work quite a bit harder than an an aggregate
like sum(float8) too since the addition in numeric is implemented in
software.
It depends on the use case, but for some cases the float4 or float8
types might be an option and it will offer much faster aggregation.
There is also https://github.com/2ndQuadrant/fixeddecimal which may be
of some use if you need fixed precision up to a predefined scale. We
found that using fixeddecimal instead of numeric for the TPC-H
benchmark improved performance of query 1 significantly.

--
David Rowley   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] prefix package availability for 9.5

2016-01-26 Thread Hubbard, Matt R W
I was hoping to be able to use the prefix module with postgresql 9.5.

However, I'm not finding it in the redhat 7 repo:
https://download.postgresql.org/pub/repos/yum/9.5/redhat/rhel-7-x86_64/

It is available for 9.4:
https://download.postgresql.org/pub/repos/yum/9.4/redhat/rhel-7-x86_64/

It is available for debian:
https://packages.debian.org/sid/postgresql-9.5-prefix


I'm afraid I've no idea how the build system for the modules works, I had a 
quick search and found buildfarm.postgresql.org, but this doesn't seem to cover 
modules.

Is there something that needs fixing somewhere to get prefix95 published for 
rhel-7-x86_64? How would one find out?

Many thanks,
Matt



[GENERAL] Performance options for CPU bound multi-SUM query

2016-01-24 Thread Matt
I have a warehousing case where data is bucketed by a key of an hourly 
timestamp and 3 other columns. In addition there are 32 numeric columns. 
The tables are partitioned on regular date ranges, and aggregated to the 
lowest resolution usable.


The principal use case is to select over a range of dates or hours, 
filter by the other key columns, and SUM() all 32 of the other columns. 
The execution plan shows the primary key index limits row selection 
efficiently, but the query appears CPU bound in performing all of those 
32 SUM() aggregates.


I am looking at a couple of distributed PostgreSQL forks, but until 
those reach feature parity with 9.5 I am hoping to stay with single node 
PostgreSQL.


Are there any other options I can use to improve query times?

Server is 64GB RAM, with work_mem set to 1GB. All SSD storage, with 
separate RAID-10 volumes for OS, data, and indexes. Additional setting 
beyond defaults as follows.


~~~
default_statistics_target = 500
maintenance_work_mem = 1GB
checkpoint_completion_target = 0.9
effective_cache_size = 48GB
work_mem = 1GB
wal_buffers = 16MB
checkpoint_segments = 128
shared_buffers = 16GB
max_connections = 20
~~~

Re: [GENERAL] pg_upgrade failing from 9.3 to 9.4 because "template0" already exists

2015-03-06 Thread Matt Landry

On 03/06/2015 02:43 PM, Stephen Frost wrote:

Right, as I mentioned, template0 shouldn't have datallowconn as 'true'.
That's why it's being included in the pg_dumpall.

On your test setup, run (as superuser):

update pg_database set datallowconn = false where datname = 'template0';

Then re-run the pg_upgrade.


Glory and halleleujah, it works. 9.4 now confirmed working, and all is well.

Thanks much, for your help.



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_upgrade failing from 9.3 to 9.4 because "template0" already exists

2015-03-06 Thread Matt Landry

On 03/06/2015 01:55 PM, Adrian Klaver wrote:


So on the original cluster, log in using psql and do \l and post the
results here. Thanks.

[...]

Meant to add to previous post, to check with issue that Stephen
mentioned do:

select datname, datallowconn from pg_database ;




postgres=# \l
  List of databases
   Name|  Owner   | Encoding |   Collate   |Ctype|   Access 
privileges

---+--+--+-+-+---
 postgres  | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
 reporting | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
 template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | 
=c/postgres  +
   |  |  | | | 
postgres=CTc/postgres
 template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | 
=c/postgres  +
   |  |  | | | 
postgres=CTc/postgres

(4 rows)

postgres=# select datname, datallowconn from pg_database ;
  datname  | datallowconn
---+--
 template1 | t
 template0 | t
 postgres  | t
 reporting | t
(4 rows)

postgres=#



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] pg_upgrade failing from 9.3 to 9.4 because "template0" already exists

2015-03-06 Thread Matt Landry

On 03/06/2015 12:37 PM, Adrian Klaver wrote:

Agreed, I am just trying to figure out how you get:

CREATE DATABASE "template0" WITH TEMPLATE = template0 ..

Seems to be a snake eating its tail:)


Yes. It does. And it's pretty obvious why having this would be a 
problem...not quite so obvious how to not have it.


For the record, I inherited this DB configuration, but I can say with 
certainty that we have _one_ actual database with data in it, and it's 
not "template0". If it turns out that, in order to do what I need to do, 
we need to either drop (or rename, or whatever) template0 or somehow 
manually exclude the attempt to upgrade template0, and that for some 
reason we have local modifications to that template that will then have 
to be re-done, then that's a totally acceptable solution, from my 
perspective, as long as we (that is, me and all the people who read the 
pg_upgrade documentation in the future) know how to do that.


But figuring out how it managed to get into this condition in the first 
place, and how pg_upgrade can be made to programmatically cope with 
that, is probably more important for all the users who aren't me. :)



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] pg_upgrade failing from 9.3 to 9.4 because "template0" already exists

2015-03-06 Thread Matt Landry
Attempting to upgrade a large (>3TB) postgressql database from 9.3 to 
9.4 on Ubuntu 14.04 LTS, but the process fails fairly early on. The 
error message instructs me to look at the last few lines of 
pg_upgrade_utility.log for more info, and the last two lines there (the 
only ones that don't succeed) are:


CREATE DATABASE "template0" WITH TEMPLATE = template0 OWNER = "postgres";
psql:pg_upgrade_dump_globals.sql:44: ERROR:  database "template0" 
already exists


For reference, the pg_upgrade command as I'm running it is:

/usr/lib/postgresql/9.4/bin/pg_upgrade -k -b /usr/lib/postgresql/9.3/bin 
-B /usr/lib/postgresql/9.4/bin -d /DB/postgres/data/9.3/main -D 
/DB/postgres/data/9.4/main -o '-c 
config_file=/etc/postgresql/9.3/main/postgresql.conf' -O '-c 
config_file=/etc/postgresql/9.4/main/postgresql.conf'


Searching google for useful advice yields...little. (Mostly articles 
about problems upgrading from 8.4 to 9.2, whose solutions -- when 
they're described -- don't help here. Also, there are a number of 
sources saying I should just do a pg_dumpall and then a 
pg_restore...which would be fine, probably, if I could afford to take my 
production database offline for a week.)


I do have a pg_dumpall backup done, and at the moment I'm doing all this 
on a test instance, so I can afford to experiment without risking data 
corruption or excessive downtime on the production server, but the whole 
point of the experimenting is to come up with a procedure that will work 
in production when it's time to do it there. Dump-and-restore definitely 
does not meet that requirement, even if it does eventually recreate a 
working database.


Any ideas? I'm sure I'm missing something obvious, here, but I can't 
seem to find any reference that tells me what it would be.



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Localhost vs. Unix Domain Sockets?

2014-08-18 Thread Matt S
> To put it another way, keeping the two sets of names distinct is
incrementally more complex to manage.  Which might be worth it if there
really is any gain.  Is this a "best practice," or is it really a
manifestation of its closely-related cousin, the "silly practice?" :)

It's ultimately up to your use case. I generate my auth maps using CM tools
(i.e. Ansible) so the management overhead is minimal. My web applications
all run as the "deploy" (OS) user, but each have separate DB users
("baltar", "caprica", "leoben", etc.) and those DB users only have access
on the DBs they need to.

>From a security perspective, any application compromise (say, a bug in an
SQL driver/lib) will therefore only affect the databases that user can
access, and not all the databases the OS user can access (which could be
many!).




On Tue, Aug 19, 2014 at 9:51 AM, John R Pierce  wrote:

> On 8/18/2014 6:45 PM, Ken Tanzer wrote:
>
>> Thanks.  I'm not really worried about this particular vulnerability, just
>> wondering about the more general idea that having db user name = os user
>> could reduce your security, even if only slightly.  Is it just as
>> conceivable that a vulnerability could come along that was more exploitable
>> only if the two names were _different_?
>>
>
> what I read on that vunerability, it was talking about dbuser == dbname,
> not os user.   and frankly, I didn't get their rationale for that.
>
>
>
> --
> john r pierce  37N 122W
> somewhere on the middle of the left coast
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


Re: [GENERAL] Localhost vs. Unix Domain Sockets?

2014-08-18 Thread Matt S
I went through the same process a little while ago - worth reading is the
pg_hba.conf documentation:
http://www.postgresql.org/docs/9.1/static/auth-pg-hba-conf.html

Specifically:

* Don't enable "trust" auth (i.e. any OS user as any DB user) - that's
rarely what you want on a multi-user machine.
* "peer" auth (OS user == DB user name) is typically the way to go in most
cases - you can lock down the DB user further with GRANT privileges as
required.
* Maps (http://www.postgresql.org/docs/9.1/static/auth-username-maps.html)
are useful if you want to allow multiple OS users access to a single DB
user, or if your OS usernames are detached from your DB user names.

I use a peer map=X configuration where my application user(s) (which my app
binaries run under) are mapped to DB user names. Those DB users have
CONNECT privs on their own DBs (only).


On Tue, Aug 19, 2014 at 8:00 AM, John R Pierce  wrote:

> On 8/18/2014 4:55 PM, Ken Tanzer wrote:
>
>> So I've got two questions.  One is whether there are any downsides to
>> using sockets, or any "gotchas" to be aware of.  The second is whether
>> there is anything to do to increase the security of sockets?  (e.g.,
>> analagous to encrypting localhost conenctions with SSL?)  From the little I
>> saw, it sounds like sockets are "just inherently secure," but wanted to
>> confirm that or get another opinion!
>>
>
> localhost is plenty secure, only root can sniff it, and root can su to
> postgres and be in full ownership of your server anyways, so if you
> consider root a security risk, well, there's no cure for that.
>
> unix domain sockets are quite secure too.   they might be slightly faster
> than tcp/ip via localhost, but its probably not enough to matter.
>
>
>
> --
> john r pierce  37N 122W
> somewhere on the middle of the left coast
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


[GENERAL] pg_ident.hba on a single-user, multi-app machine

2014-08-16 Thread Matt Silverlock
Hi all.

Trying to rationalise my pg_hba.conf and pg_ident.conf configuration on a 
Debian/Ubuntu machine where:

* One primary application user (“deploy”) runs web applications
* postgres, nginx, et. al run under their own users
* Using a Unix socket for connecting to PostgreSQL on the same machine (if I 
split the machines up at some point in the future, I’ll just run TCP + SSL w/ 
strict IP filtering)

At the moment I’m using the following approach, where each database user 
(unique per application) only has permissions for its own database. Users are 
mapped to the “deploy” user so that peer authentication can work.

http://pastebin.com/ZAWvnKNW

# file: pg_hba.conf
# TYPE  DATABASEUSERADDRESS METHOD
local   all deploy  peer 
map=appusers
local   all postgrespeer
hostall all  127.0.0.1/32   md5
hostall all ::1/128 md5
 
# file: pg_ident.conf
# MAPNAME   SYSTEM-USERNAME PG-USERNAME
appusersdeploy  baltar # represents one application
appusersdeploy  caprica # second app
# etc...
 
# via Ansible
- name: create app1 database user
  postgresql_user: db=app1 name=baltar priv=ALL
 
- name: create app2 database user
  postgresql_user: db=app2 name=caprica priv=ALL


What are the outstanding risks here? The only ‘likely’ scenario (short of the 
box itself being compromised) is if the app is compromised/flawed (i.e. some 
uncaught SQLi vuln in a lib) then it can drop its own tables, but not the 
tables of any other application running under the same OS user.

(Heck, can you even have multiple applications talking to the same Unix socket?)

Thanks in advance.

Re: [GENERAL] Table checksum proposal

2014-07-24 Thread matt
> On Thu, Jul 24, 2014 at 3:35 AM,  wrote:
>
>> I have a suggestion for a table checksumming facility within PostgreSQL.
>> The applications are reasonably obvious - detecting changes to tables,
>> validating data migrations, unit testing etc.  A possible algorithm is
>> as
>> follows:
>>
>> 1. For each row of the table, take the binary representations of the
>> values and serialise them to CSV.
>> 2. Calculate the MD5 sum of each CSV-serialised row.
>> 3. XOR the row MD5 sums together.
>> 4. CSV-serialise and MD5 a list of representations (of some sort) of the
>> types of the table's columns and XOR it with the rest.
>> 5. Output the result as the table's checksum.
>>
>> Advantages of this approach:
>>
>> 1. Easily implemented using SPI.
>> 2. Since XOR is commutative and associative, order of ingestion of rows
>> doesn't matter; therefore, unlike some other table checksumming methods,
>> this doesn't need an expensive ORDER BY *.  So, this should be pretty
>> much
>> as fast as a SELECT * FROM, which is probably as fast as a table
>> checksum
>> can be.
>> 3. Using a cursor in SPI, rows can be ingested a few at a time.  So
>> memory
>> footprint is low even for large tables.
>> 4. Output has a convenient fixed size of 128 bits.
>>
>> Questions:
>>
>> 1. Should this be a contrib module which provides a function, or should
>> it
>> be a built-in piece of functionality?
>> 2. Is MD5 too heavyweight for this?  Would using a non-cryptographic
>> checksum be worth the speed boost?
>> 3. Is there a risk of different architectures/versions returning
>> different
>> checksums for tables which could be considered identical?  If so, is
>> this
>> worth worrying about?
>>
>
> Hmm - Do you really think we need an extension for something that can be
> done using query as simple as:
>
> select md5(string_agg(md5(c::text), '' order by md5(c::text))) from
> pg_class c;
>
> (of course you can do it on any table, not only pg_class).
>
> If you want to use the xor idea (which make sense), all you need is to
> write xor aggregate.
>
> depesz
>

This is nice and neat but there are some major disadvantages with this
approach:

1. It can't detect differences in types, e.g. converting an INT column to
TEXT will leave the checksum unchanged.
2. The string_agg requires a string with length 32 * (number of rows) to
be created and then MD5ed.  So on a 100m row table that means using 3.2GB
of memory, which seems unnecessarily heavy.
3. You have used an ORDER BY, which adds more memory usage and time cost.
4. The existence of an in-built checksum facility lends some possibility
of a common standard, which is of course a major factor in making a
checksum useful - one can supply a database dump and a list of tables and
"standard" checksums without also supplying sample code.

Matthew


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Table checksum proposal

2014-07-23 Thread matt
I have a suggestion for a table checksumming facility within PostgreSQL. 
The applications are reasonably obvious - detecting changes to tables,
validating data migrations, unit testing etc.  A possible algorithm is as
follows:

1. For each row of the table, take the binary representations of the
values and serialise them to CSV.
2. Calculate the MD5 sum of each CSV-serialised row.
3. XOR the row MD5 sums together.
4. CSV-serialise and MD5 a list of representations (of some sort) of the
types of the table's columns and XOR it with the rest.
5. Output the result as the table's checksum.

Advantages of this approach:

1. Easily implemented using SPI.
2. Since XOR is commutative and associative, order of ingestion of rows
doesn't matter; therefore, unlike some other table checksumming methods,
this doesn't need an expensive ORDER BY *.  So, this should be pretty much
as fast as a SELECT * FROM, which is probably as fast as a table checksum
can be.
3. Using a cursor in SPI, rows can be ingested a few at a time.  So memory
footprint is low even for large tables.
4. Output has a convenient fixed size of 128 bits.

Questions:

1. Should this be a contrib module which provides a function, or should it
be a built-in piece of functionality?
2. Is MD5 too heavyweight for this?  Would using a non-cryptographic
checksum be worth the speed boost?
3. Is there a risk of different architectures/versions returning different
checksums for tables which could be considered identical?  If so, is this
worth worrying about?

I have knocked up some sample code if anyone is interested.

Regards,

Matthew


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Complex Recursive Query

2014-07-23 Thread matt
I wouldn't do this with recursion; plain old iteration is your friend
(yes, WITH RECURSIVE is actually iterative, not recursive...)

The algorithm goes like this:

1. Extend your graph relation to be symmetric and transitive.
2. Assign a integer group id to each node.
3. Repeatedly join the node list to the (extended) relation, updating each
node's group id to be the minimum of the group ids of every node it
touches.
4. Stop when the group ids stop changing.

Here's some example code, using your data:

DROP SCHEMA IF EXISTS test CASCADE;
CREATE SCHEMA test;
SET SEARCH_PATH TO test;

CREATE TABLE graph(key1 TEXT, key2 TEXT);

INSERT INTO graph VALUES
('a', 'x'),
('a', 'y'),
('b', 'w'),
('c', 't'),
('x', 'a'),
('y', 'a'),
('y', 'z'),
('z', 'y'),
('t', 'c'),
('w', 'b'),
('w', 'd'),
('d', 'w');

DO
$$
DECLARE
  prev INT = 0;
  curr INT;
BEGIN
  CREATE TABLE rel AS
  SELECT key1, key2 FROM graph
  UNION
  SELECT key2, key1 FROM graph
  UNION
  SELECT key1, key1 FROM graph
  UNION
  SELECT key2, key2 FROM graph;

  CREATE TABLE group_ids AS
  SELECT
key,
ROW_NUMBER() OVER (ORDER BY key) AS group_id
  FROM
(
  SELECT key1 AS key FROM graph
  UNION
  SELECT key2 FROM graph
) _;

  SELECT SUM(group_id) INTO curr FROM group_ids;
  WHILE prev != curr LOOP
prev = curr;
DROP TABLE IF EXISTS min_ids;
CREATE TABLE min_ids AS
SELECT
  a.key,
  MIN(c.group_id) AS group_id
FROM
  group_ids a
INNER JOIN
  rel b
ON
  a.key = b.key1
INNER JOIN
  group_ids c
ON
  b.key2 = c.key
GROUP BY
  a.key;

UPDATE
  group_ids
SET
  group_id = min_ids.group_id
FROM
  min_ids
WHERE
  group_ids.key = min_ids.key;

SELECT SUM(group_id) INTO curr FROM group_ids;
  END LOOP;

  DROP TABLE IF EXISTS rel;
  DROP TABLE IF EXISTS min_ids;
END
$$;

SELECT * FROM group_ids;


Hope it helps,

Matthew




> I have a collection of relationship rows of the form
>
> Table: graph
> key1 varchar
> key2 varchar
>
> A row of the form ('a','b') indicates that 'a' and 'b' are related.
> The table contains many relationships between keys, forming several
> disjoint sets. All relationships are bi-directional, and both
> directions are present.  I.e. the table contains a set of disjoint
> graphs specified as node pairs.
>
> For example the set of values
>
> key1key2
> -   -
>   a   x
>   a   y
>   b   w
>   c   t
>   x   a
>   y   a
>   y   z
>   z   y
>   t   c
>   w   b
>   w   d
>   d   w
>
> defines three disjoint groups of connected keys:
>
>   a x y z
>   c t
>   b w d
>
> What I would like to achieve is a single SQL query that returns
>
>   group key
>   - ---
> 1a
> 1x
> 1y
> 1z
> 2c
> 2t
> 3b
> 3w
> 3d
>
> I don't care about preserving the node-to-node relationships, only
> the group membership for each node.
>
> I've been playing with "WITH RECURSIVE" CTEs but haven't had any
> success.  I'm not really sure how to express what I want in SQL, and
> it's not completely clear to me that recursive CTEs will help here.
> Also I'm not sure how to generate the sequence numbers for the groups
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Re: [GENERAL] Re: [GENERAL] Fwd: Help!Why CPU Usage and LoadAverage Jump up Suddenly

2013-12-05 Thread Matt Daw
Is khugepaged running during the stalls?
http://www.postgresql.org/message-id/20130716195834.8fe5c79249cb2ff0d4270...@yahoo.es

Matt

On Thu, Dec 5, 2013 at 7:44 AM, Scott Marlowe  wrote:
> On Thu, Dec 5, 2013 at 1:46 AM, 吕晓旭  wrote:
>>
>>
>>
>> Hi, all
>> We find so weird problem on our productive PostgreSQL system. And I 
>> don't know how could I do to resolve this problem.
>> We deployed PostgreSQL 9.2.4 on two system environments,  and the 
>> performances between them are absolutely different. one of them it's 
>> perfect, and the other one lets me down, CPU Usage and LoadAverage Jumped up 
>> Suddenly when concurrency smoothly rising up, simultaneously, average 
>> response time become unacceptable.
>> Anyone, who could give me some advice?
>>
>> The parameters of system environment and PotgreSQL listed below:
>
> Have you tried  monitoring your IO subsystem when this happens? I'd be
> interested in iostat, vmstat, iotop, and so on to see what the IO
> looks like.
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Clone database using rsync?

2013-11-05 Thread matt
Copying the data folder should work as long as you stop the postgres
service on the production server before starting the copy and don't start
it up again until the copy finishes.  pg_dump and pg_restore (look them up
in the online docs) will get the job done without you having to take the
production server offline.

If you go with the folder copy and your installation has postgresql.conf,
pg_hba.conf and so on in your data folder, you'll probably want to edit
them after the copy - more logging, different security etc.

Matt


> I need to clone production database to development  server ? What is the
> best
> and simplest way to achieve that? Both my production and development
> postgres versions are same. Will copy over data folder using rsync work?
>
> Thanks in advance.
>
>
>
> --
> View this message in context:
> http://postgresql.1045698.n5.nabble.com/Clone-database-using-rsync-tp5777141.html
> Sent from the PostgreSQL - general mailing list archive at Nabble.com.
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] INSERT/UPDATE statements sometimes choosing not to use PK index?

2013-10-29 Thread Matt
Since this is an upsert, the staging table simply accepts a copy of
pipe-delimited data via a COPY FROM STDIN, and a subsequent SQL script
performs the UPDATE/INSERT. The staging table is then truncated (or delete
without where) for the next run: Truncate staging, COPY into staging,
update then insert destination from staging, repeat on next 5 minute clock
cycle.

Since the staging table is fully loaded into the destination table, a full
scan was not a concern, but the lack of index usage on the destination. For
example, this is normally supported by the PK index, and performs well:

INSERT INTO destination (…)
SELECT (…) FROM staging
WHERE NOT EXISTS (SELECT * FROM source WHERE source.id = destination.id)

The above runs in expected time when the explain plan shows an index scan
on the destination primary key index, and a seq scan on the staging table.
This will continue for many runs, until something causes the engine to stop
using the destination PK index, and begin scanning both the destination and
staging. What triggers this is the interesting question.The only system
event I can (loosely) correlate with the problem is the start of a nightly
pg_dump, but as a read-only process, this would not be changing any index
stats, correct?

An ANALYZE on staging is possible after each load cycle, but time does not
permit on the destination table. I have been focusing on the destination
because it has the index that is not being used. Will stats on the staging
table affect index selection on the destination in a case like this?

In the process of attempting to diagnose this, both tables involved have
been vacuumed (full), analyzed. I have also moved staging to an SSD volume,
and created an equivalent index on staging - which is not used in any plan,
nor do I expect to as there is no filter criteria on staging, and the index
maintenance on staging would seem to be an unneeded overhead. But in this
case, is there any advantage to an index on staging?

For reference, it is possible (not enough test cycles to verify) that left
anti-join makes this situation worse, even though the explain plans appear
identical:

INSERT INTO destination (…)
SELECT (…) FROM staging
LEFT JOIN destination ON destination.id = staging.id
WHERE destination.id IS NULL


On 29 Oct 2013, at 9:45, Tom Lane wrote:

> Matt  writes:
>> In most cases, EXPLAIN and runtime tell me the index is utilized.
However,
>> sometime on back to back runs (5 minute intervals) the runtime explodes
and
>> EXPLAIN shows me that the PK index is not used, as both the staging table
>> and destination table a sequentially scanned.
>
> You haven't explained what you do with the "staging table", but maybe you
> need to ANALYZE it after you've loaded fresh data into it.  The stats for
> the main table are presumably somewhat stable, but the stats for the
> staging table maybe not.  The apparent correlation to consecutive runs
> lends some support to this theory, as that would reduce the time window
> for auto-ANALYZE to fix it for you.
>
> regards, tom lane


[GENERAL] INSERT/UPDATE statements sometimes choosing not to use PK index?

2013-10-28 Thread Matt
I have a relatively simple data load script, which upserts (UPDATE existing
rows, INSERT new rows), which should be supported by the primary key index,
the only index on this table:

UPDATE destination SET ... FROM staging WHERE staging.pk =
destination.pk;
INSERT INTO destination SELECT ... FROM staging WHERE NOT EXISTS(SELECT
* FROM destination WHERE pk  = staging.pk);

In most cases, EXPLAIN and runtime tell me the index is utilized. However,
sometime on back to back runs (5 minute intervals) the runtime explodes and
EXPLAIN shows me that the PK index is not used, as both the staging table
and destination table a sequentially scanned. The source data maintains the
same pattern, and this load job is the only write activity in this
particular database.

A left anti-join in the above pseudo-query explains the same, and seems to
make the problem occur more frequently.

What could cause PostgreSQL (9.1) to stop using an index in a case like
this, when it normally and expectedly uses it?


Re: [GENERAL] Question about using AggCheckCallContext in a C function

2013-08-13 Thread Matt Solnit
On Aug 12, 2013, at 12:47 PM, Tom Lane  wrote:

> Matt Solnit  writes:
>> 2. The function seems to work consistently when I do a SELECT
>> SUM(mycol) without any GROUP BY.  It's only when I add grouping that
>> the failures happen.  I'm not sure if this is a real clue or a red
>> herring.
> 
> That isn't enormously surprising, since the memory management for
> the transition values is different in the two cases.
> 
>> Finally, can you tell me what precisely happens when you call
>> datumCopy() with ArrayType?  If it's only returning a copy of
>> the TOAST reference, then how is it safe for the transition function
>> to modify the content?  I'm probably *completely* misunderstanding
>> how this works, so I would love to be enlightened :-).
> 
> You're right, datumCopy() won't expand a TOAST reference.  What does
> expand it is PG_GETARG_ARRAYTYPE_P().  So if you have a case where the
> system picks up a copy of an array input that happens to be toasted,
> it's the GETARG step in the next invocation of the aggregate transition
> function that expands the TOAST reference, and then after that you have an
> in-memory copy that's safe to modify.  Maybe you're missing that somehow?
> The code fragment you showed looked okay but ...
> 
>   regards, tom lane

I think I figured it out.  The problem is this line:

  Datum *arrayData1, *arrayData2;

Datum* was correct when I first started this journey, using deconstruct_array(),
but is incorrect when accessing the array's content directly using
ARR_DATA_PTR().  Changing these to int* fixes the problem, at least
on all the systems I've tried so far.

I've been wondering why the broken code worked without a GROUP BY,
and I think it was just dumb luck.  With no GROUP BY, I was only
overrunning a single buffer, and maybe the effects were not
immediately apparent.  With GROUP BY, however, there's a buffer
overrun for each group, and each one increases the chance of doing
something catastrophic.

Sincerely,
Matt Solnit

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Question about using AggCheckCallContext in a C function

2013-08-12 Thread Matt Solnit
On Aug 12, 2013, at 11:53 AM, Tom Lane 
 wrote:

> Matt Solnit  writes:
>> After poring over the code in nodeAgg.c, and looking at the in8inc()
>> function, I think I know what the problem is:  the typical use of
>> AggCheckCallContext() is not compatible with TOAST-able data types.
> 
> That's nonsense.  There are several standard aggregates that use
> that with array transition values.
> 
> Personally, I'd wonder about the blind-faith assumption in your code that
> all the input arrays are exactly the same length, with no NULL elements.
> At the very least a check for that would seem advisable.  An empty
> (zero-dimensional) array could also make this code crash, so I'd be
> inclined to put in a check that ARR_NDIM() is 1, too.
> 
>   regards, tom lane

Thanks for your reply, albeit gruffly-worded :-).  I'm certainly a
novice to the PostgreSQL source code, so I'm not surprised that my
hypothesis was wrong.

Regarding the assumptions, I am perfectly okay with them because I
have complete control over the inputs.  We're using this function
with a very precise data set.  I did, however, take a moment to
verify that *every* value in the table matches my assumptions, and
it does.

So where do I go from here?

Additional information that might helpful:

1. When the crash occurs, and I inspect using gdb, I consistently
find that the first array's contents are "garbage".  For example:

  (gdb) print array1->dataoffset
  $6 = -1795162110

  (gdb) print array1->ndim
  $9 = 989856262

while the second array looks fine:

  (gdb) print array2->dataoffset
  $7 = 0

  (gdb) print array2->ndim
  $10 = 1

2. The function seems to work consistently when I do a SELECT
SUM(mycol) without any GROUP BY.  It's only when I add grouping that
the failures happen.  I'm not sure if this is a real clue or a red
herring.

Finally, can you tell me what precisely happens when you call
datumCopy() with ArrayType?  If it's only returning a copy of
the TOAST reference, then how is it safe for the transition function
to modify the content?  I'm probably *completely* misunderstanding
how this works, so I would love to be enlightened :-).

Sincerely,
Matt Solnit


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Question about using AggCheckCallContext in a C function

2013-08-12 Thread Matt Solnit
Hi everyone.  A while back, I wrote a function to sum array contents (think
1-D matrix addition), and am using it in a custom SUM aggregate.  Here's an
example:

CREATE TABLE foo (arr INT[]);
INSERT INTO foo VALUES ('{1, 2, 3}'), ('{4, 5, 6}');

SELECT SUM(arr) FROM foo;
   sum   
-
 {5,7,9}
(1 row)

This works, but I got some great advice from Craig Ringer about how I can
make it even faster using AggCheckCallContext().  See 
http://stackoverflow.com/a/16996606/6198

I'm trying to implement this now, and I'm running into occasional memory
corruption.  Sometimes the back-end segfaults, sometimes I get seemingly-
random "ERROR:  could not open relation with OID xyz" errors.

Here's the relevant code snippet:

// Assumptions:
// 1. We will never be called with a null array.  The CREATE FUNCTION call 
should specify STRICT to prevent PostgreSQL from doing this.
// 2. The arrays will never contain null values.  It's up to the caller to 
ensure this.
// 3. The arrays will always be the same length.  It's up to the caller to 
ensure this.

ArrayType *array1, *array2;
Datum *arrayData1, *arrayData2;
int arrayLength1;

array2 = PG_GETARG_ARRAYTYPE_P(1);

if (AggCheckCallContext(fcinfo, NULL))
{
  // We are being called by an aggregate.

  if (PG_ARGISNULL(0))
  {
// This can happen on the very first call as PostgreSQL sets up the 
"temporary transition value" for the aggregate.

// NOTE: This never seems to happen!  Is it because the function is STRICT?

// Return a copy of the second array.
PG_RETURN_ARRAYTYPE_P(copy_intArrayType(array2));
  }
  else
  {
// This means that the first array is a "temporary transition value", and 
we can safely modify it directly with no side effects.
// This avoids the overhead of creating a new array object.

array1 = PG_GETARG_ARRAYTYPE_P(0);

// Get a direct pointer to each array's contents.
arrayData1 = ARR_DATA_PTR(array1);
arrayData2 = ARR_DATA_PTR(array2);

// Get the length of the first array (should be the same as the second).
arrayLength1 = (ARR_DIMS(array1))[0];

// Add the contents of the second array to the first.
for (i = 0; i < arrayLength1; i++)
{
  arrayData1[i] += arrayData2[i];
}

// Return the updated array.
PG_RETURN_ARRAYTYPE_P(array1);
  }
}
else
{
  // We are not being called by an aggregate.
  // 
} 

After poring over the code in nodeAgg.c, and looking at the in8inc()
function, I think I know what the problem is:  the typical use of
AggCheckCallContext() is not compatible with TOAST-able data types.

When the state transition function is STRICT (as mine is), and
advance_transition_function() comes across the first non-NULL value,
it makes a copy using datumCopy().  However, from what I can tell,
datumCopy() is not really "compatible" with TOAST for this
particular use case.

Am I on the right track here?  If so, what's the best way to solve
this problem?  I would appreciate any help you can offer.

Sincerely,
Matt Solnit 

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Differences in Unicode handling on Mac vs Linux?

2013-06-02 Thread Matt Daw
> Hm ... what does "\d shots" say about the spelling of the column name?

\d shots is the same on both systems:

 sg_poznÁmka  | text
 |


> OS X's Unicode locales are pretty crummy.  I'm suspicious that there's
> some sort of case-folding inconsistency here, but it's hard to say more
> (especially since you didn't actually tell us *which* locales you've
> selected on each machine).  If it is that, as a short-term fix it might
> help to double-quote the column name.

The locales are set to "en_US.UTF-8" and encodings to "UTF8". Double
quoting does solve the column case, but it's not helping with the
Rails generated:

SELECT a.attname, format_type(a.atttypid, a.atttypmod), d.adsrc, a.attnotnull
  FROM pg_attribute a LEFT JOIN pg_attrdef d
ON a.attrelid = d.adrelid AND a.attnum = d.adnum
 WHERE a.attrelid =
'asset_sg_kdo_děláassigned_to__connections'::regclass
   AND a.attnum > 0 AND NOT a.attisdropped
 ORDER BY a.attnum

... that produces:

ERROR:  relation "asset_sg_kdo_d�l�assigned_to__connections" does not exist

\d produces:

 public | asset_sg_kdo_děláassigned_to__connections
| table| matt


For the short term, I think I'll boot up a Linux VM to troubleshoot my
production bug... but I'll submit a bug report with repro steps.

Thanks Tom!

Matt


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Differences in Unicode handling on Mac vs Linux?

2013-06-02 Thread Matt Daw
Howdy, I loaded a client's DB on my Mac to debug an unrelated bug, but
I'm blocked because my Mac is rejecting SQL that works on our Linux
production servers. Here's a simple case:

# select * from shots where sg_poznÁmka is NULL;
ERROR:  column "sg_pozn�mka" does not exist
LINE 1: select * from shots where sg_poznÁmka is NULL;

... as far as I can tell, all my encodings are consistent on both
sides, I've checked LC_COLLATE, LC_CTYPE, client_encoding,
server_encoding and the database encodings. I'm running 9.0.13 on both
machines (and I tried 9.2.4 on my Mac).

Anything else I could double-check? Or are there any known Mac-related
Unicode issues?

Thanks!

Matt


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Deploying PostgreSQL on CentOS with SSD and Hardware RAID

2013-05-16 Thread Matt Brock
So a week after asking our HP dealer, they've finally replied to say that they 
can't tell us what manufacturer and model the SSDs are because "HP treat this 
information as company confidential". Not particularly helpful.

They have at least confirmed that the drives have "surprise power loss 
protection" and "tools to present information on the percentage of life used 
and amount of life remaining under the workload-to-date".

Given that these are enterprise class drives, and given that they have the high 
availability features that we would need in database servers, and given that 
the deadline on this project is very tight so I don't really have time to do 
any testing on third-party drives, I'm guessing we'll go with the HP drives, 
even though they most likely are a little behind the times. Whilst we will 
perhaps lose in a little bit of performance compared to the latest Intel 
drives, we will gain in terms of high availability reassurance and simplicity 
of deployment which is crucial for this project given its tight deadline. 
However, after going through all the advice on this thread and having had time 
to think, I'll probably go for a four-disk RAID 10 array with SLCs, rather than 
a two-disk RAID 1 array with MLCs (for the OS) and a two-disk RAID 1 array with 
SLCs (for the database).

If I had more time and resources for testing I would likely end up going a 
different route, however.

Many thanks to all who've contributed their thoughts and opinions - much 
appreciated.

Matt.

On 13 May 2013, at 14:49, Merlin Moncure  wrote:

> On Sun, May 12, 2013 at 8:20 PM, John R Pierce  wrote:
>> On 5/12/2013 6:13 PM, David Boreham wrote:
>>> 
>>> 
>>> Not quite. More like : a) I don't know where to buy SLC drives in 2013
>>> (all the drives for example for sale on newegg.com are MLC) and b) today's
>>> MLC drives are quite good enough for me (and I'd venture to say any
>>> database-related purpose).
>> 
>> 
>> Newegg wouldn't know 'enterprise' if it bit them.   they just sell mass
>> market consumer stuff and gamer kit.
>> 
>> the real SLC drives end up OEM branded in large SAN systems, such as sold by
>> Netapp, EMC, and are made by companies like STEC that have zero presence in
>> the 'whitebox' resale markets like Newegg.
> 
> 
> The industry decided a while back that MLC was basically the way to go
> in terms of cost and engineering trade-offs, at least in cases where
> you needed a lot of storage. Yes, you can still get SLC in mid-tier
> and up storage but:
> 
> *) a lot of these drives are simply re-branded intel etc
> *) When it comes to SSD, I have zero confidence in vendor provided
> hardware specs (lifetime, iops, etc).  The lack of 3rd party test
> coverage and performance benchmarking is a big problem for me.  Ever
> bought a SAN and have had it not do what it was supposed to?
> *) The faster moving white box market has chosen MLC.  Three years
> back, the jury was still out.  This suggests to me that SAN vendors
> are still behind the curve in terms of SSD, which is typical of
> enterprise storage vendors. But,
> *) In many cases, the performance of the latest MLC drives is so fast
> that many applications that would have needed to scale up to high end
> storage would no longer need to do so.   A software raid of say for
> s3700 drives would probably outperform most <100k SANs from a couple
> years back.
> 
> merlin
> 
> 
> -- 
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
> 



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Deploying PostgreSQL on CentOS with SSD and Hardware RAID

2013-05-11 Thread Matt Brock
On 10 May 2013, at 16:25, David Boreham  wrote:

> I've never looked at SLC drives in the past few years and don't know anyone 
> who uses them these days.

Because SLCs are still more expensive? Because MLCs are now almost as good as 
SLCs for performance/endurance?

I should point out that this database will be the backend for a 
high-transaction gaming site with very heavy database usage including a lot of 
writes. Disk IO on the database server has always been our bottleneck so far. 

Also, the database is kept comparatively very small - about 25 GB currently, 
and it will grow to perhaps 50 GB this year as a result of new content and 
traffic coming in.

So whilst MLCs might be almost as good as SLCs now, the price difference for us 
is so insignificant that if we can still get a small improvement with SLCs then 
we might as well do so.

> Could you post some specific drive models please ? HP probably doesn't make 
> the drives, and it really helps to know what devices you're using since they 
> are not nearly as generic in behavior and features as magnetic drives.

I've asked our HP dealer for this information since unfortunately it doesn't 
appear to be available on the HP website - hopefully it will be forthcoming at 
some point.

Matt.




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Deploying PostgreSQL on CentOS with SSD and Hardware RAID

2013-05-10 Thread Matt Brock
After googling this for a while, it seems that High Endurance MLC is only 
starting to rival SLC for endurance and write performance in the very latest, 
cutting-edge hardware. In general, though, it seems it would be fair to say 
that SLCs are still a better bet for databases than MLC?

The number and capacity of drives is small in this instance, and the price 
difference between the two for HP SSDs isn't very wide, so cost isn't really an 
issue. We just want to use whichever is better for the database.

On 10 May 2013, at 15:20, Merlin Moncure  wrote:

> On Fri, May 10, 2013 at 9:14 AM, Matt Brock  wrote:
>> Hello.
>> 
>> We're intending to deploy PostgreSQL on Linux with SSD drives which would be 
>> in a RAID 1 configuration with Hardware RAID.
>> 
>> My first question is essentially: are there any issues we need to be aware 
>> of when running PostgreSQL 9 on CentOS 6 on a server with SSD drives in a 
>> Hardware RAID 1 configuration? Will there be any compatibility problems 
>> (seems unlikely)? Should we consider alternative configurations as being 
>> more effective for getting better use out of the hardware?
>> 
>> The second question is: are there any SSD-specific issues to be aware of 
>> when tuning PostgreSQL to make the best use of this hardware and software?
>> 
>> The specific hardware we're planning to use is the HP ProLiant DL360 Gen8 
>> server with P420i RAID controller, and two MLC SSDs in RAID 1 for the OS, 
>> and two SLC SSDs in RAID 1 for the database - but I guess it isn't necessary 
>> to have used this specific hardware setup in order to have experience with 
>> these general issues. The P420i controller appears to be compatible with 
>> recent versions of CentOS, so drivers should not be a concern (hopefully).
> 
> The specific drive models play a huge impact on SSD performance.  In
> fact, the point you are using SLC drives suggests you might be using
> antiquated (by SSD standards) hardware.   All the latest action is on
> MLC now (see here:
> http://www.intel.com/content/www/us/en/solid-state-drives/solid-state-drives-dc-s3700-series.html).
> 
> merlin
> 
> 
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
> 



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Deploying PostgreSQL on CentOS with SSD and Hardware RAID

2013-05-10 Thread Matt Brock
Hello.

We're intending to deploy PostgreSQL on Linux with SSD drives which would be in 
a RAID 1 configuration with Hardware RAID.

My first question is essentially: are there any issues we need to be aware of 
when running PostgreSQL 9 on CentOS 6 on a server with SSD drives in a Hardware 
RAID 1 configuration? Will there be any compatibility problems (seems 
unlikely)? Should we consider alternative configurations as being more 
effective for getting better use out of the hardware?

The second question is: are there any SSD-specific issues to be aware of when 
tuning PostgreSQL to make the best use of this hardware and software?

The specific hardware we're planning to use is the HP ProLiant DL360 Gen8 
server with P420i RAID controller, and two MLC SSDs in RAID 1 for the OS, and 
two SLC SSDs in RAID 1 for the database - but I guess it isn't necessary to 
have used this specific hardware setup in order to have experience with these 
general issues. The P420i controller appears to be compatible with recent 
versions of CentOS, so drivers should not be a concern (hopefully).

Any insights anyone can offer on these issues would be most welcome.

Regards,

Matt.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] regex help wanted

2013-04-28 Thread matt
> On 2013-04-25, Karsten Hilbert  wrote:
>> On Thu, Apr 25, 2013 at 10:32:26AM -0400, Tom Lane wrote:
>>
>>> Karsten Hilbert  writes:
>>> > What I don't understand is: Why does the following return a
>>> > substring ?
>>>
>>> >   select substring ('junk $$ junk' from
>>> '\$<[^<]+?::[^:]+?>\$');
>>>
>>> There's a perfectly valid match in which [^<]+? matches allergy::test
>>> and [^:]+? matches 99.
>>
>> Tom, thanks for helping !
>>
>> I would have thought "<[^<]+?:" should mean:
>>
>>  match a "<"
>>  followed by 1-n characters as long as they are not "<"
>>  until the VERY NEXT ":"
>
>
> if you want that say:  "<[^<:]+:"
>
>> The "?" should make the "+" after "[^<]" non-greedy and thus
>> stop at the first occurrence of ":", right ?  Or am I
>> misunderstanding that part ?

Greediness and non-greediness of operators are like hints - they are only
honoured if there is a choice in the matter.  In your case, if the
<[^<]+?: stopped at the first ":", it would be impossible to match the
rest of the pattern.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Maintaining state across function calls

2012-11-19 Thread matt
> m...@byrney.com writes:
>> The question is: what's the "best practice" way of letting a
>> C/C++-language function hang onto internal state across calls?
>
> A static variable for that is a really horrid idea.  Instead use
> fcinfo->flinfo->fn_extra to point to some workspace palloc'd in the
> appropriate context.  If you grep the PG sources for fn_extra you'll
> find plenty of examples.
>
>   regards, tom lane
>

Thanks for this.  Out of curiosity, why is a static a bad way to do this?


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Maintaining state across function calls

2012-11-19 Thread matt
> On 11/19/2012 08:41 PM, m...@byrney.com wrote:
>> I want to process all the records in a table through a C-language (well,
>> C++) function (i.e. one function call per row of the table) in such a
>> way
>> that the function hangs onto its internal state across calls.  Something
>> like
>>
>> SELECT my_function(a, b, c) FROM my_table ORDER BY d;
>>
>> The value returned in the last row of the table would be the result I'm
>> looking for.  (This could be neatened up by using a custom aggregate and
>> putting my calculation in the sfunc but that's a minor detail).
> [snip]
>> Alternatively, use this in a custom aggregate and make the ffunc do the
>> garbage collection, which should prevents leakage altogether.
> You don't generally need to do this cleanup yourself. Use appropriate
> palloc memory contexts and it'll be done for you when the memory context
> is destroyed.
>
> I would want to implement this as an aggregate using the standard
> aggregate / window function machinery. Have a look at how the existing
> aggregates like string_agg are implemented in the Pg source code.

Thanks for your reply.  A follow-up question: to use the palloc/pfree
functions with a C++ STL container, do I simply give the container an
allocator which uses palloc and pfree instead of the default allocator,
which uses new and delete?

Matt


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Maintaining state across function calls

2012-11-19 Thread matt
I want to process all the records in a table through a C-language (well,
C++) function (i.e. one function call per row of the table) in such a way
that the function hangs onto its internal state across calls.  Something
like

SELECT my_function(a, b, c) FROM my_table ORDER BY d;

The value returned in the last row of the table would be the result I'm
looking for.  (This could be neatened up by using a custom aggregate and
putting my calculation in the sfunc but that's a minor detail).

The question is: what's the "best practice" way of letting a
C/C++-language function hang onto internal state across calls?  So far I'm
thinking something along the lines of:

Datum my_function(int a, int b, int c, int reset)
{
  static my_data *p = NULL;
  if (reset) //(re)initialise internal state
  {
delete p;
p = NULL;
  }
  else
  {
if (!p)
{
  p = new my_data;
}
//make use of internal state to do calculations or whatever
  }
}

The user would be responsible for calling my_function with "reset" set to
true to wipe previous internal state before using the function in a new
query; doing this also frees the memory associated with the function. 
This system is of course prone to leakage if the user forgets to wipe the
internal state after use, but it will only leak sizeof(my_data) per
connection, and the OS will garbage-collect all that when the connection
dies anyway.

Alternatively, use this in a custom aggregate and make the ffunc do the
garbage collection, which should prevents leakage altogether.

Is this a reasonable thing to do?  What are the risks?  Is there a more
"best-practice" way to achieve the same result?

Many thanks,

Matt


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] pg_hba.conf directory?

2012-11-08 Thread Matt Zagrabelny
Hello,

I've searched the mailing list archives and google regarding using a
directory to contain pg_hba.conf snippets. Does such a feature exist
for any version of PG?

Would this be a better question for a pg dev mailing list?

Please Cc me, I am not (yet) subscribed to the list.

Thanks!

-Matt Zagrabelny


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] streaming replication and data file consistency

2012-10-22 Thread Matt Savona
Hi all,

I am currently running Postgresql 9.2.1 with streaming replication: one
primary, one standby.  Once an hour I have a job which compares
pg_current_xlog_location on the primary against
pg_last_xlog_replay_location on the standby to ensure the standby is not
lagging too far behind the primary. So far everything is working great.

I noticed, however, that despite the fact that the cluster is consistently
in sync the md5sums and modified timestamps on many of my data files
differ. For example:

PRIMARY

# stat pgsql/data/base/16385/17600
  File: `pgsql/data/base/16385/17600'
  Size: 3112960 Blocks: 6080   IO Block: 4096   regular file
Device: fd02h/64770dInode: 39167976Links: 1
Access: (0600/-rw---)  Uid: (   26/postgres)   Gid: (   26/postgres)
Access: 2012-10-22 10:05:29.314607927 -0400
Modify: 2012-10-22 09:48:03.770209170 -0400
Change: 2012-10-22 09:48:03.770209170 -0400

# md5sum pgsql/data/base/16385/17600
5fb7909ea14ab7aa9636b31df5679bd4  pgsql/data/base/16385/17600

STANDBY

# stat pgsql/data/base/16385/17600
  File: `pgsql/data/base/16385/17600'
  Size: 3112960 Blocks: 6080   IO Block: 4096   regular file
Device: fd02h/64770dInode: 134229639   Links: 1
Access: (0600/-rw---)  Uid: (   26/postgres)   Gid: (   26/postgres)
Access: 2012-10-22 10:05:25.361235742 -0400
Modify: 2012-10-22 09:50:29.674567827 -0400
Change: 2012-10-22 09:50:29.674567827 -0400

# md5sum pgsql/data/base/16385/17600
9deeb7b446c12fbb5745d4d282113d3c  pgsql/data/base/16385/17600

The reason I am curious about this is because when both systems are healthy
and I wish to swap primaries, I will bring the primary and the standby down
and do a full rsync of the data/ directory from old primary to new primary.
However, because the data files are different, the rsync run takes a very
long time.

My questions are:
  1) While the xlog location between primary and standby remains
consistent, are the data files, internally, structured differently between
primary and standby?
  2) Is this expected, and if so, what causes them to diverge?

Thanks in advance for helping me understand this behavior!

- Matt


Re: [GENERAL] indexes no longer used after shutdown during reindexing

2012-01-20 Thread Matt Dew

On 01/13/2012 02:49 PM, Tomas Vondra wrote:

On 13.1.2012 22:20, Tom Lane wrote:

Matt Dew  writes:

An interesting sidenote we realized.  the nice system shutdown script
/etc/init.d/postgres doesn't actually wait for the db to be down, it
just waits for pg_ctl to return.


By default, "pg_ctl stop" does wait for the server to shut down ...


Not really. It waits for up to 60 seconds and if the shutdown was not
successful (as there was a connected client), it prints a message to the log

pg_ctl: server does not shut down
HINT: The "-m fast" option immediately disconnects sessions
rather than

and returns 1.

If you really need to wait for shutdown, you need to add "-w" to the
command line, use "-m fast" or "-m immediate".

But even ignoring the return value should not cause corruption IMHO.


Thanks Tom and Tomas,
I remember -w   now,  but I'd long forgotten about it.

If the pg_ctl returns a 1 but the machine physically powers off,  there 
is a chance for corruption though right?  Postgres is trying to write 
stuff to disk and clean up and BAM power goes out.   ?


There is a chance for corruption though if the machine physically powers 
off after the pg_ctl return



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] indexes no longer used after shutdown during reindexing

2012-01-13 Thread Matt Dew

On 01/12/2012 01:21 PM, Tom Lane wrote:

Matt Dew  writes:

On 01/11/2012 04:29 PM, Tom Lane wrote:

What exactly is your definition of a "clean shutdown"?

Is a reboot command considered a clean shutdown?  It's a redhat box
which called /etc/init.d/postgresql stop, which does:  pg_ctl stop -D
'$PGDATA' -s -m fast

Well, a fast-mode stop would abort the reindex operation, but that
should certainly have left the catalog entries in the same state as
before, so there's no obvious reason here why the indexes would've
stopped being used.


We're using v8.3.9

That's a tad old.  Please consult
http://www.postgresql.org/docs/8.3/static/release.html
for reasons why an update might be a good idea.  I don't recall any
8.3.x bugs that might be related to this, but I haven't trawled the
commit logs to see what I've forgotten, either.


I'm in a rabbit hole. I dug in more and learned that that problem may
have existed before the shutdown.  I believe the root problem is still
the same though; having to recreate the table to get it to use indexes.

Hmm.  If that's the case then we don't have to explain how an aborted
reindex operation could have affected the usability of the old indexes,
so I'm inclined to believe that it didn't.  Which seems to mean that you
have a garden variety "why won't the planner use my index" issue, not
something unusual.  If you no longer have the original table then it may
be impossible to investigate further; but if you can recreate the state
where it's not using the index(es), please see
http://wiki.postgresql.org/wiki/Slow_Query_Questions
and pursue the issue on pgsql-performance.

regards, tom lane


Thanks Tom.

An interesting sidenote we realized.  the nice system shutdown script 
/etc/init.d/postgres doesn't actually wait for the db to be down, it 
just waits for pg_ctl to return.


I'm guessing it's not good when the box shuts down before postgres is.

Matt

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] indexes no longer used after shutdown during reindexing

2012-01-13 Thread Matt Dew

On 01/12/2012 01:21 PM, Tom Lane wrote:

Matt Dew  writes:

On 01/11/2012 04:29 PM, Tom Lane wrote:

What exactly is your definition of a "clean shutdown"?

Is a reboot command considered a clean shutdown?  It's a redhat box
which called /etc/init.d/postgresql stop, which does:  pg_ctl stop -D
'$PGDATA' -s -m fast

Well, a fast-mode stop would abort the reindex operation, but that
should certainly have left the catalog entries in the same state as
before, so there's no obvious reason here why the indexes would've
stopped being used.


We're using v8.3.9

That's a tad old.  Please consult
http://www.postgresql.org/docs/8.3/static/release.html
for reasons why an update might be a good idea.  I don't recall any
8.3.x bugs that might be related to this, but I haven't trawled the
commit logs to see what I've forgotten, either.


I'm in a rabbit hole. I dug in more and learned that that problem may
have existed before the shutdown.  I believe the root problem is still
the same though; having to recreate the table to get it to use indexes.

Hmm.  If that's the case then we don't have to explain how an aborted
reindex operation could have affected the usability of the old indexes,
so I'm inclined to believe that it didn't.  Which seems to mean that you
have a garden variety "why won't the planner use my index" issue, not
something unusual.  If you no longer have the original table then it may
be impossible to investigate further; but if you can recreate the state
where it's not using the index(es), please see
http://wiki.postgresql.org/wiki/Slow_Query_Questions
and pursue the issue on pgsql-performance.



Thanks Tom.   I have the original database stored away for investigation.
This was a serious problem so we're investigatng how to prevent this in 
the future.  It's strange because even though it looks like this problem 
did happen before the reboot,  it was once in a while. After the reboot 
it was everytime and the application completely stopped working.


Plus after the reboot even a simple query against the table:
select * from tbl  where id=1;  was/is doing sequential scans.

It's a smallish table, just under 5 million rows.

Thanks for the link. I'm using that.

It's on a netapp if that matters. (Not my choice.)




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] indexes no longer used after shutdown during reindexing

2012-01-12 Thread Matt Dew

On 01/11/2012 04:29 PM, Tom Lane wrote:

Matt Dew  writes:

 I have a database that was shut down, cleanly, during an 'reindex
table'  command.  When the database came back up, queries against that
table started doing sequential scans instead of using the indexes as
they had been up until that point.

What exactly is your definition of a "clean shutdown"?  At the very
least you'd have had to abort the session running the reindex.  Also,
what PG version is this, and what are the index definitions?
Is a reboot command considered a clean shutdown?  It's a redhat box 
which called /etc/init.d/postgresql stop, which does:  pg_ctl stop -D 
'$PGDATA' -s -m fast


We're using v8.3.9

"idx1" UNIQUE, btree (id)
"idx_2" btree (homeaddress)
"idx_3" btree (f3)
"idx_4" btree (lower(firstname::text) varchar_pattern_ops)
"idx_5" btree (lower(lastname::text) varchar_pattern_ops)
"idx_6" btree (lower(lastname::text) varchar_pattern_ops, 
lower(firstname::text) varchar_pattern_ops, id, f5)

"idx_7" btree (s2id)
"idx_8" btree (sid, lower(memberusername::text) 
varchar_pattern_ops, lower(email::text) varchar_pattern_ops, birthdate)

"idx_9" btree (id, f5) WHERE f5 = false

I'm in a rabbit hole. I dug in more and learned that that problem may 
have existed before the shutdown.  I believe the root problem is still 
the same though; having to recreate the table to get it to use indexes.


thanks for any help,
Matt




regards, tom lane



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] indexes no longer used after shutdown during reindexing

2012-01-11 Thread Matt Dew

On 01/11/2012 11:07 AM, Scott Marlowe wrote:

On Wed, Jan 11, 2012 at 10:42 AM, Matt Dew  wrote:

Hello all,
   I have a database that was shut down, cleanly, during an 'reindex table'
  command.  When the database came back up, queries against that table
started doing sequential scans instead of using the indexes as they had been
up until that point.

We tried:
1) vacuuming the table (vacuum tblName)
2) reindexing the table (reindex table tblName)
3) dropping and recreating the indexes

but none of those actions helped.   We ended up recreating the table by
renaming the table and doing a create table as select * from oldTable and
readding the indexes.  This worked.

This problem presented itself as an application timing out. It took several
people, several hours to track this down and solve it.

Several months ago I had two other tables also stopped using their indexes.
  Those times however I don't know if a database shutdown caused the problem.

Has anyone had this problem?  If so, what specifically is the cause?  Is
shutting down a database during a table rebuild or vacuum an absolute no-no?

Any and all help or insight would be appreciated,
Matt

You likely had an invalid index, I've seen that crop up when doing a
create index concurrently.  Just a guess.  What did or does \d of the
table and its indexes show?  Look for invalid in the output.


Hi Scott,
  The output of \d looked normal.  Nothing weird or different than before.



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] indexes no longer used after shutdown during reindexing

2012-01-11 Thread Matt Dew

Hello all,
   I have a database that was shut down, cleanly, during an 'reindex 
table'  command.  When the database came back up, queries against that 
table started doing sequential scans instead of using the indexes as 
they had been up until that point.


We tried:
1) vacuuming the table (vacuum tblName)
2) reindexing the table (reindex table tblName)
3) dropping and recreating the indexes

but none of those actions helped.   We ended up recreating the table by 
renaming the table and doing a create table as select * from oldTable 
and readding the indexes.  This worked.


This problem presented itself as an application timing out. It took 
several people, several hours to track this down and solve it.


Several months ago I had two other tables also stopped using their 
indexes.  Those times however I don't know if a database shutdown caused 
the problem.


Has anyone had this problem?  If so, what specifically is the cause?  Is 
shutting down a database during a table rebuild or vacuum an absolute no-no?


Any and all help or insight would be appreciated,
Matt

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] New Application Development Announcement

2011-03-24 Thread matt jones
This AGPLv2 application will be designed to bring the control and data
management capabilities of an RDBMS to the average user in a simple, easy to
use OS independent application. The content will be  delivered in an easy to
use and manipulate format within the convenience of a modern browser. The
initial focus of the application will be on cataloging a comic book
collection but the ability to catalog other collections can and will be
added at a later date as resources become available.  If you want to catalog
or track a specific collectionit all that will needed is a SQL table
structure that follows simple guidlines to be discussed at a later time.
 Once completed, the framework and templates written in PHP/AJAX, will be
capable of using interacting with this table sturcute with minimual changes.

The application will use Postgres for the storage of the necessary data
objects. Developers will interact with the data through the use of PHP, SQL,
and AJAX. The user will be provided with a web application allowing the
entering of values into the various tables, columns, and rows, with a
minimal amount of typing, to track the data and organize it graphically so
that they may be able to see the extent of or missing areas of, their
collection. Advanced users and developers will be able to manipulate the
structure of the database as well. Print, export, and import capabilities
will be available in a format compatible with other programs including
Microsoft Excel, Apple Numbers, Google Docs, or Open Office Calc as
development continues.

This will be available as a stand-alone application run from a hosted web
server or the user will have the ability to install it on their machine
through the use of a virtual appliance and access it locally or across a
network. If they wish to build the server themselves the code base for the
application and corrasponding website is also be available for download
under the same license.

There are other websites and programs that do these tasks quite well but
none function as both an on-line service and a standalone program that share
a common interface and have the ability to sync with each other and other
public databases. While many other sites are low cost or free and provide
other benefits such as a marketplace or forum, none provide the core
collection service as both a downloadable pre-configured application stack
and an on-line service with an Open Source license. The standalone programs
provide good functionality but none are released under an open source
license and most require a yearly fee to receive comic issue updates on top
of the fee charged for an upgraded version of the initial software. Many of
the interfaces to these programs are unnecessarily complicated as well, due
to their feature lists. This application will have a minimalist approach to
the user interface but without sacrificing the abilities inherent with a
database or the design. The goal is to keep the interface as clean and
simple as possible only allowing the average user as much control as the
need, more advanced users or those that desire more features will be able to
implement them on an as needed basis.

For infomation on how to help, suggestions, or the current state of
development you can check out the site http://collectablesdb.net or
github.com/CollectablesDB

Matt


Re: [GENERAL] Web Hosting

2011-03-07 Thread Matt
Thanks, but I tried that originally and the companies that come up have
either poor ratings, won't support postgres, won't allow me the freedom to
run my own software, or after talking with them I realized there was PEBKAC
issues with there support staff.  I also, as stated earlier, won't go with
the cheap, low end companies due to common sense issues such as server load,
service/support issues, and cheap hardware.

On Mon, Mar 7, 2011 at 1:00 PM, Benjamin Smith wrote:

>  Try this:
>
> http://lmgtfy.com/?q=web+hosting+postgresql
>
>  On Sunday, March 06, 2011 11:33:01 am Eduardo wrote:
>
> > At 17:24 06/03/2011, you wrote:
>
> > >On 3/5/2011 4:08 PM, matty jones wrote:
>
> > >>I already have a domain name but I am looking for a hosting company
>
> > >>that I can use PG with. The few I have contacted have said that
>
> > >>they support MySQL only and won't give me access to install what I
>
> > >>need or they want way to much. I don't need a dedicated host which
>
> > >>so far seems the only way this will work, all the companies I have
>
> > >>researched so far that offer shared hosting or virtual hosting only
>
> > >>use MySQL. I will take care of the setup and everything myself but
>
> > >>I have already written my code using PG/PHP and I have no intention
>
> > >>of switching.
>
> > >>
>
> > >>Thanks.
>
> > >
>
> > >http://hub.org/
>
> >
>
> > +1
>
>
> --
> This message has been scanned for viruses and
> dangerous content by *MailScanner* , and is
> believed to be clean.


Re: [GENERAL] Unprivileged access to pgsql functions?

2011-03-04 Thread Matt Warner
On Fri, Mar 4, 2011 at 2:03 PM, John R Pierce  wrote:

> On 03/04/11 1:57 PM, Matt Warner wrote:
>
>> Not sure. I believe public and pg_catalog are in the path by default. Most
>> of the create function declarations prepend pg_catalog, and I believe I saw
>> somewhere that pg_catalog is the default. But I may be misunderstanding
>> that...
>>
>
> CREATE FUNCTION nvl(anyelement, anyelement)
> RETURNS anyelement
> AS '$libdir/orafunc','ora_nvl'
> LANGUAGE C IMMUTABLE;
>
> doesn't specify any schema, so that function is created in the current
> schema in the current database
>
>
>
Yes, my bad for thinking of this as an Oracle schema as opposed to a
different database.


Re: [GENERAL] Unprivileged access to pgsql functions?

2011-03-04 Thread Matt Warner
On Fri, Mar 4, 2011 at 1:56 PM, Bosco Rama  wrote:

> Matt Warner wrote:
> >
> > The function cannot be defined in the user's DB because "language C" is
> > considered a security risk, so only the superuser can do that. Or that's
> > what I get from reading anyway...
>
> psql -U postgres -d user_db
>
> will allow the superuser to then define the function in the user's DB.
>
> Once there, you can grant access, if needed.
>
> Bosco.
>

Success! Too much exposure to Oracle had me forgetting that the databases
are separate in Postgres. D'oh!

Thank you very much!

Matt


Re: [GENERAL] Unprivileged access to pgsql functions?

2011-03-04 Thread Matt Warner
On Fri, Mar 4, 2011 at 1:51 PM, Andrew Sullivan  wrote:

> On Fri, Mar 04, 2011 at 01:41:34PM -0800, Matt Warner wrote:
> > No luck:
> >
> > *** as postgres
> > postgres=# GRANT all on function nvl(anyelement,anyelement) to public;
> > GRANT
> > postgres=#
> >
> > *** as unprivileged user
> > offload=> select nvl(0,1);
> > ERROR:  function nvl(integer, integer) does not exist
> > LINE 1: select nvl(0,1);
> >^
> > HINT:  No function matches the given name and argument types. You might
> need
> > to add explicit type casts.
>
> Is the function in your search_path?
>
> A
>
>
> --
> Andrew Sullivan
> a...@crankycanuck.ca
>
>
Not sure. I believe public and pg_catalog are in the path by default. Most
of the create function declarations prepend pg_catalog, and I believe I saw
somewhere that pg_catalog is the default. But I may be misunderstanding
that...


Re: [GENERAL] Unprivileged access to pgsql functions?

2011-03-04 Thread Matt Warner
On Fri, Mar 4, 2011 at 1:49 PM, John R Pierce  wrote:

> On 03/04/11 1:41 PM, Matt Warner wrote:
>
>> No luck:
>>
>> *** as postgres
>> postgres=# GRANT all on function nvl(anyelement,anyelement) to public;
>> GRANT
>> postgres=#
>>
>> *** as unprivileged user
>> offload=> select nvl(0,1);
>> ERROR:  function nvl(integer, integer) does not exist
>> LINE 1: select nvl(0,1);
>>   ^
>> HINT:  No function matches the given name and argument types. You might
>> need to add explicit type casts.
>>
>>
> what schema is this function defined in?  is that schema in the search
> path?
>
>
> l <http://www.postgresql.org/mailpref/pgsql-general>


The function was loaded as user postgres since it's a C-language module. The
definitions for most of the functions in the script have pg_catalog as the
prefix (e.g. CREATE FUNCTION pg_catalog.to_date(str text)).


Re: [GENERAL] Unprivileged access to pgsql functions?

2011-03-04 Thread Matt Warner
On Fri, Mar 4, 2011 at 1:48 PM, Bosco Rama  wrote:

> Matt Warner wrote:
> > No luck:
> >
> > *** as postgres
> > postgres=# GRANT all on function nvl(anyelement,anyelement) to public;
> > GRANT
> > postgres=#
> >
> > *** as unprivileged user
> > offload=> select nvl(0,1);
> > ERROR:  function nvl(integer, integer) does not exist
> > LINE 1: select nvl(0,1);
> >^
> > HINT:  No function matches the given name and argument types. You might
> need
> > to add explicit type casts.
>
> This looks like you defined the function in template1 *after* you created
> the
> user's DB.  Or never defined it there at all.
>
> Try defining the function in the user's DB itself or, if able, recreate the
> user's DB after defining it in template1.
>
> HTH.
>
> Bosco.
>

The function cannot be defined in the user's DB because "language C" is
considered a security risk, so only the superuser can do that. Or that's
what I get from reading anyway...

Recreating the user DB is problematic because there are already tables in
place.

Matt


Re: [GENERAL] Unprivileged access to pgsql functions?

2011-03-04 Thread Matt Warner
Here's how the script is defining the function, if that helps:

CREATE FUNCTION nvl(anyelement, anyelement)
RETURNS anyelement
AS '$libdir/orafunc','ora_nvl'
LANGUAGE C IMMUTABLE;


On Fri, Mar 4, 2011 at 1:41 PM, Matt Warner wrote:

> No luck:
>
> *** as postgres
> postgres=# GRANT all on function nvl(anyelement,anyelement) to public;
> GRANT
> postgres=#
>
> *** as unprivileged user
> offload=> select nvl(0,1);
> ERROR:  function nvl(integer, integer) does not exist
> LINE 1: select nvl(0,1);
>^
> HINT:  No function matches the given name and argument types. You might
> need to add explicit type casts.
>
>
> On Fri, Mar 4, 2011 at 1:34 PM, Vibhor Kumar <
> vibhor.ku...@enterprisedb.com> wrote:
>
>>
>> On Mar 5, 2011, at 2:50 AM, John R Pierce wrote:
>>
>> > On 03/04/11 1:11 PM, Matt Warner wrote:
>> >> Good afternoon.
>> >>
>> >> I've been looking at the Oracle Functionality package. ...
>> >
>> > what is this?  doesn't sound like anything in PostgreSQL I'm familiar
>> with.  Is this part of EntepriseDB's Postgres+ package or something?  You
>> should probably contact them via their support channels.
>>
>>
>> GRANT all on function nvl to public would help.
>>
>>
>> Thanks & Regards,
>> Vibhor Kumar
>> EnterpriseDB Corporation
>> The Enterprise PostgreSQL Company
>> vibhor.ku...@enterprisedb.com
>> Blog:http://vibhork.blogspot.com
>>
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>>
>
>


Re: [GENERAL] Unprivileged access to pgsql functions?

2011-03-04 Thread Matt Warner
No luck:

*** as postgres
postgres=# GRANT all on function nvl(anyelement,anyelement) to public;
GRANT
postgres=#

*** as unprivileged user
offload=> select nvl(0,1);
ERROR:  function nvl(integer, integer) does not exist
LINE 1: select nvl(0,1);
   ^
HINT:  No function matches the given name and argument types. You might need
to add explicit type casts.


On Fri, Mar 4, 2011 at 1:34 PM, Vibhor Kumar
wrote:

>
> On Mar 5, 2011, at 2:50 AM, John R Pierce wrote:
>
> > On 03/04/11 1:11 PM, Matt Warner wrote:
> >> Good afternoon.
> >>
> >> I've been looking at the Oracle Functionality package. ...
> >
> > what is this?  doesn't sound like anything in PostgreSQL I'm familiar
> with.  Is this part of EntepriseDB's Postgres+ package or something?  You
> should probably contact them via their support channels.
>
>
> GRANT all on function nvl to public would help.
>
>
> Thanks & Regards,
> Vibhor Kumar
> EnterpriseDB Corporation
> The Enterprise PostgreSQL Company
> vibhor.ku...@enterprisedb.com
> Blog:http://vibhork.blogspot.com
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


Re: [GENERAL] Unprivileged access to pgsql functions?

2011-03-04 Thread Matt Warner
To be clear, this is open source Postgres I'm using, not the enterprise
product.

Matt

On Fri, Mar 4, 2011 at 1:29 PM, Matt Warner wrote:

> It's a "contrib" module:
>
> http://pgfoundry.org/projects/orafce/
>
> Matt
>
>
> On Fri, Mar 4, 2011 at 1:20 PM, John R Pierce  wrote:
>
>> On 03/04/11 1:11 PM, Matt Warner wrote:
>>
>>> Good afternoon.
>>>
>>> I've been looking at the Oracle Functionality package. ...
>>>
>>
>> what is this?  doesn't sound like anything in PostgreSQL I'm familiar
>> with.  Is this part of EntepriseDB's Postgres+ package or something?  You
>> should probably contact them via their support channels.
>>
>>
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>>
>
>


Re: [GENERAL] Unprivileged access to pgsql functions?

2011-03-04 Thread Matt Warner
It's a "contrib" module:

http://pgfoundry.org/projects/orafce/

Matt

On Fri, Mar 4, 2011 at 1:20 PM, John R Pierce  wrote:

> On 03/04/11 1:11 PM, Matt Warner wrote:
>
>> Good afternoon.
>>
>> I've been looking at the Oracle Functionality package. ...
>>
>
> what is this?  doesn't sound like anything in PostgreSQL I'm familiar with.
>  Is this part of EntepriseDB's Postgres+ package or something?  You should
> probably contact them via their support channels.
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


[GENERAL] Unprivileged access to pgsql functions?

2011-03-04 Thread Matt Warner
Good afternoon.

I've been looking at the Oracle Functionality package. It's very
interesting. However, the one place I'm stuck is that while user Postgres
can access the functions, no other user seems to have access. I'm sure this
is something simple I'm missing, but so far Google hasn't shown me the
answer.

I've already tried explicitly granting execute permissions on the function,
and the search path, as I understand it, is already supposed to be looking
in pg_catalog.

Any pointers?

 Non-privileged user*
offload=> select nvl(null,1);
ERROR:  function nvl(unknown, integer) does not exist
LINE 1: select nvl(null,1);
   ^
HINT:  No function matches the given name and argument types. You might need
to add explicit type casts.
offload=> \q
-bash-3.00$

 Postgres user*
-bash-3.00$ psql
psql (9.0.3)
Type "help" for help.

postgres=# select nvl(null,1);
 nvl
-
   1
(1 row)

postgres=# \q

TIA,

Matt


Re: [GENERAL] Looking for Suggestion on Learning

2011-02-06 Thread Matt
On Sun, Feb 6, 2011 at 11:14 AM, ray joseph  wrote:

>  Matt,
>
>
>
> Thank you for your insightful view.  I do not have a design for any of my
> design opportunities.  This is one reason I was looking for a design tool.
>  I have many work processes that are inter related, generated by different
> groups that must transcribe data from each others artifacts.  I do have
> Visio but I have never used it for this purpose.  Since one of my objectives
> is to learn about db design, maybe I can find some training material using
> Visio. I wonder if Visio will generate SQL.  If I recall, only the
> enterprise version of Visio produces SQL, so I would like to find a FOS tool
> for this.  A tool and associated tutorial would be great.
>
>
>
As was mentioned earlier and pencil and paper is a great tool to start with.
 The first several drafts of my database were done with graph paper and a
pencil.  I find it much quicker in the beginning to do it this way then use
a computer and make changes to it through a program. When you get a design
almost nearly complete then I would  recomend a computer program.  If you go
the route of Viso there are templates for databases built in so this may
save you some time.  As far as code generation goes I have no idea as I
prefer to do all my generation by hand.


> BTW, I do use Notepad++.  I have used gnome.org/dia, but I find it much
> weaker than Visio.  I did not realize that PHP admin required a web server,
> but I have recently installed Apache for SVN.  I looked at PHP admin even
> though my preference is Python.  I have also looked at Maestro but have had
> a similar problem with tutorials.
>

PHP is a server side language and thus is usually not run from the
commandline like ruby or python, it is generally run through a web server
such as apache or IIS.

>
>
> Thank you for the link to the ‘docs’ site.  I have been there many times
> over the past couple of years but now I see it in a new light (I’m slow).
>
>
>
When you hand code SQL with Notepad++, how do you launch the code?
>
>
There are several ways to launch the code.  I use the PHPPGAdmin and load
the file into the GUI and launch it that way.  I haven't really gotten that
far into my development yet where I need to do this on a large scale, for
now that method works for me.  I am sure there are other ways to do it
through the PG commandline interface.

>
>
> I really appreciate your efforts.
>
>
>
> ray
>
>
>   --
>
> *From:* urlu...@gmail.com [mailto:urlu...@gmail.com] *On Behalf Of *Matt
> *Sent:* Saturday, February 05, 2011 10:19 PM
> *To:* ray joseph
> *Subject:* Re: [GENERAL] Looking for Suggestion on Learning
>
>
>
> I too am in a similar situation.  My company currently uses M$ Access and
> the solution is no longer viable and needs to be dealt with.  Form what I
> have been reading and learning the last few weeks trying to compare Access
> and PG is like comparing a go-kart and a race car.  They both do the same
> thing more or less but the race car is capable of much more but also needs
> more attention.  I am assuming that you already have a structural design for
> your database, tables, keys etc.  If not this is a good place to start and
> is where I am currently at in my project.
>
> On Sat, Feb 5, 2011 at 9:22 PM, ray joseph  wrote:
>
>
> > On Saturday, February 05, 2011 9:30:13 am ray wrote:
> > > I have built a few databases with MS Access and I would like to learn
> > > how to use pgsql.  I have found some examples but they have been too
> > > complex to follow or to abstract with no specific details.
>
>
>
> Use the online documentation at http://www.postgresql.org/docs/ for a
> basic tutorial on how to create tables, queries, and the like.  The docs go
> much further in detail then that but this is a good place to start.  It
> gives real examples of working with tables that are easy to follow and
> it doesn't require previous knowledge.
>
>
>
>  > >
> > > I would like to find a simple example that would take me from an open
> > > source design tool to a simple method to implement the design.
>
>
>
> What do you mean by a design tool?  Are you looking for a program to help
> you map out the table structure of your db?  Are you looking for a GUI to
> access your db and modify it?  I am using Viso to create my maps at work
> right now but you may want to check out Dia http://projects.gnome.org/dia/,
> it is a good piece of software but I haven't used it for this purpose yet.
>  When you install PG it comes with the GUI PGAdmin that gives you basic
> control over some aspects of your db and allows you to implement various
> things.  I am using PHPAdmin mys

Re: [GENERAL] Full Text Index Scanning

2011-01-30 Thread Matt Warner
Thanks. pg_trgm looks interesting, but after installing the pg_trgm.sql, I
get error messages when following the documentation.

sggeeorg=> create index test_idx on test using gist(columnname
gist_trgm_ops);
ERROR:  operator class "gist_trgm_ops" does not exist for access method
"gist"
STATEMENT:  create index test_idx on test using
gist(columnname gist_trgm_ops);
ERROR:  operator class "gist_trgm_ops" does not exist for access method
"gist"



On Sun, Jan 30, 2011 at 10:36 AM, Tom Lane  wrote:

> Matt Warner  writes:
> > If I understand this, it looks like this approach allows me to match the
> > beginnings and endings of words, but not the middle sections.
>
> Yeah, probably.  You might consider using contrib/pg_trgm instead if
> you need arbitrary substrings.
>
>regards, tom lane
>


Re: [GENERAL] Full Text Index Scanning

2011-01-30 Thread Matt Warner
If I understand this, it looks like this approach allows me to match the
beginnings and endings of words, but not the middle sections. Is that
correct? That is, if I search for "jag" I will find "jaeger" but not
"lobenjager".

Or am I (again) not understanding how this works?

TIA,

Matt

On Sun, Jan 30, 2011 at 9:59 AM, Matt Warner wrote:

> Aha! Thanks for pointing that out. It's indexing now.
>
> Thanks!
>
> Matt
>
>
> On Sun, Jan 30, 2011 at 9:12 AM, Tom Lane  wrote:
>
>> Matt Warner  writes:
>> > Doesn't seem to work either. Maybe something changed in 9.1?
>> > create index test_idx on testtable using gin(to_tsvector(wordcolumn||'
>> > '||reverse(wordcolumn)));
>> > ERROR:  functions in index expression must be marked IMMUTABLE
>>
>> That's not the same case he tested.  The single-parameter form of
>> to_tsvector isn't immutable, because it depends on the default text
>> search configuration parameter.  It should work, AFAICS, with the
>> two-parameter form.
>>
>>regards, tom lane
>>
>
>


Re: [GENERAL] Full Text Index Scanning

2011-01-30 Thread Matt Warner
Aha! Thanks for pointing that out. It's indexing now.

Thanks!

Matt

On Sun, Jan 30, 2011 at 9:12 AM, Tom Lane  wrote:

> Matt Warner  writes:
> > Doesn't seem to work either. Maybe something changed in 9.1?
> > create index test_idx on testtable using gin(to_tsvector(wordcolumn||'
> > '||reverse(wordcolumn)));
> > ERROR:  functions in index expression must be marked IMMUTABLE
>
> That's not the same case he tested.  The single-parameter form of
> to_tsvector isn't immutable, because it depends on the default text
> search configuration parameter.  It should work, AFAICS, with the
> two-parameter form.
>
>regards, tom lane
>


Re: [GENERAL] Full Text Index Scanning

2011-01-30 Thread Matt Warner
Doesn't seem to work either. Maybe something changed in 9.1?

create index test_idx on testtable using gin(to_tsvector(wordcolumn||'
'||reverse(wordcolumn)));
ERROR:  functions in index expression must be marked IMMUTABLE


On Sun, Jan 30, 2011 at 3:28 AM, Oleg Bartunov  wrote:

> I used 9.1dev, but you can try immutable function (from
> http://andreas.scherbaum.la/blog/archives/10-Reverse-a-text-in-PostgreSQL.html
> )
>
> create function reverse(text) returns text as $$
>
> select case when length($1)>0
>
> then substring($1, length($1), 1) || reverse(substring($1, 1,
> length($1)-1))
>
> else '' end $$ language sql immutable strict;
>
>
>
> On Sat, 29 Jan 2011, Matt Warner wrote:
>
>  9.0.2
>>
>> On Sat, Jan 29, 2011 at 9:35 AM, Oleg Bartunov  wrote:
>>
>>  What version of Pg you run ? Try latest version.
>>>
>>> Oleg
>>>
>>>
>>> On Sat, 29 Jan 2011, Matt Warner wrote:
>>>
>>>  Reverse isn't a built-in Postgres function, so I found one and installed
>>>
>>>> it.
>>>> However, attempting to use it in creating an index gets me the message
>>>> "ERROR:  functions in index expression must be marked IMMUTABLE", even
>>>> though the function declaration already has the immutable argument.
>>>>
>>>> Is there a specific version of the reverse function you're using? Or am
>>>> I
>>>> just missing something obvious? This is Postgres 9, BTW.
>>>>
>>>> Thanks,
>>>>
>>>> Matt
>>>>
>>>> On Sat, Jan 29, 2011 at 6:46 AM, Matt Warner >>>
>>>>> wrote:
>>>>>
>>>>
>>>>  Thanks Oleg. I'm going to have to experiment with this so that I
>>>>
>>>>> understand
>>>>> it better.
>>>>>
>>>>> Matt
>>>>>
>>>>>
>>>>> On Fri, Jan 28, 2011 at 1:12 PM, Oleg Bartunov 
>>>>> wrote:
>>>>>
>>>>>  Matt, I'd try to use prefix search on original string concatenated
>>>>> with
>>>>>
>>>>>> reverse string:
>>>>>>
>>>>>> Just tried on some spare table
>>>>>>
>>>>>> knn=# \d spot_toulouse
>>>>>>  Table "public.spot_toulouse"
>>>>>> Column|   Type| Modifiers
>>>>>> -+---+---
>>>>>>  clean_name  | character varying |
>>>>>>
>>>>>>
>>>>>> 1. create index knn=# create index clean_name_tlz_idx on spot_toulouse
>>>>>> using gin(to_tsvector('french', clean_name || ' ' ||
>>>>>> reverse(clean_name)));
>>>>>> 2.
>>>>>> select clean_name from spot_toulouse where to_tsvector('french',
>>>>>> clean_name|| ' ' ||  reverse(clean_name) ) @@
>>>>>> to_tsquery('french','the:*
>>>>>> |
>>>>>> et:*');
>>>>>>
>>>>>> Select looks cumbersome, but you can always write wrapper functions.
>>>>>> The
>>>>>> only drawback I see for now is that ranking function will a bit
>>>>>> confused,
>>>>>> since coordinates of original and reversed words will be not the same,
>>>>>> but
>>>>>> again, it's possible to obtain tsvector by custom function, which
>>>>>> aware
>>>>>> about reversing.
>>>>>>
>>>>>> Good luck and let me know if this help you.
>>>>>>
>>>>>> Oleg
>>>>>>
>>>>>>
>>>>>> On Fri, 28 Jan 2011, Matt Warner wrote:
>>>>>>
>>>>>>  I'm in the process of migrating a project from Oracle to Postgres and
>>>>>>
>>>>>>  have
>>>>>>> run into a feature question. I know that Postgres has a full-text
>>>>>>> search
>>>>>>> feature, but it does not allow scanning the index (as opposed to the
>>>>>>> data).
>>>>>>> Specifically, in Oracle you can do "select * from table where
>>>>>>> contains(colname,'%part_of_word%')>1". While thi

Re: [GENERAL] Full Text Index Scanning

2011-01-29 Thread Matt Warner
9.0.2

On Sat, Jan 29, 2011 at 9:35 AM, Oleg Bartunov  wrote:

> What version of Pg you run ? Try latest version.
>
> Oleg
>
>
> On Sat, 29 Jan 2011, Matt Warner wrote:
>
>  Reverse isn't a built-in Postgres function, so I found one and installed
>> it.
>> However, attempting to use it in creating an index gets me the message
>> "ERROR:  functions in index expression must be marked IMMUTABLE", even
>> though the function declaration already has the immutable argument.
>>
>> Is there a specific version of the reverse function you're using? Or am I
>> just missing something obvious? This is Postgres 9, BTW.
>>
>> Thanks,
>>
>> Matt
>>
>> On Sat, Jan 29, 2011 at 6:46 AM, Matt Warner > >wrote:
>>
>>  Thanks Oleg. I'm going to have to experiment with this so that I
>>> understand
>>> it better.
>>>
>>> Matt
>>>
>>>
>>> On Fri, Jan 28, 2011 at 1:12 PM, Oleg Bartunov  wrote:
>>>
>>>  Matt, I'd try to use prefix search on original string concatenated with
>>>> reverse string:
>>>>
>>>> Just tried on some spare table
>>>>
>>>> knn=# \d spot_toulouse
>>>>   Table "public.spot_toulouse"
>>>>  Column|   Type| Modifiers
>>>> -+---+---
>>>>  clean_name  | character varying |
>>>>
>>>>
>>>> 1. create index knn=# create index clean_name_tlz_idx on spot_toulouse
>>>> using gin(to_tsvector('french', clean_name || ' ' ||
>>>> reverse(clean_name)));
>>>> 2.
>>>> select clean_name from spot_toulouse where to_tsvector('french',
>>>> clean_name|| ' ' ||  reverse(clean_name) ) @@ to_tsquery('french','the:*
>>>> |
>>>> et:*');
>>>>
>>>> Select looks cumbersome, but you can always write wrapper functions. The
>>>> only drawback I see for now is that ranking function will a bit
>>>> confused,
>>>> since coordinates of original and reversed words will be not the same,
>>>> but
>>>> again, it's possible to obtain tsvector by custom function, which aware
>>>> about reversing.
>>>>
>>>> Good luck and let me know if this help you.
>>>>
>>>> Oleg
>>>>
>>>>
>>>> On Fri, 28 Jan 2011, Matt Warner wrote:
>>>>
>>>>  I'm in the process of migrating a project from Oracle to Postgres and
>>>>
>>>>> have
>>>>> run into a feature question. I know that Postgres has a full-text
>>>>> search
>>>>> feature, but it does not allow scanning the index (as opposed to the
>>>>> data).
>>>>> Specifically, in Oracle you can do "select * from table where
>>>>> contains(colname,'%part_of_word%')>1". While this isn't terribly
>>>>> efficient,
>>>>> it's much faster than full-scanning the raw data and is relatively
>>>>> quick.
>>>>>
>>>>> It doesn't seem that Postgres works this way. Attempting to do this
>>>>> returns
>>>>> no rows: "select * from table where to_tsvector(colname) @@
>>>>> to_tsquery('%part_of_word%')"
>>>>>
>>>>> The reason I want to do this is that the partial word search does not
>>>>> involve dictionary words (it's scanning names).
>>>>>
>>>>> Is this something Postgres can do? Or is there a different way to do
>>>>> scan
>>>>> the index?
>>>>>
>>>>> TIA,
>>>>>
>>>>> Matt
>>>>>
>>>>>
>>>>>Regards,
>>>>   Oleg
>>>> _
>>>> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
>>>> Sternberg Astronomical Institute, Moscow University, Russia
>>>> Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
>>>> phone: +007(495)939-16-83, +007(495)939-23-83
>>>>
>>>>
>>>
>>>
>>
>Regards,
>Oleg
> _
> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
> Sternberg Astronomical Institute, Moscow University, Russia
> Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
> phone: +007(495)939-16-83, +007(495)939-23-83
>


Re: [GENERAL] Full Text Index Scanning

2011-01-29 Thread Matt Warner
Reverse isn't a built-in Postgres function, so I found one and installed it.
However, attempting to use it in creating an index gets me the message
"ERROR:  functions in index expression must be marked IMMUTABLE", even
though the function declaration already has the immutable argument.

Is there a specific version of the reverse function you're using? Or am I
just missing something obvious? This is Postgres 9, BTW.

Thanks,

Matt

On Sat, Jan 29, 2011 at 6:46 AM, Matt Warner wrote:

> Thanks Oleg. I'm going to have to experiment with this so that I understand
> it better.
>
> Matt
>
>
> On Fri, Jan 28, 2011 at 1:12 PM, Oleg Bartunov  wrote:
>
>> Matt, I'd try to use prefix search on original string concatenated with
>> reverse string:
>>
>> Just tried on some spare table
>>
>> knn=# \d spot_toulouse
>>Table "public.spot_toulouse"
>>   Column|   Type| Modifiers
>> -+---+---
>>  clean_name  | character varying |
>>
>>
>> 1. create index knn=# create index clean_name_tlz_idx on spot_toulouse
>> using gin(to_tsvector('french', clean_name || ' ' || reverse(clean_name)));
>> 2.
>> select clean_name from spot_toulouse where to_tsvector('french',
>> clean_name|| ' ' ||  reverse(clean_name) ) @@ to_tsquery('french','the:* |
>> et:*');
>>
>> Select looks cumbersome, but you can always write wrapper functions. The
>> only drawback I see for now is that ranking function will a bit confused,
>> since coordinates of original and reversed words will be not the same, but
>> again, it's possible to obtain tsvector by custom function, which aware
>> about reversing.
>>
>> Good luck and let me know if this help you.
>>
>> Oleg
>>
>>
>> On Fri, 28 Jan 2011, Matt Warner wrote:
>>
>>  I'm in the process of migrating a project from Oracle to Postgres and
>>> have
>>> run into a feature question. I know that Postgres has a full-text search
>>> feature, but it does not allow scanning the index (as opposed to the
>>> data).
>>> Specifically, in Oracle you can do "select * from table where
>>> contains(colname,'%part_of_word%')>1". While this isn't terribly
>>> efficient,
>>> it's much faster than full-scanning the raw data and is relatively quick.
>>>
>>> It doesn't seem that Postgres works this way. Attempting to do this
>>> returns
>>> no rows: "select * from table where to_tsvector(colname) @@
>>> to_tsquery('%part_of_word%')"
>>>
>>> The reason I want to do this is that the partial word search does not
>>> involve dictionary words (it's scanning names).
>>>
>>> Is this something Postgres can do? Or is there a different way to do scan
>>> the index?
>>>
>>> TIA,
>>>
>>> Matt
>>>
>>>
>>Regards,
>>Oleg
>> _
>> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
>> Sternberg Astronomical Institute, Moscow University, Russia
>> Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
>> phone: +007(495)939-16-83, +007(495)939-23-83
>>
>
>


Re: [GENERAL] Full Text Index Scanning

2011-01-29 Thread Matt Warner
Thanks Oleg. I'm going to have to experiment with this so that I understand
it better.

Matt

On Fri, Jan 28, 2011 at 1:12 PM, Oleg Bartunov  wrote:

> Matt, I'd try to use prefix search on original string concatenated with
> reverse string:
>
> Just tried on some spare table
>
> knn=# \d spot_toulouse
>Table "public.spot_toulouse"
>   Column|   Type| Modifiers
> -+---+---
>  clean_name  | character varying |
>
>
> 1. create index knn=# create index clean_name_tlz_idx on spot_toulouse
> using gin(to_tsvector('french', clean_name || ' ' || reverse(clean_name)));
> 2.
> select clean_name from spot_toulouse where to_tsvector('french',
> clean_name|| ' ' ||  reverse(clean_name) ) @@ to_tsquery('french','the:* |
> et:*');
>
> Select looks cumbersome, but you can always write wrapper functions. The
> only drawback I see for now is that ranking function will a bit confused,
> since coordinates of original and reversed words will be not the same, but
> again, it's possible to obtain tsvector by custom function, which aware
> about reversing.
>
> Good luck and let me know if this help you.
>
> Oleg
>
>
> On Fri, 28 Jan 2011, Matt Warner wrote:
>
>  I'm in the process of migrating a project from Oracle to Postgres and have
>> run into a feature question. I know that Postgres has a full-text search
>> feature, but it does not allow scanning the index (as opposed to the
>> data).
>> Specifically, in Oracle you can do "select * from table where
>> contains(colname,'%part_of_word%')>1". While this isn't terribly
>> efficient,
>> it's much faster than full-scanning the raw data and is relatively quick.
>>
>> It doesn't seem that Postgres works this way. Attempting to do this
>> returns
>> no rows: "select * from table where to_tsvector(colname) @@
>> to_tsquery('%part_of_word%')"
>>
>> The reason I want to do this is that the partial word search does not
>> involve dictionary words (it's scanning names).
>>
>> Is this something Postgres can do? Or is there a different way to do scan
>> the index?
>>
>> TIA,
>>
>> Matt
>>
>>
>Regards,
>Oleg
> _
> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
> Sternberg Astronomical Institute, Moscow University, Russia
> Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
> phone: +007(495)939-16-83, +007(495)939-23-83
>


[GENERAL] Full Text Index Scanning

2011-01-28 Thread Matt Warner
I'm in the process of migrating a project from Oracle to Postgres and have
run into a feature question. I know that Postgres has a full-text search
feature, but it does not allow scanning the index (as opposed to the data).
Specifically, in Oracle you can do "select * from table where
contains(colname,'%part_of_word%')>1". While this isn't terribly efficient,
it's much faster than full-scanning the raw data and is relatively quick.

It doesn't seem that Postgres works this way. Attempting to do this returns
no rows: "select * from table where to_tsvector(colname) @@
to_tsquery('%part_of_word%')"

The reason I want to do this is that the partial word search does not
involve dictionary words (it's scanning names).

Is this something Postgres can do? Or is there a different way to do scan
the index?

TIA,

Matt


[GENERAL] A few [Python] tools for postgres

2010-11-01 Thread Matt Harrison
Hey Folks-

I've got 2 projects out that I'm finding useful, so I thought I'd share with
the wider postgres community.

The first is PgPartition [0].  This (python) tool eliminates the monotony of
dealing with partitions.  It generates SQL to create/index/remove/alter
partitions.

The second is PgTweak [1].  This is a somewhat nascent (also python)
project, but I hope for it to be more useful.  The main idea is to try out
different setting combinations and to see what effect they have on query
performance.  It does this right now.  I'd like for it to analyze the
EXPLAIN ANALYZE output and make somewhat intelligent suggestions.

Any feedback is appreciated.  Hopefully these tools are useful to others.
I'll be at PgWest this week, if anyone wants to discuss these (or pgtune, or
python,  etc)

cheers,

matt

http://panela.blog-city.com/

0 - http://github.com/mattharrison/PgPartition
1 - http://github.com/mattharrison/PgTweak


[GENERAL] Tools for partitioning and query optimization

2010-10-19 Thread Matt Harrison
Hey Folks-

I've got 2 projects out that I'm finding useful, so I thought I'd share with
the wider postgres community.

The first is PgPartition [0].  This (python) tool eliminates the monotony of
dealing with partitions.  It generates SQL to create/index/remove/alter
partitions.

The second is PgTweak [1].  This is a somewhat nascent (also python)
project, but I hope for it to be more useful.  The main idea is to try out
different setting combinations and to see what effect they have on query
performance.  It does this right now.  I'd like for it to analyze the
EXPLAIN ANALYZE output and make somewhat intelligent suggestions.

Any feedback is appreciated.  Hopefully these tools are useful to others.

cheers,

matt

0 - http://github.com/mattharrison/PgPartition
1 - http://github.com/mattharrison/PgTweak


Re: [GENERAL] postgres startup failure

2010-05-20 Thread Matt Bartolome
On Thu, May 20, 2010 at 3:36 PM, Tom Lane  wrote:

> Matt Bartolome  writes:
> > Setting the breakpoint (b exit) got me a little farther...
>
> > DEBUG:  invoking IpcMemoryCreate(size=32595968)
> > DEBUG:  max_safe_fds = 980, usable_fds = 1000, already_open = 10
> > Detaching after fork from child process 19066.
> > Detaching after fork from child process 19067.
>
> > Breakpoint 1, 0x0046efb6 in exit () from /lib/libc.so.6
> > (gdb) bt
> > #0  0x0046efb6 in exit () from /lib/libc.so.6
> > #1  0x082286de in proc_exit ()
> > #2  0x0820bbdf in ?? ()
> > #3  0x0820eb63 in ?? ()
> > #4  
> > #5  0x003ec424 in __kernel_vsyscall ()
> > #6  0x0051433d in ___newselect_nocancel () from /lib/libc.so.6
> > #7  0x0820d489 in ?? ()
> > #8  0x08210489 in PostmasterMain ()
> > #9  0x081b92df in main ()
>
> Hmm.  Not tremendously helpful --- if you were to install the
> appropriate postgresql-debuginfo RPM, the stack trace might get
> more useful.  However, this does appear to confirm the theory
> that postmaster.c is doing proc_exit(1), and I don't see any code
> paths in there where there isn't a error message logged first.
> So right now I'm wondering whether you're not failing to notice
> a relevant log message(s).  You've evidently managed to launch
> the syslogger --- where is it configured to write the postmaster
> log?
>
>
Gosh, you know I never even bothered looking in pg_log. Stupid! I was
expecting to see some sort of error when doing:
$ postgres -d 3 -D /data/postgres/

I even have this written down in my own recovery instructions:

4. Create pg_xlog directory as postgres user, remove backup_label

I had forgotten when doing a backup with WAL enabled you want to exclude the
pg_xlog directory because you are restoring from the WAL's written on the
standby server in the event of failure. In the pg_log it clearly prints the
error.


>regards, tom lane
>

Thanks Tom, I probably would have been running in circles for days without
your help.

-Matt


Re: [GENERAL] postgres startup failure

2010-05-20 Thread Matt Bartolome
On Thu, May 20, 2010 at 2:44 PM, Tom Lane  wrote:

> Matt Bartolome  writes:
> > gdb output...
>
> > DEBUG:  -
> > DEBUG:  invoking IpcMemoryCreate(size=32595968)
> > DEBUG:  max_safe_fds = 980, usable_fds = 1000, already_open = 10
> > Detaching after fork from child process 18310.
> > Detaching after fork from child process 18311.
>
> > Program exited with code 01.
>
> Huh.  So it called exit(1) somewhere, without generating any error
> message first.  That's pretty unfriendly.  Try setting a breakpoint
> at exit(), and if you can get it to stop there, get a stack trace
> from that point.
>
> Dunno if you're familiar with gdb, but the quick way to do this is
> to say
>b exit
> and then the "run" command.  It's possible that the attempt to set
> the breakpoint will fail because libc.so isn't loaded yet, in
> which case you need two steps:
>b main
>run ...
>when control stops at main:
>b exit
>continue
>

Setting the breakpoint (b exit) got me a little farther...

DEBUG:  invoking IpcMemoryCreate(size=32595968)
DEBUG:  max_safe_fds = 980, usable_fds = 1000, already_open = 10
Detaching after fork from child process 19066.
Detaching after fork from child process 19067.

Breakpoint 1, 0x0046efb6 in exit () from /lib/libc.so.6
(gdb) bt
#0  0x0046efb6 in exit () from /lib/libc.so.6
#1  0x082286de in proc_exit ()
#2  0x0820bbdf in ?? ()
#3  0x0820eb63 in ?? ()
#4  
#5  0x003ec424 in __kernel_vsyscall ()
#6  0x0051433d in ___newselect_nocancel () from /lib/libc.so.6
#7  0x0820d489 in ?? ()
#8  0x08210489 in PostmasterMain ()
#9  0x081b92df in main ()
(gdb) quit


>
>regards, tom lane
>


Thanks Tom,
Matt


Re: [GENERAL] postgres startup failure

2010-05-20 Thread Matt Bartolome
Hi Tom,

On Thu, May 20, 2010 at 11:33 AM, Tom Lane  wrote:

> Matt Bartolome  writes:
> > I'm attempting to start postgres on a standby machine from a backup. Both
> > the primary and standby are running postgres 8.4, fedora 12.
>
> 8.4.what exactly?
>

I'm running 8.4.2 on the primary and 8.4.4 on the standby via yum. I
wouldn't think a minor release change would cause what I'm seeing... I've
also built 8.4.2 from source on the standby and I get the same debug output.
I should also mention these are both VM's (vmware) and are running under a
windows host OS.

 > DEBUG:  invoking IpcMemoryCreate(size=32595968)
> > DEBUG:  max_safe_fds = 984, usable_fds = 1000, already_open = 6
> > DEBUG:  logger shutting down
> > DEBUG:  shmem_exit(0): 0 callbacks to make
> > DEBUG:  proc_exit(0): 0 callbacks to make
> > DEBUG:  exit(0)
> > DEBUG:  shmem_exit(-1): 0 callbacks to make
> > DEBUG:  proc_exit(-1): 0 callbacks to make
>
> Hmm.  I'm thinking that the postmaster must have simply crashed at some
> time between starting the logger subprocess and starting the startup
> subprocess.  You could try running the postmaster under gdb to see
> if you can get a stack trace, along the lines of
>
>$ gdb /usr/bin/postgres
>gdb> run -d 3 -D /data/postgres/
>... crash
>gdb> bt
>... printout
>gdb> quit
>
>
gdb output...

DEBUG:  -
DEBUG:  invoking IpcMemoryCreate(size=32595968)
DEBUG:  max_safe_fds = 980, usable_fds = 1000, already_open = 10
Detaching after fork from child process 18310.
Detaching after fork from child process 18311.

Program exited with code 01.
DEBUG:  logger shutting down
DEBUG:  shmem_exit(0): 0 callbacks to make
DEBUG:  proc_exit(0): 0 callbacks to make
DEBUG:  exit(0)
DEBUG:  shmem_exit(-1): 0 callbacks to make
DEBUG:  proc_exit(-1): 0 callbacks to make
(gdb) bt
No stack.


You might need to "continue" if it decides to stop at the point of
> forking off the logger.
>
>regards, tom lane
>

Do you have any other suggestions?

Thank you,
Matt


[GENERAL] postgres startup failure

2010-05-20 Thread Matt Bartolome
Hi,
I'm attempting to start postgres on a standby machine from a backup. Both
the primary and standby are running postgres 8.4, fedora 12.

The backup on the primary is done nightly using the following commands:

# psql -d operations -U postgres -c "SELECT
pg_start_backup('/data/postgres')";
# tar zcvf /mnt/thresheresc/postgres/backup.tar.gz /data/postgres/ --exclude
"/data/postgres/pg_xlog";
# psql -d operations -U postgres -c "SELECT pg_stop_backup()";

I also have WAL enabled on the primary and writing to the standby but it
doesn't seem to make a difference whether I'm using recovery.conf or not so
I have left that step out of the equation for now while I figure out why the
plain backup doesn't work.

After extracting the backup onto the standby machine and verifying the
correct postgres user permissions it refuses to start.

-bash-4.0$ postgres -d 3 -D /data/postgres/
DEBUG:  postgres: PostmasterMain: initial environ dump:
DEBUG:  -
DEBUG:  HOSTNAME=xxx.xxx.xxx
DEBUG:  SHELL=/bin/bash
DEBUG:  TERM=xterm
DEBUG:  HISTSIZE=1000
DEBUG:  USER=postgres
DEBUG:
LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=01;05;37;41:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lz=01;31:*.xz=01;31:*.bz2=01;31:*.tbz=01;31:*.tbz2=01;31:*.bz=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.rar=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.axv=01;35:*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=01;36:*.au=01;36:*.flac=01;36:*.mid=01;36:*.midi=01;36:*.mka=01;36:*.mp3=01;36:*.mpc=01;36:*.ogg=01;36:*.ra=01;36:*.wav=01;36:*.axa=01;36:*.oga=01;36:*.spx=01;36:*.xspf=01;36:
DEBUG:  MAIL=/var/spool/mail/postgres
DEBUG:
PATH=/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin
DEBUG:  PWD=/var/lib/pgsql
DEBUG:  LANG=en_US.UTF-8
DEBUG:  SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass
DEBUG:  HISTCONTROL=ignoreboth
DEBUG:  SHLVL=1
DEBUG:  HOME=/var/lib/pgsql
DEBUG:  LOGNAME=postgres
DEBUG:  PGDATA=/data/postgres
DEBUG:  LESSOPEN=|/usr/bin/lesspipe.sh %s
DEBUG:  G_BROKEN_FILENAMES=1
DEBUG:  _=/usr/bin/postgres
DEBUG:  PGLOCALEDIR=/usr/share/locale
DEBUG:  PGSYSCONFDIR=/etc/sysconfig/pgsql
DEBUG:  LC_COLLATE=en_US.UTF-8
DEBUG:  LC_CTYPE=en_US.UTF-8
DEBUG:  LC_MESSAGES=en_US.UTF-8
DEBUG:  LC_MONETARY=C
DEBUG:  LC_NUMERIC=C
DEBUG:  LC_TIME=C
DEBUG:  -
DEBUG:  invoking IpcMemoryCreate(size=32595968)
DEBUG:  max_safe_fds = 984, usable_fds = 1000, already_open = 6
DEBUG:  logger shutting down
DEBUG:  shmem_exit(0): 0 callbacks to make
DEBUG:  proc_exit(0): 0 callbacks to make
DEBUG:  exit(0)
DEBUG:  shmem_exit(-1): 0 callbacks to make
DEBUG:  proc_exit(-1): 0 callbacks to make


There is nothing in the pgstartup.log file... If I initdb on an empty
/data/postgres directory it starts up fine. I just can't start the database
using the backup.

Anybody have a clue what is going on here? The memory messages seem
suspicious to me...

Thank you,
Matt


[GENERAL] parse tree in XML format

2009-12-28 Thread matt
Is there some way to export the postgresql query parse tree in XML format? I 
can not locate the API/Tool etc to do that...

thanks
-Matt


  

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] stuck spinlock (0x2aac3678b0e0) detected at dynahash.c:876

2009-11-20 Thread Matt Solnit
Hi Merlin.  Thanks very much for your reply.  We are not using the "High-CPU" 
instance type, so these kernel recommendations to not apply to us.  Here is 
what we're running:

$ uname -a
Linux domU-12-31-39-09-E8-21 2.6.21.7-2.fc8xen #1 SMP Fri Feb 15 12:34:28 EST 
2008 x86_64 x86_64 x86_64 GNU/Linux

Regarding Xen, I'll look around a bit, but one of my first Google hits was the 
following thread:  
http://archives.postgresql.org/pgsql-general/2006-09/msg00503.php
Some quotes:
"PostgreSQL performs very, very well on Xen even in DomU. It is one of the 
things that lends to the Xen credibility because they use us in their 
benchmarks."
"... I've had zero issues running postgres inside a domU."

Granted, this was in 2006.

-- Matt

On Nov 20, 2009, at 9:54 AM, Merlin Moncure wrote:

> On Fri, Nov 20, 2009 at 12:15 PM, Matt Solnit  wrote:
>> 
>> We are running PostgreSQL 8.3.8 (64-bit) on a dedicated Fedora Core 8 
>> machine, in Amazon EC2.  This was using an "extra-large" instance, which 
>> means 4 Xeon cores (2.66 GHz) and 15.5 GB of memory.
> 
> considering that ec2 is a virtualized environment, the first
> conclusion that everyone is going to jump to is that this is some type
> of issue with ec2.  IIRC ec2 runs xen, did you search for any related
> issues with xen and postgresql?
> 
> are you running the correct kernel?
> http://developer.amazonwebservices.com/connect/entry.jspa?externalID=1535
> 
> "We strongly recommend using the 2.6.18 Xen stock kernel with the
> c1.medium and c1.xlarge instances. Although the default Amazon EC2
> kernels will work, the new kernels provide greater stability and
> performance for these instance types. For more information about
> kernels, refer to the Amazon Elastic Compute Cloud Developer Guide."
> 
> merlin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] stuck spinlock (0x2aac3678b0e0) detected at dynahash.c:876

2009-11-20 Thread Matt Solnit
Hi everyone.  The following error appeared in our log yesterday:

2009-11-19 13:39:40 PST:10.211.97.171(63815):[25668]: PANIC:  stuck spinlock 
(0x2aac3678b0e0) detected at dynahash.c:876

Followed by:

2009-11-19 13:44:24 PST::@:[1381]: LOG:  server process (PID 25668) was 
terminated by signal 6: Aborted
2009-11-19 13:44:24 PST::@:[1381]: LOG:  terminating any other active server 
processes

Followed by:

2009-11-19 13:44:24 PST:10.211.1.171(8016):[29736]: WARNING:  terminating 
connection because of crash of another server process
2009-11-19 13:44:24 PST:10.211.1.171(8016):[29736]: DETAIL:  The postmaster has 
commanded this server process to roll back the current transaction and exit, 
because another server process exited abnormally and possibly corrupted shared 
memory.
(repeated for every open connection).

Followed by:

2009-11-19 13:44:36 PST:[local]:postg...@postgres:[29780]: FATAL:  the database 
system is in recovery mode
(repeated several times).

Finally:

2009-11-19 13:47:15 PST::@:[223331]: LOG:  autovacuum launcher started
2009-11-19 13:47:15 PST::@:[1381]: LOG:  database system is ready to accept 
connections

The system was under a good amount of load:  approximately 80 connections, each 
generating large numbers of batched inserts (using JDBC).  This includes the 
one that crashed.  Once the database re-initialized, everything went back to 
normal.  Is this due to a bug in PostgreSQL?  Is there anything we can do about 
it? :-)

We are running PostgreSQL 8.3.8 (64-bit) on a dedicated Fedora Core 8 machine, 
in Amazon EC2.  This was using an "extra-large" instance, which means 4 Xeon 
cores (2.66 GHz) and 15.5 GB of memory.

Sincerely,
Matt Solnit
-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Libpq binary mode SELECT ... WHERE ID IN ($1) Question

2009-11-12 Thread Matt Sanchez
Hello,

Suppose I have a table:
create table foo ( id int4, name varchar(50) );

I want to prepare and execute in binary mode:
select name from foo where id in ($1);

Execute works when I have a single value for $1, and treat it as a
normal INT4. However, when I attempt to send multiple values for the
parameter, I get no meaningful results.

My code snippets are below.

When setting up an array of numbers to pass as a parameter, is there
something special that needs to be done? The documentation is quite
vague; I did poke around the source and found in contrib some int_array
code, but don't fully understand it.

I suspect I need to do something like ...
(a) when I prepare, do something to tell postgres that I will have an
array of values, instead of a single value, and/or
(b) perhaps encapsulate the array of integers in some manner.

If I could be pointed to an example or documentation, it would be much
appreciated.

Thanks,
Matt Sanchez

The prepare code snippet:

Oid oids[1] = { 23 };   //INT4OID

result = PQprepare( pgconn, "getname",
"select name from foo where id in ($1)"
1, oids );

The execute code snippet:

int ids[4] = { 3, 5, 6, 8 };// param values we want to setup

/* convert numbers to network byte order ... */

char *  paramvalues[1] = (char *) ids;
int paramlengths[1];
int paramformats[1] = { 1 };

paramlengths[0] = = 4 * sizeof( int );

result = PQexecPrepared( pgconn,
"getname",  // statement name
1,  // number of params
paramvalues,
paramlenths,
paramformats,
1 );


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Postgresql Web Hosting

2009-09-29 Thread Matt Friedman
Hi,

I'm trying to migrate a site to a new hosting company. The backend
uses postgresql 8 and php.

Anyone have thoughts on decent hosting companies for this sort of
thing? I'm just looking at shared hosting as this isn't a resource
intensive site.

Thanks,
Matt

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] enabling join_collapse_limit for a single query only

2009-07-23 Thread Matt Harrison

Hi, and thanks for the reply.

We're issuing queries from within our application where transactions  
span a whole unit of work, so wrapping a single query in a transaction  
just to set the param isn't really feasible except on a separate  
connection, which i am loathe to do.


I was really hoping to be able to set join_collapse_limit=1 *just* on  
the single query, as a kind of query hint, eg:


/* !hint:join_collapse_limit=1  */ SELECT ...

I take it this is this not possible in postgres?

cheers,
Matt h

On 23/07/2009, at 09:50, Albe Laurenz wrote:


groovefillet wrote:

Is it possible to set the runtime parameter 'join_collapse_limit' for
a single query only without setting/unsetting it before/after?


Yes:

START TRANSACTION;

SET LOCAL join_collapse_limit = 42;

SELECT .

COMMIT;

Yours,
Laurenz Albe



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] queries on xmin

2009-06-11 Thread Matt Amos
On Thu, Jun 11, 2009 at 2:48 PM, Marko Kreen wrote:
> On 6/11/09, Matt Amos  wrote:
>> On Thu, Jun 11, 2009 at 1:13 PM, Brett Henderson wrote:
>>  >> See pgq.batch_event_sql() function in Skytools [2] for how to
>>  >> query txids between snapshots efficiently and without being affected
>>  >> by long transactions.
>>  >
>>  > I'll take a look.
>>
>> it was looking at the skytools stuff which got me thinking about using
>>  txids in the first place. someone on the osm-dev list had suggested
>>  using PgQ, but we weren't keen on the schema changes that would have
>>  been necessary.
>
> Except the trigger, PgQ does not need any schema changes?

i've been having a look and it seems to me that PgQ requires some
extra tables as well as the trigger. am i missing something?

PgQ might be a good solution, but i'm worried that after calling
pgq.finish_batch() the batch is released. this would mean it wouldn't
be possible to regenerate older files (e.g: a few days to a week) in
case something unexpected went wrong. it might not be a major problem,
though.

i think we could get the same functionality without the extra daemons,
by putting a trigger on those tables for insert and recorded the
object id, version and 64-bit txid in another table. but if we're
going to alter the schema we might as well put the txid column
directly into those tables...

cheers,

matt

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] queries on xmin

2009-06-11 Thread Matt Amos
On Thu, Jun 11, 2009 at 1:13 PM, Brett Henderson wrote:
> Marko Kreen wrote:
> Sorry, I'm not sure what you're suggesting with txid_current().  We're
> currently using the |txid_current_snapshot|() method which returns us the
> maximum transaction id plus in-flight transactions.  We specifically exclude
> transactions that are in-flight from the query, then include them on
> subsequent queries when they have committed.

i think what marko is suggesting is a new column, e.g:

alter table X add column txid bigint default current_txid();

the current_txid() function returning an int8 which is the 32-bit txid
in the lower word and a 32-bit "epoch" counter in the higher word so
that it doesn't wrap-around.

>> See pgq.batch_event_sql() function in Skytools [2] for how to
>> query txids between snapshots efficiently and without being affected
>> by long transactions.
>
> I'll take a look.

it was looking at the skytools stuff which got me thinking about using
txids in the first place. someone on the osm-dev list had suggested
using PgQ, but we weren't keen on the schema changes that would have
been necessary.

cheers,

matt

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] queries on xmin

2009-06-11 Thread Matt Amos
On Thu, Jun 11, 2009 at 12:59 PM, Brett Henderson wrote:
> Greg Stark wrote:
>> Another option to consider would be including a boolean column
>> "dumped" defaulted to false. Then you could have a partial index on
>> the primary key or date "WHERE NOT dumped". Then when you dump you can
>> "SELECT FOR UPDATE * WHERE NOT dumped" and then when you're done
>> "UPDATE SET dumped = 't' ". Alternately you could use "UPDATE SET
>> dumped='t' WHERE NOT dumped RETURNING *" which is basically
>> equivalent.
>>
>
> I have a couple of hesitations with using this approach:
> 1. We can only run the replicator once.
> 2. We can only run a single replicator.
> 3. It requires write access to the db.

4. it requires a schema change to support this use case.

again, this isn't a massive issue. but for the time being we're at
least making an effort to not extend the schema beyond what the rails
code itself requires.

> 3 is also a major consideration, it makes everybody's life easier if we can
> avoid updates being made to the db by the replicator.

for safety's sake i think this makes a lot of sense.

> I hope I don't sound too negative.  My gut also tells me that what we're
> doing is not the "right" solution and I've had fairly similar arguments with
> Matt already :-)  But having spent some time playing with it I can't find
> any reason why it won't work, and from a performance point of view I suspect
> it will win ...

it seems right to me to use postgres' existing features to support
this. we've got pretty close to a working system without needing
schema changes, so it would be a shame if they turn out to be
necessary.

cheers,

matt

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] queries on xmin

2009-06-11 Thread Matt Amos
the openstreetmap project (http://osm.org/) recently moved from using
mysql to postgres and we're trying to improve some of our tools using
the new functionality that postgres provides.

in particular, we are dumping changes to the database at short
intervals (currently every minute, hour and day [1,2]) so that 3rd
party sites can use this to keep up-to-date with the main database. it
previously worked by examining the timestamp of each modified element,
but this is no longer practical due to new features in the
openstreetmap API which can cause long-running transactions [3].

we've been working out a scheme based on taking txid_snapshots at
short intervals and dumping the new rows (due to the way it's
implemented, all edits are inserted rows) and querying xmin. the query
looks something like this:

select id,version from (nodes|ways|relations) where timestamp > (now()
- '1 hour'::interval) and xmin in (...)

and we build up the txid list from the two snapshots we're dumping
between on the client. however, we're finding that this becomes much
less efficient as the txid list becomes longer. in an effort to reduce
the query time we're looking to index the xmin column. it seems that
hash indexes are already supported on the txid type, but btree are not
[4].

the queries we're doing would usually be of the form "xmin in
previous_unfinished_txids or (xmin > previous_max_txid and xmin <=
current_max_txid and not in current_unfinished_txids)" except when
wrap-around occurs, so it would seem that a btree index would be
superior to building this list client-side and using a hash index.

what problems are we going to create for ourselves if we create a
btree index on xmin casted to int4? would it be as efficient to use a
hash index, create a temporary table of txids that we're querying with
a hash index and do an explicit join? have i missed the point
entirely?

many thanks,

matt

[1] http://wiki.openstreetmap.org/wiki/Planet.osm/diffs
[2] http://wiki.openstreetmap.org/wiki/OsmChange
[3] 
http://wiki.openstreetmap.org/wiki/OSM_Protocol_Version_0.6#Diff_upload:_POST_.2Fapi.2F0.6.2Fchangeset.2F.23id.2Fupload
[4] http://archives.postgresql.org/pgsql-general/2004-10/msg01474.php

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Out of memory on SELECT in 8.3.5

2009-02-09 Thread Matt Magoffin
> No, explain analyze for the query that wouldn't execute before but now
> does, with, I assume, a large work_mem.  I'd like to see how it
> differes from the one with smaller work_mem.

Ah, I pasted that in an earlier email, sent February 10, 2009 9:58:00 AM
GMT+13:00... that plan was the one using still the 128MB of work_mem after
changing the overcommit_ratio to 80.

Regards,
Matt

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Out of memory on SELECT in 8.3.5

2009-02-09 Thread Matt Magoffin
>> Yes... and indeed changing vm.overcommit_ratio to 80 does allow that
>> previously-failing query to execute successfully. Do you think this is
>> also what caused the out-of-memory error we saw today just when a
>> transaction was initiated?
>
> Curious, what's the explain analyze look like for that one?

Do you mean the transaction initiation? I'm not sure how to get an EXPLAIN
for that, the application never got to do anything, from the application
side it failed with out-of-memory while trying to open the connection. Or,
the most precise I have is that in the JDBC driver, it failed at

Caused by: org.postgresql.util.PSQLException: FATAL: out of memory
at
org.postgresql.core.v3.ConnectionFactoryImpl.readStartupMessages(ConnectionFactoryImpl.java:444) 
   at
org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:99)
at
org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:66)
at
org.postgresql.jdbc2.AbstractJdbc2Connection.(AbstractJdbc2Connection.java:124)
at
org.postgresql.jdbc3.AbstractJdbc3Connection.(AbstractJdbc3Connection.java:30)
at
org.postgresql.jdbc3g.Jdbc3gConnection.(Jdbc3gConnection.java:24)
at org.postgresql.Driver.makeConnection(Driver.java:386)
at org.postgresql.Driver.connect(Driver.java:260)

And the corresponding error in the Postgres log is

2009-02-09 13:52:13 CST lms_kia FATAL:  out of memory
TopMemoryContext: 53952 total in 6 blocks; 7048 free (7 chunks); 46904 used
  Relcache by OID: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used
  CacheMemoryContext: 91024 total in 4 blocks; 3856 free (9 chunks); 87168
used
  MdSmgr: 0 total in 0 blocks; 0 free (0 chunks); 0 used
  LOCALLOCK hash: 8192 total in 1 blocks; 3744 free (0 chunks); 4448 used
  Timezones: 53584 total in 2 blocks; 3744 free (0 chunks); 49840 used
  ErrorContext: 8192 total in 1 blocks; 8160 free (4 chunks); 32 used
2009-02-09 13:52:29 CST lms_kia FATAL:  out of memory
2009-02-09 13:52:29 CST lms_kia DETAIL:  Failed on request of size 2456.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Out of memory on SELECT in 8.3.5

2009-02-09 Thread Matt Magoffin
> I don't think changing work_mem down is actually going to reduce the
> memory allocated without changing the plan to something less optimal.
> In the end, all of this is putting off the inevitable, if you get enough
> PGs going and enough requests and whatnot, you're going to start running
> out of memory again.  Same if you get larger data sets that take up more
> hash table space or similar.  Eventually you might need a bigger box,
> but let's try to get everything in the current box to at least be used
> first..

Yes... and indeed changing vm.overcommit_ratio to 80 does allow that
previously-failing query to execute successfully. Do you think this is
also what caused the out-of-memory error we saw today just when a
transaction was initiated?

Regards,
Matt

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Out of memory on SELECT in 8.3.5

2009-02-09 Thread Matt Magoffin
> I'd do both.  But only after I'd reduced work_mem.  Given that
> reducing work_mem removed the problem, it looks to me like pgsql is
> requesting several large blocks of ram, then only using a small port
> of them.  But overcommit set to 2 means that the OS will not allow an
> overcommit of memory to these allocations, the allocations fail, and
> you get your error.

OK, I've re-configured work_mem, and set vm.overcommit_ratio to 80.
Without restarting Postgres, I was now able to run that big query posted
earlier in this thread that failed... so the overcommit_ratio adjustment
helped there. The EXPLAIN ANALYZE for that I've included below, where I
can see it did use in-memory sorting for some of the sorts.


QUERY
PLAN
--
 Sort  (cost=2297842.92..2297943.76 rows=40336 width=234) (actual
time=370440.041..370483.133 rows=49317 loops=1)
   Sort Key: s.sale_date, s.vin
   Sort Method:  quicksort  Memory: 9357kB
   ->  Hash Left Join  (cost=1450365.72..2294757.26 rows=40336 width=234)
(actual time=293212.004..369857.956 rows=49317 loops=1)
 Hash Cond: (ml.lead_id = phone.lead_id)
 ->  Hash Left Join  (cost=1341348.27..2140418.89 rows=40336
width=219) (actual time=286374.489..362880.702 rows=49317
loops=1)
   Hash Cond: (ml.lead_id = email.lead_id)
   ->  Nested Loop Left Join  (cost=1237270.73..2000634.35
rows=40336 width=204) (actual time=278652.051..355022.014
rows=49317 loops=1)
 ->  Nested Loop Left Join 
(cost=1237270.73..1807035.53 rows=40336 width=141)
(actual time=278635.414..323774.871 rows=49317
loops=1)
   ->  Nested Loop Left Join 
(cost=1237270.73..1587513.22 rows=40336
width=126) (actual time=278632.427..317952.620
rows=49317 loops=1)
 ->  Nested Loop Left Join 
(cost=1237270.73..1414103.19 rows=40336
width=118) (actual
time=278606.034..297951.038 rows=49317
loops=1)
   ->  Merge Left Join 
(cost=1237270.73..1238315.78
rows=40336 width=89) (actual
time=278587.812..279498.796
rows=49317 loops=1)
 Merge Cond: (ml.lead_id =
lrdm.lead_id)
 ->  Sort 
(cost=46384.08..46484.92
rows=40336 width=78) (actual
time=6200.810..6240.526
rows=49317 loops=1)
   Sort Key: ml.lead_id
   Sort Method:  quicksort
 Memory: 8472kB
   ->  Hash Left Join 
(cost=9430.33..43298.42
rows=40336 width=78)
(actual
time=1079.869..6084.010
rows=49317 loops=1)
 Hash Cond:
((s.dealer_code)::text
=
(d.dealer_code)::text)
 ->  Hash Left
Join 
(cost=9094.04..42407.50
rows=40336
width=60) (actual
time=1074.170..5947.646
rows=49317
loops=1)
   Hash Cond:
(s.id =
ml.sale_id)
   Filter:
(((s.sale_type
=
'd'::bpchar)
AND (NOT
ml.lost_sale))
OR
((s.sale_type
=
'c'::bpchar)
AND
(ml.lead_pos
= 0)) OR
(s.sale_type
=
'0'::bpchar))
   ->  Index
Scan using
mb_sale_sale_date_idx
on mb_sale
s 
(cost=0.00..14027.94
rows=43804
width=50)
(actual
time=55.663..4683.901
rows=49152
loops=1)
 Index
Cond:
((sale_date
>=
'2009-01-01'::date)
AND
(sale_date
<=
'2009-01-31'::date))
   ->  Hash 
(cost=5577.35..5577.35
rows=281335
width=26)
(actual
time=1018.108..1018.108
rows=281335
loops=1)
 -> 
Seq
Scan
on
mb_lead
ml 
(cost=0.00..5577.35
rows=281335
width=26)
(actual
time=3.451..516.245
rows=281335
loops=1)
 ->  Hash 
(cost=321.13..321.13
rows=1213
width=23) (actual
time=5.577..5.577
rows=1213
loops=1)
   ->  Seq
Scan on
dealer d 
(cost=0.00..321.13
rows=1213
width=23)
(actual
time=0.056..3.684
rows=1213
loops=1)
 ->  Sort 
(cost=1190886.66..1191208.43
rows=128709 width=19) (actual
time=270075.460..271851.519
rows=1442527 loops=1)
   Sort Key: lrdm.lead_id
   Sort Method:  external
sort  Disk: 56072kB
   

Re: [GENERAL] Out of memory on SELECT in 8.3.5

2009-02-09 Thread Matt Magoffin
> with 100 concurrent postgres connections,  if they all did something
> requiring large amounts of work_mem, you could allocate 100 * 125MB (I
> believe thats what you said it was set to?) which is like 12GB :-O
>
> in fact a single query thats doing multiple sorts of large datasets  for
> a messy join (or other similar activity) can involve several instances
> of workmem.  multiply that by 100 queries, and ouch.
>
> have you considered using a connection pool to reduce the postgres
> process count?

We do have a connection pool here, and as I mentioned about a quarter of
these are Slony-controlled processes for replication. Most connections are
not doing complex queries of this sort, in fact when this query runs it is
the only query running, most connections are idle (kept open by a
connection pool) or performing short insert/update operations.

But I see your point about the large work_mem, it was set that high to
help speed up big queries such as this one.

-- m@

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Out of memory on SELECT in 8.3.5

2009-02-09 Thread Matt Magoffin
> I suspect this may be it...  Apparently, while you're only using about
> 2G, you've got 10G or so of outstanding commitments, and Linux is
> refusing to allocate more.
>
> You probably want to up your overcommit_ratio, esp. in light of the fact
> that you've only got 2G of swap on this box.  I'd probably say up it to
> 80, which would give you 14.8G of commitable memory, leaving some room
> in-memory (1.2G) for cache/buffers and whatnot.  Alternativly, you could
> go for 90, which would allow commits up to 16.4G, so if everyone used
> all their memory, you'd be into swap.

Also, by adjusting this, would I possibly just be delaying the problem we
currently have (i.e. over time, we start to run out of memory)? I just
wonder why the system is reaching this limit at all... do you feel it is
quite normal for a system with this memory configuration to be configured
with the ratio set to 80? I'm not terribly familiar with these VM
parameters, so I apologize if I sound vague.

Regards,
Matt

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Out of memory on SELECT in 8.3.5

2009-02-09 Thread Matt Magoffin
> * Matt Magoffin (postgresql@msqr.us) wrote:
>> [r...@170226-db7 ~]# cat /proc/meminfo
>> CommitLimit:  10312588 kB
>> Committed_AS:  9760756 kB
>
> I suspect this may be it...  Apparently, while you're only using about
> 2G, you've got 10G or so of outstanding commitments, and Linux is
> refusing to allocate more.
>
> You probably want to up your overcommit_ratio, esp. in light of the fact
> that you've only got 2G of swap on this box.  I'd probably say up it to
> 80, which would give you 14.8G of commitable memory, leaving some room
> in-memory (1.2G) for cache/buffers and whatnot.  Alternativly, you could
> go for 90, which would allow commits up to 16.4G, so if everyone used
> all their memory, you'd be into swap.

Thanks for the advice. Should we have more than 2GB of swap available? I
thought the goal for a Postgres system was to avoid swap use at all cost?
Would it be better for us to add more swap, or adjust this
overcommit_ratio as you discuss?

Regards,
Matt

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Out of memory on SELECT in 8.3.5

2009-02-09 Thread Matt Magoffin
ss_relname_nsp_index: 3072 total in 2 blocks; 1600 free (2
chunks); 1472 used
pg_attribute_relid_attnam_index: 3072 total in 2 blocks; 1648 free (2
chunks); 1424 used
pg_class_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376
used
pg_amproc_fam_proc_index: 3072 total in 2 blocks; 1384 free (2
chunks); 1688 used
pg_operator_oprname_l_r_n_index: 3072 total in 2 blocks; 1384 free (2
chunks); 1688 used
pg_index_indexrelid_index: 3072 total in 2 blocks; 1696 free (2
chunks); 1376 used
pg_type_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376
used
pg_rewrite_rel_rulename_index: 3072 total in 2 blocks; 1648 free (2
chunks); 1424 used
pg_authid_rolname_index: 3072 total in 2 blocks; 1696 free (2 chunks);
1376 used
pg_auth_members_member_role_index: 3072 total in 2 blocks; 1648 free
(2 chunks); 1424 used
pg_enum_typid_label_index: 3072 total in 2 blocks; 1648 free (2
chunks); 1424 used
pg_constraint_oid_index: 3072 total in 2 blocks; 1744 free (3 chunks);
1328 used
pg_conversion_oid_index: 3072 total in 2 blocks; 1744 free (3 chunks);
1328 used
pg_ts_template_tmplname_index: 3072 total in 2 blocks; 1648 free (2
chunks); 1424 used
pg_ts_config_map_index: 3072 total in 2 blocks; 1624 free (3 chunks);
1448 used
pg_namespace_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks);
1376 used
pg_type_typname_nsp_index: 3072 total in 2 blocks; 1600 free (2
chunks); 1472 used
pg_operator_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks);
1376 used
pg_amop_opr_fam_index: 3072 total in 2 blocks; 1600 free (2 chunks);
1472 used
pg_proc_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376
used
pg_opfamily_am_name_nsp_index: 3072 total in 2 blocks; 1624 free (3
chunks); 1448 used
pg_ts_template_oid_index: 3072 total in 2 blocks; 1744 free (3
chunks); 1328 used
  MdSmgr: 8192 total in 1 blocks; 6752 free (0 chunks); 1440 used
  LOCALLOCK hash: 24576 total in 2 blocks; 15984 free (5 chunks); 8592 used
  Timezones: 53584 total in 2 blocks; 3744 free (0 chunks); 49840 used
  ErrorContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used

Does this provide any useful information? I have other queries that are
failing as well, and I can provide explain output for those if it might
help.

Regards,
Matt

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Out of memory on SELECT in 8.3.5

2009-02-09 Thread Matt Magoffin
> Err, yea, overcommit_memory=2, a small amount of swap space, and a low
> overcommit_ratio could cause this to happen...  The default
> ratio is 50 though, which should mean, on this system, there is about
> 10G available for user processes, but his usage shows only a bit over 2G
> being used outside of buffers/caches (based on free output)..
>
> Matt, can you provide the output from these:
> cat /proc/sys/vm/overcommit_memory
>
> cat /proc/sys/vm/overcommit_ratio
>
> cat /proc/meminfo

Sure, here you go:

[r...@170226-db7 ~]# cat /proc/sys/vm/overcommit_memory
2

[r...@170226-db7 ~]# cat /proc/sys/vm/overcommit_ratio
50

[r...@170226-db7 ~]# cat /proc/meminfo
MemTotal: 16432296 kB
MemFree:160188 kB
Buffers:164764 kB
Cached:   14006456 kB
SwapCached:  0 kB
Active:6984144 kB
Inactive:  8471456 kB
HighTotal:   0 kB
HighFree:0 kB
LowTotal: 16432296 kB
LowFree:160188 kB
SwapTotal: 2096440 kB
SwapFree:  2095992 kB
Dirty: 540 kB
Writeback:   0 kB
AnonPages: 1270076 kB
Mapped:4258192 kB
Slab:   211448 kB
PageTables: 559248 kB
NFS_Unstable:0 kB
Bounce:  0 kB
CommitLimit:  10312588 kB
Committed_AS:  9760756 kB
VmallocTotal: 34359738367 kB
VmallocUsed:263556 kB
VmallocChunk: 34359474475 kB
HugePages_Total: 0
HugePages_Free:  0
HugePages_Rsvd:  0
Hugepagesize: 2048 kB

Regards,
Matt

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Out of memory on SELECT in 8.3.5

2009-02-09 Thread Matt Magoffin
> So did the backend crashed on this one, or just produced 'out of
> memory ' message ?

No crash, just the error message.

-- m@

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Out of memory on SELECT in 8.3.5

2009-02-09 Thread Matt Magoffin
>> We have 100+ postgres processes running, so for an individual process,
>> could the 1024 file limit be doing anything to this query? Or would I
>> see
>> an explicit error message regarding this condition?
>
> you would get one of "Open files rlimit 1024 reached for uid " in
> syslog (which you should checkout anyhow).

No... nothing like this in syslog.

Regards,
Matt

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Out of memory on SELECT in 8.3.5

2009-02-09 Thread Matt Magoffin
> [r...@170226-db7 ~]# su -l postgres -c "ulimit -a"
> core file size  (blocks, -c) 0
> data seg size   (kbytes, -d) unlimited
> max nice(-e) 0
> file size   (blocks, -f) unlimited
> pending signals (-i) 139264
> max locked memory   (kbytes, -l) 32
> max memory size (kbytes, -m) unlimited
> open files  (-n) 1024
> pipe size(512 bytes, -p) 8
> POSIX message queues (bytes, -q) 819200
> max rt priority (-r) 0
> stack size  (kbytes, -s) 10240
> cpu time   (seconds, -t) unlimited
> max user processes  (-u) 139264
> virtual memory  (kbytes, -v) unlimited
> file locks  (-x) unlimited

I just noticed something: the "open files" limit lists 1024, which is the
default for this system. A quick count of open data files currently in use
by Postgres returns almost 7000, though.

[r...@170226-db7 ~]# lsof -u postgres |egrep
'(/pg_data|/pg_index|/pg_log)' |wc -l
6749

We have 100+ postgres processes running, so for an individual process,
could the 1024 file limit be doing anything to this query? Or would I see
an explicit error message regarding this condition?

Regards,
Matt



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Out of memory on SELECT in 8.3.5

2009-02-09 Thread Matt Magoffin
> erp, that stinks.  Must be on an older kernel?  I've got it under (a
> Debian-built) 2.6.26.  I can't recall if there's another way to get
> limit info for an active process..  Could use Tom's suggestion of
> echo'ing ulimit -a out to a file somewhere during database start-up.

Yes, this is a RHEL 5 box with kernel 2.6.18... must not exist for this
ancient kernel :-)

Regards,
Matt

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Out of memory on SELECT in 8.3.5

2009-02-09 Thread Matt Magoffin
> Yeah, I remember we found a few xml-related leaks based on your reports.
> However, there's not anything here to suggest that this query is
> straining the capabilities of a 64-bit system with lots o RAM.  It seems
> certain you're hitting some artificial process-size limit, and the only
> one I know about is ulimit.
>
> I wasn't aware of /proc//limits before, but now that I've heard
> of it, checking that for the postmaster and/or a backend seems like
> a great idea.

This doesn't seem to exist for any process on this box:

[r...@170226-db7 ~]# ls /proc/*/limit*
ls: /proc/*/limit*: No such file or directory

If this were a system-defined process-size limit, then should the query
still run out of memory after restarting Postgres? Most likely we'll have
to restart Postgres soon, and I'll retry this query after doing so. Based
on past experience, I'd expect the query to complete at that time.

>From what we experience, Postgres seems to be slowly accumulating memory
in the fashion of a small memory leak and things start to fail with
out-of-memory errors after the server has been running for some time (e.g.
roughly 4-6 weeks). Restarting Postgres clears out the problems (after a
restart we can immediately run queries that were failing before the
restart)... but then the cycle starts again.

I just bring this up wondering if there is something possibly accumulating
within Postgres that isn't getting freed and might cause an out-of-memory
error like this in some way.

Regards,
Matt

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Out of memory on SELECT in 8.3.5

2009-02-08 Thread Matt Magoffin
> Agreed, that ulimit isn't reflecting any such limit, but is that really
> the same environment the postmaster gets started in?  I wouldn't trust
> a system startup script to be launched in the same environment that a
> login shell gets.  You might try adding
>   ulimit -a >/tmp/something
> to the startup script to find out for sure.

Our startup script is calling

su -l postgres ...

So I thought the limits would be the same (from the -l flag)? I then tried
to mimic this with the following:

[r...@170226-db7 ~]# su -l postgres -c "ulimit -a"
core file size  (blocks, -c) 0
data seg size   (kbytes, -d) unlimited
max nice(-e) 0
file size   (blocks, -f) unlimited
pending signals (-i) 139264
max locked memory   (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files  (-n) 1024
pipe size(512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
max rt priority (-r) 0
stack size  (kbytes, -s) 10240
cpu time   (seconds, -t) unlimited
max user processes  (-u) 139264
virtual memory  (kbytes, -v) unlimited
file locks  (-x) unlimited

which I think should accurately reflect what the postmaster environment
should be seeing.

Regards,
Matt

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Out of memory on SELECT in 8.3.5

2009-02-08 Thread Matt Magoffin
  
/lib64/libnss_files-2.5.so
2accc000-2e299000 r--p  08:07 5229018   
/usr/lib/locale/locale-archive
2e299000-2aabafc41000 rw-s  00:08 0 
/SYSV0063da81 (deleted)
7fff21fda000-7fff21fef000 rw-p 7fff21fda000 00:00 0 
[stack]
ff60-ffe0 ---p  00:00 0     
[vdso]

Hope this helps,
Matt

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Out of memory on SELECT in 8.3.5

2009-02-08 Thread Matt Magoffin
> Hmm ... a gig here, a gig there, pretty soon you're talking about real
> memory?  He's got several sorts and hashes that are each taking over
> 100MB according to the memory context dump, so it seems impossible that
> it all fits into a strict 32-bit address space.  There's surely no harm
> in double-checking the executable's file type though.

I did reply to this in a different email in this thread, but just to be
safe, yes, I did verify Postgres is a 64-bit binary:

[postg...@170226-db7 ~]$ file /opt/lms/postgres-8.3_64/bin/postgres
/opt/lms/postgres-8.3_64/bin/postgres: ELF 64-bit LSB executable, AMD
x86-64, version 1 (SYSV), for GNU/Linux 2.6.9, dynamically linked (uses
shared libs), for GNU/Linux 2.6.9, not stripped

Regards,
Matt

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Out of memory on SELECT in 8.3.5

2009-02-08 Thread Matt Magoffin
> I think it must be compiled 64-bit, or he'd not be able to get
> shared_buffers that high to start with.  However, it's possible that the
> postmaster's been started under a ulimit setting that constrains each
> backend to just a few hundred meg of per-process memory.

Here's the output of ulimit -a by the "postgres" user the database is
running under:

[postg...@170226-db7 ~]$ ulimit -a
core file size  (blocks, -c) 0
data seg size   (kbytes, -d) unlimited
max nice(-e) 0
file size   (blocks, -f) unlimited
pending signals (-i) 139264
max locked memory   (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files  (-n) 1024
pipe size(512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
max rt priority (-r) 0
stack size  (kbytes, -s) 10240
cpu time   (seconds, -t) unlimited
max user processes  (-u) 139264
virtual memory  (kbytes, -v) unlimited
file locks  (-x) unlimited

I think this means it does not have an artificial memory limit imposed,
but is there a specific setting beyond these I could check do you think?

Regards,
Matt

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Out of memory on SELECT in 8.3.5

2009-02-08 Thread Matt Magoffin
>>  total   used   free sharedbuffers
>> cached
>> Mem:  16432296   16273964 158332  0 173536
>> 14321340
>> -/+ buffers/cache:1779088   14653208
>> Swap:  20964405602095880
>
> That certainly looks fine..  And you've got 14G or so which should be
> available for this query.  Was this near the time the query was running?
> Could you give us what 'free' returns when the query is close to the
> out-of-memory error?  I'd expect the 2nd row under 'free' to be getting
> low for the allocation to fail.

It was near the time... here's free just before executing the query:

 total   used   free sharedbuffers cached
Mem:  16432296   16342056  90240  0 116868   14561200
-/+ buffers/cache:1663988   14768308
Swap:  20964404482095992

And then here is free close to the time the query runs out of memory (but
while the query is still running):

 total   used   free sharedbuffers cached
Mem:  16432296   16348260  84036  0  41344   14167384
-/+ buffers/cache:2139532   14292764
Swap:  20964404482095992

> Uhh..  I saw that your system was 64-bit, but is your PG process
> compiled as 64bit?  Maybe you're hitting an artificial 32-bit limit,
> which isn't exactly helped by your shared_buffers being set up so high
> to begin with?

Yes, it's compiled as 64-bit:

/opt/lms/postgres-8.3_64/bin/postgres: ELF 64-bit LSB executable, AMD
x86-64, version 1 (SYSV), for GNU/Linux 2.6.9, dynamically linked (uses
shared libs), for GNU/Linux 2.6.9, not stripped

We've had nagging memory-related issues with 8.3 that manifest themselves
like memory leaks... some posts I've made in the past have led to some
leaks getting fixed... but I've not been able to track down more specific
causes. It's just that over time Postgres seems to accumulate memory and
then we start to see out-of-memory errors like this one. Again, this query
could have nothing to do with the root cause, but this is just the example
I have to work with at the moment.

Regards,
Matt

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Out of memory on SELECT in 8.3.5

2009-02-08 Thread Matt Magoffin
>> PostgreSQL 8.3.5 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC)
>> 4.1.2 20070626 (Red Hat 4.1.2-14)
>
> Does the result from 'free' look reasonable on this box?

I think so:

 total   used   free sharedbuffers cached
Mem:  16432296   16273964 158332  0 173536   14321340
-/+ buffers/cache:1779088   14653208
Swap:  20964405602095880


> How many PG processes are running, generally?

I see about 90 currently, of which 24 are Slony connections and the rest
are client connections.

> Do you see the PG process running this query growing to fill the
available memory?  That query really
> didn't look like it was chewing up all that much memory to me..

Just running top, it does appear to chew through a fair amount of memory.
Here's a snapshot from top of the postgres processing running this query
from just before it ran out of memory:

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 4486 postgres  18   0 4576m 3.6g 3.3g R   90 23.1   0:34.23 postgres:
postgres lms_nna [local] EXPLAIN

These values did start out low, for example the RES memory started in the
130MB range, then climbed to the 3.6GB you see here.

> Have
> you tried reducing your work_mem setting to see if that makes the errors
> go away?  It might cause a different plan which might be less efficient
> for some queries, not sure if you have anything which would be
> dramatically affected by that..Do the row counts in the plan match up
> with what you'd expect?  Can you provide output from 'explain analyze'
> on a successful run?

I set the work_mem to 2MB, and the query does actually complete (explain
analyze output below), so does this mean that the query simply uses too
much memory for sorting/joining, and we'd have to either allocate enough
work_mem to allow the query to complete, or a smaller work_mem as shown
here to make the query use slower disk-based sorting? The row counts are
matching what we'd expect from this query.



  QUERY
PLAN
--
 Sort  (cost=2345301.47..2345402.31 rows=40336 width=234) (actual
time=203429.526..203599.840 rows=49317 loops=1)
   Sort Key: s.sale_date, s.vin
   Sort Method:  external merge  Disk: 4776kB
   ->  Hash Left Join  (cost=1469244.28..2337665.81 rows=40336 width=234)
(actual time=162362.791..202883.728 rows=49317 loops=1)
 Hash Cond: (ml.lead_id = phone.lead_id)
 ->  Hash Left Join  (cost=1349360.82..2159151.44 rows=40336
width=219) (actual time=155499.666..194941.423 rows=49317
loops=1)
   Hash Cond: (ml.lead_id = email.lead_id)
   ->  Nested Loop Left Join  (cost=1236733.28..180.90
rows=40336 width=204) (actual time=141890.479..186344.717
rows=49317 loops=1)
 ->  Nested Loop Left Join 
(cost=1236733.28..1806406.02 rows=40336 width=141)
(actual time=141890.419..166782.070 rows=49317
loops=1)
   ->  Nested Loop Left Join 
(cost=1236733.28..1586899.03 rows=40336
width=126) (actual time=141890.368..166124.253
rows=49317 loops=1)
 ->  Nested Loop Left Join 
(cost=1236733.28..1413543.83 rows=40336
width=118) (actual
time=141890.281..156284.612 rows=49317
loops=1)
   ->  Merge Left Join 
(cost=1236733.28..1237778.33
rows=40336 width=89) (actual
time=141890.184..143717.900
rows=49317 loops=1)
 Merge Cond: (ml.lead_id =
lrdm.lead_id)
 ->  Sort 
(cost=43356.21..43457.05
rows=40336 width=78) (actual
time=1722.385..1794.668
rows=49317 loops=1)
   Sort Key: ml.lead_id
   Sort Method:  external
merge  Disk: 4152kB
   ->  Hash Left Join 
(cost=11354.33..38475.05
rows=40336 width=78)
(actual
time=919.319..1578.556
rows=49317 loops=1)
 Hash Cond:
((s.dealer_code)::text
=
(d.dealer_code)::text)
 ->  Hash Left
Join 
(cost=11018.04..37584.13
rows=40336
width=60) (actual
time=914.936..1445.926
rows=49317
loops=1)
   Hash Cond:
(s.id =
ml.sale_id)
   Filter:
(((s.sale_type
=
'd'::bpchar)
AND (NOT
ml.lost_sale))
OR
((s.sale_type
=
'c'::bpchar)
AND
(ml.lead_pos
= 0)) OR
(s.sale_type
=
'0'::bpchar))
   ->  Index
Scan using
mb_sale_sale_date_idx

Re: [GENERAL] Out of memory on SELECT in 8.3.5

2009-02-08 Thread Matt Magoffin
> erm..  How much memory do you have in the system?

This system has 16GB of RAM, and Postgres is basically the only service
running on the box.

>> shared_buffers = 4000MB
>
> I hope you've got a fair bit more than 4G of memory if you're going to
> use 4G for shared buffers...  Once that memory is dedicated to shared
> buffers it's not going to be available for other usage.

Yep, we've got 16GB to work with here. I should have also mentioned the
architecture in my original post, sorry. SELECT version() returns this:

PostgreSQL 8.3.5 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC)
4.1.2 20070626 (Red Hat 4.1.2-14)

Regards,
Matt

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Out of memory on SELECT in 8.3.5

2009-02-08 Thread Matt Magoffin
We've been having persistent out-of-memory errors occur in our production
8.3 deployment, which is now running 8.3.5. I'm not sure the query here is
the cause of the problem, but this is our most-recent example which
triggered an out-of-memory error for us.

Perhaps our configuration needs tweaking somewhere...

shared_buffers = 4000MB
#temp_buffers = 8MB
work_mem = 128MB
maintenance_work_mem = 256MB
#max_stack_depth = 2MB

I was wondering if somebody could spot anything odd looking in the query
plan or the memory parameters dumped to the log that might be causing
this?

explain SELECT
s.sale_date,
s.vin,
s.dealer_region as sale_region,
s.dealer_area as sale_area,
s.dealer_code as sale_dealer,
d.name as sale_dealer_name,
s.model as sale_model,
s.sale_type,
ml.lead_id,
l.created,
lrd.reporting_date as lead_date,
ml.lead_pos as lead_order,
ml.lost_sale,
ml.range,
lrd.dealer_region as lead_region,
lrd.dealer_area as lead_area,
lrd.dealer_code as lead_dealer,
lrd.model as lead_model,
l.source as lead_source,
lrd.oem_type as lead_type,
lrd.oem_subtype as lead_subtype,
lrd.unique_all as lead_unique_all,
lrd.unique_region as lead_unique_region,
lrd.unique_area as lead_unique_area,
lrd.unique_dealer as lead_unique_dealer,
lrd.customer_fname,
lrd.customer_lname,
lrdm.item_value as lead_tracking_code,
address_street.street as owner_street,
address.city as owner_city,
address.state as owner_state,
address.postal_code as owner_postal_code,
address.country as owner_country,
email.list_value as owner_email,
phone.list_value as owner_phone

FROM mb_sale s
LEFT OUTER JOIN dealer d on d.dealer_code = s.dealer_code
LEFT OUTER JOIN mb_lead ml on ml.sale_id = s.id
LEFT OUTER JOIN lead_reporting_data lrd ON lrd.lead_id = ml.lead_id
LEFT OUTER JOIN lead_reporting_meta lrdm
ON lrdm.lead_id = ml.lead_id
AND lrdm.item_key = '[ALS:prospectid]TrackingCode'
AND lrdm.pos=1
LEFT OUTER JOIN lead l ON l.id = ml.lead_id

LEFT OUTER JOIN lead_reporting_address address ON address.lead_id =
ml.lead_id
LEFT OUTER JOIN lead_reporting_street address_street
ON address_street.address_id = address.id AND 
address_street.pos = 0

LEFT OUTER JOIN lead_reporting_list_data phone
ON phone.lead_id = ml.lead_id AND phone.list_type = 'p' AND 
phone.pos = 0

LEFT OUTER JOIN lead_reporting_list_data email
ON email.lead_id = ml.lead_id AND email.list_type = 'e' AND 
email.pos = 0

WHERE
((s.sale_type = 'd' and ml.lost_sale = FALSE) OR (s.sale_type = 
'c' and
ml.lead_pos = 0) OR (s.sale_type = '0'))
AND s.sale_date BETWEEN date('2009-01-01') AND 
date('2009-01-31')
ORDER BY s.sale_date, s.vin

 QUERY
PLAN

 Sort  (cost=2297525.76..2297626.60 rows=40336 width=234)
   Sort Key: s.sale_date, s.vin
   ->  Hash Left Join  (cost=1450193.63..2294440.10 rows=40336 width=234)
 Hash Cond: (ml.lead_id = phone.lead_id)
 ->  Hash Left Join  (cost=1341176.18..2140101.73 rows=40336
width=219)
   Hash Cond: (ml.lead_id = email.lead_id)
   ->  Nested Loop Left Join  (cost=1237098.64..2000317.19
rows=40336 width=204)
 ->  Nested Loop Left Join 
(cost=1237098.64..1806747.26 rows=40336 width=141)
   ->  Nested Loop Left Join 
(cost=1237098.64..1633400.88 rows=40336
width=133)
 ->  Nested Loop Left Join 
(cost=1237098.64..1413899.40 rows=40336
width=118)
   ->  Merge Left Join 
(cost=1237098.64..1238143.69
rows=40336 width=89)
 Merge Cond: (ml.lead_id =
lrdm.lead_id)
 ->  Sort 
(cost=46384.08..46484.92
rows=40336 width=78)
   Sort Key: ml.lead_id
   ->  Hash Left Join 
(cost=9430.33..43298.42
rows=40336 width=78)
 Hash C

  1   2   3   >