Re: [GENERAL] Linux vs FreeBSD

2014-04-11 Thread Achilleas Mantzios

Basically it goes beyond what ppl would describe as OS holly wars.
If one chooses to go by FreeBSD, then he better be prepared to handle the 
burden, both the part that is
imposed by the OS administration itself, as well as the part that is a side 
effect of the different base system.

Example of admin part :
Generally, compiling postgresql from source gives more freedom than be stuck on 
the OS's ports or PKGng
system. (the later being a very handy and welcome addition to FreeBSD).
Now what if e.g. the user wants pgsql software X (e.g. pgadmin3, p5-Pg, etc...) 
only to find out that most of those
ports need postgresql client as a dependency. He/she must be prepared to work 
his way through :
- manual installations (gmake config  gmake  gmake install)
- /usr/ports
- PKG binary installations
in decreasing order of freedom but increasing order of easiness, and in many 
cases work through a combination
of the above.

Example of base system part :
Recently I had to install pl-java on my FreeBSD workstation. There was a 
problem with libtrh, postgresql should be recompiled
with explicitly setting : -lpthread in 
/usr/local/src/postgresql-9.3.4/src/backend/Makefile, without this the backend 
would simply hang
at the very first invocation of a java function. This came after detailed 
following or email exchange of various hackers groups
in both pgsql and FreeBSD lists, to describe the issue as accurately as 
possible, to help debug as most as possible, to talk
to the right people, to give them incentive to answer back, etc.

I don't mean to scare the OP, but FreeBSD is not for everyone.

On 11/04/2014 00:50, Jan Wieck wrote:

On 04/10/14 17:25, Christofer C. Bell wrote:

I'm not wanting to get after anyone here, but I want it on the record
that I am not the source of the above quote discouraging the use of
Ubuntu in a server role.  That would be Bruce Momjian. While Bruce is
entitled to his opinion, it's not one I agree with and I don't want a
Google search years from now to tie my name to that viewpoint.


Who (in their right mind) would ever think of anything but BSD in a server role?

shaking head


Jan




--
Achilleas Mantzios
Head of IT DEV
IT DEPT
Dynacom Tankers Mgmt



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Re: [postgis-users] design question: how to geocode multiple dynamic city, country?

2014-04-11 Thread Rémi Cura
Hey,
my 2 cents :
If you give write access, beware of the sql injections
http://en.wikipedia.org/wiki/SQL_injection

Cheers,
Rémi-C



2014-04-10 21:48 GMT+02:00 zach cruise zachc1...@gmail.com:

 i accept multiple city, country from users on-the-fly, and want to
 dynamically map them.

 i could create a table where i insert their multiple entries, and then
 geocode that table for display.

 but i also want to avoid giving write permission to the web user.

 i could create a schema and restrict write to that schema.

 or something better?
 ___
 postgis-users mailing list
 postgis-us...@lists.osgeo.org
 http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users



[GENERAL] Correct syntax

2014-04-11 Thread Victor Sterpu

How would I write sutch a query?
SELECT to_timestamp ('10-10-2013 15:00', 'DD-MM- HH24:MI') + 
interval REPLACE('1.30', '.', ':')||' hours'

This gives error at REPLACE.
Thank you.


Re: [GENERAL] Correct syntax

2014-04-11 Thread Vik Fearing
On 04/11/2014 10:58 AM, Victor Sterpu wrote:
 How would I write sutch a query?
 SELECT to_timestamp ('10-10-2013 15:00', 'DD-MM- HH24:MI') +
 interval REPLACE('1.30', '.', ':')||' hours'
 This gives error at REPLACE.
 Thank you.

The way you have interval, it expects a constant.  You need to cast your
expression like this:

SELECT to_timestamp ('10-10-2013 15:00', 'DD-MM- HH24:MI') +
(REPLACE('1.30', '.', ':')||' hours')::interval

-- 
Vik



Re: [GENERAL] Linux vs FreeBSD

2014-04-11 Thread Alban Hertroys
On 11 Apr 2014, at 8:04, Achilleas Mantzios ach...@matrix.gatewaynet.com 
wrote:

 Basically it goes beyond what ppl would describe as OS holly wars.
 If one chooses to go by FreeBSD, then he better be prepared to handle the 
 burden, both the part that is
 imposed by the OS administration itself, as well as the part that is a side 
 effect of the different base system.
 
 Example of admin part :
 Generally, compiling postgresql from source gives more freedom than be stuck 
 on the OS's ports or PKGng
 system. (the later being a very handy and welcome addition to FreeBSD).
 Now what if e.g. the user wants pgsql software X (e.g. pgadmin3, p5-Pg, 
 etc...) only to find out that most of those
 ports need postgresql client as a dependency. He/she must be prepared to work 
 his way through :
 - manual installations (gmake config  gmake  gmake install)
 - /usr/ports
 - PKG binary installations
 in decreasing order of freedom but increasing order of easiness, and in many 
 cases work through a combination
 of the above.

That argument holds for any package system on any OS I know of. Once you start 
custom compiling things outside the control of the package management system, 
you’re on your own.

Custom compiling may give more freedom, but it’s hardly ever necessary on 
FreeBSD. For example, the only ports that I ever had to custom compile were 
ports for software I was developing, which of course no package management 
system can keep track of.

In general, the various options the port Makefile provides for customisation 
are quite sufficient. It’s a plus to the ports system that you get any options 
at all.

 Example of base system part :
 Recently I had to install pl-java on my FreeBSD workstation. There was a 
 problem with libtrh, postgresql should be recompiled
 with explicitly setting : -lpthread in 
 /usr/local/src/postgresql-9.3.4/src/backend/Makefile, without this the 
 backend would simply hang
 at the very first invocation of a java function. This came after detailed 
 following or email exchange of various hackers groups
 in both pgsql and FreeBSD lists, to describe the issue as accurately as 
 possible, to help debug as most as possible, to talk
 to the right people, to give them incentive to answer back, etc.

It seems to me that the reason you were custom compiling Postgres in the first 
place was a problem with the port. I’m sure tracking down the problem wasn’t 
easy, but that is not really relevant to the topic. Ports break sometimes (on 
any OS) and it would have been sufficient to contact the port maintainer about 
the issue.

For a quick (temporary) fix, you could probably have fixed the port by editing 
the port Makefile. With that, there’s no reason anymore to “custom compile” 
postgres and it leaves the dependency tracking of the port in place. Editing 
Makefiles is indeed not for everyone, but at least you _can_ do that on 
FreeBSD. Not every package management system will let you do that.

And yes, I have edited Makefiles, although the need hasn’t risen recently.

 I don't mean to scare the OP, but FreeBSD is not for everyone.

And that (again) could be said about any OS. Even Windows or OS X.
It depends on what you intend to use it for and what prior experience, 
preconceptions and expectations you might have.

Oh, and please try not to top-post when replying on this list.

 On 11/04/2014 00:50, Jan Wieck wrote:
 On 04/10/14 17:25, Christofer C. Bell wrote:
 I'm not wanting to get after anyone here, but I want it on the record
 that I am not the source of the above quote discouraging the use of
 Ubuntu in a server role.  That would be Bruce Momjian. While Bruce is
 entitled to his opinion, it's not one I agree with and I don't want a
 Google search years from now to tie my name to that viewpoint.
 
 Who (in their right mind) would ever think of anything but BSD in a server 
 role?
 
 shaking head
 
 
 Jan

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Linux vs FreeBSD

2014-04-11 Thread Achilleas Mantzios

On 11/04/2014 13:05, Alban Hertroys wrote:

On 11 Apr 2014, at 8:04, Achilleas Mantzios ach...@matrix.gatewaynet.com 
wrote:


Basically it goes beyond what ppl would describe as OS holly wars.
If one chooses to go by FreeBSD, then he better be prepared to handle the 
burden, both the part that is
imposed by the OS administration itself, as well as the part that is a side 
effect of the different base system.

Example of admin part :
Generally, compiling postgresql from source gives more freedom than be stuck on 
the OS's ports or PKGng
system. (the later being a very handy and welcome addition to FreeBSD).
Now what if e.g. the user wants pgsql software X (e.g. pgadmin3, p5-Pg, etc...) 
only to find out that most of those
ports need postgresql client as a dependency. He/she must be prepared to work 
his way through :
- manual installations (gmake config  gmake  gmake install)
- /usr/ports
- PKG binary installations
in decreasing order of freedom but increasing order of easiness, and in many 
cases work through a combination
of the above.

That argument holds for any package system on any OS I know of. Once you start 
custom compiling things outside the control of the package management system, 
you’re on your own.

I am not against FreeBSD in any way, as a matter of fact i am struggling for 
about 20 years
to keep it alive at least in my working environment, being my primary 
development workstation.

Custom compiling may give more freedom, but it’s hardly ever necessary on 
FreeBSD. For example, the only ports that I ever had to custom compile were 
ports for software I was developing, which of course no package management 
system can keep track of.

Try to install/setup PgSQL-backed openldap with unixODBC when your KDE has 
iodbc as a prerequisite.
Or try to install pljava, for which of course no OS port/OS package/PgSQL 
extension exists, yet.
Good luck with any of those.

In general, the various options the port Makefile provides for customisation 
are quite sufficient. It’s a plus to the ports system that you get any options 
at all.


Example of base system part :
Recently I had to install pl-java on my FreeBSD workstation. There was a 
problem with libtrh, postgresql should be recompiled
with explicitly setting : -lpthread in 
/usr/local/src/postgresql-9.3.4/src/backend/Makefile, without this the backend 
would simply hang
at the very first invocation of a java function. This came after detailed 
following or email exchange of various hackers groups
in both pgsql and FreeBSD lists, to describe the issue as accurately as 
possible, to help debug as most as possible, to talk
to the right people, to give them incentive to answer back, etc.

It seems to me that the reason you were custom compiling Postgres in the first 
place was a problem with the port. I’m sure tracking down the problem wasn’t 
easy, but that is not really relevant to the topic. Ports break sometimes (on 
any OS) and it would have been sufficient to contact the port maintainer about 
the issue.

No, i wasn't compiling postgresql from standard distribution because of a 
problem with the port.
(although the port had the same exact behavior)
I always run postgresql compiled by hand, since I see no reason to sacrifice my 
peace of mind
for a short-lived joy going with the ports or PKGng system.
As a matter of fact, PostgreSQL is among the few software packages that i would 
advice strongly
against using ports or pkgs of any kind. Might work in Debian. Would not risk 
this in FreeBSD.

For a quick (temporary) fix, you could probably have fixed the port by editing 
the port Makefile. With that, there’s no reason anymore to “custom compile” 
postgres and it leaves the dependency tracking of the port in place. Editing 
Makefiles is indeed not for everyone, but at least you _can_ do that on 
FreeBSD. Not every package management system will let you do that.

Sure, but the way to do this is not by simply editing a Makefile, but with 
writing an extra patch
inside /usr/ports/databases/postgresql93-server/files/ . Which is more burden 
than easiness.

And yes, I have edited Makefiles, although the need hasn’t risen recently.

With plain vanilla ports it is rarely needed.

I don't mean to scare the OP, but FreeBSD is not for everyone.

And that (again) could be said about any OS. Even Windows or OS X.
It depends on what you intend to use it for and what prior experience, 
preconceptions and expectations you might have.

Playing with words aside, going with FreeBSD is not for the average Ubuntu user.


Oh, and please try not to top-post when replying on this list.

I did just for this message, because i did not feel appropriate to quote 
anything that the previous
poster wrote.

On 11/04/2014 00:50, Jan Wieck wrote:

On 04/10/14 17:25, Christofer C. Bell wrote:

I'm not wanting to get after anyone here, but I want it on the record
that I am not the source of the above quote discouraging the use of
Ubuntu in a server role.  That would be Bruce 

Re: [GENERAL] Linux vs FreeBSD

2014-04-11 Thread Alban Hertroys
On 11 Apr 2014, at 12:39, Achilleas Mantzios ach...@matrix.gatewaynet.com 
wrote:

I moved this bit of the conversation up as it’s relevant to the OP:

 On 11/04/2014 13:05, Alban Hertroys wrote:
 On 11 Apr 2014, at 8:04, Achilleas Mantzios ach...@matrix.gatewaynet.com 
 wrote:

 I don't mean to scare the OP, but FreeBSD is not for everyone.
 And that (again) could be said about any OS. Even Windows or OS X.
 It depends on what you intend to use it for and what prior experience, 
 preconceptions and expectations you might have.
 Playing with words aside, going with FreeBSD is not for the average Ubuntu 
 user.

(Words are fun!)

That was actually what I was getting at. A user coming from Ubuntu will have 
some preconceived ideas about how things work (because that’s how Ubuntu does 
things) that get in their way when moving to a different OS where they 
sometimes work just slightly differently.

My advice to the OP:

Install FreeBSD on a system to play around with, get a feel for how it works 
and whether you like it or not.
See how it performs with Postgres on different file-systems; UFS2 or ZFS - UFS 
is the faster of the two, but ZFS makes up for that big time in maintainability 
if you provide it with enough memory.
If you require locale-specific collations (native language specific sort 
ordering), check that it does what you expect.

If you instead put your database on FreeBSD straight away, you’ve suddenly made 
yourself responsible for a system that you do not know how to maintain or tune 
properly.

There are solutions to that of course, such as hiring someone to maintain it 
for you or to educate you in how to maintain it yourself (which might not be a 
bad idea for your current situation with Ubuntu either).

 Example of base system part :
 Recently I had to install pl-java on my FreeBSD workstation. There was a 
 problem with libtrh, postgresql should be recompiled
 with explicitly setting : -lpthread in 
 /usr/local/src/postgresql-9.3.4/src/backend/Makefile, without this the 
 backend would simply hang
 at the very first invocation of a java function. This came after detailed 
 following or email exchange of various hackers groups
 in both pgsql and FreeBSD lists, to describe the issue as accurately as 
 possible, to help debug as most as possible, to talk
 to the right people, to give them incentive to answer back, etc.
 It seems to me that the reason you were custom compiling Postgres in the 
 first place was a problem with the port. I’m sure tracking down the problem 
 wasn’t easy, but that is not really relevant to the topic. Ports break 
 sometimes (on any OS) and it would have been sufficient to contact the port 
 maintainer about the issue.
 No, i wasn't compiling postgresql from standard distribution because of a 
 problem with the port.
 (although the port had the same exact behavior)
 I always run postgresql compiled by hand, since I see no reason to sacrifice 
 my peace of mind
 for a short-lived joy going with the ports or PKGng system.
 As a matter of fact, PostgreSQL is among the few software packages that i 
 would advice strongly
 against using ports or pkgs of any kind. Might work in Debian. Would not risk 
 this in FreeBSD.

Although it is getting a bit specific, would you care to elaborate why you 
would advice strongly against using ports or packages for Postgres on FreeBSD? 
Because that’s a rather strong statement you’re making and so far the only 
argument I’ve seen is that there is no port for pl/java. I’m curious as to why 
you are so strongly set on custom-compiling Postgres.

BTW, isn’t the usual solution to a missing port to create your own (local) 
port? I can’t claim I have ever done that (never needed to), but apparently 
that’s the way to go about it. The obvious benefit is that it will fit in with 
the package management system, while you could even provide the port to others 
if you’d be willing to take responsibility for maintaining that port.

 And yes, I have edited Makefiles, although the need hasn’t risen recently.
 With plain vanilla ports it is rarely needed.

True, that’s usually only necessary in the rare case that a port isn’t 
compiling or when you’re stubborn about not wanting certain dependencies (I 
know someone who used to be dead-set against gettext, for example).

 Oh, and please try not to top-post when replying on this list.
 I did just for this message, because i did not feel appropriate to quote 
 anything that the previous
 poster wrote.

Fair enough.

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] efficient way to do fuzzy join

2014-04-11 Thread Rémi Cura
Hey dear List,

I'm looking for some advice about the best way to perform a fuzzy join,
that is joining two table based on approximate matching.

It is about temporal matching
given a table A with rows containing data and a control_time (for instance
1 ; 5; 6;  .. sec, not necessarly rounded of evenly-spaced)

given another table B with lines on no precise timing (eg control_time =
2.3 ; 5.8 ; 6.2 for example)

How to join every row of B to A based on
min(@(A.control_time-B.control_time))
(that is, for every row of B, get the row of A that is temporaly the
closest),
in an efficient way?
(to be explicit, 2.3 would match to 1, 5.8 to 6, 6.2 to 6)

Optionnaly, how to get interpolation efficiently (meaning one has to get
the previous time and next time for 1 st order interpolation, 2 before and
2 after for 2nd order interpolation, and so on)?
(to be explicit 5.8 would match to 5 and 6, the weight being 0.2 and 0.8
respectively)


Currently my data is spatial so I use Postgis function to interpolate a
point on a line, but is is far from efficient or general, and I don't have
control on interpolation (only the spatial values are interpolated).


Cheers,
Rémi-C


Re: [GENERAL] Linux vs FreeBSD

2014-04-11 Thread Steve Litt
On Fri, 11 Apr 2014 14:05:43 +0200
Alban Hertroys haram...@gmail.com wrote:

 My advice to the OP:
 
 Install FreeBSD on a system to play around with, get a feel for how
 it works and whether you like it or not. See how it performs with
 Postgres on different file-systems; UFS2 or ZFS - UFS is the faster
 of the two, but ZFS makes up for that big time in maintainability if
 you provide it with enough memory. If you require locale-specific
 collations (native language specific sort ordering), check that it
 does what you expect.

Curious: Why not consider OpenBSD also?

SteveT

Steve Litt*  http://www.troubleshooters.com/
Troubleshooting Training  *  Human Performance


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Linux vs FreeBSD

2014-04-11 Thread D'Arcy J.M. Cain
On Fri, 11 Apr 2014 09:16:04 -0400
Steve Litt sl...@troubleshooters.com wrote:
 Curious: Why not consider OpenBSD also?

Or NetBSD.

-- 
D'Arcy J.M. Cain da...@druid.net |  Democracy is three wolves
http://www.druid.net/darcy/|  and a sheep voting on
+1 416 788 2246 (DoD#0082)(eNTP)   |  what's for dinner.
IM: da...@vex.net, VoIP: sip:da...@druid.net


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Linux vs FreeBSD

2014-04-11 Thread Achilleas Mantzios

On 11/04/2014 15:05, Alban Hertroys wrote:
Although it is getting a bit specific, would you care to elaborate why you would advice strongly against using ports or packages for Postgres on FreeBSD? Because that’s a rather strong statement 
you’re making and so far the only argument I’ve seen is that there is no port for pl/java. I’m curious as to why you are so strongly set on custom-compiling Postgres. BTW, isn’t the usual solution 
to a missing port to create your own (local) port? I can’t claim I have ever done that (never needed to), but apparently that’s the way to go about it. The obvious benefit is that it will fit in 
with the package management system, while you could even provide the port to others if you’d be willing to take responsibility for maintaining that port. 


pl/java has nothing to do with this. The argument against using packages/ports 
for postgresql
upgrades, is that upgrades in general involve :
- reading HISTORY thoroughly and understanding every bit of it, especially the 
migration part,
and the changes part
- backing up the current database
- installing the new binaries
- running pg_upgrade
- solving problems that pg_upgrade detects and trying again
- testing your in house C/Java/etc... functions
- testing your whole app + utilities against the new version

Now, tell me, how much of this can the /usr/ports/databases/postgresqlXX-server 
port can do?
Would you trust the system to do this for you in an automated maybe weekly  pkg 
upgrade task
that would handle e.g. cdrecord and postgresql-xxx in the same manner ?

Now about writing ports, i can say to you this is a PITA. Its a great concept, 
but you must truly
commit to having a part of your life slot maintaining the port you submitted. 
This could be fun at first,
but in the long run, this is not easy.


Fair enough.


You are welcome :)


Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.






--
Achilleas Mantzios
Head of IT DEV
IT DEPT
Dynacom Tankers Mgmt



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] CLOB BLOB limitations in PostgreSQL

2014-04-11 Thread Jack . O'Sullivan
I am working for a client who is interested in migrating from Oracle to 
Postgres. Their database is currently ~20TB in size, and is growing. The 
biggest table in this database is effectively a BLOB store and currently 
has around 1 billion rows.

From reading around Postgres, there are a couple of limits which are 
concerning in terms of being able to migrate this database. We are not up 
against these limits just yet, but it is likely that they will be a 
potential blocker within the next few years.

1) Table can be maximum of 32TB  (http://www.postgresql.org/about/)

2) When storing bytea or text datatypes there is a limit of 4 billion 
entries per table (https://wiki.postgresql.org/wiki/BinaryFilesInDB)

With both of these, are they hard limits or can they be worked around with 
partitioning of tables? Could we set the table up in such a way that each 
child table was limited, but there was no limit on the number of children?

With point two, does this mean that any table with a bytea datatype is 
limited to 4 billion rows (which would seem in conflict with the 
unlimited rows shown by http://www.postgresql.org/about)? If we had rows 
where the bytea was a null entry would they contribute towards this 
total or is it 4 billion non-null entries?

Thanks.

Re: [GENERAL] efficient way to do fuzzy join

2014-04-11 Thread Andy Colson

On 4/11/2014 7:50 AM, Rémi Cura wrote:

Hey dear List,

I'm looking for some advice about the best way to perform a fuzzy
join, that is joining two table based on approximate matching.

It is about temporal matching
given a table A with rows containing data and a control_time (for
instance 1 ; 5; 6;  .. sec, not necessarly rounded of evenly-spaced)

given another table B with lines on no precise timing (eg control_time =
2.3 ; 5.8 ; 6.2 for example)

How to join every row of B to A based on
min(@(A.control_time-B.control_time))
(that is, for every row of B, get the row of A that is temporaly the
closest),
in an efficient way?
(to be explicit, 2.3 would match to 1, 5.8 to 6, 6.2 to 6)

Optionnaly, how to get interpolation efficiently (meaning one has to get
the previous time and next time for 1 st order interpolation, 2 before
and 2 after for 2nd order interpolation, and so on)?
(to be explicit 5.8 would match to 5 and 6, the weight being 0.2 and 0.8
respectively)


Currently my data is spatial so I use Postgis function to interpolate a
point on a line, but is is far from efficient or general, and I don't
have control on interpolation (only the spatial values are interpolated).


Cheers,
Rémi-C



Have you seen the range type?

http://www.postgresql.org/docs/9.3/static/rangetypes.html

Not fuzzy, but is indexable.

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] CLOB BLOB limitations in PostgreSQL

2014-04-11 Thread Andy Colson

On 4/11/2014 9:45 AM, Jack.O'sulli...@tessella.com wrote:

I am working for a client who is interested in migrating from Oracle to
Postgres. Their database is currently ~20TB in size, and is growing. The
biggest table in this database is effectively a BLOB store and currently
has around 1 billion rows.

 From reading around Postgres, there are a couple of limits which are
concerning in terms of being able to migrate this database. We are not
up against these limits just yet, but it is likely that they will be a
potential blocker within the next few years.

1) Table can be maximum of 32TB  (http://www.postgresql.org/about/)

2) When storing bytea or text datatypes there is a limit of 4 billion
entries per table (https://wiki.postgresql.org/wiki/BinaryFilesInDB)

With both of these, are they hard limits or can they be worked around
with partitioning of tables? Could we set the table up in such a way
that each child table was limited, but there was no limit on the number
of children?

With point two, does this mean that any table with a bytea datatype is
limited to 4 billion rows (which would seem in conflict with the
unlimited rows shown by http://www.postgresql.org/about)? If we had
rows where the bytea was a null entry would they contribute towards
this total or is it 4 billion non-null entries?

Thanks.


Sorry I cant answer any of your questions, but I do have a few more to 
raise:


1) I assume Oracle is pretty efficient on disk.  You might wanna do a 
quick test of a million rows or so and compare the on disk size of an 
Oracle db vs PG.  It wouldn't surprise me if PG used more space.  (I 
mean regular varchar, integer, etc.)


2) Does the Oracle blob compress?  PG will compress but I'd bet they 
compress differently.  Again, you might wanna dump out a million blobs 
and compare their space usage.  At 20TB, a 10% increase in disk usage is 
quite a bit.


3) There are two ways to store blob data.  Bytea in your table and Large 
Object support (in a separate table).  Google postgres bytea vs large 
object might offer useful reading.


I don't know if bytea or large object offer more efficient storage, but 
it might be another thing you can test.  Large object might be a little 
more work to use, but if it saves lots of disk space, it might be worth it.


4) is this blob data binary'ish?  We have json/hstore if its text'ish, 
which might make it more usable.


-Andy



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] efficient way to do fuzzy join

2014-04-11 Thread Rémi Cura
Hey,
thanks for your answer.

I think you are right, range type with index could at least provide a fast
matching,
thus avoiding the numrow(A) * numrow(B) complexity .

Though I don't see how to use it to interpolate for more than 1st order.

Cheers,
Rémi-C


2014-04-11 17:09 GMT+02:00 Andy Colson a...@squeakycode.net:

 On 4/11/2014 7:50 AM, Rémi Cura wrote:

 Hey dear List,

 I'm looking for some advice about the best way to perform a fuzzy
 join, that is joining two table based on approximate matching.

 It is about temporal matching
 given a table A with rows containing data and a control_time (for
 instance 1 ; 5; 6;  .. sec, not necessarly rounded of evenly-spaced)

 given another table B with lines on no precise timing (eg control_time =
 2.3 ; 5.8 ; 6.2 for example)

 How to join every row of B to A based on
 min(@(A.control_time-B.control_time))
 (that is, for every row of B, get the row of A that is temporaly the
 closest),
 in an efficient way?
 (to be explicit, 2.3 would match to 1, 5.8 to 6, 6.2 to 6)

 Optionnaly, how to get interpolation efficiently (meaning one has to get
 the previous time and next time for 1 st order interpolation, 2 before
 and 2 after for 2nd order interpolation, and so on)?
 (to be explicit 5.8 would match to 5 and 6, the weight being 0.2 and 0.8
 respectively)


 Currently my data is spatial so I use Postgis function to interpolate a
 point on a line, but is is far from efficient or general, and I don't
 have control on interpolation (only the spatial values are interpolated).


 Cheers,
 Rémi-C



 Have you seen the range type?

 http://www.postgresql.org/docs/9.3/static/rangetypes.html

 Not fuzzy, but is indexable.

 -Andy



Re: [GENERAL] CLOB BLOB limitations in PostgreSQL

2014-04-11 Thread Jack . O'Sullivan
Hi Andy,

Thanks for getting those thoughts out so quickly.

As far as points 1 and 2 go, it is definitely something to think about, 
but they are largely tangential to what I need to worry about at this 
moment.  I am less concerned about how much disk do we need to store 
this than is it even possible to store this.

If I'm understanding your point 3 correctly, you are referring to the 
OID/LOB type, which, from what I have read, is even more restrictive than 
bytea or text in that you have a limit of 4 billion objects per database (
https://wiki.postgresql.org/wiki/BinaryFilesInDB). The fact that each 
object can be 2G rather than 1G is not much help to us as our objects are 
unlikely to ever been more than ~100k.

The table in question supports storing information either as XML, which we 
store in a text/char type column, or binary compressed xml, which is in 
the BLOB type column. The particular client I'm doing this for uses the 
compressed version, so all of their data in this table is binary.

Thanks

Jack



From:   Andy Colson a...@squeakycode.net
To: Jack.O'sulli...@tessella.com, pgsql-general@postgresql.org, 
Date:   11/04/2014 16:24
Subject:Re: [GENERAL] CLOB  BLOB limitations in PostgreSQL



On 4/11/2014 9:45 AM, Jack.O'sulli...@tessella.com wrote:
 I am working for a client who is interested in migrating from Oracle to
 Postgres. Their database is currently ~20TB in size, and is growing. The
 biggest table in this database is effectively a BLOB store and currently
 has around 1 billion rows.

  From reading around Postgres, there are a couple of limits which are
 concerning in terms of being able to migrate this database. We are not
 up against these limits just yet, but it is likely that they will be a
 potential blocker within the next few years.

 1) Table can be maximum of 32TB  (http://www.postgresql.org/about/)

 2) When storing bytea or text datatypes there is a limit of 4 billion
 entries per table (https://wiki.postgresql.org/wiki/BinaryFilesInDB)

 With both of these, are they hard limits or can they be worked around
 with partitioning of tables? Could we set the table up in such a way
 that each child table was limited, but there was no limit on the number
 of children?

 With point two, does this mean that any table with a bytea datatype is
 limited to 4 billion rows (which would seem in conflict with the
 unlimited rows shown by http://www.postgresql.org/about)? If we had
 rows where the bytea was a null entry would they contribute towards
 this total or is it 4 billion non-null entries?

 Thanks.

Sorry I cant answer any of your questions, but I do have a few more to 
raise:

1) I assume Oracle is pretty efficient on disk.  You might wanna do a 
quick test of a million rows or so and compare the on disk size of an 
Oracle db vs PG.  It wouldn't surprise me if PG used more space.  (I 
mean regular varchar, integer, etc.)

2) Does the Oracle blob compress?  PG will compress but I'd bet they 
compress differently.  Again, you might wanna dump out a million blobs 
and compare their space usage.  At 20TB, a 10% increase in disk usage is 
quite a bit.

3) There are two ways to store blob data.  Bytea in your table and Large 
Object support (in a separate table).  Google postgres bytea vs large 
object might offer useful reading.

I don't know if bytea or large object offer more efficient storage, but 
it might be another thing you can test.  Large object might be a little 
more work to use, but if it saves lots of disk space, it might be worth 
it.

4) is this blob data binary'ish?  We have json/hstore if its text'ish, 
which might make it more usable.

-Andy




Re: [GENERAL] efficient way to do fuzzy join

2014-04-11 Thread Andy Colson

2014-04-11 17:09 GMT+02:00 Andy Colson a...@squeakycode.net
mailto:a...@squeakycode.net:

On 4/11/2014 7:50 AM, Rémi Cura wrote:

Hey dear List,

I'm looking for some advice about the best way to perform a fuzzy
join, that is joining two table based on approximate matching.

It is about temporal matching
given a table A with rows containing data and a control_time (for
instance 1 ; 5; 6;  .. sec, not necessarly rounded of evenly-spaced)

given another table B with lines on no precise timing (eg
control_time =
2.3 ; 5.8 ; 6.2 for example)

How to join every row of B to A based on
min(@(A.control_time-B.__control_time))
(that is, for every row of B, get the row of A that is temporaly the
closest),
in an efficient way?
(to be explicit, 2.3 would match to 1, 5.8 to 6, 6.2 to 6)

Optionnaly, how to get interpolation efficiently (meaning one
has to get
the previous time and next time for 1 st order interpolation, 2
before
and 2 after for 2nd order interpolation, and so on)?
(to be explicit 5.8 would match to 5 and 6, the weight being 0.2
and 0.8
respectively)


Currently my data is spatial so I use Postgis function to
interpolate a
point on a line, but is is far from efficient or general, and I
don't
have control on interpolation (only the spatial values are
interpolated).


Cheers,
Rémi-C



Have you seen the range type?

http://www.postgresql.org/__docs/9.3/static/rangetypes.__html
http://www.postgresql.org/docs/9.3/static/rangetypes.html

Not fuzzy, but is indexable.

-Andy




On 4/11/2014 10:57 AM, Rémi Cura wrote: Hey,
 thanks for your answer.

 I think you are right, range type with index could at least provide a
 fast matching,
 thus avoiding the numrow(A) * numrow(B) complexity .

 Though I don't see how to use it to interpolate for more than 1st order.

 Cheers,
 Rémi-C




Hum..  Would you like to set an upper bound on the number of seconds the 
join would match?  Maybe range types could give you an indexed upper 
bound (match within +/- 2 seconds only), then use another match to 
find the actual min.  (I do something like this in PostGis, use bounding 
box to do quick index lookup, then st_distance to find the nearest)


I can see two row's in A matching the same row in B.  Would that be ok?

TableA
--
1
5
6

TableB
--
0.9
1.1
6.6
7.7

How should those two tables join?

-Andy


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Need query

2014-04-11 Thread Gaurav Jindal
Can anybody help me with query-

I have 3 tables- a,b,c

a has relation with b and b has relation with c
a does not have a relation with c


I need to find top performing item in table 'a'. It is based on count. If
more count, it is top performing. We can count only based on actions in
table 'c'. I can put 'where' clause in table 'c' and find those values. I
can't join table 'a' and 'c' as they don't have any relation.


select b.top
FROM b,c
WHERE c.action IN ('blah', 'blah', 'any values')  and b.id = c.id;

'top' in the query is a common column  in table 'a' named 'a_top'. We need
to count column 'track' in table 'a' based on this information.


Re: [GENERAL] postgresql.conf question... CPU spikes

2014-04-11 Thread Bala Venkat
As Andy mentioned. After tuning a query, every thing settled in . Now the
cpu utilization has come down a lot..

Thanks a lot for the help. I will certainly use the tool, pg_top

kind regards


On Fri, Apr 11, 2014 at 12:35 AM, Venkata Balaji Nagothi
vbn...@gmail.comwrote:


 On Thu, Apr 10, 2014 at 12:43 AM, Bala Venkat akpg...@gmail.com wrote:

 Hi all -

We are running postgres 9.0 ( 32 bit ) + postgis 1.5.2 on Solaris
 Sparc M5000 with 64GB .  Recently we are getting CPU utilitzation to 99% .

 In the config file


 shared_buffers=2GB.
 work_mem = 128MB
 effective_cache_size=48GB
 maintaince_work_mem= 500MB
 max_connections = 300

 When the CPU spikes happens, when I look at the pg_stat_activity log, the
 queries where current_query not like '%IDLE%' are between 100-110.

 Do you think , I have to reduce the effective_cache and work_mem for this?


 What does the load average say ? What about memory usage and disk IO ?

 Best way to look at CPU spikes issue is through top or equivalent
 utility which helps us know the PIDs for top resource consuming processes
 and the processes / sessions info using the same PIDs can be pulled in from
 pg_stat_activity.

 Another best way - which i felt is the best tool is - pg_top. pg_top is an
 excellent tool which help us identify the top resource consuming queries
 responsible for high CPU consumption or high DISK IO.

 Once you identify resource consuming processes or queries, things can be
 taken from there.

 Regards,

 Venkata Balaji N
 Fujitsu Australia



Re: [GENERAL] Linux vs FreeBSD

2014-04-11 Thread D'Arcy J.M. Cain
On Fri, 11 Apr 2014 17:15:28 +0300
Achilleas Mantzios ach...@matrix.gatewaynet.com wrote:
 pl/java has nothing to do with this. The argument against using
 packages/ports for postgresql upgrades, is that upgrades in general
 involve :
 - reading HISTORY thoroughly and understanding every bit of it,
 especially the migration part, and the changes part

Do this before installing anything.

 - backing up the current database

Also this.

 - installing the new binaries

This is the only thing that the ports or pkgsrc (NetBSD) facility does
for you.

 - running pg_upgrade
 - solving problems that pg_upgrade detects and trying again
 - testing your in house C/Java/etc... functions
 - testing your whole app + utilities against the new version

After installing the upgrade all of this has to be done.  None of it is
going to be done by ports/pkgsrc/apt-get or any other install
facility including your own brain sweat.  All ports/pkgsrc does for you
is the install part.

 Now, tell me, how much of this can
 the /usr/ports/databases/postgresqlXX-server port can do? Would you

Step 3.  Period.

 trust the system to do this for you in an automated maybe weekly  pkg
 upgrade task that would handle e.g. cdrecord and postgresql-xxx in
 the same manner ?

I wouldn't trust any system to blindly install any program I certainly
wouldn't automate any upgrades on a production server.  The most I
would do automatically is build the latest package for manual
installation.

 Now about writing ports, i can say to you this is a PITA. Its a great
 concept, but you must truly commit to having a part of your life slot
 maintaining the port you submitted. This could be fun at first, but
 in the long run, this is not easy.

Why?  Once you submit your port to the community you could update when
you could or poke the community to do it for you.  All hail open source.

-- 
D'Arcy J.M. Cain da...@druid.net |  Democracy is three wolves
http://www.druid.net/darcy/|  and a sheep voting on
+1 416 788 2246 (DoD#0082)(eNTP)   |  what's for dinner.
IM: da...@vex.net, VoIP: sip:da...@druid.net


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] efficient way to do fuzzy join

2014-04-11 Thread Andy Colson

On 4/11/2014 7:50 AM, Rémi Cura wrote:

Hey dear List,

I'm looking for some advice about the best way to perform a fuzzy
join, that is joining two table based on approximate matching.

It is about temporal matching
given a table A with rows containing data and a control_time (for
instance 1 ; 5; 6;  .. sec, not necessarly rounded of evenly-spaced)

given another table B with lines on no precise timing (eg control_time =
2.3 ; 5.8 ; 6.2 for example)

How to join every row of B to A based on
min(@(A.control_time-B.control_time))
(that is, for every row of B, get the row of A that is temporaly the
closest),
in an efficient way?
(to be explicit, 2.3 would match to 1, 5.8 to 6, 6.2 to 6)

Optionnaly, how to get interpolation efficiently (meaning one has to get
the previous time and next time for 1 st order interpolation, 2 before
and 2 after for 2nd order interpolation, and so on)?
(to be explicit 5.8 would match to 5 and 6, the weight being 0.2 and 0.8
respectively)


Currently my data is spatial so I use Postgis function to interpolate a
point on a line, but is is far from efficient or general, and I don't
have control on interpolation (only the spatial values are interpolated).


Cheers,
Rémi-C



Ok, here is a just sql way.  No ranges.  No idea if its right.  A first 
pass, so to speak.




create table a(t float, data text);
create table b(t float, data text);

insert into a values (1), (5), (6);
insert into b values (2.3), (5.8), (6.2);


select a.t, b.t
from (
  select t, least( least(t, mint), least(t, maxt)) as t2 from (
select t,
 (select t from a where a.t = b.t order by a.t limit 1) as mint,
 (select t from a where a.t  b.t order by a.t desc limit 1) as maxt
  from b
  ) as tmp
) as tmp2
inner join a on (tmp2.t2 = a.t)
inner join b on (tmp2.t = b.t)




The middle part is the magic:

select t,
 (select t from a where a.t = b.t order by a.t limit 1) as mint,
 (select t from a where a.t  b.t order by a.t desc limit 1) as maxt
from b

The rest is just to make it usable.  If t is indexed, it'll probably be 
fast too.


-Andy




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Need query

2014-04-11 Thread David G Johnston
Gaurav Jindal wrote
 a has relation with b and b has relation with c

Given this statement what specifically are you having difficulty with.  You
are allowed to perform multiple joins, whether explicit or via FROM/WHERE,
in the same query.

SELECT ...
FROM a,b,c
WHERE a=b and b=c

OR

SELECT ...
FROM c
JOIN b on c=b
JOIN a ON b=a

David J.





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Need-query-tp5799681p5799694.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Problem with query

2014-04-11 Thread Susan Cassidy
I have a query with several joins, where I am searching for specific data
in certain columns.  If I do this:

SELECT distinct on (s.description, st1.description, s.scene_id)
s.description, s.scene_id, to_char(s.time_of_creation, 'MM/DD/YY
HH24:MI:SS'),
 position_0_0_0_info, st.scene_thing_id, si.description,
st.description, m.description
from scenes s
left outer join scene_thing_instances si on s.scene_id = si.scene_id
left outer join scene_things st on si.scene_thing_id =
st.scene_thing_id
left outer join materials m on st.material_id = m.material_id
left outer join scene_things st1 on st.ultimate_parent_id =
st1.ultimate_parent_id
 where  st.description ilike '%bread%' or st1.description ilike
'%bread%'
 group by s.description, st1.description, s.scene_id,
st.scene_thing_id, si.description, m.description order by s.description

No results are found, but if I just do this:

SELECT st.description, st1.description
from
scene_things st
left outer join scene_things st1 on st.ultimate_parent_id =
st1.ultimate_parent_id
where st1.description ilike '%bread%'
group by st.description, st1.description order by st.description

I get the results I expect (several hits).

What is the first query doing wrong?

I've tried adding st1.description to the SELECT list, and the GROUP BY
clause, with no luck.

Thanks,
Susan


Re: [GENERAL] efficient way to do fuzzy join

2014-04-11 Thread Andy Colson

On 4/11/2014 12:16 PM, Andy Colson wrote:

On 4/11/2014 7:50 AM, Rémi Cura wrote:

Hey dear List,

I'm looking for some advice about the best way to perform a fuzzy
join, that is joining two table based on approximate matching.

It is about temporal matching
given a table A with rows containing data and a control_time (for
instance 1 ; 5; 6;  .. sec, not necessarly rounded of evenly-spaced)

given another table B with lines on no precise timing (eg control_time =
2.3 ; 5.8 ; 6.2 for example)

How to join every row of B to A based on
min(@(A.control_time-B.control_time))
(that is, for every row of B, get the row of A that is temporaly the
closest),
in an efficient way?
(to be explicit, 2.3 would match to 1, 5.8 to 6, 6.2 to 6)

Optionnaly, how to get interpolation efficiently (meaning one has to get
the previous time and next time for 1 st order interpolation, 2 before
and 2 after for 2nd order interpolation, and so on)?
(to be explicit 5.8 would match to 5 and 6, the weight being 0.2 and 0.8
respectively)


Currently my data is spatial so I use Postgis function to interpolate a
point on a line, but is is far from efficient or general, and I don't
have control on interpolation (only the spatial values are interpolated).


Cheers,
Rémi-C



Ok, here is a just sql way.  No ranges.  No idea if its right.  A first
pass, so to speak.



create table a(t float, data text);
create table b(t float, data text);

insert into a values (1), (5), (6);
insert into b values (2.3), (5.8), (6.2);


select a.t, b.t
from (
   select t, least( least(t, mint), least(t, maxt)) as t2 from (
 select t,
  (select t from a where a.t = b.t order by a.t limit 1) as mint,
  (select t from a where a.t  b.t order by a.t desc limit 1) as maxt
   from b
   ) as tmp
) as tmp2
inner join a on (tmp2.t2 = a.t)
inner join b on (tmp2.t = b.t)




The middle part is the magic:

select t,
  (select t from a where a.t = b.t order by a.t limit 1) as mint,
  (select t from a where a.t  b.t order by a.t desc limit 1) as maxt
from b

The rest is just to make it usable.  If t is indexed, it'll probably be
fast too.

-Andy






Here is a guess with ranges:

select a.t, (select t from b where b.t @ numrange(a.t-2, a.t+2, '[]') 
order by abs(a.t-b.t) limit 1)

from a


It returns:
 t   t
--  --
 1 2.3
 5 5.8
 6 5.8


which is different than the previous sql, but its not wrong.  6 is the 
same distance between 5.8 and 6.2, so both are the correct choice.


I had to change my tables (or type cast a lot):

create table a(t numeric);
create table b(t numeric);

insert into a values (1), (5), (6);
insert into b values (2.3), (5.8), (6.2);


-Andy



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] efficient way to do fuzzy join

2014-04-11 Thread Rémi Cura
Wow many thanks!

I had thought about the order by and limit because it is the natural way to
express the problem,
but I had discarded it for fear of suchbad complexity
(theoretically, for each row of B , compute the distance to every other row
of A!)
.

And it's okay if 2 row from B share the same join to row from A, because
when interpolating it will be different.

Here is the test env with realistic number, your solution is very fast, I
have to raise my hat (4 sec!)
---

--usefull function to fill with random text
CREATE OR REPLACE FUNCTION rc_random_string(INTEGER )
RETURNS text AS $$
SELECT array_to_string(
ARRAY(
SELECT
substring('ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789' FROM
(random()*36)::int + 1 FOR 1)
FROM generate_series(1,$1)
)
,'')
$$ LANGUAGE sql;


--creating tables
DROP TABLE IF EXISTS a;
DROP TABLE IF EXISTS b;
create table a(t float, data text);
create table b(t float, data text);
CREATE INDEX ON a (t);
CREATE INDEX ON b (t);


--filling tables
WITH the_serie AS (
SELECT  s+random()/2 AS s, rc_random_string(100) aS data
FROM generate_series(1,10) AS s

)
insert into a SELECT s, data
FROM the_serie;

WITH the_serie AS (
SELECT  s+(random()-0.5)*2 AS s, rc_random_string(100) aS data
FROM generate_series(1,10) AS s

)
insert into b SELECT s, data
FROM the_serie;

ANALYZE a;
ANALYZE b;

--computing result
DROP TABLE IF EXISTS t;
CREATE TABLE t AS
select a.t As a_t, b.t as b_t
from (
  select t, least( least(t, mint), least(t, maxt)) as t2 from (
select t,
 (select t from a where a.t = b.t order by a.t limit 1) as mint,
 (select t from a where a.t  b.t order by a.t desc limit 1) as maxt
  from b
  ) as tmp
) as tmp2
inner join a on (tmp2.t2 = a.t)
inner join b on (tmp2.t = b.t)




2014-04-11 19:16 GMT+02:00 Andy Colson a...@squeakycode.net:

 On 4/11/2014 7:50 AM, Rémi Cura wrote:

 Hey dear List,

 I'm looking for some advice about the best way to perform a fuzzy
 join, that is joining two table based on approximate matching.

 It is about temporal matching
 given a table A with rows containing data and a control_time (for
 instance 1 ; 5; 6;  .. sec, not necessarly rounded of evenly-spaced)

 given another table B with lines on no precise timing (eg control_time =
 2.3 ; 5.8 ; 6.2 for example)

 How to join every row of B to A based on
 min(@(A.control_time-B.control_time))
 (that is, for every row of B, get the row of A that is temporaly the
 closest),
 in an efficient way?
 (to be explicit, 2.3 would match to 1, 5.8 to 6, 6.2 to 6)

 Optionnaly, how to get interpolation efficiently (meaning one has to get
 the previous time and next time for 1 st order interpolation, 2 before
 and 2 after for 2nd order interpolation, and so on)?
 (to be explicit 5.8 would match to 5 and 6, the weight being 0.2 and 0.8
 respectively)


 Currently my data is spatial so I use Postgis function to interpolate a
 point on a line, but is is far from efficient or general, and I don't
 have control on interpolation (only the spatial values are interpolated).


 Cheers,
 Rémi-C



 Ok, here is a just sql way.  No ranges.  No idea if its right.  A first
 pass, so to speak.



 create table a(t float, data text);
 create table b(t float, data text);

 insert into a values (1), (5), (6);
 insert into b values (2.3), (5.8), (6.2);


 select a.t, b.t
 from (
   select t, least( least(t, mint), least(t, maxt)) as t2 from (
 select t,
  (select t from a where a.t = b.t order by a.t limit 1) as mint,
  (select t from a where a.t  b.t order by a.t desc limit 1) as maxt
   from b
   ) as tmp
 ) as tmp2
 inner join a on (tmp2.t2 = a.t)
 inner join b on (tmp2.t = b.t)




 The middle part is the magic:

 select t,
  (select t from a where a.t = b.t order by a.t limit 1) as mint,
  (select t from a where a.t  b.t order by a.t desc limit 1) as maxt
 from b

 The rest is just to make it usable.  If t is indexed, it'll probably be
 fast too.

 -Andy





Re: [GENERAL] Problem with query

2014-04-11 Thread David G Johnston
Susan Cassidy-3 wrote
 I have a query with several joins, where I am searching for specific data
 in certain columns.  

While accurate as far as describing a typical query it doesn't really tell
us its intent


 What is the first query doing wrong?

No idea, though it may have something to do with not properly handling NULL,
or, more likely, simply the join order semantics are wrong.

The first thing i would do is breakup the query into smaller components each
contained within their own CTE (WITH) section.  I'd also minimize the number
of OUTER JOINs, it is doubtful that you need as many as you seem to think. 
At best you should INNER JOIN first and then OUTER JOIN to the combination.

I also rarely require the use of DISTINCT ON if I've coded my queries
properly so removing that from the select-list would occur as well.

As would probably adding at least a COALESCE(s.description,
st1.description) somewhere.

And outputting si.description and st.description causes an ambiguity
since the resultant column name will not have the table prefix; typically
only the column name survives on the outside (or higher up).


 I've tried adding st1.description to the SELECT list, and the GROUP BY
 clause, with no luck.

And why did you think doing these things would have any impact on the number
of rows returned (or, at least, returning some rows where previously there
were none)?

David j.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Problem-with-query-tp5799696p5799700.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Problem with query

2014-04-11 Thread Chris Curvey
On Fri, Apr 11, 2014 at 1:50 PM, Susan Cassidy 
susan.cass...@decisionsciencescorp.com wrote:

 I have a query with several joins, where I am searching for specific data
 in certain columns.  If I do this:

 SELECT distinct on (s.description, st1.description, s.scene_id)
 s.description, s.scene_id, to_char(s.time_of_creation, 'MM/DD/YY
 HH24:MI:SS'),
  position_0_0_0_info, st.scene_thing_id, si.description,
 st.description, m.description
 from scenes s
 left outer join scene_thing_instances si on s.scene_id =
 si.scene_id
 left outer join scene_things st on si.scene_thing_id =
 st.scene_thing_id
 left outer join materials m on st.material_id = m.material_id
 left outer join scene_things st1 on st.ultimate_parent_id =
 st1.ultimate_parent_id
  where  st.description ilike '%bread%' or st1.description ilike
 '%bread%'
  group by s.description, st1.description, s.scene_id,
 st.scene_thing_id, si.description, m.description order by s.description

 No results are found, but if I just do this:

 SELECT st.description, st1.description
 from
 scene_things st
 left outer join scene_things st1 on st.ultimate_parent_id =
 st1.ultimate_parent_id
 where st1.description ilike '%bread%'
 group by st.description, st1.description order by st.description

 I get the results I expect (several hits).

 What is the first query doing wrong?

 I've tried adding st1.description to the SELECT list, and the GROUP BY
 clause, with no luck.

 Thanks,
 Susan


First query goes

scenes - scene_thing_instances - scene_things

second query goes

scene_things - scene_things

So they're not comparable queries.

My bet would be that scene_thing_instances is missing some rows that you
want/need.




-- 
I asked the Internet how to train my cat, and the Internet told me to get a
dog.


Re: [GENERAL] Problem with query

2014-04-11 Thread Michael Nolan
On 4/11/14, Chris Curvey ch...@chriscurvey.com wrote:
 On Fri, Apr 11, 2014 at 1:50 PM, Susan Cassidy 
 susan.cass...@decisionsciencescorp.com wrote:

 I have a query with several joins, where I am searching for specific data
 in certain columns.

Have you tried running each of your joins separately to see if there
are row values common to both tables, ie:

select count(*) from scenes s
left outer join scene_thing_instances si on s.scene_id =
si.scene_id

then

select count(*) from scene_thing_instances si
left outer join scene_things st on si.scene_thing_id =
st.scene_thing_id

etc.

I find when building complex queries (I've written some that ran over
100 lines and involved a dozen or more joined tables), I need to build
them up, testing them as I build.
--
Mike Nolan


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Database Design: Maintain Audit Trail of Changes

2014-04-11 Thread Moshe Jacobson
I know this is a terribly old thread, but if you are still looking for
software to provide an audit trail of changes in the database, please see
Cyan Audit at http://pgxn.org/dist/cyanaudit. I think it will do just what
you're looking for.
(Full disclosure: I am the author of this software)


Moshe Jacobson
Manager of Systems Engineering, Nead Werx Inc. http://www.neadwerx.com
2323 Cumberland Parkway · Suite 201 · Atlanta, GA 30339

Quality is not an act, it is a habit. -- Aristotle


On Thu, Jan 3, 2013 at 10:38 AM, Rich Shepard rshep...@appl-ecosys.comwrote:

   I have the need to develop an application that will use postgres as the
 back end, and most of the design has been worked out, but I've one issue
 left to resolve and want help in this. If this is not the appropriate forum
 for this type of question, please point me in the right direction.

   For several reasons (including operational and legal) once data are
 entered in a table they cannot be changed or deleted without an audit trail
 of the change, when it occurred, who made the change, and the reason for
 it.
 Tables might contain laboratory or instrument measurement values or the
 names of regulatory staff.

   My current thoughts are that there needs to be a separate table, perhaps
 called 'changes', with attribute columns for the source table, identifying
 value(s) for the original row, new value, date of change, person making the
 change, and the reason for the change. The original table should have an
 attribute flag to indicated that a row has been changed.

   The middleware of the application needs to check this table when data are
 to be viewed in the UI and present only the current row contents. A
 separate
 view would display a history of changes for that row.

   All thoughts, suggestions, and recommendations based on your expertise
 and
 experience will be most welcome.

 TIA,

 Rich



 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general



[GENERAL] Is it possible to load two different versions of the JDBC driver in the same application?

2014-04-11 Thread Tecno Brain
I have a Java application that is using Postgres 9.1, and we use the latest
9.1 JDBC driver.
The same Java app requires access now to a Redshift cluster. Amazon
recommends to use a Postgres 8 JDBC driver for connecting to Redshift.

Can I load both drivers in the same application ?

I currently use the Apache database connection pool configured through
Tomcat's context.xml to load the driver and handle the connection pool.

-Jorge


Re: [GENERAL] Is it possible to load two different versions of the JDBC driver in the same application?

2014-04-11 Thread Rob Sargent

On 04/11/2014 03:31 PM, Tecno Brain wrote:
I have a Java application that is using Postgres 9.1, and we use the 
latest 9.1 JDBC driver.
The same Java app requires access now to a Redshift cluster. Amazon 
recommends to use a Postgres 8 JDBC driver for connecting to Redshift.


Can I load both drivers in the same application ?

I currently use the Apache database connection pool configured through 
Tomcat's context.xml to load the driver and handle the connection pool.


-Jorge



Can you easily isolate the two?  i.e. can you easily know which driver 
you want for each db access.  If so, I suspect some classloader games 
are in your future. If not, tell Amazon you'll be moving to Rackspace.


Re: [GENERAL] Is it possible to load two different versions of the JDBC driver in the same application?

2014-04-11 Thread Dave Cramer
Ya, they probably don't like that we use server side prepared statements

Might be possible to just mess with classpath's to get it done.

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca


On 11 April 2014 17:39, Rob Sargent robjsarg...@gmail.com wrote:

  On 04/11/2014 03:31 PM, Tecno Brain wrote:

   I have a Java application that is using Postgres 9.1, and we use the
 latest 9.1 JDBC driver.
  The same Java app requires access now to a Redshift cluster. Amazon
 recommends to use a Postgres 8 JDBC driver for connecting to Redshift.

  Can I load both drivers in the same application ?

  I currently use the Apache database connection pool configured through
 Tomcat's context.xml to load the driver and handle the connection pool.

  -Jorge



   Can you easily isolate the two?  i.e. can you easily know which driver
 you want for each db access.  If so, I suspect some classloader games are
 in your future. If not, tell Amazon you'll be moving to Rackspace.