Re: [Pljava-dev] [HACKERS] Re: Should creating a new base type require superuser status?

2009-02-18 Thread Thomas Hallgren

Kris Jurka wrote:

Thomas Hallgren wrote:

Kris Jurka wrote:


3) By value: pljava does not correctly handle passed by value types 
correctly, allowing access to random memory.


This is simply not true. There's no way a Java developer can access 
random memory through PL/Java.


No, the point is that the Java developer can provide some data which 
can convince postgresql to fetch random data for the user.


Consider the attached type which is simply an int4 equivalent. 
Depending on how you define it as passed by value or passed by 
reference it will or will not work (attached).


This looks like it works:


jurka=# select '1'::intbyref, '2'::intbyval;
 intbyref | intbyval
--+--
 1| 2
(1 row)

But it doesn't really:

jurka=# create table inttest (a intbyref, b intbyval);
CREATE TABLE
jurka=# insert into inttest values ('1', '2');
INSERT 0 1
jurka=# select * from inttest;
 a | b
---+
 1 | 2139062143
(1 row)
It seems the pointer is confused for the actual value which means that 
writing the value back will corrupt the pointer. That's bad of course 
but I would classify this as a bug rather then a general security problem.


PL/Java is designed to do handle all types securely and completely hide 
the concept of 'by value' or 'by reference' from the Java developer 
since such concepts are meaningless in Java.


- thomas

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [Pljava-dev] [HACKERS] Re: Should creating a new base type require superuser status?

2009-02-17 Thread Thomas Hallgren

Kris Jurka wrote:


3) By value: pljava does not correctly handle passed by value types 
correctly, allowing access to random memory.


This is simply not true. There's no way a Java developer can access 
random memory through PL/Java.


- thomas

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [Pljava-dev] Should creating a new base type require superuser status?

2008-08-03 Thread Thomas Hallgren

Tom Lane wrote:


This is a non-issue in PL/Java. An integer parameter is never passed by 
reference and there's no way the PL/Java user can get direct access to 
backend memory.



So what exactly does happen when the user deliberately specifies wrong
typlen/typbyval/typalign info when creating a type based on PL/Java
functions?

  
Everything is converted into instances of Java classes such as String, 
byte[], etc.


I think that assumption is without ground. Java doesn't permit you to 
access memory unless you use Java classes (java.nio stuff) that is 
explicitly designed to do that and you need native code to set such 
things up. A PL/Java user can not do that unless he is able to link in 
other shared objects or dll's to the backend process.



PL/Java itself must be doing unsafe things in order to interface with
PG at all.  So what your argument really is is that you have managed to
securely sandbox the user-written code you are calling.  That might or
might not be true, but I don't think that worrying about it is without
foundation.

  
I would be presumptuous to claim that I provide the sandbox. All PL/Java 
does is to provide the type mapping. The sandbox as such is implicit in 
Java, much in the same way that it does it for web-browsers etc.


Regardless of that, I think there's some difference in expressing a 
worry that might or might not have a foundation versus claiming that 
there indeed must be a security hole a mile wide ;-)


- thomas


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [Pljava-dev] Should creating a new base type require superuser status?

2008-08-02 Thread Thomas Hallgren

Tom Lane wrote:

Thomas Hallgren [EMAIL PROTECTED] writes:
  

Tom, could you please elaborate where you see a security hole?



The problem that we've seen in the past shows up when the user lies in
the CREATE TYPE command, specifying type representation properties that
are different from what the underlying functions expect.  In particular,
if it's possible to pass a pass-by-value integer to a function
that's expecting a pass-by-reference datum, you can misuse the function
to access backend memory.

  
This is a non-issue in PL/Java. An integer parameter is never passed by 
reference and there's no way the PL/Java user can get direct access to 
backend memory.



I gather from looking at the example that Kris referenced that there's
some interface code in between the SQL function call and the user's Java
code, and that that interface code is itself looking at the declared
properties of the SQL type to decide what to do.  So to the extent that
that code is (a) bulletproof against inconsistencies and (b) not
subvertible by the PL/Java user, it might be that there's no hole in
practice.  But assumption (b) seems pretty fragile to me.

  
I think that assumption is without ground. Java doesn't permit you to 
access memory unless you use Java classes (java.nio stuff) that is 
explicitly designed to do that and you need native code to set such 
things up. A PL/Java user can not do that unless he is able to link in 
other shared objects or dll's to the backend process.


Based on that, I claim that your statement about a security hole a mile 
wide is incorrect. PL/Java is not subject to issues relating to misuse 
of backend memory.


Regards,
Thomas Hallgren


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [Pljava-dev] Should creating a new base type require superuser status?

2008-08-01 Thread Thomas Hallgren

It seems perfectly safe to me too for the reason that Kris mentions.

Tom, could you please elaborate where you see a security hole?

Regards,
Thomas Hallgren

Tom Lane wrote:

Kris Jurka [EMAIL PROTECTED] writes:
  

On Wed, 30 Jul 2008, Alvaro Herrera wrote:


I do agree that creating base types should require a superuser though.
It too seems dangerous just on principle, even if today there's no
actual hole (that we already know of).
  


  
pl/java already allows non-superusers to create functions returning 
cstring and base types built off of these functions.



So in other words, if pl/java is installed we have a security hole
a mile wide.

regards, tom lane
___
Pljava-dev mailing list
[EMAIL PROTECTED]
http://pgfoundry.org/mailman/listinfo/pljava-dev
  



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] Patch for UUID datatype (beta)

2006-09-27 Thread Thomas Hallgren

[EMAIL PROTECTED] wrote:

On Tue, Sep 19, 2006 at 11:21:51PM -0400, Alvaro Herrera wrote:

[EMAIL PROTECTED] wrote:

On Tue, Sep 19, 2006 at 08:20:13AM -0500, Jim C. Nasby wrote:

On Mon, Sep 18, 2006 at 07:45:07PM -0400, [EMAIL PROTECTED] wrote:

I would not use a 100% random number generator for a UUID value as was
suggested. I prefer inserting the MAC address and the time, to at
least allow me to control if a collision is possible. This is not easy
to do using a few lines of C code. I'd rather have a UUID type in core
with no generation routine, than no UUID type in core because the code
is too complicated to maintain, or not portable enough.

As others have mentioned, using MAC address doesn't remove the
possibility of a collision.

It does, as I control the MAC address.

What happens if you have two postmaster running on the same machine?


Could be bad things. :-)

For the case of two postmaster processes, I assume you mean two
different databases? If you never intend to merge the data between the
two databases, the problem is irrelevant. There is a much greater
chance that any UUID form is more unique, or can be guaranteed to be
unique, within a single application instance, than across all
application instances in existence. If you do intend to merge the
data, you may have a problem.

You may. But it's not very likely. Since a) there is a 13-bit random number in addition to 
the MAC address (the clock sequence) and b) the timestamp has a granularity of 100 nanosec. 
An implementation could be made to prevent clock-sequence collisions on the same machine and 
thereby avoid this altogether.


Kind Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] TODO item: GUID

2006-09-18 Thread Thomas Hallgren

[EMAIL PROTECTED] wrote:

... I *like* sorting by time, as it allows
the UUID to be used similar to sequence, leaving older, lesser accessed
UUIDs in the past. 


and don't forget, an automatic timestamp of when a record is created might be useful for 
other purposes.


Regards,
Thomas Hallgren

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] UUID/GUID discussion leading to request for hexstring bytea?

2006-09-18 Thread Thomas Hallgren

Gevik Babakhani wrote:

 To my opinion GUIDs type need to provide the following in the database.

1. GUID type must accept the correct string format(s), with of without
extra '-'
2. GUID type must internally be stored as small as possible.
3. GUID type must be comparable with == , != , LIKE and (NOT) IS NULL
4. GUID type must have the ability to be indexed, grouped, ordered,
DISTINCT... but not MAX(), MIN() or SUM()


Where do you see a need for LIKE on a GUID?

Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] UUID/GUID discussion leading to request for hexstring bytea?

2006-09-18 Thread Thomas Hallgren

Gevik Babakhani wrote:

LIKE could come handy if someone wants to abuse the uuid datatype to
store MD5 hash values. However I am not going to implement it if there
is no need for that (assuming it will pass the acceptance test)

  
Perhaps providing LIKE just to encourage abuse is not such a good idea? 
IMHO, a GUID should be comparable for equality and NULL only, not LIKE. 
I also think that ordering is feasible only when looking at parts of the 
GUID, i.e. order by the result of a function that extracts a timestamp 
or a node-address. Magnitude comparison on the GUID as a whole makes no 
sense to me.


Regards,
Thomas Hallgren


On Mon, 2006-09-18 at 10:06 +0200, Thomas Hallgren wrote:
  

Gevik Babakhani wrote:


 To my opinion GUIDs type need to provide the following in the database.

1. GUID type must accept the correct string format(s), with of without
extra '-'
2. GUID type must internally be stored as small as possible.
3. GUID type must be comparable with == , != , LIKE and (NOT) IS NULL
4. GUID type must have the ability to be indexed, grouped, ordered,
DISTINCT... but not MAX(), MIN() or SUM()

  

Where do you see a need for LIKE on a GUID?

Regards,
Thomas Hallgren





  



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] UUID/GUID discussion leading to request for hexstring bytea?

2006-09-18 Thread Thomas Hallgren

[EMAIL PROTECTED] wrote:

On Mon, Sep 18, 2006 at 11:12:54AM +0200, Gevik Babakhani wrote:
Magnitude comparison on the GUID as a whole makes no 
sense to me.

I agree. Any kind of comparison except equality has no meaning for the
GUID. (And this is discussed before) I rather have the option to sort
and group for the sake of consistency and compatibility.


Thomas: The ability to sort / comparison is required for use with
B-Tree index. I prefer a fast comparison over one with more meaning.
memcmp() is fine with me and it is how I implement it in my UUID
PostgreSQL library.

Fair enough. Although the magnitudes as such makes little sense, the ability to order will 
make it possible to compare results from different queries etc. Very difficult to do with 
random order.


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Proposal for GUID datatype

2006-09-11 Thread Thomas Hallgren

Jan de Visser wrote:

On Friday 08 September 2006 15:18, Gevik Babakhani wrote:

2a) Three input formats are supported.
example:
insert into tbl (fld) values('1dfb39af-b56a-40b8-a903-b5b31567c3ce');
insert into tbl (fld) values('{1dfb39af-b56a-40b8-a903-b5b31567c3ce}');
insert into tbl (fld) values('1dfb39afb56a40b8a903b5b31567c3ce');


Please extend your list. java's RMI package generates UIDs with the following 
format:


[head order 21:19]$ bsh.sh
BeanShell 1.3.0 - by Pat Niemeyer ([EMAIL PROTECTED])
bsh % print(new java.rmi.server.UID());
3b732da7:10d9029b3eb:-8000
bsh %  


So forms that use colons instead of dashes seem appropriate.

Or better still, make it configurable.

jan

RMI UID's has nothing in common with UUID's so I fail to see how they 
have any relevance here. I've never seen any other representation of the 
UUID's that the two that Gevik wants to support. Please note that UUID 
is a very well known concept and not an arbitrary 128 bit storage.


http://en.wikipedia.org/wiki/UUID is a good source of information. The 
appointed RFC actually contains source code.


Kind Regards,
Thomas Hallgren

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] GUC with units, details

2006-07-26 Thread Thomas Hallgren

Simon Riggs wrote:

don't ever need to say that K = 1000, AFAICS. I think we are safe to
assume that 


kB = KB = kb = Kb = 1024 bytes

mB = MB = mb = Mb = 1024 * 1024 bytes

gB = GB = gb = Gb = 1024 * 1024 * 1024 bytes

There's no value in forcing the use of specific case and it will be just
confusing for people.

It's fairly common to use 'b' for 'bits' and 'B' for 'bytes'. My suggestion would be to be 
much more restrictive and avoid small caps:


KB = 1024 bytes
MB = 1024 KB
GB = 1024 KB
TB = 1024 GB

Although I don't expect to see bit-rates or fractions ('m' == 'milli') in GUC, it might be 
good to use consistent units everywhere.


Regards,
Thomas Hallgren

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] plPHP and plRuby

2006-07-18 Thread Thomas Hallgren

Marc G. Fournier wrote:

Actually it would be nice to have the not-included PLs present in
src/pl/ as their own directories with a README.TXT containing fetch and
build instructions

So we would have

src/pl/plphp/README.TXT
src/pl/pljava/README.TXT
src/pl/plj/README.TXT

and anybody looking for pl-s would find the info in a logical place


*That* idea I like ...

ISTM that a clear strategy for how to deal with core, contrib, add-ons, etc. is long overdue 
and that's the reason why these discussions pop up over and over again. The question What 
are the criterion's for core inclusion? has not yet been answered. I though PL/Java 
fulfilled those criterion's but a new threshold for the #lines of code and a concern for 
code in unmaintainable language made it impossible.


The result of an unclear strategy can be perceived as somewhat unjust. There seem to be a 
very unanimous consensus that PL/pgsql belongs in core. Large object support, free text 
search and some others also receive support by everyone. These add-ons clearly belong where 
they are. The historical reasons to continuously include others are, IMHO, not so obvious 
and the result undoubtedly creates first- and second class citizens in the module flora. The 
split doesn't correlate very well with feature richness or popularity.


I have a suggestion that might help clearing things up a bit :-)

A couple of specialized teams need to be established (or rather, formalized since they 
already exists to some extent) that can be thought of as core subsidiary's. The idea is 
that such a team would take on the maintenance of one specialized area of PostgreSQL. Java, 
for instance, is such an area. PostgreSQL has a huge number of Java users. They all use the 
JDBC driver and a few use PL/Java. There's been talk about Eclipse tool support and some 
will have an interest in XA-compliance in order to gain JTA support, etc. Today, it's 
scattered all over the place. Other subsidiary teams should be formed around odbc (or .net 
perhaps), php, ruby, replication/clustering, etc. to take control over those areas.


A very important part of my suggestion is that for the normal user, it would appear that 
what a core subsidiary team contribute really is *part of* the database proper and not 
something maintained by a third-party contributor or commercial vendor.


The team would maintain their own website (although all layout would be centralized), source 
code control system, mailing list etc. but they would share a lot more of the PostgreSQL 
infrastructure then what is shared today. Important things would be:


- Documentation. Inclusion of a subsidiary module should mean that some chapters are added 
(automatically) to the user manual.

- Build farm support.
- Packaging and downloads
- Server infrastructure
- Seamless navigation from the PostgreSQL main web-site.

PgFoundry would live on like today, targeted on third-party modules and serving as an 
incubator for modules that aim to be included in core or into one of its subsidiaries.


Kind Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] monolithic distro

2006-07-14 Thread Thomas Hallgren

Andrew Dunstan wrote:


The topic here is NOT what features are missing from postgres.


Of course it is ;-)

Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Three weeks left until feature freeze

2006-07-13 Thread Thomas Hallgren

[EMAIL PROTECTED] wrote:

On the subject of 38K lines of code, much that isn't C (going by memory,
I apologize if this is wrong), how many of these lines could be/should be
shared between PL/Java and PL/J? It seems to me that the general concepts
should be in common, and that it is only how the Java interfaces with the
backend that changes. Could they not be one PL, with two mechanisms for
speaking to the backend?

By all means. An embedded JVM solution should share as much as possible with one that uses a 
remote JVM. From the users perspective there should be no difference at all. PL/Java is 
designed with this in mind. The class loader and the utility commands are based on JDBC, the 
security manager that enables the choice of trusted/untrusted execution is Java standard. A 
set of interfaces for non-standard access (PostgreSQL TriggerData in particular) was 
abstracted in order to allow different implementations. Etc.


That said, there is also code that deals with tight backend integration and is highly 
specialized to fit the embedded solution. This code is designed around the fact that 
function calls to the backend are very cheap. As an example, PL/Java contains a JDBC driver 
that is written directly on top of the SPI API. The involved C-structures are rarely copied 
or streamed. They are accessed directly using JNI functions.


I've spent some time lately, investigating what it would take to complement PL/Java with a 
remote JVM option. The major challenge lays in the impedance mismatch caused by concerns 
that one must consider when using RPC (limit the number of calls) compared to the current 
design (avoid copying and streaming).


Kind regards,
Thomas Hallgren

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Three weeks left until feature freeze

2006-07-13 Thread Thomas Hallgren

Joshua D. Drake wrote:


JDBC is different, in that it doesn't require the PostgreSQL core to
build. It's 100% native Java, and as such, I see benefit to it being
distributed separately.


PLJava does not need PostgreSQL core to build either. It needs:

pgxs + Postgresql libs + PostgreSQL headers

In essence the PostgreSQL SDK.

If I read what Thomas wrote (late) last night correctly.


You did.

Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Three weeks left until feature freeze

2006-07-13 Thread Thomas Hallgren

Jonah H. Harris wrote:


But, I can't find anything there to download ... just a pointer to a 
Wiki,

which, I'm sorry, would definitely not be my first thought to go look at
for a downloads ...


Hmm, yes... just saw that and it is a bit odd.  Thomas, I like the
layout of the Wiki... but could we move the project files to pgfoundry
for hosting and set the project's home page as the wiki?

Yes, that sounds reasonable. I'll look into that. What I really would 
like is to move the whole project (aside from the Wiki) from gborg to 
pgfoundry.


Kind regards,
Thomas Hallgren


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Three weeks left until feature freeze

2006-07-13 Thread Thomas Hallgren

Marc G. Fournier wrote:


So, let's try ftp ...

ftp.postgresql.org:/pub/projects/gborg/pljava/stable:

Nothing there newer then November 2005:

ftp ls -lt
227 Entering Passive Mode (66,98,251,159,248,251)
150 Opening ASCII mode data connection for /bin/ls.
total 23026
-rw-r--r--  1 80  1009  206134 Nov 20  2005 pljava-src-1.2.0.tar.gz
-rw-r--r--  1 80  1009  522895 Nov 20  2005 
pljava-i686-pc-mingw32-pg8.1-1.2.0.tar.gz
-rw-r--r--  1 80  1009  522955 Nov 20  2005 
pljava-i686-pc-mingw32-pg8.0-1.2.0.tar.gz
-rw-r--r--  1 80  1009  421717 Nov 20  2005 
pljava-i686-pc-linux-gnu-pg8.1-1.2.0.tar.gz
-rw-r--r--  1 80  1009  421999 Nov 20  2005 
pljava-i686-pc-linux-gnu-pg8.0-1.2.0.tar.gz


so, if there is a newer version (I actually eventually went to the 
wiki, so know there is a 1.3.0), its not taking advantage of the 
PostgreSQL file distribution network that has been developed over the 
years ...


How would I go about taking advantage of that? And who did the 1.2.0 
upload? I certainly didn't.


Kind Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Three weeks left until feature freeze

2006-07-13 Thread Thomas Hallgren

Marc G. Fournier wrote:
How would I go about taking advantage of that? And who did the 1.2.0 
upload? I certainly didn't.


There is alot more then then just 1.2.0 ... check out the FTP site ...

As for taking advantage of that ... upload files to the file section 
in *either* gborg or pgfoundry, and they get auto-included as part of 
the ftp network ...


The PL/Java 1.3.0 release has been on gborg for several weeks but only 
available through the wiki (the gborg generated 'download' page is messy 
and I'm not able to remove stuff that shouldn't be there). I guess 
that's why it was not mirrored. And yes, I agree wholeheartedly, a wiki 
is not the most intuitive place for downloads. Per Jonahs suggestion 
I've just uploaded everything to pgfoundry too.


Thanks for uploading the 1.3.0 to the ftp.

Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Three weeks left until feature freeze

2006-07-13 Thread Thomas Hallgren

Marc G. Fournier wrote:


I'm confused here ... has been on gborg for several weeks, but only 
available through the wiki ...


On: http://gborg.postgresql.org/project/pljava/projdisplay.php ... I 
can't find any way of downloading 1.3.0 (or, older releases even) ... 
have you been uploading, but nobody activated teh Files section to 
download? :(


Yes, I've been uploading to gborg and the links provided on the wiki 
appoints those files. I don't *want* to activate the downloads section 
since it exposes a page with a lot of files that I doesn't belong there. 
Unfortunately, there's no way to remove them. The Files section on 
pgfoundry looks a lot better :-)


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Three weeks left until feature freeze

2006-07-13 Thread Thomas Hallgren

Marc G. Fournier wrote:
... the only reason 'NetBSD doesn't offer pl/java now' is because nobody 
a) is using it under NetBSD or b) submitted a port to their system


Should be fairly straight forward if the PostgreSQL SDK and gcj 4.0 or later is installed. 
Download the PL/Java source tarball, make sure pg_ctl is in your path and type 'make 
USE_GCJ=1 release'


Alternatively, set JAVA_HOME to appoint some other JRE and just type 'make 
release'

Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Three weeks left until feature freeze

2006-07-13 Thread Thomas Hallgren

Marc G. Fournier wrote:
But Thomas, that means finding someone willing to do the work to build 
the port ... :)


PL/java should be very easy to port. In fact, I'm not sure any specific porting is needed. 
There might be some minor makefile quirk (that is what has bitten me on other platforms). I 
don't have access to a FreeBSD machine so I can't try it.


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Three weeks left until feature freeze

2006-07-12 Thread Thomas Hallgren

Hi Dave,
Sorry I missed you at the Summit. I would've liked to discuss PL/J 
versus PL/Java with you.


What is the status of PL/J? I haven't seen much activity there over the 
last 10 months. Does it run on Windows yet? Are you planning a first 
release anytime soon? Do you have any active users? Does the project 
still have over 40 dependencies to other components? The last time I 
looked (August last year) a beta-0.1.1 was planned. I didn't manage to 
built it and it didn't seem anywhere close production readiness.


Perhaps it's no surprise that I disagree when you say PL/J could be 
considered in the same light as PL/Java. Then again, I'm fairly biased ;-)


Regards,
Thomas Hallgren


Dave Cramer wrote:

Absolutely PL/J should be considered in the same light as PL/Java.

Consider this a request for PL/J to be included in the core.

Dave
On 11-Jul-06, at 12:50 PM, Josh Berkus wrote:


David,


It's good to integrate things with the core as needed.  What plans do
we have to integrate PL/J?


None, if the PL/J team doesn't speak up.  So far I have yet to see a 
request for PL/J or even a release notice.


--Josh

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings






---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Three weeks left until feature freeze

2006-07-12 Thread Thomas Hallgren

Andrew Dunstan wrote:
There is in effect no API at all, other than what is available to all 
backend modules. If someone wants to create an API which will be both 
sufficiently stable and sufficiently complete to meet the needs of the 
various PLs (especially, as Hannu rightly observes, any new PLs that 
come along) then  we can revisit this question. Until then I suggest 
that it is at best premature. I am not even sure such a thing is 
actually possible.


I concur with this. The needs for a module like PL/Java is very different then the needs of 
PL/Perl so let's get some more PL's in before we do a refactoring effort to create common 
API's. Personally, I'm not sure what would be included. The call handler API's together with 
the SPI API's are in essence what you need. The rest is fairly specialized anyway.



Also there is this: speaking as someone who actually does some work in 
this area, I very much appreciate having the eagle eyes of people like 
Tom, Neil and Joe on what's going on, and keeping things on the straight 
and narrow. I at least would feel lots less comfortable about 
maintaining things without such help.


This is partly why I'd like to get PL/Java included. Not that I expect any of them to devote 
resources to PL/Java but I think that they, from time to time, will visit the code. If not 
for anything else then to see why some other change caused build failures. It's always 
easier to have discussions around code that you know they all have on disk.



The Postgres hacker community is small. I am not sure there is an 
adequate pool of people who will maintain the momentum of each 
sub-project that we might choose to orphan. If we had thousands of eager 
code cutters it might be different, but we don't, really.


As the project grows for various reasons, the number of hackers in the community will grow 
as well. PL/Java for instance, does not come without resources :-)


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Three weeks left until feature freeze

2006-07-12 Thread Thomas Hallgren

Joshua D. Drake wrote:


Well I know it isn't an API per say, but one interesting tid bit as an example 
is that PLphp does not need the PostgreSQL source to compile. It only needs 
pgxs and the relevant headers etc...


Perhaps that is one way to go... All PLs use pgxs?

  
PL/Java does. No source needed. So yes, there's already a fairly good 
API that assists in the module build process. It does however still 
include all header files needed by the backend and thus, leaves the 
backend wide open (in a matter of speech). If a refactoring effort was 
to start later on, that would be a good place to start. I.e. divide 
headers into the ones available for external modules and the ones for 
internal backend use only.


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Three weeks left until feature freeze

2006-07-12 Thread Thomas Hallgren

Dave Cramer wrote:


I expect to see a new release shortly.

Dave, I tried to obtain the source but whenever I try I get:

 [thhal]$ cvs -d 
:pserver:[EMAIL PROTECTED]:/home/projects/plj/scm login
 Logging in to 
:pserver:[EMAIL PROTECTED]:2401/home/projects/plj/scm

 CVS password:
 /home/projects/plj/scm: no such repository

I can browse the source using the web interface though. Judging from 
that, there's been no CVS activity since I last tried, i.e. august last 
year. Is the source being maintained somewhere else? How do I obtain the 
latest CVS? Judging from your statement a lot must have happened that 
would be interesting to look at.


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Three weeks left until feature freeze

2006-07-12 Thread Thomas Hallgren

Tom Lane wrote:

...  equal claim to inclusion
in core.  Perhaps more, because it gives us an extra layer of insulation
from JVM licensing questions.


Tom,
Why to you persist talking about licensing issues with PL/Java? There are none. PL/Java 
builds and runs just fine with gcj and the above statement is completely false.


Dave,
What JVM requirements does PL/J currently have? What license implications are imposed by the 
components that it depends upon?


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Three weeks left until feature freeze

2006-07-12 Thread Thomas Hallgren

Josh Berkus wrote:


Perhaps it's no surprise that I disagree when you say PL/J could be 
considered in the same light as PL/Java. Then again, I'm fairly 
biased ;-)


This attitude does you no credit, Thomas.

My diplomatic skills are somewhat limited :-) I might be blunt at times. 
I'm sure there are other more subtle ways to get the message through. 
I'm trying to be honest and up-front. IMO, that should count for something.


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Three weeks left until feature freeze

2006-07-12 Thread Thomas Hallgren

Tom Lane wrote:

Thomas Hallgren [EMAIL PROTECTED] writes:
  
Why to you persist talking about licensing issues with PL/Java? There are none. PL/Java 
builds and runs just fine with gcj and the above statement is completely false.



Um ... if you use it with gcj, there may or may not be any licensing
problems (please recall we are trying to be a BSD-only project, not a
BSD-and-LGPL project),

You have no problems using gcc, gnu-make, etc. What's the difference?


 but what of people who use some other JVM?
It's not like gcj works for everyone yet.

  
What of them? If they decide to use another JVM, well, then let them. I 
don't see where that becomes a licensing problem from PostgreSQL.


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Three weeks left until feature freeze

2006-07-12 Thread Thomas Hallgren

Joshua D. Drake wrote:
What happens when the FSF inevitably removes the license clause and 
makes it pure GPL?


I'm sorry but I don't follow. You're saying that it's inevitable that 
FSF will remove the 'libgcc' exception from libgcj? Why on earth would 
they do that? My guess is that it will go the other way (i.e. LGPL). 
What's the logic in having different licenses on libg++ and libgcj?


Now all of this being said, I doubt there is actually an issue here 
because:


It doesn't HAVE TO BE BUILT, it is not a derivative product.

Well, assume that FSF indeed did remove the exception. It would take me 
30 minutes or so to create a substitute BSD licensed dummy JNI library 
with associated headers that would allow PL/Java to be built without any 
external modules at all. It's then completely up to the user what he/she 
wants to slot in as a replacement.


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Three weeks left until feature freeze

2006-07-12 Thread Thomas Hallgren

Josh Berkus wrote:

Thomas,

I'm starting to have second thoughts about this suggestion.  I was 
enthusiastic about it at the summit, but I was unaware of the sheer size 
of PL/Java.  38,000 lines of code is 8% of the total size of Postgresql 
... for *one* PL.


Dave Cramer acquainted me with some of the difficulties of doing a Java 
PL today, and I understand why it needs to be that large.  However, 
38,000 lines of code -- much of it in a non-C language -- presents a 
possible debugging/maintenance major headache, especially if you someday 
left the project for some reason.


OK. You're the one that suggested this submission attempt. There's not much point in 
pursuing it if you have second thoughts.


 Maybe we do need to look at a plug-in build tool, instead.

Right, something that would allow PL/Java to participate in a build farm. That would be cool 
and also resolve a some of the issues.


Regards,
Thomas Hallgren

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Three weeks left until feature freeze

2006-07-12 Thread Thomas Hallgren

Joshua D. Drake wrote:


Well, assume that FSF indeed did remove the exception. It would take 
me 30 minutes or so to create a substitute BSD licensed dummy JNI 
library with associated headers that would allow PL/Java to be built 
without any external modules at all. It's then completely up to the 
user what he/she wants to slot in as a replacement.


Do we want to do that? I mean (and I am not saying it is, I am asking) 
is that a bit grey? I would prefer it be black and white.


The JNI API is an open standard so I have every right to create a BSD 
licensed dummy for it. The user may choose a JVM from IBM, Sun, BEA, or 
other (like GCJ) to run. That's the essence of having a standardized 
API. What can possibly be 'grey' about that?


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Three weeks left until feature freeze

2006-07-11 Thread Thomas Hallgren
I'd like to submit PL/Java into core for 8.2 if possible. Personally, I see the following 
action items to make it happen:


1. A hackers discussion to resolve any issues with the submission.

Provided that #1 has a positive outcome:

2. The PL/Java CVS must be moved from gborg and become part of the PostgreSQL CVS (can this 
be done with version history intact?).

3. The regression tests need some work in order to fit in with the build farm.
4. Documentation must be ripped from the PL/Java Wiki and transformed into the format used 
by PostgreSQL.

5. I'll need committer rights to the PL/Java part in order to maintain it.
6. The pljava-dev mailing list, currently at gborg, must (perhaps) be moved also. An 
alternative is to remove it and instead refer to jdbc, general, and hackers.


Given guidance, I'll do the steps #3 and #4.

External dependencies:
Platforms where PL/Java is ported must either support GCJ 4.0 or higher or have a Java 
Runtime Environment 1.4.2 or higher installed.


Regards,
Thomas Hallgren

Bruce Momjian wrote:

There are roughly three weeks left until the feature freeze on August 1.
If people are working on items, they should be announced before August
1, and the patches submitted by August 1.  If the patch is large, it
should be discussed now and an intermediate patch posted to the lists
soon.

FYI, we don't have many major features ready for 8.2.

--
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Three weeks left until feature freeze

2006-07-11 Thread Thomas Hallgren

Josh Berkus wrote:

Tom,

What about licensing issues?  Does PL/Java work with any 
entirely-open-source

JVMs?  If not, what is the legal situation for distributing PG+PL/Java?


Actually, Sun has re-licensed the JRE to make it OSS-compatible (it's 
now available for Debian, for example)  They're doing a Java licensing 
session at  OSCON if you have any specific questions, or I can ping 
the Java Licensing Guru directly.  But even if other JRE's aren't 
supported, licensing shouldn't be an obstacle.


I don't see any license issue at all regardless. PL/Java is satisfied 
with GCJ 4.0 or higher and compiling with that doesn't affect the binary 
more then using gcc does. No JVM is required when using GCJ.




I'm also a bit concerned about size.  By my count, lines of source code:

plpgsql19890
plperl4902
plpython4163
pltcl4498
pljava 1.3.038711

IOW pljava is (already) bigger than the other four PLs put together.


That is odd.  Thomas?


It's not that odd really:

1. the mapping is strongly typed, i.e. each scalar type in PostgreSQL 
has a set of functions that maps it to the correct primitive in Java 
(int4 is a java int, double precision is a double etc.). PL/Java will 
resort to string coercion only when no other option is left.
2. a type mapping is provided for *all* types. Scalar, composite, 
pseudo, array types, and result sets.
3. new Java mappings can be created on the fly. Both for scalar and 
composite types.
4. you can create new scalar types in PostgreSQL that uses IO functions 
written in Java.
5. the Java code contains it's own API documentation (standard java-doc 
comments on classes and methods).
6. the code is written to conform to standard interfaces such as the 
JDBC interfaces (from a #lines perspective, perhaps not always the most 
optimal way of doing it but it does bring a bunch of other advantages).
7. extensive error handling is included that allow try/catch semantics 
when checkpoints are used.
8. extreme measures has been taken to ensure that the backend is never 
exposed to more then one thread at a time.

...
(from the top of my head, there are probably more reasons)

IMHO, this is yet another reason to actually include it in core. I'm not 
an expert on the other PL's but my guess is that PL/Java is far more 
sensitive to API changes in the backend core.


Regards,
Thomas Hallgren





---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Three weeks left until feature freeze

2006-07-11 Thread Thomas Hallgren

Hi Hannu,

Hannu Krosing wrote:

Maybe this functionality could be lifted out of PL/Java and made
available to all PL-s ?

At least at some API level.

  
I think that what could be shared are the ideas and the semantics. The 
API's that the backend currently expose will give you what's needed to 
do the specialized implementations.


The actual PL mappings are all different because their respective 
executor is different. Some languages are typed, others are not. Some 
languages support classes and objects, others don't. Other, more 
esoteric details like the use of a garbage collector or traditional 
alloc/free semantics also affects the actual implementation.


I'm afraid there's not much in the PL/Java type system that could be 
generalized and shared. Perhaps if we had other languages with very 
similar capabilities (like C# for instance) but even then I have some 
doubts. The good news in my opinion is that if PL/Java would make it to 
the core it could make a good reference implementation for other equally 
advanced language mappings.


Regards,
Thomas Hallgren






---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] [GENERAL] UUID's as primary keys

2006-07-06 Thread Thomas Hallgren

Greg Stark wrote:


In answer to your question, though my opinion carries no special weight at
all, I would suggest adding a bare bones 16-byte data type to core and a
second binary-compatible data type based on it that parsed/output as uuids.
The extended uuid libraries should only go in pgfoundry/contrib.


I second that.

Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Fixed length datatypes. WAS [GENERAL] UUID's as

2006-06-30 Thread Thomas Hallgren

Josh Berkus wrote:
 Jim,

 I agree about splitting the utilities, except that I think the database
 should be able to generate UUIDs somehow.

 There is a GUID add-in, and someone is working on a 2nd one.  UUIDs 
are not part of the SQL standard, and we've only seen sporadic demand 
for them (and different types each time) so I can't imagine one making 
it further than contrib real soon.


 Also, one could argue that UUIDs are a foot gun, so they're not 
exactly the type of thing we want to advocate in advance of demand.



Martijn van Oosterhout wrote:
 It seems to me that maybe the backend should include a 16-byte fixed
 length object (after all, we've got 1, 2, 4 and 8 bytes already) and
 then people can use that to build whatever they like, using domains,
 for example...

So how about the split? I.e. just add a 16 byte data type and forget all 
about UUID's for now.


Regards,
Thomas Hallgren



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Fixed length datatypes. WAS [GENERAL] UUID's as

2006-06-29 Thread Thomas Hallgren

[EMAIL PROTECTED] wrote:

On Wed, Jun 28, 2006 at 01:12:17PM -0500, Jim C. Nasby wrote:

On Wed, Jun 28, 2006 at 01:49:55PM -0400, Andrew Dunstan wrote:
Personally I don't buy the misuse objection - we already have plenty of 
things that can be misused. As long as there is a reasonable valid use 
and we can make it portable enough, I think there is a good case for 
including it.

Well, since Mark has one, how about we consider adding it in?
If nothing else, can you please put your stuff on pgFoundry so others
can find it, Mark?


It was written by Nathan Wagner [EMAIL PROTECTED] and myself, and
is based off the OSSP ( http://www.ossp.org/ ) UUID implementation.
I'm not an expert on the license, but it seems acceptable to me:

 Permission to use, copy, modify, and distribute this software for
  any purpose with or without fee is hereby granted, provided that
  the above copyright notice and this permission notice appear in all
  copies.

I haven't tested to see how portable the OSSP UUID implementation is.
This is their words:

 OSSP uuid was already written with maximum portability in mind, so
  there should be no great effort required to get it running on any Unix
  platform with a reasonable POSIX API. Additionally, the portability
  was tested by successfully building and running it on the following
  particular Unix platforms (syntax is cpu-os (compiler)):

  alpha-tru644.0 (cc)
  alpha-tru645.1 (gcc, cc)
  hppa-hpux11.11 (cc)
  ia64-hpux11.23 (cc)
  ix86-debian2.2 (gcc, icc)
  ix86-debian3.0 (gcc)
  ix86-debian3.1 (gcc)
  ix86-freebsd4.9 (gcc)
  ix86-freebsd5.2 (gcc, icc)
  ix86-netbsd1.6 (gcc)
  ix86-qnx6.2 (gcc)
  ix86-solaris10 (gcc)
  ix86-unixware7.1.3 (cc)
  mips64-irix6.5 (gcc)
  sparc64-solaris8 (gcc, forte)
  sparc64-solaris9 (gcc)

I've put it through a fair amount of testing, including using it
within compound indexes, expecting the index to be used for at
least '=', constructing many UUIDs quickly, in a sequence, and
converting it to and from string form. We chose to implement our
own encode / decode routines for performance reasons. With the
exception of testing it on a wider range of platforms, I would
call the module stable.

If there is interest - I'm sure Nathan and I would be willing to put
it on pgfoundry, and at some point give it up for inclusion into
PostgreSQL.


One requirement would be that it runs on Windows. Is that something you have 
tested?

Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Fixed length datatypes. WAS [GENERAL] UUID's as

2006-06-29 Thread Thomas Hallgren

[EMAIL PROTECTED] wrote:

On Thu, Jun 29, 2006 at 02:02:32AM -0400, [EMAIL PROTECTED] wrote:

It was written by Nathan Wagner [EMAIL PROTECTED] and myself, and
is based off the OSSP ( http://www.ossp.org/ ) UUID implementation.
I'm not an expert on the license, but it seems acceptable to me:
...
If there is interest - I'm sure Nathan and I would be willing to put
it on pgfoundry, and at some point give it up for inclusion into
PostgreSQL.


This might require a little bit of research. It appears that the
development version of OSSP UUID may provide its own PostgreSQL
'bindings'. I may try and contact the author of the OSSP UUID and
see whether any changes we have that he does not, can be rolled
into his version...

Cheers,
mark

I'm thinking ahead on possible objections to inclusion in core. One objection might be that 
a fully blown UUID implementation is a lot of code. Code that needs to be maintained and it 
increases the size of the binary etc. A solution to that might be to break the whole thing 
up in two:


1 The actual type
A plain scalar type that stores 16 bytes. It's complete with standard operators for 
comparison (natural order) and the text representation would be a 32 character hexadecimal 
string. This type should make no interpretation whatsoever on what it stores and its only 
association with UUID's is the storage size.


2 UUID utilities
Various ways of representing, generating, and extract partial information from UUID's. 
Should have support for variants #0, #1, and #2 (the OSSP based code sounds like a good 
candidate).



The split make sense since clients often have powerful UUID utilities handy and hence have 
limited or no use for such utilities in the database (true for all .NET and Java clients). 
Some PL's will also enable such packages out of the box.


The actual type would be extremely generic, lightweight, and easy to implement. No 
portability issues whatsoever. The only difficulty that I see is naming it :-).


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] [GENERAL] UUID's as primary keys

2006-06-29 Thread Thomas Hallgren

Greg Stark wrote:

Martijn van Oosterhout kleptog@svana.org writes:

  

To be honest, it seems like a lot of work to save the four bytes of
overhead for the varlena structure on disk if you're going to need it
in memory anyway. And anything like RAW(16) which people want for
UUIDs, if it's going to have a lot of functions associated with it, may
as well just be a new type. 



For large databases storage density leads directly to speed. Saving four bytes
of overhead on a 16-byte data structure would mean a 20% speed increase. Even
if that's only helpful on a tenth of the columns you're still talking about a
2% speed increase for all queries on the table. A lot of databases use CHAR(1)
for flags. The overhead is even worse there.

  
I have to concur with this. Assume you use a bytea for a UUID that in 
turn is used as a primary key. The extra overhead will be reflected in 
all indexes, all foreign keys, etc. In a normalized database some tables 
may consist of UUID columns only.


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] [GENERAL] UUID's as primary keys

2006-06-29 Thread Thomas Hallgren

Martijn van Oosterhout wrote:

On Thu, Jun 29, 2006 at 03:54:36PM +0200, Thomas Hallgren wrote:
  
I have to concur with this. Assume you use a bytea for a UUID that in 
turn is used as a primary key. The extra overhead will be reflected in 
all indexes, all foreign keys, etc. In a normalized database some tables 
may consist of UUID columns only.



So you create a UUID type. It's cheap enough to create new types after
all, that's one of postgresql's strengths.

It would be a whole lot easier if I could use a domain.


 What I'm saying is that it's
easier to create new fixed length types for the cases that need it,
than it is to redo the entire type handling of the backend.

  
Of course. But it's a matter of who does what. Your reasoning push the 
burden to the users.


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


[HACKERS] Fixed length datatypes. WAS [GENERAL] UUID's as primary keys

2006-06-28 Thread Thomas Hallgren

Martijn van Oosterhout wrote:

On Wed, Jun 28, 2006 at 01:56:47PM +0200, Thomas Hallgren wrote:
A user that is trusted with installing a C-function in the backend is 
free to scan the process memory anyway so in what way did that increase 
the security? IMHO, the only relevant security in that context is to 
have trusted people install trusted modules. I'm surprised that 
something like that made you remove significant functionality.


You're missing the point. The type output function is not generally a
priveledged function. Think bpcharout, text_out, numeric_out, etc...
These can be called by users directly and the input to those functions
cannot be trusted.

Ah, OK that makes sense. An alternative solution when the signature was changed could 
perhaps have been to pass one single argument, a structure appointing the data and its 
associated type. My idea would work if the data and its type lived together always from the 
moment its instantiated (read from disk or otherwise) and until death do them apart (or the 
data is stored on disk, in which case the tupledesc knows what it is). I guess that would 
imply a major rewrite and that my desire to have a RAW fixed length type isn't enough 
motivation to do that :-)


Instead, I would like to humbly request the inclusion of a UUID datatype (or an opaque 128 
bit datatype) in the core package. It's increasingly common and some databases (MS 
SQLServer) already have built in support for it.


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] refcount leak warnings

2006-06-27 Thread Thomas Hallgren
I have a PL/Java user that performs some lengthy operations. Eventually, 
he get warnings like:


WARNING: buffer refcount leak: [779] (rel=1663/16440/52989, blockNum=3, 
flags=0x27, refcount=1 2)


I traced this to the function PrintBufferLeakWarning. AFAICS, it's only 
called from the function ResourceOwnerReleaseInternal under the 
following comment:


* During a commit, there shouldn't be any remaining pins --- that
* would indicate failure to clean up the executor correctly --- so
* issue warnings.In the abort case, just clean up quietly.

I have no idea where to go from here. What should I look for when trying 
to find the cause of such warnings?


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[HACKERS] [Fwd: Re: [Pljava-dev] char with trailing space, PreparedStatement.setObject SetString]

2006-06-27 Thread Thomas Hallgren
There's an inconsistency between the handling of trailing whitespace in 
query parameters in the client jdbc driver compared to the PL/Java SPI 
based driver. According to Jean-Pierre, the former apparently trims the 
trailing spaces before passing the query (see below). What is the 
correct behavior?


Regards,
Thomas Hallgren


 Original Message 
Subject: 	Re: [Pljava-dev] char with trailing space, 
PreparedStatement.setObject  SetString

Date:   Tue, 27 Jun 2006 12:07:19 -0400
From:   JEAN-PIERRE PELLETIER [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
CC: [EMAIL PROTECTED]



Hi Thomas,

There are very few char columns on my system, I can easily live with 
explicit trim

in my application code, I only wanted to let you guys know.

I am not sure which of the two JDBC implementations is right.
psql and pgadmin would both handle char as expected.
As for JDBC, you might want to know how other dbms are handling this.

Thanks for your reply.
Jean-Pierre Pelletier


From: Thomas Hallgren [EMAIL PROTECTED]
To: JEAN-PIERRE PELLETIER [EMAIL PROTECTED]
CC: [EMAIL PROTECTED]
Subject: Re: [Pljava-dev] char with trailing 
space,	PreparedStatement.setObject  SetString

Date: Tue, 27 Jun 2006 17:47:24 +0200

Hi Jean-Pierre,
I'm not sure this is incorrect behavior. There's nothing in the spec that 
indicates that String values should be trimmed by setString and setObject. 
On the contrary. Some datatypes (the CHAR in particular) are sensitive to 
whitespace according to the SQL standard. Perhaps the client jdbc driver is 
doing something wrong here?


Regards,
Thomas Hallgren


JEAN-PIERRE PELLETIER wrote:

Hi,

Trailing space are not handled properly by setObject  setString.

PreparedStatement pstmt = connection.prepareStatement(
   select * from mytable where mycharcolumn = ?);

String myString = abc ;
pstmt.setObject(1, myString); // or setObject(1, myString, Types.CHAR) or 
setString(1, myString)


No rows are returned, but using trim works fine as in:
pstmt.setObject(1, myString.trim());

My environment is Pl/Java 1.3, Sun JDK 1.5.07,  PostgreSQL 8.1.4, Windows 
XP SP2


With PostgreSQL own (non pl/java) jdbc driver, setObject on char works 
fine without the trim.


Thanks,
Jean-Pierre Pelletier


___
Pljava-dev mailing list
[EMAIL PROTECTED]
http://gborg.postgresql.org/mailman/listinfo/pljava-dev





---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [Fwd: Re: [Pljava-dev] char with trailing space, PreparedStatement.setObject

2006-06-27 Thread Thomas Hallgren

Sorry, wrong list... I reposted this on pgsql-jdbc instead.



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


[HACKERS] Shared library conflicts

2006-06-23 Thread Thomas Hallgren
There was a discussion some time back concerning the linking of the 
postgres backend. Some libraries where linked although they where not 
needed, mainly because it was convenient. I had a problem with PL/Java 
since a Sun JVM ships with their own version of libz.so (they call it 
libzip.so). Sun fixed the problem, simply by adding a namespace to all 
functions in their own library so I didn't stress the issue any further. 
Now this bites me again. Sun forgot to remove the old functions in their 
libzip.so on the amd64 platform.


I have of course reported this to Sun and I expect it to be fixed in 
future releases of Java but shouldn't this be fixed for PostgreSQL too? 
Perhaps we should consider this a more generic problem. The more 
libraries that are linked with the backend, the greater the risk that 
add-on modules will run into conflicts. What effort would be involved to 
fix this once and for all?


Another related question. What happens when I use --without-zlib? Does 
it have any effect on besides disabling compression for the dump/restore 
utilities? Is there anyway to make it affect the backend only?


Kind Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Shared library conflicts

2006-06-23 Thread Thomas Hallgren
That's good news. Any chance of getting this fix backported to 8.1? Or 
at least, the libz part of it?


Regards,
Thomas Hallgren


Martijn van Oosterhout wrote:

On Fri, Jun 23, 2006 at 10:45:29AM +0200, Thomas Hallgren wrote:
  
There was a discussion some time back concerning the linking of the 
postgres backend. Some libraries where linked although they where not 
needed, mainly because it was convenient.



AIUI, this was fixed in -HEAD. In the Makefile there's a filter command
to strip libraries not needed. One of them is libz.

From src/backend/Makefile:
# The backend doesn't need everything that's in LIBS, however
LIBS := $(filter-out -lz -lreadline -ledit -ltermcap -lncurses -lcurses, 
$(LIBS))

Have a nice day,
  



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


[HACKERS] Unable to initdb using HEAD on Windows XP

2006-06-18 Thread Thomas Hallgren
I just compiled a fresh copy from CVS head. I then tried to do an initdb 
as user 'postgres' (non admin user on my system). I get the following error:


 C:\Tada\Workspaceinitdb -D data
 The files belonging to this database system will be owned by user 
postgres.

 This user must also own the server process.

 The database cluster will be initialized with locale English_United 
States.1252.



 creating directory data ... ok
 creating subdirectories ... initdb: could not create directory 
data/global: Permission denied

 initdb: removing data directory data
 could not open directory data: No such file or directory
 initdb: failed to remove data directory

AFAICS, no data directory is ever created so the 'creating directory 
data ... ok' message is probably incorrect. I even tried to change the 
permissions on the parent directory so that user 'postgres' has full 
control. It doesn't help. I didn't think it would since I am able to 
create a database in this directory if I'm using version 8.1.4.


Regards,
Thomas Hallgren



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Unable to initdb using HEAD on Windows XP

2006-06-18 Thread Thomas Hallgren

Some more info. If I manually create the data directory first, the output is 
different:

  C:\Tada\Workspacemkdir data

  C:\Tada\Workspaceinitdb -D data
  The files belonging to this database system will be owned by user postgres.
  This user must also own the server process.

  The database cluster will be initialized with locale English_United 
States.1252.


  fixing permissions on existing directory data ... ok
  creating subdirectories ... ok
  selecting default max_connections ... 100
  selecting default shared_buffers/max_fsm_pages ... 4000/20
  creating configuration files ... ok
  creating template1 database in data/base/1 ... ok
  initializing pg_authid ... child process was terminated by signal 5
  initdb: removing contents of data directory data


Thomas Hallgren wrote:
I just compiled a fresh copy from CVS head. I then tried to do an initdb 
as user 'postgres' (non admin user on my system). I get the following 
error:


 C:\Tada\Workspaceinitdb -D data
 The files belonging to this database system will be owned by user 
postgres.

 This user must also own the server process.

 The database cluster will be initialized with locale English_United 
States.1252.



 creating directory data ... ok
 creating subdirectories ... initdb: could not create directory 
data/global: Permission denied

 initdb: removing data directory data
 could not open directory data: No such file or directory
 initdb: failed to remove data directory

AFAICS, no data directory is ever created so the 'creating directory 
data ... ok' message is probably incorrect. I even tried to change the 
permissions on the parent directory so that user 'postgres' has full 
control. It doesn't help. I didn't think it would since I am able to 
create a database in this directory if I'm using version 8.1.4.


Regards,
Thomas Hallgren



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster




---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Unable to initdb using HEAD on Windows XP

2006-06-18 Thread Thomas Hallgren

Never mind. I scrubbed my folders and obtained a new fresh copy from CVS. Now 
it works.

Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 6: explain analyze is your friend


[HACKERS] PG_MODULE_MAGIC

2006-06-17 Thread Thomas Hallgren
A module magic patch was added recently and I'm a bit uncertain what the implications are 
for the external PL modules. Does it affect them at all? Will I need to provide separate 
binaries for each bug fix release even though the API's do not change? Exactly how is the 
magic determined?


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] PG_MODULE_MAGIC

2006-06-17 Thread Thomas Hallgren

Tom Lane wrote:

No, each major release (8.2, 8.3, etc).  There are hardly ever any major
releases where you wouldn't need a new compilation anyway ...

  

True. I'm all in favor of a magic used this way. It will save me some grief.

Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Proposal for debugging of server-side stored procedures

2006-06-11 Thread Thomas Hallgren

Mark Cave-Ayland wrote:


I think that Java is quite unusual in that the design of JPDA is inherently
client/server based to the point where they have defined the platform to
allow you interact with the JVM via a socket. Unfortunately the same can't
be said for Perl/Python - as you suggest passing the parameters into the
interpreter is a trivial exercise but because the debugging classes in both
languages are console interactive, the only thing you could do is redirect
the console output to a socket (see
http://search.cpan.org/dist/perl/lib/perl5db.pl and the RemotePort option in
the case of Perl).

Obviously I'm not a Perl nor Python hacker. I just find it hard to believe that languages 
that has matured over a decade doesn't have a remote debug option that can be used. Sockets, 
shared-memory, pipes, or whatever. You will be able to attach to it somehow given it's been 
launched with the correct flags.


I think you're wrong in thinking that client/server based debugging is in any way unusual. 
Googling for perl+debug+remote tells me that client/server based Perl debugging is common. 
There are a number of solutions. The same is true for Python.




What I would like to see is some form of IDE that our developers can use,
probably under Windows, that would enable them to step through and debug
stored procedures on a remote server on another network. Using simple
console redirection would involve parsing text output which strikes as being
something that would break easily with new releases. 

Good IDE's exists already, for both perl, python, and others (you mention Komodo further 
down). In addition to that you have well tested Emacs mappings and command line clients that 
people like to use. The more prominent ones will always provide upgrades when there are 
protocol changes.




And then of course, there is the problem of security whereby anyone could
connect to the socket. For example, imagine a group of developers all
debugging different functions simultaneously; if one of them connected to
the wrong console socket then it could be confusing as the developer wanders
why their code never stops at a breakpoint.


In my experience you have two use-cases.

1. You debug during development and have either have your own database instance to play 
around with or a shared sandbox database where the security is very low.
2. You debug a running instance on a live server and the sysadmin is well aware what you're 
doing. You will be given required privileges as needed.


I would argue that the times when security becomes an issue when debugging are extremely 
rare and not worth spending lots of time and effort on. It is enough to prevent anyone but a 
super-user (or even a sysadmin) to start a remote debug session.




... the only thing that PL developers would have to is
provide an API for things like step, set breakpoint, read variable, and
eval.

Well, the API isn't worth much without an implementation. So in essence, you're saying that 
the only thing the PL developers would have to do is to provide a full blown debug server ;-)




A promising option at the moment would be to implement the DBGP protocol for
Perl/Python/Tcl as suggested by Lukas, mainly because it appears ActiveState
have already written the modules to do this (see
http://aspn.activestate.com/ASPN/Downloads/Komodo/RemoteDebugging).


There you go! Perl, PHP, Python, and Tcl all taken care of. IDE and all!



The only
issue I can see with this is again related to security in that the debugger
would not respect the ACLs within PostgreSQL which could potentially allow a
user to break inside a function that wasn't his/her own.

I'd use the Komodo IDE and implement the ability to start the PL using a GUC setting per my 
original suggestion (with super-user requirement). Trivial solution, minimum effort, and 
very useful. KISS principle.


It would be great if we could agree on a GUC flag (or flags) that would control debugging 
for all PL's. At present, all PL implementors must use their own (module specific) flags.


Kind regards,
Thomas Hallgren


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Proposal for debugging of server-side stored procedures

2006-06-09 Thread Thomas Hallgren

Some thoughts from another Tom...

Mark Cave-Ayland wrote:

... debugging in Perl is initiated with perl -d somefile.pl which then
becomes a special interactive interpreter session. The same is also true
with Python which is launched in a similar way, python -m pdb somefile.py.

All PL's are launched someway or another . It would probably be very simple to add the '-d' 
flag to the PL/Perl launcher and the '-m' flag to the PL/Python launcher and control it with 
module specific GUC settings. PL/Java already does this. The SQL to initialize debugging 
looks like this:


SET pljava.vmoptions TO '-agentlib:jdwp=transport=dt_shmem,server=y,suspend=y';

This tells the VM to load in-process debugging libraries and specifies the kind of 
connection to be made. As soon as the first java function is called, the process is 
suspended and waits for a client debugger to attach.


The amount of work needed to create similar solutions for perl, python, tcl, etc. is 
probably limited and fairly trivial.



However, both Perl and Python offer the ability to hook into the interpreter
to detect events. For example, Python offers a C API here [1] that allows
you to supply a callback when particular events occur; this would allow us
to execute a callback after the interpreter executes each line.

And it would force you to write your own proprietary debugger backend for each language. 
That's a major thing to undertake. Have you considered the maintenance aspect of it? Not 
something that would make the author of a PL module scream with joy. It might be wise to 
also consider what debugger preferences a skilled perl/python/whatever language developer 
might have. A home brewed debugger in the form of a PostgreSQL add-on might not be a natural 
choice.




Perl seems a little more messy in that I can't find a documented C API to
hook into the interpreter, but it looks as if it may be possible to cook
something up with writing a new DB package [2] which uses XS call a C
callback. The other issue is that unlike Python, the debug capability must
be specified when creating the instance of interpreter rather than being
able to enable/disable debugging on the fly so it may mean that two
instances of the perl interpreter need to be maintained in memory - a
standard instance and a debugging instance.

Plpgsql isn't really a concern as we can simply code up whichever model we
eventually decide to use. The only bad news I can see is that it appears Tcl
may need to have the source patched in order to add debug hooks into the
interpreter [3].

You'll find more bad news as you go along. I have sincere doubts that inventing your own 
multi-language debugger is the right way to go.




Also, I'm wondering how to design the mechanism to be extendable in the
future beyond debugging server-side functions, such as hooking into more of
the executor mechanisms. For example, I could see a use case for allowing a
profiler to use the debug API to attach to the postmaster to receive
notifications whenever a new SQL query starts and ends; this would allow
someone to write a rather neat app that could rank the most popular queries
in terms of processing time really easily, but then maybe this could
considered treading on the toes of the existing stats process...


SQL debugging and hooking into the executor sounds really interesting and something that 
would really be worth the effort. I doubt there's a gain mixing that with debugging of pl's 
in general. Having multiple debugger clients, one for each language, and one for SQL, might 
be a good thing.


Regards,
Thomas Hallgren

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Inefficient bytea escaping?

2006-05-29 Thread Thomas Hallgren

Marko Kreen wrote:

On 5/28/06, Martijn van Oosterhout kleptog@svana.org wrote:

With -lpthread
lock.enabled 323s
lock.disabled 50s
lock.unlocked 36s


I forgot to test with -lpthread, my bad.  Indeed by default
something less expensive that full locking is going on.


The crux of the matter is though, if you're calling something a million
times, you're better off trying to find an alternative anyway. There is
a certain amount of overhead to calling shared libraries and no amount
of optimisation of the library is going save you that.


The crux of the matter was if its possible to use fwrite
as easy string combining mechanism and the answer is no,
because it's not lightweight enough.

IIRC the windows port make use of multi-threading to simulate signals and it's likely that 
some add-on modules will bring in libs like pthread. It would be less ideal if PostgreSQL 
was designed to take a significant performance hit when that happens. Especially if a viable 
alternative exists.


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Proposal for debugging of server-side stored procedures

2006-05-29 Thread Thomas Hallgren

Tom Lane wrote:

Mark Cave-Ayland [EMAIL PROTECTED] writes:
... So basically yeah, what we need
is a debug subchannel in the FE/BE protocol.  I'd suggest inventing
a single Debug message type (sendable in both directions) with the
contents being specified by a separate protocol definition.  Or perhaps
invert that and imagine the FE/BE protocol as embedded in a debug
protocol.

I think this is a bad idea. PL/Java will use either shared memory or a socket to attach and 
as you already mentioned, when using C, a gdb will attach directly using the pid. I wouldn't 
be too surprised if Perl, Python, and PHP all have a similar solution and thus have no 
benefit from additions to the FE/BE protocol. IMO, debugging should be language specific and 
take place in a separate channel. There's no gain whatsoever mixing it with the FE/BE protocol.


Regards,
Thomas Hallgren



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Proposal for debugging of server-side stored procedures

2006-05-29 Thread Thomas Hallgren

Tom Lane wrote:

Thomas Hallgren [EMAIL PROTECTED] writes:

I think this is a bad idea. PL/Java will use either shared memory or a
socket to attach and as you already mentioned, when using C, a gdb
will attach directly using the pid. I wouldn't be too surprised if
Perl, Python, and PHP all have a similar solution and thus have no
benefit from additions to the FE/BE protocol. IMO, debugging should be
language specific and take place in a separate channel. There's no
gain whatsoever mixing it with the FE/BE protocol.


It may well be that for plperl and friends we can kick the problem off
to language-specific debuggers --- indeed, one of the first things we
need to do is look at those to see what we can avoid reinventing.
But what of plpgsql?

Ideally, all pl's should use the same protocol. It should be language agnostic and allow 
different regions of the code to origin from different languages. That way, it would be 
possible to single step a plpgsql function that in turn calls a function in pljava. 
Incidentally, the JDWP (Java Debug Wire Protocol) was designed to do just that. But I think 
it would be very complicated to cross language boundaries even if we did use that.


The JDWP and the architecture that surrounds it might be a good source for inspiration 
though. See: http://java.sun.com/j2se/1.5.0/docs/guide/jpda/architecture.html.



Also, any solution of this type probably requires that the person doing
debugging have database superuser access (in fact, be logged directly
into the server machine as the postgres user).  It'd be nice to have an
approach that could be used by non-superusers to debug their trusted-PL
functions.

Indeed. In my case, it's a matter of who starts the VM and what options that are passed to 
it (given certain options, the JVM will listen to a port or a semaphore that controls a 
region of shared memory). That in turn is controlled using GUC settings so for PL/Java I 
think it would be possible to set it up that way.


Regards,
Thomas Hallgren



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] API changes in patch release

2006-05-23 Thread Thomas Hallgren
The world is not perfect and I know that you are normally very 
restrictive in what is back-patched from head into bug-fix branches. The 
8.1.4 release however, did introduce a problem. You changed the API 
function inv_open() with the comment Revise large-object access 
routines to avoid running with CurrentMemoryContext.


This change will force me to a) introduce patch level sensitive 
conditionals in the code, and b) have two PostgreSQL 8.1.n compatible 
releases of PL/Java. One where n  4 and another where n = 4. I would 
like to avoid this in the future if possible. API's should remain stable 
during patch releases.


Having said that, I've been in the game long enough to know that Utopia 
doesn't exist. You probably had a very good reason to break the 
compatibility.


Kind Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Porting MSSQL to PGSQL (Was: [OT] MySQL is bad, but THIS bad?)

2006-05-23 Thread Thomas Hallgren

Tom Lane wrote:


I think the hard part of this task is designing the API for access to
the rowsets from triggers.

My preference would be something similar to two Portal instances (the NEW and OLD). I could 
then map it in the same way that I map the result of a query. If the API actually used two 
real Portal instances, the PL/Java implementation would take minutes.


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [OT] MySQL is bad, but THIS bad?

2006-05-19 Thread Thomas Hallgren

Christopher Kings-Lynne wrote:
If you want to get users to swtich to your software from your 
competitors, you have to eliminate barriers, and a big one for any 
database is getting locked into a specific one.  People aren't going 
to take the time to try switching to postgresql if they can't easily 
make it back to thier former database. It's one of the reasons why 
PostgreSQL's standards compliance is so important; if you want to 
swtich to a new database, your best bet is to give PostgreSQL a shot, 
because even if you don't like it, we're not going to try and trap you 
into our software with bunches of non-standard knobs. Low barrier to 
exit == low barrier to entry. 


Another reason why a tool to export from pgsql to mysql is just as 
important as the vice versa...


If that's really true, then let's create a bidirectional compatibility layer as a joint 
venture with people from the MySQL camp. Should be a win-win situation. I somehow doubt that 
is the case. Important yes. But just as important? No way.


We would loose big time on the export side since the vendor lock-in aspect is seriously out 
balanced by current levels of standards compliance. On the other hand, we'd win by order of 
magnitude on the import side. I bet the MySQL people would be utterly uninterested in such a 
venture. I think that if anything should be done, we should concentrate on import and let 
the MySQL people worry about going the other way. Once it becomes just as imporant, they will.


Regards,
Thomas Hallgren



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[HACKERS] Question about casts

2006-05-18 Thread Thomas Hallgren
Just out of curiosity (and most likely, ignorance). Why can't I cast an 
array of strings into a string? I.e.


 thhal=# select ('{a,b}'::varchar[])::varchar;
 ERROR:  cannot cast type character varying[] to character varying

or a cstring into a varchar, i.e.

 thhal=# select array_out('{a,b}'::varchar[])::varchar;
 ERROR:  cannot cast type cstring to character varying

ISTM, the implementation of such casts should be fairly simple and 
straight forward and sometimes even useful. Every data type comes with 
string coercion routines anyway right?


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Question about casts

2006-05-18 Thread Thomas Hallgren

Martijn van Oosterhout wrote:

On Thu, May 18, 2006 at 05:41:14PM +0200, Thomas Hallgren wrote:
  
Just out of curiosity (and most likely, ignorance). Why can't I cast an 
array of strings into a string? I.e.


 thhal=# select ('{a,b}'::varchar[])::varchar;
 ERROR:  cannot cast type character varying[] to character varying



Why would you need to? What would you expect to happen? Joined with a
seperator, no seperator, with parenthesis?

  

Well, let's assume I use JDBC. I write code like:

 ResultSet rs = stmt.executeQuery(SELECT arrValue ...);
 while(rs.next())
 String v = rs.getString(1);

The tuples received by the result set contains String[]. If I let 
PL/Java convert it (I don't currently), it will be according to Java 
semantics. I'd like to convert it using PostgreSQL semantics instead. So 
I change my statement to:


 SELECT array_out(arrValue) ...

that works of course. What baffles me is that I cannot write

 SELECT arrValue::varchar



What's the benefit of a cast over a function call?

  
None whatsoever. But PostgreSQL enables a lot of casts for some reason 
or another right? Why not this one?



Every cast costs space and lookup time. Any user can add their own
casts if they want, but the system generally only includes the ones
useful to many people or those required for standards complience.

  
OK. I can live with that. I would have thought that casting into the 
string types was something that could be hardwired since the backing 
functions are mandatory.


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [OT] MySQL is bad, but THIS bad?

2006-05-18 Thread Thomas Hallgren

John DeSoi wrote:


Right, you'll definitely need to hack the C source code to force 
PostgreSQL to accept invalid dates ;)


http://sql-info.de/mysql/gotchas.html#1_14

Couldn't we just install something that replaced invalid dates with a randomly generated but 
otherwise correct dates? That way they would become completely invisible. No one could even 
tell that the date was invalid to start with.


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] [OT] MySQL is bad, but THIS bad?

2006-05-18 Thread Thomas Hallgren

Lukas Smith wrote:

.. but spouting outdated FUD is really making you two look foolish.

Wow. On a scale from 1 to 10 measuring seriousness, I'd put my posting on -4. I'd advice you 
to take a step back and get some distance if you consider it outdated FUD.


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] [OT] MySQL is bad, but THIS bad?

2006-05-18 Thread Thomas Hallgren

Chris Browne wrote:

[EMAIL PROTECTED] (Marc G. Fournier) writes:

To give someone a running chance at migrating it to PostgreSQL, a
'MySQL compatibility module' would allow them to just plug the
existing DB in, and then work at improving sections of the code over
time ...

Hell, if done well, the module should be able to dump appropriately
'clean' PgSQL schemas ... as in your example elow about the domains ...


You can't have that because you essentially need to throw out four
aspects of fairly vital data validation functionality:

1.  Dates cease to be validatable.

2.  NULL and 0 and '' are all roughly equivalent, even though they
aren't.

3.  Foreign key constraints have to be ignored.

4.  You have to fabricate a locale offering a case-insensitive sort
order.

I suppose #4 isn't vital data validation...

But after you gut the PostgreSQL-based system of those four aspects
of data integrity, I'm not sure there's any remaining benefit to
having PostgreSQL in play...


Assuming the objective with a transition would be to improve on things, an alternative 
approach could be to offer a three step migration path:


1. A dump/restore utility that dumps a MySQL database and restores it into a PostgreSQL 
database. This utility must have plugin capabilities where logic can be added that deals 
with cases #1, #2, and #3 above. It might be as simple as just logging incorrect records to 
a file and skip them. A pre-defined set of generic plugins could be supplied that did just 
that. A user would have the chance to replace them with customized plugins to cover for 
special cases in his own app. Perl or PHP would probably be good candidates for plugin language.


2. Provide an add-on to the PostgreSQL parser that would make it accept MySQL syntax. The 
database would still run untainted underneath so from this point on, no more invalid dates 
or foreign keys can be entered. Some other add-ons are needed as well to cater for some sane 
but non-standard MySQL behavior that PostgreSQL is lacking.


3. A good user guide that helps the user to, over time, move away from the non standard 
MySQL specific expressions. The objective being to at some point skip the MySQL syntax layer 
altogether.


Regards,
Thomas Hallgren

(dead serious this time)


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] audit table containing Select statements submitted

2006-05-16 Thread Thomas Hallgren
Some users of PL/Java make use of a non-default connection from within a Trigger in order to 
do this. In essence, they load the client JDBC package into the backend to let the backend 
as such become a client. The second connection is then maintained for the lifetime of the 
first. Perhaps not the most efficient way of doing it but it works.


Regards,
Thomas Hallgren

Gurjeet Singh wrote:
   Do we have any plans of introducing 'AUTONOMOUS TRANSACTION'  like 
feature?


   Again, it might not be a part of the standard but it is very
helpful in situations like these!!! You can run a trigger with an
autonomous transaction attached to it, which guarantees that the work
done by trigger persists even though the calling transaction rolls
back (potentially a hacker trying to cover his tracks)!!!

(http://asktom.oracle.com/~tkyte/autonomous/index.html)

Gurjeet.

On 5/16/06, Josh Berkus josh@agliodbs.com wrote:

Doug,

 But what if the user calls the access function, sees the data, then
 issues a ROLLBACK?  The audit record would be rolled back as well (as
 Tom pointed out earlier).


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings




---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] psql feature thought

2006-05-15 Thread Thomas Hallgren

Tom Lane wrote:


Quite aside from the compatibility and how-useful-is-it-really
arguments, I think this'd be a bad idea in the abstract.  SQL is not one
of those languages that assigns semantic significance to the shape of
whitespace [1].  We should NOT introduce any such concept into psql,
because it'd fundamentally break the lexical structure of the language.


+1

Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


[HACKERS] Creating a SHELL type [Was: User Defined Types in Java]

2006-05-14 Thread Thomas Hallgren
I just discovered that it is now possible to create a SHELL type using 
the 'CREATE TYPE type name' syntax using cvs head. Will this come in 
8.2 only or will it also be included in 8.1.4?


Kind Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Inheritance, Primary Keys and Foreign Keys

2006-05-14 Thread Thomas Hallgren

Albert Cervera Areny wrote:
 ...What do you mean with triggers that maintain cascade behavior?


It ties on to how references are handled. Since they currently ignore the inheritance 
aspect, you need triggers that enforce 'on cascade delete/update'. They will become obsolete 
if that changes (i.e. SELECT FROM instead of SELECT FROM ONLY).


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Inheritance, Primary Keys and Foreign Keys

2006-05-13 Thread Thomas Hallgren

Albert Cervera Areny wrote:

Of course, that's an option for my case. Just wanted to know if this solution
could be useful for PostgreSQL in general. Mainly because I'll add some
triggers to check what maybe PostgreSQL should do itself but it's
unimplemented.

If that's not interesting or a proper solution for PostgreSQL I'll add it
using the existing DDL in my application and that's all.

What do you think?

I think that if you want the database to improve its current inheritance behavior, then this 
trigger set is too limited. You need triggers that maintain both unique and primary keys and 
triggers that maintain cascade behavior.


In order to make it really good, you would also need to add some functionality to the 
mechanisms that maintain references. Today, they don't recognize inheritance at all.


Personally, I use Hibernate. It tries to compensate for the lack of these features but since 
it is a middle-tier (or client) solution, it's not ideal. Another client can still violate 
the rules and to maintain integrity in the client is negative from a performance standpoint. 
I think it would be great if PostgreSQL could provide a more complete set of features that 
would enable inheritance. A good start would be to extend it with the functionality needed 
to maintain references, cascade actions, and enforce unique constraints.


On the other hand, inheritance is a tricky business and a good OO-RDB mapper will give you 
several choices of how it should be mapped. There's no one size fits all. The best 
solution is probably if someone (you perhaps?) writes an external OO-RDB mapper module that 
executes in the backend. The author of such a tool would of course need some new nifty 
backend API's in order to do whats needed with references etc.


I actually wrote something similar using Oracle a couple of years ago. It was based on type 
inheritance and views rather then tables and used 'instead of' actions on all views (Oracles 
own mechanisms where far to limited). In some respect, I think that is a better solution. 
Inheritance and all that comes with it is more a 'type' thing then a 'table' thing in my 
world. A view is then used to _map_ the types to persistent storage, i.e. the 'tables'.


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 6: explain analyze is your friend


[HACKERS] any, anyelement, and anyarray

2006-05-12 Thread Thomas Hallgren

I have some questions regarding any, anyelement, and anyarray.

- Why does PostgreSQL declare three different generic types? Isn't one 
enough? ISTM it would be far simpler to use constructs like 'any' and 
'any[]' but neither of them are permitted.


- Why isn't the 'anyarray' declared as an array using the elemenent type 
'anyelement' in pg_type?


- Why can't I write 'anyelement[]'. Shouldn't that be the same thing as 
'anyarray'?


- The 'any' is listed as a type but I get a syntax error whenever I try 
to use it. If I could use it, what would be the difference between 'any' 
and 'anyelement'? The only thing I can think of is if 'anyelement' was 
restricted to non-arrays, but apparently it isn't.  I've tried and 
there's nothing stopping me from passing an 'int[]' to a function that 
takes an 'anyelement'.


Regards,
Thomas Hallgren




---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] any, anyelement, and anyarray

2006-05-12 Thread Thomas Hallgren

Tom Lane wrote:

Thomas Hallgren [EMAIL PROTECTED] writes:
  
- Why does PostgreSQL declare three different generic types? Isn't one 
enough? ISTM it would be far simpler to use constructs like 'any' and 
'any[]' but neither of them are permitted.



any isn't the same as anyelement, because it doesn't have the
property of constraining different argument positions to be the same
type.  For instance, compare(any,any) and compare(anyelement,anyelement)
would accept different sets of input types.

  
I've read that the anyelement will constrain all parameters and the 
return type to be of the same type, or an array of that type. I 
understand that concept since it will give the executor an ability to 
infer the return type by looking at the parameters. If any doesn't do 
that then I understand the difference.


I've been trying to use any with no luck. How do I declare a function 
that takes an any as a parameter? I know how it would be implemented 
internally but the SQL for it eludes me.


I realize that it would be trickier to return an any since the 
expected return type must somehow be derived from the context where the 
function was called. But similar things are done with record already, 
right?


- Why can't I write 'anyelement[]'. Shouldn't that be the same thing as 
'anyarray'?



No, you're confusing these with actual datatypes.  They are pseudotypes,
which means they're only allowed as function argument/result type
placeholders.

  
I understand that. But I'd consider it an implementation detail and I 
think the average SQL user finds it a bit confusing. Wouldn't it be a 
good idea to let the SQL parser recognize the anyelement[] construct as 
a synonym for anyarray? I.e. allow me to write:


CREATE FUNCTION makeSingleElementArray(anyelement) RETURNS anyelement[] 
AS ...



Regards,
Thomas Hallgren

PS. I'm happy to announce that PL/Java now handles records, domains, 
user defined types, anyelement, and anyarray without problems.




---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


[HACKERS] Number of dimensions of an array parameter

2006-05-08 Thread Thomas Hallgren

I can create a function that takes a two dimension int array:

 CREATE FUNCTION twodims(int[][]) RETURNS void AS ...

but there's nothing stopping me from calling this function with an 
arbitrary number of dimensions on the array.


I'd like to map a parameter like the one above to a corresponding 
representation in Java (it would be int[][] there too). As it turns out, 
I can't do that. PostgreSQL will not store any information that can tell 
me how many dimensions that where used in the declaration, i.e. it's 
impossible to write a language VALIDATOR that, based on the information 
in pg_proc, builds a signature where the number of dimensions is reflected.


This leaves me with two choices:

Only allow arrays with one dimension unless the parameter is of a domain 
type (domains are apparently stored with the actual number of 
dimensions). Any call that uses an array parameter with more then one 
dimension will yield an exception.

  --OR--
Always map to Object[] instead of mapping to the correct type, . This 
will work since an array in Java is also an Object and all primitive 
types can be represented as objects (i.e. int can be a 
java.lang.Integer). The strong typing and the ability to use primitives 
are lost however.


I'm leaning towards #1 and hoping that PostgreSQL will enhance the 
parameter type declarations to include the dimensions in future releases.


... After some more testing ...

Unfortunately, I run into problems even when I use domains. Consider the 
following:


thhal=# CREATE DOMAIN twodims as int[][];
CREATE DOMAIN
thhal=# SELECT typndims FROM pg_type WHERE typname = 'twodims';
typndims
--
   2
(1 row)

thhal=# SELECT array_dims('{{{1,2,3},{3,4,3}},{{5,3,2},{9,9,9}}}'::twodims);
  array_dims   
-

[1:2][1:2][1:3]
(1 row)

IMO, there is something seriously wrong here. Clearly the number of 
dimensions is a property of the type. Any array with a different number 
of dimensions should yield an error or at least be coerced into the 
right number of dimensions.


Kind Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Number of dimensions of an array parameter

2006-05-08 Thread Thomas Hallgren

Stefan Kaltenbrunner wrote:

while it would be nice to improve that - it is actually documented quite
clearly.

http://www.postgresql.org/docs/current/static/arrays.html has:

However, the current implementation does not enforce the array size
limits — the behavior is the same as for arrays of unspecified length.

Actually, the current implementation does not enforce the declared
number of dimensions either. Arrays of a particular element type are all
considered to be of the same type, regardless of size or number of
dimensions. So, declaring number of dimensions or sizes in CREATE TABLE
is simply documentation, it does not affect run-time behavior. 

  
A documented flaw is much better than an undocumented one but it's still 
a flaw, and a pretty bad one at that. It's like having a compiler that 
doesn't complain when you define a C-function that takes an int** and 
then pass an int*.


Would it be hard to enforce a real check? The implementation could use 
GUC settings like 'enforce_array_dimensions' and 'enforce_array_lengths' 
that could be set to false for the legacy implementations that rely on 
the current behavior. I know Tom added the ability to have NULL values 
in the arrays. Perhaps now is the time to improve the type semantics as 
well?


Regards,
Thomas Hallgren



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Number of dimensions of an array parameter

2006-05-08 Thread Thomas Hallgren

Tom Lane wrote:

Thomas Hallgren [EMAIL PROTECTED] writes:
  
Would it be hard to enforce a real check? The implementation could use 
GUC settings like 'enforce_array_dimensions' and 'enforce_array_lengths' 
that could be set to false for the legacy implementations that rely on 
the current behavior.



The fact that it doesn't exactly match Java semantics does not make it
legacy behavior.  I don't agree that it's a bug; I think it's a
feature, precisely because many functions can work on arrays of
different dimensions.  Why should we change to make PL/Java happier,
when it will move us further away from the semantics of, say, PL/R?

  
Would it really? The way I see it, the choice of language is irrelevant. 
Either you support dimensions or you don't. The way PostgreSQL does it, 
you get the impression that it is supported while in fact it's not. I 
can't see how anyone would consider that a feature. If you want the 
ability to use an arbitrary number of dimensions, then you should have a 
syntax that supports that particular use-case. An int[][] cannot be 
anything but a two dimensional int array. Not in my book anyway. That 
opinion has nothing to do with Java.



I think reasonable choices for PL/Java would be to reject
multidimensional array arguments, or to silently ignore the
dimensionality and treat the data as 1-D in storage order
(as I think plperl for instance already does).

  

I agree. That's the way I'll do it.

Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Number of dimensions of an array parameter

2006-05-08 Thread Thomas Hallgren

Martijn van Oosterhout wrote:

On Mon, May 08, 2006 at 07:31:14PM +0200, Thomas Hallgren wrote:
  
Would it be hard to enforce a real check? The implementation could use 
GUC settings like 'enforce_array_dimensions' and 'enforce_array_lengths' 
that could be set to false for the legacy implementations that rely on 
the current behavior. I know Tom added the ability to have NULL values 
in the arrays. Perhaps now is the time to improve the type semantics as 
well?



The big probem is where do you store the number of declared dimensions?
It's not stored anywhere, so there's nowhere to check against either.
If we can fix that first we might get to the checking part.

test=# create function foo(int[][]) returns int4 as 'select 1' language sql;
CREATE FUNCTION
test=# \df foo
   List of functions
 Result data type | Schema | Name | Argument data types 
--++--+-

 integer  | public | foo  | integer[]
(1 row)

  
Let each type have it's own entry in pg_type. I.e. let the int[] and 
int[][] be two distinct types (like int and int[] already are). In 
addition, perhaps introduce a new syntax that denotes 'arbitrary number 
of dimensions' and let that too be a distinct type.


Regards,
Thomas Hallgren




---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Fwd: [pgsql-hackers-win32] Build with Visual Studio MSVC

2006-05-07 Thread Thomas Hallgren

Gurjeet Singh wrote:

   Thomas, I love the idea of eclipse; any platform, any language,
one IDE. I am downloading it right now. Can you please send in the
steps that you perform to setup the environment, including mingw
toolkit; I will try to grow on that.


My setup is pretty basic. I can't give you an exact step by step instruction since it's been 
a while since I last did it. I installed msys, mingw, Eclipse. and then, using the Eclipse 
update manager and the 'Callisto' site, I installed the C/C++ plugin.


I also made sure that the PATH in effect for the Eclipse IDE contains entries for the 
%MSYS_HOME%\bin and %MINGW_HOME%\bin. That's it basically.




The idea of this effort is to
have a GUI IDE, with a slew of features that MSVC offers: Memory
window that allows you to edit memory inplace, call-stack window,
watches, quick-expression evaluater with class/struct support, etc.
etc. . If Eclipse can offer all these, then I dont think anyone would
mind using it insead of MSVC.

Eclipse won't offer all of these. Not yet anyway. What you get is a fair C/C++ editor and 
parsers for your make output that will annotate your files with errors and warnings. There's 
said to be some debugging support too on top of gdb, but to be honest, I've never tried it 
on Windows. I do my C-debugging using gdb on Linux. My attempts to use gdb on Windows have 
been quite futile so far. Then again, I'm not using the latest MinGW version so perhaps 
there's still hope.


All in all, Eclipse C/C++ has some way to go before it can match up with MSVC. My point was 
that you can do Windows development without MSVC and you can do it fairly well. If you are a 
Linux hacker, you might even prefer doing it that way. So as a platform, Windows is not by 
any means left alone.


I really think that what you and others are trying to accomplish is very valuable. If not 
for me (since I'm mixing Java and C and work on multiple platforms) then certainly for many 
others. Personally, I'd rather see a Visual Studio port than one for VC++6.0. I wish you the 
best of luck.


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] pseudo-type record arguments for PL-functions

2006-05-06 Thread Thomas Hallgren

David Fetter wrote:

On Thu, May 04, 2006 at 09:02:02PM +0200, Thomas Hallgren wrote:
  

Tom Lane wrote:


Why can PLs not handle pseudo-types?


No one's done the work to figure out which ones are sensible to
support and then add the logic needed to support them.

  

PL/Java will handle the RECORD type correctly. I'm just finalizing a
new, more flexible, type mapping implementation for PL/Java and it
would be easy to add support for more pseudo types too. But what
others would make sense?



Ideally, some way to get all kinds of user-defined types.  DOMAINs,
too. :)

  
OK, got them covered as well. Only thing that remain now is arrays. I 
have a hard time figuring out how to manage them. I'm looking at the 
arrayutils.c. The thing that makes me a bit confused is the 
ArrayMetaState. The functions obtain it using:


   my_extra = (ArrayMetaState *) fcinfo-flinfo-fn_extra;

which is fine if there's only one array parameter. What happens if I 
have two? And how do I declare a function that takes, say, an array of 
int's as a parameter (in SQL that is)?


I find very little information about how to write functions that deals 
with arrays. My only source of information right now is the 
arrayutils.c. Other pointers to docs and code are greatly appreciated.


Kind Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] pseudo-type record arguments for PL-functions

2006-05-06 Thread Thomas Hallgren

Tom Lane wrote:

Make a struct that can hold two ArrayMetaStates.  Or whatever else you
need.  What a C function keeps in fn_extra is its own affair.

  
Yes, of course. I see that now. I was unaware that a function had an 
associated user data. What's the semantics associated with the 
fn_extra? Does it retain its setting throughout a session (i.e. the 
lifetime of the backend process)?  PL/Java associates a structure with a 
function using a hash map lookup on the function Oid. Seems I could use 
the fn_extra and remove that map altogether.


Then again, there are times when I need to invalidate the associated 
structure of all java functions due to reload of jar files. Is there any 
way that I can list all functions for a specific language and get hold 
of their current setting of the fn_extra?


Regards,
Thomas Hallgren



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] pseudo-type record arguments for PL-functions

2006-05-06 Thread Thomas Hallgren

Martijn van Oosterhout wrote:

Looking at contrib/intarray/_int_op.c might help. It does something
like this:

ArrayType  *a = (ArrayType *) 
DatumGetPointer(PG_DETOAST_DATUM_COPY(PG_GETARG_DATUM(0)));

The file src/include/utils/array.h also seems to have many useful
functions.

Hope this helps,
  
Yes, the intarray stuff was very helpful but also somewhat confusing. 
Why are there two ways of representing some of the array types? I mean, 
why is there an _int4 when you could just as well write int4[]? I'm 
probably missing the point altogether.


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [pgsql-hackers-win32] Build with Visual Studio MSVC

2006-05-05 Thread Thomas Hallgren

Gurjeet Singh wrote:

   My main grudge is that if we are supporting almost all flovours of
nixens and compilers (close to 34 according to official website), then
why are we leaving Windows platform alone? This will bring in quite a
lot more developers.

You should look at MinGW as a development toolkit, not a platform. PostgreSQL builds and 
runs just fine on the Windows platform. Personally, I use Eclipse C/C++ with MinGW since it 
brings me a number of advantages. The most prominent one is that I only need to master one 
IDE regardless of platform.




   I am sure it's not going to be easy, but I am sure with this great
community suppport, we sure can achieve it.

Seems some people has done a lot of work to get things working with VC++ already. Search for 
the word MSVC on this list.


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] pseudo-type record arguments for PL-functions

2006-05-04 Thread Thomas Hallgren

Tom Lane wrote:

Why can PLs not handle pseudo-types?


No one's done the work to figure out which ones are sensible to support
and then add the logic needed to support them.

PL/Java will handle the RECORD type correctly. I'm just finalizing a new, more flexible, 
type mapping implementation for PL/Java and it would be easy to add support for more pseudo 
types too. But what others would make sense?


Kind Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Is a SERIAL column a black box, or not?

2006-04-30 Thread Thomas Hallgren

Rod Taylor wrote:


If SERIAL is going to be kept long term, then it should be the macro
version so it doesn't appear too duplicated.

I concur with this. But to really break out from the current middle ground, you must 
implement the IDENTITY and also document the SERIAL macro as deprecated.


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] Finding the correct type

2006-04-30 Thread Thomas Hallgren
I'm writing a UDT that takes a varchar argument that represents the name 
of a type. The caller may optionally qualify with a namespace, i.e. 
pg_catalog.varchar, or public.address. Is there a c-function 
somewhere that will return the pg_type that corresponds to the name 
(with respect to the current setting of search_path in case the name is 
unqualified)?


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Finding the correct type

2006-04-30 Thread Thomas Hallgren

Martijn van Oosterhout wrote:

On Sun, Apr 30, 2006 at 12:50:23PM +0200, Thomas Hallgren wrote:
I'm writing a UDT that takes a varchar argument that represents the name 
of a type. The caller may optionally qualify with a namespace, i.e. 
pg_catalog.varchar, or public.address. Is there a c-function 
somewhere that will return the pg_type that corresponds to the name 
(with respect to the current setting of search_path in case the name is 
unqualified)?


If you want it as a C string, something like format_type_be() would
suffice. Not it's designed for use in error messages so it makes no
particular to clean up after itself.

Another possibility is the output function for the regtype type, ie
regtypeout.

Hope this helps,
Well, regtypeout led me to regtypein which in turn led me to parseTypeString which seems to 
be exactly what I want.


Thanks,
Thomas Hallgren


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[HACKERS] stdout when running as a Windows service

2006-04-29 Thread Thomas Hallgren
I'm having problems with crashes whenever I attempt to write something 
on stdout or stderr from a UDT whenever the postmaster is running as a 
windows service. Not sure what I'm doing wrong. Is this a known problem?


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Get explain output of postgresql in Tables

2006-04-12 Thread Thomas Hallgren

Richard Huxton wrote:

Tom Lane wrote:


I dislike the thought of encouraging people to post stuff in a
not-easily-readable format.  They won't do it anyway, if it's not
default; look how we still can't get people to send EXPLAIN ANALYZE
output the first time.


It certainly needs to be one format for both purposes.


One idea that comes to mind is to work up some trivial little script
that undoes the more common forms of cut-and-paste damage.

I wonder if it would help much just to change EXPLAIN to indent with
something other than spaces?  Maybe instead of

Nested Loop  (cost=1.06..40.43 rows=5 width=244)
  Join Filter: (public.tenk1.unique2 = int4_tbl.f1)
  -  HashAggregate  (cost=1.06..1.11 rows=5 width=4)

print

Nested Loop  (cost=1.06..40.43 rows=5 width=244)
--Join Filter: (public.tenk1.unique2 = int4_tbl.f1)
 HashAggregate  (cost=1.06..1.11 rows=5 width=4)

Not sure what would look nice, but this would at least remove the hazard
from stuff that thinks whitespace isn't significant.


That's the sort of thing I was thinking of, or even something like:
1 Nested Loop ...
1.1 Join Filter...
1.1.1 HashAggregate...
1.2 etc


Why not go all the way. Here's the above using Satoshi's suggestion:

NestedLoop cost=1.06..40.43 rows=5 width=244
   JoinFilter publicTenk1Unique2=int4_tbl.f1
  HashAggregate cost=1.06..1.11 rows=5 width=4/
   /JoinFilter
/NestedLoop

Easy to copy/paste and whitespace doesn't matter. Easy to read (well, to some at least) and 
can be even easier if you have access to an XML viewer.


Regards,
Thomas Hallgren

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] plpgsql by default

2006-04-11 Thread Thomas Hallgren

Tom Lane wrote:

In the end it's only one small component of security, but any security
expert will tell you that you take all the layers of security that you
can get.  If you don't need a given bit of functionality, it shouldn't
get installed.

I think any security expert would say that if let non trustworthy people get so far as to 
create their own SQL statements, you're in big trouble. Plpgsql or not. I fail to see what 
the real issue is here. Your argument is analog to saying don't install bash on a Linux 
system by default. People might do bad things with it.


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] pg_class catalog question...

2006-04-02 Thread Thomas Hallgren

Jim C. Nasby wrote:

On Sat, Apr 01, 2006 at 05:42:34PM +0200, Thomas Hallgren wrote:
  
Why not simply a fixed number of bytes, i.e. byte(16) or octet(16)? 
Hexadecimal is just a convenient human-readable representation.



Well, hex is much easier to deal with in many regards than raw bytes,
though. But yes, the idea is that you'd just store raw bytes on disk.
byte or octet would work fine if they existed.
  
IIRC, Oracle actually uses the term RAW. It makes sense I think. No 
conversion applied, no nothing. Just simple raw data.


- thomas


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


[HACKERS] Remote PL/Java, Summary

2006-04-01 Thread Thomas Hallgren

Hi all,
And thanks for very good input regarding a remote alternative to PL/Java 
(thread titled Shared Memory). I'm convinced that such an alternative 
would be a great addition to PL/Java and increase the number of users. 
The work to create such a platform that has the stability and quality of 
todays PL/Java is significant (I really do think it is a 
production-grade product today). So significant in fact, that I'm 
beginning to think of a third alternative. An alternative that would 
combine the performance of using in-process calls with the benefits of 
sharing a JVM. The answer is of course to make the backend multi-threaded.


This question has been debated before and always promptly rejected. One 
major reason is of course that it will not bring any benefits over the 
current multi-process approach on a majority of the platforms where 
PostgreSQL is used. A process-switch is just as fast as a thread-switch 
on Linux based systems. Over the last year however, something has happen 
that certainly speaks in the favor of multi-threading. PostgreSQL is 
getting widely adopted on Windows. On Windows, a process-switch is at 
least 5 times more expensive then a thread-switch. In order to 
appropriate locking, PostgreSQL is forced to do a fair amount of 
switching during transaction processing so the gain in using a 
multi-threaded approach on Windows is probably significant. The same is 
true for other OS'es where process-switching is relatively expensive.


There are other benefits as well. PostgreSQL would no longer need shared 
memory and semaphores and lot more resources could be shared between 
backend processes. The one major drawback of a multi-threaded approach 
(the one that's been the main argument for the defenders of the current 
approach) is vulnerability. If one thread is messing things up, then the 
whole system will be brought to a halt (on the other hand, that can be 
said about the current shared-memory approach as well). The cure for 
this is to have a system that, to the extent possible, prevents this 
from happening. How would that be possible? Well, such systems are 
widely used today. Huge companies use them in mission critical 
applications all over the world. They are called Virtual Machines. Two 
types in particular are gaining more an more ground. The .NET based CLR 
and the Java VM.


Although there's an Open Source initiative called Mono that implements 
the CLR, I still don't see it as a viable alternative to create a 
production-grade multi-platform database. Microsofts CLR is of course 
confined to Microsoft platforms. The Java VM's are however a different 
matter altogether. And with the java.nio.channels package that was 
introduced in Java 1.4 and the java.util.concurrent package from Java 
5.0, Java has taken a major steps forward in being a very feasible 
platform for a database implementation. There's actually nothing 
stopping you from doing a high-performance MVCC system using Java today. 
A SQL parser would be based on JavaCC technology (the grammar is already 
written although it needs small adjustments to comply with the 
PostgreSQL dialect). Lots of technology is there out-of-the-box such as 
regular expressions, hash-maps, linked lists, etc. Not to forget an 
exceptionally great threading system, now providing atomic operations, 
semaphores, copy-on-write arrays etc. In short, everything that a 
database implementor could ever wish for.


The third alternative for PL/Java, an approach that gets more viable 
every minute I think about it, is to implement the PostgreSQL backend 
completely in Java. I'm involved in the development of one of the 
commercial JVM's. I know that an enormous amount of resources are 
constantly devoted to performance optimizations. The days when a complex 
system written in C or C++ could outperform a JVM have passed. A static 
optimizer can only do so well. A JVM, that collects heuristics, 
communicates with the CPU about cache usage etc., can be a great deal 
smarter on how the final machine code will be optimized, and 
re-optimized should the conditions change. It would be great if 
PostgreSQL could benefit from all this research.


If a commercial JVM is perceived as a problem, then combine^h^h^hpile 
the code with GNU gcj instead of gcc like today.


The list of advantages can be made a mile long. There's no point in 
listing everything here. From my own standpoint, I'm of course thinking 
first and foremost about the advantages with PL/Java. It will become the 
absolute most efficient PL of them all. Other languages, for which no 
good Java implementation exists (I'm thinking Jython for Python, etc.), 
can be implemented using JNI. The most common functions used by say, 
PL/Perl could probably be implemented as callbacks into the Java domain 
in order to make the changes in the respective PL minimal.


Opinions? Suggestions?

Kind Regards,
Thomas Hallgren



---(end of broadcast

Re: [HACKERS] pg_class catalog question...

2006-04-01 Thread Thomas Hallgren

Jim C. Nasby wrote:

On Fri, Mar 31, 2006 at 11:29:15AM -0500, Tom Lane wrote:

This argument falls flat when you consider that the width of a CHAR
entry is measured in characters, not bytes, and therefore its physical
size is not fixed even if its logical width is.


True, but in every case I've used char it was to store something that
would never be multi-byte, like a GUID, or a SHA1. Though I guess in
retrospect, what would really be handy is 'hex' datatype, that stores a
hex string (possibly with a custom format, such as a GUID) in it's
native binary format.


Why not simply a fixed number of bytes, i.e. byte(16) or octet(16)? Hexadecimal is just a 
convenient human-readable representation.


Regards,
Thomas Hallgren

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Remote PL/Java, Summary

2006-04-01 Thread Thomas Hallgren

Andrew Dunstan wrote:


and a happy April 1 to you too, btw.


;-)

- thomas


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Shared memory

2006-03-28 Thread Thomas Hallgren

Hi Simon,
Thanks for your input. All good points. I actually did some work using Java stored 
procedures on DB2 a while back but I had managed to forget (or repress :-) ) all about the 
FENCED/NOT FENCED stuff. The current discussion definitely puts it in a different 
perspective. I think PL/Java has a pretty good 'NOT FENCED' implementation, as does many 
other PL's, but no PL has yet come up with a FENCED solution.


This FENCED/NOT FENCED terminology would be a good way to differentiate between the two 
approaches. Any chance of that syntax making it into the PostgreSQL grammar, should the need 
arise?


Some more comments inline:

Simon Riggs wrote:

Just some thoughts from afar: DB2 supports in-process and out-of-process
external function calls (UDFs) that it refers to as UNFENCED and FENCED
procedures. For Java only, IBM have moved to supporting *only* FENCED
procedures for Java functions, i.e. having a single JVM for all
connections.


Are you sure about this? As I recall it a FENCED stored procedure executed in a remote JVM 
of it's own. A parameter could be used that either caused a new JVM to be instantiated for 
each stored procedure call or to be kept for the duration of the session. The former would 
yield really horrible performance but keep memory utilization at a minimum. The latter would 
get a more acceptable performance but waste more memory (in par with PL/Java today).




Each connection's Java function runs as a thread on a
single dedicated JVM-only process. 

If that was true, then different threads could share dirty session data. I wanted to do that 
using DB2 but found it impossible. That was a while back though.



That approach definitely does increase the invocation time, but it
significantly reduces the resources associated with the JVM, as well as
allowing memory management to be more controllable (bliss...). So the
overall picture could be more CPU and memory resources for each
connection in the connection pool.

My very crude measurements indicate that the overhead of using a separate JVM is between 
6-15MB of real memory per connection. Today, you get about 10MB/$ and servers configured 
with 4GB RAM or more are not uncommon.


I'm not saying that the overhead doesn't matter. Of course it does. But the time when you 
needed to be extremely conservative with memory usage has passed. It might be far less 
expensive to buy some extra memory then to invest in SMP architectures to minimize IPC overhead.


My point is, even fairly large app-servers (using connection pools with up to 200 
simultaneous connections) can run using relatively inexpensive boxes such as an AMD64 based 
server with 4GB RAM and show very good throughput with the current implementation.




If you have a few small Java functions centralisation would not be good,
but if you have a whole application architecture with many connections
executing reasonable chunks of code then this can be a win.

One thing to remembered is that a 'chunk of code' that executes in a remote JVM and uses 
JDBC will be hit by the IPC overhead on each interaction over the JDBC connection. I.e. the 
overhead is not just limited to the actual call of the UDF, it's also imposed on all 
database accesses that the UDF makes in turn.




In that environment we used Java for major database functions, with SQL
functions for small extensions.


My guess is that those major database functions did a fair amount of JDBC. Am I 
right?



Also the Java invocation time we should be celebrating is that by having
Java in the database the Java-DB time is much less than it would be if
we had a Java stack sitting on another server.



I think the cases when you have a Tomcat or JBoss sitting on the same physical server as the 
actual database are very common. One major reason being that you don't want network overhead 
between the middle tier and the backend. Moving logic into the database instead of keeping 
it in the middle tier is often done to get rid of the last hurdle, the overhead of IPC.



Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Shared memory

2006-03-28 Thread Thomas Hallgren

Tom Lane wrote:

Thomas Hallgren [EMAIL PROTECTED] writes:
  

This FENCED/NOT FENCED terminology would be a good way to
differentiate between the two approaches. Any chance of that syntax
making it into the PostgreSQL grammar, should the need arise?



Of what value would it be to have it in the grammar?  The behavior would
be entirely internal to any particular PL in any case.

  
Not necessarily but perhaps the term FENCED is incorrect for the concept 
that I have in mind.


All languages that are implemented using a VM could benefit from the 
same remote UDF protocol. Java, C#, perhaps even Perl or Ruby. The flag 
that I'd like to have would control 'in-process' versus 'remote'.


I'm not too keen on the term FENCED, since it, in the PL/Java case will 
lead to poorer isolation. Multiple threads running in the same JVM will 
be able to share data and a JVM crash will affect all connected sessions.


Then again, perhaps it's a bad idea to have this in the function 
declaration in the first place. A custom GUC parameter might be a better 
choice. It will not be possible to have some functions use the 
in-process approach and others to execute remotely but I doubt that will 
matter that much.


I'm still eager to hear what it is in the current PL/Java that you 
consider fundamental unresolvable problems.


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Shared memory

2006-03-28 Thread Thomas Hallgren

Dave Cramer wrote:


What exactly is a FENCED solution ? If it is simply a remote 
connection to a single JVM then pl-j already does that.
Last time I tried to use pl-j (in order to build a mutual test 
platform), I didn't manage to make it compile due to missing artifacts 
and it wasn't ported to Windows. Lazslo filed a JIRA bug on that but 
since then (August last year) I've seen no activity in the project. Is 
it still alive? Is anyone using it?


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Shared memory

2006-03-28 Thread Thomas Hallgren

Dave Cramer wrote:




I'm not too keen on the term FENCED, since it, in the PL/Java case 
will lead to poorer isolation. Multiple threads running in the same 
JVM will be able to share data and a JVM crash will affect all 
connected sessions.

When was the last time you saw a JVM crash ? These are very rare now.
I think that's somewhat dependent on what JVM you're using. For the 
commercial ones, BEA, IBM, and Sun, i fully agree.


In any case if it does fail, it's a JVM bug and can happen to any code 
running and take the server down if it is in process.
Crash is perhaps not the right word. My point concerned level of 
isolation. Code that is badly written may have serious impact on other 
threads in the same JVM. Let's say you cause an OutOfMemoryException or 
an endless loop. The former will render the JVM completely useless and 
the latter will cause low scheduling prio. If the same thing happens 
using an in-process JVM, the problem is isolated to that one session.


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Shared memory

2006-03-27 Thread Thomas Hallgren

Martijn,

I tried a Socket approach. Using the new IO stuff that arrived with Java 1.4 (SocketChannel 
etc.), the performance is really good. Especially on Linux where an SMP machine show a 1 to 
1.5 ratio between one process doing ping-pong between two threads and two processes doing 
ping-pong using a socket. That's acceptable overhead indeed and I don't think I'll be able 
to trim it much using a shared memory approach (the thread scenario uses Java monitor locks. 
That's the most efficient lightweight locking implementation I've come across).


One downside is that on a Windows box, the ratio between the threads and the processes 
scenario seems to be 1 to 5 which is a bit worse. I've heard that Solaris too is less 
efficient then Linux in this respect.


The real downside is that a call from SQL to PL/Java using the current in-process approach 
is really fast. It takes about 5 micro secs on my 2.8GHz i386 box. The overhead of an 
IPC-call on that box is about 18 micro secs on Linux and 64 micro secs on Windows. That's an 
overhead of between 440% and 1300% due to context switching alone. Yet, for some 
applications, perhaps that overhead is acceptable? It should be compared to the high memory 
consumption that the in-process approach undoubtedly results in (which in turn might lead to 
less optimal use of CPU caches and, if memory is insufficient, more time spent doing swapping).


Given those numbers, it would be interesting to hear what the community as a whole thinks 
about this.


Kind Regards,
Thomas Hallgren


Martijn van Oosterhout wrote:

On Fri, Mar 24, 2006 at 11:51:30AM +0100, Thomas Hallgren wrote:

Hi,
I'm currently investigating the feasibility of an alternative PL/Java 
implementation that would use shared memory to communicate between a JVM 
and the backend processes. I would very much like to make use of the 
routines provided in shmem.c but I'm a bit uncertain how to add a segment 
for my own use.


I'm wondering if a better way to do it would be similar to the way X
does it. The client connects to the X server via a pipe (tcp/ip or unix
domain). This is handy because you can block on a pipe. The client then
allocates a shared memory segment and sends a message to the server,
who can then also connect to it.

The neat thing about this is that the client can put data in the shared
memory segment and send one byte through the pipe and then block on a
read. The JVM which has a thread waiting on the other end wakes up,
processes the data, puts the result back and writes a byte to the pipe
and waits. This wakes up the client who can then read the result.

No locking, no semaphores, the standard UNIX semantics on pipes and
sockets make sure everything works.

In practice you'd probably end up sending small responses exclusively
via the pipe and only use the shared memory for larger blocks of data
but that's your choice. In X this is mostly used for image data and
such.


My questions are:
1. Do you see something right away that invalidates this approach?


Nothing direct, though a single segment just for finding the JVM seems
a lot. A socket approach would work better I think.

2. Is using the shared memory functionality that the backend provides a 
good idea (I'm thinking shmem functions, critical sections, semaphores, 
etc.). I'd rather depend on them then having conditional code for different 
operating systems.


That I don't know. However, ISTM a lock-free approach is better
wherever possible. If you can avoid the semaphores altogether...

3. Would it be better if the Postmaster allocated the global segment and 
started the JVM (based on some config parameter)?


I don't know about the segment but the postmaster should start. I
thought the tsearch guys had an approach using a co-process. I don't
know how they start it up but they connected via pipes.

Hope this helps,



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Shared memory

2006-03-27 Thread Thomas Hallgren

Martijn van Oosterhout wrote:

On Mon, Mar 27, 2006 at 10:57:21AM +0200, Thomas Hallgren wrote:

Martijn,

I tried a Socket approach. Using the new IO stuff that arrived with Java 
1.4 (SocketChannel etc.), the performance is really good. Especially on 
Linux where an SMP machine show a 1 to 1.5 ratio between one process doing 
ping-pong between two threads and two processes doing ping-pong using a 
socket. That's acceptable overhead indeed and I don't think I'll be able to 
trim it much using a shared memory approach (the thread scenario uses Java 
monitor locks. That's the most efficient lightweight locking implementation 
I've come across).


Yeah, it's fairly well known that the distinctions between processes
and threads on linux is much smaller than on other OSes. Windows is
pretty bad, which is why threading is much more popular there.

The real downside is that a call from SQL to PL/Java using the current 
in-process approach is really fast. It takes about 5 micro secs on my 
2.8GHz i386 box. The overhead of an IPC-call on that box is about 18 micro 
secs on Linux and 64 micro secs on Windows. That's an overhead of between 
440% and 1300% due to context switching alone. Yet, for some applications, 


snip

This might take some more measurements but AIUI the main difference
between in-process and intra-process is that one has a JVM per
connection, the other one JVM shared. In that case might thoughts are
as follows:

- Overhead of starting JVM. If you can start the JVM in the postmaster
you might be able to avoid this. However, if you have to restart the
JVM each process, that's a cost.

- JIT overhead. For often used classes JIT compiling can help a lot
with speed. But if every class needs to be reinterpreted each time,
maybe that costs more than your IPC.

- Memory overhead. You meantioned this already.

- Are you optimising for many short-lived connections or a few
long-lived connections?

My gut feeling is that if someone creates a huge number of server-side
java functions that performence will be better by having one always
running JVM with highly JIT optimised code than having each JVM doing
it from scratch. But this will obviously need to be tested.

The use case with a huge number of short-lived connections is not feasible at all with 
PL/Java as it stands today. This is partly the reason for my current research. Another 
reason is that it's sometimes desirable to share resources between your connections. 
Dangerous perhaps, but an API that encourages separation and allows sharing in a controlled 
way might prove very beneficial.


The ideal use-case for PL/Java is a client that utilizes a connection pool. And most servlet 
containers and EJB servers do. Scenarios where you have just a few and fairly long lived 
clients are OK too.



One other thing is that seperate processes give you the ability to
parallelize. For example, if a Java function does an SPI query, it can
receive and process results in parallel with the backend generating
them. This may not be easy to acheive with an in-process JVM.



It is fairly easy to achieve using threads. Only one thread at a time may of course execute 
an SPI query but that's true when multiple processes are in place too since the backend is 
single-threaded, and since the logical thread in PL/Java must utilize the same backend as 
where the call originated (to maintain the transaction boundaries). Any result must also 
sooner or later be delivered using that same backend which further limits the ability to 
parallelize.




Incidently, there are compilers these days that can compile Java to
native. Is this Java stuff setup in such a way that you can compile your
classes to native and load directly for the real speed-freaks?


PL/Java can be used with GCJ although I don't think the GCJ compiler outranks the JIT 
compiler in a modern JVM. It can only do static optimizations whereas the JIT has runtime 
heuristics to base its optimizations on. In the test results I've seen so far, the GCJ 
compiler only gets the upper hand in very simple tests. The JIT generated code is faster 
when things are more complicated.


GCJ is great if you're using short-lived connections (less startup time and everything is 
optimized from the very start) but the native code that it produces still needs a JVM of 
some sort. No interpreter of course but classes must be initialized, a garbage collector 
must be running etc. The shared native code results in some gain in memory consumption but 
it's not as significant as one might think.




In that
case, maybe you should concentrate on relibility and flexibility and
still have a way out for functions that *must* be high-performance.



Given time and enough resources, I'd like to provide the best of two worlds and give the 
user a choice whether or not the JVM should be external. Ideally, this should be controlled 
using configuration parameters so that its easy to test which scenario that works best. It's 
a lot of work though

Re: [HACKERS] Shared memory

2006-03-27 Thread Thomas Hallgren

Tom Lane wrote:

It's only that much difference?  Given all the other advantages of
separating the JVM from the backends, I'd say you should gladly pay
that price.

If I'm right, and the most common scenario is clients using connection pools, then it's very 
likely that you don't get any advantages at all. Paying for nothing with a 440% increase in 
calling time (at best) seems expensive :-)


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Shared memory

2006-03-27 Thread Thomas Hallgren

Tom Lane wrote:

Thomas Hallgren [EMAIL PROTECTED] writes:
  

Tom Lane wrote:


It's only that much difference?  Given all the other advantages of
separating the JVM from the backends, I'd say you should gladly pay
that price.

  
If I'm right, and the most common scenario is clients using connection pools, then it's very 
likely that you don't get any advantages at all. Paying for nothing with a 440% increase in 
calling time (at best) seems expensive :-)



You are focused too narrowly on a few performance numbers.  In my mind
the primary advantage is that it will *work*.  I do not actually believe
that you'll ever get the embedded-JVM approach to production-grade
reliability, because of the fundamental problems with threading, error
processing, etc.
  
My focus with PL/Java over the last year has been to make it a 
production-grade product and I think I've succeeded pretty well. The 
current list of open bugs is second to none. What fundamental problems 
are you thinking of that hasn't been solved already?


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


  1   2   3   4   5   >