date:20070809

Re: [HACKERS] Compilation of pg 7.4.17 fails on HP-UX

2007-08-09 Thread Tom Lane

"Adrian Maier" <[EMAIL PROTECTED]> writes:
> On 8/10/07, Tom Lane <[EMAIL PROTECTED]> wrote:
>> [ digs around a bit... ]  However, I might have been right for the
>> wrong reasons.  I'm thinking Adrian is trying to build for 64-bit
>> HPPA (a beast the 7.4 supported-platforms matrix doesn't know about),
>> and is falling foul of this problem:
>> http://archives.postgresql.org/pgsql-hackers/2004-03/msg01196.php

> So it looks like this is simply a known issue that hasn't been backpatched.

You didn't confirm in so many words: were you trying to build 64-bit?
I'd have expected a 32-bit build to work.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Compilation of pg 7.4.17 fails on HP-UX

2007-08-09 Thread Adrian Maier

On 8/10/07, Tom Lane <[EMAIL PROTECTED]> wrote:
> Decibel! <[EMAIL PROTECTED]> writes:
> > On Thu, Aug 09, 2007 at 11:46:47AM -0400, Tom Lane wrote:
> >> "Adrian Maier" <[EMAIL PROTECTED]> writes:
> >>> I have just tried to compile postgresql 7.4.17  on a HP-UX 11.11 box
> >>> (PA-RISC)   and the compliation fails in gist.
> >>
> >> I do not think anyone cares about making 7.4.x run on platforms it did
> >> not support before.  Use a newer PG release.
>
> > Actually, this is PA-RISC, not the numerous emails we've gotten this
> > week about HPUX on ia64 (what is it with the HPUX guys this week?)
>
> Oh, my mistake --- obviously hadn't consumed enough caffeine this
> morning.
>
> [ digs around a bit... ]  However, I might have been right for the
> wrong reasons.  I'm thinking Adrian is trying to build for 64-bit
> HPPA (a beast the 7.4 supported-platforms matrix doesn't know about),
> and is falling foul of this problem:
> http://archives.postgresql.org/pgsql-hackers/2004-03/msg01196.php
> Teodor fixed that in the 8.0 devel cycle, but couldn't back-patch it
> because it meant an on-disk layout change of gist indexes.
>
> What I suspect is that gcc 3.3.3 doesn't have the specific error checks
> I complained of in the above message, but simply generates bogus
> assembly code for the incorrect C code :-(

So it looks like this is simply a known issue that hasn't been backpatched.
I'll simply use 8.2 .


Thanks for your answers,
Adrian Maier

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

[HACKERS] pgcheck - data integrity check

2007-08-09 Thread Rober Mach

Hi,

I am working on a data integrity check tool (pgcheck).
I would like to discuss the following issues:

Right now I am working on a function, which should check the validity of
pages in relation. The relation is passed to the function as its
argument (its oid). For the integrity check of a page, I am using an
AccessShare lock on a relation as you can see on
http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/pgcheck/pokus/pgcheck_page/
. In near future, I would like to extend the functionality also with a
recovery tool which would be able to repair broken pages. Should the
function also repair the corrupted data on a page, the AccessShare lock
on a relation would not be sufficient. But on the other hand
AccessExclusive lock on the entire relation could significantly
influence the performance of a database.
So far I see these possibilities:
1- use AccessShare lock for the integrity check function and in case of
faulty page, pass this page to a special function, which would lock the
page using AccessExclusiveLock on the relation for correction.
+ it would not influence the performance so much
- higher complexity
2- use one function which would lock the relation with AccessExclusive
lock, check the integrity of data and in case of faulty pages, it would
repair it at once.
+ easier to implement
- could cause performance downturn because of relatively long
Exclusive lock on some relations.
3- use the AccessShare lock for the integrity check and use special
"single-user mode" for recovery of data
+ safest option for recovery of data
- long down time of database


Furhter, I would like to know your opinions on what should be checked
next in order to check the integrity of data in database. I am thinking
of checking:
-in case of variable-length data, compare the formal and actual size of data
-check whether constrains applied on items are fulfilled
-compare data in indexes with indexed tables, whether they are correct


Robert

P.S. Any comments to the c-funtion I made so far a welcome


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Unexpected VACUUM FULL failure

2007-08-09 Thread Gregory Stark


"Tom Lane" <[EMAIL PROTECTED]> writes:

> "Simon Riggs" <[EMAIL PROTECTED]> writes:
>
>> ISTM that if we call HeapTupleSatisfiesVacuum() with an additional
>> boolean parameter, force, we can tell VF to always set the hint bits in
>> every case, not just HEAP_MOVED_IN and HEAP_MOVED_OUT.
>
> Surely this approach is no good: won't it allow hint bits to reach disk
> in advance of their transaction?

I don't think so since it sounds like he's saying to still sync the log and
VACUUM FULL has an exclusive lock on the table. So any committed (or aborted)
changes it sees in the table must have been committed or aborted before the
log sync.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Unexpected VACUUM FULL failure

2007-08-09 Thread Tom Lane

"Simon Riggs" <[EMAIL PROTECTED]> writes:
> Good hunch. I plugged this hole earlier, but on further inspection I can
> see the plug wasn't wide enough. XLogAsyncCommitFlush() is good enough,
> but HeapTupleSatisfiesVacuum() still allowed the inexact bookkeeping to
> sometimes skip hint bit setting, when executed with concurrent
> transactions touching other tables.

> ISTM that if we call HeapTupleSatisfiesVacuum() with an additional
> boolean parameter, force, we can tell VF to always set the hint bits in
> every case, not just HEAP_MOVED_IN and HEAP_MOVED_OUT.

Surely this approach is no good: won't it allow hint bits to reach disk
in advance of their transaction?

I think it'd be safer, and a lot less ugly, to recast the tests in
VACUUM FULL.  If we make the first pass clear any old MOVED_IN/MOVED_OUT
bits then the last pass can key off those instead of assuming that
XMIN_COMMITTED is set everywhere.  Then we'd not need
XLogAsyncCommitFlush, which is a kluge anyway.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Compilation of pg 7.4.17 fails on HP-UX

2007-08-09 Thread Tom Lane

Decibel! <[EMAIL PROTECTED]> writes:
> On Thu, Aug 09, 2007 at 11:46:47AM -0400, Tom Lane wrote:
>> "Adrian Maier" <[EMAIL PROTECTED]> writes:
>>> I have just tried to compile postgresql 7.4.17  on a HP-UX 11.11 box
>>> (PA-RISC)   and the compliation fails in gist.
>>
>> I do not think anyone cares about making 7.4.x run on platforms it did
>> not support before.  Use a newer PG release.

> Actually, this is PA-RISC, not the numerous emails we've gotten this
> week about HPUX on ia64 (what is it with the HPUX guys this week?)

Oh, my mistake --- obviously hadn't consumed enough caffeine this
morning.

[ digs around a bit... ]  However, I might have been right for the
wrong reasons.  I'm thinking Adrian is trying to build for 64-bit
HPPA (a beast the 7.4 supported-platforms matrix doesn't know about),
and is falling foul of this problem:
http://archives.postgresql.org/pgsql-hackers/2004-03/msg01196.php
Teodor fixed that in the 8.0 devel cycle, but couldn't back-patch it
because it meant an on-disk layout change of gist indexes.

What I suspect is that gcc 3.3.3 doesn't have the specific error checks
I complained of in the above message, but simply generates bogus
assembly code for the incorrect C code :-(

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Wrote a connect-by feature

2007-08-09 Thread Joshua D. Drake

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Andrew Dunstan wrote:
> 
> 
> Bertram Scharpf wrote:
>> Hi,
>>
>> Am Mittwoch, 08. Aug 2007, 14:04:23 +0200 schrieb Bertram Scharpf:
>>  
>>> Am Mittwoch, 08. Aug 2007, 00:52:41 +0200 schrieb Bertram Scharpf:
>>>
   
   
>>
>> Nobody's answering just a line. Is it such a bore?

I have it downloaded but have not had time to test it.

>>
>>
>>
>>   
> 
> Isn't connect by the Oracle non-standard way of doing recursive queries?
> Please review the mailing list archives on this subject.
> 
> Also, many people have their heads down trying to punch out a release,
> right now.
> 
> cheers
> 
> andrew
> 
> ---(end of broadcast)---
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>   choose an index scan if your joining column's datatypes do not
>   match
> 


- --

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGu7mdATb/zqfZUUQRAgFSAJ0U2bPpRfV1G1TXbyBmakwqC4WfhwCcD3B2
wNWKLIF7FqFVo9KMVKMXtt8=
=LPls
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Wrote a connect-by feature

2007-08-09 Thread Andrew Dunstan




Bertram Scharpf wrote:

Hi,

Am Mittwoch, 08. Aug 2007, 14:04:23 +0200 schrieb Bertram Scharpf:
  

Am Mittwoch, 08. Aug 2007, 00:52:41 +0200 schrieb Bertram Scharpf:


  
  


Nobody's answering just a line. Is it such a bore?



  


Isn't connect by the Oracle non-standard way of doing recursive queries? 
Please review the mailing list archives on this subject.


Also, many people have their heads down trying to punch out a release, 
right now.


cheers

andrew

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] createlang/droplang -l outputs

2007-08-09 Thread Tom Lane

Tomoaki Sato <[EMAIL PROTECTED]> writes:
> The header in the list of already installed languages shown by
> createlang/droplang with the -l option is not printed from 8.2.

Ooops.  I seem to have missed these uses of printQuery() when we put in
the FETCH_COUNT patch last summer :-(.

> + popt.topt.start_table = true;

It looks like stop_table has to be set true as well to match
the former formatting; compare
http://developer.postgresql.org/cvsweb.cgi/pgsql/src/bin/psql/startup.c.diff?r1=1.136;r2=1.137

Patched in HEAD and 8.2 branch.  Thanks for the report!

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Wrote a connect-by feature

2007-08-09 Thread Bertram Scharpf

Hi,

Am Mittwoch, 08. Aug 2007, 14:04:23 +0200 schrieb Bertram Scharpf:
> Am Mittwoch, 08. Aug 2007, 00:52:41 +0200 schrieb Bertram Scharpf:
> >   

Nobody's answering just a line. Is it such a bore?

Bertram


-- 
Bertram Scharpf
Stuttgart, Deutschland/Germany
http://www.bertram-scharpf.de

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Joshua D. Drake

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hans-Juergen Schoenig wrote:
> 
> On Aug 9, 2007, at 5:03 PM, Greg Smith wrote:
> 
>> On Thu, 9 Aug 2007, Andrew Dunstan wrote:
>>
>>> There are also some fairly impressive code obfuscators about, that
>>> your clients might find useful.
>>
>> All they really need is to find a sufficiently clever PL/Perl programmer.
> 
> 
> we should make this a PL/Brainfuck implementation then ;)

PL/Whitespace

http://compsoc.dur.ac.uk/whitespace/

> 
> hans
> 
> 
> -- 
> Cybertec Geschwinde & Schönig GmbH
> Gröhrmühlgasse 26, 2700 Wiener Neustadt
> Tel: +43/1/205 10 35 / 340
> www.postgresql.at, www.cybertec.at
> 
> 
> 


- --

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGu6dbATb/zqfZUUQRAm5lAKCvNjklINez5AfioKTJ2FT4BeZ1yQCfbvXz
//mGwmoSYWsiAbdAtOR9/vU=
=QaI0
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Decibel!

On Thu, Aug 09, 2007 at 11:41:02AM -0400, Andrew Dunstan wrote:
> 
> 
> Decibel! wrote:
> >This is also related to the desire to be able to restrict access to the
> >catalog tables. Doing so could potentially solve this problem; it 
> >solves other issues (such as being able to see all the databases that
> >exist on a server, something that hosting environments care about).
> >  
> 
> You can hide the catalogs, albeit at the cost of some functionality. I 
> did some experimentation a couple of years back with removing public 
> access from the catalogs, removing information_schema and the public 
> schema, etc, and it worked quite well. I set up a user who had access to 
> a single schema, which only contained functions, and the user wasn't 
> able (so far as I could determine) to see anything other than those 
> functions - no tables, no catalogs, no databases, no users. The user was 
> still able to function exactly as intended. The intended scenario was 
> for a web app user, where the web server was subverted, the aim being to 
> restrict the amount of information the intruder could steal.
> 
> That doesn't help with information leaking in shared hosting setups, I 
> agree.

No, but that combined with row-level security might. Actually, if we had
a standard set of views that all the tools were expected to use instead
of the raw catalogs, it wouldn't be hard at all to secure things in a
hosted environment.
-- 
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)


pgpIPQ0HPLZrD.pgp
Description: PGP signature

Re: [HACKERS] Compilation of pg 7.4.17 fails on HP-UX

2007-08-09 Thread Decibel!

On Thu, Aug 09, 2007 at 11:46:47AM -0400, Tom Lane wrote:
> "Adrian Maier" <[EMAIL PROTECTED]> writes:
> > I have just tried to compile postgresql 7.4.17  on a HP-UX 11.11 box
> > (PA-RISC)   and the compliation fails in gist.
> 
> I do not think anyone cares about making 7.4.x run on platforms it did
> not support before.  Use a newer PG release.

Actually, this is PA-RISC, not the numerous emails we've gotten this
week about HPUX on ia64 (what is it with the HPUX guys this week?)

http://www.postgresql.org/docs/7.4/interactive/supported-platforms.html
indicates that HPUX on PA-RISC should work...
-- 
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)


pgpGfcyDpEN3F.pgp
Description: PGP signature

Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Robert Treat

On Thursday 09 August 2007 11:30, Joshua D. Drake wrote:
> Andrew Sullivan wrote:
> > On Thu, Aug 09, 2007 at 04:42:19PM +0200, Hans-Juergen Schoenig wrote:
> >> the problem here is that vendors of appliances don't want people to
> >> spider their codes. this is a fact - it is not the idea of open
> >> source to do so but bloody reality. in addition to that people are
> >> not willing to code everything in C just to hide.
> >
> > Well, then, they're out of luck.
>
> This whole thread can be summed up as: plpgsql is an interpretive
> language, not a compiled on.
>
> You can and do see the source, just like perl, ruby, python, bash etc...
>
> If you don't like it, don't use it and use C instead.
>
> :)
>
> Keep in mind that the obfuscation techniques that can be used by python
> and perl are all reversible.
>

Hmm I wonder if you could wire plphp through something like Zend Gaurd? 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

[HACKERS] 2D partitioning of VLDB - sane or not?

2007-08-09 Thread Jason Nerothin

I am building up a schema for storing a bunch of data about proteins, which
on a certain level can be modelled with quite simple tables. The problem is
that the database I am building needs to house lots of it >10TB and growing,
with one table in particular threatening to top 1TB. In the case of the
table and in the case of the overall database, the size can be expected to
grow quickly (and most of it can never be deleted).

In the past, with smaller tables, I have had success partitioning on a
64-bit crc hash that takes a more or less uniform distribution of input data
and pumps out a more-or-less uniform distribution of partitioned data with a
very small probability of collision. The hash itself is implemented as a c
add-on library, returns a BIGINT and serves as a candidate key for what for
our purposes we can call a protein record.

Now back to the big table, which relates two of these records (in a
theoretically symmetric way). Assuming I set the the table up as something
like:

CREATE TABLE big_protein_relation_partition_dimA_dimB{
protein_id_a BIGINTEGER NOT NULL CHECK( bin_num(protein_id_a) = dimA ), ---
key (hash) from some table
protein_id_a BIGINTEGER NOT NULL CHECK( bin_num(protein_id_b) = dimB ), ---
key (hash) from some table
...
}

and do a little c bit-twiddling and define some binning mechanism on the
BIGINTEGERs.

As near I can tell, binning out along the A and B dimensions into 256 bins,
I shouldn't be in any danger of running out of OIDs or anything like that
(despite having to deal with 2^16 tables). Theoretically, at least, I should
be able to do UNIONS along each axis (to avoid causing the analyzer too much
overhead) and use range exclusion to make my queries zip along with proper
indexing.

Aside from running into a known bug with "too many triggers" when creating
gratuitous indices on these tables, I feel as it may be possible to do what
I want without breaking everything. But then again, am I taking too many
liberties with technology that maybe didn't have use cases like this one in
mind?

Jason

-- 

Jason Nerothin
Programmer/Analyst IV - Database Administration
UCLA-DOE Institute for Genomics & Proteomics
Howard Hughes Medical Institute

611 C.E. Young Drive East   | Tel: (310) 206-3907
105 Boyer Hall, Box 951570  | Fax: (310) 206-3914
Los Angeles, CA 90095. USA | Mail: [EMAIL PROTECTED]

http://www.mbi.ucla.edu/~jason

Re: [HACKERS] RIP: Buildfarm member Baiji ??

2007-08-09 Thread Chris Mair




The extinction of a dolphin ... sign of things to come?



I've got a dolphin too (somebody's doing this on purpose ;):
http://www.pgbuildfarm.org/cgi-bin/show_history.pl?nm=tucuxi&br=HEAD

A few hours ago I realized the vmware instance it's running in had
been suspended in a snapshot for the last few days...

1 extinct, 1 in coma ->  not going too well for dolphins these days ;)


Bye,
Chris.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Hans-Juergen Schoenig




Make a loadable PL "plpgsqlsec" or something like that that's just
a thin wrapper around the plpgsql call handler, and all it does is
decrypt the source text.



perfect idea, simple perfect.
i did not consider that yet.
i was hoping for some enlightenment like that.

would be a nice module for contrib or pgfoundry ...

many thanks,

hans


--
Cybertec Geschwinde & Schönig GmbH
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at

Re: [HACKERS] default_text_search_config and expression indexes

2007-08-09 Thread tomas

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Thu, Aug 09, 2007 at 02:03:13PM +0400, Oleg Bartunov wrote:
> On Thu, 9 Aug 2007, [EMAIL PROTECTED] wrote:
> 
> >Maybe I'm missing something [...]

> I'm tired to repeat - index itself doesn't know about configuration !
> It just index tsvector data type. tsvector in turn can be obtained
> using various ways:
> 1. manually
> 2. to_tsvector
> 
> tsvector can be stored in a separate attribute or be fully virtual
> like in expressional index.
> 
> Moreover, tsvector can be obtained using various configurations depending
> on your application.

Yep. I think I got that.

Now what will be the "normal" case for a new user?

* build an index with a virtual tsvector (using a configuration).
  Then I would expect the "index" to know the "whole" function to
  calculate its entries -- that would include the config used.

  This would be more "compatible" with the approach stated elsewhere to
  always mention explicitly the config.

* manually. Would a novice do that? Or is that advanced stuff?

Regards -- and sorry for my stupid questions :)

- -- tomás
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFGu1ciBcgs9XrR2kYRAnVqAJ9mSc463I44JxLLDPpUZ/JirUVH5ACeOhUr
2/4aXs0ukMnvP8YCq8pamwQ=
=IgfC
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Tom Lane

"Merlin Moncure" <[EMAIL PROTECTED]> writes:
> ... IIRC the general
> conclusion was that if you want to truly encrypt the sources for your
> functions, the basic idea is to create a new stored procedure language
> that wraps pl/pgsql and handles encryption there.

> This would be relatively easy to support as an external module, I think.

Yeah, I was about to make that same suggestion: it's not clear to me
that you need any support whatsoever from the core project for this.
Make a loadable PL "plpgsqlsec" or something like that that's just
a thin wrapper around the plpgsql call handler, and all it does is
decrypt the source text.

It's not clear exactly where the *en*cryption should happen, but
one possibility is to have the new PL have a validator function that
does an immediate UPDATE on the pg_proc row if the text isn't already
encrypted.

The encrypted representation would need to work like MD5 passwords:
it's ASCII so that dumps will work, and it's possible to tell whether a
given string is encrypted or not.

Given the very limited use-case for what's being suggested here,
I don't feel a need to put it in core.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] change name of redirect_stderr?

2007-08-09 Thread Tom Lane

Andrew Dunstan <[EMAIL PROTECTED]> writes:
> Before I wrap up the CSVlog stuff, we need to decide whether or not to 
> change the name of the redirect_stderr setting, and if so to what. The 
> reason is that with CSVlogs it will no longer apply just to stderr (we 
> will require it to be on for CSVlogs, in fact).

> I suggest "redirect_logs", although it's arguably too general as it 
> doesn't apply to syslog/eventlog.

Perhaps it should be named analogously to stats_start_collector,
ie think of the syslogger process as a "log collector".  I don't
much like "log_start_collector" though --- "start_log_collector"
seems far less confusing as to where the verb is.

No strong opinion here, just tossing out some ideas.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] HOT patch, missing things

2007-08-09 Thread Tom Lane

Gregory Stark <[EMAIL PROTECTED]> writes:
> I agree about the costs for evaluating the expressions. But a COLD update is
> certainly going to have to evaluate both expressions once. The only additional
> cost here is that HOT is going to have to evaluate the *old* expression as
> well. So it's at worst twice as expensive as a normal COLD update.

What's bothering me is the case where we evaluate the expression twice,
find it doesn't match, and fall through to the COLD update logic which
will do it a third time.

> I think I'm leaning towards doing a binary comparison of the
> parameters to the expressions.

Yeah, we could simply insist on no change to any column that's used by
any of the expressions.  That would be cheap to test.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] HOT patch, missing things

2007-08-09 Thread Gregory Stark

"Tom Lane" <[EMAIL PROTECTED]> writes:

> We have so far managed to avoid any really strong dependencies on the
> requirement of index-function immutability --- your queries may not work
> very well if the function isn't immutable, but you are not at risk of
> system-level data corruption. With this, you will be.

Wait, how would this be any more vulnerable to system-level data corruption
than a normal update? The worst case I can see is that you have a properly
updated tuple but the new tuple version is indexed incorrectly just as would
be the case if you have a functional index or expression index which had
changed value since the update was performed.

I agree about the costs for evaluating the expressions. But a COLD update is
certainly going to have to evaluate both expressions once. The only additional
cost here is that HOT is going to have to evaluate the *old* expression as
well. So it's at worst twice as expensive as a normal COLD update.

I think I'm leaning towards doing a binary comparison of the parameters to the
expressions. That won't catch as many cases as comparing the results of the
expressions -- and I can think of cases where that would be disappointing --
but it's in keeping with how it determines whether a tuple is eligible for a
HOT update in the first place.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[HACKERS] createlang/droplang -l outputs

2007-08-09 Thread Tomoaki Sato

Hi,

The header in the list of already installed languages shown by
createlang/droplang with the -l option is not printed from 8.2.

PostgreSQL 8.2:
  $ createlang -l
   plpgsql | yes

PostgreSQL 8.1:
  $ createlang -l
  Procedural Languages
Name   | Trusted? 
  -+--
   plpgsql | yes

The structure member 'start_table' is added to the 'printTableOpt' in
the 'src/bin/psql/print.h' from 8.2. But the value of the
'start_table' is not set to true.

Regards,

Index: src/bin/scripts/createlang.c
===
RCS file: /projects/cvsroot/pgsql/src/bin/scripts/createlang.c,v
retrieving revision 1.24
diff -u -r1.24 createlang.c
--- src/bin/scripts/createlang.c14 Jul 2006 14:52:27 -  1.24
+++ src/bin/scripts/createlang.c9 Aug 2007 14:46:31 -
@@ -138,6 +138,7 @@
memset(&popt, 0, sizeof(popt));
popt.topt.format = PRINT_ALIGNED;
popt.topt.border = 1;
+   popt.topt.start_table = true;
popt.topt.encoding = PQclientEncoding(conn);
popt.title = _("Procedural Languages");
printQuery(result, &popt, stdout, NULL);
Index: src/bin/scripts/droplang.c
===
RCS file: /projects/cvsroot/pgsql/src/bin/scripts/droplang.c,v
retrieving revision 1.21
diff -u -r1.21 droplang.c
--- src/bin/scripts/droplang.c  31 May 2006 11:02:42 -  1.21
+++ src/bin/scripts/droplang.c  9 Aug 2007 14:46:31 -
@@ -149,6 +149,7 @@
memset(&popt, 0, sizeof(popt));
popt.topt.format = PRINT_ALIGNED;
popt.topt.border = 1;
+   popt.topt.start_table = true;
popt.topt.encoding = PQclientEncoding(conn);
popt.title = _("Procedural Languages");
printQuery(result, &popt, stdout, NULL);



Tomoaki Sato <[EMAIL PROTECTED]>
SRA OSS, Inc. Japan

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Problem with locks

2007-08-09 Thread Gregory Stark

"Gregory Stark" <[EMAIL PROTECTED]> writes:

> "Tom Lane" <[EMAIL PROTECTED]> writes:
>
>> Gregory Stark <[EMAIL PROTECTED]> writes:
>>> We're seeing a problem where occasionally a process appears to be granted a
>>> lock but miss its semaphore signal.
>>
>> Kernel bug maybe?  What's the platform?
>
> It does sound like it given the way my description went. I was worried it may
> be some code path not setting waitStatus properly or the compiler caching it
> incorrectly somehow.
>
> But now that I check I see it's a pretty old kernel version (Linux 2.6.5) 

For what it's worth we've reproduced the problem with 2.6.16.21 which is
"only" about a year old. I want to rerun this with a shiny new 2.6.22 kernel
but really 2.6.16 is recent enough that I don't know of any major bugs fixed
in IPC handling since then (with the exception of hugetlb interaction which
we're not using on this machine) .

So now this is probably either an ongoing kernel bug affecting Postgres or
it's elsewhere -- either in Postgres or GCC.

I'm really concerned about this because while the behaviour with
deadlock_timeout set quite high (we have it set to 60s on this machine) is bad
enough -- the behaviour with it set to the default 1s is far more scary.

On the default 1s timeout on a machine undergoing lock waits which are mostly
under 1s you will probably never notice anything recognizably similar to this.
You'll occasionally have some lock waits which last a second for no good
reason but you'll never notice that. 

*But* if you should have a lock wait which lasts more than 1s before it's
granted, then when it's granted the semaphore gets lost you're in serious doo
doo. The deadlock timeout only fires once and then nothing's going to wake up
that process ever again.

IIRC we've actually gotten a couple reports of people claiming they've got a
"deadlock" when there was no evidence of a deadlock in pg_locks. We always
chalked it down to a single long-lived process holding the lock and blocking,
but never did much analysis on those reports to see if that was really the
case. It's quite possible we had users already observing this problem.

If it's a real problem then we're in a bit of a bind. Even if we find and fix
a Linux kernel problem we'll still have users on versions of the kernel prior
to 2.6.23 or whatever has the bug fixed. We may be best off including an
option to have the deadlock timer refire every deadlock_timeout interval
instead of just firing once. Then we could print a message any time it occurs
and include a HINT about upgrading to a kernel with the bug fixed.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Merlin Moncure

On 8/9/07, Andrew Dunstan <[EMAIL PROTECTED]> wrote:
> Decibel! wrote:
> > This is also related to the desire to be able to restrict access to the
> > catalog tables. Doing so could potentially solve this problem; it
> > solves other issues (such as being able to see all the databases that
> > exist on a server, something that hosting environments care about).
 >
>
> You can hide the catalogs, albeit at the cost of some functionality. I
> did some experimentation a couple of years back with removing public
> access from the catalogs, removing information_schema and the public
> schema, etc, and it worked quite well. I set up a user who had access to
> a single schema, which only contained functions, and the user wasn't
> able (so far as I could determine) to see anything other than those
> functions - no tables, no catalogs, no databases, no users. The user was
> still able to function exactly as intended. The intended scenario was
> for a web app user, where the web server was subverted, the aim being to
> restrict the amount of information the intruder could steal.

This works very well to stop casual browsing of functions from psql, etc.

That said, I am in the camp that securing system catalogs (including
pg_proc) is a good and necessary feature.  This debate came up a while
back with all the usual arguments pro- and con-.  IIRC the general
conclusion was that if you want to truly encrypt the sources for your
functions, the basic idea is to create a new stored procedure language
that wraps pl/pgsql and handles encryption there.

This would be relatively easy to support as an external module, I think.

merlin

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] HOT patch, missing things

2007-08-09 Thread Tom Lane

"Pavan Deolasee" <[EMAIL PROTECTED]> writes:
> I started with this. ISTM to support partial indexes, we must check
> the old and new tuple against partiality match.
> ...
> For functional index, we should apply the function to the old and new
> tuple and compare the outcome. If the results are same, HOT update
> is feasible.

I don't like either of these.  They are going to be extremely expensive
if the function or predicate is expensive (because you're going to be
doing two evaluations that you might get no benefit from).  Also, if the
function is not as immutable as it's supposed to be, you will soon have
an utterly corrupt index, with entries pointing at rows they in fact
don't match and never did.  We have so far managed to avoid any really
strong dependencies on the requirement of index-function immutability
--- your queries may not work very well if the function isn't immutable,
but you are not at risk of system-level data corruption.  With this, you
will be.  Since we are entirely dependent on users to mark immutable
functions correctly (and have not always gotten it right ourselves :-(),
I don't think this is an acceptable risk.

If we can't do better than that (and offhand I don't see how to), then
I agree with the current approach of disabling HOT when functional or
partial indexes are present.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Compilation of pg 7.4.17 fails on HP-UX

2007-08-09 Thread Tom Lane

"Adrian Maier" <[EMAIL PROTECTED]> writes:
> I have just tried to compile postgresql 7.4.17  on a HP-UX 11.11 box
> (PA-RISC)   and the compliation fails in gist.

I do not think anyone cares about making 7.4.x run on platforms it did
not support before.  Use a newer PG release.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Andrew Dunstan




Decibel! wrote:

This is also related to the desire to be able to restrict access to the
catalog tables. Doing so could potentially solve this problem; it 
solves other issues (such as being able to see all the databases that

exist on a server, something that hosting environments care about).
  


You can hide the catalogs, albeit at the cost of some functionality. I 
did some experimentation a couple of years back with removing public 
access from the catalogs, removing information_schema and the public 
schema, etc, and it worked quite well. I set up a user who had access to 
a single schema, which only contained functions, and the user wasn't 
able (so far as I could determine) to see anything other than those 
functions - no tables, no catalogs, no databases, no users. The user was 
still able to function exactly as intended. The intended scenario was 
for a web app user, where the web server was subverted, the aim being to 
restrict the amount of information the intruder could steal.


That doesn't help with information leaking in shared hosting setups, I 
agree.


cheers

andrew

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Andrew Dunstan




Andrew Sullivan wrote:

  And they can always write it in C and deliver only compiled
object code; although whether anyone would be foolish enough to run
such object code without inspecting its safety is an excercise in
speculation I don't care to pursue.
  


The intended use for this is an appliance, not a general purpose server. 
They wouldn't have any option :-)


cheers

andrew

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Joshua D. Drake

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Andrew Sullivan wrote:
> On Thu, Aug 09, 2007 at 04:42:19PM +0200, Hans-Juergen Schoenig wrote:
> 
>> the problem here is that vendors of appliances don't want people to  
>> spider their codes. this is a fact - it is not the idea of open  
>> source to do so but bloody reality. in addition to that people are  
>> not willing to code everything in C just to hide.
> 
> Well, then, they're out of luck.

This whole thread can be summed up as: plpgsql is an interpretive
language, not a compiled on.

You can and do see the source, just like perl, ruby, python, bash etc...

If you don't like it, don't use it and use C instead.

:)

Keep in mind that the obfuscation techniques that can be used by python
and perl are all reversible.

Sincerely,

Joshua D. Drake

- --

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGuzMUATb/zqfZUUQRAn0uAJ9evWtJkzS5hMiJDNSi+1jxMobdoACghm3D
r7Tmmz801bKntXAipJBrvy4=
=yRz/
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Andrew Sullivan

On Thu, Aug 09, 2007 at 05:16:43PM +0200, Hans-Juergen Schoenig wrote:
> we are shipping appliances - users can only use SQL; no compilers and  
> no SSH logins allowed ...

If you have shipped a physical thing to someone else, then they can
do what they want with it irrespective of what "rules" you think you
have imposed.  If you believe otherwise, then you are deluding
yourself about the security of your approach.  It was this sort of
magical thinking that caused SIM cards, WEP keys, DVD copy
protection, and other such "guaranteed safe" technologies to be
revealed as encumbered with needless weight that may foil casual
users, but that are trivially broken by anyone actually interested in
doing the breaking.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
In the future this spectacle of the middle classes shocking the avant-
garde will probably become the textbook definition of Postmodernism. 
--Brad Holland

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Andrew Sullivan

On Thu, Aug 09, 2007 at 04:42:19PM +0200, Hans-Juergen Schoenig wrote:

> the problem here is that vendors of appliances don't want people to  
> spider their codes. this is a fact - it is not the idea of open  
> source to do so but bloody reality. in addition to that people are  
> not willing to code everything in C just to hide.

Well, then, they're out of luck.

> so, there has to be a concept to achieve this for stored procedures  
> somehow.

Just because you want something does not mean that there must be a
way to achieve it.

> i am afraid the source level encryption is the easiest thing and most  
> understandable thing to do.

It's easy and understandable, but it won't do anything for you,
because the code has to be decrypted automatically in order to run. 
If it can be decrypted by the system, then it can obviously be
decrypted by someone who has superuser access to the system too.  I
have seen systems that attempt this sort of "protection" anyway. 
(Indeed, one might argue that all of the hopeless and failed copy
protection systems CDs and DVDs have come with are exactly this sort
of trick.)  The fundamental fact is that, if you want users to be able
to use code that is interpreted, then you're never going to be able
to guarantee that the users can't also see that code.

Other equally useless, in my experience, techniques include code
obfuscation.  You could probably bind things up in enough levels of
nested calls to various functions that mapping it out would be tricky
and tedious.  Of course, this also means that maintaining the code is
a fairly significant pain.

I'll also note that I am not actually convinced these contortions do
anything to protect the system from people who want to figure out how
it works. 

None of that, of course, does anything to relieve the pressure on you
to deliver the requested feature.  But perhaps you can reason with
them.  And they can always write it in C and deliver only compiled
object code; although whether anyone would be foolish enough to run
such object code without inspecting its safety is an excercise in
speculation I don't care to pursue.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
The plural of anecdote is not data.
--Roger Brinner

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Decibel!

On Thu, Aug 09, 2007 at 04:50:53PM +0200, Zdenek Kotala wrote:
> Hans-Juergen Schoenig napsal(a):
> >the idea is basically to hide codes - many companies want that and ask 
> >for it again and again.
> >
> >i would suggest keys to reside in $PGDATA. we do this for SSL and so 
> >already.
> >
> >initdb could create such keys so that they are unique to every database 
> >instance.
> >decrypting could be avoided as much as possible basically we should just 
> >decrypt on first all and when it changes.
> 
> But, Companies want to hide code also because they distribute their 
> software. If you store key somewhere on server, user will be able to 
> decrypt the original code.

If folks keep that argument up, we'll never, ever get this useful
feature.

I don't care *what* security you implement (maybe short of quantum
physics), given enough time and motivation, someone can break it. That's
not the point here... the point is just to make it more difficult for
someone to obtain the source code for procedures. Right now, anyone who
can connect to the database can see *everything*.

This is also related to the desire to be able to restrict access to the
catalog tables. Doing so could potentially solve this problem; it also
solves other issues (such as being able to see all the databases that
exist on a server, something that hosting environments care about).
-- 
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)


pgpY53aG0Hgml.pgp
Description: PGP signature

Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Hans-Juergen Schoenig



The basic problem is this: if you have to decrypt the code in order  
to give it to a compiler (PL/pgSQL, Java, Perl, ...) then there is  
a point in time where the source code is in plaintext form - it  
would be trivial to add an fprintf( stderr, "%s", plainTextForm )  
to the PL handler to steal the code.


  -- Korry



if somebody is allowed to recompile on the box GRANT and REVOKE are  
all useful.

you did not get the point, i think.
we are shipping appliances - users can only use SQL; no compilers and  
no SSH logins allowed ...


hans


--
Cybertec Geschwinde & Schönig GmbH
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at

[HACKERS] change name of redirect_stderr?

2007-08-09 Thread Andrew Dunstan



Before I wrap up the CSVlog stuff, we need to decide whether or not to 
change the name of the redirect_stderr setting, and if so to what. The 
reason is that with CSVlogs it will no longer apply just to stderr (we 
will require it to be on for CSVlogs, in fact).


I suggest "redirect_logs", although it's arguably too general as it 
doesn't apply to syslog/eventlog. But maybe that doesn't matter, as we 
can note it in the docs and the sample conf file.


thoughts?

cheers

andrew

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread korry.douglas



My suggestion is to extend PL API and each PL language should offer 
wrap or encrypt function which generate encrypted code and this code 
will be store in the pg_proc. PL language will be responsible to 
detect if it raw or crypted code. PG_Dump will dump crypted procedure 
and author is responsible keep his uncrypted version in source 
repository.
That strategy assumes that there is no way to reproduce the source form 
from the encrypted form.  That means that the PL compiler must be able 
to recognize two different languages (the unencrypted form and the 
encrypted form).  That may work for PL/pgSQL but it won't work for any 
language where a third-party compiler is involved (unless the decryption 
function produces an obfuscated source form that represents legitimate 
code for the compiler in question).


The basic problem is this: if you have to decrypt the code in order to 
give it to a compiler (PL/pgSQL, Java, Perl, ...) then there is a point 
in time where the source code is in plaintext form - it would be trivial 
to add an fprintf( stderr, "%s", plainTextForm ) to the PL handler to 
steal the code.


  -- Korry

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] comunication protocol

2007-08-09 Thread Omar Bettin




Merlin Moncure wrote:

On 8/9/07, Omar Bettin <[EMAIL PROTECTED]> wrote:
  

Merlin Moncure wrote:


AFAIK, the fastest possible way to get data off the server, skipping
all data and text processing is to write a SPI routine, and stream the
data out locally to the server.   I am doing exactly this in a
particular problem that requires high performance and I can tell you
that SPI is fast.

  

Interesting,do you thing that is possible to implement some send()
in the interface?



SPI is an interface which allows you to make sql calls from C code
(PostgreSQL allows you to link C code compiled as a .so to the server
and call -- see numerous examples in contrib).  The routine you need
to exploit is SPI_getbinval which gives you Datum (essentially a
variant) pointing to the internal binary representation of your field.
 In theory you could collect the data into a buffer and send() it off
although thats a lot of work IMO.  Also, I would only advise this for
fast dumps from a single table (no joins, etc).
  

why not joins?
  

   I have tried ZeosLib and for me is unusable (too slow), I use a
strong modified PostgresDAC.



I'm suprised -- I know the ZeosLib internals and it's a very thin
layer over libpq.  Here is what I suggest:

* turn on statement logging in the server (set log_min_duration_statement)
* make some operations in the app which you suggest are slow  -- they
will show up in the log
* 'explain analyze' the query from the psql console

make note of the times and post back (maybe move this thread to the
-performance list)

merlin

  

So, you aren't agree with the compression...

I have sow the network statistics and in some cases, the network traffic 
is very big.
Probably a better setup could increase the performance by a few points 
percent but

I think a compressed protocol could increase the communication by 2/3 times.

I think the biggest bottleneck in the whole system is just that.

:..try to get a query from a remote server with a 56k modem! :)

Omar


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

  


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Hans-Juergen Schoenig



On Aug 9, 2007, at 5:03 PM, Greg Smith wrote:


On Thu, 9 Aug 2007, Andrew Dunstan wrote:

There are also some fairly impressive code obfuscators about, that  
your clients might find useful.


All they really need is to find a sufficiently clever PL/Perl  
programmer.



we should make this a PL/Brainfuck implementation then ;)

hans


--
Cybertec Geschwinde & Schönig GmbH
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at

Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Decibel!

On Thu, Aug 09, 2007 at 04:34:48PM +0200, Peter Eisentraut wrote:
> Am Donnerstag, 9. August 2007 16:09 schrieb Hans-Juergen Schoenig:
> > the idea is basically to hide codes - many companies want that and ?
> > ask for it again and again.
> 
> If you want to design a security feature, you need to offer a threat and risk 
> analysis, not just the whining of customers.

This isn't as much a security feature as a means for people to keep
their code private. To some degree you could just get away with storing
nothing but a parsed representation of the code, though of course
someone could always decompile that.

There's a non-trivial amount of work involved in handling key
management, etc, so I think we don't want to try and build that in. What
would be interesting is providing hooks for en/decrypting function code
and having the backend call those hooks as appropriate. That should
allow someone to develop the ability to encrypt the code in the database
outside of the backend.
-- 
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)


pgpZN0BTyzThA.pgp
Description: PGP signature

Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Greg Smith


On Thu, 9 Aug 2007, Andrew Dunstan wrote:

There are also some fairly impressive code obfuscators about, that your 
clients might find useful.


All they really need is to find a sufficiently clever PL/Perl programmer.

--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Hans-Juergen Schoenig



On Aug 9, 2007, at 4:47 PM, korry.douglas wrote:



the idea is basically to hide codes - many companies want that and  
ask for it again and again.

Hide code from who (or is that whom?)?



the code should be hidden from the guy who is actually executing the  
function.

so:
some user is doing: select func();

the backend loads the keys from PGDATA, decrypts the codes executes  
them. as a normal user cannot look "into" the backend the code is safe.

the keys are only visible to the sysadmis but not at SQL level.




The PL compiler(s) will need to decrypt the code.



no, the backend will pass the decrypted codes to the call handler.
there is no need for Perl, Python or so to be aware of this issue.


If a compiler can decrypt it, then anyone can decrypt it (because  
the compilers are open-source).
And the problem is that any user that can run a function must be  
able to compile that function, and therefore, any user that can run  
a function must have the decryption key for that function.  So, I'm  
not sure you've secured the source code from any user that can run  
the function.



why that? the backend is doing the job. the user does not pass the  
keys. it is a database internal thing. the only idea is to make sure  
that pg_proc does not contain user readable code.





Of course, if your goal is to hide the code from someone snooping  
through the pg_proc relation (on disk), then encryption will  
certainly help (provided the key is properly protected).





no, somebody who has access to the filesystem is not relevant.
just think of shipping some database inside some router or inside  
some other stuff. the vendor just wants to make sure that other  
people don't fully understand the "magic" going on.


hans


--
Cybertec Geschwinde & Schönig GmbH
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at

Re: [HACKERS] GUC for default heap fillfactor

2007-08-09 Thread Decibel!

On Thu, Aug 09, 2007 at 09:57:48AM +0900, ITAGAKI Takahiro wrote:
> > If HOT gets into 8.3, we might need a GUC to set database wide heap
> > fillfactor to an appropriate value.
> 
> I have no objection to do that, but we will need other default values soon,
> something like default_[heap|btree|hash|gist|gin]_fillfactor. Some of us
> might feel it is mess to add random guc variables.

I think we'd want to handle indexes with a different mechanism, probably
one that makes changes to pg_am.

In any case, how important is it to do this before 8.3? We were supposed
to release this month, after all.
-- 
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)


pgpkDxwB7dNwA.pgp
Description: PGP signature

Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Zdenek Kotala


Hans-Juergen Schoenig napsal(a):
the idea is basically to hide codes - many companies want that and ask 
for it again and again.


i would suggest keys to reside in $PGDATA. we do this for SSL and so 
already.


initdb could create such keys so that they are unique to every database 
instance.
decrypting could be avoided as much as possible basically we should just 
decrypt on first all and when it changes.


But, Companies want to hide code also because they distribute their software. If 
you store key somewhere on server, user will be able to decrypt the original 
code.  If I remember correctly Oracle wrap generates something like bytecode and 
each Oracle installation is able to understand them. But It is not possible 
decode it back to original form.



My suggestion is to extend PL API and each PL language should offer wrap or 
encrypt function which generate encrypted code and this code will be store in 
the pg_proc. PL language will be responsible to detect if it raw or crypted 
code. PG_Dump will dump crypted procedure and author is responsible keep his 
uncrypted version in source repository.



Zdenek

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] HOT patch, missing things

2007-08-09 Thread Decibel!

On Thu, Aug 09, 2007 at 01:25:14PM +0100, Gregory Stark wrote:
> "Pavan Deolasee" <[EMAIL PROTECTED]> writes:
> 
> > HOT update is feasible iff
> >
> > - old and new tuples, both match the partiality condition OR
> > - old and new tuples, both don't match the condition
> ...
> > For functional index, we should apply the function to the old and new
> > tuple and compare the outcome. If the results are same, HOT update
> > is feasible.
> 
> This is debatable. We could compare the columns used in the partial condition
> expression or function expression directly. If they're the same then the
> expression or function must return the same value. If the function is quite
> expensive then that might be cheaper. 
> 
> On the other hand if it's not expensive and the columns change frequently but
> the results don't then we might be doing a lot of work for nothing.

If we're going to get this into 8.3 I think we should be leaning towards
whatever is the simplest way to do it...
-- 
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)


pgpPF8E4Vwj64.pgp
Description: PGP signature

Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread korry.douglas



the idea is basically to hide codes - many companies want that and ask 
for it again and again.
Hide code from who (or is that whom?)?  The PL compiler(s) will need to 
decrypt the code.  If a compiler can decrypt it, then anyone can decrypt 
it (because the compilers are open-source). 

And the problem is that any user that can run a function must be able to 
compile that function, and therefore, any user that can run a function 
must have the decryption key for that function.  So, I'm not sure you've 
secured the source code from any user that can run the function.


Of course, if your goal is to hide the code from someone snooping 
through the pg_proc relation (on disk), then encryption will certainly 
help (provided the key is properly protected).


Note: I think source-code encryption is a useful idea, it just seems 
tricky to implement in an open environment.


   -- Korry



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Hans-Juergen Schoenig



On Aug 9, 2007, at 4:34 PM, Peter Eisentraut wrote:


Am Donnerstag, 9. August 2007 16:09 schrieb Hans-Juergen Schoenig:

the idea is basically to hide codes - many companies want that and
ask for it again and again.


If you want to design a security feature, you need to offer a  
threat and risk

analysis, not just the whining of customers.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/




well, the complete analysis is easy - the solution is not.
currently we have basically no option to reduce access to the system  
tables. this would be hard anyway as we need those tables for  
basically all kinds of operations.
the problem here is that vendors of appliances don't want people to  
spider their codes. this is a fact - it is not the idea of open  
source to do so but bloody reality. in addition to that people are  
not willing to code everything in C just to hide.
so, there has to be a concept to achieve this for stored procedures  
somehow.
i am afraid the source level encryption is the easiest thing and most  
understandable thing to do.


so, better ideas are welcome.

hans



--
Cybertec Geschwinde & Schönig GmbH
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at

Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Andrew Dunstan



Wouldn't it be better just to hide prosrc from the users you want to 
hide it from? (Note: my SOC student is working on column level privs, 
although unfortunately his work doesn't currently work with SELECT 
privs). In fact, in many cases, depending on the app, you can simply 
remove SELECT priv from public on pg_proc.


There are also some fairly impressive code obfuscators about, that your 
clients might find useful. Of course, debugging the obfuscated code is a 
nightmare, but that's their affair if they want to use it ;-)


cheers

andrew

Hans-Juergen Schoenig wrote:
the idea is basically to hide codes - many companies want that and ask 
for it again and again.


i would suggest keys to reside in $PGDATA. we do this for SSL and so 
already.


initdb could create such keys so that they are unique to every 
database instance.
decrypting could be avoided as much as possible basically we should 
just decrypt on first all and when it changes.

for pg_dump i would suggest two options:
a.) pass the keys to dump in a decrypted way
b.) dump in encrypted way.

i would think that this is a quite valuable features. would be nice to 
have it.
maybe we can agree on a nice mechanism here which will be implemented 
then.


hans



On Aug 9, 2007, at 3:57 PM, Andrew Dunstan wrote:




Hans-Juergen Schoenig wrote:

hello everybody,

one of our customers wants to store the code of interpreted 
procedures (PL/pgSQL, PL/Perl) and so in an encrypted way.
so the idea we had to add one more column to pg_proc telling us 
whether prosrc is encrypted or not. people could chose then whether 
to crypt codes there or not (speed of decryption can be an issue).


should not be hard to implement ...

what do people think about this feature?




Perhaps you could give us a justification for it. Are you intending 
to have stored procs contain security sensitive information? Or is 
this an attempt to hide closed source code from prying eyes? Where 
would the encryption keys be stored? And how would it work with pg_dump?


This doesn't sound very well thought out, frankly.

cheers

andrew

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate




--
Cybertec Geschwinde & Schönig GmbH
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at 




---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Peter Eisentraut

Am Donnerstag, 9. August 2007 16:09 schrieb Hans-Juergen Schoenig:
> the idea is basically to hide codes - many companies want that and  
> ask for it again and again.

If you want to design a security feature, you need to offer a threat and risk 
analysis, not just the whining of customers.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Hans-Juergen Schoenig

the idea is basically to hide codes - many companies want that and  
ask for it again and again.


i would suggest keys to reside in $PGDATA. we do this for SSL and so  
already.


initdb could create such keys so that they are unique to every  
database instance.
decrypting could be avoided as much as possible basically we should  
just decrypt on first all and when it changes.


for pg_dump i would suggest two options:
a.) pass the keys to dump in a decrypted way
b.) dump in encrypted way.

i would think that this is a quite valuable features. would be nice  
to have it.
maybe we can agree on a nice mechanism here which will be implemented  
then.


hans



On Aug 9, 2007, at 3:57 PM, Andrew Dunstan wrote:




Hans-Juergen Schoenig wrote:

hello everybody,

one of our customers wants to store the code of interpreted  
procedures (PL/pgSQL, PL/Perl) and so in an encrypted way.
so the idea we had to add one more column to pg_proc telling us  
whether prosrc is encrypted or not. people could chose then  
whether to crypt codes there or not (speed of decryption can be an  
issue).


should not be hard to implement ...

what do people think about this feature?




Perhaps you could give us a justification for it. Are you intending  
to have stored procs contain security sensitive information? Or is  
this an attempt to hide closed source code from prying eyes? Where  
would the encryption keys be stored? And how would it work with  
pg_dump?


This doesn't sound very well thought out, frankly.

cheers

andrew

---(end of  
broadcast)---

TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate




--
Cybertec Geschwinde & Schönig GmbH
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at

Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Andrew Dunstan




Hans-Juergen Schoenig wrote:

hello everybody,

one of our customers wants to store the code of interpreted procedures 
(PL/pgSQL, PL/Perl) and so in an encrypted way.
so the idea we had to add one more column to pg_proc telling us 
whether prosrc is encrypted or not. people could chose then whether to 
crypt codes there or not (speed of decryption can be an issue).


should not be hard to implement ...

what do people think about this feature?




Perhaps you could give us a justification for it. Are you intending to 
have stored procs contain security sensitive information? Or is this an 
attempt to hide closed source code from prying eyes? Where would the 
encryption keys be stored? And how would it work with pg_dump?


This doesn't sound very well thought out, frankly.

cheers

andrew

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread hubert depesz lubaczewski

On Thu, Aug 09, 2007 at 03:39:06PM +0200, Hans-Juergen Schoenig wrote:
>so the idea we had to add one more column to pg_proc telling us whether
>prosrc is encrypted or not. people could chose then whether to crypt codes
>there or not (speed of decryption can be an issue).
>should not be hard to implement ...
>what do people think about this feature?

how would you decrypt the source before using the function?

depesz

-- 
quicksil1er: "postgres is excellent, but like any DB it requires a
highly paid DBA.  here's my CV!" :)
http://www.depesz.com/ - blog dla ciebie (i moje CV)

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Hans-Juergen Schoenig


hello everybody,

one of our customers wants to store the code of interpreted  
procedures (PL/pgSQL, PL/Perl) and so in an encrypted way.
so the idea we had to add one more column to pg_proc telling us  
whether prosrc is encrypted or not. people could chose then whether  
to crypt codes there or not (speed of decryption can be an issue).


should not be hard to implement ...

what do people think about this feature?

many thanks,

hans


--
Cybertec Geschwinde & Schönig GmbH
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at

Re: [HACKERS] HOT patch, missing things

2007-08-09 Thread Gregory Stark

"Pavan Deolasee" <[EMAIL PROTECTED]> writes:

> HOT update is feasible iff
>
> - old and new tuples, both match the partiality condition OR
> - old and new tuples, both don't match the condition
...
> For functional index, we should apply the function to the old and new
> tuple and compare the outcome. If the results are same, HOT update
> is feasible.

This is debatable. We could compare the columns used in the partial condition
expression or function expression directly. If they're the same then the
expression or function must return the same value. If the function is quite
expensive then that might be cheaper. 

On the other hand if it's not expensive and the columns change frequently but
the results don't then we might be doing a lot of work for nothing.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] comunication protocol

2007-08-09 Thread Merlin Moncure

On 8/9/07, Omar Bettin <[EMAIL PROTECTED]> wrote:
> Merlin Moncure wrote:
> > AFAIK, the fastest possible way to get data off the server, skipping
> > all data and text processing is to write a SPI routine, and stream the
> > data out locally to the server.   I am doing exactly this in a
> > particular problem that requires high performance and I can tell you
> > that SPI is fast.
> >
> Interesting,do you thing that is possible to implement some send()
> in the interface?

SPI is an interface which allows you to make sql calls from C code
(PostgreSQL allows you to link C code compiled as a .so to the server
and call -- see numerous examples in contrib).  The routine you need
to exploit is SPI_getbinval which gives you Datum (essentially a
variant) pointing to the internal binary representation of your field.
 In theory you could collect the data into a buffer and send() it off
although thats a lot of work IMO.  Also, I would only advise this for
fast dumps from a single table (no joins, etc).

>I have tried ZeosLib and for me is unusable (too slow), I use a
> strong modified PostgresDAC.

I'm suprised -- I know the ZeosLib internals and it's a very thin
layer over libpq.  Here is what I suggest:

* turn on statement logging in the server (set log_min_duration_statement)
* make some operations in the app which you suggest are slow  -- they
will show up in the log
* 'explain analyze' the query from the psql console

make note of the times and post back (maybe move this thread to the
-performance list)

merlin

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Function structure in formatting.c

2007-08-09 Thread Brendan Jurd

On 8/9/07, Jaime Casanova <[EMAIL PROTECTED]> wrote:
>
> take your time, this seems like it will be for 8.4 anyway
>

I hear you, unfortunately "taking my time" usually means I forget
about it for eight months and by the time I come back to it I've
forgotten what I was doing =)

I wasn't really expecting this to make it into 8.3.  I just need to
get it done so I can free up the headspace for other projects.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] HOT patch, missing things

2007-08-09 Thread Simon Riggs

On Thu, 2007-08-09 at 15:46 +0530, Pavan Deolasee wrote:
> 
> 
> On 8/9/07, Simon Riggs <[EMAIL PROTECTED]> wrote:
> 
> Whether I got the exact details of frugging & depruning
> correct or not:
> if a tuple version is removed, then VACUUM doesn't need to
> remove it
> later, so any non-VACUUM removal of rows must defer a VACUUM.
> 
> 
> 
> ISTM that you are worried about the cases where a tuple is HOT updated
> and hence can be pruned/defragged, but only if we revisit the page at
> a later time.
> 
> What if we just track the amount of potentially dead space in the
> relation 
> (somebody had suggested that earlier in the thread) ? Every committed
> UPDATE/DELETE and aborted UPDATE/INSERT would increment
> the dead space. Whenever page fragmentation is repaired, either during
> normal operation or during vacuum, the dead space is reduced by the 
> amount of reclaimed space. Autovacuum triggers whenever the percentage
> of dead space increases beyond a threshold.
> 
> We can some fine tuning to track the space consumed by redirect-dead
> line pointers.

Sounds great.

-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] HOT patch, missing things

2007-08-09 Thread Pavan Deolasee

On 8/9/07, Simon Riggs <[EMAIL PROTECTED]> wrote:
>
>
> Whether I got the exact details of frugging & depruning correct or not:
> if a tuple version is removed, then VACUUM doesn't need to remove it
> later, so any non-VACUUM removal of rows must defer a VACUUM.
>
>

ISTM that you are worried about the cases where a tuple is HOT updated
and hence can be pruned/defragged, but only if we revisit the page at
a later time.

What if we just track the amount of potentially dead space in the relation
(somebody had suggested that earlier in the thread) ? Every committed
UPDATE/DELETE and aborted UPDATE/INSERT would increment
the dead space. Whenever page fragmentation is repaired, either during
normal operation or during vacuum, the dead space is reduced by the
amount of reclaimed space. Autovacuum triggers whenever the percentage
of dead space increases beyond a threshold.

We can some fine tuning to track the space consumed by redirect-dead
line pointers.

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

Re: [HACKERS] default_text_search_config and expression indexes

2007-08-09 Thread Oleg Bartunov


On Thu, 9 Aug 2007, [EMAIL PROTECTED] wrote:


Maybe I'm missing something, but it seems to me that the configuration
is more attached to a column/index thatn to the whole database. If
there's a default in an expression, I'd rather expect this default to be
drawn from the index involved than from a global value (like a functional
index does now).


I'm tired to repeat - index itself doesn't know about configuration !
It just index tsvector data type. tsvector in turn can be obtained
using various ways:
1. manually
2. to_tsvector

tsvector can be stored in a separate attribute or be fully virtual
like in expressional index.

Moreover, tsvector can be obtained using various configurations depending
on your application.


Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] comunication protocol

2007-08-09 Thread Omar Bettin




Gustavo Tonini wrote:

Em Quarta 08 Agosto 2007 12:02, Omar Bettin escreveu:
  

Hi,
I have installed postgresql to my client as a server for a progam 
(libpq.dll/VCL based) that I wrote for them.



Borland VCL? What component are you using?

Gustavo.
  

  I use a strong modified PostgresDAC component.

 Omar


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] comunication protocol

2007-08-09 Thread Omar Bettin




Merlin Moncure wrote:

On 8/8/07, Omar Bettin <[EMAIL PROTECTED]> wrote:
  

Hi,
I have installed postgresql to my client as a server for a progam
(libpq.dll/VCL based) that I wrote for them.

Every is working fine, but I noted some general slowness,  compared with an
older database system (iAnywhere ADS)
due (I think) to the  text based communication protocol.



you should maybe report a couple of specific things (explain analyze,
etc) for analysis and make sure your expectations are reasonable.  It
is possible simple configuration issues or query changes might be the
answer here, then again, maybe not.
  
   I think this is not necessary because I have used the same 
application with two different servers and you can see the speed 
differences just using it.
  

I know there is the possibility to adopt a ssl compressed connection but I
think a pure compressed connections could be better.



I think you are looking in the wrong direction here.

  

So, I have studied the postgresql sources and I have tried to implement some
compression between the backend and the frontend,
using pglz_compress/pglz_decompress on be-secure.c and fe-secure.c.

At the moment is working good on a local configuration, got some problems on
a remote connection due I think a different way to communicate.



AFAIK, the fastest possible way to get data off the server, skipping
all data and text processing is to write a SPI routine, and stream the
data out locally to the server.   I am doing exactly this in a
particular problem that requires high performance and I can tell you
that SPI is fast.

http://developer.postgresql.org/pgdocs/postgres/spi-examples.html

That way you will bypass the protocol completely.  On my computer, I
get roughly 300k records/sec raw read performance using libpq and
about 1.3m records sec using a hacked SPI and streaming to disk.  This
may not be helpful for your application but if you are exploring ways
to bypass protocol overhead this is where I would start.
  
   Interesting,do you thing that is possible to implement some send() 
in the interface?

By the way, your problem might be the VCL driver you are using to
access the database.  The highest performance driver I have used
(which wraps libpq) is the Zeos library which is very fast.

  
  I have tried ZeosLib and for me is unusable (too slow), I use a 
strong modified PostgresDAC.

There are other ways (other than be-secure and fe-secure) with which the
backend comunicate with the frontend?
And, do you think this solution could speed up something?



Once again, I would start by looking at your application and posting
here to make sure you are looking at the right bottlenecks (you
_suspect_ the protocol is the problem, but is it really?).

this means:
* explain analyze/queries (w/how fast you think it should be going)
* relevant .conf settings
* time measurements from the app

merlin

  
   I just switch form Application1 (IAnywhere Ads) to Application2 
(Postgresql) optimizing the VCL (strong modifications to PostgresDAC 
sources) and sow the results.
   My application needs to full open some tables and with this protocol 
is like to download a long text file.



   omar

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] default_text_search_config and expression indexes

2007-08-09 Thread tomas

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Thu, Aug 09, 2007 at 02:36:41AM -0400, Bruce Momjian wrote:
> Oleg Bartunov wrote:
> > > Yea, seems more work than is necessary.  If we require the configuration
> > > to be always supplied, and document that mismatches are a problem, I
> > > think we are in good shape.
> > 
> > We should agree that all you describe is only for DUMMY users. 
> > >From authors point of view I dislike your approach to treat text searching 
> > as a very limited tool [...]

[...]

> I am glad we are moving this interface discussion forward.  It seems 
> Heikki has similar concerns about the interface being error-prone.
> 
> It would be nice to have a novice and advanced interface, but we would
> have to document both, and then that is going to be confusing for users.
> 
> As I see it, specifying the configuration name in every function call is
> the novice interface, and avoids the most common errors.  I can see
> defaulting the interface name as being an advanced interface, but I
> don't think it has enough of a feature to be worth documenting and
> implementing.
> 
> If we figure out something better in 8.4 we can implement it, but at
> this point I can't think of any good solution to not specifying the
> configuration name every time.

Maybe I'm missing something, but it seems to me that the configuration
is more attached to a column/index thatn to the whole database. If
there's a default in an expression, I'd rather expect this default to be
drawn from the index involved than from a global value (like a functional
index does now).

Regards
- -- tomás
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFGuuAoBcgs9XrR2kYRAqiiAJsFL+Iu/b/xYaLza5ozmi839Qh5awCeOp+f
SZHKDPUHZ3u99XzLBn2ZKjw=
=twEt
-END PGP SIGNATURE-


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] HOT patch, missing things

2007-08-09 Thread Pavan Deolasee

On 8/7/07, Heikki Linnakangas <[EMAIL PROTECTED]> wrote:
>
> There's three things clearly missing in the patch:
>
> 1. HOT updates on tables with expression or partial indexes. Hasn't been
> done yet because it should be pretty straightforward and we've had more
> important things to do. Though not critical, should be finished before
> release in my opinion.
>
>

I started with this. ISTM to support partial indexes, we must check
the old and new tuple against partiality match.

HOT update is feasible iff

- old and new tuples, both match the partiality condition OR
- old and new tuples, both don't match the condition

In either case, we either had an index entry which can serve for
the new tuple OR we did not have an index entry for the old tuple,
but neither the new tuple needs it.

Of course, we still need to apply all other criteria to finally decide
whether to do HOT or COLD update.

For functional index, we should apply the function to the old and new
tuple and compare the outcome. If the results are same, HOT update
is feasible.

We still need to think about the best way to do this without any
modularity invasion and least possible overhead, but can anybody
see any issue with the broader approach ?

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

[HACKERS] Compilation of pg 7.4.17 fails on HP-UX

2007-08-09 Thread Adrian Maier

Hello,

I have just tried to compile postgresql 7.4.17  on a HP-UX 11.11 box
(PA-RISC)   and the compliation fails in gist.

make[4]: Entering directory
`/home/adrian/postgresql-7.4.17/src/backend/access/gist'
gcc -O2 -fno-strict-aliasing -Wall -Wmissing-prototypes
-Wmissing-declarations -I../../../../src/include
-D_XOPEN_SOURCE_EXTENDED   -c -o gist.o gist.c
/var/tmp//ccxYASP0.s: Assembler messages:
/var/tmp//ccxYASP0.s:1528: Error: Field not properly aligned [8] (52).
/var/tmp//ccxYASP0.s:1528: Error: Invalid operands
/var/tmp//ccxYASP0.s:1531: Error: Field not properly aligned [8] (44).
/var/tmp//ccxYASP0.s:1531: Error: Invalid operands
/var/tmp//ccxYASP0.s:1534: Error: Field not properly aligned [8] (60).
/var/tmp//ccxYASP0.s:1534: Error: Invalid operands
make[4]: *** [gist.o] Error 1
make[4]: Leaving directory
`/home/adrian/postgresql-7.4.17/src/backend/access/gist'
make[3]: *** [gist-recursive] Error 2
make[3]: Leaving directory `/home/adrian/postgresql-7.4.17/src/backend/access'
make[2]: *** [access-recursive] Error 2
make[2]: Leaving directory `/home/adrian/postgresql-7.4.17/src/backend'
make[1]: *** [all] Error 2
make[1]: Leaving directory `/home/adrian/postgresql-7.4.17/src'
make: *** [all] Error 2

This doesn't happen in HEAD.


$ uname -a
HP-UX hpdev B.11.11 U 9000/800 822196241 unlimited-user license

$ gcc --version
gcc (GCC) 3.3.3


Cheers,
Adrian Maier

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Unexpected VACUUM FULL failure

2007-08-09 Thread Simon Riggs

On Wed, 2007-08-08 at 23:23 -0400, Tom Lane wrote:
> I wrote:
> > ... Since we've whacked the tqual.c logic around recently,
> > the problem might actually lie there...
> 
> In fact, I bet this is a result of the async-commit patch.  The places
> where vacuum.c bleats "HEAP_MOVED_OFF was expected" are all places where
> it is looking at a tuple not marked XMIN_COMMITTED; it expects that
> after its first pass over the table, *every* tuple is either
> XMIN_COMMITTED or one that it moved.  Async commit changed tqual.c
> so that tuples that are in fact known committed might not get marked
> XMIN_COMMITTED right away.  The patch tries to prevent this from
> happening within VACUUM FULL by means of
> 
> /* 
>  * VACUUM FULL assumes that all tuple states are well-known prior to
>  * moving tuples around --- see comment "known dead" in repair_frag(),
>  * as well as simplifications in tqual.c.  So before we start we must
>  * ensure that any asynchronously-committed transactions with changes
>  * against this table have been flushed to disk.  It's sufficient to do
>  * this once after we've acquired AccessExclusiveLock.
>  */
> XLogAsyncCommitFlush();
> 
> but I bet lunch that that's not good enough.  I still haven't reproduced
> it, but I'm thinking that the inexact bookkeeping that we created for
> clog page LSNs allows tuples to not get marked if the right sort of
> timing of concurrent transactions happens.
> 
> Not sure about the best solution for this.

Good hunch. I plugged this hole earlier, but on further inspection I can
see the plug wasn't wide enough. XLogAsyncCommitFlush() is good enough,
but HeapTupleSatisfiesVacuum() still allowed the inexact bookkeeping to
sometimes skip hint bit setting, when executed with concurrent
transactions touching other tables.

ISTM that if we call HeapTupleSatisfiesVacuum() with an additional
boolean parameter, force, we can tell VF to always set the hint bits in
every case, not just HEAP_MOVED_IN and HEAP_MOVED_OUT.

Patch enclosed, but a little crufty. Gotta run now, talk later.

-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com
Index: src/backend/commands/vacuum.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/commands/vacuum.c,v
retrieving revision 1.354
diff -c -r1.354 vacuum.c
*** src/backend/commands/vacuum.c	1 Aug 2007 22:45:08 -	1.354
--- src/backend/commands/vacuum.c	9 Aug 2007 07:24:41 -
***
*** 1384,1390 
  			tuple.t_len = ItemIdGetLength(itemid);
  			ItemPointerSet(&(tuple.t_self), blkno, offnum);
  
! 			switch (HeapTupleSatisfiesVacuum(tuple.t_data, OldestXmin, buf))
  			{
  case HEAPTUPLE_DEAD:
  	tupgone = true;		/* we can delete the tuple */
--- 1384,1390 
  			tuple.t_len = ItemIdGetLength(itemid);
  			ItemPointerSet(&(tuple.t_self), blkno, offnum);
  
! 			switch (HeapTupleSatisfiesVacuumFull(tuple.t_data, OldestXmin, buf))
  			{
  case HEAPTUPLE_DEAD:
  	tupgone = true;		/* we can delete the tuple */
***
*** 1998,2004 
  		break;
  	}
  	/* must check for DEAD or MOVED_IN tuple, too */
! 	nextTstatus = HeapTupleSatisfiesVacuum(nextTdata,
  		   OldestXmin,
  		   nextBuf);
  	if (nextTstatus == HEAPTUPLE_DEAD ||
--- 1998,2004 
  		break;
  	}
  	/* must check for DEAD or MOVED_IN tuple, too */
! 	nextTstatus = HeapTupleSatisfiesVacuumFull(nextTdata,
  		   OldestXmin,
  		   nextBuf);
  	if (nextTstatus == HEAPTUPLE_DEAD ||
Index: src/backend/utils/time/tqual.c
===
RCS file: /projects/cvsroot/pgsql/src/backend/utils/time/tqual.c,v
retrieving revision 1.103
diff -c -r1.103 tqual.c
*** src/backend/utils/time/tqual.c	1 Aug 2007 22:45:09 -	1.103
--- src/backend/utils/time/tqual.c	9 Aug 2007 07:24:43 -
***
*** 77,82 
--- 77,85 
  TransactionId RecentGlobalXmin = InvalidTransactionId;
  
  /* local functions */
+ static HTSV_Result HeapTupleSatisfiesVacuumInternal(HeapTupleHeader tuple, 
+ 		TransactionId OldestXmin, Buffer buffer, bool locked);
+ 
  static bool XidInMVCCSnapshot(TransactionId xid, Snapshot snapshot);
  
  
***
*** 1045,1054 
   * deleted by XIDs >= OldestXmin are deemed "recently dead"; they might
   * still be visible to some open transaction, so we can't remove them,
   * even if we see that the deleting transaction has committed.
   */
  HTSV_Result
! HeapTupleSatisfiesVacuum(HeapTupleHeader tuple, TransactionId OldestXmin,
! 		 Buffer buffer)
  {
  	/*
  	 * Has inserting transaction committed?
--- 1048,1060 
   * deleted by XIDs >= OldestXmin are deemed "recently dead"; they might
   * still be visible to some open transaction, so we can't remove them,
   * even if we see that the deleting transaction has committed.
+  *
+  * If the heap we are che

Re: [HACKERS] HOT patch, missing things

2007-08-09 Thread Simon Riggs

On Thu, 2007-08-09 at 12:39 +0530, Pavan Deolasee wrote:

> No, pruning removes all dead tuples, irrespective of whether they are 
> HOT or COLD updated and whether they are heap-only or not. It
> handles line pointer redirection and marks
> all dead tuples as ~LP_USED. Defragging just repairs the fragmentation
> by rearranging LP_USED tuples.
> 
> 
> It's easily possible that 100% of the cold updates and deletes
> are
> removed by defragging because of HOT updates. It would be
> pointless to 
> trigger a VACUUM when it might find nothing to clear up. Its
> also
> possible that the deletes are at one end of the table and the
> updates at
> the other. So we really need to keep track of the effects of
> defragging
> dead cold updates and deletes, so they can be subtracted from
> the cold
> update + deletes.

Whether I got the exact details of frugging & depruning correct or not:
if a tuple version is removed, then VACUUM doesn't need to remove it
later, so any non-VACUUM removal of rows must defer a VACUUM. 

-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] GUC for default heap fillfactor

2007-08-09 Thread Simon Riggs

On Thu, 2007-08-09 at 12:17 +0530, Pavan Deolasee wrote:

> OK. I wish to have a config parameter so that DBA can set
> the system level default and then override that (if required) for
> each table. It will also be useful for experimenting with different
> fillfactor. The only way today is either to recompile your sources 
> or change the parameter in every CREATE TABLE statement.

Why not just put an extra file into your test script that issues ALTER
TABLE statements on appropriate tables? If you run this between CREATE
and COPY you won't have to edit a thing. You'll be able to more easily
control the tables you want to set, so you'll be in less danger of
having a gain on one table wiped out by a loss on another table.

-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] HOT patch, missing things

2007-08-09 Thread Pavan Deolasee

On 8/8/07, Simon Riggs <[EMAIL PROTECTED]> wrote:
>
>
>
> So pruning removes dead hot updated tuples, while defragging will remove
> dead cold updated tuples and deletes, as well as rearranging space.



No, pruning removes all dead tuples, irrespective of whether they are
HOT or COLD updated and whether they are heap-only or not. It
handles line pointer redirection and marks
all dead tuples as ~LP_USED. Defragging just repairs the fragmentation
by rearranging LP_USED tuples.

It's easily possible that 100% of the cold updates and deletes are
> removed by defragging because of HOT updates. It would be pointless to
> trigger a VACUUM when it might find nothing to clear up. Its also
> possible that the deletes are at one end of the table and the updates at
> the other. So we really need to keep track of the effects of defragging
> dead cold updates and deletes, so they can be subtracted from the cold
> update + deletes.
>
> The argument for including HOT updates is weaker, but I'm still thinking
> that they should eventually cause a VACUUM to take place. A very slowly
> updated table with spread out updates might bloat a table to 200 times
> its size, even with HOT, in the worst case. Perhaps we should count each
> HOT update as 1/5th of a row for autovacuum purposes, or some other
> discounting factor?
>
>
>
I can't see how a table might bloat to 200 times its size assuming only
HOT updates and without long running transactions. And even if the table is
really bloating that way, it must be because of COLD updates and they
are discounted in triggering autovac. Am I missing something ?

Thanks,
Pavan


-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

67 matches

Mail list logo