Re: [HACKERS] HOT is applied

2007-09-21 Thread Bruce Momjian
Heikki Linnakangas wrote:
> Tom Lane wrote:
> > I've committed the HOT patch.
> 
> Thanks, much easier to work with it now that it's in.
> 
> >  I'd still like to think about whether we
> > can be smarter about when to invoke pruning, but that's a small enough
> > issue that the patch can go in without it.
> 
> Yeah. I'm doing some micro-benchmarking, and the attached test case is
> much slower with HOT. It's spending a lot of time trying to prune, only
> to find out that it can't.
> 
> Instead of/in addition to avoiding pruning when it doesn't help, maybe
> we could make HeapTupleSatisfiesVacuum cheaper.
> 
> I'm going to continue testing, this is just a heads-up that HOT as
> committed seriously hurts performance in some cases. (though one can
> argue that this test case isn't a very realistic one.)

This might be a simplistic question but if the page is +90% full and
there is a long-lived transaction, isn't Postgres going to try pruning
on each page read access?

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Improving the Performance of Full Table Updates

2007-09-21 Thread Gokulakannan Somsundaram
Hi Tom,
  Thanks for the feedback. Let me clarify my intention again.
This was thought for improving the performance of the Data Warehousing
applications

Full table updates similar to
"Update dd set n2=n2+1"

When you talked about firing triggers, i looked into the implementation of
triggers and the approach i suggested may not work fine with the Triggers.
Since we cannot hold write locks and fire triggers, triggers should get
disabled for this. But Remember in Data Warehousing applications, people
won't be usually having Row-level triggers.

My alternate suggestion would be to make this kind of update an optional one
to provide the speed up.

If a page contains 100 rows, then the current scenario takes 1 read lock +
100 write locks to complete the full table update. In the suggested
scenario, it takes one Write Lock. Also it reduces the 101 Logical I/Os to1
Logical I/O. This might provide the same kind of benefit the Bitmap Index
Scan provided.

Again there are some specific cases
a) If the tuple is locked / Concurrently being updated:
Currently we release the buffer, take a lock on the
tuple and wait for the transaction to complete in case of concurrent
updates. In the Full table update also, we will do the same.

b) If the page contains lot of deleted tuples:
 This is a tricky scenario. Say if we have 100 tuples
and we have 20% of them deleted. In the current scenario, we will find that
out during the read lock and we will not waste time in those tuples during
the write lock. But in the suggested scenario, we will be wasting time in
those with the write lock on the buffer. In order to circumvent that, we can
resort to a one read lock + one write lock combination.

Again if this full table updates are thought with the OLTP applications in
mind, then this is not at all a suitable option. This will only benefit the
people with Data Warehouses.

Expecting some more replies


Thanks,
Gokul.




On 9/20/07, Tom Lane <[EMAIL PROTECTED]> wrote:
>
> "Gokulakannan Somsundaram" <[EMAIL PROTECTED]> writes:
> > I propose to change this row-by-row approach, when it is a full table
> > update. I plan to send a extra flag(which will be set for Full table
> > Deletes/Updates). this would make the access method directly acquire the
> > exclusive lock and update the existing record.
>
> This sounds like a recipe for utter destruction of what little
> modularity and layering we've got left.  And I rather doubt it will buy
> anything interesting performance-wise.
>
> To cite just one concrete objection: surely the tuple-fetch code has got
> no business firing triggers.
>
> regards, tom lane
>


[HACKERS] Query

2007-09-21 Thread Eswar
Hi 
 
I have just installed PostgreSQL 8.1 on a Windows XP system with a 
NTFS filesystem. Everything seemed to work fine until I executed the 
command promt option from the PostgreSQL start button. 
 
I wanted to create my first database by typing "createdb mydb" and 
was asked for my password. Everytime I type it in I get the error message
"createdb: could not connect to database postgres: FATAL: password 
authentication failed for user "x" ". 
 
Another problem is that the user x in the error message is not the 
user specified in the installation (I used the default user "postgres" 
given in the installation) but my  username on my computer and it has
administrative rights.
 
I dont really know why PostgreSQL asks for my username and not the one
specified 
in the installation.   
 
 
If anybody can tell me why I get this error everytime I would greatly 
appreciate it. 

 

 

-

Thanks & Regards

V.Eswar

Senior Software Engineer

Patni Computer Systems Ltd.
Level 0, Tower III 
Cyber City, Magarpatta City Hadapsar
Pune 411 028.
Tel: +91 20 39842000, x 2057

Link : 9-621-402

 


http://www.patni.com
World-Wide Partnerships. World-Class Solutions.
_

This e-mail message may contain proprietary, confidential or legally
privileged information for the sole use of the person or entity to
whom this message was originally addressed. Any review, e-transmission
dissemination or other use of or taking of any action in reliance upon
this information by persons or entities other than the intended
recipient is prohibited. If you have received this e-mail in error
kindly delete  this e-mail from your records. If it appears that this
mail has been forwarded to you without proper authority, please notify
us immediately at [EMAIL PROTECTED] and delete this mail. 
_


Re: [HACKERS] HOT is applied

2007-09-21 Thread Pavan Deolasee
On 9/21/07, Bruce Momjian <[EMAIL PROTECTED]> wrote:
>
>
> This might be a simplistic question but if the page is +90% full and
> there is a long-lived transaction, isn't Postgres going to try pruning
> on each page read access?
>
>
The way it stands today, yes. Thats one reason why we are seeing
the performance drop in this particular case.

I liked the earlier proposed idea of storing the xid in the page header
to quickly tell us whether its worth pruning the page.

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com


Re: [HACKERS] HOT is applied

2007-09-21 Thread Heikki Linnakangas
Bruce Momjian wrote:
> This might be a simplistic question but if the page is +90% full and
> there is a long-lived transaction, isn't Postgres going to try pruning
> on each page read access?

Yes :(. That's why we earlier talked about stored the xid of the oldest
deleted tuple on the page in the page header. That way we could skip the
fruitless pruning attempts until that xid < OldestXmin.

Another approach is to try to make HeapTupleSatisfiesVacuum cheaper, so
that the fruitless pruning attempts wouldn't hurt that much.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] HOT is applied

2007-09-21 Thread Heikki Linnakangas
Tom Lane wrote:
> "Heikki Linnakangas" <[EMAIL PROTECTED]> writes:
>> Tom Lane wrote:
>>> I'd still like to think about whether we
>>> can be smarter about when to invoke pruning, but that's a small enough
>>> issue that the patch can go in without it.
> 
>> Yeah. I'm doing some micro-benchmarking, and the attached test case is
>> much slower with HOT. It's spending a lot of time trying to prune, only
>> to find out that it can't.
> 
> Not sure if that's an appropriate description or not.  oprofile
> (on a dual Xeon running Fedora 6) shows me this:
> 
> ...
> samples  %symbol name
> 1070003  29.8708  LWLockAcquire
> 1015097  28.3380  LWLockRelease
> 2835147.9147  heap_page_prune
> ...
> so basically it's all about the locking.  Maybe the problem is that with
> HOT we lock the buffer too often?  heap_page_prune_opt is designed to
> not take the buffer lock unless there's a good probability of needing
> to prune, but maybe that's not working as intended.

If you look at the callgraph, you'll see that those
LWLockAcquire/Release calls are coming from HeapTupleSatisfiesVacuum ->
TransactionIdIsInProgress, which keeps trashing the ProcArrayLock. A
"if(TransactionIdIsCurrentTransactionId(xid)) return true;" check in
TransactionIdIsInProgress would speed that up, but I wonder if there's a
more general solution to make HeapTupleSatisfiesVacuum cheaper. For
example, we could cache the in-progress status of tuples.

> Shouldn't we be able to prune rows that have been inserted and deleted
> by the same transaction?  I'd have hoped to see this example use only
> one heap page ...

Well maybe, but that's a separate issue. Wouldn't we need the "snapshot
bookkeeping" we've discussed in the past, to notice that there's no
snapshot in our own backend that needs to see the tuples?

Nevertheless, the fruitless pruning attempts would still hurt us in
slightly more complex scenarios, even if we fixed the above.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] like/ilike improvements

2007-09-21 Thread Guillaume Smet
Andrew,

On 9/20/07, Andrew Dunstan <[EMAIL PROTECTED]> wrote:
> Please try the attached patch, which goes back to using a special case
> for single-byte ILIKE. I want to make sure that at the very least we
> don't cause a performance regression with the code done this release. I
> can't see an obvious way around the problem for multi-byte case -
> lower() then requires converting to and from wchar, and I don't see a
> way of avoiding calling lower(). If this is a major blocker I would
> suggest you look at an alternative to using ILIKE for your UTF8 data.

I tested your patch with latin1 and C encoding.

It's better but still slower than 8.2.

C results:
cityvox_c=# SELECT e.numeve FROM evenement e WHERE e.libgeseve LIKE
'%hocus pocus%';
 numeve

(0 rows)

Time: 113.655 ms

cityvox_c=# SELECT e.numeve FROM evenement e WHERE e.libgeseve ILIKE
'%hocus pocus%';
  numeve
---
 900024298
 87578
(2 rows)

Time: 124.829 ms

Latin1 results:
cityvox_latin1=# SELECT e.numeve FROM evenement e WHERE e.libgeseve
LIKE '%hocus pocus%';
 numeve

(0 rows)

Time: 113.207 ms

cityvox_latin1=# SELECT e.numeve FROM evenement e WHERE e.libgeseve
ILIKE '%hocus pocus%';
  numeve
---
 900024298
 87578
(2 rows)

Time: 123.163 ms

And to answer your IRC question about switching to regexp, it's even
slower than the new UTF-8 ILIKE of 8.3 so I don't think it's the way
to go :).

Regards,

--
Guillaume

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Beginning Tamil Community for Postgre

2007-09-21 Thread Heikki Linnakangas
ஆமாச்சு wrote:
> I am Sri Ramadoss from, Chennai, India interested in forming & taking up the 
> responsibility of Localising Postgre into my native language Tamil.
> 
> I am also interested in forming a Community for Postgre in TamilNadu, a State 
> of India. Tamil is spoken by around 70 million people across the globe.
> 
> Please help me as to how to take this up further and guide me. My previous 
> experiences in Localisation include KDE Localisation and GNU Website 
> Localisation. 

Translations are handled through the pgtranslation project, see
http://pgtranslation.projects.postgresql.org/contributing.html to get
started.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] like/ilike improvements

2007-09-21 Thread ITAGAKI Takahiro

"Guillaume Smet" <[EMAIL PROTECTED]> wrote:

> It's better but still slower than 8.2.

It probablly comes from 'var-varlena' feature in 8.3. Now we store
text fields in a compact format on disks and extract them on access.
It consumes some CPU cycles. If all of data are in buffer cache
and the encoding of database is single-byte encodings, the performance
of LIKE in 8.3 was 10-20% slower than 8.2 on my tests.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Improving the Performance of Full Table Updates

2007-09-21 Thread Heikki Linnakangas
Gokulakannan Somsundaram wrote:
> Again if this full table updates are thought with the OLTP applications in
> mind, then this is not at all a suitable option. This will only benefit the
> people with Data Warehouses.
> 
> Expecting some more replies

Start with profiling.

I just ran a quick oprofile run of a full-table UPDATE on a simple table
with one index, and it looks like RelationGetBufferForTuple uses 4.53%
of the CPU time. Out of that, 2.86 percentage points are spent in
ReadBuffer_common. That means that write-locking the heap pages takes at
most 4.53 - 2.86 = 1.67 % of the total CPU time.

That's the upper limit of the benefit from the scheme you're proposing.
Surely the effort would be better spent on something else. For example,
if you kept the insertion target page just pinned over the calls, which
wouldn't have the problems with triggers etc, you could save that 2.86%.
Which still isn't much. Or take a look at WAL logging. XLogInsert took
16.06% of the CPU time. Earlier tests have suggested that a big chunk of
that time is spent in CRC calculation. Alternative CRC methods have been
suggested in the past, or perhaps that could time could be offloaded to
the WAL writer process, speeding up the UPDATE on a multi-CPU server.

Also, if we're talking about data warehousing, we're talking about big
tables that don't fit in memory. That means that you're likely
bottlenecked by I/O speed, not CPU. If that's the case, saving some CPU
time makes no difference whatsoever. What would help with I/O bottleneck
is to try to make the disk footprint smaller, or make better use of the
I/O bandwidth available.

Three steps to improve throughput:

1. Identify the hardware component that's the bottleneck.
2. Profile the workload to see what's using the bottlenecked resource
the most.
3. Figure out how to make that piece of code cheaper.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] like/ilike improvements

2007-09-21 Thread Gregory Stark
"ITAGAKI Takahiro" <[EMAIL PROTECTED]> writes:

> "Guillaume Smet" <[EMAIL PROTECTED]> wrote:
>
>> It's better but still slower than 8.2.
>
> It probablly comes from 'var-varlena' feature in 8.3. Now we store
> text fields in a compact format on disks and extract them on access.
> It consumes some CPU cycles. If all of data are in buffer cache
> and the encoding of database is single-byte encodings, the performance
> of LIKE in 8.3 was 10-20% slower than 8.2 on my tests.

Hm, it does seem I missed like.c when I converted all the text operators to
avoid detoasting packed varlenas. I'll send a patch in a few minutes to do
that. I'm surprised it would have such a large effect though.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] ecpg PREPARE is not thread safe

2007-09-21 Thread Michael Meskes
On Fri, Sep 21, 2007 at 11:05:47AM +0900, ITAGAKI Takahiro wrote:
> PREPARE statements concurrently in several threads. The cause seems to
> be the global variable 'prep_stmts' in ecpg/ecpglib/prepare.c .
> It is accessed without any locks.

And it is global, right. This has to be fixed, you're right.

> I think the proper approach is 3, because server-side prepared statements
> are independent in each connection. For that matter, are there any problems

Right now the prepared statements are not considered connection
specific. I'm not sure whether the standard says anything about this.
But moving this data shoudln't be a major problem.

> Even if we have some kinds of exclusive controls, current ecpg might not
> good at prepared statements when we use multiple connections in a signle
> thread or do multiple PREPARE in multiple threads. If so, 1 and 2 are not
> correct fixes.

Sorry, I don't get this. What exactly are you talking about here?

Michael
-- 
Michael Meskes
Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: [EMAIL PROTECTED]
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] TODO/exotic features/sql*net

2007-09-21 Thread Abhijit Menon-Sen
Regarding this item in the TODO:

SQL*Net listener that makes PostgreSQL appear as an Oracle database
to clients

I recently had (an unrelated) reason to look into the SQL*Net protocol
and discovered that no documentation for it is publicly available, and
reverse-engineering it is (supposedly) a violation of the Oracle user
agreement. People have wanted it for years, but Oracle (for whatever
reason) thinks it's a secret worth guarding closely.

I've discussed this item with various people, so I thought I'd mention
this for the sake of the archives.

(IMO, the TODO item should be dropped.)

-- ams, who is eternally grateful for protocol.html

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] ecpg PREPARE is not thread safe

2007-09-21 Thread ITAGAKI Takahiro

Michael Meskes <[EMAIL PROTECTED]> wrote:

> Right now the prepared statements are not considered connection
> specific. I'm not sure whether the standard says anything about this.
> But moving this data shoudln't be a major problem.
> 
> > Even if we have some kinds of exclusive controls, current ecpg might not
> > good at prepared statements when we use multiple connections in a signle
> > thread or do multiple PREPARE in multiple threads. If so, 1 and 2 are not
> > correct fixes.
> 
> Sorry, I don't get this. What exactly are you talking about here?

I'm worried that prepared statements are used in another connection.
ECPG does not consider in which connection the statements is prepared.
Are there any mix-up problem here? If no, the TSD approach is enough
to fix the race condition. If yes, per-connection approach is needed.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] HOT is applied

2007-09-21 Thread Heikki Linnakangas
Merlin Moncure wrote:
> On 9/20/07, Heikki Linnakangas <[EMAIL PROTECTED]> wrote:
>> Yeah. I'm doing some micro-benchmarking, and the attached test case is
>> much slower with HOT. It's spending a lot of time trying to prune, only
>> to find out that it can't.
>>
>> Instead of/in addition to avoiding pruning when it doesn't help, maybe
>> we could make HeapTupleSatisfiesVacuum cheaper.
>>
>> I'm going to continue testing, this is just a heads-up that HOT as
>> committed seriously hurts performance in some cases. (though one can
>> argue that this test case isn't a very realistic one.)
> 
> well, I ran your test on my box and here are the results:
> pre hot:
> run 1: 3617.641 ms
> run 2: 5195.215 ms
> run 3: 6760.449 ms
> after vacuum:
> run 1: 4171.362 ms
> run 2: 5513.317 ms
> run 3: 6884.125 ms
> post hot:
> run 1: Time: 7286.292 ms
> run 2: Time: 7477.089 ms
> run 3: Time: 7701.229 ms
> 
> those results aren't exactly terrible, and this case is highly artificial.

Your runtimes seem to be increasing as you repeat the test. Did you
remove the "DROP TABLE" from the beginning? On my laptop, post hot takes
~2x as long as pre hot, even when repeated, which matches the results of
your first runs.

Yeah, it's highly artificial.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] TODO/exotic features/sql*net

2007-09-21 Thread Peter Eisentraut
Am Freitag, 21. September 2007 schrieb Abhijit Menon-Sen:
> Regarding this item in the TODO:
>
> SQL*Net listener that makes PostgreSQL appear as an Oracle database
> to clients

> (IMO, the TODO item should be dropped.)

Yeah, if at all, this should be an external proxy server.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] HOT is applied

2007-09-21 Thread Heikki Linnakangas
Merlin Moncure wrote:
> On 9/20/07, Heikki Linnakangas <[EMAIL PROTECTED]> wrote:
>> Yeah. I'm doing some micro-benchmarking, and the attached test case is
>> much slower with HOT. It's spending a lot of time trying to prune, only
>> to find out that it can't.
>>
>> Instead of/in addition to avoiding pruning when it doesn't help, maybe
>> we could make HeapTupleSatisfiesVacuum cheaper.
>>
>> I'm going to continue testing, this is just a heads-up that HOT as
>> committed seriously hurts performance in some cases. (though one can
>> argue that this test case isn't a very realistic one.)
> 
> well, I ran your test on my box and here are the results:
> pre hot:
> run 1: 3617.641 ms
> run 2: 5195.215 ms
> run 3: 6760.449 ms
> after vacuum:
> run 1: 4171.362 ms
> run 2: 5513.317 ms
> run 3: 6884.125 ms
> post hot:
> run 1: Time: 7286.292 ms
> run 2: Time: 7477.089 ms
> run 3: Time: 7701.229 ms
> 
> those results aren't exactly terrible,

Your runtimes seem to be increasing as you repeat the test. Did you
remove the "DROP TABLE" from the beginning? On my laptop, post hot takes
~2x as long as pre hot, even when repeated, which matches the results of
your first runs.

> and this case is highly artificial.

Yeah.

I'm going through a bunch of CPU test cases a colleague of mine wrote.
Some of the test cases shows much worse performance with HOT.
Unfortunately they're quite complex and overlapping, so instead of
posting them as is, I'm going through the ones that HOT performs badly
at, and try to reduce them to simpler test cases like the above.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[HACKERS] Re: compiler warnings in ecpglib/execute.c (uninitialized local variable 'prepname' used)

2007-09-21 Thread Michael Meskes
On Thu, Sep 20, 2007 at 02:05:57PM +0200, Hannes Eder wrote:
> while rebuilding pgsql with msvc 2005 I noticed this compiler warning:
>
> .\src\interfaces\ecpg\ecpglib\execute.c(1495): warning C4700: uninitialized 
> local variable 'prepname' used

This free() call seems to be absolutely superfluous, so I removed it.

Fixed in CVS head.

Thanks for the report.

Michael
-- 
Michael Meskes
Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: [EMAIL PROTECTED]
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Dynamically adding index types (was GIT indexes)

2007-09-21 Thread Simon Riggs
On Wed, 2007-09-19 at 10:37 -0400, Tom Lane wrote: 
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > We're able to dynamically add AMs in the way you suggest, but there is
> > no way to alter the RMgrTable to either add a new RM or re-assign one of
> > the unused RMs.
> 
> Hmmm...

> > 1. Remove the "Const" in front of RmgrTable in rmgr.c. That would allow
> > re-assignment of the two existing unused RMs.
> 
> Useless, as there's no way for an add-on AM to cause the value to be
> changed before recovery begins.

OK, sounds like the only way is to have a dedicated plug-in. 


if (resource_manager_hook)
   set RmgrTable in plugin
else
   normal static definition (but no longer Const)

...or variations of the above depending upon whether we want to allow
redefining existing Rmgrs - not something I personally want.

Plus something to say "xlog record found for unknown Rmgr".

Plus changes to all places that refer to RM_MAX_ID and replace with a
global value of RmgrTableEntries.

We can then get rid of the two reserved Rmgr values...


Will that do it?


-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Schema access in PL/PGSQL for custom objects - i.e. type access?

2007-09-21 Thread Andrew Dunstan


This appears to be a usage question. -hackers is about development of 
Postgres, not use. Please ask in the correct forum.


Also, when you do ask on the correct list, please add more detail to 
questions like this. In particular, show code that you have tried and 
that didn't work like you expected. Mailing list readers aren't 
mind-readers.


cheers

andrew

blay bloo wrote:

I was wondering how to access the schema of a type in PL/PGSQL

Basically, we've created some custom objects in the database, which
are somewhat similar to database/composite types (i.e. Create type..)-
essentially a named list of name/type pairs.

In PL/PGSQL you can define variables to be custom database types, but
when we try to do it with our custom objects we get:
ERROR: "myet is not a table"

Essentially we just want some way  to create a variable (type
instance, or even row) with the appropriate name/type pairs for OUR
objects in PL/PGSQL.

Where would we start looking to 'hack' the code for this - i.e.
how/where in pgsql can i either a) add my object to be recognised as a
'type' or b) make it seem the equivalent to a custom db type


Thanks
BB

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

  


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] HOT is applied

2007-09-21 Thread Pavan Deolasee
On 9/21/07, Tom Lane <[EMAIL PROTECTED]> wrote:
>
>
> so this example is getting past the heuristic tests in
> heap_page_prune_opt almost every time.  Why is that?  Too tired to poke
> at it more tonight.
>
>
I guess you already know the answer now, but anyways: Since we are
updating a single tuple in a single transaction, each update is preceded
by a sequential scan. All but last pages are completely full and marked
prunable, so HOT would try to (unsuccessfully) prune every (except may
be last) page.

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com


[HACKERS] Schema access in PL/PGSQL for custom objects - i.e. type access?

2007-09-21 Thread blay bloo
I was wondering how to access the schema of a type in PL/PGSQL

Basically, we've created some custom objects in the database, which
are somewhat similar to database/composite types (i.e. Create type..)-
essentially a named list of name/type pairs.

In PL/PGSQL you can define variables to be custom database types, but
when we try to do it with our custom objects we get:
ERROR: "myet is not a table"

Essentially we just want some way  to create a variable (type
instance, or even row) with the appropriate name/type pairs for OUR
objects in PL/PGSQL.

Where would we start looking to 'hack' the code for this - i.e.
how/where in pgsql can i either a) add my object to be recognised as a
'type' or b) make it seem the equivalent to a custom db type


Thanks
BB

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] HOT is applied

2007-09-21 Thread Pavan Deolasee
On 9/21/07, Tom Lane <[EMAIL PROTECTED]> wrote:
>
>
> It shouldn't, though --- the hint bit should get cleared on the first
> try.  I think I probably broke something in the last round of revisions
> to heap_page_prune_opt, but haven't looked yet ...



We set the hint bit (prunable) again when we see a RECENTLY_DEAD
or DELETE_IN_PROGRESS tuple. This is correct in normal circumstances
because otherwise we would never be able to prune the page.

On a second thought, may be we can choose not to set the bit again. In
the worst case, the next update on the page would be a COLD update.
That would set the bit again and we shall retry prune in the next visit to
the page.

This scheme would work as long as we don't put in mechanism to
update FSM after pruning. But if we choose to update FSM (I am
inclined to do this to avoid repeated relation extension because
of COLD updates), we would need to improve this. Otherwise a
page full of only DEAD tuples may never be pruned and vacuum
would be required to reclaim that space.

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com


Re: [HACKERS] HOT is applied

2007-09-21 Thread Tom Lane
"Heikki Linnakangas" <[EMAIL PROTECTED]> writes:
> Bruce Momjian wrote:
>> This might be a simplistic question but if the page is +90% full and
>> there is a long-lived transaction, isn't Postgres going to try pruning
>> on each page read access?

> Yes :(

It shouldn't, though --- the hint bit should get cleared on the first
try.  I think I probably broke something in the last round of revisions
to heap_page_prune_opt, but haven't looked yet ...

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] HOT is applied

2007-09-21 Thread Heikki Linnakangas
Tom Lane wrote:
> I don't much like the idea of adding an xid to the page header --- for
> one thing, *which* xid would you put there, and what would you test it
> against?

I was thinking that you would put the smallest in-progress xmax on the
page there, and you would test it against OldestXmin. If all
transactions commit, there surely isn't anything to prune until that xid
falls beyond the OldestXmin horizon. If an inserting transaction aborts,
we could prune the aborted tuple earlier, but optimizing for aborts
doesn't seem that important.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] HOT is applied

2007-09-21 Thread Tom Lane
"Heikki Linnakangas" <[EMAIL PROTECTED]> writes:
> If you look at the callgraph, you'll see that those
> LWLockAcquire/Release calls are coming from HeapTupleSatisfiesVacuum ->
> TransactionIdIsInProgress, which keeps trashing the ProcArrayLock. A
> "if(TransactionIdIsCurrentTransactionId(xid)) return true;" check in
> TransactionIdIsInProgress would speed that up, but I wonder if there's a
> more general solution to make HeapTupleSatisfiesVacuum cheaper. For
> example, we could cache the in-progress status of tuples.

Dunno about "more general", but your idea reduces the runtime of this
example by about 50% (22.2s to 10.5s) for me.  I'm worried though that
it would be a net negative in more typical situations, especially if
you've got a lot of open subtransactions.

regards, tom lane

*** src/backend/storage/ipc/procarray.c.origSat Sep  8 16:31:15 2007
--- src/backend/storage/ipc/procarray.c Fri Sep 21 11:08:34 2007
***
*** 341,346 
--- 341,353 
return false;
}
  
+   /*
+* Also, we can detect our own transaction without any access to shared
+* memory.
+*/
+   if (TransactionIdIsCurrentTransactionId(xid))
+   return true;
+ 
/* Get workspace to remember main XIDs in */
xids = (TransactionId *) palloc(sizeof(TransactionId) * 
arrayP->maxProcs);
  

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] HOT is applied

2007-09-21 Thread Tom Lane
"Pavan Deolasee" <[EMAIL PROTECTED]> writes:
> On 9/21/07, Tom Lane <[EMAIL PROTECTED]> wrote:
>> 
>> so this example is getting past the heuristic tests in
>> heap_page_prune_opt almost every time.  Why is that?  Too tired to poke
>> at it more tonight.
>> 
> I guess you already know the answer now, but anyways: Since we are
> updating a single tuple in a single transaction, each update is preceded
> by a sequential scan. All but last pages are completely full and marked
> prunable, so HOT would try to (unsuccessfully) prune every (except may
> be last) page.

Hmm ... the problem really is that heap_page_prune turns the hint back
on when it sees a DELETE_IN_PROGRESS tuple.  Maybe that's a bad idea.

I don't much like the idea of adding an xid to the page header --- for
one thing, *which* xid would you put there, and what would you test it
against?

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] HOT is applied

2007-09-21 Thread Heikki Linnakangas
Tom Lane wrote:
> "Heikki Linnakangas" <[EMAIL PROTECTED]> writes:
>> Bruce Momjian wrote:
>>> This might be a simplistic question but if the page is +90% full and
>>> there is a long-lived transaction, isn't Postgres going to try pruning
>>> on each page read access?
> 
>> Yes :(
> 
> It shouldn't, though --- the hint bit should get cleared on the first
> try.

Think so? That would mean that you only get a single chance to prune
after an update. Not sure how big a problem that is, but I do feel that
would make HOT a lot less effective in some usage patterns.

The idea of having a "prunable xmin" in the page header sounds more and
more attractive to me...

>  I think I probably broke something in the last round of revisions
> to heap_page_prune_opt, but haven't looked yet ...

Pavan's patch was like that as well; you don't clear the flag (or
rather, you set it again while pruning) until there's nothing left to
prune on the page.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Schema access in PL/PGSQL for custom objects - i.e. type access?

2007-09-21 Thread Tom Lane
"blay bloo" <[EMAIL PROTECTED]> writes:
> I was wondering how to access the schema of a type in PL/PGSQL
> Basically, we've created some custom objects in the database, which
> are somewhat similar to database/composite types (i.e. Create type..)-
> essentially a named list of name/type pairs.

Is there a particularly good reason why you don't just use composite
types?  It sounds to me like you want to pile one hack atop another,
when you could avoid either by switching to the standard facility.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] HOT is applied

2007-09-21 Thread Heikki Linnakangas
Tom Lane wrote:
> "Heikki Linnakangas" <[EMAIL PROTECTED]> writes:
>> If you look at the callgraph, you'll see that those
>> LWLockAcquire/Release calls are coming from HeapTupleSatisfiesVacuum ->
>> TransactionIdIsInProgress, which keeps trashing the ProcArrayLock. A
>> "if(TransactionIdIsCurrentTransactionId(xid)) return true;" check in
>> TransactionIdIsInProgress would speed that up, but I wonder if there's a
>> more general solution to make HeapTupleSatisfiesVacuum cheaper. For
>> example, we could cache the in-progress status of tuples.
> 
> Dunno about "more general", but your idea reduces the runtime of this
> example by about 50% (22.2s to 10.5s) for me.  I'm worried though that
> it would be a net negative in more typical situations, especially if
> you've got a lot of open subtransactions.

Yeah. I played with this a bit more, and came up with a couple of other
micro-optimizations:

1. Instead of pallocing and pfreeing a new array in
TransactionIdIsInProgress, we could just malloc the array once and reuse
it. That palloc/pfree alone was consuming around 8% of CPU time in the
test case.

2. About ~30% of CPU time is spent in HeapTupleSatisfiesMVCC ->
TransactionIdDidAbort. This is in fact not related to HOT, but it caught
my eye while looking at the profile. We have this piece of code in
HeapTupleSatisfiesMVCC:

>   /* deleting subtransaction aborted? */
>   /* FIXME -- is this correct w.r.t. the cmax of the tuple? */
>   if (TransactionIdDidAbort(HeapTupleHeaderGetXmax(tuple)))
>   {
>   SetHintBits(tuple, buffer, HEAP_XMAX_INVALID,
>   InvalidTransactionId);
>   return true;
>   }
> 
>   
> Assert(TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmax(tuple)));

We've already checked that the xmin is our own transaction id, so we
check if the xmax is an aborted subtransaction of our own transaction. A
TransactionIdDidAbort call seems like an awfully expensive way to check
that. We could call TransactionIdIsCurrentTransactionId instead, which
doesn't need to access any shared memory structures (but might be
expensive if you have a lot of active subxacts, as you pointed out). Or
we could keep a list of aborted subtransactions in backend private
memory, and just check against that. In this particular case, a simple
"if(xmin==xmax)" check would be effective as well, since we know that
the xmin is not aborted at that point.

3. One more general alternative to the little patch I sent earlier would
be to build an array of in-progress-xids in TransactionIdInProgress,
like we do in GetSnapshotData, and use that for the in-progress checks
in HeapTupleSatisfiesVacuum. That should work, if we build the cached
array when we lock the page for pruning, and use it until we unlock. If
a transaction commits/abort after we build the cached array, we might
return DELETE/INSERT_IN_PROGRESS for a tuple that is in fact already
DEAD, but that's ok. Since we're holding a lock on the page, it's not
possible for new transaction to start and subsequently put it's xid on
the page where we would bump into it. I believe this would have roughly
the same performance effect in this test case as calling
TransactionIdIsCurrentTransactionId from TransactionIdIsInProgress.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] HOT is applied

2007-09-21 Thread Tom Lane
I wrote:
> Dunno about "more general", but your idea reduces the runtime of this
> example by about 50% (22.2s to 10.5s) for me.  I'm worried though that
> it would be a net negative in more typical situations, especially if
> you've got a lot of open subtransactions.

Actually ... the only way that TransactionIdIsCurrentTransactionId can
take a meaningful amount of time is if you've got lots of
subtransactions, and in that case your own subxids cache has certainly
overflowed, which is likely to force TransactionIdIsInProgress into the
"slow answer" path.  But if we use TransactionIdIsCurrentTransactionId
to handle our own xids then we can just ignore MyProc altogether inside
the loop, thus very possibly (if we are the only overflowed-subxids proc)
saving us from going through the slow answer path.  So on reflection
I can't see how this isn't a win.  I'll clean it up and apply it.

I'm also starting to come around to liking the page-header-xid field
a bit more.  I suggest that it could replace the "page is prunable"
flag bit altogether --- to mark the page prunable, you must store
some appropriate xid into the header field.  This would avoid a useless
prune attempt immediately after a page is marked prunable.

A possible problem with it, if we treat it as a non-WAL-logged hint,
is that corruption of the value could lead to a page being marked with
an xid that's way in the future, keeping it from getting pruned.
However, if the page header gets corrupted this is the least of your
worries, so it doesn't seem like an enormous objection.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] HOT is applied

2007-09-21 Thread Merlin Moncure
On 9/21/07, Heikki Linnakangas <[EMAIL PROTECTED]> wrote:
> Merlin Moncure wrote:
> > pre hot:
> > run 1: 3617.641 ms
> > run 2: 5195.215 ms
> > run 3: 6760.449 ms
> > after vacuum:
> > run 1: 4171.362 ms
> > run 2: 5513.317 ms
> > run 3: 6884.125 ms
> > post hot:
> > run 1: Time: 7286.292 ms
> > run 2: Time: 7477.089 ms
> > run 3: Time: 7701.229 ms
> >
> > those results aren't exactly terrible, and this case is highly artificial.
>
> Your runtimes seem to be increasing as you repeat the test. Did you
> remove the "DROP TABLE" from the beginning? On my laptop, post hot takes
> ~2x as long as pre hot, even when repeated, which matches the results of
> your first runs.

correct.

Well, my first round of results are so far not showing the big gains I
saw with hot in some of the earlier patches...so far, it looks
approximately to be a wash although with the reduced need to vacuum.
i'll test some more when things settle down.

merlin

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] like/ilike improvements

2007-09-21 Thread Guillaume Smet
Gregory,

On 9/21/07, Gregory Stark <[EMAIL PROTECTED]> wrote:
> Hm, it does seem I missed like.c when I converted all the text operators to
> avoid detoasting packed varlenas. I'll send a patch in a few minutes to do
> that. I'm surprised it would have such a large effect though.

The patch doesn't seem to apply cleanly on head (I have a problem with
oracle_compat.c). I tested it though with latin1 encoding.

The LIKE case is better:
cityvox_latin1=# SELECT e.numeve FROM evenement e WHERE e.libgeseve
LIKE '%hocus pocus%';
 numeve

(0 rows)

Time: 98.995 ms

-> it seems to be as fast as 8.2 was, now.

The ILIKE case seems to go into an infinite loop: postmaster takes
100% of CPU and the query never finishes.

--
Guillaume

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] HOT is applied

2007-09-21 Thread Heikki Linnakangas
Merlin Moncure wrote:
> Well, my first round of results are so far not showing the big gains I
> saw with hot in some of the earlier patches...so far, it looks
> approximately to be a wash although with the reduced need to vacuum.
> i'll test some more when things settle down.

Oh... Which version did you test earlier? What are you testing, pgbench?

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] HOT is applied

2007-09-21 Thread Heikki Linnakangas
Tom Lane wrote:
> Actually ... the only way that TransactionIdIsCurrentTransactionId can
> take a meaningful amount of time is if you've got lots of
> subtransactions, and in that case your own subxids cache has certainly
> overflowed, which is likely to force TransactionIdIsInProgress into the
> "slow answer" path.  But if we use TransactionIdIsCurrentTransactionId
> to handle our own xids then we can just ignore MyProc altogether inside
> the loop, thus very possibly (if we are the only overflowed-subxids proc)
> saving us from going through the slow answer path.  So on reflection
> I can't see how this isn't a win.  I'll clean it up and apply it.

Good point.

I noted that most callers of TransactionIdIsInProgress in tqual.c
already call TransactionIdIsCurrentTransactionId before
TransactionIdIsInProgress. In those cases we could just skip the test
for our own xids altogether, if it's worth code mangling to tell
TransactionIdIsInProgress whether it's safe to skip it or not.

> I'm also starting to come around to liking the page-header-xid field
> a bit more.  I suggest that it could replace the "page is prunable"
> flag bit altogether --- to mark the page prunable, you must store
> some appropriate xid into the header field.  This would avoid a useless
> prune attempt immediately after a page is marked prunable.

Yeah, we don't need the flag if we have an xid.

> A possible problem with it, if we treat it as a non-WAL-logged hint,
> is that corruption of the value could lead to a page being marked with
> an xid that's way in the future, keeping it from getting pruned.
> However, if the page header gets corrupted this is the least of your
> worries, so it doesn't seem like an enormous objection.

Agreed. Next attempt to prune or vacuum the page will fix it anyway.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] HOT is applied

2007-09-21 Thread Tom Lane
"Heikki Linnakangas" <[EMAIL PROTECTED]> writes:
> Yeah. I played with this a bit more, and came up with a couple of other
> micro-optimizations:

> 1. Instead of pallocing and pfreeing a new array in
> TransactionIdIsInProgress, we could just malloc the array once and reuse
> it. That palloc/pfree alone was consuming around 8% of CPU time in the
> test case.

Good idea --- not only faster, but we can get rid of all the goto's and
the "locked" flag, if we're willing to have a couple more LWLockRelease
calls here.  I'll incorporate this in the patch I'm working up.

> We've already checked that the xmin is our own transaction id, so we
> check if the xmax is an aborted subtransaction of our own transaction. A
> TransactionIdDidAbort call seems like an awfully expensive way to check
> that. We could call TransactionIdIsCurrentTransactionId instead, which
> doesn't need to access any shared memory structures (but might be
> expensive if you have a lot of active subxacts, as you pointed out).

I like that idea too ...

> 3. One more general alternative to the little patch I sent earlier would
> be to build an array of in-progress-xids in TransactionIdInProgress,
> like we do in GetSnapshotData, and use that for the in-progress checks
> in HeapTupleSatisfiesVacuum. That should work, if we build the cached
> array when we lock the page for pruning, and use it until we unlock.

This seems like it uglifies a whole lot of APIs, though, for what's
probably not going to be a lot of gain.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] like/ilike improvements

2007-09-21 Thread Gregory Stark

"Guillaume Smet" <[EMAIL PROTECTED]> writes:

> Gregory,
>
> On 9/21/07, Gregory Stark <[EMAIL PROTECTED]> wrote:
>> Hm, it does seem I missed like.c when I converted all the text operators to
>> avoid detoasting packed varlenas. I'll send a patch in a few minutes to do
>> that. I'm surprised it would have such a large effect though.
>
> The patch doesn't seem to apply cleanly on head (I have a problem with
> oracle_compat.c). I tested it though with latin1 encoding.

Huh, I'll check. You have updated recently right? Because Andrew's changes to
ascii and char and so on just went in very recently.

> The LIKE case is better:
> cityvox_latin1=# SELECT e.numeve FROM evenement e WHERE e.libgeseve
> LIKE '%hocus pocus%';
>  numeve
> 
> (0 rows)
>
> Time: 98.995 ms
>
> -> it seems to be as fast as 8.2 was, now.
>
> The ILIKE case seems to go into an infinite loop: postmaster takes
> 100% of CPU and the query never finishes.

Can you send me the test cases you're using? It seems to be working for me and
it passes all the regression tests (no idea if they test ilike though).

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] HOT is applied

2007-09-21 Thread Pavan Deolasee
On 9/21/07, Tom Lane <[EMAIL PROTECTED]> wrote:
>
>
>
> I'm also starting to come around to liking the page-header-xid field
> a bit more.  I suggest that it could replace the "page is prunable"
> flag bit altogether --- to mark the page prunable, you must store
> some appropriate xid into the header field.  This would avoid a useless
> prune attempt immediately after a page is marked prunable.
>
>
Agree. I was thinking of minimum of the following:

1 xmin of INSERT_IN_PROGRESS tuple
2 xmax of DELETE_IN_PROGRESS tuple
3 xmax of RECENTLY_DEAD_TUPLE

When we attempt a prune, we can set xid to the minimum of the above
three. If none of the above tuples exist in the page, xid can be set to
FrozenXid
A page need not be pruned if its xid is set to FrozenXid or is greater
than OldtestXmin.

In addition, I guess we can even set the xid also when we insert,
delete or update from the page unless its already set to a lower value.

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com


Re: [HACKERS] HOT is applied

2007-09-21 Thread Tom Lane
"Heikki Linnakangas" <[EMAIL PROTECTED]> writes:
> We've already checked that the xmin is our own transaction id, so we
> check if the xmax is an aborted subtransaction of our own transaction. A
> TransactionIdDidAbort call seems like an awfully expensive way to check
> that. We could call TransactionIdIsCurrentTransactionId instead, which
> doesn't need to access any shared memory structures (but might be
> expensive if you have a lot of active subxacts, as you pointed out).

Applied --- seems to buy another 50% savings in the example with all the
pruning activity.  I'll get after the pruning activity itself in a bit,
but it seems worth squeezing the tqual functions first while we have
this example that stresses them so much.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] Text <-> C string

2007-09-21 Thread Brendan Jurd
Hi hackers,

I've noticed that there is a lot of code, particularly in src/backend,
that goes through the motions of making a text datum into a cstring to
perform some work on it, and likewise for making a cstring into a text
datum.

Is there not a nice macro somewhere to handle this consistently?  And
if not, shouldn't there be?

I noticed a comment for StrNCpy() in src/include/c.h that seems related:

/* BTW: when you need to copy a non-null-terminated string (like a
text datum) and add a null, do not do it with StrNCpy [snip] Do it
honestly with "memcpy(dst,src,len); dst[len] = '\0'; instead."

Okay, I can see why using StrNCpy is a bad idea, but why not "Do it
honestly with TEXT_CSTRING(src, dst)", or similar?

Surely having the exact same four lines of code written out in dozens
of places is a Bad Thing, but perhaps there is some reasoning behind
this that I am missing?

Thanks for your time,
BJ

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] HOT is applied

2007-09-21 Thread Tom Lane
"Heikki Linnakangas" <[EMAIL PROTECTED]> writes:
> I noted that most callers of TransactionIdIsInProgress in tqual.c
> already call TransactionIdIsCurrentTransactionId before
> TransactionIdIsInProgress. In those cases we could just skip the test
> for our own xids altogether, if it's worth code mangling to tell
> TransactionIdIsInProgress whether it's safe to skip it or not.

At least in this example, it seems it wouldn't buy anything.
HeapTupleSatisfiesVacuum doesn't call
TransactionIdIsCurrentTransactionId, and while HeapTupleSatisfiesMVCC
does, it falls out at that point and doesn't get to
TransactionIdIsInProgress.  So I'm not inclined to contort the API
unless we can come up with an example where it's important.

I'm off to work on the page-header-xid idea ...

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Text <-> C string

2007-09-21 Thread Gregory Stark
"Brendan Jurd" <[EMAIL PROTECTED]> writes:

> Surely having the exact same four lines of code written out in dozens
> of places is a Bad Thing, but perhaps there is some reasoning behind
> this that I am missing?

The canonical way to do it is with

DatumGetCString(DirectFunctionCall1(textout, t))


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Text <-> C string

2007-09-21 Thread Brendan Jurd
On 9/22/07, Gregory Stark <[EMAIL PROTECTED]> wrote:
> The canonical way to do it is with
>
> DatumGetCString(DirectFunctionCall1(textout, t))

Ah, I see.  Thanks.

In that case, would it be helpful if I submitted a patch for the
various code fragments that do this locally, updating them to use
DatumGetCString?

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] like/ilike improvements

2007-09-21 Thread Andrew Dunstan



Guillaume Smet wrote:

Gregory,

On 9/21/07, Gregory Stark <[EMAIL PROTECTED]> wrote:
  

Hm, it does seem I missed like.c when I converted all the text operators to
avoid detoasting packed varlenas. I'll send a patch in a few minutes to do
that. I'm surprised it would have such a large effect though.



The patch doesn't seem to apply cleanly on head (I have a problem with
oracle_compat.c). 
  


It applied cleanly for me.

cheers

andrew

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Text <-> C string

2007-09-21 Thread Gregory Stark

"Brendan Jurd" <[EMAIL PROTECTED]> writes:

> On 9/22/07, Gregory Stark <[EMAIL PROTECTED]> wrote:
>> The canonical way to do it is with
>>
>> DatumGetCString(DirectFunctionCall1(textout, t))
>
> Ah, I see.  Thanks.
>
> In that case, would it be helpful if I submitted a patch for the
> various code fragments that do this locally, updating them to use
> DatumGetCString?

I would be interested in seeing just a list of such places if you have it
handy. I don't think we consider it wrong to violate the text data type
abstraction barrier like you describe though. 

I'm interested because any such code is possibly either failing to take into
account toasted data or is unnecessarily detoasting packed varlenas.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Text <-> C string

2007-09-21 Thread Brendan Jurd
On 9/22/07, Gregory Stark <[EMAIL PROTECTED]> wrote:
> The canonical way to do it is with
>
> DatumGetCString(DirectFunctionCall1(textout, t))

I just noticed a couple of macros defined in src/include/tsearch/ts_utils.h:

#define TextPGetCString(t)
DatumGetCString(DirectFunctionCall1(textout, PointerGetDatum(t)))
#define CStringGetTextP(c) DatumGetTextP(DirectFunctionCall1(textin,
CStringGetDatum(c)))

Seems these would actually be convenient in quite a lot of places in
the backend.  Is there any downside to moving these two into
src/include/postgres.h?

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Text <-> C string

2007-09-21 Thread Brendan Jurd
Well, a couple of specific cases that I came across are
quote_identifier() in src/backend/utils/adt/quote.c, and
do_to_timestamp() in src/backend/utils/adt/formatting.c (line 3349).

I was getting a rough notion of how common the duplication was using

$ egrep -Rn -C 2 'memcpy.*VARDATA' src/backend

Not all of these are genuine duplications of textout and textin (you
have to eyeball them individually to work that out) but it's a
reasonable starting point.

The files matched under src/backend are as follows.

src/backend/libpq/be-fsstubs.c
src/backend/utils/mb/mbutils.c
src/backend/utils/adt/timestamp.c
src/backend/utils/adt/nabstime.c
src/backend/utils/adt/xml.c
src/backend/utils/adt/quote.c
src/backend/utils/adt/oracle_compat.c
src/backend/utils/adt/varchar.c
src/backend/utils/adt/ruleutils.c
src/backend/utils/adt/varlena.c
src/backend/utils/adt/tsginidx.c
src/backend/utils/adt/cash.c
src/backend/utils/adt/date.c
src/backend/utils/adt/genfile.c
src/backend/utils/adt/network.c
src/backend/utils/adt/selfuncs.c
src/backend/utils/adt/formatting.c
src/backend/utils/adt/version.c
src/backend/utils/adt/pgstatfuncs.c
src/backend/access/heap/tuptoaster.c
src/backend/access/common/heaptuple.c
src/backend/storage/large_object/inv_api.c
src/backend/executor/execQual.c
src/backend/catalog/pg_conversion.c

On 9/22/07, Gregory Stark <[EMAIL PROTECTED]> wrote:
>
> "Brendan Jurd" <[EMAIL PROTECTED]> writes:
>
> > On 9/22/07, Gregory Stark <[EMAIL PROTECTED]> wrote:
> >> The canonical way to do it is with
> >>
> >> DatumGetCString(DirectFunctionCall1(textout, t))
> >
> > Ah, I see.  Thanks.
> >
> > In that case, would it be helpful if I submitted a patch for the
> > various code fragments that do this locally, updating them to use
> > DatumGetCString?
>
> I would be interested in seeing just a list of such places if you have it
> handy. I don't think we consider it wrong to violate the text data type
> abstraction barrier like you describe though.
>
> I'm interested because any such code is possibly either failing to take into
> account toasted data or is unnecessarily detoasting packed varlenas.
>
> --
>   Gregory Stark
>   EnterpriseDB  http://www.enterprisedb.com
>
> ---(end of broadcast)---
> TIP 5: don't forget to increase your free space map settings
>

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Open issues for HOT patch

2007-09-21 Thread Alvaro Herrera
Gregory Stark wrote:
> 
> "Heikki Linnakangas" <[EMAIL PROTECTED]> writes:
> 
> > There is one wacky idea I haven't dared to propose yet:
> >
> > We could lift the limitation that you can't defragment a page that's
> > pinned, if we play some smoke and mirrors in the buffer manager. When
> > you prune a page, make a *copy* of the page you're pruning, and keep
> > both versions in the buffer cache. Old pointers keep pointing to the old
> > version. Any new calls to ReadBuffer will return the new copy, and the
> > old copy can be dropped when its pin count drops to zero.
> 
> Fwiw when Heikki first mentioned this idea I thought it was the craziest thing
> I ever heard. But the more I thought about it the more I liked it. I've come
> to the conclusion that while it's a wart, it's not much worse than the wart of
> the super-exclusive lock which it replaces. In fact it's arguably cleaner in
> some ways.

I read this as very similar to RCU.  In some scenarios it makes a lot of
sense.  I don't think it's a thing to be attacked in 8.3 though, since
it is a big change to the bufmgr.

-- 
Alvaro Herrerahttp://www.advogato.org/person/alvherre
"The Postgresql hackers have what I call a "NASA space shot" mentality.
 Quite refreshing in a world of "weekend drag racer" developers."
(Scott Marlowe)

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Text <-> C string

2007-09-21 Thread Tom Lane
"Brendan Jurd" <[EMAIL PROTECTED]> writes:
> I just noticed a couple of macros defined in src/include/tsearch/ts_utils.h:

> #define TextPGetCString(t)
> DatumGetCString(DirectFunctionCall1(textout, PointerGetDatum(t)))
> #define CStringGetTextP(c) DatumGetTextP(DirectFunctionCall1(textin,
> CStringGetDatum(c)))

> Seems these would actually be convenient in quite a lot of places in
> the backend.  Is there any downside to moving these two into
> src/include/postgres.h?

I think if you look around you'll find several similar things in various
contrib modules.  It would make some sense to try to unify all this.
I'm not particularly for making it macros in postgres.h though ---
that's no help if the macros require referencing stuff in builtins.h.

On grounds of code-space savings I think it might be worth making
these things be simple functions declared in builtins.h; that would
also make it much easier to change their implementations.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend