[HACKERS] Tracking a snapshot on PITR slaves

2007-08-02 Thread Florian G. Pflug

Hi

Since my attempts to find a simple solution for the read-only query
locking problems (Once that doesn't need full wal logging of lock
requests) haven't been successfully yet, I've decided to turn to the
problems of tracking a snapshot on the slaves for now. (Because first
such a snapshot is needed for any kind of concurrent recovery anyway, and
second because any non-simplistic solution of the locking problems
will quite likely benefit from such a snapshot).

The idea is to create a special kind of snapshot that works basically
like a MVCC snapshot, but with the meaning of the xip array inverted.
Usually, if a xid is *not* in the xip array of a snapshot, and greater
than the xmin of that snapshot, the clog state of the xid determines
tuple visibility. This is not well suited for queries running during
replay, because the effects of a xlog record with a (to the slave)
previously unknown xid shouldn't be visible to concurrently running
queries.

Therefore, flag xip_inverted will be added to SnapshotData, that causes
HeapTupleSatisfiesMVCC to assume that any xid >= xmin and *not* in the
xip array is in progress.

This allows the following to work:
.) Store RecentXmin with every xlog record, in a new field xl_xmin.
   (Wouldn't be needed in *every* record, but for now keeping it
   directly inside XLogRecord make things easier, and it's just 4 bytes)
.) Maintain a global snapshot template in shmem during replay, with the xmin
   being the highest xmin seen so far in any xlog record. That template
   is copied whenever a readonly query needs to obtain a snapshot.
.) Upon replaying a COMMIT or COMMIT_PREPARED record, the xmin of the
   to-be-committed transaction is added to the global snapshot,
   making the commit visibile to all further copies of that snapshot.

If you can shoot this down, you're welcome to do so ;-)

greetings, Florian Pflug


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] .NET driver

2007-08-02 Thread Andrew Dunstan



Brar Piening wrote:

Robert Treat schrieb:
That would be nice. Of course none of this seems relevant to hackers, 
so I'd   

Your'e right - of course.

But sometimes I wish 'hackers' would care a little more about their 
interfaces as the best backend can't be good without good interfaces 
and some of the PostgreSQL-interfaces don't reach the standard they 
are reaching for other databases.
As a windows-user I still can't drag and drop a Dataset in VS.Net with 
Npgsql and I still have to build a single-threaded perl if i want to 
use DBD::Pg (I know about DBD::PgPP).





This latter is simply not true.

ActiveState Perl is threaded and DBD::Pg works just fine with it. In 
fact, you don't need to build your own - just get the one from pgfoundry:


point your ppm at: http://dbdpgppm.projects.postgresql.org//DBD-Pg-5.8.ppd

cheers

andrew



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] .NET driver

2007-08-02 Thread Brar Piening

Robert Treat schrieb:
That would be nice. Of course none of this seems relevant to hackers, so I'd 
  

Your'e right - of course.

But sometimes I wish 'hackers' would care a little more about their 
interfaces as the best backend can't be good without good interfaces and 
some of the PostgreSQL-interfaces don't reach the standard they are 
reaching for other databases.
As a windows-user I still can't drag and drop a Dataset in VS.Net with 
Npgsql and I still have to build a single-threaded perl if i want to use 
DBD::Pg (I know about DBD::PgPP).


I'm really happy with the backend right now and I could perhaps convince 
the decision makers at my job to use my personal favorite (in addition 
to MSSQL) - but not as long as the interface doesn't look like the one 
they are used to.


If C# will not go above 5-10% in this 
http://www.postgresql.org/community/survey.13 statistic, PostgreSQL will 
not be able to cover all the markets it could.
See: 
http://radar.oreilly.com/archives/2006/08/programming_language_trends_1.html


As I know that this is is off-topic here I'm not going to discuss this 
any further on this list but I'll respond to personal mails or follow an 
invitation to 'advocacy' (to which I'm not yet subscribed) or any other 
convenient list.


Regards,

Brar




---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] .NET driver

2007-08-02 Thread Brar Piening

Andrei Kovalevski schrieb:


   I have an experience with writing ODBC driver for PostgreSQL 
(https://projects.commandprompt.com/public/odbcng/). I would be happy 
to help community to improve .NET data provider.



Please join the Npgsql Project at http://pgfoundry.org/projects/npgsql

Francisco Figueiredo Jr. (fxjrlists[at]yahoo[dot]com[dot]br) will be 
happy about some new support.


I once did some initial VS.Net 2002/3 integration but ran out of time 
half the way.
It ist quite a bit of a pain as Microsoft has marked some important 
classes as sealed so you will see yourself reimplementing some wheels 
they already have implemented.
Plus - as Merlin stated before - VS.Net/ADO.Net is a somewhat moving 
target for data provider implementations.


Brar

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] GIT patch

2007-08-02 Thread Tom Lane
Heikki Linnakangas <[EMAIL PROTECTED]> writes:
> Alvaro Herrera wrote:
>> At this
>> point I feel like the patch still needs some work and reshuffling before
>> it is in an acceptable state.  The fact that there are some API changes
>> for which the patch needs to be adjusted makes me feel like we should
>> put this patch on hold for 8.4.  So we would first get the API changes
>> discussed and done and then adapt this patch to them.

> I hate to say it but I agree.

I concur with putting this whole area off till 8.4.  We do not have any
consensus on what the API should be, which is exactly why the patch was
never finished.  All the proposals are pretty ugly.

Another problem: frankly I'm pretty dissatisfied with the entire concept
of not storing all the index keys, especially in the proposed way which
would eliminate any outside control over whether keys are dropped or
not.  Two problems I can see with it are:

1. The performance hit for functional indexes could be really steep,
since you'd need to recompute a potentially expensive function to
recheck matches.

2. This would forever cut off any development of indexscans that make
use of index key data beyond what btree itself knows how to do.  An
example of the sort of thing I'm thinking about is applying a LIKE or
regex pattern match operator against the index key before visiting the
heap --- not just a derived >= or <= condition, but the actual pattern
match.  We've discussed adding an index AM call that returns the key
values, which'd allow the executor to apply non-btree operators to them
before visiting the heap.  But that idea is DOA if the planner can't
tell in advance whether the entries will be available.

So instead of pressing to try to get something into 8.3, I would rather
we stand back and think about it some more.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] GIT patch

2007-08-02 Thread Heikki Linnakangas
Alexey Klyukin wrote:
> Well, then should we return to the review of your 'bitmapscan changes'
> patch ? I've posted a version which applies (or applied to the cvs head
> at the time of post) cleanly there:
> http://archives.postgresql.org/pgsql-patches/2007-06/msg00204.php

Yes, that's probably a good place to start.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] clog_buffers to 64 in 8.3?

2007-08-02 Thread Greg Smith

On Thu, 2 Aug 2007, Tom Lane wrote:

I find it entirely likely that simply changing the [NUM_CLOG_BUFFERS] 
constant would be a net loss on many workloads.


Would it be reasonable to consider changing it to a compile-time option 
before the 8.3 beta?  From how you describe the potential downsides, it 
sounds to me like something that specific distributors might want to 
adjust based on their target customer workloads and server scale.  That 
would make it available as a tunable to those aiming at larger systems 
with enough CPU/memory throughput that the additional overhead of more 
linear searches is trumped by the reduced potential for locking 
contention, as appears to be the case in Sun's situation here.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] GIT patch

2007-08-02 Thread Bruce Momjian
Alvaro Herrera wrote:
> Heikki Linnakangas wrote:
> > Alvaro Herrera wrote:
> > > I've started reading the GIT patch to see if I can help with the review.
> 
> > As the patch stands, I tried to keep it as non-invasive as possible,
> > with minimum changes to existing APIs. That's because in the winter we
> > were discussing changes to the indexam API to support the bitmap index
> > am, and also GIT. I wanted to just have a patch to do performance
> > testing with, without getting into the API changes.
> 
> Hmm, do say, doesn't it seem like the lack of feedback and the failed
> bitmap patch played against final development of this patch?  At this
> point I feel like the patch still needs some work and reshuffling before
> it is in an acceptable state.  The fact that there are some API changes
> for which the patch needs to be adjusted makes me feel like we should
> put this patch on hold for 8.4.  So we would first get the API changes
> discussed and done and then adapt this patch to them.

As Heikki mentioned, this was discussed back in March/April with no
movement.   At this point we have at least a month until beta so please
try to move it forward as much as possible.  It isn't going to be any
easier during 8.4.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] clog_buffers to 64 in 8.3?

2007-08-02 Thread Tom Lane
Josh Berkus <[EMAIL PROTECTED]> writes:
> Tom,
>> I don't actually think that what Jignesh is testing is a particularly
>> realistic scenario, and so I object to making performance decisions on
>> the strength of that one measurement.

> What do you mean by "not realistic"?  What would be a realistic scenario?

The difference between maxing out at 1200 sessions and 1300 sessions
doesn't excite me a lot --- in most environments you'd be well advised
to use many fewer backends and a connection pooler.  But in any case
the main point is that this is *one* benchmark on *one* platform.  Does
anyone outside Sun even know what the benchmark is, beyond the fact that
it's running a whole lot of sessions?

Also, you should not imagine that boosting NUM_CLOG_BUFFERS has zero
cost.  The linear searches used in slru.c start to look pretty
questionable if we want more than a couple dozen buffers.  I find it
entirely likely that simply changing the constant would be a net loss
on many workloads.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] clog_buffers to 64 in 8.3?

2007-08-02 Thread Josh Berkus
Tom,

> I don't actually think that what Jignesh is testing is a particularly
> realistic scenario, and so I object to making performance decisions on
> the strength of that one measurement.

What do you mean by "not realistic"?  What would be a realistic scenario?

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] clog_buffers to 64 in 8.3?

2007-08-02 Thread Tom Lane
Josh Berkus <[EMAIL PROTECTED]> writes:
> Through the User Concurrency Thread on -performance [1], Tom and
> Jignesh found that our proximate bottleneck on SMP multi-user scaling
> is clog_buffers.

I don't actually think that what Jignesh is testing is a particularly
realistic scenario, and so I object to making performance decisions on
the strength of that one measurement.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[HACKERS] clog_buffers to 64 in 8.3?

2007-08-02 Thread Josh Berkus
All,

Through the User Concurrency Thread on -performance [1], Tom and Jignesh found 
that our proximate bottleneck on SMP multi-user scaling is clog_buffers.  
Increasing clog_buffers to 64 improved this scaling by 30% per Jignesh:

===

8.3+ HOT = did not defer more from 8.2 Numbers and hit the CLOG problem 
at 1100 users (instead of 1000 for 8.2)

8.3 + HOT + CLOG = Got a 1350 users peak of 137364 txn  and it held 
steady till 1450 before it started dropping..

The Best 8.2 +CLOG patch is at 1250user at 128638 txn.. which at the 
same users in 8.3 did 131265.. So per user transactions also seems to 
have improved.. Good but roughly 2% at same user count.. But peak value 
in terms of scalability the improvement is 6.7%

Pristine 8.2 could do about 950 users at 100828 txn: So at same user 
transactions 8.3+HOT+CLOG gives about  102058 txn =  1.2% while in terms 
of scalability throughput we get a huge boost of 36.2%

So if we get the CLOG patch integrated in 8.3+HOT+CLOG release then 
overall the gain from our pristine 8.2.4 release will be about  36%  out 
of the box 

===

So:

1) Is there any potential negative impact to increasing the number of CLOG 
buffers?

2) Is this a small enough change that we can make it during beta?

---Josh

[1] http://archives.postgresql.org/pgsql-performance/2007-07/msg00237.php

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] .NET driver

2007-08-02 Thread Robert Treat
On Thursday 02 August 2007 08:57, Andrei Kovalevski wrote:
> Merlin Moncure wrote:
> > On 8/2/07, Hannu Krosing <[EMAIL PROTECTED]> wrote:
> >> Ühel kenal päeval, N, 2007-08-02 kell 11:24, kirjutas Rohit Khare:
> >>> I used NPGSQL .NET driver to connect PGSQL 8.2.4 database to VB.NET.
> >>> As stated on NPGSQL page, it doesn't seem to provide seamless
> >>> integration and performance with .NET. Instead when I used ODBC, the
> >>> performance was comparatively better. What's the reason? When can we
> >>> expect .NET driver that provides seamless integration.
> >>
> >> What kind of "seamless integration" are you looking for ?
> >
> > The .net data provider is not as good when working with typed datasets
> > in terms of support from the ide.  Normally for other providers the
> > IDE does everything for you, writing update statements and  such in a
> > ORM fashion.   This is kind of a pain for some of the report designers
> > and other things that want to work with a typed set.  It's possible to
> > work around this, it's just a pain, and changes with each release of
> > visual studio.  Also, the connection pooling portions are buggy
> > (google LOG: incomplete startup packet).
> >
> > The ODBC driver works pretty good actually.  I can't speak about the
> > performance though.
> >
> > merlin
>
> I have an experience with writing ODBC driver for PostgreSQL
> (https://projects.commandprompt.com/public/odbcng/). I would be happy to
> help community to improve .NET data provider.
>

That would be nice. Of course none of this seems relevant to hackers, so I'd 
ask those interested to check out the .net project page at 
http://pgfoundry.org/projects/npgsql/  

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] .NET driver

2007-08-02 Thread Andrei Kovalevski

Merlin Moncure wrote:

On 8/2/07, Hannu Krosing <[EMAIL PROTECTED]> wrote:
  

Ühel kenal päeval, N, 2007-08-02 kell 11:24, kirjutas Rohit Khare:


I used NPGSQL .NET driver to connect PGSQL 8.2.4 database to VB.NET.
As stated on NPGSQL page, it doesn't seem to provide seamless
integration and performance with .NET. Instead when I used ODBC, the
performance was comparatively better. What's the reason? When can we
expect .NET driver that provides seamless integration.
  

What kind of "seamless integration" are you looking for ?



The .net data provider is not as good when working with typed datasets
in terms of support from the ide.  Normally for other providers the
IDE does everything for you, writing update statements and  such in a
ORM fashion.   This is kind of a pain for some of the report designers
and other things that want to work with a typed set.  It's possible to
work around this, it's just a pain, and changes with each release of
visual studio.  Also, the connection pooling portions are buggy
(google LOG: incomplete startup packet).

The ODBC driver works pretty good actually.  I can't speak about the
performance though.

merlin
  


   I have an experience with writing ODBC driver for PostgreSQL 
(https://projects.commandprompt.com/public/odbcng/). I would be happy to 
help community to improve .NET data provider.


Andrei.


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] .NET driver

2007-08-02 Thread Merlin Moncure
On 8/2/07, Hannu Krosing <[EMAIL PROTECTED]> wrote:
> Ühel kenal päeval, N, 2007-08-02 kell 11:24, kirjutas Rohit Khare:
> > I used NPGSQL .NET driver to connect PGSQL 8.2.4 database to VB.NET.
> > As stated on NPGSQL page, it doesn't seem to provide seamless
> > integration and performance with .NET. Instead when I used ODBC, the
> > performance was comparatively better. What's the reason? When can we
> > expect .NET driver that provides seamless integration.
>
> What kind of "seamless integration" are you looking for ?

The .net data provider is not as good when working with typed datasets
in terms of support from the ide.  Normally for other providers the
IDE does everything for you, writing update statements and  such in a
ORM fashion.   This is kind of a pain for some of the report designers
and other things that want to work with a typed set.  It's possible to
work around this, it's just a pain, and changes with each release of
visual studio.  Also, the connection pooling portions are buggy
(google LOG: incomplete startup packet).

The ODBC driver works pretty good actually.  I can't speak about the
performance though.

merlin

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] .NET driver

2007-08-02 Thread Hannu Krosing
Ühel kenal päeval, N, 2007-08-02 kell 11:24, kirjutas Rohit Khare:
> I used NPGSQL .NET driver to connect PGSQL 8.2.4 database to VB.NET.
> As stated on NPGSQL page, it doesn't seem to provide seamless
> integration and performance with .NET. Instead when I used ODBC, the
> performance was comparatively better. What's the reason? When can we
> expect .NET driver that provides seamless integration. 

What kind of "seamless integration" are you looking for ?

Which is more important to you "seamless integration" or performance ?

--
Hannu



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] GIT patch

2007-08-02 Thread Alexey Klyukin
Heikki Linnakangas wrote:
> Alvaro Herrera wrote:
> > Hmm, do say, doesn't it seem like the lack of feedback and the failed
> > bitmap patch played against final development of this patch?  
> 
> Yes :(. That's a one reason why I tried to help with the review of that
> patch.
> 
> > At this
> > point I feel like the patch still needs some work and reshuffling before
> > it is in an acceptable state.  The fact that there are some API changes
> > for which the patch needs to be adjusted makes me feel like we should
> > put this patch on hold for 8.4.  So we would first get the API changes
> > discussed and done and then adapt this patch to them.
> 
> I hate to say it but I agree. Getting the API changes discussed and
> committed was my plan in February/March. Unfortunately it didn't happen
> back then.
> 
> There's a few capabilities we need from the new API:
> 
> 1. Support for candidate matches. Because a clustered index doesn't
> contain the key for every heap tuple, when you search for a value we
> don't know exactly which ones match. Instead, you get a bunch of
> candidate matches, which need to be rechecked after fetching the heap
> tuple. Oleg & Teodor pointed out that GiST could use the capability as
> well. I also proposed a while ago to change the hash index
> implementation so that it doesn't store the index key in the index, but
> just the hash of it. That again would need the support for candidate
> matches. And there's range-encoded bitmap indexes, if we implement them
> in a more distant future.

Well, then should we return to the review of your 'bitmapscan changes'
patch ? I've posted a version which applies (or applied to the cvs head
at the time of post) cleanly there:
http://archives.postgresql.org/pgsql-patches/2007-06/msg00204.php

> 
> 2. Support to sort the heap tuples represented by one index tuple, in
> normal index scans, if we go with alternative 1. Or support to do binary
> searches over them, if we go with alternative 2 or 3. As the patch
> stands, the sorting is done within b-tree, but that's quite ugly.
-- 
Alexey Klyukin http://www.commandprompt.com/
The PostgreSQL Company - Command Prompt, Inc.


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] GIT patch

2007-08-02 Thread Heikki Linnakangas
Alvaro Herrera wrote:
> Hmm, do say, doesn't it seem like the lack of feedback and the failed
> bitmap patch played against final development of this patch?  

Yes :(. That's a one reason why I tried to help with the review of that
patch.

> At this
> point I feel like the patch still needs some work and reshuffling before
> it is in an acceptable state.  The fact that there are some API changes
> for which the patch needs to be adjusted makes me feel like we should
> put this patch on hold for 8.4.  So we would first get the API changes
> discussed and done and then adapt this patch to them.

I hate to say it but I agree. Getting the API changes discussed and
committed was my plan in February/March. Unfortunately it didn't happen
back then.

There's a few capabilities we need from the new API:

1. Support for candidate matches. Because a clustered index doesn't
contain the key for every heap tuple, when you search for a value we
don't know exactly which ones match. Instead, you get a bunch of
candidate matches, which need to be rechecked after fetching the heap
tuple. Oleg & Teodor pointed out that GiST could use the capability as
well. I also proposed a while ago to change the hash index
implementation so that it doesn't store the index key in the index, but
just the hash of it. That again would need the support for candidate
matches. And there's range-encoded bitmap indexes, if we implement them
in a more distant future.

2. Support to sort the heap tuples represented by one index tuple, in
normal index scans, if we go with alternative 1. Or support to do binary
searches over them, if we go with alternative 2 or 3. As the patch
stands, the sorting is done within b-tree, but that's quite ugly.

> Of the three proposals you suggest, I think the first one
> 
>> 1. A grouped index tuple contains a bitmap of offsetnumbers,
>> representing a bunch of heap tuples stored on the same heap page, that
>> all have a key between the key stored on the index tuple and the next
>> index tuple. We don't keep track of the ordering of the heap tuples
>> represented by one group index tuple. When doing a normal, non-bitmap,
>> index scan, they need to be sorted. This is what the patch currently
>> implements.
> 
> makes the most sense -- the index is keep simple and fast, and doing the
> sorting during an indexscan seems a perfectly acceptable compromise,
> knowing that the amount of tuples possible returned for sort is limited
> by the heap blocksize.

The overhead is shown in the CPU test results, particularly in the
select_range* tests, I put up on the git web site:
http://community.enterprisedb.com/git/.

The other alternatives might be simpler, though.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly