[HACKERS] Tracking a snapshot on PITR slaves
Hi Since my attempts to find a simple solution for the read-only query locking problems (Once that doesn't need full wal logging of lock requests) haven't been successfully yet, I've decided to turn to the problems of tracking a snapshot on the slaves for now. (Because first such a snapshot is needed for any kind of concurrent recovery anyway, and second because any non-simplistic solution of the locking problems will quite likely benefit from such a snapshot). The idea is to create a special kind of snapshot that works basically like a MVCC snapshot, but with the meaning of the xip array inverted. Usually, if a xid is *not* in the xip array of a snapshot, and greater than the xmin of that snapshot, the clog state of the xid determines tuple visibility. This is not well suited for queries running during replay, because the effects of a xlog record with a (to the slave) previously unknown xid shouldn't be visible to concurrently running queries. Therefore, flag xip_inverted will be added to SnapshotData, that causes HeapTupleSatisfiesMVCC to assume that any xid >= xmin and *not* in the xip array is in progress. This allows the following to work: .) Store RecentXmin with every xlog record, in a new field xl_xmin. (Wouldn't be needed in *every* record, but for now keeping it directly inside XLogRecord make things easier, and it's just 4 bytes) .) Maintain a global snapshot template in shmem during replay, with the xmin being the highest xmin seen so far in any xlog record. That template is copied whenever a readonly query needs to obtain a snapshot. .) Upon replaying a COMMIT or COMMIT_PREPARED record, the xmin of the to-be-committed transaction is added to the global snapshot, making the commit visibile to all further copies of that snapshot. If you can shoot this down, you're welcome to do so ;-) greetings, Florian Pflug ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] .NET driver
Brar Piening wrote: Robert Treat schrieb: That would be nice. Of course none of this seems relevant to hackers, so I'd Your'e right - of course. But sometimes I wish 'hackers' would care a little more about their interfaces as the best backend can't be good without good interfaces and some of the PostgreSQL-interfaces don't reach the standard they are reaching for other databases. As a windows-user I still can't drag and drop a Dataset in VS.Net with Npgsql and I still have to build a single-threaded perl if i want to use DBD::Pg (I know about DBD::PgPP). This latter is simply not true. ActiveState Perl is threaded and DBD::Pg works just fine with it. In fact, you don't need to build your own - just get the one from pgfoundry: point your ppm at: http://dbdpgppm.projects.postgresql.org//DBD-Pg-5.8.ppd cheers andrew ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] .NET driver
Robert Treat schrieb: That would be nice. Of course none of this seems relevant to hackers, so I'd Your'e right - of course. But sometimes I wish 'hackers' would care a little more about their interfaces as the best backend can't be good without good interfaces and some of the PostgreSQL-interfaces don't reach the standard they are reaching for other databases. As a windows-user I still can't drag and drop a Dataset in VS.Net with Npgsql and I still have to build a single-threaded perl if i want to use DBD::Pg (I know about DBD::PgPP). I'm really happy with the backend right now and I could perhaps convince the decision makers at my job to use my personal favorite (in addition to MSSQL) - but not as long as the interface doesn't look like the one they are used to. If C# will not go above 5-10% in this http://www.postgresql.org/community/survey.13 statistic, PostgreSQL will not be able to cover all the markets it could. See: http://radar.oreilly.com/archives/2006/08/programming_language_trends_1.html As I know that this is is off-topic here I'm not going to discuss this any further on this list but I'll respond to personal mails or follow an invitation to 'advocacy' (to which I'm not yet subscribed) or any other convenient list. Regards, Brar ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] .NET driver
Andrei Kovalevski schrieb: I have an experience with writing ODBC driver for PostgreSQL (https://projects.commandprompt.com/public/odbcng/). I would be happy to help community to improve .NET data provider. Please join the Npgsql Project at http://pgfoundry.org/projects/npgsql Francisco Figueiredo Jr. (fxjrlists[at]yahoo[dot]com[dot]br) will be happy about some new support. I once did some initial VS.Net 2002/3 integration but ran out of time half the way. It ist quite a bit of a pain as Microsoft has marked some important classes as sealed so you will see yourself reimplementing some wheels they already have implemented. Plus - as Merlin stated before - VS.Net/ADO.Net is a somewhat moving target for data provider implementations. Brar ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] GIT patch
Heikki Linnakangas <[EMAIL PROTECTED]> writes: > Alvaro Herrera wrote: >> At this >> point I feel like the patch still needs some work and reshuffling before >> it is in an acceptable state. The fact that there are some API changes >> for which the patch needs to be adjusted makes me feel like we should >> put this patch on hold for 8.4. So we would first get the API changes >> discussed and done and then adapt this patch to them. > I hate to say it but I agree. I concur with putting this whole area off till 8.4. We do not have any consensus on what the API should be, which is exactly why the patch was never finished. All the proposals are pretty ugly. Another problem: frankly I'm pretty dissatisfied with the entire concept of not storing all the index keys, especially in the proposed way which would eliminate any outside control over whether keys are dropped or not. Two problems I can see with it are: 1. The performance hit for functional indexes could be really steep, since you'd need to recompute a potentially expensive function to recheck matches. 2. This would forever cut off any development of indexscans that make use of index key data beyond what btree itself knows how to do. An example of the sort of thing I'm thinking about is applying a LIKE or regex pattern match operator against the index key before visiting the heap --- not just a derived >= or <= condition, but the actual pattern match. We've discussed adding an index AM call that returns the key values, which'd allow the executor to apply non-btree operators to them before visiting the heap. But that idea is DOA if the planner can't tell in advance whether the entries will be available. So instead of pressing to try to get something into 8.3, I would rather we stand back and think about it some more. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] GIT patch
Alexey Klyukin wrote: > Well, then should we return to the review of your 'bitmapscan changes' > patch ? I've posted a version which applies (or applied to the cvs head > at the time of post) cleanly there: > http://archives.postgresql.org/pgsql-patches/2007-06/msg00204.php Yes, that's probably a good place to start. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] clog_buffers to 64 in 8.3?
On Thu, 2 Aug 2007, Tom Lane wrote: I find it entirely likely that simply changing the [NUM_CLOG_BUFFERS] constant would be a net loss on many workloads. Would it be reasonable to consider changing it to a compile-time option before the 8.3 beta? From how you describe the potential downsides, it sounds to me like something that specific distributors might want to adjust based on their target customer workloads and server scale. That would make it available as a tunable to those aiming at larger systems with enough CPU/memory throughput that the additional overhead of more linear searches is trumped by the reduced potential for locking contention, as appears to be the case in Sun's situation here. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] GIT patch
Alvaro Herrera wrote: > Heikki Linnakangas wrote: > > Alvaro Herrera wrote: > > > I've started reading the GIT patch to see if I can help with the review. > > > As the patch stands, I tried to keep it as non-invasive as possible, > > with minimum changes to existing APIs. That's because in the winter we > > were discussing changes to the indexam API to support the bitmap index > > am, and also GIT. I wanted to just have a patch to do performance > > testing with, without getting into the API changes. > > Hmm, do say, doesn't it seem like the lack of feedback and the failed > bitmap patch played against final development of this patch? At this > point I feel like the patch still needs some work and reshuffling before > it is in an acceptable state. The fact that there are some API changes > for which the patch needs to be adjusted makes me feel like we should > put this patch on hold for 8.4. So we would first get the API changes > discussed and done and then adapt this patch to them. As Heikki mentioned, this was discussed back in March/April with no movement. At this point we have at least a month until beta so please try to move it forward as much as possible. It isn't going to be any easier during 8.4. -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] clog_buffers to 64 in 8.3?
Josh Berkus <[EMAIL PROTECTED]> writes: > Tom, >> I don't actually think that what Jignesh is testing is a particularly >> realistic scenario, and so I object to making performance decisions on >> the strength of that one measurement. > What do you mean by "not realistic"? What would be a realistic scenario? The difference between maxing out at 1200 sessions and 1300 sessions doesn't excite me a lot --- in most environments you'd be well advised to use many fewer backends and a connection pooler. But in any case the main point is that this is *one* benchmark on *one* platform. Does anyone outside Sun even know what the benchmark is, beyond the fact that it's running a whole lot of sessions? Also, you should not imagine that boosting NUM_CLOG_BUFFERS has zero cost. The linear searches used in slru.c start to look pretty questionable if we want more than a couple dozen buffers. I find it entirely likely that simply changing the constant would be a net loss on many workloads. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] clog_buffers to 64 in 8.3?
Tom, > I don't actually think that what Jignesh is testing is a particularly > realistic scenario, and so I object to making performance decisions on > the strength of that one measurement. What do you mean by "not realistic"? What would be a realistic scenario? -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] clog_buffers to 64 in 8.3?
Josh Berkus <[EMAIL PROTECTED]> writes: > Through the User Concurrency Thread on -performance [1], Tom and > Jignesh found that our proximate bottleneck on SMP multi-user scaling > is clog_buffers. I don't actually think that what Jignesh is testing is a particularly realistic scenario, and so I object to making performance decisions on the strength of that one measurement. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] clog_buffers to 64 in 8.3?
All, Through the User Concurrency Thread on -performance [1], Tom and Jignesh found that our proximate bottleneck on SMP multi-user scaling is clog_buffers. Increasing clog_buffers to 64 improved this scaling by 30% per Jignesh: === 8.3+ HOT = did not defer more from 8.2 Numbers and hit the CLOG problem at 1100 users (instead of 1000 for 8.2) 8.3 + HOT + CLOG = Got a 1350 users peak of 137364 txn and it held steady till 1450 before it started dropping.. The Best 8.2 +CLOG patch is at 1250user at 128638 txn.. which at the same users in 8.3 did 131265.. So per user transactions also seems to have improved.. Good but roughly 2% at same user count.. But peak value in terms of scalability the improvement is 6.7% Pristine 8.2 could do about 950 users at 100828 txn: So at same user transactions 8.3+HOT+CLOG gives about 102058 txn = 1.2% while in terms of scalability throughput we get a huge boost of 36.2% So if we get the CLOG patch integrated in 8.3+HOT+CLOG release then overall the gain from our pristine 8.2.4 release will be about 36% out of the box === So: 1) Is there any potential negative impact to increasing the number of CLOG buffers? 2) Is this a small enough change that we can make it during beta? ---Josh [1] http://archives.postgresql.org/pgsql-performance/2007-07/msg00237.php -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] .NET driver
On Thursday 02 August 2007 08:57, Andrei Kovalevski wrote: > Merlin Moncure wrote: > > On 8/2/07, Hannu Krosing <[EMAIL PROTECTED]> wrote: > >> Ühel kenal päeval, N, 2007-08-02 kell 11:24, kirjutas Rohit Khare: > >>> I used NPGSQL .NET driver to connect PGSQL 8.2.4 database to VB.NET. > >>> As stated on NPGSQL page, it doesn't seem to provide seamless > >>> integration and performance with .NET. Instead when I used ODBC, the > >>> performance was comparatively better. What's the reason? When can we > >>> expect .NET driver that provides seamless integration. > >> > >> What kind of "seamless integration" are you looking for ? > > > > The .net data provider is not as good when working with typed datasets > > in terms of support from the ide. Normally for other providers the > > IDE does everything for you, writing update statements and such in a > > ORM fashion. This is kind of a pain for some of the report designers > > and other things that want to work with a typed set. It's possible to > > work around this, it's just a pain, and changes with each release of > > visual studio. Also, the connection pooling portions are buggy > > (google LOG: incomplete startup packet). > > > > The ODBC driver works pretty good actually. I can't speak about the > > performance though. > > > > merlin > > I have an experience with writing ODBC driver for PostgreSQL > (https://projects.commandprompt.com/public/odbcng/). I would be happy to > help community to improve .NET data provider. > That would be nice. Of course none of this seems relevant to hackers, so I'd ask those interested to check out the .net project page at http://pgfoundry.org/projects/npgsql/ -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] .NET driver
Merlin Moncure wrote: On 8/2/07, Hannu Krosing <[EMAIL PROTECTED]> wrote: Ühel kenal päeval, N, 2007-08-02 kell 11:24, kirjutas Rohit Khare: I used NPGSQL .NET driver to connect PGSQL 8.2.4 database to VB.NET. As stated on NPGSQL page, it doesn't seem to provide seamless integration and performance with .NET. Instead when I used ODBC, the performance was comparatively better. What's the reason? When can we expect .NET driver that provides seamless integration. What kind of "seamless integration" are you looking for ? The .net data provider is not as good when working with typed datasets in terms of support from the ide. Normally for other providers the IDE does everything for you, writing update statements and such in a ORM fashion. This is kind of a pain for some of the report designers and other things that want to work with a typed set. It's possible to work around this, it's just a pain, and changes with each release of visual studio. Also, the connection pooling portions are buggy (google LOG: incomplete startup packet). The ODBC driver works pretty good actually. I can't speak about the performance though. merlin I have an experience with writing ODBC driver for PostgreSQL (https://projects.commandprompt.com/public/odbcng/). I would be happy to help community to improve .NET data provider. Andrei. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] .NET driver
On 8/2/07, Hannu Krosing <[EMAIL PROTECTED]> wrote: > Ühel kenal päeval, N, 2007-08-02 kell 11:24, kirjutas Rohit Khare: > > I used NPGSQL .NET driver to connect PGSQL 8.2.4 database to VB.NET. > > As stated on NPGSQL page, it doesn't seem to provide seamless > > integration and performance with .NET. Instead when I used ODBC, the > > performance was comparatively better. What's the reason? When can we > > expect .NET driver that provides seamless integration. > > What kind of "seamless integration" are you looking for ? The .net data provider is not as good when working with typed datasets in terms of support from the ide. Normally for other providers the IDE does everything for you, writing update statements and such in a ORM fashion. This is kind of a pain for some of the report designers and other things that want to work with a typed set. It's possible to work around this, it's just a pain, and changes with each release of visual studio. Also, the connection pooling portions are buggy (google LOG: incomplete startup packet). The ODBC driver works pretty good actually. I can't speak about the performance though. merlin ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] .NET driver
Ühel kenal päeval, N, 2007-08-02 kell 11:24, kirjutas Rohit Khare: > I used NPGSQL .NET driver to connect PGSQL 8.2.4 database to VB.NET. > As stated on NPGSQL page, it doesn't seem to provide seamless > integration and performance with .NET. Instead when I used ODBC, the > performance was comparatively better. What's the reason? When can we > expect .NET driver that provides seamless integration. What kind of "seamless integration" are you looking for ? Which is more important to you "seamless integration" or performance ? -- Hannu ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] GIT patch
Heikki Linnakangas wrote: > Alvaro Herrera wrote: > > Hmm, do say, doesn't it seem like the lack of feedback and the failed > > bitmap patch played against final development of this patch? > > Yes :(. That's a one reason why I tried to help with the review of that > patch. > > > At this > > point I feel like the patch still needs some work and reshuffling before > > it is in an acceptable state. The fact that there are some API changes > > for which the patch needs to be adjusted makes me feel like we should > > put this patch on hold for 8.4. So we would first get the API changes > > discussed and done and then adapt this patch to them. > > I hate to say it but I agree. Getting the API changes discussed and > committed was my plan in February/March. Unfortunately it didn't happen > back then. > > There's a few capabilities we need from the new API: > > 1. Support for candidate matches. Because a clustered index doesn't > contain the key for every heap tuple, when you search for a value we > don't know exactly which ones match. Instead, you get a bunch of > candidate matches, which need to be rechecked after fetching the heap > tuple. Oleg & Teodor pointed out that GiST could use the capability as > well. I also proposed a while ago to change the hash index > implementation so that it doesn't store the index key in the index, but > just the hash of it. That again would need the support for candidate > matches. And there's range-encoded bitmap indexes, if we implement them > in a more distant future. Well, then should we return to the review of your 'bitmapscan changes' patch ? I've posted a version which applies (or applied to the cvs head at the time of post) cleanly there: http://archives.postgresql.org/pgsql-patches/2007-06/msg00204.php > > 2. Support to sort the heap tuples represented by one index tuple, in > normal index scans, if we go with alternative 1. Or support to do binary > searches over them, if we go with alternative 2 or 3. As the patch > stands, the sorting is done within b-tree, but that's quite ugly. -- Alexey Klyukin http://www.commandprompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] GIT patch
Alvaro Herrera wrote: > Hmm, do say, doesn't it seem like the lack of feedback and the failed > bitmap patch played against final development of this patch? Yes :(. That's a one reason why I tried to help with the review of that patch. > At this > point I feel like the patch still needs some work and reshuffling before > it is in an acceptable state. The fact that there are some API changes > for which the patch needs to be adjusted makes me feel like we should > put this patch on hold for 8.4. So we would first get the API changes > discussed and done and then adapt this patch to them. I hate to say it but I agree. Getting the API changes discussed and committed was my plan in February/March. Unfortunately it didn't happen back then. There's a few capabilities we need from the new API: 1. Support for candidate matches. Because a clustered index doesn't contain the key for every heap tuple, when you search for a value we don't know exactly which ones match. Instead, you get a bunch of candidate matches, which need to be rechecked after fetching the heap tuple. Oleg & Teodor pointed out that GiST could use the capability as well. I also proposed a while ago to change the hash index implementation so that it doesn't store the index key in the index, but just the hash of it. That again would need the support for candidate matches. And there's range-encoded bitmap indexes, if we implement them in a more distant future. 2. Support to sort the heap tuples represented by one index tuple, in normal index scans, if we go with alternative 1. Or support to do binary searches over them, if we go with alternative 2 or 3. As the patch stands, the sorting is done within b-tree, but that's quite ugly. > Of the three proposals you suggest, I think the first one > >> 1. A grouped index tuple contains a bitmap of offsetnumbers, >> representing a bunch of heap tuples stored on the same heap page, that >> all have a key between the key stored on the index tuple and the next >> index tuple. We don't keep track of the ordering of the heap tuples >> represented by one group index tuple. When doing a normal, non-bitmap, >> index scan, they need to be sorted. This is what the patch currently >> implements. > > makes the most sense -- the index is keep simple and fast, and doing the > sorting during an indexscan seems a perfectly acceptable compromise, > knowing that the amount of tuples possible returned for sort is limited > by the heap blocksize. The overhead is shown in the CPU test results, particularly in the select_range* tests, I put up on the git web site: http://community.enterprisedb.com/git/. The other alternatives might be simpler, though. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly