Re: [HACKERS] Adding new joining alghoritm to postgresql
On Friday, July 19, 2013 7:17 PM tubadzin wrote: Hi. I'm a little confused. 1.I have source code 9.2.4. version from http://www.postgresql.org/ftp/source/ 2.I want to add new alghoritm to index nested loops join, merge join and hash join. I have Executor catalog in src catalag containing nodeHash.c, nodeHasjoin.c, nodeMergejoin and nodeNestloop.c 3.After changes, I want to compile postgresql and use it. 4.Problem is: a)I do not know which library is responsible for this functionality. I understand, that I have to compile src and replace library (I don't know which library) in path where Postgresql in installed: C:\Program Files (x86)\PostgreSQL\9.2 I think you would need to copy postgres.exe. Ideally you need to copy all the libraries that got changed due to your source code change. In the link below, you can even find how to create installation from source. b)I don't know how use files/library (which library?) with visual studio 2010 and how compile it. Find the instructions for how to build on windows at below link: http://www.postgresql.org/docs/devel/static/install-windows.html With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improvement of checkpoint IO scheduler for stable transaction responses
Hi With your tests did you try to write the hot buffers first? ie buffers with a high refcount, either by sorting them on refcount or at least sweeping the buffer list in reverse? In my understanding there's an 'impedance mismatch' between what postgresql wants and what the OS offers. when it called fsync() Postresql wants a set of buffers selected quickly at checkpoint start time written to disks, but the OS only offers to write all dirties buffers at fsync time, not exactly the same contract, on a loaded server with checkpoint spreading the difference could be big, worst case checkpoint want 8KB fsync write 1GB. As a control, there's 150 years of math, up to Maxwell himself, behind t Adding as little energy (packets) as randomly as possible to a control system you couldn't measure actuators do make a by writing to the OS the less likely to be recycle buffers first it may have less work to do at fsync time, hopefully they have been written by the OS background task during the spread and are not re-dirtied by other backends. Didier
[HACKERS] create index does change pages?
hello. dose one create index on the large table changes the place of some tuples in some pages?
Re: [HACKERS] confusing typedefs in jsonfuncs.c
On Thu, 2013-07-18 at 21:34 -0400, Tom Lane wrote: Yeah, this is randomly different from everywhere else in PG. The more usual convention if you want typedefs for both the struct and the pointer type is that the pointer type is FooBar and the struct type is FooBarData. I think that is more useful if you have a really good abstraction around FooBar, like for Relation, for example. That's not really the case here. This is more like node types, perhaps. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] create index does change pages?
On 20.07.2013 12:56, mohsen soodkhah mohammadi wrote: hello. dose one create index on the large table changes the place of some tuples in some pages? No, creating an index doesn't move any tuples in the heap. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] create index does change pages?
Sent from my iPad On 20-Jul-2013, at 15:26, mohsen soodkhah mohammadi mohsensoodk...@gmail.com wrote: hello. dose one create index on the large table changes the place of some tuples in some pages? No, making an index involves keeping some metadata about your tables in an auxiliary data structure. It will require some extra memory for the auxiliary data structure, but there is no moving of the actual tuples. Regards, Atri -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.4] row level security
2013/7/19 Stephen Frost sfr...@snowman.net: * Greg Smith (g...@2ndquadrant.com) wrote: On 7/18/13 7:57 PM, Karol Trzcionka wrote: Current head 4cbe3ac3e86790d05c569de4585e5075a62a9b41 - patch applies correct (only change needed in parallel_schedule). However it fails on own regression tests (other tests pass). I got a rejected hunk in src/backend/nodes/nodeFuncs.c as well as that parallel_schedule issue. Maybe you didn't get the nodeFuncs change but didn't notice that? That might explain why the tests didn't work for you either. The nodeFuncs.c hunk seems likely to have been impacted by the patch I committed today (WITH CHECK OPTION), so I doubt that was the issue. Attached is an updated patch where I tried to only fix the two small hunks of bit rot. I get All 140 tests passed here, on a Mac no less. Thanks for updating the patch, I ran into the failed hunks too and expected to have to deal with them. :) Thanks for pointing out this problem. I synchronized my local master with the upstream one, then adjusted the row-security branch. I'll submit the patch soon, with fixing-up the portion that Tom pointed out. Thanks, -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.4] row level security
2013/7/19 Dean Rasheed dean.a.rash...@gmail.com: On 19 July 2013 04:09, Greg Smith g...@2ndquadrant.com wrote: On 7/18/13 11:03 PM, Stephen Frost wrote: Wasn't there a wiki page about this feature which might also help? Bigger question- is it correct for the latest version of the patch? https://wiki.postgresql.org/wiki/RLS has accumulated a lot of older debris that may or may not be useful here. This useful piece was just presented at PGCon: http://www.pgcon.org/2013/schedule/attachments/273_PGcon2013-kaigai-row-level-security.pdf That is very up to date intro to the big picture issues. Hi, I took a look at this patch too. I didn't read all the code in detail, but there was one area I wondered about --- is it still necessary to add the new field rowsec_relid to RangeTblEntry? As far as I can see, it is only used in the new function getrelid() which replaces the old macro of the same name, so that it can work if it is passed the index of the sourceRelation subquery RTE instead of the resultRelation. This seems to be encoding a specific assumption that a subquery RTE is always going to come from row-level security expansion. Is it the case that this can only happen from expand_targetlist(), in which case why not pass both source_relation and result_relation to expand_targetlist(), which I think will make that code neater. As it stands, what expand_targetlist() sees as result_relation is actually source_relation, which getrelid() then handles specially. Logically I think expand_targetlist() should pass the actual result_relation to getrelid(), opening the target table, and then pass source_relation to makeVar() when building new TLEs. With that change to expand_targetlist(), the change to getrelid() may be unnecessary, and then also the new rowsec_relid field in RangeTblEntry may not be needed. Hmm. I didn't have this idea. It seems to me fair enough and kills necessity to enhance RangeTblEntry and getrelid() indeed. I try to fix up this implementation according to your suggestion. Thanks, -- KaiGai Kohei kai...@kaigai.gr.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Error compiling Postgresql source code in Visual Studio 2010
Hi. 1.I downloaded source code PostgreSql version 9.2.4. from http://www.postgresql.org/ftp/source/ 2.Next, I downloaded ActivePerl and installed it. Next, I used Visual Studio x64 Win64 command prompt 2010. Later I used visual studio command prompt for get into path source code : C:\Postgresql-9-2-4\src\tools\msvc and I used build command like in instruction below. http://www.postgresql.org/docs/9.0/static/install-windows-full.html#AEN23979 3.Ok, after 3 minutes source code compiled and I have visual studio with 112 projects, I was so optimistic:) C:\Postgresql-9-2-4\pgsql.sln 4.And now I have a big problem:( I want to add new alghoritm to nodeHash.c in Postgres project in Visual Studio and later compile source code. But, on the beginning I did'n implement anything. On start, Visual Studio generates 1473 error like this (Can I omit this error to compile? Maybe any seetings in Visual Studio are required?): Error 1498 IntelliSense: identifier BLCKSZ is undefined C:\Postgresql-9-2-4\src\include\access\relscan.h 53 28 Error 35 error C2466: cannot allocate an array of constant size 0 C:\Postgresql-9-2-4\src\backend\access\gin\ginentrypage.c 568 1 postgres Error 675 error C2229: struct 'SpGistScanOpaqueData' has an illegal zero-sized array C:\Postgresql-9-2-4\src\include\access\spgist_private.h 158 1 postgres Error 180 error C2229: struct 'GISTScanOpaqueData' has an illegal zero-sized array C:\Postgresql-9-2-4\src\include\access\gist_private.h 156 1 postgres Error 402 error C2229: struct 'lt;unnamed-taggt;' has an illegal zero-sized array C:\Postgresql-9-2-4\src\backend\access\heap\pruneheap.c 40 1 postgres Error 1159 error C2198: '_tarWriteHeader' : too few arguments for call C:\Postgresql-9-2-4\src\backend\replication\basebackup.c 758 1 postgres Error 1009 error C2146: syntax error : missing ')' before identifier 'PG_MAJORVERSION' C:\Postgresql-9-2-4\src\backend\catalog\catalog.c 189 1 postgres Error 51 error C2133: 'vector' : unknown size C:\Postgresql-9-2-4\src\backend\access\gin\gindatapage.c 481 1 postgres Error 722 error C2133: 'toPlaceholder' : unknown size C:\Postgresql-9-2-4\src\backend\access\spgist\spgvacuum.c 131 1 postgres Error 1438 error C2099: initializer is not a constant C:\Postgresql-9-2-4\src\backend\utils\misc\guc.c 2203 1 postgres Error 885 error C2065: 'XLOG_SEG_SIZE' : undeclared identifier C:\Postgresql-9-2-4\src\backend\access\transam\xlogfuncs.c 542 1 postgres Error 990 error C2065: 'XLOG_BLCKSZ' : undeclared identifier C:\Postgresql-9-2-4\src\backend\access\transam\xlog.c 4053 1 postgres Error 1387 error C2065: 'FLOAT8PASSBYVAL' : undeclared identifier C:\Postgresql-9-2-4\src\backend\utils\adt\float.c 1906 1 postgres Error 13 error C2065: 'BLCKSZ' : undeclared identifier C:\Postgresql-9-2-4\src\backend\access\gin\ginbtree.c 248 1 postgres Error 1467 error C1189: #error : BLCKSZ must be between 1KB and 1MB C:\Postgresql-9-2-4\src\backend\utils\misc\guc.c 4809 1 postgres Error 1147 error C1083: Cannot open source file: 'src\backend\replication\repl_gram.c': No such file or directory C:\Postgresql-9-2-4\c1 postgres Error 892 error C1083: Cannot open source file: 'src\backend\bootstrap\bootparse.c': No such file or directory C:\Postgresql-9-2-4\c1 postgres
Re: [HACKERS] Improvement of checkpoint IO scheduler for stable transaction responses
On 7/20/13 4:48 AM, didier wrote: With your tests did you try to write the hot buffers first? ie buffers with a high refcount, either by sorting them on refcount or at least sweeping the buffer list in reverse? I never tried that version. After a few rounds of seeing that all changes I tried were just rearranging the good and bad cases, I got pretty bored with trying new changes in that same style. by writing to the OS the less likely to be recycle buffers first it may have less work to do at fsync time, hopefully they have been written by the OS background task during the spread and are not re-dirtied by other backends. That is the theory. In practice write caches are so large now, there is almost no pressure forcing writes to happen until the fsync calls show up. It's easily possible to enter the checkpoint fsync phase only to discover there are 4GB of dirty writes ahead of you, ones that have nothing to do with the checkpoint's I/O. Backends are constantly pounding the write cache with new writes in situations with checkpoint spikes. The writes and fsync calls made by the checkpoint process are only a fraction of the real I/O going on. The volume of data being squeezed out by each fsync call is based on total writes to that relation since the checkpoint. That's connected to the writes to that relation happening during the checkpoint, but the checkpoint writes can easily be the minority there. It is not a coincidence that the next feature I'm working on attempts to quantify the total writes to each 1GB relation chunk. That's the most promising path forward on the checkpoint problem I've found. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improvement of checkpoint IO scheduler for stable transaction responses
On 20.07.2013 19:28, Greg Smith wrote: On 7/20/13 4:48 AM, didier wrote: With your tests did you try to write the hot buffers first? ie buffers with a high refcount, either by sorting them on refcount or at least sweeping the buffer list in reverse? I never tried that version. After a few rounds of seeing that all changes I tried were just rearranging the good and bad cases, I got pretty bored with trying new changes in that same style. It doesn't seem like we're getting anywhere with minor changes to the existing logic. The reason I brought up sorting the writes in the first place is that it allows you to fsync() each segment after it's written, rather than doing all the writes first, and then fsyncing all the relations. Mitsumasa-san, since you have the test rig ready, could you try the attached patch please? It scans the buffer cache several times, writing out all the dirty buffers for segment A first, then fsyncs it, then all dirty buffers for segment B, and so on. The patch is ugly, but if it proves to be helpful, we can spend the time to clean it up. - Heikki diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c index 8079226..14149a9 100644 --- a/src/backend/storage/buffer/bufmgr.c +++ b/src/backend/storage/buffer/bufmgr.c @@ -1210,10 +1210,15 @@ static void BufferSync(int flags) { int buf_id; - int num_to_scan; + int buf_id_start; int num_to_write; int num_written; int mask = BM_DIRTY; + bool target_chosen = false; + RelFileNode target_rnode = {0, 0, 0}; + ForkNumber target_forknum = 0; + int target_segno = 0; + int target_bufs = 0; /* Make sure we can handle the pin inside SyncOneBuffer */ ResourceOwnerEnlargeBuffers(CurrentResourceOwner); @@ -1275,10 +1280,10 @@ BufferSync(int flags) * Note that we don't read the buffer alloc count here --- that should be * left untouched till the next BgBufferSync() call. */ - buf_id = StrategySyncStart(NULL, NULL); - num_to_scan = NBuffers; + buf_id_start = 0; num_written = 0; - while (num_to_scan-- 0) + target_chosen = false; + for (buf_id = 0; buf_id NBuffers; buf_id++) { volatile BufferDesc *bufHdr = BufferDescriptors[buf_id]; @@ -1294,7 +1299,12 @@ BufferSync(int flags) * write the buffer though we didn't need to. It doesn't seem worth * guarding against this, though. */ - if (bufHdr-flags BM_CHECKPOINT_NEEDED) + /* the above reasoning applies to the target checks too */ + if ((!target_chosen || + (RelFileNodeEquals(target_rnode, bufHdr-tag.rnode) + target_forknum == bufHdr-tag.forkNum + target_segno == bufHdr-tag.blockNum / RELSEG_SIZE)) + bufHdr-flags BM_CHECKPOINT_NEEDED) { if (SyncOneBuffer(buf_id, false) BUF_WRITTEN) { @@ -1320,11 +1330,34 @@ BufferSync(int flags) * Sleep to throttle our I/O rate. */ CheckpointWriteDelay(flags, (double) num_written / num_to_write); + +/* Find other buffers belonging to this segment */ +if (!target_chosen) +{ + LockBufHdr(bufHdr); + target_rnode = bufHdr-tag.rnode; + target_forknum = bufHdr-tag.forkNum; + target_segno = bufHdr-tag.blockNum / RELSEG_SIZE; + UnlockBufHdr(bufHdr); + target_bufs = 0; + buf_id_start = buf_id; + target_chosen = true; +} +target_bufs++; } } - if (++buf_id = NBuffers) - buf_id = 0; + if (buf_id == NBuffers - 1 target_chosen) + { + if (log_checkpoints) +elog(LOG, checkpoint sync: wrote %d buffers for target, syncing, + target_bufs); + + smgrsyncrel(target_rnode, target_forknum, target_segno); + target_chosen = false; + /* continue the scan where we left before we chose the target */ + buf_id = buf_id_start; + } } /* @@ -1825,10 +1858,11 @@ CheckPointBuffers(int flags) { TRACE_POSTGRESQL_BUFFER_CHECKPOINT_START(flags); CheckpointStats.ckpt_write_t = GetCurrentTimestamp(); + smgrsyncbegin(); BufferSync(flags); CheckpointStats.ckpt_sync_t = GetCurrentTimestamp(); TRACE_POSTGRESQL_BUFFER_CHECKPOINT_SYNC_START(); - smgrsync(); + smgrsyncend(); CheckpointStats.ckpt_sync_end_t = GetCurrentTimestamp(); TRACE_POSTGRESQL_BUFFER_CHECKPOINT_DONE(); } diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c index e629181..4e4eb03 100644 --- a/src/backend/storage/smgr/md.c +++ b/src/backend/storage/smgr/md.c @@ -187,6 +187,7 @@ static MdfdVec *_mdfd_getseg(SMgrRelation reln, ForkNumber forkno, BlockNumber blkno, bool skipFsync, ExtensionBehavior behavior); static BlockNumber _mdnblocks(SMgrRelation reln, ForkNumber forknum, MdfdVec *seg); +static void mdsyncguts(PendingOperationEntry *entry, ForkNumber forknum, int segno); /* @@ -235,7 +236,8 @@ SetForwardFsyncRequests(void) /* Perform any pending fsyncs we may have queued up, then drop table */ if (pendingOpsTable) { - mdsync(); + mdsyncbegin(); + mdsyncend(); hash_destroy(pendingOpsTable); } pendingOpsTable = NULL; @@ -970,27 +972,18
[HACKERS] InvokeObjectPostAlterHook() vs. CommandCounterIncrement()
objectaccess.h has this to say: * OAT_POST_ALTER should be invoked just after the object is altered, * but before the command counter is incremented. An extension using the * hook can use SnapshotNow and SnapshotSelf to get the old and new * versions of the tuple. That's a clever design, but it precludes CommandCounterIncrement() appearing between an applicable catalog change and the InvokeObjectPostAlterHook() call. Otherwise, both SnapshotNow and SnapshotSelf see the new version. I find at least one violation of that requirement. AlterTSConfiguration() calls makeConfigurationDependencies(), which typically issues a CCI, before InvokeObjectPostAlterHook(). A second example, arguably harmless, is AlterEnum(). It can get a CCI through RenumberEnumType(), so SnapshotNow sees the post-renumbered entries while SnapshotSelf sees those plus the added one. sepgsql does not currently intercede for text search configurations or enums, so only a third-party object_access_hook consumer could encounter the problem. This is a dangerously-easy problem to introduce; unrelated development could add a CCI to some intermediate code in a DDL implementation and quietly break object_access_hook consumers. Can we detect that mistake? One possibility is to call a macro, say StartObjectAlter(), before the first catalog modification associated with a particular hook invocation. This macro would stash the current CID. Then, RunObjectPostAlterHook() would elog(ERROR) if the CID had changed since that moment. But how would one actually fix the code after such a check fires? It's rarely easy to avoid adding a CCI. I'm thinking we would be better off saying the firing point needs to appear right after the heap_{insert,update,delete} + CatalogUpdateIndexes(). The hooks already have the form of applying to a single catalog row rather than a complete DDL operation, and many of the firing sites are close to that placement already. If these hooks will need to apply to a larger operation, I think that mandates a different means to reliably expose the before/after object states. Thanks, nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] updated emacs configuration
On Thu, Jun 27, 2013 at 10:31 PM, Tom Lane t...@sss.pgh.pa.us wrote: AFAIR, no one has ever done a serious comparison to anything except GNU indent, and (at least at the time) it seemed to have bugs as bad as pgindent's, just different ones. I'm certainly open to another choice as long as we don't lose on portability of the tool. But who will do the legwork to test something else? Fwiw I played with clang-format a bit the other day. But I couldn't get it to resemble our coding style. It was strongly of the opinion that spacing within lines should be completely consistent and that meant it eliminated all spaces that lined up initializers and such. In theory this tool is promising though since it works by looking at the llvm bytecode to determine what the real syntax is. It should be able to handle the typedef issues we have with most of the the tools. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] proposal - psql - show longest tables
Hello all I very often use a little bit adjusted psql system queries to detection TOP N sized tables. I am thinking so it can be useful for all users I propose a few new commands \dts [N|size] ... show N largest tables | show tables larger than size ordered by size \dis [N|size] ... show N largest indexes | show indexes larger than size ordered by size \dtst [N|size] ... show N largest total size | show tables where total size is larger than size ordered by total size \dtr [N] ... show N largest tables (ordered by rows) example: \dts 10 --- top 10 tables ordered by size \dts 10MB -- tables larger than 10MB ordered by size Schema |Name | Type |Owner| Size +-+---+-+---+- public | eshop_users | table | eshop_owner | 16 kB public | zamestnanci | table | eshop_owner | 16 kB What do you think about this proposal? Comments, notes? Regards Pavel Stehule -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Error compiling Postgresql source code in Visual Studio 2010
On Saturday, July 20, 2013 9:53 PM tubadzin wrote: Hi. 1.I downloaded source code PostgreSql version 9.2.4. from http://www.postgresql.org/ftp/source/ 2.Next, I downloaded ActivePerl and installed it. Next, I used Visual Studio x64 Win64 command prompt 2010. Later I used visual studio command prompt for get into path source code : C:\Postgresql-9-2-4\src\tools\msvc and I used build command like in instruction below. http://www.postgresql.org/docs/9.0/static/install-windows-full.html#AEN23979 3.Ok, after 3 minutes source code compiled and I have visual studio with 112 projects, I was so optimistic:) C:\Postgresql-9-2-4\pgsql.sln 4.And now I have a big problem:( I want to add new alghoritm to nodeHash.c in Postgres project in Visual Studio and later compile source code. But, on the beginning I did'n implement anything. On start, Visual Studio generates 1473 error like this (Can I omit this error to compile? Maybe any seetings in Visual Studio are required?): If build works for you from command prompt, then change the source code and again build from command prompt, your changes will get compiled. However it should be built from visual studio as well, can u check by adding perl\bin path in VC++ Directories for Executable files (Tools-Options-Projects and Solutions-VC++ Directories), also check if platform sdk is present in path, and path for bison,flex executables. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls
On Fri, 2013-07-19 at 09:39 -0700, Josh Berkus wrote: So ... are you doing a final review of this for the CF, Jeff? We need to either commit it or bounce it to the next CF. I am going on vacation tomorrow, and I just didn't quite find time to take this to commit. Sorry about that, Nicholas. The patch improved a lot this CF though, so we'll get it in quickly and I don't foresee any problem with it making it in 9.4. (For that matter, am I not supposed to commit between 'fests? Or is it still an option for me to finish up with this after I get back even if we close the CF?) Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers