Re: [HACKERS] Development with Eclipse - Wrong error messages in IDE
> On Feb 3, 2016, at 2:38 AM, Peter Moser <pitiz...@gmail.com> wrote: > > Does anyone had similar problems? Do I have to configure Eclipse to > understand the PG_RMGR macro or is there another possibility to teach Eclipse > these macros? I just built 9.6 under Eclipse CDT to try this out and was able to open e.g. heapam.c without any error markers. I added PostgreSQL as a “Makefile Project with Existing Code” after running ./configure from the command-line. After that, I built the project from within Eclipse by adding the ‘all’ make target and running it. One setting I usually change: right-click the project, pick Properties, then drill down through C/C++ General -> Preprocessor Include Paths. In the Provider pane, there is an entry for “CDT GCC Build Output Parser”. I’m not sure if this is strictly necessary, but I set my “Container to keep discovered entries” setting to “File”. Basically, Eclipse scans the make output for -I flags, then notes all the includes used to build each file, so the static analyzer, etc. can have more accurate information (it is crucial that the “Compiler command pattern” in this window be a regex that will match the compiler binary you use, so if you have /usr/local/bin/gcc, and “gcc” is the pattern, you are in for trouble). After running the build, Eclipse should now know what includes are used for each file and stop whining. If it ever seems to have problems, you can kick it by running a clean target, then all, then picking “Project -> C/C++ Index -> Rebuild” (I think). -- Jason Petersen Software Engineer | Citus Data 303.736.9255 ja...@citusdata.com signature.asc Description: Message signed with OpenPGP using GPGMail
[HACKERS] Query Deparsing Support
The DDL deparse support that just landed looks impressive, but I’ve needed query deparsing for some time now. Within pg_shard https://github.com/citusdata/pg_shard we took the fast-and-dirty approach of merging in a modified ruleutils.c, though I’d (obviously) like to get away from that. Many of ruleutils.c’s functions already have a public interface; would there be any objection to exposing get_query_def to provide a way to turn a Query back into an SQL string? A more structured approach analogous to the new DDL stuff would make my life easier, but simply having the Query-to-SQL function would be a huge improvement for now. I trawled through the archives but couldn’t find any discussion of why this function was kept static. -- Jason Petersen Software Engineer | Citus Data 303.736.9255 ja...@citusdata.com signature.asc Description: Message signed with OpenPGP using GPGMail
[HACKERS] BuildTupleFromCStrings Memory Documentation?
Within the core codebase, BuildTupleFromCStrings is often called within a temporary memory context cleared after the call. In dblink.c, this is justified as being needed to “[clean up] not only the data we have direct access to, but anycruft the I/O functions might leak”.I wrote a pretty minimal case to call BuildTupleFromCStrings in a loop (attached) and found that I was using 40GB of RAM in a few minutes, though I was not allocating any memory myself and immediately freed the tuple it returned.Is the need to wrap this call in a protective context documented anywhere? Portions of the documentation useBuildTupleFromCStrings in examples without mentioning this precaution. Is it just well-known, or did I miss a README or comment somewhere? --Jason PetersenSoftware Engineer | Citus Data303.736.9255ja...@citusdata.com tup_memleak.c Description: Binary data signature.asc Description: Message signed with OpenPGP using GPGMail
[HACKERS] Clarification of FDW API Documentation
I've been deep in the FDW APIs lately and have come up with a couple of questions about the [user-facing documentation][1]. # Requirement that DELETE use junk columns The bit I'm confused by is the parenthetical in this bit at the end of the section on `AddForeignUpdateTargets`: If the AddForeignUpdateTargets pointer is set to NULL, no extra target expressions are added. (This will make it impossible to implement DELETE operations, though UPDATE may still be feasible if the FDW relies on an unchanging primary key to identify rows.) Later on, the section on `ExecForeignDelete` says (emphasis mine): The junk column(s) **must** be used to identify the tuple to be deleted. Why is this a requirement? At the moment, `postgres_fdw` asks remote machines for the `ctid` of tuples during a scan so it can use that `ctid` to create a targeted `DELETE` during `ExecForeignDelete`, but I think an alternative could avoid the use of the `ctid` junk column altogether... Imagine if `BeginForeignScan` set up a remote cursor and `IterateForeignScan` just fetched _one tuple at a time_ (unlike the current behavior where they are fetched in batches). The tuple would be passed to `ExecForeignDelete` (as is required), but the remote cursor would remain pointing at that tuple. Couldn't `ExecForeignDelete` just call `DELETE FROM table WHERE CURRENT OF cursor` to then delete that tuple? Even if there is no guarantee that `IterateForeignScan` is called exactly once before each `ExecForeignDelete` call (which would remove the ability to have them cooperate using this single cursor), one could easily devise other storage backends that don't need junk columns to perform `DELETE` operations. So why the strong language around this functionality? # Examples of `NULL` return after modification Each of the `ExecForeign`- functions needs to return a tuple representing the row inserted, deleted, or modified. But each function's documentation contains an aside similar to this: The return value is either a slot containing the data that was actually inserted (this might differ from the data supplied, for example as a result of trigger actions), or NULL if no row was actually inserted (again, typically as a result of triggers). Is this even accurate in PostgreSQL 9.3? Can triggers fire against foreign tables? If so, can someone provide an example where the foreign scan has found a tuple, passed it to `ExecForeignDelete`, and then no delete takes place (i.e. `ExecForeignDelete` returns `NULL`)? As far as I can reason, if the foreign scan has found a tuple, the update and delete actions need to do _something_ with it. Maybe I'm missing something. --Jason [1]: http://www.postgresql.org/docs/9.3/static/fdw-callbacks.html -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?
On Apr 18, 2014, at 1:51 PM, Atri Sharma atri.j...@gmail.com wrote: Counting clock sweeps is an intersting idea. I think one concern was tracking hot buffers in cases where there is no memory pressure, and hence the clock sweep isn't running --- I am not sure how this would help in that case. Yes, we obviously want a virtual clock. Focusing on the use of gettimeofday seems silly to me: it was something quick for the prototype. The problem with the clocksweeps is they don’t actually track the progression of “time” within the PostgreSQL system. What’s wrong with using a transaction number or some similar sequence? It would accurately track “age” in the sense we care about: how long ago in “units of real work being done by the DB” something was added. —Jason
[HACKERS] Buffer Allocation Concurrency Limits
In December, Metin (a coworker of mine) discussed an inability to scale a simple task (parallel scans of many independent tables) to many cores (it’s here). As a ramp-up task at Citus I was tasked to figure out what the heck was going on here. I have a pretty extensive writeup here (whose length is more a result of my inexperience with the workings of PostgreSQL than anything else) and was looking for some feedback. In short, my conclusion is that a working set larger than memory results in backends piling up on BufFreelistLock. As much as possible I removed anything that could be blamed for this: Hyper-Threading is disabled zone reclaim mode is disabled numactl was used to ensure interleaved allocation kernel.sched_migration_cost was set to highly disable migration kernel.sched_autogroup_enabled was disabled transparent hugepage support was disabled For a way forward, I was thinking the buffer allocation sections could use some of the atomics Andres added here. Rather than workers grabbing BufFreelistLock to iterate the clock hand until they find a victim, the algorithm could be rewritten in a lock-free style, allowing workers to move the clock hand in tandem. Alternatively, the clock iteration could be moved off to a background process, similar to what Amit Kapila proposed here. Is this assessment accurate? I know 9.4 changes a lot about lock organization, but last I looked I didn’t see anything that could alleviate this contention: are there any plans to address this? —Jason
Re: [HACKERS] Move unused buffers to freelist
Bump. I’m interested in many of the issues that were discussed in this thread. Was this patch ever wrapped up (I can’t find it in any CF), or did this thread die off? —Jason On Aug 6, 2013, at 12:18 AM, Amit Kapila amit.kap...@huawei.com wrote: On Friday, June 28, 2013 6:20 PM Robert Haas wrote: On Fri, Jun 28, 2013 at 12:52 AM, Amit Kapila amit.kap...@huawei.com wrote: Currently it wakes up based on bgwriterdelay config parameter which is by default 200ms, so you means we should think of waking up bgwriter based on allocations and number of elements left in freelist? I think that's what Andres and I are proposing, yes. As per my understanding Summarization of points raised by you and Andres which this patch should address to have a bigger win: 1. Bgwriter needs to be improved so that it can help in reducing usage count and finding next victim buffer (run the clock sweep and add buffers to the free list). Check. I think one way to handle it is that while moving buffers to freelist, if we find that there are not enough buffers (= high watermark) which have zero usage count, then move through buffer list and reduce usage count. Now here I think it is important how do we find that how many times we should circulate the buffer list to reduce usage count. Currently I have kept it proportional to number of times it failed to move enough buffers to freelist. 2. SetLatch for bgwriter (wakeup bgwriter) when elements in freelist are less. Check. The way to do this is to keep a variable in shared memory in the same cache line as the spinlock protecting the freelist, and update it when you update the free list. Added a new variable freelistLatch in BufferStrategyControl 3. Split the workdone globallock (Buffreelist) in StrategyGetBuffer (a spinlock for the freelist, and an lwlock for the clock sweep). Check. Added a new variable freelist_lck in BufferStrategyControl which will be used to protect freelist. Still Buffreelist will be used to protect clock sweep part of StrategyGetBuffer. 4. Separate processes for writing dirty buffers and moving buffers to freelist I think this part might be best pushed to a separate patch, although I agree we probably need it. 5. Bgwriter needs to be more aggressive, logic based on which it calculates how many buffers it needs to process needs to be improved. This is basically overlapping with points already made. I suspect we could just get rid of bgwriter_delay, bgwriter_lru_maxpages, and bgwriter_lru_multiplier altogether. The background writer would just have a high and a low watermark. When the number of buffers on the freelist drops below the low watermark, the allocating backend sets the latch and bgwriter wakes up and begins adding buffers to the freelist. When the number of buffers on the free list reaches the high watermark, the background writer goes back to sleep. Some experimentation might be needed to figure out what values are appropriate for those watermarks. In theory this could be a configuration knob, but I suspect it's better to just make the system tune it right automatically. Currently in Patch I have used low watermark as 1/6 and high watermark as 1/3 of NBuffers. Values are hardcoded for now, but I will change to guc's or hash defines. As far as I can think there is no way to find number of buffers on freelist, so I had added one more variable to maintain it. Initially I thought that I could use existing variables firstfreebuffer and lastfreebuffer to calculate it, but it may not be accurate as once the buffers are moved to freelist, these don't give exact count. The main doubt here is what if after traversing all buffers, it didn't find enough buffers to meet high watermark? Currently I just move out of loop to move buffers and just try to reduce usage count as explained in point-1 6. There can be contention around buffer mapping locks, but we can focus on it later 7. cacheline bouncing around the buffer header spinlocks, is there anything we can do to reduce this? I think these are points that we should leave for the future. This is just a WIP patch. I have kept older code in comments. I need to further refine it and collect performance data. I had prepared one script (perf_buff_mgmt.sh) to collect performance data for different shared buffers/scalefactor/number_of_clients Top level points which still needs to be taken care: 1. Choose Optimistically used buffer in StrategyGetBuffer(). Refer Simon's Patch: https://commitfest.postgresql.org/action/patch_view?id=743 2. Don't bump the usage count on every time buffer is pinned. This idea I got when reading archives about improvements in this area. With Regards, Amit Kapila. changed_freelist_mgmt.patchperf_buff_mgmt.sh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription:
Re: [HACKERS] Suspicion of a compiler bug in clang: using ternary operator in ereport()
I realize Postgres’ codebase is probably intractably large to begin using a tool like splint (http://www.splint.org ), but this is exactly the sort of thing it’ll catch. I’m pretty sure it would have warned in this case that the code relies on an ordering of side effects that is left undefined by C standards (and as seen here implemented differently by two different compilers). The workaround is to make separate assignments on separate lines, which act as sequence points to impose a total order on the side-effects in question. —Jason On Jan 28, 2014, at 2:12 PM, Christian Kruse christ...@2ndquadrant.com wrote: Hi, On 28/01/14 16:43, Christian Kruse wrote: ereport(FATAL, (errmsg(could not map anonymous shared memory: %m), (errno == ENOMEM) ? errhint(This error usually means that PostgreSQL's request for a shared memory segment exceeded available memory or swap space. To reduce the request size (currently %zu bytes), reduce PostgreSQL's shared memory usage, perhaps by reducing shared_buffers or max_connections., *size) : 0)); did not emit a errhint when using clang, although errno == ENOMEM was true. The same code works with gcc. According to http://llvm.org/bugs/show_bug.cgi?id=18644#c5 this is not a compiler bug but a difference between gcc and clang. Clang seems to use a left-to-right order of evaluation while gcc uses a right-to-left order of evaluation. So if errmsg changes errno this would lead to errno == ENOMEM evaluated to false. I added a watch point on errno and it turns out that exactly this happens: in src/common/psprintf.c line 114 nprinted = vsnprintf(buf, len, fmt, args); errno gets set to 0. This means that we will miss errhint/errdetail if we use errno in a ternary operator and clang. Should we work on this issue? Best regards, -- Christian Kruse http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers