Attached is a first WIP saving Timing Events via a new hook, grabbed by
a new pg_timing_events contrib module. This only implements a small
portion of the RFP spec I presented earlier this month, and is by no
means finished code looking for a regular review. This is just enough
framework to do something useful while looking for parts that I suspect
have tricky details.
Right now this saves just a single line of text and inserts the hook
into the checkpoint code, so output looks like this:
$ psql -c "select * from pg_timing_events" -x
-[ RECORD 1 ]-
seq | 0
event | check/restartpoint complete: wrote 0 buffers (0.0%); 0
transaction log file(s) added, 0 removed, 0 recycled; write=0.000 s,
sync=0.000 s, total=0.001 s; sync files=0, longest=0.000 s, average=0.000 s
Extending this to save the key/value set and most of the other data I
mentioned before is pretty straightforward. There is a fair amount of
debris in here from the pg_stat_statements code this started as. I've
left a lot of that here but commented out because it gives examples of
grabbing things like the user and database I'll need soon.
There's a modest list of things that I've done or am facing soon that
involve less obvious choices though, and I would appreciate feedback on
these things in particular:
-This will eventually aggregate data from clients running queries, the
checkpointer process, and who knows what else in the future. The most
challenging part of this so far is deciding how to handle the memory
management side. I've tried just dumping everything into
TopMemoryContext, and that not working as hoped is the biggest known bug
right now. I'm not decided on whether to continue doing that but
resolve the bug; if there is an alternate context that makes more sense
for a contrib module like this; or if I should fix the size of the data
and allocate everything at load time. The size of the data from
log_lock_waits in particular will be hard to estimate in advance.
-The event queue into a simple array accessed in circular fashion.
After realizing that output needed to navigate in the opposite order of
element addition, I ended up just putting all the queue navigation code
directly into the add/output routines. I'd be happy to use a more
formal Postgres type here instead--I looked at SHM_QUEUE for
example--but I haven't found something yet that makes this any simpler.
-I modeled the hook here on the logging one that went into 9.2. It's
defined in its own include file and it gets initialized by the logging
system. No strong justification for putting it there, it was just
similar to the existing hook and that seemed reasonable. This is in
some respects an alternate form of logging after all. The data sent by
the hook itself needs to be more verbose in order to handle the full
spec, right now I'm just asking about placement.
-I'm growing increasingly worried about allowing concurrent reads of
this data (which might be large and take a while to return to the
client) without blocking insertions. The way that was split to improve
pg_stat_statements concurrency doesn't convert naturally to this data.
The serial number field in here is part of one idea I had. I might grab
the header with an exclusive lock for only a moment and lookup the
serial number of the last record I should display. Then I could drop to
a share lock looping over the elements, stopping if I find one with a
serial number over that.
--
Greg Smith 2ndQuadrant US g...@2ndquadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
diff --git a/contrib/pg_timing_events/Makefile
b/contrib/pg_timing_events/Makefile
new file mode 100644
index 0000000..3e7f1ab
--- /dev/null
+++ b/contrib/pg_timing_events/Makefile
@@ -0,0 +1,18 @@
+# contrib/pg_timing_events/Makefile
+
+MODULE_big = pg_timing_events
+OBJS = pg_timing_events.o
+
+EXTENSION = pg_timing_events
+DATA = pg_timing_events--1.0.sql pg_timing_events--unpackaged--1.0.sql
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = contrib/pg_timing_events
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/contrib/pg_timing_events/pg_timing_events--1.0.sql
b/contrib/pg_timing_events/pg_timing_events--1.0.sql
new file mode 100644
index 0000000..ccf6372
--- /dev/null
+++ b/contrib/pg_timing_events/pg_timing_events--1.0.sql
@@ -0,0 +1,43 @@
+/* contrib/pg_timing_events/pg_timing_events--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION pg_timing_events" to load this file. \quit
+
+-- Register functions.
+CREATE FUNCTION pg_timing_events_reset()
+RETURNS void
+AS 'MODULE_PATHNAME'
+LANGUAGE C;
+
+CREATE FUNCTION pg_timing_events(
+ OUT seq int8,
+ OUT event text
+-- OUT occured timestamp
+
+-- OUT userid oid,
+-- OUT dbid oid,
+-- OUT query text,
+-- OUT calls int8,
+-- OUT total_time float8,
+-- OUT rows int8,
+-- OUT shared_blks_hit int8,
+-- OUT shared_blks_read int8,
+-- OUT shared_blks_written int8,
+-- OUT local_blks_hit int8,
+-- OUT local_blks_read int8,
+-- OUT local_blks_written int8,
+-- OUT temp_blks_read int8,
+-- OUT temp_blks_written int8
+)
+RETURNS SETOF record
+AS 'MODULE_PATHNAME'
+LANGUAGE C;
+
+-- Register a view on the function for ease of use.
+CREATE VIEW pg_timing_events AS
+ SELECT * FROM pg_timing_events();
+
+GRANT SELECT ON pg_timing_events TO PUBLIC;
+
+-- Don't want this to be available to non-superusers.
+REVOKE ALL ON FUNCTION pg_timing_events_reset() FROM PUBLIC;
diff --git a/contrib/pg_timing_events/pg_timing_events--unpackaged--1.0.sql
b/contrib/pg_timing_events/pg_timing_events--unpackaged--1.0.sql
new file mode 100644
index 0000000..0c9b1d8
--- /dev/null
+++ b/contrib/pg_timing_events/pg_timing_events--unpackaged--1.0.sql
@@ -0,0 +1,8 @@
+/* contrib/pg_timing_events/pg_timing_events--unpackaged--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION pg_timing_events" to load this file. \quit
+
+ALTER EXTENSION pg_timing_events ADD function pg_timing_events_reset();
+ALTER EXTENSION pg_timing_events ADD function pg_timing_events();
+ALTER EXTENSION pg_timing_events ADD view pg_timing_events;
diff --git a/contrib/pg_timing_events/pg_timing_events.c
b/contrib/pg_timing_events/pg_timing_events.c
new file mode 100644
index 0000000..d7558e6
--- /dev/null
+++ b/contrib/pg_timing_events/pg_timing_events.c
@@ -0,0 +1,716 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_timing_events.c
+ * Track statement execution times across a whole database cluster.
+ *
+ * Note about locking issues: to create or delete an entry in the shared
+ * hashtable, one must hold pgte->lock exclusively. Modifying any field
+ * in an entry requires the same. To look up an entry, one must hold
+ * the lock shared.
+ *
+ * Portions copyright (c) 2012, Salesforce.com
+ * Portions copyright (c) 2008-2012, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * contrib/pg_timing_events/pg_timing_events.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include <unistd.h>
+
+#include "access/hash.h"
+#include "catalog/pg_type.h"
+#include "executor/executor.h"
+#include "executor/instrument.h"
+#include "funcapi.h"
+#include "mb/pg_wchar.h"
+#include "miscadmin.h"
+#include "pgstat.h"
+#include "storage/fd.h"
+#include "storage/ipc.h"
+#include "storage/spin.h"
+#include "tcop/utility.h"
+#include "utils/builtins.h"
+#include "utils/hsearch.h"
+#include "utils/guc.h"
+#include "utils/memutils.h"
+#include "utils/timing_event.h"
+
+
+PG_MODULE_MAGIC;
+
+/* Location of stats file */
+#define PGTE_DUMP_FILE "global/pg_timing_events.stat"
+
+/* This constant defines the magic number in the stats file header */
+static const uint32 PGTE_FILE_HEADER = 0x20100108;
+
+/*
+ * Statistics per statement
+ *
+ * NB: see the file read/write code before changing field order here.
+ */
+typedef struct pgteEntry
+{
+ /* TODO Remove key - it's just to cut down on compiler errors while
refactoring to eliminate */
+ int key;
+
+ Timestamp entry_time;
+ char *description;
+ int serial_num;
+
+ /* Old entries just to use as examples */
+ Oid userid; /* user OID */
+ Oid dbid; /* database OID */
+ int encoding; /* query encoding */
+ int query_len; /* # of valid bytes in
query string */
+ const char *query_ptr; /* query string proper */
+
+ slock_t mutex; /* protects the counters only */
+ char query[1]; /* VARIABLE LENGTH ARRAY - MUST
BE LAST */
+ /* Note: the allocated length of query[] is actually pgte->query_size */
+} pgteEntry;
+
+/*
+ * Global shared state
+ */
+typedef struct pgteSharedState
+{
+ LWLockId lock; /* protects queue
lookup/modification */
+ int serial_num;
+ int front;
+ unsigned char full; /* really a boolean */
+ char last_query[1024];
+ int queue_size;
+ int query_size; /* max query length in
bytes */
+} pgteSharedState;
+
+/*---- Local variables ----*/
+
+/* Saved hook values in case of unload */
+static shmem_startup_hook_type prev_shmem_startup_hook = NULL;
+static timing_event_hook_type prev_timing_event_hook = NULL;
+
+/* Links to shared memory state */
+static pgteSharedState *pgte = NULL;
+static pgteEntry *PGTE_queue = NULL;
+
+/*---- GUC variables ----*/
+
+static int PGTE_max; /* max # statements to track */
+static int PGTE_track = 1; /* tracking level */
+static bool PGTE_save; /* whether to save stats across
shutdown */
+
+
+/* TODO Add another GUC for disabling collection */
+#define PGTE_enabled() \
+ (PGTE_track == 1 )
+
+/*---- Function declarations ----*/
+
+void _PG_init(void);
+void _PG_fini(void);
+
+Datum pg_timing_events_reset(PG_FUNCTION_ARGS);
+Datum pg_timing_events(PG_FUNCTION_ARGS);
+
+PG_FUNCTION_INFO_V1(pg_timing_events_reset);
+PG_FUNCTION_INFO_V1(pg_timing_events);
+
+static void PGTE_shmem_startup(void);
+static void PGTE_shmem_shutdown(int code, Datum arg);
+static void PGTE_timing_event(const char *desc);
+static void PGTE_store(const char *query);
+static Size PGTE_memsize(void);
+static void entry_reset(void);
+
+
+/*
+ * Module load callback
+ */
+void
+_PG_init(void)
+{
+ /*
+ * In order to create our shared memory area, we have to be loaded via
+ * shared_preload_libraries. If not, fall out without hooking into any
of
+ * the main system. (We don't throw error here because it seems useful
to
+ * allow the pg_timing_events functions to be created even when the
+ * module isn't active. The functions must protect themselves against
+ * being called then, however.)
+ */
+ if (!process_shared_preload_libraries_in_progress)
+ return;
+
+ /*
+ * Define (or redefine) custom GUC variables.
+ */
+ DefineCustomIntVariable("pg_timing_events.max",
+ "Sets the maximum number of events tracked by pg_timing_events.",
+ NULL,
+ &PGTE_max,
+ 1000,
+ 100,
+ INT_MAX,
+ PGC_POSTMASTER,
+ 0,
+ NULL,
+ NULL,
+ NULL);
+
+ DefineCustomBoolVariable("pg_timing_events.save",
+ "Save pg_timing_events statistics across server
shutdowns.",
+ NULL,
+ &PGTE_save,
+ true,
+ PGC_SIGHUP,
+ 0,
+ NULL,
+ NULL,
+ NULL);
+
+ EmitWarningsOnPlaceholders("pg_timing_events");
+
+ /*
+ * Request additional shared resources. (These are no-ops if we're not
in
+ * the postmaster process.) We'll allocate or attach to the shared
+ * resources in PGTE_shmem_startup().
+ */
+ RequestAddinShmemSpace(PGTE_memsize());
+ RequestAddinLWLocks(1);
+
+ /*
+ * Install hooks.
+ */
+ prev_shmem_startup_hook = shmem_startup_hook;
+ shmem_startup_hook = PGTE_shmem_startup;
+
+ prev_timing_event_hook = timing_event_hook;
+ timing_event_hook = PGTE_timing_event;
+}
+
+/*
+ * Module unload callback
+ */
+void
+_PG_fini(void)
+{
+ /* Uninstall hooks. */
+ shmem_startup_hook = prev_shmem_startup_hook;
+ timing_event_hook = prev_timing_event_hook;
+
+}
+
+/*
+ * shmem_startup hook: allocate or attach to shared memory,
+ * then load any pre-existing statistics from file.
+ */
+static void
+PGTE_shmem_startup(void)
+{
+ bool found;
+ FILE *file;
+ uint32 header;
+ int32 num;
+ int32 i;
+ int query_size;
+ int buffer_size;
+ char *buffer = NULL;
+
+ if (prev_shmem_startup_hook)
+ prev_shmem_startup_hook();
+
+ /* reset in case this is a restart within the postmaster */
+ pgte = NULL;
+
+ /*
+ * Create or attach to the shared memory state, including hash table
+ */
+ LWLockAcquire(AddinShmemInitLock, LW_EXCLUSIVE);
+
+ pgte = (pgteSharedState *) ShmemInitStruct("pg_timing_events",
+ sizeof(pgteSharedState),
+ &found);
+
+ if (!found)
+ {
+ /* First time through ... */
+ pgte->lock = LWLockAssign();
+ pgte->query_size = pgstat_track_activity_query_size;
+ pgte->front = -1;
+ pgte->full = 0;
+ }
+
+ /* Be sure everyone agrees on the table entry sizes */
+ query_size = pgte->query_size;
+ pgte->queue_size = PGTE_max;
+
+ PGTE_queue = (pgteEntry *)
+ ShmemInitStruct("Timing Event Queue",
+ PGTE_max * (Size) sizeof(pgteEntry *),
&found);
+ if (!found)
+ {
+ /*
+ * TODO If the queue logic is right, there should be no need
+ * to initialize any of its elements here?
+ */
+ }
+
+ LWLockRelease(AddinShmemInitLock);
+
+ /*
+ * If we're in the postmaster (or a standalone backend...), set up a
shmem
+ * exit hook to dump the statistics to disk.
+ */
+ if (!IsUnderPostmaster)
+ on_shmem_exit(PGTE_shmem_shutdown, (Datum) 0);
+
+ /*
+ * Attempt to load old statistics from the dump file, if this is the
first
+ * time through and we weren't told not to.
+ */
+ if (found || !PGTE_save)
+ return;
+
+ /*
+ * TODO Below is the old pg_stat_statements reading code, but since
we've
+ * disabled writes we should never find a file to read yet for it to
break
+ * on. Return anyway just to be safe.
+ */
+ return;
+
+ /*
+ * Note: we don't bother with locks here, because there should be no
other
+ * processes running when this code is reached.
+ */
+ file = AllocateFile(PGTE_DUMP_FILE, PG_BINARY_R);
+ if (file == NULL)
+ {
+ if (errno == ENOENT)
+ return; /* ignore not-found
error */
+ goto error;
+ }
+
+ buffer_size = query_size;
+ buffer = (char *) palloc(buffer_size);
+
+ if (fread(&header, sizeof(uint32), 1, file) != 1 ||
+ header != PGTE_FILE_HEADER ||
+ fread(&num, sizeof(int32), 1, file) != 1)
+ goto error;
+
+ for (i = 0; i < num; i++)
+ {
+ pgteEntry temp;
+/*
+ pgteEntry *entry;
+*/
+ if (fread(&temp, offsetof(pgteEntry, mutex), 1, file) != 1)
+ goto error;
+
+ /* Encoding is the only field we can easily sanity-check */
+/*
+ if (!PG_VALID_BE_ENCODING(temp.key.encoding))
+ goto error;
+*/
+ /* Previous incarnation might have had a larger query_size */
+/*
+ if (temp.key.query_len >= buffer_size)
+ {
+ buffer = (char *) repalloc(buffer, temp.key.query_len +
1);
+ buffer_size = temp.key.query_len + 1;
+ }
+
+ if (fread(buffer, 1, temp.key.query_len, file) !=
temp.key.query_len)
+ goto error;
+ buffer[temp.key.query_len] = '\0';
+*/
+ /* Clip to available length if needed */
+/*
+ if (temp.key.query_len >= query_size)
+ temp.key.query_len =
pg_encoding_mbcliplen(temp.key.encoding,
+
buffer,
+
temp.key.query_len,
+
query_size - 1);
+ temp.key.query_ptr = buffer;
+*/
+ /* make the hashtable entry (discards old entries if too many)
*/
+/*
+ entry = entry_alloc(&temp);
+*/
+ }
+
+ pfree(buffer);
+ FreeFile(file);
+
+ /*
+ * Remove the file so it's not included in backups/replication
+ * slaves, etc. A new file will be written on next shutdown.
+ */
+ unlink(PGTE_DUMP_FILE);
+
+ return;
+
+error:
+ ereport(LOG,
+ (errcode_for_file_access(),
+ errmsg("could not read pg_stat_statement file \"%s\":
%m",
+ PGTE_DUMP_FILE)));
+ if (buffer)
+ pfree(buffer);
+ if (file)
+ FreeFile(file);
+ /* If possible, throw away the bogus file; ignore any error */
+ unlink(PGTE_DUMP_FILE);
+}
+
+/*
+ * shmem_shutdown hook: Dump statistics into file.
+ *
+ * Note: we don't bother with acquiring lock, because there should be no
+ * other processes running when this is called.
+ */
+static void
+PGTE_shmem_shutdown(int code, Datum arg)
+{
+ FILE *file;
+
+/*
+ TODO Don't need to store state until code is rewritten
+ HASH_SEQ_STATUS hash_seq;
+ int32 num_entries;
+ pgteEntry *entry;
+*/
+
+ /* Don't try to dump during a crash. */
+ if (code)
+ return;
+
+ /* Safety check ... shouldn't get here unless shmem is set up. */
+ if (!pgte || !PGTE_queue)
+ return;
+
+ /* Don't dump if told not to. */
+ if (!PGTE_save)
+ return;
+
+ /* TODO Cannot execute the below until it's corrected to write queue
entries */
+ return;
+
+ file = AllocateFile(PGTE_DUMP_FILE ".tmp", PG_BINARY_W);
+ if (file == NULL)
+ goto error;
+
+ if (fwrite(&PGTE_FILE_HEADER, sizeof(uint32), 1, file) != 1)
+ goto error;
+
+/* TODO Replace with a queue iterator
+ num_entries = queue_get_num_entries(pgteSharedState);
+ if (fwrite(&num_entries, sizeof(int32), 1, file) != 1)
+ goto error;
+*/
+
+/*
+ hash_seq_init(&hash_seq, PGTE_hash);
+ while ((entry = hash_seq_search(&hash_seq)) != NULL)
+ {
+ int len = entry->key.query_len;
+
+ if (fwrite(entry, offsetof(pgteEntry, mutex), 1, file) != 1 ||
+ fwrite(entry->query, 1, len, file) != len)
+ goto error;
+ }
+*/
+
+ if (FreeFile(file))
+ {
+ file = NULL;
+ goto error;
+ }
+
+ /*
+ * Rename file into place, so we atomically replace the old one.
+ */
+ if (rename(PGTE_DUMP_FILE ".tmp", PGTE_DUMP_FILE) != 0)
+ ereport(LOG,
+ (errcode_for_file_access(),
+ errmsg("could not rename pg_stat_statement
file \"%s\": %m",
+ PGTE_DUMP_FILE ".tmp")));
+
+ return;
+
+error:
+ ereport(LOG,
+ (errcode_for_file_access(),
+ errmsg("could not write pg_stat_statement file \"%s\":
%m",
+ PGTE_DUMP_FILE ".tmp")));
+ if (file)
+ FreeFile(file);
+ unlink(PGTE_DUMP_FILE ".tmp");
+}
+
+/*
+ * Timing event hook: save the event
+ */
+static void
+PGTE_timing_event(const char *desc)
+{
+ if (PGTE_enabled())
+ {
+ PGTE_store(desc);
+ }
+}
+
+/*
+ * Store some statistics for a statement.
+ */
+static void
+PGTE_store(const char *query)
+{
+ Assert(query != NULL);
+
+ /* Safety check... */
+ if (!pgte || !PGTE_queue)
+ return;
+
+ /* Must acquire exclusive lock to add a new entry. */
+ LWLockAcquire(pgte->lock, LW_EXCLUSIVE);
+
+ pgte->front++;
+ if (pgte->front == PGTE_max)
+ {
+ pgte->front = 0;
+ /*
+ * Noting that queue has filled is rundant but harmless on
passes after
+ * the first time it happens.
+ */
+ pgte->full = 1;
+ }
+
+ /* Using a plain strdup here results in junk events from the
background writer */
+
PGTE_queue[pgte->front].description=MemoryContextStrdup(TopMemoryContext,
query);
+ PGTE_queue[pgte->front].serial_num=pgte->serial_num;
+ pgte->serial_num++;
+
+ /*
+ * TODO Cheat and save a copy in a static buffer. This must be removed
once the
+ * memory is correctly moved to the output via the MemoryContextStrdup
above
+ * (or its replacement)
+ */
+ strncpy(pgte->last_query,query,1023);
+
+ /* TODO Examples of how to grab some state data we'll need to add later
+ key.userid = GetUserId();
+ key.dbid = MyDatabaseId;
+ key.encoding = GetDatabaseEncoding();
+ key.query_len = strlen(query);
+ if (key.query_len >= pgte->query_size)
+ key.query_len = pg_encoding_mbcliplen(key.encoding,
+
query,
+
key.query_len,
+
pgte->query_size - 1);
+ key.query_ptr = query;
+ */
+
+ LWLockRelease(pgte->lock);
+}
+
+/*
+ * Reset all statement statistics.
+ */
+Datum
+pg_timing_events_reset(PG_FUNCTION_ARGS)
+{
+ if (!pgte || !PGTE_queue)
+ ereport(ERROR,
+
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("pg_timing_events must be loaded via
shared_preload_libraries")));
+ entry_reset();
+ PG_RETURN_VOID();
+}
+
+#define pg_timing_events_COLS 2
+
+/*
+ * Retrieve statement statistics.
+ */
+Datum
+pg_timing_events(PG_FUNCTION_ARGS)
+{
+ ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
+ TupleDesc tupdesc;
+ Tuplestorestate *tupstore;
+ MemoryContext per_query_ctx;
+ MemoryContext oldcontext;
+ /*
+ Oid userid = GetUserId();
+ bool is_superuser = superuser();
+ */
+ int count;
+ int position;
+ int offset;
+
+ if (!pgte || !PGTE_queue)
+ ereport(ERROR,
+
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("pg_timing_events must be loaded via
shared_preload_libraries")));
+
+ /* check to see if caller supports us returning a tuplestore */
+ if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("set-valued function called in context
that cannot accept a set")));
+ if (!(rsinfo->allowedModes & SFRM_Materialize))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("materialize mode required, but it is
not " \
+ "allowed in this context")));
+
+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ elog(ERROR, "return type must be a row type");
+
+ per_query_ctx = rsinfo->econtext->ecxt_per_query_memory;
+ oldcontext = MemoryContextSwitchTo(per_query_ctx);
+
+ tupstore = tuplestore_begin_heap(true, false, work_mem);
+ rsinfo->returnMode = SFRM_Materialize;
+ rsinfo->setResult = tupstore;
+ rsinfo->setDesc = tupdesc;
+
+ MemoryContextSwitchTo(oldcontext);
+
+ /* TODO Fake data insertion, remove once real events are being stored */
+ /*PGTE_store("dummy data");*/
+
+ /*
+ * We can co-exist with other readers, but will block concurrent
additions
+ */
+ LWLockAcquire(pgte->lock, LW_SHARED);
+
+ /*
+ * Determine start and how many active entries are in the queue.
+ * Once the array holding the queue is full, the next position after
+ * the front of the queue holds the oldest entry. But before the queue
+ * fills, the oldest entry is at position 0.
+ */
+ if (pgte->full)
+ {
+ position=(pgte->front + 1) % PGTE_max;
+ count = PGTE_max;
+ }
+ else
+ {
+ position=0;
+ count = pgte->front + 1;
+ }
+
+ offset=0; /* Just for debug printing, not navigation */
+
+ for (; count>0; count--)
+ {
+ Datum values[pg_timing_events_COLS];
+ bool nulls[pg_timing_events_COLS];
+ int col = 0;
+ pgteEntry *entry = &PGTE_queue[position];
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, 0, sizeof(nulls));
+
+ /* printf("position[%d]='%s' (element
%d)\n",offset,PGTE_queue[position].description,i); */
+
+ /*
+ * TODO Quietly upcasting the temporary 32 bit serial_num to 64
bits.
+ * Eventually the serial number needs to be a true 64 bit value
if
+ * it is kept in the final design.
+ */
+ values[col++] = Int64GetDatumFast(entry->serial_num);
+
+/* Debugging hack if we want to see how long the description string is */
+/* values[col++] = Int64GetDatumFast(strlen(entry->description));*/
+
+/* Code that should work but doesn't yet */
+/* values[col++] = CStringGetTextDatum(entry->description); */
+
+ values[col++] = CStringGetTextDatum(pgte->last_query);
+
+/*
+ values[col++] = ObjectIdGetDatum(entry->key.userid);
+ values[col++] = ObjectIdGetDatum(entry->key.dbid);
+*/
+
+/*
+ if (is_superuser || entry->userid == userid)
+ {
+ char *qstr;
+
+ qstr = (char *)
+ pg_do_encoding_conversion((unsigned char *)
entry->query,
+
entry->query_len,
+
entry->encoding,
+
GetDatabaseEncoding());
+ values[col++] = CStringGetTextDatum(qstr);
+ if (qstr != entry->query)
+ pfree(qstr);
+ }
+ else
+ values[col++] = CStringGetTextDatum("<insufficient
privilege>");
+*/
+ Assert(col == pg_timing_events_COLS);
+
+ tuplestore_putvalues(tupstore, tupdesc, values, nulls);
+
+ /* Move to next queue position, wrapping around the array at
the end */
+ position++;
+ if (position == PGTE_max)
+ position=0;
+ offset++;
+ }
+
+ LWLockRelease(pgte->lock);
+
+ /* clean up and return the tuplestore */
+ tuplestore_donestoring(tupstore);
+
+ return (Datum) 0;
+}
+
+/*
+ * Estimate shared memory space needed.
+ */
+static Size
+PGTE_memsize(void)
+{
+ Size size;
+ Size entrysize;
+
+ size = MAXALIGN(sizeof(pgteSharedState));
+ entrysize = offsetof(pgteEntry, query)
+pgstat_track_activity_query_size;
+ size = add_size(size, hash_estimate_size(PGTE_max, entrysize));
+
+ return size;
+}
+
+/*
+ * Release all entries.
+ */
+static void
+entry_reset(void)
+{
+ pgteEntry *entry;
+
+ LWLockAcquire(pgte->lock, LW_EXCLUSIVE);
+ entry=PGTE_queue;
+
+/* TODO Iterate over queue array
+ while ((entry = hash_seq_search(&hash_seq)) != NULL)
+ {
+ hash_search(PGTE_hash, &entry->key, HASH_REMOVE, NULL);
+ }
+*/
+
+ LWLockRelease(pgte->lock);
+
+}
diff --git a/contrib/pg_timing_events/pg_timing_events.control
b/contrib/pg_timing_events/pg_timing_events.control
new file mode 100644
index 0000000..7db2d12
--- /dev/null
+++ b/contrib/pg_timing_events/pg_timing_events.control
@@ -0,0 +1,5 @@
+# pg_timing_events extension
+comment = 'track execution statistics of all SQL statements executed'
+default_version = '1.0'
+module_pathname = '$libdir/pg_timing_events'
+relocatable = true
diff --git a/src/backend/access/transam/xlog.c
b/src/backend/access/transam/xlog.c
index bf76f6d..8abce9a 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -58,6 +58,7 @@
#include "utils/relmapper.h"
#include "utils/snapmgr.h"
#include "utils/timestamp.h"
+#include "utils/timing_event.h"
#include "pg_trace.h"
@@ -6730,6 +6731,7 @@ LogCheckpointEnd(bool restartpoint)
longest_usecs,
average_usecs;
uint64 average_sync_time;
+ StringInfoData timing_descr;
CheckpointStats.ckpt_end_t = GetCurrentTimestamp();
@@ -6774,6 +6776,31 @@ LogCheckpointEnd(bool restartpoint)
average_secs = (long) (average_sync_time / 1000000);
average_usecs = average_sync_time - (uint64) average_secs *1000000;
+ if (timing_event_hook)
+ {
+ initStringInfo(&timing_descr);
+ appendStringInfo(&timing_descr,
+ "check/restartpoint complete: wrote %d buffers
(%.1f%%); "
+ "%d transaction log file(s) added, %d removed, %d
recycled; "
+ "write=%ld.%03d s, sync=%ld.%03d s, total=%ld.%03d s; "
+ "sync files=%d, longest=%ld.%03d s, average=%ld.%03d
s",
+ CheckpointStats.ckpt_bufs_written,
+ (double) CheckpointStats.ckpt_bufs_written * 100 /
NBuffers,
+ CheckpointStats.ckpt_segs_added,
+ CheckpointStats.ckpt_segs_removed,
+ CheckpointStats.ckpt_segs_recycled,
+ write_secs, write_usecs / 1000,
+ sync_secs, sync_usecs / 1000,
+ total_secs, total_usecs / 1000,
+ CheckpointStats.ckpt_sync_rels,
+ longest_secs, longest_usecs / 1000,
+ average_secs, average_usecs / 1000);
+
+ elog(LOG,"Timing event message: %s",timing_descr.data);
+
+ (*timing_event_hook) (timing_descr.data);
+ }
+
if (restartpoint)
elog(LOG, "restartpoint complete: wrote %d buffers (%.1f%%); "
"%d transaction log file(s) added, %d removed, %d
recycled; "
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 68b7ab3..1deb31c 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -77,6 +77,7 @@
#include "utils/guc.h"
#include "utils/memutils.h"
#include "utils/ps_status.h"
+#include "utils/timing_event.h"
#undef _
@@ -104,6 +105,11 @@ extern bool redirection_done;
*/
emit_log_hook_type emit_log_hook = NULL;
+/*
+ * Hook for notifing a timing event happened.
+ */
+timing_event_hook_type timing_event_hook = NULL;
+
/* GUC parameters */
int Log_error_verbosity = PGERROR_VERBOSE;
char *Log_line_prefix = NULL; /* format for extra log line
info */
diff --git a/src/include/utils/timing_event.h b/src/include/utils/timing_event.h
new file mode 100644
index 0000000..7c516f7
--- /dev/null
+++ b/src/include/utils/timing_event.h
@@ -0,0 +1,20 @@
+/*-------------------------------------------------------------------------
+ *
+ * timing_event.h
+ * Timing event definitions.
+ *
+ * Portions Copyright (c) 1996-2012, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/utils/timing_event.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef TIMEEVENT_H
+#define TIMEEVENT_H
+
+/* Hook for sending timing event messages */
+typedef void (*timing_event_hook_type) (const char *event_text);
+extern PGDLLIMPORT timing_event_hook_type timing_event_hook;
+
+#endif /* TIMEEVENT_H */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index a831a1e..9c8356f 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2234,6 +2234,7 @@ timeKEY
time_t
timerCA
timezone_extra
+timing_event_hook_type
tlist_vinfo
transferMode
trgm
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers