date:20121030

Re: [HACKERS] Logical to physical page mapping

2012-10-30 Thread Markus Wanner

On 10/29/2012 12:05 PM, Robert Haas wrote:
 OK, I'll stop babbling now...

Not perceived as babbling here. Thanks for that nice round-up of options
and ideas around the torn page problem.

Regards

Markus Wanner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP checksums patch

2012-10-30 Thread jesper

 On 10/1/12 12:22 PM, Josh Berkus wrote:
 Perhaps we don't allow this to be turned per page, but rather per
 cluster, and per-cluster would require the entire cluster to be
 rewritten.
 We dicussed this last year, and options which require a total rewrite of
 the database in order to turn on the option were rejected as impractical
 for users.

 For whatever it's worth... we (and presumably others) still use londiste
 (or Slony) as our upgrade path, so we could tolerate a cluster-wide
 setting. We'd just set it when building new clusters via londiste and
 forget about it.

 So I'd rather see this get in at a cluster level than not make it at all
 while we wait for something better.

Some still do dump/restore between major releases, so there it is also
applicable. (and preferrable).

I do know a lot of things had been discussed last year, an API I could
easily imagine would work:

ALTER TABLE tablename SET ENABLE_CHECKSUMMING=YES
(would make the server do the checksums on all writes on table+indices
going forward, here, verification of the checksums would still be enabled,
but only on already checksummed pages ).
ALTER TABLE tablename set VERIFY_CHECKSUM=YES
(would cause the server to scan table and index, and rewrite the parts
that hadn't an updated with a checksum already.

Perhaps the names are not well-chosen, but it is based on the assumptions
that check-summing overhead is measurable and it is thereby desireable to
be able to dis/enable it on a per-table level.

Then I could let my system go 6-12 months between the above two commands
and the amount of rewrite would hopefully not require a rewrite of the
entire 2.5TB of PGDATA.

Jesper

-- 
Jesper Krogh



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PERFORM] out of memory

2012-10-30 Thread Tatsuo Ishii

 i have sql file (it's size are 1GB  )
 when i execute it then the String is 987098801 bytr too long for encoding
 conversion  error occured .
 pls give me solution about

You hit the upper limit of internal memory allocation limit in
PostgreSQL. IMO, there's no way to avoid the error except you use
client encoding identical to backend.

Hackers:
The particular limit seem to be set considering TOAST(from
include/utils/memutils.h):

 * XXX This is deliberately chosen to correspond to the limiting size
 * of varlena objects under TOAST.  See VARSIZE_4B() and related macros
 * in postgres.h.  Many datatypes assume that any allocatable size can
 * be represented in a varlena header.

IMO the SQL string size limit is totally different from
TOAST. Shouldn't we have different limit for SQL string?
(MAX_CONVERSION_GROWTH is different story, of course)
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] What are the advantages of not being able to access multiple databases with one connection?

2012-10-30 Thread crocket

MySQL permits a connection to access multiple databases.
But Postgresql restricts a connection to one database.
I think postgresql database connection is somewhat limited.

Is it an old and decrepit design? or does it deserve some appreciations?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] What are the advantages of not being able to access multiple databases with one connection?

2012-10-30 Thread Will Crawford

On 30 October 2012 12:37, crocket crockabisc...@gmail.com wrote:
 MySQL permits a connection to access multiple databases.
 But Postgresql restricts a connection to one database.
 I think postgresql database connection is somewhat limited.

 Is it an old and decrepit design? or does it deserve some appreciations?

Actually, a database in postgresql supports multiple schemas, and
the databases in mysql are more or less congruent to those. To put
it another way, pg has a three-level namespace which can be accessed
through the same host and port combination; mysql only has two levels.
Is that better?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] What are the advantages of not being able to access multiple databases with one connection?

2012-10-30 Thread Hannu Krosing


On 10/30/2012 01:37 PM, crocket wrote:

MySQL permits a connection to access multiple databases.
But Postgresql restricts a connection to one database.
I think postgresql database connection is somewhat limited.

Is it an old and decrepit design? or does it deserve some appreciations?

It's an old and decrepit design ;)

But nobody has found the ability to use many databases from
the same connection valuable enough to do the rewrite needed.

If you have ideas on how to enable access to multiple databases
from the same connection - both at technical and logical level (i.e.
what it would _mean_ and how it would be expressed in SQL) then
sure people would be ready to gear you out.

It would be valuable at least for some tools doing monitoring and
maintenance even in case SQL access to multiple databases
simultaneously would not be available.

For example it would be nice to get pg_stat_user_tables() info for
all databases in a single query or at least to switch databases (say using
SET DATABASE dbname) without reconnecting.


Hannu


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 3/8] Add support for a generic wal reading facility dubbed XLogReader

2012-10-30 Thread Andres Freund

On Monday, October 29, 2012 08:58:53 PM Alvaro Herrera wrote:
 Heikki Linnakangas escribió:
  Hmm. I was thinking that making this work in a non-backend context
  would be too hard, so I didn't give that much thought, but I guess
  there isn't many dependencies to backend functions after all.
  palloc/pfree are straightforward to replace with malloc/free. That's
  what we could easily do with the error messages too, just malloc a
  suitably sized buffer.
  
  How does a non-backend program get access to xlogreader.c? Copy
  xlogreader.c from the source tree at build time and link into the
  program? Or should we turn it into a shared library?

 Andres commented elsewhere about reading xlog records, processing them
 as they came in, and do a running CRC while we're still reading it.  I
 think this is a mistake; we shouldn't do anything with a record until
 the CRC has been verified.  Otherwise we risk reading arbitrarily
 corrupt data.

Uhm. xlog.c does just the same. It reads the header and if it looks valid it 
uses its length information to read the full record and only computes the CRC 
at the end.

 Overall, I think I like Heikki's minimal patch better than Andres'
 original proposal, but I'll keep looking at both.

I'll defer to you two on that, no point in fighting that many experienced 
people ;)

Andres

-- 
Andres Freund   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [PATCH 3/8] Add support for a generic wal reading facility dubbed XLogReader

2012-10-30 Thread Alvaro Herrera

Andres Freund escribió:
 On Monday, October 29, 2012 08:58:53 PM Alvaro Herrera wrote:
  Heikki Linnakangas escribió:

  Andres commented elsewhere about reading xlog records, processing them
  as they came in, and do a running CRC while we're still reading it.  I
  think this is a mistake; we shouldn't do anything with a record until
  the CRC has been verified.  Otherwise we risk reading arbitrarily
  corrupt data.
 
 Uhm. xlog.c does just the same. It reads the header and if it looks valid it 
 uses its length information to read the full record and only computes the CRC 
 at the end.

Uh.  Correct.

Am I the only one who finds this rather bizarre?  Maybe this was okay
when xlog data would only come from WAL files stored in the data
directory at recovery, but if we're now receiving these from a remote
sender over the network I wonder if we should be protecting against
malicious senders.  (This is not related to this patch anyway.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 3/8] Add support for a generic wal reading facility dubbed XLogReader

2012-10-30 Thread Andres Freund

On Tuesday, October 30, 2012 03:20:03 PM Alvaro Herrera wrote:
 Andres Freund escribió:
  On Monday, October 29, 2012 08:58:53 PM Alvaro Herrera wrote:
   Heikki Linnakangas escribió:
   
   Andres commented elsewhere about reading xlog records, processing them
   as they came in, and do a running CRC while we're still reading it.  I
   think this is a mistake; we shouldn't do anything with a record until
   the CRC has been verified.  Otherwise we risk reading arbitrarily
   corrupt data.
  
  Uhm. xlog.c does just the same. It reads the header and if it looks valid
  it uses its length information to read the full record and only computes
  the CRC at the end.
 
 Uh.  Correct.
 
 Am I the only one who finds this rather bizarre?  Maybe this was okay
 when xlog data would only come from WAL files stored in the data
 directory at recovery, but if we're now receiving these from a remote
 sender over the network I wonder if we should be protecting against
 malicious senders.  (This is not related to this patch anyway.)

How should this work otherwise? The CRC is over the whole data so we obviously 
need to read the whole data to compute the CRC? Would you prefer protecting 
the header with a separate CRC?
You can't use a CRC against malicous users anyway, its not cryptographically 
secure in any meaning of the word, its trivial to generate different content 
resulting in the same CRC. The biggest user of the CRC checking code we have 
is making sure were not reading beyond the end of the WAL...

Greetings,

Andres
-- 
Andres Freund   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] September 2012 commitfest

2012-10-30 Thread Greg Stark

On Mon, Oct 29, 2012 at 3:27 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 * tuplesort memory usage: grow_memtuples
   Greg Stark signed up for this

I'll commit this later this week. I looked at it briefly at the
conference but I think it actually does need some minor tweaks.

 * Trim trailing NULL columns
   Josh Berkus was going to do performance testing, but if he published
   anything I can't find it.  Robert said, in the previous commitfest,
   that if the benchmarks were right then this patch is ready to go in.
   It's been long since any committer weighed in on this thread, though.

 I vaguely recall Greg Stark signed up for another patch recently, but I
 can't readily find which one it was.

In the commitfest app I put my name down on the trim trailing NULL
columns patch. I'm generally for the idea and the benchmarks looked
convincing so unless there's something obviously broken I'll probably
commit it more or less as-is.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 3/8] Add support for a generic wal reading facility dubbed XLogReader

2012-10-30 Thread Tom Lane

Andres Freund and...@2ndquadrant.com writes:
 On Tuesday, October 30, 2012 03:20:03 PM Alvaro Herrera wrote:
 Am I the only one who finds this rather bizarre?  Maybe this was okay
 when xlog data would only come from WAL files stored in the data
 directory at recovery, but if we're now receiving these from a remote
 sender over the network I wonder if we should be protecting against
 malicious senders.  (This is not related to this patch anyway.)

 You can't use a CRC against malicous users anyway, its not cryptographically 
 secure in any meaning of the word,

More to the point, if a bad guy has got control of your WAL stream,
crashing the startup process with a ridiculous length word is several
orders of magnitude less than the worst thing he can find to do to you.
So the above argument seems completely nonsensical to me.  Anybody who's
worried about that type of scenario is better advised to be setting up
SSL to ensure that the stream is coming from the server they think it's
coming from.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 3/8] Add support for a generic wal reading facility dubbed XLogReader

2012-10-30 Thread Alvaro Herrera

Tom Lane escribió:
 Andres Freund and...@2ndquadrant.com writes:
  On Tuesday, October 30, 2012 03:20:03 PM Alvaro Herrera wrote:
  Am I the only one who finds this rather bizarre?  Maybe this was okay
  when xlog data would only come from WAL files stored in the data
  directory at recovery, but if we're now receiving these from a remote
  sender over the network I wonder if we should be protecting against
  malicious senders.  (This is not related to this patch anyway.)
 
  You can't use a CRC against malicous users anyway, its not 
  cryptographically 
  secure in any meaning of the word,
 
 More to the point, if a bad guy has got control of your WAL stream,
 crashing the startup process with a ridiculous length word is several
 orders of magnitude less than the worst thing he can find to do to you.
 So the above argument seems completely nonsensical to me.

Well, I wasn't talking just about the record length, but about the
record in general.  The length just came up because it's what I noticed.

And yeah, I was thinking in one sum for the header and another one for
the data.  If we're using CRC to detect end of WAL, what sense does it
make to have to read the whole record if we can detect the end by just
looking at the header?  (I obviously see that we need to checksum the
data as well; and having a second CRC field would bloat the record.)


 Anybody who's worried about that type of scenario is better advised to
 be setting up SSL to ensure that the stream is coming from the server
 they think it's coming from.

Okay.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 3/8] Add support for a generic wal reading facility dubbed XLogReader

2012-10-30 Thread Tom Lane

Alvaro Herrera alvhe...@2ndquadrant.com writes:
 And yeah, I was thinking in one sum for the header and another one for
 the data.

I don't think it's worth the space.

 If we're using CRC to detect end of WAL, what sense does it
 make to have to read the whole record if we can detect the end by just
 looking at the header?

Er, what?  The typical case where CRC shows us it's end of WAL is that
the overall CRC doesn't match, ie, torn record.  Record header
corruption as such is just about nonexistent AFAIK.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH 3/8] Add support for a generic wal reading facility dubbed XLogReader

2012-10-30 Thread Andres Freund

On Tuesday, October 30, 2012 04:24:21 PM Alvaro Herrera wrote:
 Tom Lane escribió:
  Andres Freund and...@2ndquadrant.com writes:
   On Tuesday, October 30, 2012 03:20:03 PM Alvaro Herrera wrote:
   Am I the only one who finds this rather bizarre?  Maybe this was okay
   when xlog data would only come from WAL files stored in the data
   directory at recovery, but if we're now receiving these from a remote
   sender over the network I wonder if we should be protecting against
   malicious senders.  (This is not related to this patch anyway.)
   
   You can't use a CRC against malicous users anyway, its not
   cryptographically secure in any meaning of the word,
  
  More to the point, if a bad guy has got control of your WAL stream,
  crashing the startup process with a ridiculous length word is several
  orders of magnitude less than the worst thing he can find to do to you.
  So the above argument seems completely nonsensical to me.
 
 Well, I wasn't talking just about the record length, but about the
 record in general.  The length just came up because it's what I noticed.
 
 And yeah, I was thinking in one sum for the header and another one for
 the data.  If we're using CRC to detect end of WAL, what sense does it
 make to have to read the whole record if we can detect the end by just
 looking at the header?  (I obviously see that we need to checksum the
 data as well; and having a second CRC field would bloat the record.)

Well, the header is written first. In the header we can detect somewhat 
accurately that were beyond the current end-of-wal by looking at -xl_prev and 
doing some validity checks, but thats not applicable for the data. A valid 
looking header doesn't mean that the whole, potentially multi-megabyte, record 
data is valid, we could have crashed while writing the data.

Greetings,

Andres
-- 
Andres Freund   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [HACKERS] Patch für MAP_HUGETLB for mmap() shared memory

2012-10-30 Thread Christian Kruse

Hi,

On 29/10/12 21:14, Peter Geoghegan wrote:
 I have a few initial observations on this.

Thanks for your feedback.


 * I think you should be making the new GUC PGC_INTERNAL on platforms
 where MAP_HUGETLB is not defined or available. See also,
 effective_io_concurrency. This gives sane error handling.

Ok, sounds good for me.

 * Why aren't you failing when HUGE_TLB_ON and the mmap fails? You do
 the same thing as HUGE_TLB_TRY, contrary to your documentation:

  +if (huge_tlb_pages == HUGE_TLB_ON || huge_tlb_pages == 
 HUGE_TLB_TRY)

No, what I did was mmap()ing the memory with MAP_HUGETLB and falling
back to mmap() w/o MAP_HUGETLB when mmap() failed. But there was
another bug, I didn't mmap() at all when huge_tlb_pages == off.

 * I think you should add MAP_HUGETLB to PG_MMAP_FLAGS directly, rather
 than XOR'ing after the fact. We already build the flags that comprise
 PG_MMAP_FLAGS using another set of intermediary flags, based on
 platform considerations, so what you're doing here is just
 inconsistent.

This would mean that I have to disable the bit when huge_tlb_pages ==
off. I think it is better to enable it if wanted and to just leave the
flags as they were when this feature has been turned off, do you
disagree?

 Don't wrap the mmap() code in ifdefs, and instead rely
 on the GUC being available on all platforms, with setting the GUC
 causing an error on unsupported platforms, in the style of
 effective_io_concurrency, as established in commit
 3b07182e613a0e63ff662e3dc7f820f010923ec7 . Build a PG_MAP_HUGETLB
 intermediary flag on all platforms.

Ok, this sounds good. It will remove complexity from the code.

 * Apparently you're supposed to do this for the benefit of Itanic [1]:

 /* Only ia64 requires this */
 #ifdef __ia64__
 #define ADDR (void *)(0x8000UL)
 #define FLAGS (MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB | MAP_FIXED)
 #else
 #define ADDR (void *)(0x0UL)
 #define FLAGS (MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB)
 #endif

Hm… is this enough? Or do we have to do more work for ia64?

 * You aren't following our style guide with respect to error messages [2].

Thanks, I wasn't aware of this. I attached a new version of the patch.

Greetings,
 CK


pgp3AmMtkv8SY.pgp
Description: PGP signature

Re: [HACKERS] Patch für MAP_HUGETLB for mmap() shared memory

2012-10-30 Thread Christian Kruse

Hi,

On 29/10/12 16:33, Tom Lane wrote:
  I created a patch which implements MAP_HUGETLB for sysv shared memory 
  segments
  (PGSharedMemoryCreate). It is based on tests of Tom Lane and Andres Freund, 
  I
  added error handling, huge page size detection and a GUC variable.

 My recollection is we'd decided not to pursue that because of fears that
 it could make performance horribly worse on some platforms.

This is why I created a GUC for turning this feature off.

Greetings,
 CK


pgpADXeFXTb5Z.pgp
Description: PGP signature

Re: [HACKERS] Re: [HACKERS] Patch für MAP_HUGETLB for mmap() shared memory

2012-10-30 Thread Christian Kruse

Hey,

Oh man, first I didn't sent the email to the list and now I forgot the
attachment. I should really get some sleep, sorry for any
inconveniences :(

Greetings,
 CK
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index b4fcbaf..66ed10f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1049,6 +1049,37 @@ include 'filename'
   /listitem
  /varlistentry
 
+ varlistentry id=guc-huge-tlb-pages xreflabel=huge_tlb_pages
+  termvarnamehuge_tlb_pages/varname (typeenum/type)/term
+  indexterm
+   primaryvarnamehuge_tlb_pages/ configuration parameter/primary
+  /indexterm
+  listitem
+   para
+Enables/disables the use of huge tlb pages. Valid values are
+literalon/literal, literaloff/literal and 
literaltry/literal.
+The default value is literaltry/literal.
+   /para
+
+   para
+With varnamehuge_tlb_pages/varname set to literalon/literal
+symbolmmap()/symbol will be called with 
symbolMAP_HUGETLB/symbol.
+If the call fails the server will fail fatally.
+   /para
+
+   para
+With varnamehuge_tlb_pages/varname set to literaloff/literal we
+will not use symbolMAP_HUGETLB/symbol at all.
+   /para
+
+   para
+With varnamehuge_tlb_pages/varname set to literaltry/literal
+we will try to use symbolMAP_HUGETLB/symbol and fall back to
+symbolmmap()/symbol without symbolMAP_HUGETLB/symbol.
+   /para
+  /listitem
+ /varlistentry
+
  varlistentry id=guc-temp-buffers xreflabel=temp_buffers
   termvarnametemp_buffers/varname (typeinteger/type)/term
   indexterm
diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
index df06312..73cfdef 100644
--- a/src/backend/port/sysv_shmem.c
+++ b/src/backend/port/sysv_shmem.c
@@ -27,10 +27,14 @@
 #ifdef HAVE_SYS_SHM_H
 #include sys/shm.h
 #endif
+#ifdef MAP_HUGETLB
+#include dirent.h
+#endif
 
 #include miscadmin.h
 #include storage/ipc.h
 #include storage/pg_shmem.h
+#include utils/guc.h
 
 
 typedef key_t IpcMemoryKey;/* shared memory key passed to 
shmget(2) */
@@ -61,6 +65,19 @@ typedef int IpcMemoryId; /* shared memory ID 
returned by shmget(2) */
 #define MAP_FAILED ((void *) -1)
 #endif
 
+#ifdef MAP_HUGETLB
+#  ifdef __ia64__
+#define PG_HUGETLB_BASE_ADDR (void *)(0x8000UL)
+#define PG_MAP_HUGETLB (MAP_HUGETLB|MAP_FIXED)
+#  else
+#define PG_HUGETLB_BASE_ADDR (void *)(0x0UL)
+#define PG_MAP_HUGETLB MAP_HUGETLB
+#  endif
+#else
+#  define PG_MAP_HUGETLB 0
+#endif
+
+
 
 unsigned long UsedShmemSegID = 0;
 void  *UsedShmemSegAddr = NULL;
@@ -342,6 +359,154 @@ PGSharedMemoryIsInUse(unsigned long id1, unsigned long 
id2)
 }
 
 
+#ifdef MAP_HUGETLB
+#define HUGE_PAGE_INFO_DIR  /sys/kernel/mm/hugepages
+
+/*
+ * static long InternalGetFreeHugepagesCount(const char *name)
+ *
+ * Attempt to read the number of available hugepages from
+ * /sys/kernel/mm/hugepages/hugepages-size/free_hugepages
+ * Will fail (return -1) if file could not be opened, 0 if no pages are 
available
+ * and  0 if there are free pages
+ *
+ */
+static long
+InternalGetFreeHugepagesCount(const char *name)
+{
+   int fd;
+   char buff[1024];
+   size_t len;
+   long result;
+   char *ptr;
+
+   len = snprintf(buff, 1024, %s/%s/free_hugepages, HUGE_PAGE_INFO_DIR, 
name);
+   if (len == 1024) /* I don't think that this will happen ever */
+   {
+   ereport(WARNING,
+   (errmsg(Filename %s/%s/free_hugepages is too 
long, HUGE_PAGE_INFO_DIR, name),
+errcontext(while checking hugepage size)));
+   return -1;
+   }
+
+   fd = open(buff, O_RDONLY);
+   if (fd = 0)
+   {
+   ereport(WARNING,
+   (errmsg(Could not open file %s: %s, buff, 
strerror(errno)),
+errcontext(while checking hugepage size)));
+   return -1;
+   }
+
+   len = read(fd, buff, 1024);
+   if (len = 0)
+   {
+   ereport(WARNING,
+   (errmsg(Error reading from file %s: %s, buff, 
strerror(errno)),
+errcontext(while checking hugepage size)));
+   close(fd);
+   return -1;
+   }
+
+   /*
+* If the content of free_hugepages is longer than or equal to 1024 
bytes
+* the rest is irrelevant; we simply want to know if there are any
+* hugepages left
+*/
+   if (len == 1024)
+   {
+   buff[1023] = 0;
+   }
+   else
+   {
+   buff[len] = 0;
+   }
+
+   close(fd);
+
+   result = strtol(buff, ptr, 10);
+
+   if (ptr == NULL)
+   {
+   ereport(WARNING,
+   (errmsg(Could not convert contents of file 
%s/%s/free_hugepages

Re: [HACKERS] Re: [HACKERS] Patch für MAP_HUGETLB for mmap() shared memory

2012-10-30 Thread Andres Freund

On Tuesday, October 30, 2012 08:20:33 PM Christian Kruse wrote:
 Hey,
 
 Oh man, first I didn't sent the email to the list and now I forgot the
 attachment. I should really get some sleep, sorry for any
 inconveniences :(

+#ifdef MAP_HUGETLB
+#  ifdef __ia64__
+#define PG_HUGETLB_BASE_ADDR (void *)(0x8000UL)
+#define PG_MAP_HUGETLB (MAP_HUGETLB|MAP_FIXED)
+#  else

Not your fault, but that looks rather strange to me. The level of documentation 
around the requirement of using MAP_FIXED is subpar...

+long
+InternalGetHugepageSize()
+{
+...
+   if (biggest_size  size  
InternalGetFreeHugepagesCount(ent-
d_name)  0)
+   {
+   biggest_size = size;
+   }

I dislike automatically using the biggest size here. There are platforms with 
16GB hugepages... I think we should rather always use the smallest size and 
leave it to the kernel to choose whatever pagesize he wants to use.

+#ifdef MAP_HUGETLB
+   longpagesize = InternalGetHugepageSize();
+
+   if (pagesize = 0)
+   pagesize = sysconf(_SC_PAGE_SIZE);
+#else
longpagesize = sysconf(_SC_PAGE_SIZE);
+#endif

Thats going to log two warnings if the current system doesn't have hugepage 
support compiled in and thus no /sys/kernel/mm/hugepages directory.
I think you should 1. only test this if TRY or ON is configured, 2. make the 
messages in InternalGetHugepageSize DEBUG1 at least for the TRY case.
 
+   {
+   {huge_tlb_pages,
+#ifdef MAP_HUGETLB
+   PGC_SUSET,
+#else
+   PGC_INTERNAL,
+#endif
+   RESOURCES_MEM,
+   gettext_noop(Enable/disable the use of the hugepages 
feature),
+   NULL
+   },
+   huge_tlb_pages,
+   HUGE_TLB_TRY, huge_tlb_options,
+   NULL, NULL, NULL
+   },

you use HUGE_TLB_TRY with ifdef here.


Looking forward to having this...

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [HACKERS] Patch für MAP_HUGETLB for mmap() shared memory

2012-10-30 Thread Christian Kruse

Hey,

On 30/10/12 20:33, Andres Freund wrote:
 +#ifdef MAP_HUGETLB
 +#  ifdef __ia64__
 +#define PG_HUGETLB_BASE_ADDR (void *)(0x8000UL)
 +#define PG_MAP_HUGETLB (MAP_HUGETLB|MAP_FIXED)
 +#  else

 Not your fault, but that looks rather strange to me. The level of 
 documentation
 around the requirement of using MAP_FIXED is subpar...

Yeah, looks strange to me, too. This is why I asked if this is really
all we have to do.

 I dislike automatically using the biggest size here. There are platforms with
 16GB hugepages... I think we should rather always use the smallest size and
 leave it to the kernel to choose whatever pagesize he wants to use.

Good point. I will change this to the smallest available page size.

 Thats going to log two warnings if the current system doesn't have hugepage
 support compiled in and thus no /sys/kernel/mm/hugepages directory.
 I think you should 1. only test this if TRY or ON is configured, 2. make the
 messages in InternalGetHugepageSize DEBUG1 at least for the TRY case.

Ok, point taken.

 […]
 + HUGE_TLB_TRY, huge_tlb_options,
 + NULL, NULL, NULL
 + },

 you use HUGE_TLB_TRY with ifdef here.

Ah, right. Setting it to HUGE_TLB_OFF when no MAP_HUGETLB is available.

Greetings,
 CK


pgp6PpeR5Kpxp.pgp
Description: PGP signature

Re: [HACKERS] Re: [HACKERS] Patch für MAP_HUGETLB for mmap() shared memory

2012-10-30 Thread Christian Kruse

Hey,

ok, I think I implemented all of the changes you requested. All but
the ia64 dependent, I have to do more research for this one.


Greetings,
 CK
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index b4fcbaf..66ed10f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1049,6 +1049,37 @@ include 'filename'
   /listitem
  /varlistentry
 
+ varlistentry id=guc-huge-tlb-pages xreflabel=huge_tlb_pages
+  termvarnamehuge_tlb_pages/varname (typeenum/type)/term
+  indexterm
+   primaryvarnamehuge_tlb_pages/ configuration parameter/primary
+  /indexterm
+  listitem
+   para
+Enables/disables the use of huge tlb pages. Valid values are
+literalon/literal, literaloff/literal and 
literaltry/literal.
+The default value is literaltry/literal.
+   /para
+
+   para
+With varnamehuge_tlb_pages/varname set to literalon/literal
+symbolmmap()/symbol will be called with 
symbolMAP_HUGETLB/symbol.
+If the call fails the server will fail fatally.
+   /para
+
+   para
+With varnamehuge_tlb_pages/varname set to literaloff/literal we
+will not use symbolMAP_HUGETLB/symbol at all.
+   /para
+
+   para
+With varnamehuge_tlb_pages/varname set to literaltry/literal
+we will try to use symbolMAP_HUGETLB/symbol and fall back to
+symbolmmap()/symbol without symbolMAP_HUGETLB/symbol.
+   /para
+  /listitem
+ /varlistentry
+
  varlistentry id=guc-temp-buffers xreflabel=temp_buffers
   termvarnametemp_buffers/varname (typeinteger/type)/term
   indexterm
diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
index df06312..f9de239 100644
--- a/src/backend/port/sysv_shmem.c
+++ b/src/backend/port/sysv_shmem.c
@@ -27,10 +27,14 @@
 #ifdef HAVE_SYS_SHM_H
 #include sys/shm.h
 #endif
+#ifdef MAP_HUGETLB
+#include dirent.h
+#endif
 
 #include miscadmin.h
 #include storage/ipc.h
 #include storage/pg_shmem.h
+#include utils/guc.h
 
 
 typedef key_t IpcMemoryKey;/* shared memory key passed to 
shmget(2) */
@@ -61,6 +65,19 @@ typedef int IpcMemoryId; /* shared memory ID 
returned by shmget(2) */
 #define MAP_FAILED ((void *) -1)
 #endif
 
+#ifdef MAP_HUGETLB
+#  ifdef __ia64__
+#define PG_HUGETLB_BASE_ADDR (void *)(0x8000UL)
+#define PG_MAP_HUGETLB (MAP_HUGETLB|MAP_FIXED)
+#  else
+#define PG_HUGETLB_BASE_ADDR (void *)(0x0UL)
+#define PG_MAP_HUGETLB MAP_HUGETLB
+#  endif
+#else
+#  define PG_MAP_HUGETLB 0
+#endif
+
+
 
 unsigned long UsedShmemSegID = 0;
 void  *UsedShmemSegAddr = NULL;
@@ -73,7 +90,6 @@ static void IpcMemoryDelete(int status, Datum shmId);
 static PGShmemHeader *PGSharedMemoryAttach(IpcMemoryKey key,
 IpcMemoryId *shmid);
 
-
 /*
  * InternalIpcMemoryCreate(memKey, size)
  *
@@ -342,6 +358,155 @@ PGSharedMemoryIsInUse(unsigned long id1, unsigned long 
id2)
 }
 
 
+#ifdef MAP_HUGETLB
+#define HUGE_PAGE_INFO_DIR  /sys/kernel/mm/hugepages
+
+/*
+ * static long InternalGetFreeHugepagesCount(const char *name)
+ *
+ * Attempt to read the number of available hugepages from
+ * /sys/kernel/mm/hugepages/hugepages-size/free_hugepages
+ * Will fail (return -1) if file could not be opened, 0 if no pages are 
available
+ * and  0 if there are free pages
+ *
+ */
+static long
+InternalGetFreeHugepagesCount(const char *name)
+{
+   int fd;
+   char buff[1024];
+   size_t len;
+   long result;
+   char *ptr;
+
+   len = snprintf(buff, 1024, %s/%s/free_hugepages, HUGE_PAGE_INFO_DIR, 
name);
+   if (len == 1024) /* I don't think that this will happen ever */
+   {
+   ereport(huge_tlb_pages == HUGE_TLB_TRY ? DEBUG1 : WARNING,
+   (errmsg(Filename %s/%s/free_hugepages is too 
long, HUGE_PAGE_INFO_DIR, name),
+errcontext(while checking hugepage size)));
+   return -1;
+   }
+
+   fd = open(buff, O_RDONLY);
+   if (fd = 0)
+   {
+   ereport(huge_tlb_pages == HUGE_TLB_TRY ? DEBUG1 : WARNING,
+   (errmsg(Could not open file %s: %s, buff, 
strerror(errno)),
+errcontext(while checking hugepage size)));
+   return -1;
+   }
+
+   len = read(fd, buff, 1024);
+   if (len = 0)
+   {
+   ereport(huge_tlb_pages == HUGE_TLB_TRY ? DEBUG1 : WARNING,
+   (errmsg(Error reading from file %s: %s, buff, 
strerror(errno)),
+errcontext(while checking hugepage size)));
+   close(fd);
+   return -1;
+   }
+
+   /*
+* If the content of free_hugepages is longer than or equal to 1024 
bytes
+* the rest is irrelevant; we simply want to know if there are any
+* hugepages left
+*/
+

[HACKERS] [PATCH] PL/Python: Add spidata to all spiexceptions

2012-10-30 Thread Oskari Saarenmaa

PL/Python maps Python SPIError exceptions with 'spidata' attribute into SQL
errors.  PL/Python also creates classes in plpy.spiexceptions for all known
errors but does not initialize their spidata, so when a PL/Python function
raises such an exception it is not recognized properly and is always
reported as an internal error.

This allows PL/Python code to raise exceptions that PL/pgSQL can catch and
which are correctly reported in logs instead of always showing up as XX000.
---
 src/pl/plpython/expected/plpython_error.out   | 12 
 src/pl/plpython/expected/plpython_error_0.out | 12 
 src/pl/plpython/plpy_plpymodule.c |  9 +
 src/pl/plpython/sql/plpython_error.sql| 14 ++
 4 files changed, 47 insertions(+)

diff --git a/src/pl/plpython/expected/plpython_error.out 
b/src/pl/plpython/expected/plpython_error.out
index e1ec9c2..c1c36d9 100644
--- a/src/pl/plpython/expected/plpython_error.out
+++ b/src/pl/plpython/expected/plpython_error.out
@@ -400,3 +400,15 @@ CONTEXT:  Traceback (most recent call last):
   PL/Python function manual_subxact_prepared, line 4, in module
 plpy.execute(save)
 PL/Python function manual_subxact_prepared
+/* raising plpy.spiexception.* from python code should preserve sqlstate
+ */
+CREATE FUNCTION plpy_raise_spiexception() RETURNS void AS $$
+raise plpy.spiexceptions.DivisionByZero()
+$$ LANGUAGE plpythonu;
+DO $$
+BEGIN
+   SELECT plpy_raise_spiexception();
+EXCEPTION WHEN division_by_zero THEN
+   -- NOOP
+END
+$$ LANGUAGE plpgsql;
diff --git a/src/pl/plpython/expected/plpython_error_0.out 
b/src/pl/plpython/expected/plpython_error_0.out
index 6f74a50..61d95e3 100644
--- a/src/pl/plpython/expected/plpython_error_0.out
+++ b/src/pl/plpython/expected/plpython_error_0.out
@@ -400,3 +400,15 @@ CONTEXT:  Traceback (most recent call last):
   PL/Python function manual_subxact_prepared, line 4, in module
 plpy.execute(save)
 PL/Python function manual_subxact_prepared
+/* raising plpy.spiexception.* from python code should preserve sqlstate
+ */
+CREATE FUNCTION plpy_raise_spiexception() RETURNS void AS $$
+raise plpy.spiexceptions.DivisionByZero()
+$$ LANGUAGE plpythonu;
+DO $$
+BEGIN
+   SELECT plpy_raise_spiexception();
+EXCEPTION WHEN division_by_zero THEN
+   -- NOOP
+END
+$$ LANGUAGE plpgsql;
diff --git a/src/pl/plpython/plpy_plpymodule.c 
b/src/pl/plpython/plpy_plpymodule.c
index 37ea2a4..4213e83 100644
--- a/src/pl/plpython/plpy_plpymodule.c
+++ b/src/pl/plpython/plpy_plpymodule.c
@@ -247,6 +247,7 @@ PLy_generate_spi_exceptions(PyObject *mod, PyObject *base)
PyObject   *exc;
PLyExceptionEntry *entry;
PyObject   *sqlstate;
+   PyObject   *spidata;
PyObject   *dict = PyDict_New();
 
if (dict == NULL)
@@ -258,6 +259,14 @@ PLy_generate_spi_exceptions(PyObject *mod, PyObject *base)
 
PyDict_SetItemString(dict, sqlstate, sqlstate);
Py_DECREF(sqlstate);
+
+   spidata = Py_BuildValue(izzzi, exception_map[i].sqlstate,
+   NULL, NULL, 
NULL, 0);
+   if (spidata == NULL)
+   PLy_elog(ERROR, could not generate SPI exceptions);
+   PyDict_SetItemString(dict, spidata, spidata);
+   Py_DECREF(spidata);
+
exc = PyErr_NewException(exception_map[i].name, base, dict);
PyModule_AddObject(mod, exception_map[i].classname, exc);
entry = hash_search(PLy_spi_exceptions, 
exception_map[i].sqlstate,
diff --git a/src/pl/plpython/sql/plpython_error.sql 
b/src/pl/plpython/sql/plpython_error.sql
index 502bbec..ec93144 100644
--- a/src/pl/plpython/sql/plpython_error.sql
+++ b/src/pl/plpython/sql/plpython_error.sql
@@ -298,3 +298,17 @@ plpy.execute(rollback)
 $$ LANGUAGE plpythonu;
 
 SELECT manual_subxact_prepared();
+
+/* raising plpy.spiexception.* from python code should preserve sqlstate
+ */
+CREATE FUNCTION plpy_raise_spiexception() RETURNS void AS $$
+raise plpy.spiexceptions.DivisionByZero()
+$$ LANGUAGE plpythonu;
+
+DO $$
+BEGIN
+   SELECT plpy_raise_spiexception();
+EXCEPTION WHEN division_by_zero THEN
+   -- NOOP
+END
+$$ LANGUAGE plpgsql;
-- 
1.7.12.1



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extensions Documentation

2012-10-30 Thread Peter Eisentraut

On Fri, 2012-10-26 at 09:09 -0700, David E. Wheeler wrote:
 On Oct 26, 2012, at 5:27 AM, Peter Eisentraut pete...@gmx.net wrote: 
  The advantage that these programming language ecosystems have is that
  they can implement the processors for the documentation format in the
  language itself, so it's easy to recommend or enforce a particular
  system.  I don't think we're going to implement any documentation
  processing in SQL, so we'd end up adding some kind of external
  dependency, and that's usually tricky.
 
 True, which is why I was thinking of something relatively light-weight
 and self-contained like sundown.

That's a markdown library, which transforms markdown to HTML, right?  So
what would we do with the HTML?

  Also, there are wildly diverging paradigms in use.  For example, in
  Perl, the convention is one man page per module.  In Python, for most
  modules you don't get any locally installed documentation by default.
  Instead, you're encouraged to upload your stuff to readthedocs.org.  All
  of these have their advantages, but I think it's too early to tell what
  the best convention for a PostgreSQL extension would be.
 
 Well, only because nothing much exists yet. The convention for what
 gets turned into proper documentation (e.g., man pages and/or HTML
 output) will be whatever someone decides to implement and get
 committed. Because otherwise, there is no convention at all, aside
 from the current convention is a plain text file stuck in the docs
 folder, which isn't really documentation, IMHO.

I don't agree with this approach.  There is no need to be prescriptive.
Enforcing a documentation format won't make better, or any,
documentation anyway.

That said, if there are things we could put in, e.g., pgxs, to make
building documentation simpler, we can discuss that.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [COMMITTERS] pgsql: Preserve intermediate .c files in coverage mode

2012-10-30 Thread Peter Eisentraut

On Sun, 2012-10-28 at 11:10 -0400, Tom Lane wrote:
 Peter Eisentraut pete...@gmx.net writes:
  Preserve intermediate .c files in coverage mode
 
  The introduction of the .y - .c pattern rule causes some .c files such
  as bootparse.c to be considered intermediate files in the .y - .c - .o
  rule chain, which make would automatically delete.  But in coverage
  mode, the processing tools such as genhtml need those files, so mark
  them as precious so that make preserves them.
 
 [ blink... ]  I'd vote for making them precious all the time.  No such
 behavioral change was discussed or agreed to,

This is standard, default make behavior.  It only showed up here because
the coverage processing doesn't list all the files it needs in make
rules.

 and I'm concerned about
 possible side effects, for instance losing files that you don't have the
 tools to rebuild when working from a distributed tarball.

I don't think this is a problem because distprep explicitly lists the
files it builds.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extensions Documentation

2012-10-30 Thread David E. Wheeler

On Oct 30, 2012, at 2:08 PM, Peter Eisentraut pete...@gmx.net wrote:

 True, which is why I was thinking of something relatively light-weight
 and self-contained like sundown.
 
 That's a markdown library, which transforms markdown to HTML, right?  So
 what would we do with the HTML?

Put it into the HTML directory (share/docs/html/extensions/$extension.html) and 
inject its name into the TOC.

I'd also be in favor of adding hooks to generate man pages.

 I don't agree with this approach.  There is no need to be prescriptive.
 Enforcing a documentation format won't make better, or any,
 documentation anyway.

My point isn't to be prescriptive, but that actual code tends to win out in the 
end, as opposed to discussion. (Which, yes, I have started.)

 That said, if there are things we could put in, e.g., pgxs, to make
 building documentation simpler, we can discuss that.

Yeah, that would be ideal. But if no one has really thought about how to go 
about it yet…

David



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-10-30 Thread Josh Berkus

On 10/29/12 6:40 AM, Chris Corbyn wrote:
 What's the use case of this? It sounds like it will just create a maintenance 
 nightmare where some stuff you expect to lookup in in postgresql.conf is 
 actually hiding in the .auto file. Assuming only super users/sysadmins would 
 have the ability to change things in the config file, wouldn't they be more 
 likely to just do it on the server and edit the .conf (which among other 
 things, keeps it tidy and orderly).

The use is the ability to manage dozens, or hundreds, of PostgreSQL
servers via Port 5432.  It would also make writing an auto-configurator
easier.

I agree that there's not much benefit if you're only managing a single
PostgreSQL server.  There's a lot of benefit for those of us who have to
manage a lot of them though.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Preserve intermediate .c files in coverage mode

2012-10-30 Thread Tom Lane

Peter Eisentraut pete...@gmx.net writes:
 On Sun, 2012-10-28 at 11:10 -0400, Tom Lane wrote:
 [ blink... ]  I'd vote for making them precious all the time.  No such
 behavioral change was discussed or agreed to,

 This is standard, default make behavior.  It only showed up here because
 the coverage processing doesn't list all the files it needs in make
 rules.

Yeah, I know it's default, but it's not how our make files ever
treated these files before.  At the risk of repeating myself: a change
of this sort was not discussed nor agreed to.  And I'm not agreeing to
it.  I don't care for the idea that the build tree after a regular make
might not include all files that would be in a distribution tarball.

A concrete usage case that this breaks is doing something like
find . -name '*.c' | xargs grep whatever
Up to now I've always been able to assume that that would catch
occurrences of whatever coming from *.y and *.l files.  No more
though.  Maybe the derived *.c files are there, or maybe they're
not --- it'll be really history-dependent.

Please just make them precious.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-10-30 Thread Christopher Browne

On Tue, Oct 30, 2012 at 5:25 PM, Josh Berkus j...@agliodbs.com wrote:
 On 10/29/12 6:40 AM, Chris Corbyn wrote:
 What's the use case of this? It sounds like it will just create a 
 maintenance nightmare where some stuff you expect to lookup in in 
 postgresql.conf is actually hiding in the .auto file. Assuming only super 
 users/sysadmins would have the ability to change things in the config file, 
 wouldn't they be more likely to just do it on the server and edit the .conf 
 (which among other things, keeps it tidy and orderly).

 The use is the ability to manage dozens, or hundreds, of PostgreSQL
 servers via Port 5432.  It would also make writing an auto-configurator
 easier.

 I agree that there's not much benefit if you're only managing a single
 PostgreSQL server.  There's a lot of benefit for those of us who have to
 manage a lot of them though.

I rather think that the fact that postgresql.conf has supported an
include directive since at least as far back as 8.2 (likely further;
I'll not bother spelunking further into the docs) makes this extremely
troublesome.

We have long supported the notion that this configuration does not
have a Unique Place to be (e.g. - if you use INCLUDE, then there are
at least two possible places).

I should think that doing this requires heading back towards there
being a single unique configuration stream, and over the course of
several versions, deprecating the INCLUDE directive.

I imagine it means we'd want to come up with a representation that has
suitable semantics for being written to, make sure it is reasonably
expressive *without* INCLUDE, and establish a migration path between
the old and new forms.  At some point, the old form can be treated as
vestigal, and be dropped.
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, How would the Lone Ranger handle this?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-10-30 Thread Josh Berkus


 I should think that doing this requires heading back towards there
 being a single unique configuration stream, and over the course of
 several versions, deprecating the INCLUDE directive.

Oh, maybe I should take a closer look at Amit's proposal then.  I
thought we planned to make use of the INCLUDE facility for SET
PERSISTENT, including supporting include-if-exists.  Possibly what he's
proposing and what I thought our last consensus were are highly divergent.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Limiting the number of parameterized indexpaths created

2012-10-30 Thread Tom Lane

I looked into the complaint of unreasonable planner runtime in bug #7626,
http://archives.postgresql.org/pgsql-bugs/2012-10/msg00232.php

In the given example, the indexed relation foo has join clauses with
30 other relations.  The code that I added in commit
3b8968f25232ad09001bf35ab4cc59f5a501193e will try all 2^30 combinations
of those rels as possible outer relations for a parameterized indexscan
:-(.  So clearly, the idea that we can just try everything and not have
some kind of heuristic restriction isn't going to work.

This particular example, and probably a lot of similar examples, does
have a principled non-heuristic fix, which comes from the fact that we
recognize all the join clauses as belonging to the same equivalence
class.  Therefore, using more than one of them in an indexscan is
useless.  (The code already does know that and discards the extra
clauses, but that happens too far downstream to prevent the exponential
search time here.)  The extra function eclass_already_used() added in
the attached patch attacks the problem this way, and is sufficient to
resolve the bug for the case presented.

However, we still have an issue for cases where the join clauses aren't
equivalence clauses.  (For instance, if you just change all the =
signs to  in Brian's example, it still takes forever.)  So we also
need some heuristic rule to limit the number of cases considered.

I spent a good deal of time thinking about how the indexscan code might
make use of the joinlist data structure (which prevents exponential
planning time growth overall by limiting which joins we'll consider)
to fix this; the idea being to not consider outer-relation sets that
couldn't actually be presented for use because of joinlist restrictions.
But this looked messy, and probably expensive in itself.  Moreover it
seemed like practical implementations of the idea would carry the
restriction that we could never generate parameterized paths for
joinlist subproblems, which would be a real shame.  And we'd be doing a
lot of work for what is probably a very unusual corner case, given that
I think eclass_already_used() will fix the problem in most real cases.

So what I'm proposing instead, which is implemented in the other half of
the attached patch, is that we simply put an arbitrary limit on how many
outer-relation sets we'll consider while generating parameterized
indexscans.  As written, the patch limits the number of relid sets to 10
times the number of join clauses it's working with, which is small
enough to keep the runtime of this test case to something reasonable.
I think that in practice this will still allow useful
combination-outer-rel cases to be found.  I wonder though if anyone has
a better idea?

regards, tom lane

diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index 444ec2a40e4ac2b0ed8435017149c7664869e34c..78aaca4f279d6ba580643c75d82d82bfc88fba38 100644
*** a/src/backend/optimizer/path/indxpath.c
--- b/src/backend/optimizer/path/indxpath.c
*** static void consider_index_join_outer_re
*** 92,97 
--- 92,98 
  			   IndexClauseSet *eclauseset,
  			   List **bitindexpaths,
  			   List *indexjoinclauses,
+ 			   int considered_clauses,
  			   List **considered_relids);
  static void get_join_index_paths(PlannerInfo *root, RelOptInfo *rel,
  	 IndexOptInfo *index,
*** static void get_join_index_paths(Planner
*** 101,106 
--- 102,109 
  	 List **bitindexpaths,
  	 Relids relids,
  	 List **considered_relids);
+ static bool eclass_already_used(EquivalenceClass *parent_ec, Relids oldrelids,
+ 	List *indexjoinclauses);
  static bool bms_equal_any(Relids relids, List *relids_list);
  static void get_index_paths(PlannerInfo *root, RelOptInfo *rel,
  IndexOptInfo *index, IndexClauseSet *clauses,
*** consider_index_join_clauses(PlannerInfo 
*** 447,452 
--- 450,456 
  			IndexClauseSet *eclauseset,
  			List **bitindexpaths)
  {
+ 	int			considered_clauses = 0;
  	List	   *considered_relids = NIL;
  	int			indexcol;
  
*** consider_index_join_clauses(PlannerInfo 
*** 460,467 
  	 * filter (qpqual); which is where an available clause would end up being
  	 * applied if we omit it from the indexquals.
  	 *
! 	 * This looks expensive, but in practical cases there won't be very many
! 	 * distinct sets of outer rels to consider.
  	 *
  	 * For simplicity in selecting relevant clauses, we represent each set of
  	 * outer rels as a maximum set of clause_relids --- that is, the indexed
--- 464,474 
  	 * filter (qpqual); which is where an available clause would end up being
  	 * applied if we omit it from the indexquals.
  	 *
! 	 * This looks expensive, but in most practical cases there won't be very
! 	 * many distinct sets of outer rels to consider.  As a safety valve when
! 	 * that's not true, we use a heuristic: limit the number

Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-10-30 Thread Hannu Krosing


On 10/29/2012 03:14 PM, Amit Kapila wrote:


On Monday, October 29, 2012 7:11 PM Chris Corbyn

 What's the use case of this? It sounds like it will just create a 
maintenance nightmare where some stuff you expect to lookup in in 
postgresql.conf is actually hiding in the .auto file. Assuming only 
super users/sysadmins would have the ability to change things in the 
config file, wouldn't they be more likely to just do it on the server 
and edit the .conf (which among other things, keeps it tidy and orderly).


Basically after this user will have 2 options to change the 
postgresql.conf parameters.


One is by directly editing the postgresql.conf file and

Other is by using SQL commands.

There will be nothing hidden in .auto file, it's just that it will 
create separate file for parameters set by SQL command to avoid the 
hassles of parsing the postgresql.conf during the processing of SQL 
command.


If interested I have somewhere pl/pythhonu functions for both looking at 
and

changing parameters in postgresql.conf file,

It even keeps the old value and adds comments both to old and to the new 
one abot who an when changed it.
Could also be extended to fpr example rotate last 10 postgreSQL conf 
files and/or skip rewriting the file in case the effective value of GUC 
did not change.


Cheers,
Hannu

Re: [HACKERS] Re: [HACKERS] Patch für MAP_HUGETLB for mmap() shared memory

2012-10-30 Thread Christian Kruse

Hey,

On Tuesday 30 October 2012 20:33:18 Andres Freund wrote:
 +#ifdef MAP_HUGETLB
 +#  ifdef __ia64__
 +#define PG_HUGETLB_BASE_ADDR (void *)(0x8000UL)
 +#define PG_MAP_HUGETLB (MAP_HUGETLB|MAP_FIXED)
 +#  else
 
 Not your fault, but that looks rather strange to me. The level of
 documentation around the requirement of using MAP_FIXED is subpar...

Ok, after further reading I think with MAP_FIXED one has to „generate“ SHM 
segment addresses hisself. So 0x8000UL would just be one possible 
location for a hugepage.

Since this feature is not used in PG for now, the code should work. Could 
someone with access to a ia64 architecture verify it?

Greetings,
 CK



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-10-30 Thread Andres Freund

On Tuesday, October 30, 2012 11:24:20 PM Tom Lane wrote:
 The fact that this isn't being done by a large number of
 people (is anybody at all actually doing it?) suggests to me that maybe
 the demand isn't all that great.

It might also be that the idea of implementing that yourself is quite scary.

Also you would need to parse the file to reset values which isn't exactly 
easy... I think it only really becomes viable with the introduction of 
directory includes where you can use one file per value.

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-10-30 Thread Josh Berkus

Tom,

 I'm not convinced we ever *had* a consensus on this.  There were
 proposals, but I'm not sure a majority ever bought into any one of 'em.
 The whole problem of intermixing manual editing and programmatic editing
 is just a big can of worms, and not everybody is prepared to give up the
 former to have the latter.

Well, I think we have consensus that intermixing is impractical, which
is why every further proposal is around having a separate file for the
SQL-modified values.  And yes, we have a certain amount of You'll get
my carefully edited postgresql.conf when you pry it out of my cold, dead
hands going on.

The real consensus problem, AFAICT, is that while we have consensus that
we would like something like SET PERSISTENT as an *option*, there's a
Hurricane Sandy-sized Bikeshedding Windstorm about how, exactly, people
would like it to work.  Personally, I would prefer the implementation
which actually gets committed. ;-)

 You can, if you are so inclined, implement something functionally
 equivalent to Amit's proposal today using contrib/adminpack's
 pg_file_write --- okay, it's much less convenient than a built-in
 implementation would be, but you can drop some variable assignments into
 a file and then put a suitable INCLUDE into the otherwise-static main
 config file.  The fact that this isn't being done by a large number of
 people (is anybody at all actually doing it?) suggests to me that maybe
 the demand isn't all that great.

It suggest nothing of the sort:

1. a tiny minority of our users even know about adminpack

2. implementing it the way you suggest would require a hacker's
understanding of Postgres, which is an even smaller minority.

On the other hand, the success of tools like Puppet have made having SET
PERSISTENT a lot less urgent for many large-scale installation managers.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Move postgresql_fdw_validator into dblink

2012-10-30 Thread Shigeru Hanada

Sorry for long absence.

On Sat, Oct 20, 2012 at 4:24 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 IIRC, the reason why postgresql_fdw instead of pgsql_fdw was
 no other fdw module has shorten naming such as ora_fdw for
 Oracle.
 However, I doubt whether it is enough strong reason to force to
 solve the technical difficulty; naming conflicts with existing user
 visible features.
 Isn't it worth to consider to back to the pgsql_fdw_validator
 naming again?

AFAIR, in the discussion about naming of the new FDW, another
name postgres_fdw was suggested as well as postgresql_fdw, and
I chose longer one at that time.  Perhaps only a few people
feel that postgres is shortened name of postgresql.  How
about using postgres_fdw for PG-FDW?

Once we chose the different name, postgresql_fdw_validator can
be live with postgres_fdw, though their names seem little
confusing.

In addition, it would be worth mentioning that it's not
recommended to use postgresql_fdw_validator as validator of a
third-party's FDW to avoid dependency.

Regards,
-- 
Shigeru HANADA


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multiple --table options for other commands

2012-10-30 Thread Josh Kupershmidt

On Sun, Oct 28, 2012 at 4:30 PM, Josh Kupershmidt schmi...@gmail.com wrote:

 I see there's already a TODO for allowing pg_restore to accept
 multiple --table arguments[1], but would folks support adding this
 capability to various other commands which currently accept only a
 single --table argument, such as clusterdb, vacuumdb, and reindexdb?

I went ahead and cooked up a patch to allow pg_restore, clusterdb,
vacuumdb, and reindexdb to accept multiple --table switches. Hope I
didn't overlook a similar tool, but these were the only other commands
I found taking a --table argument.

If you run, say:
  pg_dump -Fc --table=tab1 --table=tab2 ... |
  pg_restore --table=tab1 --table=tab2 ...

without the patch, only tab2 will be restored and pg_restore will
make no mention of this omission to the user. With the patch, both
tables should be restored.

I also added support for multiple --index options for reindexdb, since
that use-case seemed symmetrical to --table. As suggested previously,
I moved the SimpleStringList types and functions out of pg_dump.h and
common.c into dumputils.{h,c}, so that the code in ./src/bin/scripts/
could use it.

If there are no objections, I can add the patch to the next CF.

Josh


multiple_tables.v1.diff
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-10-30 Thread Amit Kapila

On Wednesday, October 31, 2012 4:14 AM Josh Berkus wrote:
 Tom,
 
  I'm not convinced we ever *had* a consensus on this.  There were
  proposals, but I'm not sure a majority ever bought into any one of
 'em.
  The whole problem of intermixing manual editing and programmatic
 editing
  is just a big can of worms, and not everybody is prepared to give up
 the
  former to have the latter.
 
 Well, I think we have consensus that intermixing is impractical, which
 is why every further proposal is around having a separate file for the
 SQL-modified values.  And yes, we have a certain amount of You'll get
 my carefully edited postgresql.conf when you pry it out of my cold, dead
 hands going on.

  I think for that part it was discussed that always postgresql.conf values 
will override the values of .auto.
  
 
 The real consensus problem, AFAICT, is that while we have consensus that
 we would like something like SET PERSISTENT as an *option*, there's a
 Hurricane Sandy-sized Bikeshedding Windstorm about how, exactly, people
 would like it to work.  Personally, I would prefer the implementation
 which actually gets committed. ;-)

I think the original syntax is proposed by Robert Hass by reffering Oracle's 
syntax in below mail:
http://archives.postgresql.org/pgsql-hackers/2010-10/msg00953.php

and then finally the Syntax which I have used in my proposal was suggested by 
Tom in below mail:
http://archives.postgresql.org/pgsql-hackers/2010-10/msg00977.php


Do you see any discrepancy in the proposal I have sent and what have been 
concluded in previous discussions?

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-10-30 Thread Amit Kapila

On Wednesday, October 31, 2012 3:58 AM Andres Freund wrote:
 On Tuesday, October 30, 2012 11:24:20 PM Tom Lane wrote:
  The fact that this isn't being done by a large number of
  people (is anybody at all actually doing it?) suggests to me that
 maybe
  the demand isn't all that great.
 
 It might also be that the idea of implementing that yourself is quite
 scary.

 
 Also you would need to parse the file to reset values which isn't
 exactly
 easy... 

As this new file (postgresql.conf.auto) will not be edited by users, so it
might not be
difficult to handle it, as the code will now the exact format of file.
The problem can be there if we need to parse postgresql.conf to set/reset
values, as for that 
the format is not fixed. However that is taken care by having 2 files.
Please point me, if I misunderstood the difficulty raised by you.


With Regards,
Amit Kapila.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-10-30 Thread Amit Kapila

On Wednesday, October 31, 2012 3:32 AM Hannu Krosing wrote:
On 10/29/2012 03:14 PM, Amit Kapila wrote:

 

On Monday, October 29, 2012 7:11 PM Chris Corbyn




 What's the use case of this? It sounds like it will just create a
maintenance nightmare where some stuff you expect to lookup in in
postgresql.conf is actually hiding in the .auto file. Assuming only super
users/sysadmins would have the ability to change things in the config file,
wouldn't they be more likely to just do it on the server and edit the .conf
(which among other things, keeps it tidy and orderly).

 

Basically after this user will have 2 options to change the postgresql.conf
parameters. 

One is by directly editing the postgresql.conf file and

Other is by using SQL commands.

There will be nothing hidden in .auto file, it's just that it will create
separate file for parameters set by SQL command to avoid the hassles of
parsing the postgresql.conf during the processing of SQL command.

 

 

If interested I have somewhere pl/pythhonu functions for both looking at
and 
changing parameters in postgresql.conf file, 

 

In the previous discussion about this feature, it was mentioned by many
people as postgresql.conf can be editied by users in many ways, it will be
difficult to come up with a reliable function which can handle all possible
cases. That is why I have taken the approach of having 2 separate files, one
user editable and other can be only edited by SQL Commands.

In anycase if you have those functions readily available then please send
them, it can be useful for me.

 

With Regards,

Amit Kapila.

38 matches

Mail list logo