Re: [HACKERS] Fix pgstattuple/pgstatindex to use regclass-type as the argument

2013-07-16 Thread Satoshi Nagayasu
(2013/07/04 3:58), Fujii Masao wrote:
 On Wed, Jun 26, 2013 at 12:39 AM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Jun 20, 2013 at 2:32 PM, Fujii Masao masao.fu...@gmail.com wrote:
 Since pg_relpages(oid) doesn't exist, pg_relpages() is in the same
 situation as pgstatindex(), i.e., we cannot just replace pg_relpages(text)
 with pg_relpages(regclass) for the backward-compatibility. How do you
 think we should solve the pg_relpages() problem? Rename? Just
 add pg_relpages(regclass)?

 Adding a function with a new name seems likely to be smoother, since
 that way you don't have to worry about problems with function calls
 being thought ambiguous.
 
 Could you let me know the example that this problem happens?
 
 For the test, I just implemented the regclass-version of pg_relpages()
 (patch attached) and tested some cases. But I could not get that problem.
 
  SELECT pg_relpages('hoge');-- OK
  SELECT pg_relpages(oid) FROM pg_class WHERE relname = 'hoge';-- OK
  SELECT pg_relpages(relname) FROM pg_class WHERE relname = 'hoge';-- 
 OK

In the attached patch, I cleaned up three functions to have
two types of arguments for each, text and regclass.

  pgstattuple(text)
  pgstattuple(regclass)
  pgstatindex(text)
  pgstatindex(regclass)
  pg_relpages(text)
  pg_relpages(regclass)

I still think a regclass argument is more appropriate for passing
relation/index name to a function than text-type, but having both
arguments in each function seems to be a good choice at this moment,
in terms of backward-compatibility.

Docs needs to be updated if this change going to be applied.

Any comments?
-- 
Satoshi Nagayasu sn...@uptime.jp
Uptime Technologies, LLC. http://www.uptime.jp
diff --git a/contrib/pgstattuple/Makefile b/contrib/pgstattuple/Makefile
index fc893d8..957742a 100644
--- a/contrib/pgstattuple/Makefile
+++ b/contrib/pgstattuple/Makefile
@@ -4,7 +4,7 @@ MODULE_big  = pgstattuple
 OBJS   = pgstattuple.o pgstatindex.o
 
 EXTENSION = pgstattuple
-DATA = pgstattuple--1.1.sql pgstattuple--1.0--1.1.sql 
pgstattuple--unpackaged--1.0.sql
+DATA = pgstattuple--1.2.sql pgstattuple--1.0--1.1.sql 
pgstattuple--unpackaged--1.0.sql
 
 REGRESS = pgstattuple
 
diff --git a/contrib/pgstattuple/expected/pgstattuple.out 
b/contrib/pgstattuple/expected/pgstattuple.out
index ab28f50..eaba306 100644
--- a/contrib/pgstattuple/expected/pgstattuple.out
+++ b/contrib/pgstattuple/expected/pgstattuple.out
@@ -11,12 +11,24 @@ select * from pgstattuple('test'::text);
  0 |   0 | 0 | 0 |0 |  
0 |  0 |  0 |0
 (1 row)
 
+select * from pgstattuple('test'::name);
+ table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | 
dead_tuple_len | dead_tuple_percent | free_space | free_percent 
+---+-+---+---+--++++--
+ 0 |   0 | 0 | 0 |0 |  
0 |  0 |  0 |0
+(1 row)
+
 select * from pgstattuple('test'::regclass);
  table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | 
dead_tuple_len | dead_tuple_percent | free_space | free_percent 
 
---+-+---+---+--++++--
  0 |   0 | 0 | 0 |0 |  
0 |  0 |  0 |0
 (1 row)
 
+select * from pgstattuple('test'::regclass::oid);
+ table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | 
dead_tuple_len | dead_tuple_percent | free_space | free_percent 
+---+-+---+---+--++++--
+ 0 |   0 | 0 | 0 |0 |  
0 |  0 |  0 |0
+(1 row)
+
 select * from pgstatindex('test_pkey');
  version | tree_level | index_size | root_block_no | internal_pages | 
leaf_pages | empty_pages | deleted_pages | avg_leaf_density | 
leaf_fragmentation 
 
-+++---+++-+---+--+
diff --git a/contrib/pgstattuple/pgstatindex.c 
b/contrib/pgstattuple/pgstatindex.c
index 97f897e..9ec74e7 100644
--- a/contrib/pgstattuple/pgstatindex.c
+++ b/contrib/pgstattuple/pgstatindex.c
@@ -40,11 +40,15 @@
 
 
 extern Datum pgstatindex(PG_FUNCTION_ARGS);
+extern Datum pgstatindexbyid(PG_FUNCTION_ARGS);
 extern Datum pg_relpages(PG_FUNCTION_ARGS);
+extern Datum pg_relpagesbyid(PG_FUNCTION_ARGS);
 extern Datum pgstatginindex(PG_FUNCTION_ARGS);
 
 PG_FUNCTION_INFO_V1(pgstatindex);
+PG_FUNCTION_INFO_V1(pgstatindexbyid);
 

Re: [HACKERS] Fix pgstattuple/pgstatindex to use regclass-type as the argument

2013-07-16 Thread Satoshi Nagayasu

Hi Rushabh,

(2013/07/16 14:58), Rushabh Lathia wrote:

Hello Satoshi,

I assigned myself for the reviewer of this patch. Issue status is waiting on
author.


Thank you for picking it up.


Now looking at the discussion under the thread it seems like we are waiting
for the suggestion for the new function name, right ?


Yes.


I am wondering why actually we need new name ? Can't we just overload the
same function and provide two version of the functions ?


I think the major reason is to avoid some confusion with old and new
function arguments.

My thought here is that having both arguments (text and regclass)
for each function is a good choice to clean up interfaces with keeping
the backward-compatibility.


In the last thread Fujii just did the same for pg_relpages and it seems
like an
good to go approach, isn't it ? Am I missing anything here ?


I just posted a revised patch to handle the issue in three functions
of the pgstattuple module. Please take a look.

Regards,
--
Satoshi Nagayasu sn...@uptime.jp
Uptime Technologies, LLC. http://www.uptime.jp


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] make dist error

2013-07-16 Thread Tatsuo Ishii
I'm getting errors while executing make dist on git master head.

$ make dist
make dist
rm -rf postgresql-9.4devel* =install=
for x in `cd .  find . \( -name CVS -prune \) -o \( -name .git -prune \) -o 
-print`; do \
  file=`expr X$x : 'X\./\(.*\)'`; \
  if test -d ./$file ; then \
mkdir postgresql-9.4devel/$file  chmod 777 
postgresql-9.4devel/$file; \
  else \
ln ./$file postgresql-9.4devel/$file /dev/null 21 \
  || cp ./$file postgresql-9.4devel/$file; \
  fi || exit; \
done
make -C postgresql-9.4devel distprep
[snip]
/bin/tar chf postgresql-9.4devel.tar postgresql-9.4devel
/bin/tar: postgresql-9.4devel/src/bin/pg_controldata/pg_crc.c: File removed 
before we read it
/bin/tar: postgresql-9.4devel/src/bin/pg_resetxlog/pg_crc.c: File removed 
before we read it
/bin/tar: postgresql-9.4devel/src/backend/tcop/.#postgres.c: File removed 
before we read it
make: *** [postgresql-9.4devel.tar] Error 1
make: *** Deleting file `postgresql-9.4devel.tar'
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add visibility map information to pg_freespace.

2013-07-16 Thread Satoshi Nagayasu



(2013/07/09 19:55), Kyotaro HORIGUCHI wrote:

Hello, I've brought visibilitymap extentions for pg_freespacemap
and pgstattuple.

At Mon, 08 Jul 2013 16:59:05 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
horiguchi.kyot...@lab.ntt.co.jp wrote in 
20130708.165905.118860769.horiguchi.kyot...@lab.ntt.co.jp

I'll come again with the first implementation of it. And as for
pg_freespacemap, I'll keep the current direction - adding column
to present output records format of pg_freespace(). And
documentation, if possible.


pg_freespace_vm_v2.patch:

   Interface has been changed from the first patch. The version of
   pg_freespace() provided with vm information is named
   pg_freespace_with_vminfo() and shows output like following.

| postgres=# select * from pg_freespace_with_vminfo('t'::regclass) limit 10;
|  blkno | avail | is_all_visible
| ---+---+
|  0 |64 | t
|  1 |32 | t
|  2 |96 | t
|  3 |64 | t
|  4 |96 | t
|  5 |96 | t
|  6 |   128 | t
|  7 |32 | t
|  8 |96 | t


I think we can simply add is_all_viible column to the existing
pg_freespace(), because adding column would not break
backward-compatibility in general. Any other thoughts?


pgstattuple_vm_v1.patch:

   The first version of VM extension for pgstattuple. According to
   the previous discussion, the added column is named
   'all_visible_percent'.

| postgres=# select * from pgstattuple('t');
| -[ RECORD 1 ]---+-
| table_len   | 71770112
| tuple_count | 989859
| tuple_len   | 31675488
| tuple_percent   | 44.13
| dead_tuple_count| 99
| dead_tuple_len  | 3168
| dead_tuple_percent  | 0
| free_space  | 31886052
| free_percent| 44.43
| all_visible_percent | 99.98


It seems working fine.

And I added a regression test for pg_freespacemap and additional
test cases for pgstattuple. Please take a look.

Regards,
--
Satoshi Nagayasu sn...@uptime.jp
Uptime Technologies, LLC. http://www.uptime.jp
diff --git a/contrib/pg_freespacemap/Makefile b/contrib/pg_freespacemap/Makefile
index d794df2..09d6ff8 100644
--- a/contrib/pg_freespacemap/Makefile
+++ b/contrib/pg_freespacemap/Makefile
@@ -6,6 +6,8 @@ OBJS = pg_freespacemap.o
 EXTENSION = pg_freespacemap
 DATA = pg_freespacemap--1.1.sql pg_freespacemap--1.0--1.1.sql 
pg_freespacemap--unpackaged--1.0.sql
 
+REGRESS = pg_freespacemap
+
 ifdef USE_PGXS
 PG_CONFIG = pg_config
 PGXS := $(shell $(PG_CONFIG) --pgxs)
diff --git a/contrib/pg_freespacemap/expected/pg_freespacemap.out 
b/contrib/pg_freespacemap/expected/pg_freespacemap.out
new file mode 100644
index 000..cde954d
--- /dev/null
+++ b/contrib/pg_freespacemap/expected/pg_freespacemap.out
@@ -0,0 +1,100 @@
+create extension pg_freespacemap;
+create table t1 ( uid integer primary key, uname text not null );
+select * from pg_freespace('t1');
+ blkno | avail 
+---+---
+(0 rows)
+
+select * from pg_freespace('t1'::regclass);
+ blkno | avail 
+---+---
+(0 rows)
+
+select * from pg_freespace('t1', 1);
+ pg_freespace 
+--
+0
+(1 row)
+
+select * from pg_freespace_with_vminfo('t1');
+ blkno | avail | is_all_visible 
+---+---+
+(0 rows)
+
+select * from pg_freespace_with_vminfo('t1'::regclass);
+ blkno | avail | is_all_visible 
+---+---+
+(0 rows)
+
+insert into t1 values ( 100, 'postgresql' );
+select * from pg_freespace('t1');
+ blkno | avail 
+---+---
+ 0 | 0
+(1 row)
+
+select * from pg_freespace('t1', 1);
+ pg_freespace 
+--
+0
+(1 row)
+
+select * from pg_freespace_with_vminfo('t1');
+ blkno | avail | is_all_visible 
+---+---+
+ 0 | 0 | f
+(1 row)
+
+select * from pg_freespace('t1_pkey');
+ blkno | avail 
+---+---
+ 0 | 0
+ 1 | 0
+(2 rows)
+
+select * from pg_freespace('t1_pkey', 1);
+ pg_freespace 
+--
+0
+(1 row)
+
+select * from pg_freespace('t1_pkey', 2);
+ pg_freespace 
+--
+0
+(1 row)
+
+select * from pg_freespace_with_vminfo('t1_pkey');
+ blkno | avail | is_all_visible 
+---+---+
+ 0 | 0 | f
+ 1 | 0 | f
+(2 rows)
+
+vacuum t1;
+select * from pg_freespace('t1');
+ blkno | avail 
+---+---
+ 0 |  8096
+(1 row)
+
+select * from pg_freespace_with_vminfo('t1');
+ blkno | avail | is_all_visible 
+---+---+
+ 0 |  8096 | t
+(1 row)
+
+select * from pg_freespace('t1_pkey');
+ blkno | avail 
+---+---
+ 0 | 0
+ 1 | 0
+(2 rows)
+
+select * from pg_freespace_with_vminfo('t1_pkey');
+ blkno | avail | is_all_visible 
+---+---+
+ 0 | 0 | f
+ 1 | 0 | f
+(2 rows)
+
diff --git a/contrib/pg_freespacemap/sql/pg_freespacemap.sql 
b/contrib/pg_freespacemap/sql/pg_freespacemap.sql
new file mode 100644
index 000..79a458d
--- /dev/null
+++ 

Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-07-16 Thread Tatsuo Ishii
 To clarify what state this is all in: Fabien's latest
 pgbench-throttle-v15.patch is the ready for a committer version.  The
 last two revisions are just tweaking the comments at this point, and
 his version is more correct than my last one.

Got it. I will take care of this.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix pgstattuple/pgstatindex to use regclass-type as the argument

2013-07-16 Thread Rushabh Lathia
Hi Satoshi,

I spent some time on the revised version on the
patch(pgstattuple_regclass_v2.diff)
and here are my comments.

.) Patch get applies cleanly on PG master branch
.) Successful build and database creation
.) Basic test coverage included in the patch
.) make check running cleanly

Basically goal of the patch is to allow specifying a relation/index with
several expressions, 'relname', 'schemaname.relname' and oid in all
pgstattuple
functions. To achieve the same patch introduced another version of
pgstattuple
functions which takes regclass as input args. To make it backward compatible
we kept the pgstatetuple functions with TEXT input arg.

In the mail thread we decided that pgstattuple(text) will be depreciated in
the future release and we need to document that. Which is missing in the
patch.

Apart from that few comments in the C code to explain why multiple version
of the pgstattuple function ? would be really helpful for future
understanding
purpose.

Thanks,



On Tue, Jul 16, 2013 at 11:42 AM, Satoshi Nagayasu sn...@uptime.jp wrote:

 Hi Rushabh,


 (2013/07/16 14:58), Rushabh Lathia wrote:

 Hello Satoshi,

 I assigned myself for the reviewer of this patch. Issue status is waiting
 on
 author.


 Thank you for picking it up.


  Now looking at the discussion under the thread it seems like we are
 waiting
 for the suggestion for the new function name, right ?


 Yes.


  I am wondering why actually we need new name ? Can't we just overload the
 same function and provide two version of the functions ?


 I think the major reason is to avoid some confusion with old and new
 function arguments.

 My thought here is that having both arguments (text and regclass)
 for each function is a good choice to clean up interfaces with keeping
 the backward-compatibility.


  In the last thread Fujii just did the same for pg_relpages and it seems
 like an
 good to go approach, isn't it ? Am I missing anything here ?


 I just posted a revised patch to handle the issue in three functions
 of the pgstattuple module. Please take a look.


 Regards,
 --
 Satoshi Nagayasu sn...@uptime.jp
 Uptime Technologies, LLC. http://www.uptime.jp




-- 
Rushabh Lathia


[HACKERS] A general Q about index

2013-07-16 Thread Soroosh Sardari
Hi

I want to know how an index is created and used.
actually if you can show to me a simple start point, it would be great.

Regards,
Soroosh Sardari


Re: [HACKERS] changeset generation v5-01 - Patches git tree

2013-07-16 Thread Robert Haas
On Sun, Jul 7, 2013 at 4:34 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2013-07-07 15:43:17 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  3b) Add catcache 'filter' that ensures the cache stays unique and use
  that for the mapping

  I slightly prefer 3b) because it's smaller, what's your opinions?

 This is just another variation on the theme of kluging the catcache to
 do something it shouldn't.  You're still building a catcache on a
 non-unique index, and that is going to lead to trouble.

 I don't think the lurking dangers really are present. The index
 essentially *is* unique since we filter away anything non-unique. The
 catcache code hardly can be confused by tuples it never sees. That would
 even work if we started preloading catcaches by doing scans of the
 entire underlying relation or by caching all of a page when reading one
 of its tuples.

 I can definitely see that there are aesthetical reasons against doing
 3b), that's why I've also done 3a). So I'll chalk you up to voting for
 that...

I also vote for (3a).  I did a quick once over of 1, 2, and 3a and
they look reasonable.  Barring strenuous objections, I'd like to go
ahead and commit these, or perhaps an updated version of them.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] findDependentObjects() mutual exclusion vs. MVCC catalog scans

2013-07-16 Thread Noah Misch
Consider this sequence of commands:

create type rowtype as (c int, d int);
create temp table t of rowtype;
\c -
drop type rowtype cascade;

Since the switch to MVCC catalog scans, it exhibits the following error about
10% of the time on my system:

CREATE TYPE
CREATE TABLE
You are now connected to database test as user nm.
ERROR:  XX000: cache lookup failed for relation 17009
LOCATION:  getRelationDescription, objectaddress.c:2186

With \c, in general, you may end up executing commands under the new session
before the old backend has finished exiting.  For this test case specifically,
the two backends' attempts to drop table t regularly overlap.  The old
backend would drop it within RemoveTempRelationsCallback(), and the new
backend would cascade from rowtype to drop it.  findDependentObjects() deals
with concurrent deletion attempts by acquiring a lock on each object it will
delete, then calling systable_recheck_tuple() to determine whether another
deleter was successful while the current backend was waiting for the lock.
systable_recheck_tuple() uses the scan snapshot, which really only works if
that snapshot is SnapshotNow or some other that changes its decision in
response to concurrent transaction commits.  The switch to MVCC snapshots left
this mutual exclusion protocol ineffective.

Let's fix this by having systable_recheck_tuple() acquire a fresh catalog MVCC
snapshot and recheck against that.  I believe it would also be fully safe to
use SnapshotNow here; however, I'm assuming we would otherwise manage to
remove SnapshotNow entirely.

Thanks,
nm

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com
diff --git a/src/backend/access/index/genam.c b/src/backend/access/index/genam.c
index 2bfe78a..2fecc88 100644
--- a/src/backend/access/index/genam.c
+++ b/src/backend/access/index/genam.c
@@ -362,7 +362,8 @@ systable_getnext(SysScanDesc sysscan)
  * systable_recheck_tuple --- recheck visibility of most-recently-fetched tuple
  *
  * This is useful to test whether an object was deleted while we waited to
- * acquire lock on it.
+ * acquire lock on it.  We recheck visibility in the style of SnapshotNow by
+ * checking against a fresh catalog snapshot.
  *
  * Note: we don't actually *need* the tuple to be passed in, but it's a
  * good crosscheck that the caller is interested in the right tuple.
@@ -370,30 +371,37 @@ systable_getnext(SysScanDesc sysscan)
 bool
 systable_recheck_tuple(SysScanDesc sysscan, HeapTuple tup)
 {
+   Snapshotfreshsnap;
boolresult;
 
+   /*
+* For a scan using a non-MVCC snapshot like SnapshotSelf, we would 
simply
+* reuse the old snapshot.  So far, the only caller uses MVCC snapshots.
+*/
+   freshsnap = GetCatalogSnapshot(RelationGetRelid(sysscan-heap_rel));
+
if (sysscan-irel)
{
IndexScanDesc scan = sysscan-iscan;
 
+   Assert(IsMVCCSnapshot(scan-xs_snapshot));
Assert(tup == scan-xs_ctup);
Assert(BufferIsValid(scan-xs_cbuf));
/* must hold a buffer lock to call HeapTupleSatisfiesVisibility 
*/
LockBuffer(scan-xs_cbuf, BUFFER_LOCK_SHARE);
-   result = HeapTupleSatisfiesVisibility(tup, scan-xs_snapshot,
-   
  scan-xs_cbuf);
+   result = HeapTupleSatisfiesVisibility(tup, freshsnap, 
scan-xs_cbuf);
LockBuffer(scan-xs_cbuf, BUFFER_LOCK_UNLOCK);
}
else
{
HeapScanDesc scan = sysscan-scan;
 
+   Assert(IsMVCCSnapshot(scan-rs_snapshot));
Assert(tup == scan-rs_ctup);
Assert(BufferIsValid(scan-rs_cbuf));
/* must hold a buffer lock to call HeapTupleSatisfiesVisibility 
*/
LockBuffer(scan-rs_cbuf, BUFFER_LOCK_SHARE);
-   result = HeapTupleSatisfiesVisibility(tup, scan-rs_snapshot,
-   
  scan-rs_cbuf);
+   result = HeapTupleSatisfiesVisibility(tup, freshsnap, 
scan-rs_cbuf);
LockBuffer(scan-rs_cbuf, BUFFER_LOCK_UNLOCK);
}
return result;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make dist error

2013-07-16 Thread Andrew Dunstan


On 07/16/2013 02:53 AM, Tatsuo Ishii wrote:

I'm getting errors while executing make dist on git master head.

$ make dist
make dist
rm -rf postgresql-9.4devel* =install=
for x in `cd .  find . \( -name CVS -prune \) -o \( -name .git -prune \) -o 
-print`; do \
  file=`expr X$x : 'X\./\(.*\)'`; \
  if test -d ./$file ; then \
mkdir postgresql-9.4devel/$file  chmod 777 
postgresql-9.4devel/$file; \
  else \
ln ./$file postgresql-9.4devel/$file /dev/null 21 \
  || cp ./$file postgresql-9.4devel/$file; \
  fi || exit; \
done
make -C postgresql-9.4devel distprep
[snip]
/bin/tar chf postgresql-9.4devel.tar postgresql-9.4devel
/bin/tar: postgresql-9.4devel/src/bin/pg_controldata/pg_crc.c: File removed 
before we read it
/bin/tar: postgresql-9.4devel/src/bin/pg_resetxlog/pg_crc.c: File removed 
before we read it
/bin/tar: postgresql-9.4devel/src/backend/tcop/.#postgres.c: File removed 
before we read it
make: *** [postgresql-9.4devel.tar] Error 1
make: *** Deleting file `postgresql-9.4devel.tar'


It's working for me:

   ...
   /bin/tar chf postgresql-9.4devel.tar postgresql-9.4devel
   gzip --best -c postgresql-9.4devel.tar postgresql-9.4devel.tar.gz
   bzip2 -c postgresql-9.4devel.tar postgresql-9.4devel.tar.bz2
   rm -rf postgresql-9.4devel

It looks like your source directory isn't completely clean. Before I did 
this I did:


   git clean -dfx


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make dist error

2013-07-16 Thread Tatsuo Ishii
 It's working for me:
 
...
/bin/tar chf postgresql-9.4devel.tar postgresql-9.4devel
gzip --best -c postgresql-9.4devel.tar postgresql-9.4devel.tar.gz
bzip2 -c postgresql-9.4devel.tar postgresql-9.4devel.tar.bz2
rm -rf postgresql-9.4devel
 
 It looks like your source directory isn't completely clean. Before I
 did this I did:
 
git clean -dfx

Oh, I didn't know that make dist requires git clean. Thanks.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make dist error

2013-07-16 Thread Andrew Dunstan


On 07/16/2013 09:53 AM, Andrew Dunstan wrote:


On 07/16/2013 02:53 AM, Tatsuo Ishii wrote:

I'm getting errors while executing make dist on git master head.

$ make dist
make dist
rm -rf postgresql-9.4devel* =install=
for x in `cd .  find . \( -name CVS -prune \) -o \( -name .git 
-prune \) -o -print`; do \

  file=`expr X$x : 'X\./\(.*\)'`; \
  if test -d ./$file ; then \
mkdir postgresql-9.4devel/$file  chmod 777 
postgresql-9.4devel/$file;\

  else \
ln ./$file postgresql-9.4devel/$file /dev/null 21 \
  || cp ./$file postgresql-9.4devel/$file; \
  fi || exit; \
done
make -C postgresql-9.4devel distprep
[snip]
/bin/tar chf postgresql-9.4devel.tar postgresql-9.4devel
/bin/tar: postgresql-9.4devel/src/bin/pg_controldata/pg_crc.c: File 
removed before we read it
/bin/tar: postgresql-9.4devel/src/bin/pg_resetxlog/pg_crc.c: File 
removed before we read it
/bin/tar: postgresql-9.4devel/src/backend/tcop/.#postgres.c: File 
removed before we read it

make: *** [postgresql-9.4devel.tar] Error 1
make: *** Deleting file `postgresql-9.4devel.tar'


It's working for me:

   ...
   /bin/tar chf postgresql-9.4devel.tar postgresql-9.4devel
   gzip --best -c postgresql-9.4devel.tar postgresql-9.4devel.tar.gz
   bzip2 -c postgresql-9.4devel.tar postgresql-9.4devel.tar.bz2
   rm -rf postgresql-9.4devel

It looks like your source directory isn't completely clean. Before I 
did this I did:


   git clean -dfx



Incidentally, the buildfarm animal guaibasaurus builds make dist every 
build, via a bespoke buildfarm module. See for example 
http://www.pgbuildfarm.org/cgi-bin/show_stage_log.pl?nm=guaibasaurusdt=2013-07-16%2004%3A17%3A01stg=make-dist, 
so we should get early notice if anything breaks it.



cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A general Q about index

2013-07-16 Thread David Johnston
soroosh sardari wrote
 Hi
 
 I want to know how an index is created and used.
 actually if you can show to me a simple start point, it would be great.
 
 Regards,
 Soroosh Sardari

In the documentation there is a table of contents and listed on that table
is a section named Indexes.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/A-general-Q-about-index-tp5763912p5763926.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A general Q about index

2013-07-16 Thread David Johnston
David Johnston wrote
 
 soroosh sardari wrote
 Hi
 
 I want to know how an index is created and used.
 actually if you can show to me a simple start point, it would be great.
 
 Regards,
 Soroosh Sardari
 In the documentation there is a table of contents and listed on that table
 is a section named Indexes.
 
 David J.

Since you posted this to hacker you may mean you wish to know how to program
them as oppose to use them in SQL.  I have no clue to this regard.  If you
did mean use in SQL then the documentation is your friend and you also
should not have posted this question to -hackers but to -general instead;
probably should have posted there anyway to begin with and been more clear
as to what you mean by created and used.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/A-general-Q-about-index-tp5763912p5763928.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] findDependentObjects() mutual exclusion vs. MVCC catalog scans

2013-07-16 Thread Robert Haas
On Tue, Jul 16, 2013 at 9:50 AM, Noah Misch n...@leadboat.com wrote:
 Consider this sequence of commands:

 create type rowtype as (c int, d int);
 create temp table t of rowtype;
 \c -
 drop type rowtype cascade;

 Since the switch to MVCC catalog scans, it exhibits the following error about
 10% of the time on my system:

 CREATE TYPE
 CREATE TABLE
 You are now connected to database test as user nm.
 ERROR:  XX000: cache lookup failed for relation 17009
 LOCATION:  getRelationDescription, objectaddress.c:2186

 With \c, in general, you may end up executing commands under the new session
 before the old backend has finished exiting.  For this test case specifically,
 the two backends' attempts to drop table t regularly overlap.  The old
 backend would drop it within RemoveTempRelationsCallback(), and the new
 backend would cascade from rowtype to drop it.  findDependentObjects() deals
 with concurrent deletion attempts by acquiring a lock on each object it will
 delete, then calling systable_recheck_tuple() to determine whether another
 deleter was successful while the current backend was waiting for the lock.
 systable_recheck_tuple() uses the scan snapshot, which really only works if
 that snapshot is SnapshotNow or some other that changes its decision in
 response to concurrent transaction commits.  The switch to MVCC snapshots left
 this mutual exclusion protocol ineffective.

 Let's fix this by having systable_recheck_tuple() acquire a fresh catalog MVCC
 snapshot and recheck against that.  I believe it would also be fully safe to
 use SnapshotNow here; however, I'm assuming we would otherwise manage to
 remove SnapshotNow entirely.

I recommend reworking the header comment to avoid mention of
SnapshotNow, since if we get rid of SnapshotNow, the reference might
not be too clear to far-future hackers.

+   /*
+* For a scan using a non-MVCC snapshot like SnapshotSelf, we would 
simply
+* reuse the old snapshot.  So far, the only caller uses MVCC snapshots.
+*/
+   freshsnap = GetCatalogSnapshot(RelationGetRelid(sysscan-heap_rel));

This comment is not very clear, because it doesn't describe what the
code actually does, but rather speculates about what the code could do
if the intention of some future caller were different.  I recommend
adding Assert(IsMVCCSnapshot(scan-xs_snapshot)) and changing the
comment to something like this: For now, we don't handle the case of
a non-MVCC scan snapshot.  This is adequate for existing uses of this
function, but might need to be changed in the future.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] findDependentObjects() mutual exclusion vs. MVCC catalog scans

2013-07-16 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Jul 16, 2013 at 9:50 AM, Noah Misch n...@leadboat.com wrote:
 Let's fix this by having systable_recheck_tuple() acquire a fresh catalog 
 MVCC
 snapshot and recheck against that.  I believe it would also be fully safe to
 use SnapshotNow here; however, I'm assuming we would otherwise manage to
 remove SnapshotNow entirely.

I agree with Robert's comments, and in addition suggest that this code
needs a comment about why it's safe to use the snapshot without doing
RegisterSnapshot or equivalent.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make dist error

2013-07-16 Thread Peter Eisentraut
On 7/16/13 10:03 AM, Andrew Dunstan wrote:
 Incidentally, the buildfarm animal guaibasaurus builds make dist every
 build, via a bespoke buildfarm module. See for example
 http://www.pgbuildfarm.org/cgi-bin/show_stage_log.pl?nm=guaibasaurusdt=2013-07-16%2004%3A17%3A01stg=make-dist,
 so we should get early notice if anything breaks it.

Ditto for my Jenkins:
http://pgci.eisentraut.org/jenkins/job/postgresql_master_dist/

So this is well covered.

The requirement to clean before make dist is perhaps underdocumented.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Differences in WHERE clause of SELECT

2013-07-16 Thread Robert Haas
On Tue, Jul 9, 2013 at 7:12 PM, Josh Berkus j...@agliodbs.com wrote:
 Like uses the operator class text_pattern_ops which doesn't include an
 implict cast.

This is wrong.  Casts are associated with data types, not operator classes.

 For one thing, the implicit cast is from text --
 integer, not the other way around, and there is no LIKE operator for
 integers.

This is also wrong.  There's no implicit cast between text and integer
in either direction - only assignment casts.

I think the reason why the first few examples work and the last one
fails is that, in the first few cases, there are integer literals and
unknown literals, and the operations in question are defined for
integers (and for text), so we pick the integer interpretation and
call it good.  But in the non-working case, the operation is defined
to work only on text, and an integer argument is supplied.  So we have
to cast, and there's no implicit cast, ergo we fail.

The point is that we use a different procedure to decide what to do
with a quoted literal ('1', or '3.14159', or 'foo') than we do to
decide whether it's OK to cast a value of a type we already know.  For
example this fails:

rhaas=# create table foo (a int, b varchar);
CREATE TABLE
rhaas=# insert into foo values ('1', '1');
INSERT 0 1
rhaas=# select * from foo where a = b;
ERROR:  operator does not exist: integer = character varying
LINE 1: select * from foo where a = b;
  ^

This is just like the OP's first example (which worked) except that
here there's no unknown literal, so we actually need to cast, and we
refuse to do so and fail.

I continue to be of the opinion that our behavior in this area is
bone-headed.  It's reasonable to reject integer = character varying on
the basis that we don't know whether integer or character varying
comparison semantics are wanted, and the two might give different
answers (think: leading zeroes), so we'd better ask the user to
clarify.  But  '1' LIKE 1 is not ambiguous; there is only one
plausible meaning for that, and we ought to adopt it, per the patch I
proposed previously.  Rejecting SQL that other systems happily accept
is unhelpful and unfriendly and it is costing us users and mind-share.

/rant

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] findDependentObjects() mutual exclusion vs. MVCC catalog scans

2013-07-16 Thread Andres Freund
On 2013-07-16 09:50:07 -0400, Noah Misch wrote:
 With \c, in general, you may end up executing commands under the new session
 before the old backend has finished exiting.  For this test case specifically,
 the two backends' attempts to drop table t regularly overlap.  The old
 backend would drop it within RemoveTempRelationsCallback(), and the new
 backend would cascade from rowtype to drop it.  findDependentObjects() deals
 with concurrent deletion attempts by acquiring a lock on each object it will
 delete, then calling systable_recheck_tuple() to determine whether another
 deleter was successful while the current backend was waiting for the lock.
 systable_recheck_tuple() uses the scan snapshot, which really only works if
 that snapshot is SnapshotNow or some other that changes its decision in
 response to concurrent transaction commits.  The switch to MVCC snapshots left
 this mutual exclusion protocol ineffective.

Nice catch.

I wonder though, isn't that code unsafe in other ways as well? What if
the pg_depend entry was rewritten inbetween? Consider somebody doing
something like REASSIGN OWNED concurrently with a DROP. The DROP
possibly will cascade to an entry which changed the owner already. And
the recheck will then report that the object doesn't exist anymore and
abort since it does a simple HeapTupleSatisfiesVisibility() and doesn't
follow the ctid chain if the tuple has been updated...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSL renegotiation

2013-07-16 Thread Robert Haas
On Fri, Jul 12, 2013 at 8:51 PM, Noah Misch n...@leadboat.com wrote:
 On Fri, Jul 12, 2013 at 04:32:52PM -0400, Alvaro Herrera wrote:
 Now, should we support the 0.9.6-and-earlier mechanism?  My inclination
 is no; even RHEL 3, the oldest supported Linux distribution, uses 0.9.7
 (Heck, even Red Hat Linux 9, released on 2003).  To see OpenSSL 0.9.6
 you need to go back to Red Hat Linux 7.2, released on 2001 using a Linux
 kernel 2.4.  Surely no one in their right mind would use a current
 Postgres release on such an ancient animal.

 Agreed.  The OpenSSL Project last applied a security fix to 0.9.6 over eight
 years ago.  Compatibility with 0.9.6 has zero or negative value.

+1 from me as well, if any more are needed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Differences in WHERE clause of SELECT

2013-07-16 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 I continue to be of the opinion that our behavior in this area is
 bone-headed.  It's reasonable to reject integer = character varying on
 the basis that we don't know whether integer or character varying
 comparison semantics are wanted, and the two might give different
 answers (think: leading zeroes), so we'd better ask the user to
 clarify.  But  '1' LIKE 1 is not ambiguous; there is only one
 plausible meaning for that, and we ought to adopt it, per the patch I
 proposed previously.  Rejecting SQL that other systems happily accept
 is unhelpful and unfriendly and it is costing us users and mind-share.

I don't agree with this opinion.  '1' + '2' might be claimed to have
only one plausible meaning as well, but that doesn't mean that we should
accept it.  There is a very fundamental distinction between numbers and
strings, and we'd do our users no service by pretending there isn't.
What's more, your argument for this essentially rests on the assumption
that LIKE represents only one possible operator choice, which is already
false today (see \do ~~) and might be even more false in future.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] checking variadic any argument in parser - should be array

2013-07-16 Thread Pavel Stehule
2013/7/15 Andrew Dunstan and...@dunslane.net:

 On 07/14/2013 12:28 AM, Pavel Stehule wrote:

 Hello

 2013/7/14 Andrew Dunstan and...@dunslane.net:

 On 06/29/2013 03:29 PM, Pavel Stehule wrote:



 5. This patch has user visibility, i.e. now we are throwing an error
 when
 user only says VARIADIC NULL like:

   select concat(variadic NULL) is NULL;

 Previously it was working but now we are throwing an error. Well we
 are
 now
 more stricter than earlier with using VARIADIC + ANY, so I have no
 issue
 as
 such. But I guess we need to document this user visibility change. I
 don't
 know exactly where though. I searched for VARIADIC and all related
 documentation says it needs an array, so nothing harmful as such, so
 you
 can
 ignore this review comment but I thought it worth mentioning it.

 yes, it is point for possible issues in RELEASE NOTES, I am thinking
 ???

 Well, writer of release notes should be aware of this. And I hope he
 will
 be. So no issue.



 Is the behaviour change really unavoidable? Is it really what we want?
 Nobody seems to have picked up on this except the author and the
 reviewer.
 I'd hate us to do this and then surprise people. I'm not sure how many
 people are using VARIADIC any, but I have started doing so and expect
 to
 do so more, and I suspect I'm not alone.

 It doesn't disallow NULL - it disallow nonarray types on this
 possition, because there are must be only array type values. Other
 possible usage created unambiguous behave.

 so SELECT varfx(VARIADIC NULL) -- is disallowed
 but SELECT varfx(VARIADIC NULL::text[]) -- is allowed



 Quite so, I understand exactly what the defined behaviour will be.




 for example, I can wrote SELECT varfx(10,20,30), but I cannot write
 SELECT varfx(VARIADIC 10,20,30) - because this behave should be
 undefined.

 Can me  send, your use case, where this check is unwanted, please.



 The only question I raised was for the NULL case. If you're not saying
 VARIADIC NULL then I have no issue.

NULL is allowed - but it should be typed.


 Anyway, nobody else seem to care much (and I suspect very few people are
 writing VARIADIC any functions anyway, apart from you and me). So I'll see
 about getting this committed shortly.


exactly

Regards

Pavel

 cheers

 andrew



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improvement of checkpoint IO scheduler for stable transaction responses

2013-07-16 Thread Ants Aasma
On Jul 14, 2013 9:46 PM, Greg Smith g...@2ndquadrant.com wrote:
 I updated and re-reviewed that in 2011: 
 http://www.postgresql.org/message-id/4d31ae64.3000...@2ndquadrant.com and 
 commented on why I think the improvement was difficult to reproduce back 
 then.  The improvement didn't follow for me either.  It would take a really 
 amazing bit of data to get me to believe write sorting code is worthwhile 
 after that.  On large systems capable of dirtying enough blocks to cause a 
 problem, the operating system and RAID controllers are already sorting block. 
  And *that* sorting is also considering concurrent read requests, which are a 
 lot more important to an efficient schedule than anything the checkpoint 
 process knows about.  The database doesn't have nearly enough information yet 
 to compete against OS level sorting.

That reasoning makes no sense. OS level sorting can only see the
writes in the time window between PostgreSQL write, and being forced
to disk. Spread checkpoints sprinkles the writes out over a long
period and the general tuning advice is to heavily bound the amount of
memory the OS willing to keep dirty. This makes probability of
scheduling adjacent writes together quite low, the merging window
being limited either by dirty_bytes or dirty_expire_centisecs. The
checkpointer has the best long term overview of the situation here, OS
scheduling only has the short term view of outstanding read and write
requests. By sorting checkpoint writes it is much more likely that
adjacent blocks are visible to OS writeback at the same time and will
be issued together.

I gave the linked patch a shot. I tried it with pgbench scale 100
concurrency 32, postgresql shared_buffers=3GB,
checkpoint_timeout=5min, checkpoint_segments=100,
checkpoint_completion_target=0.5, pgdata was on a 7200RPM HDD, xlog on
Intel 320 SSD, kernel settings: dirty_background_bytes = 32M,
dirty_bytes = 128M.

first checkpoint on master: wrote 209496 buffers (53.7%); 0
transaction log file(s) added, 0 removed, 26 recycled; write=314.444
s, sync=9.614 s, total=324.166 s; sync files=16, longest=9.208 s,
average=0.600 s
IO while checkpointing: about 500 write iops at 5MB/s, 100% utilisation.

first checkpoint with checkpoint sorting applied: wrote 205269 buffers
(52.6%); 0 transaction log file(s) added, 0 removed, 0 recycled;
write=149.049 s, sync=0.386 s, total=149.559 s; sync files=39,
longest=0.255 s, average=0.009 s
IO while checkpointing: about 23 write iops at 12MB/s, 10% utilisation.

Transaction processing rate for a 20min run went from 5200 to 7000.

Looks to me that in this admittedly best case workload the sorting is
working exactly as designed, converting mostly random IO into
sequential. I have seen many real world workloads where this kind of
sorting would have benefited greatly.

I also did a I/O bound test with scalefactor 100 and
checkpoint_timeout 30min. 2hour average tps went from 121 to 135, but
I'm not yet sure if it's repeatable or just noise.

Regards,
Ants Aasma
-- 
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Differences in WHERE clause of SELECT

2013-07-16 Thread Greg Stark
On Tue, Jul 16, 2013 at 4:40 PM, Robert Haas robertmh...@gmail.com wrote:
 But  '1' LIKE 1 is not ambiguous


What about '1' LIKE 01 ?

What do people who write this actually expect LIKE to do? I gather
they're actually treating it as a synonym for =?


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dynamic background workers

2013-07-16 Thread Robert Haas
On Wed, Jul 3, 2013 at 11:15 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Andres Freund escribió:
 Just as a datapoint, if you benchmark the numbers of forks that can be
 performed by a single process (i.e. postmaster) the number is easily in
 the 10s of thousands. Now forking that much has some scalability
 implications inside the kernel, but still.
 I'd be surprised if the actual fork is more than 5-10% of the current
 cost of starting a new backend.

 I played at having some thousands of registered bgworkers on my laptop,
 and there wasn't even that much load.  So yeah, you can have lots of
 forks.

Since no one seems to be objecting to this patch beyond the lack of
documentation, I've added documentation and committed it, with
appropriate rebasing and a few minor cleanups.  One loose end is
around the bgw_sighup and bgw_sigterm structure members.  If you're
registering a background worker for a library that is not loaded in
the postmaster, you can't (safely) use these for anything, because
it's possible (though maybe not likely) for the worker process to map
the shared library at a different address than where they are mapped
in the backend that requests the new process to be started.  However,
that doesn't really matter; AFAICS, you can just as well call pqsignal
to set the handlers to anything you want from the main entrypoint
before unblocking signals.  So I'm inclined to say we should just
remove bgw_sighup and bgw_sigterm altogether and tell people to do it
that way.

Alternatively, we could give them the same treatment that I gave
bgw_main: let the user specify a function name and we'll search the
appropriate DSO for it.  But that's probably less convenient for
anyone using this facility than just calling pqsignal() before
unblocking signals, so I don't see any real reason to go that route.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] findDependentObjects() mutual exclusion vs. MVCC catalog scans

2013-07-16 Thread Noah Misch
On Tue, Jul 16, 2013 at 05:56:10PM +0200, Andres Freund wrote:
 On 2013-07-16 09:50:07 -0400, Noah Misch wrote:
  With \c, in general, you may end up executing commands under the new 
  session
  before the old backend has finished exiting.  For this test case 
  specifically,
  the two backends' attempts to drop table t regularly overlap.  The old
  backend would drop it within RemoveTempRelationsCallback(), and the new
  backend would cascade from rowtype to drop it.  findDependentObjects() 
  deals
  with concurrent deletion attempts by acquiring a lock on each object it will
  delete, then calling systable_recheck_tuple() to determine whether another
  deleter was successful while the current backend was waiting for the lock.
  systable_recheck_tuple() uses the scan snapshot, which really only works if
  that snapshot is SnapshotNow or some other that changes its decision in
  response to concurrent transaction commits.  The switch to MVCC snapshots 
  left
  this mutual exclusion protocol ineffective.
 
 Nice catch.
 
 I wonder though, isn't that code unsafe in other ways as well? What if
 the pg_depend entry was rewritten inbetween? Consider somebody doing
 something like REASSIGN OWNED concurrently with a DROP. The DROP
 possibly will cascade to an entry which changed the owner already. And
 the recheck will then report that the object doesn't exist anymore and
 abort since it does a simple HeapTupleSatisfiesVisibility() and doesn't
 follow the ctid chain if the tuple has been updated...

I'm not seeing a problem with that particular route.  Say we're examining a
pg_depend tuple where the referencing object is a table and the referenced
object is a role, the table's owner.  We're dropping the role and cascade to
the table.  If the REASSIGNED OWNED assigns the table to a different role,
then we are correct to treat the dependency as gone.  If it's the same role
(REASSIGNED OWNED BY alice TO alice, pointless but permitted), several of
the rename implementations short-circuit that case and don't change catalog
entries.  But even if that optimization were omitted, shdepChangeDep() will
have blocked against our previously-acquired deletion lock on the role.  Code
that adds or updates a dependency without locking both objects of the new
pg_depend tuple is buggy independently.

That being said, there may well be a related mechanism that can slip past the
locking here.  My brain turns to mush when I ponder findDependentObjects() too
thoroughly.

Thanks,
nm

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSL renegotiation

2013-07-16 Thread David Fetter
On Fri, Jul 12, 2013 at 08:51:52PM -0400, Noah Misch wrote:
 On Fri, Jul 12, 2013 at 04:32:52PM -0400, Alvaro Herrera wrote:
  Now, should we support the 0.9.6-and-earlier mechanism?  My
  inclination is no; even RHEL 3, the oldest supported Linux
  distribution, uses 0.9.7 (Heck, even Red Hat Linux 9, released on
  2003).  To see OpenSSL 0.9.6 you need to go back to Red Hat Linux
  7.2, released on 2001 using a Linux kernel 2.4.  Surely no one in
  their right mind would use a current Postgres release on such an
  ancient animal.
 
 Agreed.  The OpenSSL Project last applied a security fix to 0.9.6
 over eight years ago.  Compatibility with 0.9.6 has zero or negative
 value.

You've made a persuasive case that we should actively break backward
compatibility here.  Would that be complicated to do?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Differences in WHERE clause of SELECT

2013-07-16 Thread Robert Haas
On Tue, Jul 16, 2013 at 12:05 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I don't agree with this opinion.

I know.

 '1' + '2' might be claimed to have
 only one plausible meaning as well, but that doesn't mean that we should
 accept it.

That example clearly has multiple sensible interpretations, because
surely we have no way of knowing whether the user wants +(int2,int2),
+(int4,int4), +(int8,int8), +(real,real), +(money,money), etc., etc.
There are 42 binary + operators, of which at least 10 or so are
plausible interpretations of that call.  I have never proposed
changing the behavior of this case - though, to be fair, if we found a
way to just decide on int4, say, I bet we'd get very few complaints.

 There is a very fundamental distinction between numbers and
 strings, and we'd do our users no service by pretending there isn't.
 What's more, your argument for this essentially rests on the assumption
 that LIKE represents only one possible operator choice, which is already
 false today (see \do ~~) and might be even more false in future.

This is a better argument, but I'm still not buying it.  Sure, there
are four candidate functions there, but they all perform the same
logical operation. It looks to me like bpcharlike and textlike both
call the same underlying function, and on a quick glance namelike
appears only trivially different.  Only bytealike is significantly
different, in that it operates on the underlying bytes rather than the
hex-escaped (or backslash-escaped) version of those bytes.  It's a
pretty good bet that the user did not want an implicit cast to bytea.

PostgreSQL requires more casts that any other mainstream database
system on the planet, and it's one of the biggest barriers to
migrating from other database systems to PostgreSQL.  I understand
that there is a consensus in this forum to pretend that our
intolerance is a virtue, but I don't agree with that consensus. A
request for =(text,int) is ambiguous; we really do not know what is
meant.  A request for ~~(int,text) is not ambiguous; we reject it out
of pedantry, not necessity.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improvement of checkpoint IO scheduler for stable transaction responses

2013-07-16 Thread Greg Smith

On 7/16/13 12:46 PM, Ants Aasma wrote:


Spread checkpoints sprinkles the writes out over a long
period and the general tuning advice is to heavily bound the amount of
memory the OS willing to keep dirty.


That's arguing that you can make this feature be useful if you tune in a 
particular way.  That's interesting, but the goal here isn't to prove 
the existence of some workload that a change is useful for.  You can 
usually find a test case that validates any performance patch as helpful 
if you search for one.  Everyone who has submitted a sorted checkpoint 
patch for example has found some setup where it shows significant gains. 
 We're trying to keep performance stable across a much wider set of 
possibilities though.


Let's talk about default parameters instead, which quickly demonstrates 
where your assumptions fail.  The server I happen to be running pgbench 
tests on today has 72GB of RAM running SL6 with RedHat derived kernel 
2.6.32-358.11.1.  This is a very popular middle grade server 
configuration nowadays.  There dirty_background_ratio and 
dirty_background_ratio are 10 (percent).  That means that roughly 7GB of 
RAM can be used for write caching.  Note that this is a fairly low write 
cache tuning compared to a survey of systems in the field--lots of 
people have servers with earlier kernels where these numbers can be as 
high as 20 or even 40% instead.


The current feasible tuning for shared_buffers suggests a value of 8GB 
is near the upper limit, beyond which cache related overhead makes 
increases counterproductive.  Your examples are showing 53% of 
shared_buffers dirty at checkpoint time; that's typical.  The 
checkpointer is then writing out just over 4GB of data.


With that background what process here has more data to make decisions with?

-The operating system has 7GB of writes it's trying to optimize.  That 
potentially includes backend, background writer, checkpoint, temp table, 
statistics, log, and WAL data.  The scheduler is also considering read 
operations.


-The checkpointer process has 4GB of writes from rarely written shared 
memory it's trying to optimize.


This is why if you take the opposite approach of yours today--go 
searching for workloads where sorting is counterproductive--those are 
equally easy to find.  Any test of write speed I do starts with about 50 
different scale/client combinations.  Why do I suggest pgbench-tools as 
a way to do performance tests?  It's because an automated sweep of 
client setups like it does is the minimum necessary to create enough 
variation in workload for changing the database's write path.  It's 
really amazing how often doing that shows a proposed change is just 
shuffling the good and bad cases around.  That's been the case for every 
sorting and fsync delay change submitted so far.  I'm not even 
interested in testing today's submission because I tried that particular 
approach for a few months, twice so far, and it fell apart on just as 
many workloads as it helped.



The checkpointer has the best long term overview of the situation here, OS
scheduling only has the short term view of outstanding read and write
requests.


True only if shared_buffers is large compared to the OS write cache, 
which was not the case on the example I generated with all of a minute's 
work.  I regularly see servers where Linux's Dirty area becomes a 
multiple of the dirty buffers written by a checkpoint.  I can usually 
make that happen at will with CLUSTER and VACUUM on big tables.  The 
idea that the checkpointer has a long-term view while the OS has a short 
one, that presumes a setup that I would say is possible but not common.



kernel settings: dirty_background_bytes = 32M,
dirty_bytes = 128M.


You disclaimed this as a best case scenario.  It is a low throughput / 
low latency tuning.  That's fine, but if Postgres optimizes itself 
toward those cases it runs the risk of high throughput servers with 
large caches being detuned.  I've posted examples before showing very 
low write caches like this leading to VACUUM running at 1/2 its normal 
speed or worse, as a simple example of where a positive change in one 
area can backfire badly on another workload.  That particular problem 
was so common I updated pgbench-tools recently to track table 
maintenance time between tests, because that demonstrated an issue even 
when the TPS numbers all looked fine.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A general Q about index

2013-07-16 Thread Soroosh Sardari
On Tue, Jul 16, 2013 at 7:00 PM, David Johnston pol...@yahoo.com wrote:

 David Johnston wrote
 
  soroosh sardari wrote
  Hi
 
  I want to know how an index is created and used.
  actually if you can show to me a simple start point, it would be great.
 
  Regards,
  Soroosh Sardari
  In the documentation there is a table of contents and listed on that
 table
  is a section named Indexes.
 
  David J.

 Since you posted this to hacker you may mean you wish to know how to
 program
 them as oppose to use them in SQL.  I have no clue to this regard.  If you
 did mean use in SQL then the documentation is your friend and you also
 should not have posted this question to -hackers but to -general instead;
 probably should have posted there anyway to begin with and been more clear
 as to what you mean by created and used.

 David J.




 --
 View this message in context:
 http://postgresql.1045698.n5.nabble.com/A-general-Q-about-index-tp5763912p5763928.html
 Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




Actually  I mean the index in the source code, I want to know how index is
used in the backend for
execution of some query.
sorry my question is a bit ambiguous,
I start to read src/backend/access/index/genam.c, and I think maybe this
file is not good way to start.
let me write a bunch of question to illustrate my point,
-How create an index for a table with specified field
-How index is stored in a page
-what module is responsible to fetch and swap out index pages
-Is there a kind of abstract class for all variant of index?


Re: [HACKERS] Differences in WHERE clause of SELECT

2013-07-16 Thread Merlin Moncure
On Tue, Jul 16, 2013 at 12:58 PM, Robert Haas robertmh...@gmail.com wrote:
 PostgreSQL requires more casts that any other mainstream database
 system on the planet, and it's one of the biggest barriers to
 migrating from other database systems to PostgreSQL.  I understand
 that there is a consensus in this forum to pretend that our
 intolerance is a virtue, but I don't agree with that consensus. A
 request for =(text,int) is ambiguous; we really do not know what is
 meant.  A request for ~~(int,text) is not ambiguous; we reject it out
 of pedantry, not necessity.

You make some good points but I still wonder (I'm recalling your lpad
discussion from a while back) if we're better off providing specific
overloads and casts for portability pain points instead of changing
casting mechanics.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: template-ify (binary) extensions

2013-07-16 Thread Markus Wanner
On 07/16/2013 01:27 AM, Robert Haas wrote:
 Andres points out that you can install adminpack to obtain
 local filesystem access, and that is true.  But the system
 administrator can also refuse to allow adminpack, and/or use selinux
 or other mechanisms to prevent the postgres binary from writing a file
 with execute permissions.

I think execute permissions (on the FS) are irrelevant. It's about
loading a shared library. The noexec mount option can prevent that, though.

But okay, you're saying we *have* and *want* a guarantee that even a
superuser cannot execute arbitrary native code via libpq (at least in
default installs w/o extensions).

Andres made two contrib-free suggestions: with COPY TO BINARY, you get a
header prepended, which I think is sufficient to prevent a dlopen() or
LoadLibrary(). Text and CSV formats of COPY escape their output, so it's
hard to write \000 or other control bytes. ESCAPE and DELIMITER also
have pretty restrictive requirements. So COPY doesn't seem quite good
enough to write a valid DSO.

His second suggestion was tuplesort tapes. tuplesort.c says: We require
the first unsigned int of a stored tuple to be the total size on-tape
of the tuple That's kind of a header as well. Writing a proper DSO
certainly does not sound trivial, either.

From a security perspective, I wouldn't want to rely on that guarantee.
Postgres writes too many files to be sure none of those can be abused to
write a loadable DSO, IMO.

Mounting $PGDATA 'noexec' and allowing the postgres user to write only
to such noexec mounts sounds like a good layer. It's independent, though
- it can be used whether or not the above guarantee holds.

 Things aren't quite so bad if we write the bits to a file first and
 then dynamically load the file.  That way at least noexec or similar
 can provide protection.  But it still seems like a pretty dangerous
 direction.

I agree now. Thanks for elaborating.

Regards

Markus Wanner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Differences in WHERE clause of SELECT

2013-07-16 Thread Robert Haas
On Tue, Jul 16, 2013 at 2:58 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Tue, Jul 16, 2013 at 12:58 PM, Robert Haas robertmh...@gmail.com wrote:
 PostgreSQL requires more casts that any other mainstream database
 system on the planet, and it's one of the biggest barriers to
 migrating from other database systems to PostgreSQL.  I understand
 that there is a consensus in this forum to pretend that our
 intolerance is a virtue, but I don't agree with that consensus. A
 request for =(text,int) is ambiguous; we really do not know what is
 meant.  A request for ~~(int,text) is not ambiguous; we reject it out
 of pedantry, not necessity.

 You make some good points but I still wonder (I'm recalling your lpad
 discussion from a while back) if we're better off providing specific
 overloads and casts for portability pain points instead of changing
 casting mechanics.

TBH, I think we've gone too far in that direction already.  We've got
numerous cases where there are multiple function or operator
declarations that exist only to work around shortcomings of the
typecasting system.  For example, we've got pg_size_pretty(bigint) and
pg_size_pretty(numeric); if we adopted the patch that I proposed
previously, one of those would be enough, and then it would work for
integers, too.  We could instead add a third version for integers, but
where's the joy in that?

We've currently got 10 different functions for || that include
||(anynonarray,text), ||(text,anynonarray), and ||(text,text).  The
first two of those are workarounds for the fact that the third one
does not do what you want.   And then there's concat(variadic any),
which is yet a third way of hacking around the same general category
of problem.  The patch I proposed previously would allow concat to be
declared as variadic text if we so desired, but it wouldn't help with
the || case.  I think that's OK; I never pretended that patch would
fix everything, but I still think it's an improvement over what we
have now.

We can certainly continue to play whack-a-mole and dream up a new
solution every time a really intolerable variant of this problem comes
up.  But that doesn't seem good to me.  It means that every case
behaves a little different from every other case, and the whole thing
is kinda arcane and hard to understand, even for hackers.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: template-ify (binary) extensions

2013-07-16 Thread Robert Haas
On Tue, Jul 16, 2013 at 3:14 PM, Markus Wanner mar...@bluegap.ch wrote:
 But okay, you're saying we *have* and *want* a guarantee that even a
 superuser cannot execute arbitrary native code via libpq (at least in
 default installs w/o extensions).

Yes, that's a good way of summarizing my position.  I think I'd
support having an extension that allows that, although I don't think
I'd want such an extension installed on any machine I administer.  But
I oppose having it be something the server allows by default.  YMMV.
:-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: Non-recursive processing of AND/OR lists

2013-07-16 Thread Pavel Stehule
Hello

2013/7/15 Gurjeet Singh gurj...@singh.im:
 On Sun, Jul 14, 2013 at 8:27 PM, Robert Haas robertmh...@gmail.com wrote:

 On Wed, Jul 10, 2013 at 9:02 PM, Josh Berkus j...@agliodbs.com wrote:
  I think it's a waste of code to try to handle bushy trees.  A list is
  not a particularly efficient representation of the pending list; this
  will probably be slower than recusing in the common case.  I'd suggest
  keeping the logic to handle left-deep trees, which I find rather
  elegant, but ditching the pending list.


 Somehow I find it hard to believe that recursing would be more efficient
 than processing the items right there. The recursion is not direct either;
 transformExprRecurse() is going to call this function again, but after a few
 more switch-case comparisons.

 Agreed that there's overhead in allocating list items, but is it more
 overhead than pushing functions on the call stack? Not sure, so I leave it
 to others who understand such things better than I do.

 If by common-case you mean a list of just one logical AND/OR operator, then
 I agree that creating and destroying a list may incur overhead that is
 relatively very expensive. To that end, I have altered the patch, attached,
 to not build a pending list until we encounter a node with root_expr_kind in
 a right branch.

 We're getting bushy-tree processing with very little extra code, but if you
 deem it not worthwhile or adding complexity, please feel free to rip it out.


 
  Is there going to be further discussion of this patch, or do I return
  it?

 Considering it's not been updated, nor my comments responded to, in
 almost two weeks, I think we return it at this point.


 Sorry, I didn't notice that this patch was put back in  'Waiting on Author'
 state.


I did a some performance tests of v5 and v6 version and there v5 is
little bit faster than v6, and v6 has significantly higher stddev

but I am not sure, if my test is correct - I tested a speed of EXPLAIN
statement - result was forwarded to /dev/null

Result of this test is probably related to tested pattern of
expressions - in this case expr or expr or expr or expr or expr ... 

10 000 exprs (ms)

 v |   avg   | stddev
---+-+
 5 | 1839.14 |  13.68
 6 | 1871.77 |  48.02

==v5 profile==
209064   43.5354  postgres equal
207849   43.2824  postgres process_equivalence
37453 7.7992  postgres datumIsEqual
3178  0.6618  postgres SearchCatCache
2350  0.4894  postgres AllocSetAlloc

==v6 profile==
193251   45.3998  postgres process_equivalence
178183   41.8599  postgres equal
30430 7.1488  postgres datumIsEqual
2819  0.6623  postgres SearchCatCache
1951  0.4583  postgres AllocSetAlloc


I found so 9.4 planner is about 1% slower (for test that sent by
Gurjeet), that  than 9.2 planner, but it is not related to this patch

v6 is clean and all regression tests was passed

Regards

Pavel

 Best regards,

 --
 Gurjeet Singh

 http://gurjeet.singh.im/

 EnterpriseDB Inc.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Differences in WHERE clause of SELECT

2013-07-16 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:

 We can certainly continue to play whack-a-mole and dream up a new
 solution every time a really intolerable variant of this problem comes
 up.  But that doesn't seem good to me.  It means that every case
 behaves a little different from every other case, and the whole thing
 is kinda arcane and hard to understand, even for hackers.

If you're building up a list of things that generate errors in
PostgreSQL but not other DBMS products, make sure you have this:

test=# create table t(d date);
CREATE TABLE
test=# insert into t values (NULL);
INSERT 0 1
test=# insert into t values (COALESCE(NULL, NULL));
ERROR:  column d is of type date but expression is of type text
LINE 1: insert into t values (COALESCE(NULL, NULL));
  ^
HINT:  You will need to rewrite or cast the expression.

From a user perspective, it's hard to explain why COALESCE(NULL,
NULL) fails in a location that a bare NULL works.  From the
perspective of those working on the code, and looking at the
problem from the inside out, it seems sane; but that's the only
perspective from which it does.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_memory_barrier() doesn't compile, let alone work, for me

2013-07-16 Thread Martijn van Oosterhout
On Sun, Jul 14, 2013 at 09:26:38PM -0400, Robert Haas wrote:
 I'm pretty sure we've got latent memory-ordering risks in our existing
 code which we just haven't detected and fixed yet.  Consider, for
 example, this exciting code from GetNewTransactionId:
 
 myproc-subxids.xids[nxids] = xid;
 mypgxact-nxids = nxids + 1;
 
 I don't believe that's technically safe even on an architecture like
 x86, because the compiler could decide to reorder those assignments.
 Of course there is probably no reason to do so, and even if it does
 you'd have to get really unlucky to see a user-visible failure, and if
 you did you'd probably misguess the cause.

You're probably right. Note that it's not even just the compiler that
might reorder them, the CPU/cache subsystem/memory bus all play their
part in memory reordering.  x86 is pretty forgiving, which is why it
works.

I found this to be a really good explanation of all the things that can
go wrong with memory ordering.  It also explains why, in the long run,
memory barriers are not optimal.

http://herbsutter.com/2013/02/11/atomic-weapons-the-c-memory-model-and-modern-hardware/

That talk discusses how the hardware world is converging on SC [1] as
the memory model to use.  And C11/C++11 atomics will implement this for
the programmer.  With these you can actually make guarentees.  For
example, by marking mypgxact-nxids as an atomic type the compiler will
emit all the necessary markings to let the CPU know what you want, so
everything works the way you expect it to.  Even on arcane
architechtures.  No explicit barriers needed.

Unfortunatly, it won't help on compilers that don't support it.

[1] http://en.wikipedia.org/wiki/Sequential_consistency

There are places where you put code in and verify it does what you
want.  With this one you can put test programs in and it can tell you
all possibly results due to memory reordering.

http://svr-pes20-cppmem.cl.cam.ac.uk/cppmem/help.html

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature


Re: [HACKERS] Proposal - Support for National Characters functionality

2013-07-16 Thread Martijn van Oosterhout
On Mon, Jul 15, 2013 at 05:11:40PM +0900, Tatsuo Ishii wrote:
  Does support for alternative multi-byte encodings have something to do
  with the Han unification controversy? I don't know terribly much about
  this, so apologies if that's just wrong.
 
 There's a famous problem regarding conversion between Unicode and other
 encodings, such as Shift Jis.
 
 There are lots of discussion on this. Here is the one from Microsoft:
 
 http://support.microsoft.com/kb/170559/EN-US

Apart from Shift-JIS not being a well defined (it's more a family of
encodings) it has the unusual feature of providing multiple ways to
encode the same character.  This is not even a Han unification issue,
they have largely been addressed.  For example, the square-root symbol
exists twice (0x8795 and 0x81E3) and many other mathmatical symbols
also.

Here's the code page which you can browse online:

http://msdn.microsoft.com/en-us/goglobal/cc305152

Which means to be round-trippable Unicode would have to double those
characters, but this would make it hard/impossible to round-trip with
any other character set that had those characters.  No easy solution
here.

Something that has been done before [1] is to map the doubles to the
custom area of the unicode space (0xe000-0x).  It gives you
round-trip support at the expense of having to handle those characters
yourself.  But since postgres doesn't do anything meaningful with
unicode characters this might be acceptable.

[1] Python does a similar trick to handle filenames coming from disk in
an unknown encoding:
http://docs.python.org/3/howto/unicode.html#files-in-an-unknown-encoding

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature


Re: [HACKERS] pg_filedump 9.3: checksums (and a few other fixes)

2013-07-16 Thread Josh Berkus
On 07/08/2013 04:59 PM, Tom Lane wrote:
 Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Well, Tom opined in
 http://www.postgresql.org/message-id/23249.1370878...@sss.pgh.pa.us that
 the current patch is okay.  I have a mild opinion that it should instead
 print only SHR_LOCK when both bits are set, and one of the others when
 only one of them is set.  But I don't have a strong opinion about this,
 and since Tom disagrees with me, feel free to exercise your own (Jeff's)
 judgement.
 
 FWIW, I think that's exactly what I did in the preliminary 9.3 patch
 that I committed to pg_filedump a few weeks ago.  Could you take a look
 at what's there now and see if that's what you meant?

So, is this getting committed today, or do we bounce it?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap [Review]

2013-07-16 Thread Josh Berkus
On 07/11/2013 09:28 AM, Greg Stark wrote:
 Neither of these address Tom's concerns about API changes and future
 flexibility. I was assigned this patch in the rreviewers list and my
 inclination would be to take it but I wasn't about to
 overrule Tom. If he says he's ok with it then I'm fine going ahead and
 reviewing the code. If I still have commit bits I could even commit
 it.

API changes?  I can't find that issue in the discussion.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_filedump 9.3: checksums (and a few other fixes)

2013-07-16 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 On 07/08/2013 04:59 PM, Tom Lane wrote:
 FWIW, I think that's exactly what I did in the preliminary 9.3 patch
 that I committed to pg_filedump a few weeks ago.  Could you take a look
 at what's there now and see if that's what you meant?

 So, is this getting committed today, or do we bounce it?

I was hoping for a comment from Alvaro, but wouldn't have gotten to
committing it today in any case.  IMO this patch doesn't really belong
in the commitfest queue, since pg_filedump isn't part of the community
distribution.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [Review] Re: [HACKERS] minor patch submission: CREATE CAST ... AS EXPLICIT

2013-07-16 Thread Josh Berkus
On 07/09/2013 01:10 AM, Fabien COELHO wrote:
 Where are we with this patch?  Fabien, are you going to submit an
 updated version which addresses the objections, or should I mark it
 Returned With Feedback?
 
 There is no need for an updated patch. I addressed the objections with
 words, not code:-)

So, Tom, Robert, Cedric: can we have a verdict?  Commit or no?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [Review] Re: [HACKERS] minor patch submission: CREATE CAST ... AS EXPLICIT

2013-07-16 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 On 07/09/2013 01:10 AM, Fabien COELHO wrote:
 Where are we with this patch?  Fabien, are you going to submit an
 updated version which addresses the objections, or should I mark it
 Returned With Feedback?

 There is no need for an updated patch. I addressed the objections with
 words, not code:-)

 So, Tom, Robert, Cedric: can we have a verdict?  Commit or no?

My vote is still no, because of (1) the keyword-creep issue, and
(2) the fact that this is proposing to invent non-standard syntax
for functionality that's in the SQL standard.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [9.4 CF 1] SDO: Patches for Review

2013-07-16 Thread Josh Berkus
Hackers,

As the Commitfest is past deadline, here is the disposition of 15
patches which were still marked Needs Review:

Needs Review


The following patches have not had even one substantial review.  Please
someone pick them up and review them ASAP:

- Row-Level Security - KaiGai Kohei
- Performance Improvement by reducing WAL for Update Operation -  Amit
Kapila/Hari Babu
- Fix pgstattuple/pgstatindex to use regclass-type as the argument -
Satoshi Nagasayu
- Revive Line Type - Peter Eisentraut

In Limbo


These two patches are still under active review as of today, so we are
waiting on the outcome of that before marking them one way or the other:

- Lag  Lead Window Functions Can Ignore Nulls  -  Nicholas White
- SQL Command to edit postgresql.conf (ALTER SYSTEM patch) - Amit Kapila

I couldn't determine what status the following patch had:

- CREATE CAST ... AS EXPLICIT - Fabien Coelho

Returned, Please Resubmit
=

Finally, during the commitfest we promise each submitter one serious
review.  In most cases, we try to do as much review is required to get
the patches into committable state.  However, it is now the end date of
the commitfest, and the following patches did get one serious review,
even if they were updated after the review.

The authors of these patches are strongly encouraged to resubmit them
for the September Commitfest, or to contact me and I will transfer the
patches.

Add support to IF NOT EXISTS to others CREATE statements  - Fabrizio
de Royes Mello
TransactionIdIsInProgress() procno cache - Simon Riggs
Patch to compute Max LSN of Data Pages (from 2012-11) - Amit Kapila/Hari
Babu
fail-back without fresh backup - Samrat Revagade, Sawada Masahiko
updated emacs configuration - Peter Eisentraut
Add more regression tests for ROLE (USER) - Robins Tharakan
Add more regression tests for SCHEMA - Robins Tharakan
Add more regression tests for SEQUENCE - Robins Tharakan



-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [Review] Re: [HACKERS] minor patch submission: CREATE CAST ... AS EXPLICIT

2013-07-16 Thread Josh Berkus
On 07/16/2013 03:12 PM, Tom Lane wrote:
 Josh Berkus j...@agliodbs.com writes:
 On 07/09/2013 01:10 AM, Fabien COELHO wrote:
 Where are we with this patch?  Fabien, are you going to submit an
 updated version which addresses the objections, or should I mark it
 Returned With Feedback?
 
 There is no need for an updated patch. I addressed the objections with
 words, not code:-)
 
 So, Tom, Robert, Cedric: can we have a verdict?  Commit or no?
 
 My vote is still no, because of (1) the keyword-creep issue, and
 (2) the fact that this is proposing to invent non-standard syntax
 for functionality that's in the SQL standard.

Ok, marking Returned with Feedback.


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [9.4 CF 1]SDO: Patches awaiting Author

2013-07-16 Thread Josh Berkus
Hackers,

The following patches were Waiting on Author as of today.  As such,
most have been marked returned with feedback.  Authors are encouraged
to update their patches and submit to the next Commitfest.

There are a few exceptions to Returned with feedback, though:

Could Not Determine Status:

logical changeset generation v5 Andres Freund
Pluggable toast compression Andres Freund

Didn't Belong in CF:

pg_filedump 9.3: checksums (and a few other fixes)

The rest, all Returned with Feedback:

Extension templates Dimitri Fontaine
COPY tuning: allow batched inserts with SERIAL/default nextval()Simon 
Riggs
Non-recursive processing of AND/OR listsGurjeet Singh
Add basic regression tests for SET xxx  Robins Tharakan
Add basic regression tests for DISCARD  Robins Tharakan
Remove useless USE_PGXS support in contrib  Peter Eisentraut
Add visibility map information to pg_freespace  Koytaro Horiguchi
Reduce maximum error in tuples estimation after vacuum  Koytaro Horiguchi
Make recovery.conf parameters into GUCs Simon Riggs, et al
transforms  Peter Eisentraut

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] XLogInsert scaling, revisited

2013-07-16 Thread Michael Paquier
On Tue, Jul 16, 2013 at 2:24 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Mon, Jul 8, 2013 at 6:16 PM, Heikki Linnakangas
 hlinnakan...@vmware.com wrote:
 Ok, I've committed this patch now. Finally, phew!

 I found that this patch causes the assertion failure. When I set up simple
 replication environment and promoted the standby before executing any
 transaction on the master, I got the following assertion failure.

 2013-07-16 02:22:06 JST sby1 LOG:  received promote request
 2013-07-16 02:22:06 JST sby1 FATAL:  terminating walreceiver process
 due to administrator command
 2013-07-16 02:22:06 JST sby1 LOG:  redo done at 0/2F0
 2013-07-16 02:22:06 JST sby1 LOG:  selected new timeline ID: 2
 hrk:head-pgsql postgres$ 2013-07-16 02:22:06 JST sby1 LOG:  archive
 recovery complete
 TRAP: FailedAssertion(!(readOff == (XLogCtl-xlblocks[firstIdx] -
 8192) % ((uint32) (16 * 1024 * 1024))), File: xlog.c, Line: 7048)
 2013-07-16 02:22:12 JST sby1 LOG:  startup process (PID 37115) was
 terminated by signal 6: Abort trap
 2013-07-16 02:22:12 JST sby1 LOG:  terminating any other active server 
 processes
Note that this is also reproducible even when trying to recover only
from archives without strrep.
Regards,
--
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSL renegotiation

2013-07-16 Thread Noah Misch
On Tue, Jul 16, 2013 at 10:41:44AM -0700, David Fetter wrote:
 On Fri, Jul 12, 2013 at 08:51:52PM -0400, Noah Misch wrote:
  Agreed.  The OpenSSL Project last applied a security fix to 0.9.6
  over eight years ago.  Compatibility with 0.9.6 has zero or negative
  value.
 
 You've made a persuasive case that we should actively break backward
 compatibility here.  Would that be complicated to do?

Nope.  If Alvaro's code change builds under 0.9.6, malfunctioning only at
runtime, I suspect we would add a configure-time version check and possibly
a runtime one as well.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] findDependentObjects() mutual exclusion vs. MVCC catalog scans

2013-07-16 Thread Noah Misch
On Tue, Jul 16, 2013 at 11:27:02AM -0400, Robert Haas wrote:
 I recommend reworking the header comment to avoid mention of
 SnapshotNow, since if we get rid of SnapshotNow, the reference might
 not be too clear to far-future hackers.
 
 + /*
 +  * For a scan using a non-MVCC snapshot like SnapshotSelf, we would 
 simply
 +  * reuse the old snapshot.  So far, the only caller uses MVCC snapshots.
 +  */
 + freshsnap = GetCatalogSnapshot(RelationGetRelid(sysscan-heap_rel));
 
 This comment is not very clear, because it doesn't describe what the
 code actually does, but rather speculates about what the code could do
 if the intention of some future caller were different.  I recommend
 adding Assert(IsMVCCSnapshot(scan-xs_snapshot)) and changing the
 comment to something like this: For now, we don't handle the case of
 a non-MVCC scan snapshot.  This is adequate for existing uses of this
 function, but might need to be changed in the future.

On Tue, Jul 16, 2013 at 11:35:48AM -0400, Tom Lane wrote:
 I agree with Robert's comments, and in addition suggest that this code
 needs a comment about why it's safe to use the snapshot without doing
 RegisterSnapshot or equivalent.

Committed with hopefully-better comments.  Thanks.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]

2013-07-16 Thread Noah Misch
On Mon, Jul 15, 2013 at 11:43:04AM -0700, David Fetter wrote:
 On Sun, Jul 14, 2013 at 10:15:12PM -0400, Noah Misch wrote:
  See attached patch revisions.  The first patch edits 
  find_minmax_aggs_walker()
  per my comments just now.  The second is an update of your FILTER patch with
  the changes to which I alluded above; it applies atop the first patch.  
  Would
  you verify that I didn't ruin anything?  Barring problems, will commit.

 Tested your changes.  They pass regression, etc. :)

Committed.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Adding optionally commit number in PG_VERSION_STR

2013-07-16 Thread Michael Paquier
Hi all,

It happens that I work occasionally on multiple builds based on
different stable branches at the same time to check fixes that need to
be backpatched, and I tend to easily lose track on which version the
build I created is based on (Duh!). There is of course the version
number up to the 3rd digit available (for example 9.2.4, 9.3beta2,
etc.), but as a developer I think that it would be helpful to include
the commit ID in PG_VERSION_STR to get a better reference on exactly
what the development build is based on. This could be controlled by an
additional flag in ./configure.in called something like
--enable-version-commit, of course disabled by default. If enabled,
PG_VERSION_STR would be generated with the new information. configure
would also return an error when this flag is enabled if git is either
not found, or if the repository where configure is not a native git
repository.

Thoughts?
--
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Listen/notify across clusters

2013-07-16 Thread Andreas Karlsson

On 07/15/2013 04:10 PM, Greg Jaskiewicz wrote:

In terms of features, apart from separating LISTEN so that it can be actually 
used on Standbys, wouldn't it be a matter of including the notifications in the 
WAL stream, as simple packets ?
This would guarantee same behaviour as on the master.


I guess one problem is to implement writing to the WAL with the smallest 
possible performance hit.  As far as I can see there are two possible 
approaches: either write to WAL when NOTIFY is run or write to WAL on 
commit. The former seems more in line with how commands in PostgreSQL 
usually work.


There shouldn't be any major problems with implementing LISTEN on the 
slaves since LISTEN is done in memory.


I feel like I as a beginner to the codebase am missing something bit 
because while this is a fair bit of work it does not too hard to implement.


Andreas

--
Andreas Karlsson


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding optionally commit number in PG_VERSION_STR

2013-07-16 Thread Tom Lane
Michael Paquier michael.paqu...@gmail.com writes:
 It happens that I work occasionally on multiple builds based on
 different stable branches at the same time to check fixes that need to
 be backpatched, and I tend to easily lose track on which version the
 build I created is based on (Duh!). There is of course the version
 number up to the 3rd digit available (for example 9.2.4, 9.3beta2,
 etc.), but as a developer I think that it would be helpful to include
 the commit ID in PG_VERSION_STR to get a better reference on exactly
 what the development build is based on. This could be controlled by an
 additional flag in ./configure.in called something like
 --enable-version-commit, of course disabled by default. If enabled,
 PG_VERSION_STR would be generated with the new information. configure
 would also return an error when this flag is enabled if git is either
 not found, or if the repository where configure is not a native git
 repository.

Personally, I'd find that pretty useless, because the build I'm running
has typically got uncommitted changes in it.  Also, there isn't any good
way to set PG_VERSION_STR except at configure time, which means the
common practice of doing (at most) make clean before rebuilding would
not result in an updated string even if you had committed.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improvement of checkpoint IO scheduler for stable transaction responses

2013-07-16 Thread Ants Aasma
On Tue, Jul 16, 2013 at 9:17 PM, Greg Smith g...@2ndquadrant.com wrote:
 On 7/16/13 12:46 PM, Ants Aasma wrote:

 Spread checkpoints sprinkles the writes out over a long
 period and the general tuning advice is to heavily bound the amount of
 memory the OS willing to keep dirty.


 That's arguing that you can make this feature be useful if you tune in a
 particular way.  That's interesting, but the goal here isn't to prove the
 existence of some workload that a change is useful for.  You can usually
 find a test case that validates any performance patch as helpful if you
 search for one.  Everyone who has submitted a sorted checkpoint patch for
 example has found some setup where it shows significant gains.  We're trying
 to keep performance stable across a much wider set of possibilities though.

 Let's talk about default parameters instead, which quickly demonstrates
 where your assumptions fail.  The server I happen to be running pgbench
 tests on today has 72GB of RAM running SL6 with RedHat derived kernel
 2.6.32-358.11.1.  This is a very popular middle grade server configuration
 nowadays.  There dirty_background_ratio and dirty_background_ratio are 10
 (percent).  That means that roughly 7GB of RAM can be used for write
 caching.  Note that this is a fairly low write cache tuning compared to a
 survey of systems in the field--lots of people have servers with earlier
 kernels where these numbers can be as high as 20 or even 40% instead.

 The current feasible tuning for shared_buffers suggests a value of 8GB is
 near the upper limit, beyond which cache related overhead makes increases
 counterproductive.  Your examples are showing 53% of shared_buffers dirty at
 checkpoint time; that's typical.  The checkpointer is then writing out just
 over 4GB of data.

 With that background what process here has more data to make decisions with?

 -The operating system has 7GB of writes it's trying to optimize.  That
 potentially includes backend, background writer, checkpoint, temp table,
 statistics, log, and WAL data.  The scheduler is also considering read
 operations.

 -The checkpointer process has 4GB of writes from rarely written shared
 memory it's trying to optimize.

Actually I was arguing that the reasoning that OS will take care of
the sorting does not apply in reasonably common cases. My point is
that the OS isn't able to optimize the writes because spread
checkpoints trickle the writes out to the OS in random order over a
long time. If OS writeback behavior is left in the default
configuration it will start writing out data before checkpoint write
phase ends (due to dirty_expire_centisecs), this will miss write
combining opportunities that would arise if we sorted the data before
dumping them to the OS dirty buffers. I'm not arguing that we try to
bypass OS I/O scheduling decisions, I'm arguing that by arranging
checkpoint writes in logical order we will make pages visible to the
I/O scheduler in a way that will lead to more efficient writes.

Also I think that you are overestimating the capabilities of the OS IO
scheduler. At least for Linux, the IO scheduler does not see pages in
the dirty list - only pages for which writeback has been initiated. In
default configuration this means up to 128 read and 128 write I/Os are
considered. The writes are picked by basically doing round robin on
files with dirty pages and doing a clocksweep scan for a chunk of
pages from each. So in reality there is practically no benefit in
having the OS do the reordering, while there is the issue that
flushing a large amount of dirty pages at once does very nasty things
to query latency by overloading all of the I/O queues.

 This is why if you take the opposite approach of yours today--go searching
 for workloads where sorting is counterproductive--those are equally easy to
 find.  Any test of write speed I do starts with about 50 different
 scale/client combinations.  Why do I suggest pgbench-tools as a way to do
 performance tests?  It's because an automated sweep of client setups like it
 does is the minimum necessary to create enough variation in workload for
 changing the database's write path.  It's really amazing how often doing
 that shows a proposed change is just shuffling the good and bad cases
 around.  That's been the case for every sorting and fsync delay change
 submitted so far.  I'm not even interested in testing today's submission
 because I tried that particular approach for a few months, twice so far, and
 it fell apart on just as many workloads as it helped.

As you know running a full suite of write benchmarks takes a very long
time, with results often being inconclusive (noise is greater than
effect we are trying to measure). This is why I'm interested which
workloads you suspect might fall apart from this patch - because I
can't think of any. Worst case would be that the OS fully absorbs all
checkpoint writes before writing anything out, so the sorting is
useless waste of CPU and memory. The CPU 

Re: [HACKERS] A general Q about index

2013-07-16 Thread soroosh sardari
On Tue, Jul 16, 2013 at 7:00 PM, David Johnston pol...@yahoo.com wrote:

 David Johnston wrote
 
  soroosh sardari wrote
  Hi
 
  I want to know how an index is created and used.
  actually if you can show to me a simple start point, it would be great.
 
  Regards,
  Soroosh Sardari
  In the documentation there is a table of contents and listed on that
 table
  is a section named Indexes.
 
  David J.

 Since you posted this to hacker you may mean you wish to know how to
 program
 them as oppose to use them in SQL.  I have no clue to this regard.  If you
 did mean use in SQL then the documentation is your friend and you also
 should not have posted this question to -hackers but to -general instead;
 probably should have posted there anyway to begin with and been more clear
 as to what you mean by created and used.

 David J.




 --
 View this message in context:
 http://postgresql.1045698.n5.nabble.com/A-general-Q-about-index-tp5763912p5763928.html
 Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



Actually  I mean the index in the source code, I want to know how index is
used in the backend for
execution of some query.
sorry my question is a bit ambiguous,
I start to read src/backend/access/index/genam.c, and I think maybe this
file is not good way to start.
let me write a bunch of question to illustrate my point,
-How create an index for a table with specified field
-How index is stored in a page
-what module is responsible to fetch and swap out index pages
-Is there a kind of abstract class for all variant of index?


Re: [HACKERS] A general Q about index

2013-07-16 Thread Craig Ringer
On 07/16/2013 11:09 PM, soroosh sardari wrote:
 
 Actually  I mean the index in the source code, I want to know how index
 is used in the backend for
 execution of some query.
 sorry my question is a bit ambiguous,
 I start to read src/backend/access/index/genam.c, and I think maybe
 this file is not good way to start.
 let me write a bunch of question to illustrate my point,
 -How create an index for a table with specified field
 -How index is stored in a page
 -what module is responsible to fetch and swap out index pages
 -Is there a kind of abstract class for all variant of index?

Start with the documentation:

http://www.postgresql.org/docs/current/static/internals.html

particularly:

http://www.postgresql.org/docs/current/static/indexam.html
http://www.postgresql.org/docs/current/static/gist.html
http://www.postgresql.org/docs/current/static/catalog-pg-index.html
http://www.postgresql.org/docs/current/static/catalog-pg-am.html

That should give you the ground work to understand the source code a bit
better. Then feel free to return with specific questions you don't feel
are explained by the documentation or code comments and aren't obvious
from the code.
-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers