[HACKERS] Question regarding Sync message and unnamed portal

2012-09-30 Thread Tatsuo Ishii
>From the manual:

"An unnamed portal is destroyed at the end of the transaction"

"At completion of each series of extended-query messages, the frontend
should issue a Sync message. This parameterless message causes the
backend to close the current transaction if it's not inside a
BEGIN/COMMIT transaction block"

>From these statements, I would think #4 will fail in the following
sequence of commands because #3 closes transaction and it destroys
unnamed portal: 1)Parse/Bind creates unnamed portal,
2)Parse/Bind/Execute creates named portal and executes, 3)Send Sync
message (because it is required in extended protocol), 4)Execute
unnamed portal created in #1.

If this is true, that means unnamed portal execution and named portal
execution cannot be mixed unless they are inside an explicit
transaction. IMO this should be described in the document.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 64-bit API for large object

2012-09-30 Thread Kohei KaiGai
2012/9/30 Tatsuo Ishii :
>> * At inv_seek(), it seems to me it checks offset correctness with wrong way,
>>   as follows:
>> |  case SEEK_SET:
>> |  if (offset < 0)
>> |  elog(ERROR, "invalid seek offset: " INT64_FORMAT, offset);
>> |  obj_desc->offset = offset;
>> |  break;
>>   It is a right assumption, if large object size would be restricted to 2GB.
>>   But the largest positive int64 is larger than expected limitation.
>>   So, it seems to me it should be compared with (INT_MAX * PAGE_SIZE)
>>   instead.
>
> Point taken. However, checking offset < 0 seems to be still valid
> because it is possible to pass minus offset to inv_seek(), no?  Also I
> think upper limit for seek position should be defined as (INT_MAX *
> LOBLKSIZE), rather than (INT_MAX * PAGE_SIZE). Probably (INT_MAX *
> LOBLKSIZE) should be defined in pg_largeobject.h as:
>
> /*
>  * Maximum byte length for each large object
> */
> #define MAX_LARGE_OBJECT_SIZE   INT64CONST(INT_MAX * LOBLKSIZE)
>
> Then the checking offset in inv_seek() will be:
>
> case SEEK_SET:
> if (offset < 0 || offset >= MAX_LARGE_OBJECT_SIZE)
> elog(ERROR, "invalid seek offset: " 
> INT64_FORMAT, offset);
> obj_desc->offset = offset;
> break;
> case SEEK_CUR:
> if ((offset + obj_desc->offset) < 0 ||
>(offset + obj_desc->offset) >= 
> MAX_LARGE_OBJECT_SIZE)
> elog(ERROR, "invalid seek offset: " 
> INT64_FORMAT, offset);
> obj_desc->offset += offset;
> break;
> case SEEK_END:
> {
> int64   pos = inv_getsize(obj_desc) + 
> offset;
>
> if (pos < 0 || pos >= MAX_LARGE_OBJECT_SIZE)
> elog(ERROR, "invalid seek offset: " 
> INT64_FORMAT, offset);
> obj_desc->offset = pos;
> }
>
> What do you think?
>
Yes, it is exactly what I expected. Indeed, it is still need a check to prevent
negative offset here.

Thanks,
-- 
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_regress running for ~10 hours using 100% CPU

2012-09-30 Thread Thomas Munro
Hi

Buildfarm machine lyrebird, clang 2.9, amd64, Debian 6, 'HEAD'.  I
have been unsuccessful in finding out what it's doing using gdb:

pg_animal@asterix:~/work/HEAD/pgsql$ gdb --pid=10681
GNU gdb (GDB) 7.0.1-debian
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
.
Attaching to process 10681
Reading symbols from
/home/pg_animal/work/HEAD/pgsql.2325/src/test/regress/pg_regress...done.
Reading symbols from /lib/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging
symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Die: DW_TAG_restrict_type (abbrev 48, offset 0x4adf)
  parent at offset: 0x3ebe
  has children: FALSE
  attributes:
DW_AT_type (DW_FORM_ref4) constant ref: 0x4115 (adjusted)
Dwarf Error: Cannot find type of die [in module
/home/pg_animal/work/HEAD/pgsql.2325/src/test/regress/pg_regress]

Asking GDB anything generates similar messages.  It seems like this
might be related to a known clang/dwarf/gdb bug (apparently
experienced by many MacOS users and the Chromium project, based on
cursory googling), and possibly not related to whatever difficulty
pg_regress is having.  So I don't actually have anything useful to
report here unfortunately... but thought I should write a note since
I'll suppose when I kill it it'll show up as a failure.  Unfortunately
I don't have time to dig further currently.

Thomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Question regarding Sync message and unnamed portal

2012-09-30 Thread Tom Lane
Tatsuo Ishii  writes:
> From the manual:
> "An unnamed portal is destroyed at the end of the transaction"

Actually, all portals are destroyed at end of transaction (unless
they're from holdable cursors).  Named or not doesn't enter into it.

> From these statements, I would think #4 will fail in the following
> sequence of commands because #3 closes transaction and it destroys
> unnamed portal: 1)Parse/Bind creates unnamed portal,
> 2)Parse/Bind/Execute creates named portal and executes, 3)Send Sync
> message (because it is required in extended protocol), 4)Execute
> unnamed portal created in #1.

The right thing to use if you're trying to interleave portal executions
like that is Flush, not Sync.  Sync mainly adds a protocol
resynchronization point --- it's needed in case portal execution fails
partway through.  (In which case you'll have lost both portals in the
transaction abort anyway.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] embedded list v3

2012-09-30 Thread Tom Lane
Andres Freund  writes:
> Patch 0001 contains a assert_compatible_types(a, b) and a 
> assert_compatible_types_bool(a, b) macro which I found very useful to make it
> harder to misuse the api. I think its generally useful and possibly should be
> used in more places.

This seems like basically a good idea, but the macro names are very
unfortunately chosen: they don't comport with our other names for
assertion macros, and they imply that the test is symmetric which it
isn't.  It's also unclear what the point of the _bool version is
(namely, to be used in expression contexts in macros).

I suggest instead

AssertVariableIsOfType(varname, typename)

AssertVariableIsOfTypeMacro(varname, typename)

Or possibly we should leave off the "Assert" prefix, since this will be
a compile-time-constant check and thus not really all that much like
the existing run-time Assert mechanism.  Or write "Check" instead of
"Assert", or some other verb.

Anybody got another color for this bikeshed?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] embedded list v3

2012-09-30 Thread Tom Lane
Andres Freund  writes:
> Current version is available at branch ilist in:
> git://git.postgresql.org/git/users/andresfreund/postgres.git
> ssh://g...@git.postgresql.org/users/andresfreund/postgres.git

I'm still pretty desperately unhappy with your insistence on circularly
linked dlists.  Not only does that make initialization problematic,
but now it's not even consistent with slists.

A possible compromise that would fix the initialization issue is to
allow an empty dlist to be represented as *either* two NULL pointers
or a pair of self-pointers.  Conversion from one case to the other
could happen like this:

  INLINE_IF_POSSIBLE void
  dlist_push_head(dlist_head *head, dlist_node *node)
  {
+   if (head->head.next == NULL)
+   dlist_init(head);
+ 
node->next = head->head.next;
node->prev = &head->head;
node->next->prev = node;
head->head.next = node;
  
dlist_check(head);
  }

It appears to me that of the inline'able functions, only dlist_push_head
and dlist_push_tail would need to do this; the others presume nonempty
lists and so wouldn't have to deal with the NULL/NULL case.
dlist_is_empty would need one extra test too.  dlist_foreach could be
something like

#define dlist_foreach(iter, ptr)
for (AssertVariableIsOfTypeMacro(iter, dlist_iter),
 AssertVariableIsOfTypeMacro(ptr, dlist_head *),
 iter.end = &(ptr)->head,
 iter.cur = iter.end->next ? iter.end->next : iter.end;
 iter.cur != iter.end;
 iter.cur = iter.cur->next)

I remain unimpressed with the micro-efficiency of this looping code,
but since you're set on pessimizing list iteration to speed up list
alteration, maybe it's the best we can do.

BTW, the "fast path" in dlist_move_head can't be right can it?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] embedded list v3

2012-09-30 Thread Andres Freund
On Sunday, September 30, 2012 06:57:32 PM Tom Lane wrote:
> Andres Freund  writes:
> > Patch 0001 contains a assert_compatible_types(a, b) and a
> > assert_compatible_types_bool(a, b) macro which I found very useful to
> > make it harder to misuse the api. I think its generally useful and
> > possibly should be used in more places.
> 
> This seems like basically a good idea, but the macro names are very
> unfortunately chosen: they don't comport with our other names for
> assertion macros, and they imply that the test is symmetric which it
> isn't.  It's also unclear what the point of the _bool version is
> (namely, to be used in expression contexts in macros).
> 
> I suggest instead
> 
>   AssertVariableIsOfType(varname, typename)
> 
>   AssertVariableIsOfTypeMacro(varname, typename)
> 
> Or possibly we should leave off the "Assert" prefix, since this will be
> a compile-time-constant check and thus not really all that much like
> the existing run-time Assert mechanism.  Or write "Check" instead of
> "Assert", or some other verb.
> 
> Anybody got another color for this bikeshed?
No, happy with the new name.

Thanks for committing! Wondered for a minute what the point of autoconfiscation 
is/was but I see that e.g. clang already works... Nice.

The bizarre syntactic placement requirements directly come from the standard 
btw. No idea why they thought that would be a good idea... (check 6.7.1, 
6.7.2.1, 6.7.10).

Perhaps we need to decouple _Static_assert support from compound statement 
support at some point, but we will see.

Greetings,

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] embedded list v3

2012-09-30 Thread Tom Lane
Andres Freund  writes:
> Perhaps we need to decouple _Static_assert support from compound statement 
> support at some point, but we will see.

Yeah, possibly, but until we have an example of a non-gcc-compatible
compiler that can do something equivalent, it's hard to guess how we
might need to alter the autoconf tests.  Anyway the important thing
for now is the external specification of the macros, and I think we're
good on that (modulo any better naming suggestions).

I'm fairly sure there are already a few cases of Asserts on
compile-time-constant expressions, so I made sure that there was a layer
allowing access to _Static_assert without the type-compatibility code.
I didn't go looking for anything to convert, but I think there's some
in the shared memory initialization code.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] embedded list v3

2012-09-30 Thread Andres Freund
Hi,

On Sunday, September 30, 2012 10:33:28 PM Tom Lane wrote:
> Andres Freund  writes:
> > Current version is available at branch ilist in:
> > git://git.postgresql.org/git/users/andresfreund/postgres.git
> > ssh://g...@git.postgresql.org/users/andresfreund/postgres.git
> 
> I'm still pretty desperately unhappy with your insistence on circularly
> linked dlists. Not only does that make initialization problematic,
> but now it's not even consistent with slists.
The slist code originally grew out of the dlist code and thus was pretty 
similar, but at some point (after your dislike of the circular lists?, not 
sure) I thought about it again and found no efficiency differences for any of 
the common manipulations in single linked lists because you don't need to deal 
with prev and tail pointers, so I saw no point in insisting there. No external 
user should care.

> A possible compromise that would fix the initialization issue is to
> allow an empty dlist to be represented as *either* two NULL pointers
> or a pair of self-pointers.  Conversion from one case to the other
> could happen like this:

> It appears to me that of the inline'able functions, only dlist_push_head
> and dlist_push_tail would need to do this; the others presume nonempty
> lists and so wouldn't have to deal with the NULL/NULL case.
> dlist_is_empty would need one extra test too.
The problem is that dlist_push_head/tail and and dlist_is_empty are prety hot 
code paths.

In transaction reassembly/wal decoding for every wal record that we need to 
look at in the context of a transaction the code very roughly does something 
like:

/* get change */
Change *change;

if (dlist_is_empty(&apply_cache->cached_changes))
 change = dlist_container(..., dlist_pop_head_node(&apply_cache-
>cached_changes))
else
 change = malloc(...);

/* get data from wal */
fill_change_change(current_wal_record, change);

/* remember change, till TX is complete */
dlist_push_tail(tx->changes, change->node);

(In reality more of those happen, but anyway)

We literally have tens of thousands list manipulation a second if the server is 
busy. Iteration only happens once a XLOG_COMMIT/ABORT is found (in combination 
with merging the subtransaction's changes).


In the slab allocator I originally built this for it was exactly the same. The 
lists are manipulated for every palloc/pfree but only iterated at 
MemoryContextReset.

I am really sorry for being stubborn here, but I changed to circular lists 
after profiling and finding that pipeline stalls & misprediced branches where 
the major thing I could change. Not sure how we can resolv this :(

> BTW, the "fast path" in dlist_move_head can't be right can it?
Yea, its crap, thanks for noticing. Shouldn't cause any issues except being 
slower, thats why I probably didn't notice I broke it at some point. ->next is 
missing.

Greetings,

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] embedded list v3

2012-09-30 Thread Andres Freund
On Sunday, September 30, 2012 10:48:01 PM Tom Lane wrote:
> Andres Freund  writes:
> > Perhaps we need to decouple _Static_assert support from compound
> > statement support at some point, but we will see.
> 
> Yeah, possibly, but until we have an example of a non-gcc-compatible
> compiler that can do something equivalent, it's hard to guess how we
> might need to alter the autoconf tests.  Anyway the important thing
> for now is the external specification of the macros, and I think we're
> good on that (modulo any better naming suggestions).
I thought msvc supported _Static_assert as well, but after a short search it 
seems I misremembered and it only supports static_assert from C++11 (which is 
plausible, because I've worked on a C++11 project which was ported to windows 
). I don't know how C and C++ compilation modes are different in msvc.

I really don't understand why C and C++ standard development isn't coordinated 
more... They seem to come up with annoying incompatibilities all the time.

> I'm fairly sure there are already a few cases of Asserts on
> compile-time-constant expressions, so I made sure that there was a layer
> allowing access to _Static_assert without the type-compatibility code.
> I didn't go looking for anything to convert, but I think there's some
> in the shared memory initialization code.
Definitely a sensible goal. I wasn't really sure how well the idea would be 
received after I asked before and only heard echoes of my excitement and didn't 
want to spend too much time on it...

Greetings,

Andres
-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Statistics and selectivity estimation for ranges

2012-09-30 Thread Jeff Davis
On Tue, 2012-09-04 at 17:27 +0400, Alexander Korotkov wrote:
> Addon patch is attached. Actually, I don't get your intention of
> introducing STATISTIC_KIND_RANGE_EMPTY_FRAC stakind. Did you plan to
> leave it as empty frac in distinct stakind or replace this stakind
> with STATISTIC_KIND_LENGTH_HISTOGRAM? In the attached
> patch STATISTIC_KIND_RANGE_EMPTY_FRAC is replaced
> with STATISTIC_KIND_LENGTH_HISTOGRAM.

Review comments:

1. In compute_range_stats, you need to guard against the case where
there is no subdiff function. Perhaps default to 1.0 or something?

2. I think it would be helpful to add comments regarding what happens
when lengths are identical, right now it's a little confusing. For
instance, the comment: "Generate a length histogram slot entry if there
are at least two length values" doesn't seem right, because the
condition still matches even if there is only one distinct value.

3. It looks like get_distance also needs to guard against a missing
subdiff.

4. There are 3 binary search functions, which seems a little excessive:
  * rbound_bsearch: greatest i such that hist[i] < v; or -1
  * rbound_bsearch_equal: greatest i such that:
  hist[i] <= v and (i=0 or hist[i-1] != hist[i]); or -1
  * length_hist_bsearch: least i such that hist[i] >= v;
  or length of hist
(let me know if I misunderstood the definitions)
At a minimum, we need more consistent and informative names. Also, the
definition of rbound_bsearch_equal is a little confusing because it's
looking for the highest index among distinct values, but the lowest
index among identical values. Do you see a way to refactor these to be a
little easier to understand?

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] gistchoose vs. bloat

2012-09-30 Thread Jeff Davis
On Tue, 2012-09-04 at 19:21 +0400, Alexander Korotkov wrote:

> New version of patch is attached. Parameter "randomization" was
> introduced. It controls whether to randomize choose. Choose algorithm
> was rewritten.
> 
Review comments:

1. Comment above while loop in gistRelocateBuildBuffersOnSplit needs to
be updated.

2. Typo in two places: "if randomization id required".

3. In gistRelocateBuildBuffersOnSplit, shouldn't that be:
 splitPageInfo = &relocationBuffersInfos[bufferIndex];
   not:
 splitPageInfo = &relocationBuffersInfos[i];

4. It looks like the randomization is happening while trying to compare
the penalties. I think it may be more readable to separate those two
steps; e.g.

  /* create a mapping whether randomization is on or not */
  for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i))
  offsets[i - FirstOffsetNumber] = i;

  if (randomization)
  /* randomize offsets array */

  for (i = 0; i < maxoff; i++)
  {
 offset = offsets[i];
 ...
  }

That's just an idea; if you think it's more readable as-is (or if I am
misunderstanding) then let me know.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP checksums patch

2012-09-30 Thread Jeff Davis
On Fri, 2012-09-14 at 17:58 -0700, Jeff Davis wrote:
> This is just a rebased version of the patch by Simon here:

I just noticed the following note in the docs for this patch:

  The default is off for backwards compatibility and
  to allow upgrade. The recommended setting is on though
  this should not be enabled until upgrade is successfully complete
  with full set of new backups.

I don't understand what that means -- if they have the page_checksums
GUC available, then surely upgrade is complete, right? And what is the
backwards-compatibility issue?

Also, it looks out of date, because the default in guc.c is set to true.
I think we should probably default to true, because it's safer and it
can always be disabled at runtime, but I don't have a strong opinion
about that.

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP checksums patch

2012-09-30 Thread Jeff Davis
On Fri, 2012-09-14 at 17:58 -0700, Jeff Davis wrote:
> * we might want to make it slightly easier for external utilities, like
> for backup/replication, to verify the pages

Ideally, PageVerificationInfoOK should be available to external
utilities, so that someone might script a background job to verify
pages. Or, perhaps we should just include a new utility,
pg_verify_checksums?

Either way, where is a good place to put the function so that it's
shared between the server and the utility? In src/port somewhere?

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Switching timeline over streaming replication

2012-09-30 Thread Amit kapila
> On Friday, September 28, 2012 6:38 PM Amit Kapila wrote:
> On Tuesday, September 25, 2012 6:29 PM Heikki Linnakangas wrote:
> On 25.09.2012 10:08, Heikki Linnakangas wrote:
> > On 24.09.2012 16:33, Amit Kapila wrote:
> >> In any case, it will be better if you can split it into multiple
> patches:
> >> 1. Having new functionality of "Switching timeline over streaming
> >> replication"
> >> 2. Refactoring related changes.


> Ok, here you go. xlog-c-split-1.patch contains the refactoring of existing
code, with no user-visible changes.
> streaming-tli-switch-2.patch applies over xlog-c-split-1.patch, and
contains the new functionality.


> Please find the initial review of the patch. Still more review is pending,
> but I thought whatever is done I shall post

Some more review:

11. In function readTimeLineHistory()
ereport(DEBUG3,
(errmsg_internal("history of timeline %u is %s",
targetTLI, nodeToString(result;


Don't nodeToString(result) needs to be changed as it contain list of 
structure TimeLineHistoryEntry


12. In function @@ -3768,6 +3773,8 @@ rescanLatestTimeLine(void)
+ * The current timeline must be found in the history file, and the
+ * next timeline must've forked off from it *after* the current
+ * recovery location.
  */
- if (!list_member_int(newExpectedTLIs,
- (int) recoveryTargetTLI))
- ereport(LOG,
- (errmsg("new timeline %u is not a child of database system timeline %u",
- newtarget,
- ThisTimeLineID)));


is there any logic in the current patch which ensures that above check is 
not require now?


13. In function @@ -3768,6 +3773,8 @@ rescanLatestTimeLine(void)
+ found = false;
+ foreach (cell, newExpectedTLIs)
..
..
- list_free(expectedTLIs);
+ list_free_deep(expectedTLIs);
whats the use of the found variable and freeing expectedTLIs in loop might 
cause problem.


14. In function @@ -3768,6 +3773,8 @@ rescanLatestTimeLine(void)
newExpectedTLIs = readTimeLineHistory(newtarget);
Shouldn't this variable be declared as newExpectedTLEs as the list returned by 
readTimeLineHistory contains target list entry


15. StartupXLOG
/* Now we can determine the list of expected TLIs */
expectedTLIs = readTimeLineHistory(recoveryTargetTLI);


Should expectedTLIs be changed to expectedTLEs as the list returned by 
readTimeLineHistory contains target list entry


16.@@ -5254,8 +5252,8 @@ StartupXLOG(void)
writeTimeLineHistory(ThisTimeLineID, recoveryTargetTLI,
- curFileTLI, endLogSegNo, reason);
+ curFileTLI, EndRecPtr, reason);
if endLogSegNo is not used here, it needs to be removd from function 
declaration as well.


17.@@ -5254,8 +5252,8 @@ StartupXLOG(void)
if (InArchiveRecovery)
  ..
  ..
+
+ /* The history file can be archived immediately. */
+ TLHistoryFileName(histfname, ThisTimeLineID);
+ XLogArchiveNotify(histfname);


 Shouldn't this be done archive_mode is on. Right now InArchiveRecovery is 
true even when we do recovery for standby


18. +static bool
+WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess, bool 
fetching_ckpt)
{
..
+ if (XLByteLT(RecPtr, receivedUpto))
+ havedata = true;
+ else
+ {
+ XLogRecPtr latestChunkStart;
+
+ receivedUpto = GetWalRcvWriteRecPtr(&latestChunkStart, &receiveTLI);
+ if (XLByteLT(RecPtr, receivedUpto) && receiveTLI == curFileTLI)
+ {
+ havedata = true;
+ if (!XLByteLT(RecPtr, latestChunkStart))
+ {
+ XLogReceiptTime = GetCurrentTimestamp();
+ SetCurrentChunkStartTime(XLogReceiptTime);
+ }
+ }
+ else
+ havedata = false;
+ }


In the above logic, it seems there is inconsistency in setting havedata = true;
In the above code in else loop, let us say cond. receiveTLI == curFileTLI is 
false but XLByteLT(RecPtr, receivedUpto) is true,
then in next round in for loop, the check if (XLByteLT(RecPtr, receivedUpto)) 
will get true and will set havedata = true;
which seems to be contradictory.


19.


+static bool
+WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess, bool 
fetching_ckpt)
{
..
+ if (PrimaryConnInfo)
+ {
+ XLogRecPtr ptr = fetching_ckpt ? RedoStartLSN : RecPtr;
+ TimeLineID tli = timelineOfPointInHistory(ptr, expectedTLIs);
+
+ if (tli < curFileTLI)


I think in some cases if (tli < curFileTLI) might not make sense, as for case 
where curFileTLI =0 for randAccess. 




20. Function name WaitForWALToBecomeAvailable suggests that it waits for WAL, 
but it also returns true when trigger file is present,
which can be little misleading.




21. @@ -2411,27 +2411,6 @@ reaper(SIGNAL_ARGS)

a. won't it impact stop of online basebackup functionality because earlier 
on promotion
   I think this code will stop walsenders and basebackup stop will also 
give error in such cases.






22. @@ -63,10 +66,17 @@ void
 _PG_init(void)
 {
  /* Tell walreceiver how to reach us */
- if (walrcv_connect != NULL || walrcv_receive != NULL ||
- walrcv_send != NULL || walrcv_disconnect != NULL)
+ if (walrcv_connect != NULL || walrcv_identify_system ||
+ walrcv_readtimelinehistoryfile != NULL ||


   check for walrcv

Re: [HACKERS] Extending range of to_tsvector et al

2012-09-30 Thread Dan Scott
On Sun, Sep 30, 2012 at 1:56 PM, johnkn63  wrote:
> When using to_tsvector  a number of newer unicode characters and pua
> characters are not included. How do I add the characters which I desire to
> be found?

I've just started digging into this code a bit, but from what I've
found src/backend/tsearch/wparser_def.c defines much of the parser
functionality, and in the area of Unicode includes a number of
comments like:

* with multibyte encoding and C-locale isw* function may fail or give
wrong result.
* multibyte encoding and C-locale often are used for Asian languages.
* any non-ascii symbol with multibyte encoding with C-locale is an
alpha character

... in concert with ifdefs around WIDE_UPPER_LOWER (in effect if
WCSTOMBS and TOWLOWER are available) to complicate testing scenarios
:)

Also note that src/test/regress/sql/tsearch.sql and
regress/sql/tsdicts.sql currently focus on English, ASCII-only data.

Perhaps this is a good opportunity for you to describe what your
environment looks like (OS, PostgreSQL version, encoding and locale
settings for the database) and show some sample to_tsquery() @@
to_tsvector() queries that don't behave the way you think they should
behave - and we could start building some test cases as a first step?

-- 
Dan Scott
Laurentian University


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unportable use of uname in pg_upgrade test script

2012-09-30 Thread Peter Eisentraut
On Sat, 2012-09-29 at 13:33 -0400, Andrew Dunstan wrote:
> On 09/29/2012 01:06 PM, Tom Lane wrote:
> > Andrew Dunstan  writes:
> >> The trouble with uname -s is that its output is a bit variable. I think
> >> this will work:
> >>   testhost=`uname -a | sed 's/.* //'`
> > What do you mean by "a bit variable"?
> 
> On one of my machines uname -s return MINGW32_NT5.1
> 
> On another it says MINGW32_NT6.1

You could do

case $testhost in
MINGW*) something;;

Or call config.guess, which exists more or less for this purpose.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doc patch, normalize search_path in index

2012-09-30 Thread Dan Scott
On Fri, Sep 28, 2012 at 1:40 PM, Karl O. Pinc  wrote:
> Hi,
>
> The attached patch (against git head)
> normalizes "search_path" as the thing indexed
> and uses a secondary index term to distinguish
> the configuration parameter from the run-time
> setting.

Makes sense to me, although I suspect the conceptual material is
better served by the "search path"-the-concept index entry and the
reference material by the "search_path configuration parameter" entry
(so, from that perspective, perhaps the patch should just be to remove
the "search_path" index entry from the DDL schemas conceptual
section).

> "search path" the concept remains distinguished
> in the index from "search_path" the setting/config param.
> It's hard to say whether it's useful to make this
> distinction.

I think that indexing "search path"-the-concept is useful for
translations, and the Japanese translation includes an index (I
couldn't find the index for the French translation).

-- 
Dan Scott
Laurentian University


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extending range of to_tsvector et al

2012-09-30 Thread john knightley
Dear Dan,

thank you for your reply.

The OS I am using is Ubuntu 12.04, with PostgreSQL 9.1.5 installed on
a utf8 local

A short 5 line dictionary file  is sufficient to test:-

raeuz
我们
𦘭𥎵
𪽖𫖂
󶒘󴮬

line 1 "raeuz" Zhuang word written using English letters and show up
under ts_vector ok
line 2 "我们" uses everyday Chinese word and show up under ts_vector ok
line 3 "𦘭𥎵" Zhuang word written using rather old Chinese charcters
found in Unicode 3.1 which came in about the year 2000  and show up
under ts_vector ok
line 4 "𪽖𫖂" Zhuang word written using rather old Chinese charcters
found in Unicode 5.2 which came in about the year 2009 but do not show
up under ts_vector ok
line 5 "󶒘󴮬" Zhuang word written using rather old Chinese charcters
found in PUA area of the font Sawndip.ttf but do not show up under
ts_vector ok (Font can be downloaded from
http://gdzhdb.l10n-support.com/sawndip-fonts/Sawndip.ttf)

The last two words even though included in a dictionary do not get
accepted by ts_vector.

Regards
John

On Mon, Oct 1, 2012 at 11:04 AM, Dan Scott  wrote:
> On Sun, Sep 30, 2012 at 1:56 PM, johnkn63  wrote:
>> When using to_tsvector  a number of newer unicode characters and pua
>> characters are not included. How do I add the characters which I desire to
>> be found?
>
> I've just started digging into this code a bit, but from what I've
> found src/backend/tsearch/wparser_def.c defines much of the parser
> functionality, and in the area of Unicode includes a number of
> comments like:
>
> * with multibyte encoding and C-locale isw* function may fail or give
> wrong result.
> * multibyte encoding and C-locale often are used for Asian languages.
> * any non-ascii symbol with multibyte encoding with C-locale is an
> alpha character
>
> ... in concert with ifdefs around WIDE_UPPER_LOWER (in effect if
> WCSTOMBS and TOWLOWER are available) to complicate testing scenarios
> :)
>
> Also note that src/test/regress/sql/tsearch.sql and
> regress/sql/tsdicts.sql currently focus on English, ASCII-only data.
>
> Perhaps this is a good opportunity for you to describe what your
> environment looks like (OS, PostgreSQL version, encoding and locale
> settings for the database) and show some sample to_tsquery() @@
> to_tsvector() queries that don't behave the way you think they should
> behave - and we could start building some test cases as a first step?
>
> --
> Dan Scott
> Laurentian University


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_upgrade tests vs alter generic changes

2012-09-30 Thread Peter Eisentraut
On Sat, 2012-09-29 at 17:52 -0400, Andrew Dunstan wrote:
> It turns out that the reason is that we support collations on MSVC but
> not on Mingw. 

The cause for that is that on Windows locale_t is called _locale_t, and
there is a workaround for that in src/include/port/win32.h for the MSVC
build, but with MinGW, the configure test result still says that
locale_t does not exist, so the code is omitted.

This could be fixed either by making the configure test a bit smarter,
or just by sticking a

#define HAVE_LOCALE_T 1

into the aforementioned win32.h.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extending range of to_tsvector et al

2012-09-30 Thread Dan Scott
Hi John:

On Sun, Sep 30, 2012 at 11:45 PM, john knightley
 wrote:
> Dear Dan,
>
> thank you for your reply.
>
> The OS I am using is Ubuntu 12.04, with PostgreSQL 9.1.5 installed on
> a utf8 local
>
> A short 5 line dictionary file  is sufficient to test:-
>
> raeuz
> 我们
> 𦘭𥎵
> 𪽖𫖂
> 󶒘󴮬
>
> line 1 "raeuz" Zhuang word written using English letters and show up
> under ts_vector ok
> line 2 "我们" uses everyday Chinese word and show up under ts_vector ok
> line 3 "𦘭𥎵" Zhuang word written using rather old Chinese charcters
> found in Unicode 3.1 which came in about the year 2000  and show up
> under ts_vector ok
> line 4 "𪽖𫖂" Zhuang word written using rather old Chinese charcters
> found in Unicode 5.2 which came in about the year 2009 but do not show
> up under ts_vector ok
> line 5 "󶒘󴮬" Zhuang word written using rather old Chinese charcters
> found in PUA area of the font Sawndip.ttf but do not show up under
> ts_vector ok (Font can be downloaded from
> http://gdzhdb.l10n-support.com/sawndip-fonts/Sawndip.ttf)
>
> The last two words even though included in a dictionary do not get
> accepted by ts_vector.

Hmm. Fedora 17 x86-64 w/ PostgreSQL 9.1.5 here, the latter seems to
work using the default text search configuration (albeit with one
crucial note: I created the database with the "lc_ctype=C
lc_collate=C" options):

WORKING:

createdb --template=template0 --lc-ctype=C --lc-collate=C foobar
foobar=# select ts_debug('󶒘󴮬');
ts_debug

 (word,"Word, all letters",󶒘󴮬,{english_stem},english_stem,{󶒘󴮬})
(1 row)

NOT WORKING AS EXPECTED:

foobaz=# SHOW LC_CTYPE;
  lc_ctype
-
 en_US.UTF-8
(1 row)

foobaz=# select ts_debug('󶒘󴮬');
ts_debug
-
 (blank,"Space symbols",󶒘󴮬,{},,)
(1 row)

So... perhaps LC_CTYPE=C is a possible workaround for you?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extending range of to_tsvector et al

2012-09-30 Thread Tom Lane
john knightley  writes:
> The OS I am using is Ubuntu 12.04, with PostgreSQL 9.1.5 installed on
> a utf8 local

> A short 5 line dictionary file  is sufficient to test:-

> raeuz
> 我们
> 𦘭𥎵
> 𪽖𫖂
> 󶒘󴮬

> line 1 "raeuz" Zhuang word written using English letters and show up
> under ts_vector ok
> line 2 "我们" uses everyday Chinese word and show up under ts_vector ok
> line 3 "𦘭𥎵" Zhuang word written using rather old Chinese charcters
> found in Unicode 3.1 which came in about the year 2000  and show up
> under ts_vector ok
> line 4 "𪽖𫖂" Zhuang word written using rather old Chinese charcters
> found in Unicode 5.2 which came in about the year 2009 but do not show
> up under ts_vector ok
> line 5 "󶒘󴮬" Zhuang word written using rather old Chinese charcters
> found in PUA area of the font Sawndip.ttf but do not show up under
> ts_vector ok (Font can be downloaded from
> http://gdzhdb.l10n-support.com/sawndip-fonts/Sawndip.ttf)

AFAIK there is nothing in Postgres itself that would distinguish, say,
𦘭 from 𪽖.  I think this must be down to
your platform's locale definition: it probably thinks that the former is
a letter and the latter is not.  You'd have to gripe to the locale
maintainers to get that fixed.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extending range of to_tsvector et al

2012-09-30 Thread john knightley
On Mon, Oct 1, 2012 at 12:11 PM, Tom Lane  wrote:
> john knightley  writes:
>> The OS I am using is Ubuntu 12.04, with PostgreSQL 9.1.5 installed on
>> a utf8 local
>
>> A short 5 line dictionary file  is sufficient to test:-
>
>> raeuz
>> 我们
>> 𦘭𥎵
>> 𪽖𫖂
>> 󶒘󴮬
>
>> line 1 "raeuz" Zhuang word written using English letters and show up
>> under ts_vector ok
>> line 2 "我们" uses everyday Chinese word and show up under ts_vector ok
>> line 3 "𦘭𥎵" Zhuang word written using rather old Chinese charcters
>> found in Unicode 3.1 which came in about the year 2000  and show up
>> under ts_vector ok
>> line 4 "𪽖𫖂" Zhuang word written using rather old Chinese charcters
>> found in Unicode 5.2 which came in about the year 2009 but do not show
>> up under ts_vector ok
>> line 5 "󶒘󴮬" Zhuang word written using rather old Chinese charcters
>> found in PUA area of the font Sawndip.ttf but do not show up under
>> ts_vector ok (Font can be downloaded from
>> http://gdzhdb.l10n-support.com/sawndip-fonts/Sawndip.ttf)
>
> AFAIK there is nothing in Postgres itself that would distinguish, say,
> 𦘭 from 𪽖.  I think this must be down to
> your platform's locale definition: it probably thinks that the former is
> a letter and the latter is not.  You'd have to gripe to the locale
> maintainers to get that fixed.
>
> regards, tom lane

PostgreSQL in general does not usually distinguish but full text search does:-

 select ts_debug('𦘭 from 𪽖');

gives the result:-

 ts_debug
---
 (word,"Word, all letters",𦘭,{english_stem},english_stem,{𦘭})
 (blank,"Space symbols"," ",{},,)
 (asciiword,"Word, all ASCII",from,{english_stem},english_stem,{})
 (blank,"Space symbols"," 𪽖",{},,)
(4 rows)

Somewhere there is dictionary, or library that is based on @ Unicode
4.0 which includes "𦘭","U+2662d" but not  "𫖂","U+2b582" which is
Unicode 5.1.

Also PUA characters are dropped in the same way by the full text
search, which is what google does but which I do not wish to do.

Regards
John


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extending range of to_tsvector et al

2012-09-30 Thread john knightley
On Mon, Oct 1, 2012 at 11:58 AM, Dan Scott  wrote:
> Hi John:
>
> On Sun, Sep 30, 2012 at 11:45 PM, john knightley
>  wrote:
>> Dear Dan,
>>
>> thank you for your reply.
>>
>> The OS I am using is Ubuntu 12.04, with PostgreSQL 9.1.5 installed on
>> a utf8 local
>>
>> A short 5 line dictionary file  is sufficient to test:-
>>
>> raeuz
>> 我们
>> 𦘭𥎵
>> 𪽖𫖂
>> 󶒘󴮬
>>
>> line 1 "raeuz" Zhuang word written using English letters and show up
>> under ts_vector ok
>> line 2 "我们" uses everyday Chinese word and show up under ts_vector ok
>> line 3 "𦘭𥎵" Zhuang word written using rather old Chinese charcters
>> found in Unicode 3.1 which came in about the year 2000  and show up
>> under ts_vector ok
>> line 4 "𪽖𫖂" Zhuang word written using rather old Chinese charcters
>> found in Unicode 5.2 which came in about the year 2009 but do not show
>> up under ts_vector ok
>> line 5 "󶒘󴮬" Zhuang word written using rather old Chinese charcters
>> found in PUA area of the font Sawndip.ttf but do not show up under
>> ts_vector ok (Font can be downloaded from
>> http://gdzhdb.l10n-support.com/sawndip-fonts/Sawndip.ttf)
>>
>> The last two words even though included in a dictionary do not get
>> accepted by ts_vector.
>
> Hmm. Fedora 17 x86-64 w/ PostgreSQL 9.1.5 here, the latter seems to
> work using the default text search configuration (albeit with one
> crucial note: I created the database with the "lc_ctype=C
> lc_collate=C" options):
>
> WORKING:
>
> createdb --template=template0 --lc-ctype=C --lc-collate=C foobar
> foobar=# select ts_debug('󶒘󴮬');
> ts_debug
> 
>  (word,"Word, all letters",󶒘󴮬,{english_stem},english_stem,{󶒘󴮬})
> (1 row)
>
> NOT WORKING AS EXPECTED:
>


>
> foobaz=# SHOW LC_CTYPE;
>   lc_ctype
> -
>  en_US.UTF-8
> (1 row)
>
> foobaz=# select ts_debug('󶒘󴮬');
> ts_debug
> -
>  (blank,"Space symbols",󶒘󴮬,{},,)
> (1 row)
>
> So... perhaps LC_CTYPE=C is a possible workaround for you?

LC_CTYPE would not be a work around - this database needs to be in
utf8 , the full text search is to be used for a mediawiki. Is this a
bug that is being worked on?

Regards
John


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Question regarding Sync message and unnamed portal

2012-09-30 Thread Tatsuo Ishii
> Tatsuo Ishii  writes:
>> From the manual:
>> "An unnamed portal is destroyed at the end of the transaction"
> 
> Actually, all portals are destroyed at end of transaction (unless
> they're from holdable cursors).  Named or not doesn't enter into it.

We need to fix the document then.

>> From these statements, I would think #4 will fail in the following
>> sequence of commands because #3 closes transaction and it destroys
>> unnamed portal: 1)Parse/Bind creates unnamed portal,
>> 2)Parse/Bind/Execute creates named portal and executes, 3)Send Sync
>> message (because it is required in extended protocol), 4)Execute
>> unnamed portal created in #1.
> 
> The right thing to use if you're trying to interleave portal executions
> like that is Flush, not Sync.  Sync mainly adds a protocol
> resynchronization point --- it's needed in case portal execution fails
> partway through.  (In which case you'll have lost both portals in the
> transaction abort anyway.)

Thanks for the suggestion. However, problem with using Flush is,
backend never sends "Ready for Query" until Sync is sent. For frontend
program "Ready for query" is important because 1) client knows session
state, 2) "Ready for query" is a command boundary as stated in
document.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] doc patch for increase in shared_buffers

2012-09-30 Thread Heikki Linnakangas

On 29.09.2012 22:13, Jeff Janes wrote:

The default value for shared_buffers was recently increased from 32MB
to 128MB, but the docs were not updated.


Thanks, applied.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers