Re: [HACKERS] logical changeset generation v6.7

2013-12-09 Thread Kyotaro HORIGUCHI

Hello, sorry for annoying you with meaningless questions.
Your explanation made it far clearer to me.

This will be the last message I mention on this patch..


On 2013-12-05 22:03:51 +0900, Kyotaro HORIGUCHI wrote:
- You assined HeapTupleGetOid(tuple) into relid to read in
  several points but no modification. Nevertheless, you left
  HeapTupleGetOid not replaced there. I think 'relid' just for
  repeated reading has far small merit compared to demerit of
  lowering readability. You'd be better to make them uniform in
  either way.
 
  It's primarily to get the line lengths halfway under control.

 Mm. I'm afraid I couldn't catch your words, do you mean that
 IsSystemClass() or isTempNamespace() could change the NULL bitmap
 in the tuple?

Huh? No. I just meant that the source code lines are longer if you  use
HeapTupleGetOid(tuple) instead of just relid. Anway, that patch  has
since been committed...


Really? Sorry for annoying you. Thank you, I understand that.


   = 0002:
  
- You are identifying the wal_level with the expr 

evel =

  WAL_LEVEL_LOGICAL' but it seems somewhat improper.
 
  Hm. Why?

 It actually does no harm and somewhat trifling so I don't 
 you should fix it.


 The reason for the comment is the greater values for vel
 are undefined at the moment, so strictly saying, such values
 should be handled as invalid ones.

Note that other checks for wal_level are written the same
way. Consider how much bigger this patch would be if every
usage of wal_level would need to get changed because a new
level had been added.


I know the objective. But it is not obvious that the future value
will need the process. Anyway we never know that until it
actually comes so I said it trifling.



- RelationIsAccessibleInLogicalDecoding and
  RelationIsLogicallyLogged are identical in code. Together with
  the name ambiguity, this is quite confising and cause of
  future misuse between these macros, I suppose. Plus the names
  seem too long.
 
  Hm, don't think they are equivalent, rather the contrary. Note one
  returns false if IsCatalogRelation(), the other true.

 Oops, I'm sorry. I understand they are not same. Then I have
 other questions. The name for the first one
 'RelationIsAccessibleInLogicalDecoding' doesn't seem representing
 what its comment reads.

 |  /* True if we need to log enough information to have access via
 |  decoding snapshot. */

 Making the macro name for this comment directly, I suppose it
 would be something like 'NeedsAdditionalInfoInLogicalDecoding' or
 more directly 'LogicalDeodingNeedsCids' or so..

The comment talks about logging enough information that it is accessible
- just as the name.


Though I'm not convinced for that. But since it also seems not so
wrong and you say so, I pretend to be convinced:-p


- In heap_insert, the information conveyed on rdata[3] seems to
  be better to be in rdata[2] because of the scarecity of the
  additional information. XLOG_HEAP_CONTAINS_NEW_TUPLE only
  seems to be needed. Also is in heap_multi_insert and
  heap_upate.
 
  Could you explain a bit more what you mean by that? The reason it's a
  separate rdata entry is that otherwise a full page write will remove the
  information.

 Sorry, I missed the comment 'so that an eventual FPW doesn't
 remove the tuple's data'. Although given the necessity of removal
 prevention, rewriting rdata[].buffer which is required by design
 (correct?)  with InvalidBuffer seems a bit outrageous for me and
 obfuscating the objective of it.

Well, it's added in a separate rdata element. Just as in dozens of other
places.


Mmmm. Was there any rdata entriy which has substantial content
but .buffer is set to InvalidBuffer just for avoiding removal by
fpw? Although for the objection I made, I became to be accostomed
to see there and I became to think it is not so bad.. I put an
end by this comment.


- In heap_delete, at the point adding replica identity, same to
  the aboves, rdata[3] could be moved into rdata[2] making new
  type like 'xl_heap_replident'.
 
  Hm. I don't think that'd be a good idea, because we'd then need special
  case decoding code for deletes because the wal format would be different
  for inserts/updates and deletes.

 Hmm. Although one common xl_heap_replident is impractical,
 splitting a logcally single entity into two or more XLogRecDatas
 also seems not to be a good idea.

That's done everywhere. There's basically two reasons to use separate
rdata elements:
* the buffers are different
* the data pointer is different

The rdata chain elements don't survive in the WAL.


Well, I came to see rdata's as simple containers holding
fragments to be written into WAL stream. Thanks for patiently
answering for such silly questions.


 Considering above, looking heap_xlog_insert(), you marked on
 xlrec.flags with XLOG_HEAP_CONTAINS_NEW_TUPLE to signal decoder
 that the record should have tuple data 

Re: [HACKERS] logical changeset generation v6.7

2013-12-05 Thread Kyotaro HORIGUCHI
Hello,

 Will send the rebased version as soon as I've addressed your comments.

Thank you.

  = 0001:
  
   - You assined HeapTupleGetOid(tuple) into relid to read in
 several points but no modification. Nevertheless, you left
 HeapTupleGetOid not replaced there. I think 'relid' just for
 repeated reading has far small merit compared to demerit of
 lowering readability. You'd be better to make them uniform in
 either way.
 
 It's primarily to get the line lengths halfway under control.

Mm. I'm afraid I couldn't catch your words, do you mean that
IsSystemClass() or isTempNamespace() could change the NULL bitmap
in the tuple?

  = 0002:
  
   - You are identifying the wal_level with the expr 'wal_level =
 WAL_LEVEL_LOGICAL' but it seems somewhat improper.
 
 Hm. Why?

It actually does no harm and somewhat trifling so I don't assert
you should fix it.

The reason for the comment is the greater values for wal_level
are undefined at the moment, so strictly saying, such values
should be handled as invalid ones. Although there is a practice
to avoid loop overruns by comparing counters with the expression
like (i  CEILING).

 For instance, I found a macro for which comment reads as follows
and I feel a bit uneasy with it :-) It's nothing more than that.

| /* Do we need to WAL-log information required only for Hot Standby? */

| #define XLogStandbyInfoActive() (wal_level = WAL_LEVEL_HOT_STANDBY)

   - RelationIsAccessibleInLogicalDecoding and
 RelationIsLogicallyLogged are identical in code. Together with
 the name ambiguity, this is quite confising and cause of
 future misuse between these macros, I suppose. Plus the names
 seem too long.
 
 Hm, don't think they are equivalent, rather the contrary. Note one
 returns false if IsCatalogRelation(), the other true.

Oops, I'm sorry. I understand they are not same. Then I have
other questions. The name for the first one
'RelationIsAccessibleInLogicalDecoding' doesn't seem representing
what its comment reads.

|  /* True if we need to log enough information to have access via
|  decoding snapshot. */

Making the macro name for this comment directly, I suppose it
would be something like 'NeedsAdditionalInfoInLogicalDecoding' or
more directly 'LogicalDeodingNeedsCids' or so..

   - In heap_insert, the information conveyed on rdata[3] seems to
 be better to be in rdata[2] because of the scarecity of the
 additional information. XLOG_HEAP_CONTAINS_NEW_TUPLE only
 seems to be needed. Also is in heap_multi_insert and
 heap_upate.
 
 Could you explain a bit more what you mean by that? The reason it's a
 separate rdata entry is that otherwise a full page write will remove the
 information.

Sorry, I missed the comment 'so that an eventual FPW doesn't
remove the tuple's data'. Although given the necessity of removal
prevention, rewriting rdata[].buffer which is required by design
(correct?)  with InvalidBuffer seems a bit outrageous for me and
obfuscating the objective of it.  Other mechanism should be
preferable, I suppose. The most straight way to do that should be
new flag bit for XLogRecData, say, survive_fpw or something.

   - In heap_multi_insert, need_cids referred only once so might be
 better removed.
 
 It's accessed in a loop over potentially quite some items, that's why I
 moved it into an extra variable.

Sorry bothering you with comments biside the point.. But the
scope of needs_cids is narrower than it is. I think the
definition should be moved into the block for 'if (needwal)'.

   - In heap_delete, at the point adding replica identity, same to
 the aboves, rdata[3] could be moved into rdata[2] making new
 type like 'xl_heap_replident'.
 
 Hm. I don't think that'd be a good idea, because we'd then need special
 case decoding code for deletes because the wal format would be different
 for inserts/updates and deletes.

Hmm. Although one common xl_heap_replident is impractical,
splitting a logcally single entity into two or more XLogRecDatas
also seems not to be a good idea.

   - In heapam_xlog.h, the new type xl_heap_header_len is
 inadequate in both of naming which is confising and
 construction on which the header in xl_heap_header is no
 longer be a header and indecisive member name 't_len'..
 
 The header bit in the name refers to the fact that it's containing
 information about the a HeapTuple's header, not that it's a header
 itself. Do you have a better suggestion than xl_heap_header_len?

Sorry, I'm confused during writing the comment, The order of
members in xl_heap_header_len doesn't matter.  I got the reason
for the xl_header_len and whole xlog record image after
re-reading the relevant code. The update record became to contain
two variable length data by this patch. So the length of the
tuple body cannot be calculated only with whole record length and
header lengths.

Considering above, looking 

Re: [HACKERS] logical changeset generation v6.7

2013-12-05 Thread Andres Freund
Hi,

On 2013-12-05 22:03:51 +0900, Kyotaro HORIGUCHI wrote:
- You assined HeapTupleGetOid(tuple) into relid to read in
  several points but no modification. Nevertheless, you left
  HeapTupleGetOid not replaced there. I think 'relid' just for
  repeated reading has far small merit compared to demerit of
  lowering readability. You'd be better to make them uniform in
  either way.
 
  It's primarily to get the line lengths halfway under control.

 Mm. I'm afraid I couldn't catch your words, do you mean that
 IsSystemClass() or isTempNamespace() could change the NULL bitmap
 in the tuple?

Huh? No. I just meant that the source code lines are longer if you use
HeapTupleGetOid(tuple) instead of just relid. Anway, that patch has
since been committed...

   = 0002:
  
- You are identifying the wal_level with the expr 'wal_level =
  WAL_LEVEL_LOGICAL' but it seems somewhat improper.
 
  Hm. Why?

 It actually does no harm and somewhat trifling so I don't assert
 you should fix it.

 The reason for the comment is the greater values for wal_level
 are undefined at the moment, so strictly saying, such values
 should be handled as invalid ones.

Note that other checks for wal_level are written the same way. Consider
how much bigger this patch would be if every usage of wal_level would
need to get changed because a new level had been added.

- RelationIsAccessibleInLogicalDecoding and
  RelationIsLogicallyLogged are identical in code. Together with
  the name ambiguity, this is quite confising and cause of
  future misuse between these macros, I suppose. Plus the names
  seem too long.
 
  Hm, don't think they are equivalent, rather the contrary. Note one
  returns false if IsCatalogRelation(), the other true.

 Oops, I'm sorry. I understand they are not same. Then I have
 other questions. The name for the first one
 'RelationIsAccessibleInLogicalDecoding' doesn't seem representing
 what its comment reads.

 |  /* True if we need to log enough information to have access via
 |  decoding snapshot. */

 Making the macro name for this comment directly, I suppose it
 would be something like 'NeedsAdditionalInfoInLogicalDecoding' or
 more directly 'LogicalDeodingNeedsCids' or so..

The comment talks about logging enough information that it is accessible
- just as the name.

- In heap_insert, the information conveyed on rdata[3] seems to
  be better to be in rdata[2] because of the scarecity of the
  additional information. XLOG_HEAP_CONTAINS_NEW_TUPLE only
  seems to be needed. Also is in heap_multi_insert and
  heap_upate.
 
  Could you explain a bit more what you mean by that? The reason it's a
  separate rdata entry is that otherwise a full page write will remove the
  information.

 Sorry, I missed the comment 'so that an eventual FPW doesn't
 remove the tuple's data'. Although given the necessity of removal
 prevention, rewriting rdata[].buffer which is required by design
 (correct?)  with InvalidBuffer seems a bit outrageous for me and
 obfuscating the objective of it.

Well, it's added in a separate rdata element. Just as in dozens of other
places.

- In heap_delete, at the point adding replica identity, same to
  the aboves, rdata[3] could be moved into rdata[2] making new
  type like 'xl_heap_replident'.
 
  Hm. I don't think that'd be a good idea, because we'd then need special
  case decoding code for deletes because the wal format would be different
  for inserts/updates and deletes.

 Hmm. Although one common xl_heap_replident is impractical,
 splitting a logcally single entity into two or more XLogRecDatas
 also seems not to be a good idea.

That's done everywhere. There's basically two reasons to use separate
rdata elements:
* the buffers are different
* the data pointer is different

The rdata chain elements don't survive in the WAL.

 Considering above, looking heap_xlog_insert(), you marked on
 xlrec.flags with XLOG_HEAP_CONTAINS_NEW_TUPLE to signal decoder
 that the record should have tuple data not being removed by fpw.
 This is the same for the update record. So the redoer(?) also can
 distinguish whether the update record contains extra tuple data
 or not.

But it doesn't know the length of the individual records, so knowing
there are two doesn't help.

 On the other hand, the update record after patched is longer by
 sizeof(uint16) regardless of whether 'tuple data' is attached or
 not. I don't know the value of the size in WAL stream, but the
 smaller would be better maybe.

I think that'd make things too complicated without too much gain in
comparison to the savings.

 # Of course, it doesn't matter if the object for OLD can
 # naturally be missing as English.

Well, I think the context makes it clear enough.

 I'm not a native English speaker as you know:-)

Neither am I ;). 

 undefining XLOG_HEAP_CONTAINS_OLD and use separte macros type 1
 |  if (xlrec-flags  XLOG_HEAP_CONTAINS_OLD_KEY ||
 |  

Re: [HACKERS] logical changeset generation v6.7

2013-12-04 Thread Kyotaro HORIGUCHI
Hello, this is cont'd comments.

 0008 and after to come later..

I had nothing to comment for patch 0008.

= 0009: 

 - In repl_scanner.l, you omitted double-doublequote handling for
   replication but it should be implemented. Zero-length
   identifier check might be needed depending on the upper-layer.

 - In walsender.c, the log messages Initiating logical rep..
   and Starting logical replication.. should be INFO or LOG in
   loglevel, not WARNING. And 'rep' in the former message would
   be better not abbreviated since not done so in the latter.

 - In walsender.c, StartLogicalReplication seems trying to abort
   itself for timeline change. But timeline changes in 9.3+ don't
   need such an aid. You'd better consult StartReplication in
   current master for detail. There might be other defferences.

 - In walsender.c, the typedef name WalSndSendData doesn't seem
   to be a function pointer. I suppose passing bare function
   pointer to WanSndLoop and WalSndDone is not a good deed. It'd
   be better to wrap it in any struct for callback, say,
   LogicalDecodingContext. It'd be even better if it could be a
   common struct with 'physycal' replication.

 - In walsender.c, I wonder if the differences are necessary
   between logical and physical replication in fetching latest
   WALs, construction of WAL sending loop and so on .. Logical
   walsender seems to be implimentated in somewhat ad-hoc way on
   the whole. I belive it could be more commonize in the base
   structure.

 - In procarray.c, the added two includes which is not
   accompanied by any other modification are needless. make emits
   no error or warning without them.

...Time's up. It'll be continued for later from 0010..

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v6.7

2013-12-04 Thread Andres Freund
On 2013-12-04 17:31:50 +0900, Kyotaro HORIGUCHI wrote:
 = 0009: 
 
  - In repl_scanner.l, you omitted double-doublequote handling for
replication but it should be implemented. Zero-length
identifier check might be needed depending on the upper-layer.

I am not sure what you mean here. IDENT can be double quoted, and so can
the option names?

  - In walsender.c, the log messages Initiating logical rep..
and Starting logical replication.. should be INFO or LOG in
loglevel, not WARNING. And 'rep' in the former message would
be better not abbreviated since not done so in the latter.

Agreed.

  - In walsender.c, StartLogicalReplication seems trying to abort
itself for timeline change. But timeline changes in 9.3+ don't
need such an aid. You'd better consult StartReplication in
current master for detail. There might be other defferences.

Timeline increases currently need work, yes, that error messgage is the
smallest part...

  - In walsender.c, the typedef name WalSndSendData doesn't seem
to be a function pointer. I suppose passing bare function
pointer to WanSndLoop and WalSndDone is not a good deed. It'd
be better to wrap it in any struct for callback, say,
LogicalDecodingContext. It'd be even better if it could be a
common struct with 'physycal' replication.

I don't see that as being realistic/advantageous. Wrapping a function
pointer in a struct doesn't improve anything in itself.

I was thinking we might want to just decouple the entire event loop and
not reuse that code, but that's ugly as well.

  - In walsender.c, I wonder if the differences are necessary
between logical and physical replication in fetching latest
WALs, construction of WAL sending loop and so on .. Logical
walsender seems to be implimentated in somewhat ad-hoc way on
the whole. I belive it could be more commonize in the base
structure.

That's because the xlogreader.h interface - over my loud protests -
doesn't support chunk-wise reading of the WAL stream and neccessicates
blocking inside the reader callback. So the event loop needs to be in
several individual functions (WalSndLoop, WalSndWaitForWal,
WalSndWriteData) instead of once in WalSndLoop…

  - In procarray.c, the added two includes which is not
accompanied by any other modification are needless. make emits
no error or warning without them.

Right. Will remove.

Thanks,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v6.7

2013-12-03 Thread Kyotaro HORIGUCHI
Hello, This is rather trivial and superficial comments as not
fully gripping functions of this patchset.

- Some patches have line offset to master. Rebase needed.

Other random comments follows,

= 0001:

 - You assined HeapTupleGetOid(tuple) into relid to read in
   several points but no modification. Nevertheless, you left
   HeapTupleGetOid not replaced there. I think 'relid' just for
   repeated reading has far small merit compared to demerit of
   lowering readability. You'd be better to make them uniform in
   either way.


   
= 0002:

 - You are identifying the wal_level with the expr 'wal_level =
   WAL_LEVEL_LOGICAL' but it seems somewhat improper.

 - In heap_insert, you added following comment and code,

* Also, if this is a catalog, we need to transmit combocids to
* properly decode, so log that as well.
*/
   need_tuple_data = RelationIsLogicallyLogged(relation);
   if (RelationIsAccessibleInLogicalDecoding(relation))
   log_heap_new_cid(relation, heaptup);

   Actually 'is a catalog' is checkied in
   RelationIsAcc...Decodeing() but this either of naming or
   commnet should be changed for consistent look. (I this the
   name of the macro is rather long but gives a vague
   illustration of the function..)


 - RelationIsAccessibleInLogicalDecoding and
   RelationIsLogicallyLogged are identical in code. Together with
   the name ambiguity, this is quite confising and cause of
   future misuse between these macros, I suppose. Plus the names
   seem too long.

 - In heap_insert, the information conveyed on rdata[3] seems to
   be better to be in rdata[2] because of the scarecity of the
   additional information. XLOG_HEAP_CONTAINS_NEW_TUPLE only
   seems to be needed. Also is in heap_multi_insert and
   heap_upate.

 - In heap_multi_insert, need_cids referred only once so might be
   better removed.

 - In heap_delete, at the point adding replica identity, same to
   the aboves, rdata[3] could be moved into rdata[2] making new
   type like 'xl_heap_replident'.

 - In heapam_xlog.h, the new type xl_heap_header_len is
   inadequate in both of naming which is confising and
   construction on which the header in xl_heap_header is no
   longer be a header and indecisive member name 't_len'..

 - In heapam_xlog.h, XLOG_HEAP_CONTAINS_OLD looks incomplete. And
   it seems to be used in nowhere in this patchset. It should be
   removed.

 - log_heap_new_cid() is called at several part just before other
   xlogs is being inserted. I suppose this should be built in the
   target xlog structures.

 - in RecovoerPreparedTransactions(), any commend needed for the
   reason calling XLogLogicalInfoActive()..

 - In xact.c, the comment for the member 'didLogXid' in
   TransactionStateData seems differ from it's meaning. It
   becomes true when any WAL record for the current transaction
   id just has been written to WAL buffer. So the comment,

/* has xid been included in WAL record? */

   would be better be something like (Should need corrected as
   I'm not native speaker.)

/* Any WAL record for this transaction has been emitted ? */

   and also the member name should be something like
   XidIsLogged. (Not so chaned?)

 - The name of the function MarkCurrentTransactionIdLoggedIfAny,
   although irregular abbreviations are discouraged, seems too
   long. Isn't MarkCur(r/rent)XidLoggedIfAny sufficient?  Anyway,
   the work involving this function seems would be better to be
   done in some other way..

 - The comment for RelationGetIndexAttrBitmap() should be edited
   for attrKind.

 - The macro name INDEX_ATTR_BITMAP_KEY should be
   INDEX_ATTR_BITMAP_FKEY. And INDEX_ATTR_BITMAP_IDENTITY_KEY
   should be INDEX_ATTR_BITMAP_REPLID_KEY or something.

 - In rel.h the member name 'rd_idattr' would be better being
   'rd_replidattr' or something like that.

= 0004:

 - Could the macro name 'RelationIsUsedAsCatalogTable' be as
   simple as IsUserCatalogRelation or something? It's from the
   viewpoint of not only simplicity but also similarity to other
   macro and/or functions having closer functionality. You
   already call the table 'user_catalog_table' in rel.h.

To be continued to next mail.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v6.7

2013-12-03 Thread Kyotaro HORIGUCHI
Hello, this is continued comments.

 = 0004:

 To be continued to next mail.


= 0005:

 - In heapam.c, it seems to be better replacing t_self only
   during logical decoding.

 - In GetOldestXmin(), the parameter name 'alreadyLocked' would
   be better being simplly 'nolock' since alreadyLocked seems to
   me suggesting that it will unlock the lock acquired beforehand.

 - Before that, In LogicalDecodingAcquireFreeSlot, lock window
   for procarray is extended after GetOldestXmin, but procarray
   does not seem to be accessed during the additional period. If
   you are holding this lock for the purpose other than accessing
   procArray, it'd be better to provide its own lock object.

 LWLockAcquire(ProcArrayLock, LW_SHARED);
 slot-effective_xmin = GetOldestXmin(true, true, true);
 slot-xmin = slot-effective_xmin;
 
 if (!TransactionIdIsValid(LogicalDecodingCtl-xmin) ||
   NormalTransactionIdPrecedes(slot-effective_xmin, 
LogicalDecodingCtl-xmin))
   LogicalDecodingCtl-xmin = slot-effective_xmin;
 LWLockRelease(ProcArrayLock);

 - In dropdb in dbcommands.c, ereport is invoked referring the
   result of LogicalDecodingCountDBSlots. But it seems better to
   me issueing this exception within LogicalDecodingCountDBSlots
   even if goto is required.

 - In LogStandbySnapshot in standby.c, two complementary
   conditions are imposed on two same unlocks. It might be
   somewhat paranoic but it is safer being like follows,

| XLogRecPtr  recptr = InvalidXLogRecPtr;
| 
|
| /* LogCurrentRunningXacts shoud be done before unlock when logical 
decoding*/ 
| if (wal_level = WAL_LEVEL_LOGICAL)
|recptr = LogCurrentRunningXacts(running);
| 
| LWLockRelease(ProcArrayLock);
| 
| if (recptr == InvalidXLogRecPtr)
|recptr = LogCurrentRunningXacts(running);
  
 - In tqual.c, in Setup/RevertFrom DecodingSnapshots, the name
   CatalogSnapshotData looks lacking unity with other
   Snapshot*Data's.

= 0007:

 - In heapam.c, the new global variable 'RecentGlobalDataXmin' is
   quite similar to 'RecentGlobalXmin' and does not represents
   what it is. The name should be
   changed. RecentGlobalNonCatalogXmin would be more preferable..

 - Althgough simplly from my teste, the following part in
   heapam.c

 if (IsSystemRelation(scan-rs_rd)
   || RelationIsAccessibleInLogicalDecoding(scan-rs_rd))
   heap_page_prune_opt(scan-rs_rd, buffer, RecentGlobalXmin);
 else
   heap_page_prune_opt(scan-rs_rd, buffer, RecentGlobalDataXmin);

   would be readable to be like,

 if (IsSystemRelation(scan-rs_rd)
   || RelationIsAccessibleInLogicalDecoding(scan-rs_rd))
   xmin = RecentGlobalXmin
 else
   xmin = RecentGlobalDataXmin
   heap_page_prune_opt(scan-rs_rd, buffer, xmin);

index_fetch_heap in indexam.c has similar part to this, and
you coded in latter style in IndexBuildHeapScan in index.c.

 - In procarray.c, you should add documentation for new parameter
   'systable' for GetOldestXmin. And the name 'systable' seems
   somewhat confusing, since its full-splled meaning is
   'including systables'. This name should be changed to
   'include_systable' or 'only_usertable' with inverting or
   something..

0008 and after to come later..

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v6.7

2013-12-03 Thread Andres Freund
On 2013-11-29 01:16:39 -0500, Robert Haas wrote:
 On Thu, Nov 14, 2013 at 8:46 AM, Andres Freund and...@2ndquadrant.com wrote:
  [ new patches ]
 
 Here's an updated version of patch #2.  I didn't really like the
 approach you took in the documentation, so I revised it.

Fair enough.

 Apart from that, I spent a lot of time looking at
 HeapSatisfiesHOTandKeyUpdate.  I'm not very happy with your changes.
 The idea seems to be that we'll iterate through all of the HOT columns
 regardless, but that might be very inefficient.  Suppose there are 100
 HOT columns, the last one is the only key column, and only the first
 one has been modified.  Once we look at #1 and determine that it's not
 HOT, we should zoom forward and skip over the next 98, and only look
 at the last one; your version does not behave like that.

Well, the hot bitmap will only contains indexed columns, so for that's
only going to happen if there's actually indexes over all those
columns. And in that case it seems unlikely that the performance
of that routine matters.
That said, keeping the old performance characteristics seems like a good
idea to me. Not sure anymore why I changed it that way.

  I've taken a crack at rewriting
 this logic, and the result looks cleaner and simpler to me, but I
 haven't tested it beyond the fact that it passes make check.  See what
 you think.

Hm. I think it actually will not abort early in call cases either, but
that looks fixable. Imagine what happens if id_attrs or key_attrs is
empty, ISTM that we'll check all hot columns in that case.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v6.7

2013-12-03 Thread Andres Freund
Hi,

On 2013-11-28 21:15:18 -0500, Robert Haas wrote:
 OK, I've committed the patch to adjust the definition of
 IsSystemRelation()/IsSystemClass() and add
 IsCatalogRelation()/IsCatalogClass().

Thanks for taking care of this!

  I kibitzed your decision about
 which function to use in a few places - specifically, I made all of
 the places that cared about allow_system_table_mods uses the IsSystem
 functions, and all the places that cared about invalidation messages
 use the IsCatalog functions.  I don't think any of these changes are
 more cosmetic, but I think it may reduce the chance of errors or
 inconsistencies in the face of future changes.

Agreed.

Do you think we need to do anything about the
ERROR:  cannot remove dependency on schema pg_catalog because it is a system 
object
thingy? Imo the current state is much more consistent than the earlier
one, but that's still a quite surprising leftover...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v6.7

2013-12-03 Thread Robert Haas
On Tue, Dec 3, 2013 at 8:24 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2013-11-28 21:15:18 -0500, Robert Haas wrote:
 OK, I've committed the patch to adjust the definition of
 IsSystemRelation()/IsSystemClass() and add
 IsCatalogRelation()/IsCatalogClass().

 Thanks for taking care of this!

  I kibitzed your decision about
 which function to use in a few places - specifically, I made all of
 the places that cared about allow_system_table_mods uses the IsSystem
 functions, and all the places that cared about invalidation messages
 use the IsCatalog functions.  I don't think any of these changes are
 more cosmetic, but I think it may reduce the chance of errors or
 inconsistencies in the face of future changes.

 Agreed.

 Do you think we need to do anything about the
 ERROR:  cannot remove dependency on schema pg_catalog because it is a system 
 object
 thingy? Imo the current state is much more consistent than the earlier
 one, but that's still a quite surprising leftover...

I don't feel obliged to change it, but I also don't see a reason not
to clean it up.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v6.7

2013-12-03 Thread Robert Haas
On Tue, Dec 3, 2013 at 8:18 AM, Andres Freund and...@2ndquadrant.com wrote:
  I've taken a crack at rewriting
 this logic, and the result looks cleaner and simpler to me, but I
 haven't tested it beyond the fact that it passes make check.  See what
 you think.

 Hm. I think it actually will not abort early in call cases either, but
 that looks fixable. Imagine what happens if id_attrs or key_attrs is
 empty, ISTM that we'll check all hot columns in that case.

Yeah, you're right.  I think the current logic will terminate when all
flags are set to false or all attribute numbers have been checked, but
it doesn't know that if HOT's been disproven then we needn't consider
further HOT columns.  I think the way to fix that is to tweak this
part:

+   if (next_hot_attnum  FirstLowInvalidHeapAttributeNumber)
check_now = next_hot_attnum;
+   else if (next_key_attnum  FirstLowInvalidHeapAttributeNumber)
+   check_now = next_key_attnum;
+   else if (next_id_attnum  FirstLowInvalidHeapAttributeNumber)
+   check_now = next_id_attnum;
else
+   break;

What I think we ought to do there is change each of those criteria to
say if (hot_result  next_hot_attnum 
FirstLowInvalidHeapAttributeNumber) and similarly for the other two.
That way we consider each set a valid source of attribute numbers only
until the result flag for that set flips false.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v6.7

2013-12-03 Thread Andres Freund
Hi,

On 2013-12-03 17:13:05 +0900, Kyotaro HORIGUCHI wrote:
 - Some patches have line offset to master. Rebase needed.

Will send the rebased version as soon as I've addressed your comments.

 = 0001:
 
  - You assined HeapTupleGetOid(tuple) into relid to read in
several points but no modification. Nevertheless, you left
HeapTupleGetOid not replaced there. I think 'relid' just for
repeated reading has far small merit compared to demerit of
lowering readability. You'd be better to make them uniform in
either way.

It's primarily to get the line lengths halfway under control.

 = 0002:
 
  - You are identifying the wal_level with the expr 'wal_level =
WAL_LEVEL_LOGICAL' but it seems somewhat improper.

Hm. Why?

  - RelationIsAccessibleInLogicalDecoding and
RelationIsLogicallyLogged are identical in code. Together with
the name ambiguity, this is quite confising and cause of
future misuse between these macros, I suppose. Plus the names
seem too long.

Hm, don't think they are equivalent, rather the contrary. Note one
returns false if IsCatalogRelation(), the other true.

  - In heap_insert, the information conveyed on rdata[3] seems to
be better to be in rdata[2] because of the scarecity of the
additional information. XLOG_HEAP_CONTAINS_NEW_TUPLE only
seems to be needed. Also is in heap_multi_insert and
heap_upate.

Could you explain a bit more what you mean by that? The reason it's a
separate rdata entry is that otherwise a full page write will remove the
information.

  - In heap_multi_insert, need_cids referred only once so might be
better removed.

It's accessed in a loop over potentially quite some items, that's why I
moved it into an extra variable.

  - In heap_delete, at the point adding replica identity, same to
the aboves, rdata[3] could be moved into rdata[2] making new
type like 'xl_heap_replident'.

Hm. I don't think that'd be a good idea, because we'd then need special
case decoding code for deletes because the wal format would be different
for inserts/updates and deletes.

  - In heapam_xlog.h, the new type xl_heap_header_len is
inadequate in both of naming which is confising and
construction on which the header in xl_heap_header is no
longer be a header and indecisive member name 't_len'..

The header bit in the name refers to the fact that it's containing
information about the a HeapTuple's header, not that it's a header
itself. Do you have a better suggestion than xl_heap_header_len?

  - In heapam_xlog.h, XLOG_HEAP_CONTAINS_OLD looks incomplete. And
it seems to be used in nowhere in this patchset. It should be
removed.

Not sure what you mean with incomplete? It contains the both possible
variants for an old contained tuple. The macro is used in the decoding,
but I don't think things get clearer if we revise the macros in that
later patch.

  - log_heap_new_cid() is called at several part just before other
xlogs is being inserted. I suppose this should be built in the
target xlog structures.

Proportionally it will only be logged in a absolute minority of the
cases (since normally the catalog will only seldomly be updated in
comparison to a user's tables), so it doesn't seem like a good idea to
complicate the already *horribly* complicated wal format for heap_*.

  - in RecovoerPreparedTransactions(), any commend needed for the
reason calling XLogLogicalInfoActive()..

It's pretty much the Test here must match one used in
AssignTransactionId() comment. We only want to allow overwriting if
AssignTransactionId() might already have done the SubTransSetParent()
calls.

  - In xact.c, the comment for the member 'didLogXid' in
TransactionStateData seems differ from it's meaning. It
becomes true when any WAL record for the current transaction
id just has been written to WAL buffer. So the comment,
 
 /* has xid been included in WAL record? */
 
would be better be something like (Should need corrected as
I'm not native speaker.)

 /* Any WAL record for this transaction has been emitted ? */

I don't think that'd be an improvement, transaction is a bit ambigiuous
there because it might be the toplevel or subtransaction.

and also the member name should be something like
XidIsLogged. (Not so chaned?)

Hm.

  - The name of the function MarkCurrentTransactionIdLoggedIfAny,
although irregular abbreviations are discouraged, seems too
long. Isn't MarkCur(r/rent)XidLoggedIfAny sufficient?

If you look at the other names in xact.h that doesn't seem to fit too
well in the naming pattern.

 Anyway,
the work involving this function seems would be better to be
done in some other way..

Why? How?

  - The comment for RelationGetIndexAttrBitmap() should be edited
for attrKind.

Good point.

  - The macro name INDEX_ATTR_BITMAP_KEY should be
INDEX_ATTR_BITMAP_FKEY. And INDEX_ATTR_BITMAP_IDENTITY_KEY
should be INDEX_ATTR_BITMAP_REPLID_KEY or 

Re: [HACKERS] logical changeset generation v6.7

2013-12-03 Thread Andres Freund
On 2013-12-03 19:15:53 +0900, Kyotaro HORIGUCHI wrote:
  - In heapam.c, it seems to be better replacing t_self only
during logical decoding.

I don't see what'd be gained by that except make the test matrix bigger
at no gain.

  - Before that, In LogicalDecodingAcquireFreeSlot, lock window
for procarray is extended after GetOldestXmin, but procarray
does not seem to be accessed during the additional period. If
you are holding this lock for the purpose other than accessing
procArray, it'd be better to provide its own lock object.

The comment above the part explains the reason:

/*
 * Acquire the current global xmin value and directly set the logical 
xmin
 * before releasing the lock if necessary. We do this so wal decoding is
 * guaranteed to have all catalog rows produced by xacts with an xid 
 * walsnd-xmin available.
 *
 * We can't use ComputeLogicalXmin here as that acquires ProcArrayLock
 * separately which would open a short window for the global xmin to
 * advance above walsnd-xmin.
 */

  - In dropdb in dbcommands.c, ereport is invoked referring the
result of LogicalDecodingCountDBSlots. But it seems better to
me issueing this exception within LogicalDecodingCountDBSlots
even if goto is required.

What if LogicalDecodingCountDBSlots() is needed in other places? That
seems like a layering violation to me.

  - In LogStandbySnapshot in standby.c, two complementary
conditions are imposed on two same unlocks. It might be
somewhat paranoic but it is safer being like follows,

I don't see an advantage in that.

  - In tqual.c, in Setup/RevertFrom DecodingSnapshots, the name
CatalogSnapshotData looks lacking unity with other
Snapshot*Data's.

That part needs a bit of work, agreed.

 = 0007:
 
  - In heapam.c, the new global variable 'RecentGlobalDataXmin' is
quite similar to 'RecentGlobalXmin' and does not represents
what it is. The name should be
changed. RecentGlobalNonCatalogXmin would be more preferable..

Hm. It's a mighty long name... but it indeed is clearer.

  - Althgough simplly from my teste, the following part in
heapam.c
 
  if (IsSystemRelation(scan-rs_rd)
  || RelationIsAccessibleInLogicalDecoding(scan-rs_rd))
  heap_page_prune_opt(scan-rs_rd, buffer, RecentGlobalXmin);
  else
  heap_page_prune_opt(scan-rs_rd, buffer, RecentGlobalDataXmin);
 
would be readable to be like,
 
  if (IsSystemRelation(scan-rs_rd)
  || RelationIsAccessibleInLogicalDecoding(scan-rs_rd))
xmin = RecentGlobalXmin
  else
xmin = RecentGlobalDataXmin
  heap_page_prune_opt(scan-rs_rd, buffer, xmin);

Well, it requires introducing a new variable (which better not be named
xmin, but OldestXmin or similar). But I don't really care.

 index_fetch_heap in indexam.c has similar part to this, and
 you coded in latter style in IndexBuildHeapScan in index.c.

It's different there, because we do an explicit GetOldestXmin() call
there which we surely don't want to do twice.

 0008 and after to come later..

Thanks for your review!

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v6.7

2013-11-28 Thread Robert Haas
On Thu, Nov 14, 2013 at 8:46 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2013-11-12 18:50:33 +0100, Andres Freund wrote:
  You've actually changed the meaning of this section (and not in a good 
  way):
 
   be set at server start. varnamewal_level/ must be set
  -to literalarchive/ or literalhot_standby/ to allow
  -connections from standby servers.
  +to literalarchive/, literalhot_standby/ or 
  literallogical/
  +to allow connections from standby servers.
 
  I think that the previous text meant that you needed archive - or, if
  you want to allow connections, hot_standby.  The new text loses that
  nuance.

 Yea, that's because it was lost on me in the first place...

 I think that's because the nuance isn't actually in the text - note that
 it is talking about max_wal_senders and talking about connections
 *from*, not *to* standby servers.
 I've reformulated the wal_level paragraph and used or higher in
 several places now.

 Ok, so here's a rebased version of this. I tried to fix all the issues
 you mentioned, and it's based on the split off IsSystemRelation() patch,
 I've sent yesterday (included here).

OK, I've committed the patch to adjust the definition of
IsSystemRelation()/IsSystemClass() and add
IsCatalogRelation()/IsCatalogClass().  I kibitzed your decision about
which function to use in a few places - specifically, I made all of
the places that cared about allow_system_table_mods uses the IsSystem
functions, and all the places that cared about invalidation messages
use the IsCatalog functions.  I don't think any of these changes are
more cosmetic, but I think it may reduce the chance of errors or
inconsistencies in the face of future changes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] logical changeset generation v6.7

2013-11-27 Thread Fabrízio de Royes Mello
On Thu, Nov 14, 2013 at 11:46 AM, Andres Freund and...@2ndquadrant.com
wrote:

 On 2013-11-12 18:50:33 +0100, Andres Freund wrote:
   You've actually changed the meaning of this section (and not in a
good way):
  
be set at server start. varnamewal_level/ must be set
   -to literalarchive/ or literalhot_standby/ to allow
   -connections from standby servers.
   +to literalarchive/, literalhot_standby/ or
literallogical/
   +to allow connections from standby servers.
  
   I think that the previous text meant that you needed archive - or, if
   you want to allow connections, hot_standby.  The new text loses that
   nuance.
 
  Yea, that's because it was lost on me in the first place...

 I think that's because the nuance isn't actually in the text - note that
 it is talking about max_wal_senders and talking about connections
 *from*, not *to* standby servers.
 I've reformulated the wal_level paragraph and used or higher in
 several places now.

 Ok, so here's a rebased version of this. I tried to fix all the issues
 you mentioned, and it's based on the split off IsSystemRelation() patch,
 I've sent yesterday (included here).


Hello,

I'm trying to apply the patches but show some warnings/errors:

$ gunzip -c
/home/fabrizio/Downloads/0002-wal_decoding-Add-wal_level-logical-and-log-data-requ.patch.gz
| git apply -
warning: src/backend/access/transam/xlog.c has type 100755, expected 100644

$ gunzip -c
/home/fabrizio/Downloads/0005-wal_decoding-Introduce-wal-decoding-via-catalog-time.patch.gz
| git apply -
warning: src/backend/access/transam/xlog.c has type 100755, expected 100644

$ gunzip -c
/home/fabrizio/Downloads/0006-wal_decoding-Implement-VACUUM-FULL-CLUSTER-support-v.patch.gz
| git apply -
warning: src/backend/access/transam/xlog.c has type 100755, expected 100644

$ gunzip -c
/home/fabrizio/Downloads/0007-wal_decoding-Only-peg-the-xmin-horizon-for-catalog-t.patch.gz
| git apply -
warning: src/backend/access/transam/xlog.c has type 100755, expected 100644

$ gunzip -c
/home/fabrizio/Downloads/0011-wal_decoding-pg_recvlogical-Introduce-pg_receivexlog.patch.gz
| git apply -
error: patch failed: src/bin/pg_basebackup/streamutil.c:210
error: src/bin/pg_basebackup/streamutil.c: patch does not apply

The others are applied correctly. The permission warning must be fixed and
0011 bust be rebased.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello