Re: [HACKERS] Enabling Checksums

2013-04-18 Thread Andres Freund
On 2013-04-17 18:16:36 -0700, Daniel Farina wrote:
 The original paper is often shorthanded Castagnoli 93, but it exists
 in the IEEE's sphere of influence and is hard to find a copy of.
 Luckily, a pretty interesting survey paper discussing some of the
 issues was written by Koopman in 2002 and is available:
 http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.11.8323 As a
 pedagolgical note, it's pretty interesting and accessible piece of
 writing (for me, as someone who knows little of error
 detection/correction) and explains some of the engineering reasons
 that provoke such exercises.

http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=arnumber=231911userType=inst

There's also a koopman paper from 2004 thats interesting.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enabling Checksums

2013-04-18 Thread Andres Freund
On 2013-04-18 00:44:02 +0300, Ants Aasma wrote:
 I went ahead and coded up both the parallel FNV-1a and parallel FNV-1a
 + srl1-xor variants and ran performance tests and detection rate tests
 on both.
 
 Performance results:
 Mul-add checksums: 12.9 bytes/s
 FNV-1a checksums: 13.5 bytes/s
 FNV-1a + srl-1: 7.4 bytes/s
 
 Detection rates:
 False positive rates:
  Add-mul   FNV-1a FNV-1a + srl-1
 Single bit flip: 1:inf 1:129590   1:64795
 Double bit flip: 1:148 1:511  1:53083
 Triple bit flip: 1:673 1:5060 1:61511
   Quad bit flip: 1:18721:193491:68320
 Write 0x00 byte: 1:774538137   1:118776   1:68952
 Write 0xFF byte: 1:165399500   1:137489   1:68958
   Partial write: 1:59949   1:719391:89923
   Write garbage: 1:64866   1:649801:67732
 Write run of 00: 1:57077   1:611401:59723
 Write run of FF: 1:63085   1:596091:62977
 
 Test descriptions:
 N bit flip: picks N random non-overlapping bits and flips their value.
 Write X byte: overwrites a single byte with X.
 Partial write: picks a random cut point, overwrites everything from
 there to end with 0x00.
 Write garbage/run of X: picks two random cut points and fills
 everything in between with random values/X bytes.

I don't think this table is complete without competing numbers for
truncated crc-32. Any chance to get that?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enabling Checksums

2013-04-18 Thread Simon Riggs
On 17 April 2013 22:36, Bruce Momjian br...@momjian.us wrote:

  I would like to know the answer of how an upgrade from checksum to
  no-checksum would behave so I can modify pg_upgrade to allow it.

 Why? 9.3 pg_upgrade certainly doesn't need it. When we get to 9.4, if
 someone has checksums enabled and wants to disable it, why is pg_upgrade
 the right time to do that? Wouldn't it make more sense to allow them to
 do that at any time?

 Well, right now, pg_upgrade is the only way you could potentially turn
 off checksums.  You are right that we might eventually want a command,
 but my point is that we currently have a limitation in pg_upgrade that
 might not be necessary.

We don't currently have checksums, so pg_upgrade doesn't need to cope
with turning them off in 9.3

For 9.4, it might, but likely we've have a tool to turn them off
before then anyway.

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] event trigger API documentation?

2013-04-18 Thread Dimitri Fontaine
Peter Eisentraut pete...@gmx.net writes:
 Offhand, that seems about enough, but I'm just beginning to explore.

I'm interested into hearing about any such use case…

 Chances are, event triggers will end up somewhere near the top of the
 release announcements, so we should have a consistent message about what
 to do with them and how to use them.  If for now, we say, we only
 support writing them in PL/pgSQL, and here is how to do that, and here
 are some examples, that's fine.  But currently, it's not quite clear.

I would prefer that we're silent about them for another (couple of?)
release, because we didn't reach yet the feature set that I did consider
the bare minimum. In my view 9.3 only has code infrastructure to prepare
for the ability to implement Event Triggers later. That this
infrastructure already allows you to do some things with it is like a
proof of concept.

 Surely you had some use cases in mind when you set out to implement
 this.  What were they, and where are we now in relation to them?

I have mainly 4 use cases for them, and none of them are possible to
implement in 9.3:

  - audit (separate log or audit tables for commited only actions)
  - ddl support for replications (trigger based, logical rep.)
  - ddl extensibility
  - apt-get for extensions without dynamically loaded module

The extension items is an example of the more general ddl
extensibility item. Don't worry about ever seeing a patch to core for
implementing it, the whole Event Trigger and Extension Templates
exercise is meant to allow for coding that kind of crazy ideas out of
core.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enabling Checksums

2013-04-18 Thread Daniel Farina
On Wed, Apr 17, 2013 at 11:08 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2013-04-17 18:16:36 -0700, Daniel Farina wrote:
 The original paper is often shorthanded Castagnoli 93, but it exists
 in the IEEE's sphere of influence and is hard to find a copy of.
 Luckily, a pretty interesting survey paper discussing some of the
 issues was written by Koopman in 2002 and is available:
 http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.11.8323 As a
 pedagolgical note, it's pretty interesting and accessible piece of
 writing (for me, as someone who knows little of error
 detection/correction) and explains some of the engineering reasons
 that provoke such exercises.

 http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=arnumber=231911userType=inst

 There's also a koopman paper from 2004 thats interesting.

Having read the 2002 paper more, it seems that the current CRC32
doesn't have a whole lot going for it: CRC32C pretty much cleans its
clock across the board (I don't understand detected Hamming Distance
that seem greater than the information content of the message, e.g. HD
14 with 8 bit messages as seen in CRC32C: that's where CRC32 can win).

CRC32C looks, all in all, the most flexible, because detection of
Hamming Distance 4 spans from 5244-131072 bits (the upper range of
which is a full 16KiB!) and there is superior Hamming Distance
detection on shorter messages up until the point where it seems like
the Hamming Distance able to be detected is larger than the message
size itself (e.g. HM 13 on an 8 bit message).  I'm not sure if this is
an error in my understanding, or what.

Also, larger runs (16KB) are better served by CRC32C: even the
probably-best contender I can see (0xD419CC15) drops to Hamming
Distance 2-detection right after 65505 bits.  CRC32C has the biggest
range at HD4, although Koopman 0xBA0DC66 comes close, gaining superior
Hamming distance detection for 178-16360 bits (the upper end of this
rnage is short of 2KiB by 3 bytes).

All in all, there is no reason I can see to keep CRC32 at all, vs
CRC32C on the basis of error detection alone, so putting aside all the
business about instruction set architecture, I think a software CRC32C
in a vacuum can be seen as a robustness improvement.

There may be polynomials that are not CRC32 or CRC32C that one might
view as having slightly better tradeoffs as seen in Table 1 of Koopman
2002, but it's kind of a stretch: being able to handle 8KB and 16KB as
seen in CRC32C at HD4 as seen in CRC32C is awfully compelling to me.
Koopman 0xBA0DC66B can admirably reach HD6 on a much larger range, up
to 16360 bytes, which is every so shy of 2KiB.  Castagnoli 0xD419CC15
can, short of 8KB by 31 bits can detect HD 5.

Corrections welcome on my interpretations of Tbl 1.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enabling Checksums

2013-04-18 Thread Ants Aasma
On Thu, Apr 18, 2013 at 5:08 AM, Greg Smith g...@2ndquadrant.com wrote:
 On 4/17/13 8:56 PM, Ants Aasma wrote:

 Nothing from the two points, but the CRC calculation algorithm can be
 switched out for slice-by-4 or slice-by-8 variant. Speed up was around
 factor of 4 if I remember correctly...I can provide you

 with a patch of the generic version of any of the discussed algorithms
 within an hour, leaving plenty of time in beta or in 9.4 to
 accommodate the optimized versions.

 Can you nail down a solid, potential for commit slice-by-4 or slice-by-8
 patch then?  You dropped into things like per-byte overhead to reach this
 conclusion, which was fine to let the methods battle each other. Maybe I
 missed it, but I didn't remember seeing an obvious full patch for this
 implementation then come back up from that.  With the schedule pressure this
 needs to return to more database-level tests.  Your concerns about the
 committed feature being much slower then the original Fletcher one are
 troubling, and we might as well do that showdown again now with the best of
 the CRC implementations you've found.

I meant any of fast ones is easy to nail down. The sped up slice-by-8
is somewhat slightly trickier to clean up. Especially if anyone
expects it to accelerate WAL calculation, then it brings up a whole
bunch of design questions on how to handle alignment issues. For
performance testing what is attached should work fine, it would still
need some cleanup.

 It's fair that you're very concerned about (1), but I wouldn't give it 100%
 odds of happening either.  The user demand that's motivated me to work on
 this will be happy with any of (1) through (3), and in two of them
 optimizing the 16 bit checksums now turns out to be premature.

Fair enough, although I'd like to point out the optimization is
premature in the sense that the effort might go to waste. The checksum
function is a self contained, easy to test and very low maintenance
piece of code - not the usual premature optimization risk.

Regards,
Ants Aasma
-- 
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de


crc32c-sb8-checksum.v0.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enabling Checksums

2013-04-18 Thread Bruce Momjian
On Thu, Apr 18, 2013 at 09:17:39AM +0100, Simon Riggs wrote:
 On 17 April 2013 22:36, Bruce Momjian br...@momjian.us wrote:
 
   I would like to know the answer of how an upgrade from checksum to
   no-checksum would behave so I can modify pg_upgrade to allow it.
 
  Why? 9.3 pg_upgrade certainly doesn't need it. When we get to 9.4, if
  someone has checksums enabled and wants to disable it, why is pg_upgrade
  the right time to do that? Wouldn't it make more sense to allow them to
  do that at any time?
 
  Well, right now, pg_upgrade is the only way you could potentially turn
  off checksums.  You are right that we might eventually want a command,
  but my point is that we currently have a limitation in pg_upgrade that
  might not be necessary.
 
 We don't currently have checksums, so pg_upgrade doesn't need to cope
 with turning them off in 9.3

True, 9.2 doesn't have checksums, while 9.3 will.  One point is that
pg_upgrade could actually be used to turn off checksums for 9.3 to 9.3
upgrades if no tablespaces are used.

 For 9.4, it might, but likely we've have a tool to turn them off
 before then anyway.

True.   Would we want pg_upgrade to still enforce matching checksum
modes for old and new servers at that point?  Eventually we will have to
decide that.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] event trigger API documentation?

2013-04-18 Thread Alvaro Herrera
Dimitri Fontaine escribió:
 Peter Eisentraut pete...@gmx.net writes:
  Offhand, that seems about enough, but I'm just beginning to explore.
 
 I'm interested into hearing about any such use case…
 
  Chances are, event triggers will end up somewhere near the top of the
  release announcements, so we should have a consistent message about what
  to do with them and how to use them.  If for now, we say, we only
  support writing them in PL/pgSQL, and here is how to do that, and here
  are some examples, that's fine.  But currently, it's not quite clear.
 
 I would prefer that we're silent about them for another (couple of?)
 release,

You can be as much silent as you want in marketing materials (though
maybe Berkus will disagree with you about being silent there), but it is
not admissible to be silent in the documentation or pretend the feature
is not there.  Whatever got committed, however small, needs to be
properly documented.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] event trigger API documentation?

2013-04-18 Thread Peter Eisentraut
On 4/18/13 5:05 AM, Dimitri Fontaine wrote:
 Peter Eisentraut pete...@gmx.net writes:
  Offhand, that seems about enough, but I'm just beginning to explore.
 I'm interested into hearing about any such use case…

Without going into too many details (because I don't have them yet), I
was thinking about triggering an external test suite whenever there is a
schema change in the database.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] event trigger API documentation?

2013-04-18 Thread Dimitri Fontaine
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 You can be as much silent as you want in marketing materials (though
 maybe Berkus will disagree with you about being silent there), but it is
 not admissible to be silent in the documentation or pretend the feature
 is not there.  Whatever got committed, however small, needs to be
 properly documented.

Definitely, yes.

The only questions in this thread are:

  - only docs or docs + contrib example?

Tom said it's too late for the contrib example.

  - what about support for PLs other than C and PLpgSQL?

It used to be part of the patch, and I don't understand well enough
the development calendar to guess if I'm supposed to extract that
from earlier patch or if that's too late for 9.3. I'm not sure
what Peter's idea are wrt to the calendar here.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] event trigger API documentation?

2013-04-18 Thread Dimitri Fontaine
Peter Eisentraut pete...@gmx.net writes:
 Without going into too many details (because I don't have them yet), I
 was thinking about triggering an external test suite whenever there is a
 schema change in the database.

So if all you want to know about is that something did change in the
schema to trigger your action, yes you can do it. I would go as far as
to propose that you consider registering an event in a PGQ queue at the
time when the ddl event occurs, so that you can have your test suite run
be triggers from the outside of the database at its leisure.

If you want to stay within PostgreSQL offering proper, a NOTIFY would
do, and you can do that in PLpgSQL too.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] event trigger API documentation?

2013-04-18 Thread Alvaro Herrera
Dimitri Fontaine escribió:
 Alvaro Herrera alvhe...@2ndquadrant.com writes:
  You can be as much silent as you want in marketing materials (though
  maybe Berkus will disagree with you about being silent there), but it is
  not admissible to be silent in the documentation or pretend the feature
  is not there.  Whatever got committed, however small, needs to be
  properly documented.
 
 Definitely, yes.
 
 The only questions in this thread are:
 
   - only docs or docs + contrib example?
 
 Tom said it's too late for the contrib example.

So there's already an answer to this question, isn't there.

   - what about support for PLs other than C and PLpgSQL?
 
 It used to be part of the patch, and I don't understand well enough
 the development calendar to guess if I'm supposed to extract that
 from earlier patch or if that's too late for 9.3. I'm not sure
 what Peter's idea are wrt to the calendar here.

It seems far too late for more code at this stage.  IMO anyway.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] (auto)vacuum truncate exclusive lock

2013-04-18 Thread Jan Wieck

On 4/12/2013 1:57 PM, Tom Lane wrote:

Kevin Grittner kgri...@ymail.com writes:

Tom Lane t...@sss.pgh.pa.us wrote:

I think that the minimum appropriate fix here is to revert the hunk
I quoted, ie take out the suppression of stats reporting and analysis.



I'm not sure I understand -- are you proposing that is all we do
for both the VACUUM command and autovacuum?


No, I said that was the minimum fix.

Looking again at the patch, I note this comment:

/*
+* We failed to establish the lock in the specified number of
+* retries. This means we give up truncating. Suppress the
+* ANALYZE step. Doing an ANALYZE at this point will reset the
+* dead_tuple_count in the stats collector, so we will not get
+* called by the autovacuum launcher again to do the truncate.
+*/

and I suppose the rationale for suppressing the stats report was this
same idea of lying to the stats collector in order to encourage a new
vacuum attempt to happen right away.  Now I'm not sure that that's a
good idea at all --- what's the reasoning for thinking the table will be
any less hot in thirty seconds?  But if it is reasonable, we need a
redesign of the reporting messages, not just a hack to not tell the
stats collector what we did.


Yes, that was the rationale behind it combined with don't change 
function call sequences and more all over the place.


The proper solution would eventually be to add a block number to the 
stats held by the stats collector, which preserves the information about 
the last filled block of the table. The decouple the truncate from 
vacuum/autovacuum. vacuum/autovacuum will set that block number when 
they detect the trailing free space. The analyze step can happen just as 
usual and reset stats, which doesn't reset that block number. The 
autovacuum launcher goes through its normal logic for launching autovac 
or autoanalyze. If it doesn't find any of those to do but the 
last-used-block is set, it launches the separate lazy truncate step.


This explicitly moves the truncate, which inherently requires the 
exclusive lock and therefore is undesirable even in a manual vacuum, 
into the background.



Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] (auto)vacuum truncate exclusive lock

2013-04-18 Thread Jan Wieck

On 4/12/2013 2:08 PM, Alvaro Herrera wrote:

Tom Lane escribió:


Are you saying you intend to revert that whole concept?  That'd be
okay with me, I think.  Otherwise we need some thought about how to
inform the stats collector what's really happening.


Maybe what we need is to consider table truncation as a separate
activity from vacuuming.  Then autovacuum could invoke it without having
to do a full-blown vacuum.  For this to work, I guess we would like to
separately store the status of the back-scan in pgstat somehow (I think
a boolean flag suffices: were we able to truncate all pages that
appeared to be empty?)


Should have read the entire thread before responding :)


Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] (auto)vacuum truncate exclusive lock

2013-04-18 Thread Jan Wieck

On 4/18/2013 11:44 AM, Jan Wieck wrote:


Yes, that was the rationale behind it combined with don't change
function call sequences and more all over the place.


function call signatures

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enabling Checksums

2013-04-18 Thread Ants Aasma
On Thu, Apr 18, 2013 at 5:57 PM, Ants Aasma a...@cybertec.at wrote:
 I'll generate an avalanche diagram for CRC32C too, but it will take a
 while even if I use a smaller dataset.

Well that was useless... In CRC flipping each bit in the input flips
preset pattern of bits in the output regardless of the actual data on
the page. Some stats for CRC32C - input bits affect 28344 different
bit combinations. Count of bits by number of duplicated bitpatterns:
[(1, 8868),
 (2, 17722),
 (3, 17775),
 (4, 12048),
 (5, 5725),
 (6, 2268),
 (7, 875),
 (8, 184),
 (9, 45),
 (10, 10),
 (16, 16)]

Count of bit positions by number of bit-positions affected:
[(0, 16),
 (1, 25),
 (3, 1185),
 (5, 8487),
 (7, 22970),
 (9, 22913),
 (11, 8790),
 (13, 1119),
 (15, 31)]

Map of number of bit position affected, with 8 being black and 0 or 16
being red attached.

I'm not sure if the issues with partial writes are somehow related to this.

Regards,
Ants Aasma
-- 
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de
attachment: effect-random-crc.png
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enabling Checksums

2013-04-18 Thread Jeff Davis
On Wed, 2013-04-17 at 20:21 -0400, Greg Smith wrote:
 -Original checksum feature used Fletcher checksums.  Its main problems, 
 to quote wikipedia, include that it cannot distinguish between blocks 
 of all 0 bits and blocks of all 1 bits.

That is fairly easy to fix by using a different modulus: 251 vs 255.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enabling Checksums

2013-04-18 Thread Florian Weimer
* Greg Smith:

 The TCP/IP checksum spec is at https://tools.ietf.org/html/rfc793 ;
 its error detection limitations are described at
 http://www.noahdavids.org/self_published/CRC_and_checksum.html ; and a
 good article about optimizing its code is at
 http://www.locklessinc.com/articles/tcp_checksum/  I'll take a longer
 look at whether it's an improvement on the Fletcher-16 used by the
 current patch.

The TCP checksum is too weak to be practical.  Every now an then, I
see data transfers where the checksum is valid, but the content
contains bit flips.  Anything that flips bits randomly at intervals
which are multiples of 16 bits is quite likely to pass through
checksum detection.

In practice, TCP relies on checksumming on the sub-IP layers.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enabling Checksums

2013-04-18 Thread Florian Pflug
On Apr18, 2013, at 19:04 , Jeff Davis pg...@j-davis.com wrote:
 On Wed, 2013-04-17 at 20:21 -0400, Greg Smith wrote:
 -Original checksum feature used Fletcher checksums.  Its main problems, 
 to quote wikipedia, include that it cannot distinguish between blocks 
 of all 0 bits and blocks of all 1 bits.
 
 That is fairly easy to fix by using a different modulus: 251 vs 255.

At the expense of a drastic performance hit though, no? Modulus operations
aren't exactly cheap.

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enabling Checksums

2013-04-18 Thread Ants Aasma
On Thu, Apr 18, 2013 at 8:05 PM, Florian Pflug f...@phlo.org wrote:
 On Apr18, 2013, at 19:04 , Jeff Davis pg...@j-davis.com wrote:
 On Wed, 2013-04-17 at 20:21 -0400, Greg Smith wrote:
 -Original checksum feature used Fletcher checksums.  Its main problems,
 to quote wikipedia, include that it cannot distinguish between blocks
 of all 0 bits and blocks of all 1 bits.

 That is fairly easy to fix by using a different modulus: 251 vs 255.

 At the expense of a drastic performance hit though, no? Modulus operations
 aren't exactly cheap.

The modulus can be done in the end. By using a modulus of 65521 the
resulting checksum is called Adler-32. [1] However the quality of
Fletcher-32/Adler-32 is strictly worse than even the first iteration
of multiply-add based checksums proposed.

[1] http://en.wikipedia.org/wiki/Adler-32

Regards,
Ants Aasma
-- 
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enabling Checksums

2013-04-18 Thread Florian Pflug
On Apr18, 2013, at 18:48 , Ants Aasma a...@cybertec.at wrote:
 On Thu, Apr 18, 2013 at 5:57 PM, Ants Aasma a...@cybertec.at wrote:
 I'll generate an avalanche diagram for CRC32C too, but it will take a
 while even if I use a smaller dataset.
 
 Well that was useless... In CRC flipping each bit in the input flips
 preset pattern of bits in the output regardless of the actual data on
 the page. Some stats for CRC32C - input bits affect 28344 different
 bit combinations. Count of bits by number of duplicated bitpatterns:

Yup, CRC is linear too. CRC is essentially long division for polynomials,
i.e. you interpret the N input bits as the coefficients of a (large)
polynomial of degree (N-1), and divide by the CRC polynomial. The remainder
is the checksum, and consists of B bits where B is the degree of the
CRC polynomial. (Polynomial here means polynomial over GF(2), i.e. over
a field with only two values 0 and 1)

I'm currently trying to see if one can easily explain the partial-write
behaviour from that. Having lots of zeros at the end end corresponds
to an input polynomial of the form

  p(x) * x^l

where l is the number of zero bits. The CRC (q(x) is the CRC polynomial) is

  p(x) * x^l mod q(x) = (p(x) mod q(x)) * (x^l mod q(x)) mod q(x)

That still doesn't explain it, though - the result *should* simply
be the checksum of p(x), scrambled a bit by the multiplication with
(x^l mod q(x)). But if q(x) is irreducible, that scrambling is invertible
(as multiplication module some irreducible element always is), and thus
shouldn't matter much.

So either the CRC32-C polynomial isn't irreducible, or there something
fishy going on. Could there be a bug in your CRC implementation? Maybe
a mixup between big and little endian, or something like that?

The third possibility is that I've overlooking something, of course ;-)
Will think more about this tomorrow if time permits

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enabling Checksums

2013-04-18 Thread Ants Aasma
On Thu, Apr 18, 2013 at 8:15 PM, Florian Pflug f...@phlo.org wrote:
 So either the CRC32-C polynomial isn't irreducible, or there something
 fishy going on. Could there be a bug in your CRC implementation? Maybe
 a mixup between big and little endian, or something like that?

I'm suspecting an implementation bug myself. I already checked the
test harness and that was all sane, compiler hadn't taken any
unforgivable liberties there. I will crosscheck the output with other
implementations to verify that the checksum is implemented correctly.

Regards,
Ants Aasma
-- 
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enabling Checksums

2013-04-18 Thread Jeff Davis
On Thu, 2013-04-18 at 19:05 +0200, Florian Pflug wrote:
 On Apr18, 2013, at 19:04 , Jeff Davis pg...@j-davis.com wrote:
  On Wed, 2013-04-17 at 20:21 -0400, Greg Smith wrote:
  -Original checksum feature used Fletcher checksums.  Its main problems, 
  to quote wikipedia, include that it cannot distinguish between blocks 
  of all 0 bits and blocks of all 1 bits.
  
  That is fairly easy to fix by using a different modulus: 251 vs 255.
 
 At the expense of a drastic performance hit though, no? Modulus operations
 aren't exactly cheap.

Modulo is only necessary when there's a possibility of overflow, or at
the very end of the calculation. If we accumulate 32-bit integers into
64-bit sums, then it turns out that it can't overflow given the largest
input we support (32K page).

32K page = 8192 32-bit integers

1*(2^32-1) + 2*(2^32-1) + 3*(2^32-1) ... 8192*(2^32-1)
= (2^32-1) * (8192^2 - 8192)/2
= 144097595856261120 (  2^64-1 )

So, we only need to do the modulo at the end.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enabling Checksums

2013-04-18 Thread Ants Aasma
On Thu, Apr 18, 2013 at 8:24 PM, Ants Aasma a...@cybertec.at wrote:
 On Thu, Apr 18, 2013 at 8:15 PM, Florian Pflug f...@phlo.org wrote:
 So either the CRC32-C polynomial isn't irreducible, or there something
 fishy going on. Could there be a bug in your CRC implementation? Maybe
 a mixup between big and little endian, or something like that?

 I'm suspecting an implementation bug myself. I already checked the
 test harness and that was all sane, compiler hadn't taken any
 unforgivable liberties there. I will crosscheck the output with other
 implementations to verify that the checksum is implemented correctly.

Looks like the implementation is correct. I cross-referenced it
against a bitwise algorithm for crc32 with the castagnoli polynomial.
This also rules out any endianness issues as the bitwise variant
consumes input byte at a time.

What ever it is, it is something specific to PostgreSQL page layout.
If I use /dev/urandom as the source the issue disappears. So much for
CRC32 being proven good.

Regards,
Ants Aasma
--
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de



-- 
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Recovery target 'immediate'

2013-04-18 Thread Heikki Linnakangas
I just found out that if you use continuous archiving and online 
backups, it's surprisingly difficult to restore a backup, without 
replaying any more WAL than necessary.


If you don't set a recovery target, PostgreSQL will recover all the WAL 
it finds. You can set recovery target time to a point immediately after 
the end-of-backup record, but that's tricky. You have to somehow find 
out the exact time when the backup ended, and set it to that. But if you 
set it any too early, recovery will abort with requested recovery stop 
point is before consistent recovery point error. And that's not quite 
precise anyway; not all record types carry timestamps, so you will 
always replay a few extra records until the first timestamped record 
comes along. Setting recovery_target_xid is similarly difficult. If you 
were well prepared, you created a named recovery point with 
pg_create_restore_point() immediately after the backup ended, and you 
can use that, but that requires forethought.


It seems that we're missing a setting, something like recovery_target = 
'immediate', which would mean stop as soon as consistency is reached. 
Or am I missing some trick?


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enabling Checksums

2013-04-18 Thread Florian Pflug
On 18.04.2013, at 20:02, Ants Aasma a...@cybertec.at wrote:
 On Thu, Apr 18, 2013 at 8:24 PM, Ants Aasma a...@cybertec.at wrote:
 On Thu, Apr 18, 2013 at 8:15 PM, Florian Pflug f...@phlo.org wrote:
 So either the CRC32-C polynomial isn't irreducible, or there something
 fishy going on. Could there be a bug in your CRC implementation? Maybe
 a mixup between big and little endian, or something like that?
 
 I'm suspecting an implementation bug myself. I already checked the
 test harness and that was all sane, compiler hadn't taken any
 unforgivable liberties there. I will crosscheck the output with other
 implementations to verify that the checksum is implemented correctly.
 
 Looks like the implementation is correct. I cross-referenced it
 against a bitwise algorithm for crc32 with the castagnoli polynomial.
 This also rules out any endianness issues as the bitwise variant
 consumes input byte at a time.
 
 What ever it is, it is something specific to PostgreSQL page layout.
 If I use /dev/urandom as the source the issue disappears. So much for
 CRC32 being proven good.

Weird. Is the code of your test harness available publicly, or could you post 
it? I'd like to look into this...

best regard,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enabling Checksums

2013-04-18 Thread Greg Stark
On Thu, Apr 18, 2013 at 6:04 PM, Florian Weimer f...@deneb.enyo.de wrote:
 The TCP checksum is too weak to be practical.  Every now an then, I
 see data transfers where the checksum is valid, but the content
 contains bit flips.

Well of course, it's only a 16-bit checksum. 64k packets isn't very
many so if you're not counting checksum failures it won't take very
long before one gets through. The purpose of the checksum is to notify
you that you have a problem, not to block bad packets from getting
through.

 Anything that flips bits randomly at intervals
 which are multiples of 16 bits is quite likely to pass through
 checksum detection.

I'm not sure about this


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers