On Wed, Feb 12, 2014 at 10:02:32AM +0530, Amit Kapila wrote:
> By issue, I assume you mean to say, which compression algorithm is
> best for this patch.
> For this patch, currently we have 2 algorithm's for which results have been
> posted. As far as I understand Heikki is pretty sure that the latest algorithm
> (compression using prefix-suffix match in old and new tuple) used for this
> patch is better than the other algorithm in terms of CPU gain or overhead.
> The performance data taken by me for the worst case for this algorithm
> shows there is a CPU overhead for this algorithm as well.
> OTOH the another algorithm (compression using old tuple as history) can be
> a bigger win in terms I/O reduction in more number of cases.
> In short, it is still not decided which algorithm to choose and whether
> it can be enabled by default or it is better to have table level switch
> to enable/disable it.
> So I think the decision to be taken here is about below points:
> 1. Are we okay with I/O reduction at the expense of CPU for *worst* cases
> and I/O reduction without impacting CPU (better overall tps) for
> *favourable* cases?
> 2. If we are not okay with worst case behaviour, then can we provide
> a table-level switch, so that it can be decided by user?
> 3. If none of above, then is there any other way to mitigate the worst
> case behaviour or shall we just reject this patch and move on.
> Given a choice to me, I would like to go with option-2, because I think
> for most cases UPDATE statement will have same data for old and
> new tuples except for some part of tuple (generally column's having large
> text data are not modified), so we will be end up mostly in favourable cases
> and surely for worst cases we don't want user to suffer from CPU overhead,
> so a table-level switch is also required.
I think 99.9% of users are never going to adjust this so we had better
choose something we are happy to enable for effectively everyone. In my
reading, prefix/suffix seemed safe for everyone. We can always revisit
this if we think of something better later, as WAL format changes are not
a problem for pg_upgrade.
I also think making it user-tunable is so hard for users to know when to
adjust as to be almost not worth the user interface complexity it adds.
I suggest we go with always-on prefix/suffix mode, then add some check
so the worst case is avoided by just giving up on compression.
As I said previously, I think compressing the page images is the next
big win in this area.
> I think here one might argue that for some users it is not feasible to
> decide whether their tuples data for UPDATE is going to be similar
> or completely different and they are not at all ready for any risk for
> CPU overhead, but they would be happy to see I/O reduction in which
> case it is difficult to decide what should be the value of table-level
> switch. Here I think the only answer is "nothing is free" in this world,
> so either make sure about the application's behaviour for UPDATE
> statement before going to production or just don't enable this switch and
> be happy with the current behaviour.
Again, can't set do a minimal attempt at prefix/suffix compression so
there is no measurable overhead?
> On the other side there will be users who will be pretty certain about their
> usage of UPDATE statement or atleast are ready to evaluate their
> application if they can get such a huge gain, so it would be quite useful
> feature for such users.
> >can we move move forward with the full-page compression patch?
> In my opinion, it is not certain that whatever compression algorithm got
> decided for this patch (if any) can be directly used for full-page
> compression, some ideas could be used or may be the algorithm could be
> tweaked a bit to make it usable for full-page compression.
Thanks, I understand that now.
Bruce Momjian <br...@momjian.us> http://momjian.us
+ Everyone has their own god. +
Sent via pgsql-hackers mailing list (firstname.lastname@example.org)
To make changes to your subscription: