2013-02-24 03:23 keltezéssel, Stephen Frost írta:
Zoltán,

* Boszormenyi Zoltan (z...@cybertec.at) wrote:
2013-02-23 02:55 keltezéssel, Stephen Frost írta:
First off, it's not in context diff format, which is the PG standard for
patch submission.
Since moving to GIT, this expectation is obsolete.
No, it isn't.  Patches posted to the list should be in context diff
format (and uncompressed unless it's too large) for easier reading.
That avoids having to download it, apply it to a git tree and then
create the diff ourselves.  Indeed, the move to git had impact on this
at all.

I remembered this mail from The Master(tm):
http://www.postgresql.org/message-id/21555.1339866...@sss.pgh.pa.us


All PG hackers
became comfortable with the unified diff format, see references
from the list. A lot of hackers post "git diff" patches which cannot
produce context diff, either.
It's quite possible to create a context diff from git- indeed, it's
documented on the http://wiki.postgresql.org/wiki/Working_with_Git
portion of the wiki, specifically to help people understand how to take
the changes they've made in their git forks and create context diffs to
post to the mailing lists.  The preference for context diffs is also
still documented at http://wiki.postgresql.org/wiki/Submitting_a_Patch.

And, to be honest, I'm not sure how familiar you are with the history of
PG in this area, but I've been through all of it from the pre-git CVS
days, through the migration to git, to the exclusive use of git.

I started with Postgres95, though only using the source tarballs.
I started following CVS at around 2003 or so.

   I feel
quite confident that the preference for context diffs hasn't changed.

With stats for the February, 2013:
Name                 ctx  ctx/gitunif/git unif
Dimitry Fontaine 40       1      0
Ian Lawrence Barwick  0     1       0      0
Alvaro Herrera       10     0       5     0
Pavel Stehule         9     0       0      0
Heikki Linnakangas    0     0       7     0
Michael Paquier       0     0       6     0
Andres Freund         0     0       5     0
Alexander Law         0     0       1      0
Etsuro Fujita         0     0       4     0
Amit Kapila           4     0       0      0
Kevin Grittner        2     0       3      0
Gurjeet Singh         0     0       2    0
Erik Rijkers          0     0       0      1
Zoltan Boszormenyi    0     0       2      0
Tomas Vondra          0     0       2      0
Bruce Momjian         0     4       0      0
Fujii Masao           1     0       0      0
David Fetter          5     0       0      0
Jonathan Rogers       0     0       1      0
Andrew Dunstan        2     0       0      0
Manlio Perillo        0     0       1      0
Dean Rasheed          0     1       2      0
Jeff Janes            0     2       1      0
Phil Sorber           0     3       2      0
Mark Wong             0     1       0      0
Ivan Lezhnjov IV      0     0       1      0
Kohei Gagai           0     0       2      0
MauMau                1     0       0      0
Tom Lane              0     1       0      0
Sean Chittenden       0     0       2      0
Albe Laurenz          0     1       0      0
Daniel Farina         1     0       0      0
Simon Riggs           0     0       3      0
Peter Eisentraut      0     0       4      0

Plain context diffs: 39
Context diffs generated from git: 14
"git diff" patches: 57
Plain unified diff: 1

Some mails contained more than one patch, these were
counted as-is. One patch is one patch.

So, more than halfof the recently posted patches come
directly from "git diff". The preference has changed.
But indeed, plain "diff -u" is cleanly in the minority.
I can repost my patch from "git diff", too, to be in
the majority. :-P

That's not strictly true but it's not widely used either:

$ find . -type f | xargs grep weight | grep heavy
./monitoring.sgml:     <entry>Probe that fires when a request for a
heavyweight lock (lmgr lock)
./monitoring.sgml:     <entry>Probe that fires when a request for a
heavyweight lock (lmgr lock)
Interesting.  That didn't show up in the searches that I was doing
through the web interface, though it does now.  If we're going to use
that term, we should define it in the lock documentation.  If not, then
we should avoid it everywhere.  I'm fine with either.

Me too, the documentation should be consistent. I will remove the
"heavyweight" term.


- I don't particularly like "lock_timeout_option", for a couple of
   reasons.  First is simply the name is terrible, but beyond that, it
   strikes me that wanting to set both a 'per-lock timeout' and a
   'overall waiting-for-locks timeout' at the same time would be a
   reasonable use-case.  If we're going to have 2 GUCs and we're going to
   support each of those options, why not just let the user specify
   values for each?
OK, suggest names for the 2 GUCS, please.
lock_timeout_wait and lock_timeout_stmt_wait ?

Hm. How about without the "_wait" suffix?
Or lock_timeout vs statement_lock_timeout?

   Though I've been really
wondering to myself as to if we need both of these options as well as
statement_timeout.

   Perhaps you can explain the use case for each
option and how they're distinct from each other?

statement_timeout is the legacy behaviour, it should be kept.
It's behaviour is well understood: the statement finishes under
the specified time or it throws an error. The problem from the
application point of view is that the error can happen while
the tuples are being transferred to the client, so the recordset
can be cut in half.

"lock_timeout_stmt" (or lock_timeout_option = per_statement)
is somewhat extending the statement_timeout as only the
time waiting on locks are counted, so SELECT FOR UPDATE/etc.
may throw an error but if all rows are collected already, or
plain SELECT is run, the application gets them all.
This seems to follow the Microsoft SQL Server semantics:
http://msdn.microsoft.com/en-us/library/ms189470.aspx

The per-lock lock_timeout was the result of a big Informix
project ported to PostgreSQL, this follows Informix semantics:
http://publib.boulder.ibm.com/infocenter/idshelp/v10/index.jsp?topic=/com.ibm.sqls.doc/sqls788.htm

   Indeed, the use-case
that I'm envisioning is focused around "don't wait more than 'X' for the
relation-level locks" which would allow you to distinguish queries that
are, most likely anyway, making progress, from those which have been
caught behind a lock.  That would match your 'per-statement' lock
timeout concept, iiuc, though I think it might be more simply
implemented.

- Not a big fan of this:

+    * See notes in PGSemaphoreLock.
Why? Copy&paste the *long* comment (a different one in each *_sema.c)
or omitting the comments is better? Suggest a better comment, and
it will be included.
How about, for starters, there's more than one PGSemaphoreLock?  Second,
as you'll note in posix_sema.c, it's useful to say more than just "look
over there".  I'm not suggesting a mass copy/paste, but more than 4
words would be valuable.

OK.


- Do we really need TimestampTzPlusMicroseconds..?
Well, my code is the only user for this macro but it's cleaner
than explicitly doing
To be honest, I was really wondering why TimestampTzPlusMilliseconds
isn't sufficient, not suggesting that we litter the code with #ifdef's.

Because Timestamp[Tz] is microsecond resolution, and the timeout
GUCs are in milliseconds. Adding up time differences (and rounding
or truncating them to millisecond in the process) would make the
per-statement lock timeout lose precision...


        Thanks,

                Stephen


--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
     http://www.postgresql.at/



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to