Re: [edk2] [PATCH v2] OvmfPkg/PlatformPei: Fix VS2008 build breakage

2016-03-22 Thread Paolo Bonzini


On 18/03/2016 22:53, Laszlo Ersek wrote:
> > The correct character to use in that situation is the emdash, If you
> > *absolutely* must, then rewrite the whole sentence to avoid using it.
> > Do *not* replace it with hyphens.
> 
> Okay. I've googled the use of emdash in the English language, and it
> seems to be more or less interchangeable with parens. Is that okay?

Actually there is a good reason to replace the emdash with hyphens, and
the reason is that the emdash looks like crap (to me at least) in
monospaced fonts.

Paolo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH v2] OvmfPkg/PlatformPei: Fix VS2008 build breakage

2016-03-19 Thread Laszlo Ersek
On 03/19/16 01:19, David Woodhouse wrote:

> Isn't that i18n.commitencoding=latin2 config on your part somewhat
> gratuitous? The idea is that the tools are converting everything on the
> way in and out, and that *only* affects the commit objects that are
> actually encoded in the repository, right? So letting that be UTF-8 to
> match the default shouldn't even be *visible* to you, should it? Except
> that you probably wouldn't have suffered that git-send-email bug you
> mentioned?

With i18n.commitencoding=UTF-8, git expects me to write the commit
messages in UTF-8. My main editor doesn't do UTF-8.

Most of the time neither UTF-8 nor latin[12] are necessary; I write my
own commit messages only in ASCII. So that is certainly accommodated by
sticking with the default (UTF-8).

However, occasionally I have to make an excursion into latin2 (which
partially overlaps latin1 too), for people's names. This can be covered
by setting i18n.commitencoding to latin2 -- it doesn't break ASCII, and
allows me to paste most (latin script) names transparently, using my
main editor.

If I switched i18n.commitencoding back to UTF-8, then I'd have to edit
commit messages with a UTF-8 capable editor whenever ASCII didn't
suffice, even for names in simple latin script that would otherwise fit
in latin2. That's a huge chore for me.

I utterly hate most wide-spread editors that happen to have good UTF-8
support (emacs, vi*, gedit). "joe" I don't hate, but it still doesn't
have all the macros and commands that I use all the time in my main
editor, even while editing commit messages. (Sometimes I draw ASCII
diagrams in commit messages. I have shortcuts for pasting frequent Cc:'s
quickly. I have a dedicated wrapping mode for commit messages. And so on.)

People customize the heck out of their programmable editors (mostly
emacs and vim); I do the same with mine, it just happens to lack UTF-8.

Hm Wait a second. I just realized there is a git hook called
"commit-msg":

   commit-msg
   This hook is invoked by git commit, and can be bypassed with
   --no-verify option. It takes a single parameter, the name of
   the file that holds the proposed commit log message. Exiting
   with non-zero status causes the git commit to abort.

   The hook is allowed to edit the message file in place, and can
   be used to normalize the message into some project standard
   format (if the project has one). It can also be used to refuse
   the commit after inspecting the message file.

   The default commit-msg hook, when enabled, detects duplicate
   "Signed-off-by" lines, and aborts the commit if one is found.

Okay, here's what I'll do. I will switch i18n.commitencoding back to
UTF-8. And, I will add a commit-msg hook that converts the commit
message in-place from latin2 to UTF-8, with "iconv". That should keep us
both happy. Deal?

Laszlo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


[edk2] [PATCH v2] OvmfPkg/PlatformPei: Fix VS2008 build breakage

2016-03-19 Thread David Woodhouse
Commit 7b8fe6356 ("OvmfPkg: PlatformPei: enable PCIEXBAR (aka MMCONFIG /
ECAM) on Q35") broke the VS2008 build, causing it to complain of
"potentially uninitialized local variable 'PciExBarBase' used" at
line 275.

On this occasion it isn't actually wrong — the mHostBridgeDevId variable
is global, so theoretically it *could* change between the two if()
statements.

Of course VS2008 is also a steaming pile of poo, and even if we use
a local boolean variable which definitely *can't* change in the middle,
it still can't work it out and still emits the warning.

So just initialise PciExBarBase to zero to shut the compiler up.

Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: David Woodhouse 
---
Oops, my build test on Windows still had the temporary 'PciExBarBase=0'
to shut the compiler up, so wasn't really exercising the fix I was
submitting. Sorry for the noise.

It turns out that VS2008 really is just a stupid pile of crap.

OvmfPkg/PlatformPei/Platform.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/OvmfPkg/PlatformPei/Platform.c b/OvmfPkg/PlatformPei/Platform.c
index 0fc2278..20008d0 100644
--- a/OvmfPkg/PlatformPei/Platform.c
+++ b/OvmfPkg/PlatformPei/Platform.c
@@ -212,7 +212,7 @@ MemMapInitialization (
 
   if (!mXen) {
 UINT32  TopOfLowRam;
-UINT64  PciExBarBase;
+UINT64  PciExBarBase = 0;
 UINT32  PciBase;
 UINT32  PciSize;
 
-- 
2.5.5

-- 
David WoodhouseOpen Source Technology Centre
david.woodho...@intel.com  Intel Corporation



smime.p7s
Description: S/MIME cryptographic signature
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH v2] OvmfPkg/PlatformPei: Fix VS2008 build breakage

2016-03-19 Thread David Woodhouse
On Fri, 2016-03-18 at 17:53 +0100, Laszlo Ersek wrote:
> 
> (1) The commit message uses at least one non-ASCII character, the EM
> DASH (U+2014). Can you please replace it with a "--"? Yes, I know you
> hate me for asking this. Please just write me off as stupid and replace
> the character.

No. This isn't 1994. If you wish to change it, go ahead. Don't ask me
to do stupid things for no good reason.

> (2) In the edk2 coding style, initialization of local variables is not
> permitted. Can you please add an assignment instead?

Er, really? What insanity is this? Where did these guidelines come from
and can they be fixed? Why would initialisation of local variables
*ever* be frowned upon? How do they make this stuff up?

> (3) I tried to apply your patch nonetheless. It doesn't apply. I looked
> at the patch email that I saved from Thunderbird. The code section of
> the email contains =C2=A0 quoted-printable sequences. Since the charset
> is UTF-8 (according to the Content-Type header), 0xC2 0xA0 translates to
> U+00A0, "NO-BREAK SPACE".

Hm, that would appear to be an Evolution bug. I think it triggers when
lines have trailing whitespace. I'll investigate. Thanks for pointing
it out.

I can put it in a PR if you prefer... :)

-- 
dwmw2



smime.p7s
Description: S/MIME cryptographic signature
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH v2] OvmfPkg/PlatformPei: Fix VS2008 build breakage

2016-03-19 Thread Laszlo Ersek
On 03/18/16 18:45, David Woodhouse wrote:
> On Fri, 2016-03-18 at 17:53 +0100, Laszlo Ersek wrote:
>>
>> (1) The commit message uses at least one non-ASCII character, the EM
>> DASH (U+2014). Can you please replace it with a "--"? Yes, I know you
>> hate me for asking this. Please just write me off as stupid and replace
>> the character.
> 
> No. This isn't 1994.

Really? I wouldn't know. If I look at what ISO C features we're allowed
to use, we certainly seem to be stuck in 1995. Or, well, sometime before
1989, because we can't even use structure assignment.

> If you wish to change it, go ahead.

Thank you, I will.

> Don't ask me
> to do stupid things for no good reason.

There is a good reason. You know it and you don't care about it. Not a
problem; I'll fix up the commit message.

>> (2) In the edk2 coding style, initialization of local variables is not
>> permitted. Can you please add an assignment instead?
> 
> Er, really? What insanity is this?

Please ask your colleagues at Intel (but please also make sure that you
don't take offense on their behalf, when you call their guidelines
insane ;) ).

> Where did these guidelines come from
> and can they be fixed? Why would initialisation of local variables
> *ever* be frowned upon? How do they make this stuff up?

Please consult "EDK II C Coding Standards Specification.pdf", section
6.8 "C Function Layout":

  *Initializing a variable as part of its declaration is illegal.*

(Emphasis theirs.)

> 
>> (3) I tried to apply your patch nonetheless. It doesn't apply. I looked
>> at the patch email that I saved from Thunderbird. The code section of
>> the email contains =C2=A0 quoted-printable sequences. Since the charset
>> is UTF-8 (according to the Content-Type header), 0xC2 0xA0 translates to
>> U+00A0, "NO-BREAK SPACE".
> 
> Hm, that would appear to be an Evolution bug.

\o/

Finally something I'm not held responsible for!

(BTW, why did you mail out the patch with Evolution, rather than
git-send-email? Are you sure you are using the tools as they were
intended? :))

> I think it triggers when
> lines have trailing whitespace. I'll investigate. Thanks for pointing
> it out.
> 
> I can put it in a PR if you prefer... :)

Works for me, but please send the pull request to the mailing list. I'll
rebase the commit for the commit message fixup, and for adding my R-b,
but I'll keep the base commit / history intact.

Thanks!
Laszlo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH v2] OvmfPkg/PlatformPei: Fix VS2008 build breakage

2016-03-19 Thread David Woodhouse
On Fri, 2016-03-18 at 20:02 +0100, Laszlo Ersek wrote:
> On 03/18/16 19:40, David Woodhouse wrote:
> 
> > 
> > But this is different. This is the commit messages. And what would you
> > know... the last commit message in the log which isn't ASCII *isn't*
> > that other one I pointed out; it's one from you (7daf2401) in which you
> > commit the heinous crime of slepping Michał Zegan's name correctly :)
> I went to extreme lengths to commit his name correctly. It wasn't at all
> natural for me. But names are names.
> 
> > 
> > No, I genuinely don't know of any reason to eschew non-ASCII characters
> > in commit messages.

> Because they are a PITA for people who don't use UTF-8 generally? (I
> don't.)

This doesn't really make sense at any level. You must have to jump
through hoops to still operate a legacy 20th-century character set on a
modern Linux distribution, surely? And it must confer *no* benefit to
you — unless you count it as a benefit that you have to go to 'extreme
lengths' just to get someone's *name* right — something that normally
ought to Just Work™ when you use 'git am'.

And once it's been committed, either your system just doesn't display
the commit logs correctly at all (and gets names like Michał's wrong),
or you shouldn't even *notice* the emdash. Which is it?

I strongly suspect the latter, and that you only noticed because you
were looking closely at the encoding because of that Evolution bug?

> People still write RFC's today, and I think those are all pure ASCII 
> as well.

∃ lots of things which can fit in pure ASCII.
∃ lots of things which don't.

> Anyway, we won't convince each other; we've been through this. Feel free
> to post non-ASCII in the commit messages; should I come across them,
> I'll try to fix them up (except in names).

No. Please don't. If this was another of the bizarre but enforced-by-
the-project things then I was prepared to tolerate it until it could be
fixed, but if you are just mangling my commit messages to make them
less grammatically correct for an unsupportable personal preference,
then that's not OK.

The correct character to use in that situation is the emdash, If you
*absolutely* must, then rewrite the whole sentence to avoid using it.
Do *not* replace it with hyphens. And when you do tweak a commit
message, it is best practice to *note* that you have done so, and why.

Rewriting it for this reason is *not* acceptable.

-- 
dwmw2



smime.p7s
Description: S/MIME cryptographic signature
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH v2] OvmfPkg/PlatformPei: Fix VS2008 build breakage

2016-03-19 Thread David Woodhouse
On Fri, 2016-03-18 at 19:05 +0100, Laszlo Ersek wrote:
> On 03/18/16 18:45, David Woodhouse wrote:
> > 
> > On Fri, 2016-03-18 at 17:53 +0100, Laszlo Ersek wrote:
> > > 
> > > 
> > > (1) The commit message uses at least one non-ASCII character, the EM
> > > DASH (U+2014). Can you please replace it with a "--"? Yes, I know you
> > > hate me for asking this. Please just write me off as stupid and replace
> > > the character.
> > No. This isn't 1994.
> Really? I wouldn't know. If I look at what ISO C features we're allowed
> to use, we certainly seem to be stuck in 1995. Or, well, sometime before
> 1989, because we can't even use structure assignment.

That's a separate issue. Yes, we still insist on supporting the
Microsoft toolchains despite the fact that they are so badly maintained
that they don't even support the last C standard of the *previous*
century, let alone anything from the 2000s.

But this is different. This is the commit messages. And what would you
know... the last commit message in the log which isn't ASCII *isn't*
that other one I pointed out; it's one from you (7daf2401) in which you
commit the heinous crime of slepping Michał Zegan's name correctly :)

> There is a good reason. You know it and you don't care about it. Not a
> problem; I'll fix up the commit message.

No, I genuinely don't know of any reason to eschew non-ASCII characters
in commit messages.

> > Er, really? What insanity is this?
>
> Please ask your colleagues at Intel (but please also make sure that you
> don't take offense on their behalf, when you call their guidelines
> insane ;) ).

Hehehe. Touché.

> > 
> > Where did these guidelines come from
> > and can they be fixed? Why would initialisation of local variables
> > *ever* be frowned upon? How do they make this stuff up?
> Please consult "EDK II C Coding Standards Specification.pdf", section
> 6.8 "C Function Layout":
> 
>   *Initializing a variable as part of its declaration is illegal.*
> 
> (Emphasis theirs.)

Insanity theirs :)

Why in $DEITY's name would anyone write this? Was it done for a bet?

> \o/
> 
> Finally something I'm not held responsible for!
> 
> (BTW, why did you mail out the patch with Evolution, rather than
> git-send-email? Are you sure you are using the tools as they were
> intended? :))

:)

I like to have copies in my sent-mail folder of everything I sent. It
also gets S/MIME signatures, and it's easier to hit 'reply' and have a
properly-threaded message, rather than having to extract the Message-Id 
of the message I want to reply to and feed it to git-format-patch.

On the whole (modulo evo bugs) it's easier just to insert the plain
text file into the email I'm sending, for a single patch at least.

And I seem to have got away without triggering that Evo bug for a long
time, by working on code which cares more about messy trailing
whitespace on lines, than it does about other things :)

-- 
dwmw2


smime.p7s
Description: S/MIME cryptographic signature
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH v2] OvmfPkg/PlatformPei: Fix VS2008 build breakage

2016-03-19 Thread Laszlo Ersek
On 03/19/16 08:52, David Woodhouse wrote:

> If you're providing a message with -m on the comment line, then sure,
> LC_CTYPE is correct. But when it comes from a file? Every file can come
> from a different place, and can be in its *own* character set. I bet
> you have *plenty* of text files on your system which aren't in latin2.

Probably. (I didn't create them though! ;))

> And if you're applying a patch with 'git-am' then the temporary file
> storing the extracted message is almost *certainly* in UTF-8 or the
> original charset from the email, not your LC_CTYPE. Right?

git-am seems to work surprisingly well in this regard; I think. It seems
to match git-format-patch. When you format a patch whose git-internal
encoding is not UTF-8 (let's say, it's latin2), and the commit message
actually digresses outside of ASCII, then the formatted patch email will
receive:

MIME-Version: 1.0
Content-Type: text/plain; charset=latin2
Content-Transfer-Encoding: 8bit

This will make the message body display correctly in all MUAs.

(The only snag, as I mentioned, is that the Cc: mail header that git
constructs from the (otherwise correct) body-cc incorrectly becomes:

  =?UTF-8?q?QUOTED_PRINTABLE_LATIN2_BYTE_STRING?= 

I.e., the embedded encoding identifier doesn't actually match the byte
string.
)

In turn, when git-am processes such a patch email, it picks up the
charset from Content-Type, and passes it to git-commit as the
commit-encoding. If I remember correctly.

> Few people fully comprehend the true horror of the "simple" system they
> try to cling to...

My primary motivation is not the "simplicity" of single-byte encodings
-- I agree with the quotes here. My primary motivation is this killer
app called NEdit that I can't exist without.

Laszlo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH v2] OvmfPkg/PlatformPei: Fix VS2008 build breakage

2016-03-19 Thread Laszlo Ersek
On 03/18/16 22:13, David Woodhouse wrote:
> On Fri, 2016-03-18 at 20:02 +0100, Laszlo Ersek wrote:
>> On 03/18/16 19:40, David Woodhouse wrote:
>>
>>>
>>> But this is different. This is the commit messages. And what would you
>>> know... the last commit message in the log which isn't ASCII *isn't*
>>> that other one I pointed out; it's one from you (7daf2401) in which you
>>> commit the heinous crime of slepping Michał Zegan's name correctly :)
>> I went to extreme lengths to commit his name correctly. It wasn't at all
>> natural for me. But names are names.
>>
>>>
>>> No, I genuinely don't know of any reason to eschew non-ASCII characters
>>> in commit messages.
> 
>> Because they are a PITA for people who don't use UTF-8 generally? (I
>> don't.)
> 
> This doesn't really make sense at any level. You must have to jump
> through hoops to still operate a legacy 20th-century character set on a
> modern Linux distribution, surely?

Not too many. Good locale support and good internationalization means
that my ISO8859-2 preference is well accommodated.

> And it must confer *no* benefit to
> you

It certainly does. It allows me to use my text editor of choice. I've
been using that editor for ~15 years, and in the last five years, it's
been running 10 hours per day or more.

> — unless you count it as a benefit that you have to go to 'extreme
> lengths' just to get someone's *name* right — something that normally
> ought to Just Work™ when you use 'git am'.
> 
> And once it's been committed, either your system just doesn't display
> the commit logs correctly at all (and gets names like Michał's wrong),
> or you shouldn't even *notice* the emdash. Which is it?

It happens to display Michał's name correctly, because it fits in latin2.

[i18n]
logOutputEncoding = latin2
commitencoding = latin2

The extreme lengths that I had to go to were necessary to convince
git-send-email not to mess up Michał's CC in the email headers, picking
it up from the commit message. The commit message was all right, but the
email header got mis-encoded (it's a git bug; it thought the CC line was
UTF-8, despite knowing that the full commit message was latin2).

emdash doesn't display correctly for me indeed, in the output of "git
log". It's not a big annoyance, but if I am to apply a patch, I try to
prevent it.

> I strongly suspect the latter, and that you only noticed because you
> were looking closely at the encoding because of that Evolution bug?

No. I tend to notice glyphs by the naked eye that are not ascii / latin1
/ latin2. Here's another example:

http://thread.gmane.org/gmane.comp.bios.edk2.devel/8563/focus=8566

>> Anyway, we won't convince each other; we've been through this. Feel free
>> to post non-ASCII in the commit messages; should I come across them,
>> I'll try to fix them up (except in names).
> 
> No. Please don't. If this was another of the bizarre but enforced-by-
> the-project things then I was prepared to tolerate it until it could be
> fixed, but if you are just mangling my commit messages to make them
> less grammatically correct for an unsupportable personal preference,
> then that's not OK.
> 
> The correct character to use in that situation is the emdash, If you
> *absolutely* must, then rewrite the whole sentence to avoid using it.
> Do *not* replace it with hyphens.

Okay. I've googled the use of emdash in the English language, and it
seems to be more or less interchangeable with parens. Is that okay?

> And when you do tweak a commit
> message, it is best practice to *note* that you have done so, and why.

Absolutely.

> Rewriting it for this reason is *not* acceptable.

Fine. Please ask Jordan to apply the patch then.

Laszlo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH v2] OvmfPkg/PlatformPei: Fix VS2008 build breakage

2016-03-19 Thread David Woodhouse
On Sat, 2016-03-19 at 02:23 +0100, Laszlo Ersek wrote:
> On 03/19/16 02:15, David Woodhouse wrote:
> 
> > So we treat it as an opaque sequence of bytes on the way *in*, then
> > make assumptions on the way *out* about what it was?
> 
> On the way in, it is assumed to be UTF-8, unless the user says
> otherwise. If the user says otherwise (in i18n.commitencoding), that
> statement is captured in the commit object. Either way, the commit
> message is not converted.

Right. That is exactly the problem.

In the legacy world of mixed character sets, *every* string needs to be
explicitly labelled with its character set, and everything handling
strings needs to *either* convert its input to its desired output
charset, or at the very least pass that label along intact.

To *assume* that your input matches your output format (as git-commit
does), and to wilfully ignore the explicit labelling on it (LC_CTYPE,
as you mention), is wrong. You are deliberately *dropping* the label on
the input bytestream, and slapping an incorrect label on it instead.

But it happens. All the time. Even in software written by people who
ought to know better.

Because consistently labelling and converting character sets is hard. 

In fact, there's some room for debate about whether LC_CTYPE *is* the
correct label for the git-commit input. It's possible that there *is*
no right answer here, and no way for git to avoid being buggy.

If you're providing a message with -m on the comment line, then sure,
LC_CTYPE is correct. But when it comes from a file? Every file can come
from a different place, and can be in its *own* character set. I bet
you have *plenty* of text files on your system which aren't in latin2.
And if you're applying a patch with 'git-am' then the temporary file
storing the extracted message is almost *certainly* in UTF-8 or the
original charset from the email, not your LC_CTYPE. Right?

In the legacy charset world we'd need a way to label every text file
with its character set — LC_CTYPE is only an approximation.

But hey, thank $DEITY we solved that in the 21st century by just
standardising on UTF-8 and letting the labelling issue be a thing of
the distant past. :)

Few people fully comprehend the true horror of the "simple" system they
try to cling to...

-- 
dwmw2



smime.p7s
Description: S/MIME cryptographic signature
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH v2] OvmfPkg/PlatformPei: Fix VS2008 build breakage

2016-03-19 Thread David Woodhouse
On Fri, 2016-03-18 at 17:53 +0100, Laszlo Ersek wrote:
> 
> (1) The commit message uses at least one non-ASCII character, the EM
> DASH (U+2014). Can you please replace it with a "--"? Yes, I know you
> hate me for asking this. Please just write me off as stupid and
> replace the character.

There's another one in commit 5121a7646 btw :)

Or should I say ☺

-- 
dwmw2



smime.p7s
Description: S/MIME cryptographic signature
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH v2] OvmfPkg/PlatformPei: Fix VS2008 build breakage

2016-03-19 Thread Laszlo Ersek
On 03/19/16 02:15, David Woodhouse wrote:

> So we treat it as an opaque sequence of bytes on the way *in*, then
> make assumptions on the way *out* about what it was?

On the way in, it is assumed to be UTF-8, unless the user says
otherwise. If the user says otherwise (in i18n.commitencoding), that
statement is captured in the commit object. Either way, the commit
message is not converted.

On the way out, the commit message is always converted. From: what the
commit object says; default is UTF-8, or else what the user stuck in
there. To: UTF-8 by default, or what the user specified in
i18n.logOutputEncoding. So normally there is a transparent UTF-8 -->
UTF-8 conversion on output.

> Which is just one of the set of classic "oops, I dropped the label"
> bugs which rendered the legacy charsets unworkable, but this time it
> almost seems to be *deliberate*.
> 
> The logic behind not re-coding is silly. Because throwing away the
> charset label on the input text and assuming it's already in
> i18n.commitencoding definitely *isn't* a reversible operation :)

Right; if git had considered my LC_CTYPE locale category setting, and
had automatically converted between it and its internal UTF-8
representation, I would have never looked at "i18n.*". :(

Laszlo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH v2] OvmfPkg/PlatformPei: Fix VS2008 build breakage

2016-03-18 Thread David Woodhouse
On Fri, 2016-03-18 at 22:53 +0100, Laszlo Ersek wrote:
> It happens to display Michał's name correctly, because it fits in latin2.

Ah, OK. You got lucky on that one. Lots of names *don't* fit in
ISO8859-2.

> The extreme lengths that I had to go to were necessary to convince
> git-send-email not to mess up Michał's CC in the email headers, picking
> it up from the commit message. The commit message was all right, but the
> email header got mis-encoded (it's a git bug; it thought the CC line was
> UTF-8, despite knowing that the full commit message was latin2).

I'd be interested in a reference to the upstream bug report for that
one.

It's going to be caused by the use of 'i18n.commitencoding=latin2'.

Hm... did you realise that when you use that setup and you commit and
push directly, that means you are putting latin2-encoded objects into
the upstream EDK2 git repository?

I'm surprised that even works without corrupting things for everyone —
it looks like your commits actually carry an explicit label marking
them as latin2.

However, I really do think that's best avoided.

The lesson we learned in the 20th century was that labelling character
sets just doesn't work very well. The labels fall off, assumptions get
made and things interpreted in the "default" character set. So your
non-standard latin2 commits *are* at some point going to be wrongly
interpreted as UTF-8. In fact, wait a minute... you just *gave* me an
example of that happening :)

The labelling problem hasn't gone away entirely in the 21st century —
everyone using a modern setup still ought to be explicitly labelling
their output as UTF-8, and correctly converting from legacy on the way
in.

But in a system where everything is UTF-8 all of the time, mislabelling
errors are completely harmless. Errors like the one above — while they
might happen — don't actually cause a problem. Something was UTF-8 and
you mistakenly interpreted it as UTF-8 because you misplaced its label?
Oh well, nobody noticed :)

So yes, I'm interested in the bug because it should be fixed. But
basically, you brought it upon yourself by operating in a mode that is
*known* to invite such errors, and was abandoned by most other people a
*long* time ago. And you're pushing that choice into the EDK2 history
so that others are exposed to the same class of bugs. :(

> emdash doesn't display correctly for me indeed, in the output of "git
> log". It's not a big annoyance, but if I am to apply a patch, I try to
> prevent it.
> 
> > 
> > I strongly suspect the latter, and that you only noticed because you
> > were looking closely at the encoding because of that Evolution bug?
> No. I tend to notice glyphs by the naked eye that are not ascii / latin1
> / latin2. Here's another example:
> 
> http://thread.gmane.org/gmane.comp.bios.edk2.devel/8563/focus=8566

I'll see that one and raise you a 'There is nothing you can put in the
man page source to make it output the correct dash on all devices' :)

https://bugzilla.redhat.com/show_bug.cgi?id=1173619


> Okay. I've googled the use of emdash in the English language, and it
> seems to be more or less interchangeable with parens. Is that okay?

In some usages, yes — but probably not this example.

Two dashes is often considered an acceptable rendering of the emdash,
if you *really* must... but in this case since it isn't actually
necessary, I'd very much prefer that you leave it as it is.

-- 
dwmw2

smime.p7s
Description: S/MIME cryptographic signature
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH v2] OvmfPkg/PlatformPei: Fix VS2008 build breakage

2016-03-18 Thread David Woodhouse
On Sat, 2016-03-19 at 01:55 +0100, Laszlo Ersek wrote:
> 
> Okay, here's what I'll do. I will switch i18n.commitencoding back to
> UTF-8. And, I will add a commit-msg hook that converts the commit
> message in-place from latin2 to UTF-8, with "iconv". That should keep
> us both happy. Deal?

That sounds like a good solution; thanks.

I think it's arguably a git bug that it's necessary. Git ought to
honour the locale settings when dealing with user input. If it can
convert on *reading* from repository files, why wouldn't it convert on
*writing* them?

The 'git commit' man page says both that

       ·   The commit log messages are uninterpreted sequences of non-NUL
   bytes.

and also that

        2. git log, git show, git blame and friends look at the encoding
   header of a commit object, and try to re-code the log message into
   UTF-8 unless otherwise specified. You can specify the desired
   output encoding with i18n.logoutputencoding ...

... which seems self-contradictory. It goes on to say:

       Note that we deliberately chose not to re-code the commit log message
   when a commit is made to force UTF-8 at the commit object level,
   because re-coding to UTF-8 is not necessarily a reversible operation.

So we treat it as an opaque sequence of bytes on the way *in*, then
make assumptions on the way *out* about what it was?

Which is just one of the set of classic "oops, I dropped the label"
bugs which rendered the legacy charsets unworkable, but this time it
almost seems to be *deliberate*.

The logic behind not re-coding is silly. Because throwing away the
charset label on the input text and assuming it's already in
i18n.commitencoding definitely *isn't* a reversible operation :)

-- 
dwmw2



smime.p7s
Description: S/MIME cryptographic signature
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH v2] OvmfPkg/PlatformPei: Fix VS2008 build breakage

2016-03-18 Thread Laszlo Ersek
On 03/18/16 22:13, David Woodhouse wrote:

> Rewriting it for this reason is *not* acceptable.

BTW I have no clue why not, if I keep your S-o-b in the first place, and
document the changes. In other projects maintainers rewrite
contributors' patches even, and (at best) credit the contributor with a
tag. I think it's entirely in a maintainer's jurisdiction to adapt the
commit message as he sees fit.

It may be idiosyncratic, and you can yell at me for it; but all that
will do is convince me to avert my eyes when you post an OvmfPkg patch.

Whenever you contribute to a project, do you always start with making a
huge noise, calling everyone around (or their rules) insane, "makes no
sense at all", and so on? The stuff we've been operating by thus far
have been mostly okay. I have updated a few commit messages as well, and
haven't received complaints. The operational changes you are
force-feeding the project (or the rate thereof) seem to surpass all the
changes it has undergone since I started participating.

CRLF has been working out for most people. Rebasing patches has been
working out. Staying away from non-ASCII in commit messages ditto.
Keeping full context when quoting patches ditto. You want edk2 to change
its ways in all aspects at once. And since I seem to be the guy who
talks to you the most, I'm the one you yell at the most. I've never
experienced this before. I'm stunned.

Laszlo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH v2] OvmfPkg/PlatformPei: Fix VS2008 build breakage

2016-03-18 Thread Laszlo Ersek
On 03/18/16 23:52, David Woodhouse wrote:
> On Fri, 2016-03-18 at 22:53 +0100, Laszlo Ersek wrote:

>> The extreme lengths that I had to go to were necessary to convince
>> git-send-email not to mess up Michał's CC in the email headers, picking
>> it up from the commit message. The commit message was all right, but the
>> email header got mis-encoded (it's a git bug; it thought the CC line was
>> UTF-8, despite knowing that the full commit message was latin2).
> 
> I'd be interested in a reference to the upstream bug report for that
> one.

I didn't report it. First, I expected to be beat down with "just use
UTF-8". Second, I'm using a fairly old git release
(git-1.8.3.1-6.el7.x86_64, on RHEL-7.2), and the most recent git release
might not have the bug.

> It's going to be caused by the use of 'i18n.commitencoding=latin2'.
> 
> Hm... did you realise that when you use that setup and you commit and
> push directly, that means you are putting latin2-encoded objects into
> the upstream EDK2 git repository?

Yes, I realized it.

Latin2 text can be cleanly transcoded to UTF-8 (which is how most people
will read it), i.e., from i18n.commitEncoding to i18n.logOutputEncoding.

> I'm surprised that even works without corrupting things for everyone —
> it looks like your commits actually carry an explicit label marking
> them as latin2.

Yes. I had read the discussion in the git-config and git-log manuals,
and saw it was safe.

> So yes, I'm interested in the bug because it should be fixed. But
> basically, you brought it upon yourself by operating in a mode that is
> *known* to invite such errors, and was abandoned by most other people a
> *long* time ago.

(Hm, didn't I just predict exactly this argument above?...)

> And you're pushing that choice into the EDK2 history
> so that others are exposed to the same class of bugs. :(

The facility I'm using is a documented feature of git; git will
transcode Michał's name to UTF-8 for others (or whatever they have in
logOutputEncoding).

Laszlo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH v2] OvmfPkg/PlatformPei: Fix VS2008 build breakage

2016-03-18 Thread David Woodhouse
On Fri, 2016-03-18 at 23:26 +0100, Laszlo Ersek wrote:
> 
> Whenever you contribute to a project, do you always start with making a
> huge noise, calling everyone around (or their rules) insane, "makes no
> sense at all", and so on?

Not at all. Some weeks I'll work on as many as a dozen or so projects.
This week it's been EDK2, OpenSSL, Linux, MIT krb5, GSS-NTLMSSP, git,
NSS, and two or three GNOME projects. And possibly others; I distinctly
remember there being some python but I tend to blot out those memories
as quickly as I can.

Most projects are relatively easy to work with. If they have coding
styles that they adhere to, then they're obvious in the code you're
working on — and in the ideal case, encoded in the source files in a
form which emacs can automatically understand.

I can cope with 'this project suffers from supporting Microsoft tools
and thus we can't even use C99' — I've seen that enough times.

But what we have here is a particularly special combination — the
insistence on supporting crappy Microsoft tools with false warnings
about uninitialised variables, in conjunction with a rule that "we
don't initialise variables on declaration". That *combination* just
blows the mind, surely?

I have *never* dealt with projects that seem to be trying *so* hard to
do the wrong thing and make life hard, in so many ways. To the point
where one has to go and look up the special project-specific
instructions just to do something as trivial as applying a patch from
the mailing list.

And sure, you do that every day so you don't find it problematic. You
even *wrote* those instructions. But that's doesn't mean it's OK.
That's just Stockholm syndrome.

What's really frustrating is that that was a known issue, and it's
something that *could* have been handled before the final switch to
git. I had even worked on doing the Subversion → git conversion such
that the CRLF thing was fixed in all commits as they came across.

But it didn't get done, and now we still have the same issues for no
good reason — and a longer period of transition(s) before we finally
get to where we need to be.

The same goes for the half-measure of using git as if it was
subversion. Again it means *another* change before we get there.

So yes. I'm sorry. I get frustrated.

> CRLF has been working out for most people. Rebasing patches has been
> working out. Staying away from non-ASCII in commit messages ditto.
> Keeping full context when quoting patches ditto. You want edk2 to
> change its ways in all aspects at once. And since I seem to be the
> guy who talks to you the most, I'm the one you yell at the most. I've
> never experienced this before. I'm stunned.

CRLF works for *you* because you're doing it all the time, and can
remember all the magic you have to do to work around it. Throwing away
history works until you actually find you need to refer back to it. You
*aren't* staying away from non-ASCII in commit messages; your
configuration has led to pushing esoteric legacy charsets into the EDK2
git objects themselves, inviting all the labelling problems that we
used to have in the 20th century before we all started using UTF-8. And
you've shown an example of that happening already. The patch citation
you're picking on was clearly showing the i'f(x) {set} … if(x) {use}'
pattern which was triggering the warning, in a way that wouldn't have
been readable (I might as well have elided it completely) if I'd left
the full thing. Although you claim it made it hard to find the code in
question, the bit that said 'line 275' ought to have given it away.

And sure, I'm a whole lot better at dealing with computers than I am
with people, and I tend not to be very good at hiding my frustration.
I'm sorry if you interpret that as 'yelling', especially 'at you'. I
respect you greatly, and I keep telling you how much I appreciate the
help that you offer to me and to others.

It's not that I "want edk2 to change its way in all aspects at once". 

All I wanted to do was due diligence on the OpenSSL 1.1 support that
I've been working on.

I wanted to pull in the HTTPS patches and actually get that working,
but it's just too painful. Sure, I probably *could* have done it, and
thanks again for giving me yet another magic incantation to work around
yet another aspect of the brokenness I had tried to avoid by pointing
it out before we set it in stone. But the moment has passed — the
additional time it would require has pushed it to "Someone Else's
Problem" status. I've thrown the 'hey, we can build the ssl/ directory
too' patch into the PR I submitted this morning, and that'll suffice.

But I at *least* wanted to test what I was doing on Windows. So I went
to the trouble of firing up a Windows VM and dealing with it (which
probably hasn't helped my frustration level right from the start, to be
honest), and found that the build had apparently been broken for 2
weeks without anyone noticing. Which makes me wonder why I'm
bothering... but OK, I'll not only 

Re: [edk2] [PATCH v2] OvmfPkg/PlatformPei: Fix VS2008 build breakage

2016-03-18 Thread Laszlo Ersek
On 03/18/16 19:40, David Woodhouse wrote:

> But this is different. This is the commit messages. And what would you
> know... the last commit message in the log which isn't ASCII *isn't*
> that other one I pointed out; it's one from you (7daf2401) in which you
> commit the heinous crime of slepping Michał Zegan's name correctly :)

I went to extreme lengths to commit his name correctly. It wasn't at all
natural for me. But names are names.

> No, I genuinely don't know of any reason to eschew non-ASCII characters
> in commit messages.

Because they are a PITA for people who don't use UTF-8 generally? (I don't.)

People still write RFC's today, and I think those are all pure ASCII as
well.

Anyway, we won't convince each other; we've been through this. Feel free
to post non-ASCII in the commit messages; should I come across them,
I'll try to fix them up (except in names).

Thanks
Laszlo
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel


Re: [edk2] [PATCH v2] OvmfPkg/PlatformPei: Fix VS2008 build breakage

2016-03-18 Thread David Woodhouse
On Sat, 2016-03-19 at 01:03 +0100, Laszlo Ersek wrote:
> > So yes, I'm interested in the bug because it should be fixed. But
> > basically, you brought it upon yourself by operating in a mode that is
> > *known* to invite such errors, and was abandoned by most other people a
> > *long* time ago.
> 
> (Hm, didn't I just predict exactly this argument above?...)

Yes, and for good reason.

> > And you're pushing that choice into the EDK2 history
> > so that others are exposed to the same class of bugs. :(
> 
> The facility I'm using is a documented feature of git; git will
> transcode Michał's name to UTF-8 for others (or whatever they have in
> logOutputEncoding).

Except when it predictably breaks, as you demonstrated.

As I said, we knew *years* ago that charset labelling doesn't work
reliably, and the only good answer is to be consistent so that it
doesn't matter (which basically means UTF-8. Don't even get me started
on UTF-16.)

And you've intentionally pushed an inconsistent set of character sets
to the EDK2 repository, you've *seen* the git tools fail to deal with
those labels properly, you've *predicted* that you'd be told "that's
what you get for mixing charsets"... and yet you're still absolutely
happy that you're pushing this to the EDK2 repository for everyone to
enjoy the potential bugs?

I'm still not yelling. But I'm not smiling either.

I really do think that if we can have a policy which eschews merge
commits, we should *definitely* have a policy which eschews non-UTF8
objects in the tree.

Isn't that i18n.commitencoding=latin2 config on your part somewhat
gratuitous? The idea is that the tools are converting everything on the
way in and out, and that *only* affects the commit objects that are
actually encoded in the repository, right? So letting that be UTF-8 to
match the default shouldn't even be *visible* to you, should it? Except
that you probably wouldn't have suffered that git-send-email bug you
mentioned?

-- 
dwmw2



smime.p7s
Description: S/MIME cryptographic signature
___
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel