Re: [Python-Dev] hgeol extension (Was: Mercurial migration: help needed)

2009-09-06 Thread Stephen J. Turnbull
Martin v. Löwis writes:

  This is what I refer to as YAGNI. Subversion has LF as the internal
  storage, and, IIRC, so does CVS. I don't think there is any precedence
  for wanting something else - and frankly, I can't see how repository
  storage would matter.

Well, internally you could use U+2028 LINE SEPARATOR, which would
screw up *everybody* if they don't use the converter, since there are
probably very few editors that understand U+2028.  I've heard that
this is what Samba did when converting to Unicode: intead of using
UTF-8 they used UTF-16 so that English would be at least as buggy as
any other language.

Maybe there's somebody who was participating in Samba at that time who
knows?
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] hgeol extension (Was: Mercurial migration: help needed)

2009-09-05 Thread Martin v. Löwis
 Can anyone (re-) post the specification of the proposed extension, to
 the level that it is currently defined?

For reference, here are the original specification, mine and Martin
Geisler's:

http://mail.python.org/pipermail/python-dev/2009-August/090984.html
http://mail.python.org/pipermail/python-dev/2009-August/091453.html

Here is my attempt at summarizing it:

- name of versioned configuration file (in root of tree): .hgeol
- names of conversion modes: native, LF, CRLF
In the configuration file, there is a section [patterns] which
maps file name patterns to conversion modes, e.g.

[patterns]
**.txt = native
**.py = native
**.dsp = CRLF
**.bat = CRLF
Tools/bgen/README = native
Lib/email/test/data/msg_26.txt = CRLF

- Martin Geisler also proposes that there is a section
[repository]
native = conversionmode
I personally feel YAGNI; it should only support LF (adding such
a feature later may be considered)

Open issues:
- name of extension
- what should happen if the file on disk doesn't have the expected
line endings, or mixed line endings? E.g. a file declared as native
should have CRLF on Windows - what if it doesn't, on commit?
My proposal: do what svn does (whatever that is).

That's it, AFAICT. Martin Geisler also discussed something that
I read as an implementation strategy, by mapping the patterns to
into the (apparently existing) encode/decode configuration setting.

HTH,
Martin

P.S. If you decide that you will or will not work on it, please let
us know.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] hgeol extension (Was: Mercurial migration: help needed)

2009-09-05 Thread Brett Cannon
On Sat, Sep 5, 2009 at 07:18, Martin v. Löwismar...@v.loewis.de wrote:
 Can anyone (re-) post the specification of the proposed extension, to
 the level that it is currently defined?

 For reference, here are the original specification, mine and Martin
 Geisler's:

 http://mail.python.org/pipermail/python-dev/2009-August/090984.html
 http://mail.python.org/pipermail/python-dev/2009-August/091453.html

 Here is my attempt at summarizing it:

 - name of versioned configuration file (in root of tree): .hgeol
 - names of conversion modes: native, LF, CRLF
 In the configuration file, there is a section [patterns] which
 maps file name patterns to conversion modes, e.g.

 [patterns]
 **.txt = native
 **.py = native
 **.dsp = CRLF
 **.bat = CRLF
 Tools/bgen/README = native
 Lib/email/test/data/msg_26.txt = CRLF

 - Martin Geisler also proposes that there is a section
 [repository]
 native = conversionmode
 I personally feel YAGNI; it should only support LF (adding such
 a feature later may be considered)

Do you mean what native is in the repo or what it should be considered
on the user's machine? If it's the former then I actually like it as
it means a clone doesn't need to do anything special when 'native'
matches what is expected in the repo while a commit still does its EOL
validation. I still think we need to have a server-side block which
rejects commits that messes up the line-endings so people can fix
them. Shouldn't mess up 'blame' as the messed up line-endings should
only be from their edits. Otherwise it's just like when Tim used to
run reindent.py over the entire repo on occasion.

And as mentioned in another email by Paul, it would be nice to let the
user specify what they want 'native' to be on their machine if they
happen to be a Windows user who prefers LF.


 Open issues:
 - name of extension

StupidLineEndings =)

 - what should happen if the file on disk doesn't have the expected
 line endings, or mixed line endings? E.g. a file declared as native
 should have CRLF on Windows - what if it doesn't, on commit?
 My proposal: do what svn does (whatever that is).

Or refuse the commit with a message and tell the user to fix it (if
svn doesn't happen to do that).

-Brett
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] hgeol extension (Was: Mercurial migration: help needed)

2009-09-05 Thread Martin v. Löwis
 - Martin Geisler also proposes that there is a section
 [repository]
 native = conversionmode
 I personally feel YAGNI; it should only support LF (adding such
 a feature later may be considered)
 
 Do you mean what native is in the repo or what it should be considered
 on the user's machine? 

The former.

 If it's the former then I actually like it as
 it means a clone doesn't need to do anything special when 'native'
 matches what is expected in the repo while a commit still does its EOL
 validation.

But the same would be true if the repo format would be always LF:
when native matches (which would then be on Unix), the extension
would *still* have to do nothing but validation.

 I still think we need to have a server-side block which
 rejects commits that messes up the line-endings so people can fix
 them.

Certainly.

 Shouldn't mess up 'blame' as the messed up line-endings should
 only be from their edits.

It could be that they had a number of commits that eventually lead
to the version that they push; this will also push the intermediate
versions. So when you then do a blame, it will tell you that the
revision was logged as fix whitespace, rather than resolve issue
#9743.

You are mostly right that the committer name would be the same
(except when the committer was pushing some changes pulled from
the actual contributor), however, I still see these whitespace-only
changes as a complication.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] hgeol extension (Was: Mercurial migration: help needed)

2009-09-05 Thread Brett Cannon
On Sat, Sep 5, 2009 at 15:06, Martin v. Löwismar...@v.loewis.de wrote:
 - Martin Geisler also proposes that there is a section
 [repository]
 native = conversionmode
 I personally feel YAGNI; it should only support LF (adding such
 a feature later may be considered)

 Do you mean what native is in the repo or what it should be considered
 on the user's machine?

 The former.

 If it's the former then I actually like it as
 it means a clone doesn't need to do anything special when 'native'
 matches what is expected in the repo while a commit still does its EOL
 validation.

 But the same would be true if the repo format would be always LF:
 when native matches (which would then be on Unix), the extension
 would *still* have to do nothing but validation.

Right, but I am just thinking about how we specify in .hgeols what the
repository is expected to be as this extension might work out nicely
for other projects who prefer CLRF as their repo-native line ending.


 I still think we need to have a server-side block which
 rejects commits that messes up the line-endings so people can fix
 them.

 Certainly.

 Shouldn't mess up 'blame' as the messed up line-endings should
 only be from their edits.

 It could be that they had a number of commits that eventually lead
 to the version that they push; this will also push the intermediate
 versions. So when you then do a blame, it will tell you that the
 revision was logged as fix whitespace, rather than resolve issue
 #9743.


Yep.

 You are mostly right that the committer name would be the same
 (except when the committer was pushing some changes pulled from
 the actual contributor), however, I still see these whitespace-only
 changes as a complication.

It's unfortunate, but I see it as a rare occurrence as it would only
happen if someone got sloppy. And it should typically get caught
client-side before the commit ever occurs, minimizing the
whitespace-only commits even more.

-Brett
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] hgeol extension

2009-09-05 Thread Martin Geisler
Martin v. Löwis mar...@v.loewis.de writes:

 Can anyone (re-) post the specification of the proposed extension, to
 the level that it is currently defined?

 For reference, here are the original specification, mine and Martin
 Geisler's:

 http://mail.python.org/pipermail/python-dev/2009-August/090984.html
 http://mail.python.org/pipermail/python-dev/2009-August/091453.html

 Here is my attempt at summarizing it:

 - name of versioned configuration file (in root of tree): .hgeol
 - names of conversion modes: native, LF, CRLF
 In the configuration file, there is a section [patterns] which
 maps file name patterns to conversion modes, e.g.

 [patterns]
 **.txt = native
 **.py = native
 **.dsp = CRLF
 **.bat = CRLF
 Tools/bgen/README = native
 Lib/email/test/data/msg_26.txt = CRLF

 - Martin Geisler also proposes that there is a section
 [repository]
 native = conversionmode
 I personally feel YAGNI; it should only support LF (adding such
 a feature later may be considered)

I don't think it's a good idea to store everything in LF in the
repository. Unlike Subversion, you cannot expect all interactions to
take place through the eol-filter we're implementing. Letting people
checkout a useful unfiltered clone would be possible if we know the
repository native format and convert back to that.

Anyway, it's a minor detail. More importantly, I've posted a simple,
rough extension that does this here:

  http://markmail.org/message/yj4so736t4cfdulv

I figured it would be better to discuss the design and implementation on
mercurial-devel since there are more Mercurial hackers there. I've CC'ed
a bunch of people from this thread to seed the discussion -- the rest
of you on python-devel are hereby invited to join :-)

  http://selenic.com/mailman/listinfo/mercurial-devel

-- 
Martin Geisler

VIFF (Virtual Ideal Functionality Framework) brings easy and efficient
SMPC (Secure Multiparty Computation) to Python. See: http://viff.dk/.


pgpgBUTv4WZGB.pgp
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com