Re: Use of new .gitattributes working-tree-encoding attribute across different platform types
Jeff King writes: > One thing I almost did in the example I gave above was to literally call > the encoding name by a "real" one. I.e.: > > echo '*.txt working-tree-encoding=iso-8859-1' >.gitattributes > git config encoding.iso-8859-1.replace latin1 > > or something. But I wondered if it was a little crazy as a practice, > since mapping "iso-8859-1" to "utf-8" is probably going to lead to > headaches. > > But your example above of semantically equivalent variants with > different spellings would be a good use of that trick. Yeah, I think the above looks quite sensible.
Re: Use of new .gitattributes working-tree-encoding attribute across different platform types
On Mon, Jul 02, 2018 at 04:09:32PM +0200, Lars Schneider wrote: > Brian had a good argument [1] for an even more flexible system > proposed by Peff: > > > 1) We allow users to define custom encoding mappings in their Git config. > Example: > > git config --global core.encoding.myenc UTF-16 I think this should be encoding.myenc.something. In Git's config format, only the subsection names (the middle of a three-dot name) are unconstrained. So even if encoding.myenc only ever has one key ("replace" or "useInstead" or whatever you want to call it), there's value in organizing the namespace that way. And as a bonus, it leaves room for extending the feature later if we do need more keys. -Peff
Re: Use of new .gitattributes working-tree-encoding attribute across different platform types
On Sun, Jul 01, 2018 at 05:56:58PM +, brian m. carlson wrote: > On Thu, Jun 28, 2018 at 01:27:07PM -0400, Jeff King wrote: > > Yeah, that was along the lines that I was thinking. I wonder if anybody > > would ever need two such auto-encodings, though. Probably not. But > > another way to think about it would be to allow something like: > > > > working-tree-encoding=foo > > > > and then in your config "foo" to map to some encoding. > > > > But that may be over-engineering, I dunno. utf8 has always been enough > > for me. :) > > I had a thought the other day about why this solution might be valuable. > Different platforms encode different values for iconv character sets. > So, for example, one may have platforms supporting some disjoint sets of > the following: > > * LATIN-1 > * LATIN1 > * ISO8859-1 > * ISO-8859-1 > * ISO_8859-1 > * ISO_8859-1:1987 > * some lowercase variants of these > > Therefore, specifying a working-tree-encoding value that works across a > wide variety of system may be non-trivial. This is less of a problem > with UTF-8, but having the ability to pick an encoding and remap it to a > supported value may be useful nevertheless. One thing I almost did in the example I gave above was to literally call the encoding name by a "real" one. I.e.: echo '*.txt working-tree-encoding=iso-8859-1' >.gitattributes git config encoding.iso-8859-1.replace latin1 or something. But I wondered if it was a little crazy as a practice, since mapping "iso-8859-1" to "utf-8" is probably going to lead to headaches. But your example above of semantically equivalent variants with different spellings would be a good use of that trick. It also makes me wonder if there's another layer of indirection somewhere in the iconv machinery we could be taking advantage of to accomplish the same thing. Probably not conveniently or portably, I guess. -Peff
Re: Use of new .gitattributes working-tree-encoding attribute across different platform types
> -Lars Schneider wrote: - > To: Jeff King > From: Lars Schneider > Date: 06/28/2018 18:21 > Cc: "brian m. carlson" , Steve Groeger > , git@vger.kernel.org > Subject: Re: Use of new .gitattributes working-tree-encoding attribute across > different platform types > > >> On Jun 28, 2018, at 4:34 PM, Jeff King wrote: >> >> On Thu, Jun 28, 2018 at 02:44:47AM +, brian m. carlson wrote: >> >>> On Wed, Jun 27, 2018 at 07:54:52AM +, Steve Groeger wrote: >>>> We have common code that is supposed to be usable across different >>>> platforms and hence different file encodings. With the full support of the >>>> working-tree-encoding in the latest version of git on all platforms, how >>>> do we have files converted to different encodings on different platforms? >>>> I could not find anything that would allow us to say 'if platform = z/OS >>>> then encoding=EBCDIC else encoding=ASCII'. Is there a way this can be >>>> done? >>> >>> I don't believe there is such functionality. Git doesn't have >>> attributes that are conditional on the platform in that sort of way. >>> You could use a smudge/clean filter and adjust the filter for the >>> platform you're on, which might meet your needs. >> >> We do have prior art in the line-ending code, though. There the >> attributes say either that a file needs a specific line-ending type >> (which is relatively rare), or that it should follow the system type, >> which is then set separately in the config. >> >> I have the impression that the working-tree-encoding stuff was made to >> handle the first case, but not the second. It doesn't seem like an >> outrageous thing to eventually add. >> >> (Though I agree that clean/smudge filters would work, and can even >> implement the existing working-tree-encoding feature, albeit less >> efficiently and conveniently). > > Thanks for the suggestion Peff! > How about this: > > 1) We allow users to set the encoding "auto". Example: > > *.txt working-tree-encoding=auto > > 2) We define a new variable `core.autoencoding`. By default the value is > UTF-8 (== no re-encoding) but user can set to any value in their Git config. > Example: > >git config --global core.autoencoding UTF-16 > > All files marked with the value "auto" will use the encoding defined in > `core.autoencoding`. > > Would that work? > > @steve: Would that fix your problem? On Jul 2, 2018, at 2:13 PM, Steve Groeger wrote: > > I think this proposed solution may resolve my issue. Thanks for the confirmation! Brian had a good argument [1] for an even more flexible system proposed by Peff: 1) We allow users to define custom encoding mappings in their Git config. Example: git config --global core.encoding.myenc UTF-16 2) Users can reuse these mappings in ther .gitattributes files: *.txt working-tree-encoding=myenc Does this idea look good to everyone? Thanks, Lars [1] https://public-inbox.org/git/20180701175657.gc7...@genre.crustytoothpaste.net/
Re: Use of new .gitattributes working-tree-encoding attribute across different platform types
Lars, I think this proposed solution may resolve my issue. Thanks Steve Groeger Java Runtimes Development IBM Hursley IBM United Kingdom Ltd Tel: (44) 1962 816911 Mobex: 279990 Mobile: 07718 517 129 Fax (44) 1962 816800 Lotus Notes: Steve Groeger/UK/IBM Internet: groe...@uk.ibm.com Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -Lars Schneider wrote: - To: Jeff King From: Lars Schneider Date: 06/28/2018 18:21 Cc: "brian m. carlson" , Steve Groeger , git@vger.kernel.org Subject: Re: Use of new .gitattributes working-tree-encoding attribute across different platform types > On Jun 28, 2018, at 4:34 PM, Jeff King wrote: > > On Thu, Jun 28, 2018 at 02:44:47AM +, brian m. carlson wrote: > >> On Wed, Jun 27, 2018 at 07:54:52AM +, Steve Groeger wrote: >>> We have common code that is supposed to be usable across different >>> platforms and hence different file encodings. With the full support of the >>> working-tree-encoding in the latest version of git on all platforms, how do >>> we have files converted to different encodings on different platforms? >>> I could not find anything that would allow us to say 'if platform = z/OS >>> then encoding=EBCDIC else encoding=ASCII'. Is there a way this can be >>> done? >> >> I don't believe there is such functionality. Git doesn't have >> attributes that are conditional on the platform in that sort of way. >> You could use a smudge/clean filter and adjust the filter for the >> platform you're on, which might meet your needs. > > We do have prior art in the line-ending code, though. There the > attributes say either that a file needs a specific line-ending type > (which is relatively rare), or that it should follow the system type, > which is then set separately in the config. > > I have the impression that the working-tree-encoding stuff was made to > handle the first case, but not the second. It doesn't seem like an > outrageous thing to eventually add. > > (Though I agree that clean/smudge filters would work, and can even > implement the existing working-tree-encoding feature, albeit less > efficiently and conveniently). Thanks for the suggestion Peff! How about this: 1) We allow users to set the encoding "auto". Example: *.txt working-tree-encoding=auto 2) We define a new variable `core.autoencoding`. By default the value is UTF-8 (== no re-encoding) but user can set to any value in their Git config. Example: git config --global core.autoencoding UTF-16 All files marked with the value "auto" will use the encoding defined in `core.autoencoding`. Would that work? @steve: Would that fix your problem? - Lars Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
Re: Use of new .gitattributes working-tree-encoding attribute across different platform types
On Thu, Jun 28, 2018 at 01:27:07PM -0400, Jeff King wrote: > Yeah, that was along the lines that I was thinking. I wonder if anybody > would ever need two such auto-encodings, though. Probably not. But > another way to think about it would be to allow something like: > > working-tree-encoding=foo > > and then in your config "foo" to map to some encoding. > > But that may be over-engineering, I dunno. utf8 has always been enough > for me. :) I had a thought the other day about why this solution might be valuable. Different platforms encode different values for iconv character sets. So, for example, one may have platforms supporting some disjoint sets of the following: * LATIN-1 * LATIN1 * ISO8859-1 * ISO-8859-1 * ISO_8859-1 * ISO_8859-1:1987 * some lowercase variants of these Therefore, specifying a working-tree-encoding value that works across a wide variety of system may be non-trivial. This is less of a problem with UTF-8, but having the ability to pick an encoding and remap it to a supported value may be useful nevertheless. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204 signature.asc Description: PGP signature
Re: Use of new .gitattributes working-tree-encoding attribute across different platform types
On Thu, Jun 28, 2018 at 07:21:18PM +0200, Lars Schneider wrote: > How about this: > > 1) We allow users to set the encoding "auto". Example: > > *.txt working-tree-encoding=auto > > 2) We define a new variable `core.autoencoding`. By default the value is > UTF-8 (== no re-encoding) but user can set to any value in their Git config. > Example: > > git config --global core.autoencoding UTF-16 > > All files marked with the value "auto" will use the encoding defined in > `core.autoencoding`. > > Would that work? Yeah, that was along the lines that I was thinking. I wonder if anybody would ever need two such auto-encodings, though. Probably not. But another way to think about it would be to allow something like: working-tree-encoding=foo and then in your config "foo" to map to some encoding. But that may be over-engineering, I dunno. utf8 has always been enough for me. :) -Peff
Re: Use of new .gitattributes working-tree-encoding attribute across different platform types
> On Jun 28, 2018, at 4:34 PM, Jeff King wrote: > > On Thu, Jun 28, 2018 at 02:44:47AM +, brian m. carlson wrote: > >> On Wed, Jun 27, 2018 at 07:54:52AM +, Steve Groeger wrote: >>> We have common code that is supposed to be usable across different >>> platforms and hence different file encodings. With the full support of the >>> working-tree-encoding in the latest version of git on all platforms, how do >>> we have files converted to different encodings on different platforms? >>> I could not find anything that would allow us to say 'if platform = z/OS >>> then encoding=EBCDIC else encoding=ASCII'. Is there a way this can be >>> done? >> >> I don't believe there is such functionality. Git doesn't have >> attributes that are conditional on the platform in that sort of way. >> You could use a smudge/clean filter and adjust the filter for the >> platform you're on, which might meet your needs. > > We do have prior art in the line-ending code, though. There the > attributes say either that a file needs a specific line-ending type > (which is relatively rare), or that it should follow the system type, > which is then set separately in the config. > > I have the impression that the working-tree-encoding stuff was made to > handle the first case, but not the second. It doesn't seem like an > outrageous thing to eventually add. > > (Though I agree that clean/smudge filters would work, and can even > implement the existing working-tree-encoding feature, albeit less > efficiently and conveniently). Thanks for the suggestion Peff! How about this: 1) We allow users to set the encoding "auto". Example: *.txt working-tree-encoding=auto 2) We define a new variable `core.autoencoding`. By default the value is UTF-8 (== no re-encoding) but user can set to any value in their Git config. Example: git config --global core.autoencoding UTF-16 All files marked with the value "auto" will use the encoding defined in `core.autoencoding`. Would that work? @steve: Would that fix your problem? - Lars
Re: Use of new .gitattributes working-tree-encoding attribute across different platform types
On Thu, Jun 28, 2018 at 02:44:47AM +, brian m. carlson wrote: > On Wed, Jun 27, 2018 at 07:54:52AM +, Steve Groeger wrote: > > We have common code that is supposed to be usable across different > > platforms and hence different file encodings. With the full support of the > > working-tree-encoding in the latest version of git on all platforms, how do > > we have files converted to different encodings on different platforms? > > I could not find anything that would allow us to say 'if platform = z/OS > > then encoding=EBCDIC else encoding=ASCII'. Is there a way this can be > > done? > > I don't believe there is such functionality. Git doesn't have > attributes that are conditional on the platform in that sort of way. > You could use a smudge/clean filter and adjust the filter for the > platform you're on, which might meet your needs. We do have prior art in the line-ending code, though. There the attributes say either that a file needs a specific line-ending type (which is relatively rare), or that it should follow the system type, which is then set separately in the config. I have the impression that the working-tree-encoding stuff was made to handle the first case, but not the second. It doesn't seem like an outrageous thing to eventually add. (Though I agree that clean/smudge filters would work, and can even implement the existing working-tree-encoding feature, albeit less efficiently and conveniently). -Peff
Re: Use of new .gitattributes working-tree-encoding attribute across different platform types
On Wed, Jun 27, 2018 at 07:54:52AM +, Steve Groeger wrote: > We have common code that is supposed to be usable across different platforms > and hence different file encodings. With the full support of the > working-tree-encoding in the latest version of git on all platforms, how do > we have files converted to different encodings on different platforms? > I could not find anything that would allow us to say 'if platform = z/OS then > encoding=EBCDIC else encoding=ASCII'. Is there a way this can be done? I don't believe there is such functionality. Git doesn't have attributes that are conditional on the platform in that sort of way. You could use a smudge/clean filter and adjust the filter for the platform you're on, which might meet your needs. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204 signature.asc Description: PGP signature
Re: Use of new .gitattributes working-tree-encoding attribute across different platform types
On 27.06.18 09:54, Steve Groeger wrote: > Hi, > > Sorry for incomplete post earlier. Here is the full post: > > > In the latest version of git a new attribute has been added, > working-tree-encoding. The release notes states: > > 'The new "working-tree-encoding" attribute can ask Git to convert the >contents to the specified encoding when checking out to the working >tree (and the other way around when checking in).' > We have been using this attribute on our z/OS systems using a version of git > from Rocket software to convert files to EBCDIC for quite a while now. On > other platforms (Linux, AIX etc) git ignored this attribute and therefore > left the files in ASCII. > > We have common code that is supposed to be usable across different platforms > and hence different file encodings. With the full support of the > working-tree-encoding in the latest version of git on all platforms, how do > we have files converted to different encodings on different platforms? > I could not find anything that would allow us to say 'if platform = z/OS then > encoding=EBCDIC else encoding=ASCII'. Is there a way this can be done? > > > > > Thanks > Steve Groeger [] Did you consider to put a gitattributes file on machine level ? https://git-scm.com/docs/gitattributes [snipped the other places where to put gitattributes] ... Attributes for all users on a system should be placed in the $(prefix)/etc/gitattributes file. > Java Runtimes Development > IBM Hursley > IBM United Kingdom Ltd > Tel: (44) 1962 816911 Mobex: 279990 Mobile: 07718 517 129 > Fax (44) 1962 816800 > Lotus Notes: Steve Groeger/UK/IBM > Internet: groe...@uk.ibm.com > > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU >
Use of new .gitattributes working-tree-encoding attribute across different platform types
Hi, Sorry for incomplete post earlier. Here is the full post: In the latest version of git a new attribute has been added, working-tree-encoding. The release notes states: 'The new "working-tree-encoding" attribute can ask Git to convert the contents to the specified encoding when checking out to the working tree (and the other way around when checking in).' We have been using this attribute on our z/OS systems using a version of git from Rocket software to convert files to EBCDIC for quite a while now. On other platforms (Linux, AIX etc) git ignored this attribute and therefore left the files in ASCII. We have common code that is supposed to be usable across different platforms and hence different file encodings. With the full support of the working-tree-encoding in the latest version of git on all platforms, how do we have files converted to different encodings on different platforms? I could not find anything that would allow us to say 'if platform = z/OS then encoding=EBCDIC else encoding=ASCII'. Is there a way this can be done? Thanks Steve Groeger Java Runtimes Development IBM Hursley IBM United Kingdom Ltd Tel: (44) 1962 816911 Mobex: 279990 Mobile: 07718 517 129 Fax (44) 1962 816800 Lotus Notes: Steve Groeger/UK/IBM Internet: groe...@uk.ibm.com Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
Use of new .gitattributes working-tree-encoding attribute across different platform types
I could not find anything that would allow us to say 'if platform = z/OS then encoding=EBCDIC else encoding=ASCII'. Is there a way this can be done? Thanks Steve Groeger Java Runtimes Development IBM Hursley IBM United Kingdom Ltd Tel: (44) 1962 816911 Mobex: 279990 Mobile: 07718 517 129 Fax (44) 1962 816800 Lotus Notes: Steve Groeger/UK/IBM Internet: groe...@uk.ibm.com Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU