W dniu 29.09.2016 o 08:33, Torsten Bögershausen pisze:
> On 15.09.16 22:04, Junio C Hamano wrote:
>> Lars Schneider <[email protected]> writes:
>>
>>> Wouldn't that complicate the pathname parsing on the filter side?
>>> Can't we just define in our filter protocol documentation that our
>>> "pathname" packet _always_ has a trailing "\n"? That would mean the
>>> receiver would know a packet "pathname=ABC\n\n" encodes the path
>>> "ABC\n" [1].
>>
>> That's fine, too. If you declare that pathname over the protocol is
>> a binary thing, you can also define that the packet does not have
>> the terminating \n, i.e. the example encodes the path "ABC\n\n",
>> which is also OK ;-)
>>
>> As long as the rule is clearly documented, easy for filter
>> implementors to follow it, and hard for them to get it wrong, I'd be
>> perfectly happy.
>>
>
> (Sorry for the late reply)
>
> In V8 the additional "\n" is clearly documented.
>
> On the long run,
> I would suggest to be more clear what BINARY is:
>
> --- a/Documentation/technical/protocol-common.txt
> +++ b/Documentation/technical/protocol-common.txt
> @@ -61,6 +61,9 @@ the length's hexadecimal representation.
> A pkt-line MAY contain binary data, so implementors MUST ensure
> pkt-line parsing/formatting routines are 8-bit clean.
>
> +Each pkt-line that may contain ASCII control characters should
> +be treated as binary.
> +
Well, it is not as clear cut with pathnames. Sane pathnames should
not contain control characters, even if they are outside US-ASCII,
assuming sane filesystem pathnames charset (like UTF-8).
One thing pathname cannot include is NUL ("\0") character.
So in most cases they are ASCII, but might not be. Not that
pkt-line text packets are binary-unsafe... I think the trailing
"\n" is here for easier debugging.
http://www.dwheeler.com/essays/filenames-in-shell.html
http://www.dwheeler.com/essays/fixing-unix-linux-filenames.html
--
Jakub Narębski