Re: "svn patch" and the TAB character

Doug Robinson Tue, 17 Dec 2019 05:33:49 -0800

On Mon, Dec 16, 2019 at 8:55 PM Daniel Shahaf <[email protected]>
wrote:
> Doug Robinson wrote on Mon, Dec 16, 2019 at 11:13:25 -0500:
> > So the two file names, differing only by a TAB in the "right place" will
> > currently have completely different behaviors:
> >
> >   My File NameSPACE(Revision 12)
> >   My File NameTAB(Revision 12)
> >
> > I see no point in maintaining that the "TAB" is critically significant
and
> > different from the "SPACE".  The difference is easier parsing vs. 1
visually
> > indistinguishable use case (depending on what tool you are using to
view the
> > diff file).
> >
> > And the "sequence of SPACE" instead of just a single "SPACE" falls into
the
> > same visually indistinguishable category.  So keep the parsing simple
and
> > just declare file names with "SP/TAB(Revision #)" at the end to not be
> > supported.
>
> You can't assume the string after the tab will be "(revision %ld)".
>
> First of all, as I already pointed out, that string is translatable.


Then we're screwed.  Some languages will have the revision number before
the word.  Some may actually require characters on both sides of the
revision number.

There's a reason that languages like C, C++, Java, etc. have "keywords" -
so they can properly parse the code.  In this case the "code" is in the
form of a "diff" file and the language-variant "SVN diff" needs a
fixed-text keyword with a specific operational sequence, e.g. "(KEYWORD
OPERAND)", i.e. "(revision 1234)".  Short of that parsing is purely
accidental.

> Second of
> all, we also have to support patches generated by third-party tools,
which may
> contain arbitrary text after the tab character.

Define the language.  Require the tooling to comply.  Trying to make
everything work (everybody happy) is the path to failure.

>  And of course, both the
> filename and the label (= the part after the tab character) may contain an
> arbitrary number of spaces.

The problem is parsing the line into proper tokens when every character out
there can be part of the file name.  There must be a structure to the field
or parsing is not really parsing anymore.

> Please propose an algorithm for parsing a filename out of a diff header
line
> (a '---' line or a '+++' line) that doesn't contain tabs, under these
conditions.
> (We don't have to fix _all_ cases, but fixing the bug just for English
speakers
> isn't going to fly.)

Anything I propose without keywords/structure I can trivially
counter-example.  Something will break.  As is the current situation.

-- 
*DOUGLAS B ROBINSON* SENIOR PRODUCT MANAGER

T +1 925 396 1125
*E* [email protected]

-- 


* <http://wandisco.com/>*

**The *LiveData* Company
*Find out more 
*wandisco.com <http://wandisco.com/>*



 
<https://www.wandisco.com/liveanalytics>


THIS MESSAGE AND ANY ATTACHMENTS 
ARE CONFIDENTIAL, PROPRIETARY AND MAY BE PRIVILEGED
*


If this message was 
misdirected, WANdisco, Inc. and its subsidiaries, ("WANdisco") does not 
waive any confidentiality or privilege. If you are not the intended 
recipient, please notify us immediately and destroy the message without 
disclosing its contents to anyone. Any distribution, use or copying of this 
email or the information it contains by other than an intended recipient is 
unauthorized. The views and opinions expressed in this email message are 
the author's own and may not reflect the views and opinions of WANdisco, 
unless the author is authorized by WANdisco to express such views or 
opinions on its behalf. All email sent to or from this address is subject 
to electronic storage and review by WANdisco. Although WANdisco operates 
anti-virus programs, it does not accept responsibility for any damage 
whatsoever caused by viruses being passed.

Re: "svn patch" and the TAB character

Reply via email to