On Mon, Dec 16, 2019 at 8:55 PM Daniel Shahaf <d...@daniel.shahaf.name> wrote: > Doug Robinson wrote on Mon, Dec 16, 2019 at 11:13:25 -0500: > > So the two file names, differing only by a TAB in the "right place" will > > currently have completely different behaviors: > > > > My File NameSPACE(Revision 12) > > My File NameTAB(Revision 12) > > > > I see no point in maintaining that the "TAB" is critically significant and > > different from the "SPACE". The difference is easier parsing vs. 1 visually > > indistinguishable use case (depending on what tool you are using to view the > > diff file). > > > > And the "sequence of SPACE" instead of just a single "SPACE" falls into the > > same visually indistinguishable category. So keep the parsing simple and > > just declare file names with "SP/TAB(Revision #)" at the end to not be > > supported. > > You can't assume the string after the tab will be "(revision %ld)". > > First of all, as I already pointed out, that string is translatable.
Then we're screwed. Some languages will have the revision number before the word. Some may actually require characters on both sides of the revision number. There's a reason that languages like C, C++, Java, etc. have "keywords" - so they can properly parse the code. In this case the "code" is in the form of a "diff" file and the language-variant "SVN diff" needs a fixed-text keyword with a specific operational sequence, e.g. "(KEYWORD OPERAND)", i.e. "(revision 1234)". Short of that parsing is purely accidental. > Second of > all, we also have to support patches generated by third-party tools, which may > contain arbitrary text after the tab character. Define the language. Require the tooling to comply. Trying to make everything work (everybody happy) is the path to failure. > And of course, both the > filename and the label (= the part after the tab character) may contain an > arbitrary number of spaces. The problem is parsing the line into proper tokens when every character out there can be part of the file name. There must be a structure to the field or parsing is not really parsing anymore. > Please propose an algorithm for parsing a filename out of a diff header line > (a '---' line or a '+++' line) that doesn't contain tabs, under these conditions. > (We don't have to fix _all_ cases, but fixing the bug just for English speakers > isn't going to fly.) Anything I propose without keywords/structure I can trivially counter-example. Something will break. As is the current situation. -- *DOUGLAS B ROBINSON* SENIOR PRODUCT MANAGER T +1 925 396 1125 *E* doug.robin...@wandisco.com -- * <http://wandisco.com/>* **The *LiveData* Company *Find out more *wandisco.com <http://wandisco.com/>* <https://www.wandisco.com/liveanalytics> THIS MESSAGE AND ANY ATTACHMENTS ARE CONFIDENTIAL, PROPRIETARY AND MAY BE PRIVILEGED * If this message was misdirected, WANdisco, Inc. and its subsidiaries, ("WANdisco") does not waive any confidentiality or privilege. If you are not the intended recipient, please notify us immediately and destroy the message without disclosing its contents to anyone. Any distribution, use or copying of this email or the information it contains by other than an intended recipient is unauthorized. The views and opinions expressed in this email message are the author's own and may not reflect the views and opinions of WANdisco, unless the author is authorized by WANdisco to express such views or opinions on its behalf. All email sent to or from this address is subject to electronic storage and review by WANdisco. Although WANdisco operates anti-virus programs, it does not accept responsibility for any damage whatsoever caused by viruses being passed.