Mike, Perhaps a default set of file extensions that can be overridden?
-Jaben > On May 21, 2018, at 3:41 PM, Kinney, Michael D <michael.d.kin...@intel.com> > wrote: > > Liming, > > We have a set of standard flags for tools that > should always be present. > > --help > -v > -q > --debug > > We should also always have the program name, > description, version, and copyright. > > Please see BaseTools/Scripts/BinToPcd.py as > an example. > > It might be useful to have a way to run this tool > on a single file when BaseTools/Scripts/PatchCheck.py > reports an issue. > > Do you think it would be good to have one option to > scan path for file extensions that are documented as > DOS line endings so the extensions do not have to be > entered? > > Mike > > >> -----Original Message----- >> From: edk2-devel [mailto:edk2-devel- >> boun...@lists.01.org] On Behalf Of Carsey, Jaben >> Sent: Monday, May 21, 2018 7:50 AM >> To: Gao, Liming <liming....@intel.com>; edk2- >> de...@lists.01.org >> Subject: Re: [edk2] [RFC] Formalize source files to >> follow DOS format >> >> Liming, >> >> One Pep8 thing. >> Can you change to use the with statement for the file >> read/write? >> >> Other small thoughts. >> I think that FileList should be changed to a set as >> order is not important. >> Maybe wrapper the re.sub function with your own so all >> the .encode() are in one location? As we move to python >> 3 we will have fewer changes to make. >> >> >>> -----Original Message----- >>> From: edk2-devel [mailto:edk2-devel- >> boun...@lists.01.org] On Behalf Of >>> Liming Gao >>> Sent: Sunday, May 20, 2018 9:52 PM >>> To: edk2-devel@lists.01.org >>> Subject: [edk2] [RFC] Formalize source files to follow >> DOS format >>> >>> FormatDosFiles.py is added to clean up dos source >> files. It bases on >>> the rules defined in EDKII C Coding Standards >> Specification. >>> 5.1.2 Do not use tab characters >>> 5.1.6 Only use CRLF (Carriage Return Line Feed) line >> endings. >>> 5.1.7 All files must end with CRLF >>> No trailing white space in one line. (To be added in >> spec) >>> >>> The source files in edk2 project with the below >> postfix are dos format. >>> .h .c .nasm .nasmb .asm .S .inf .dec .dsc .fdf .uni >> .asl .aslc .vfr .idf >>> .txt .bat .py >>> >>> The package maintainer can use this script to clean up >> all files in his >>> package. The prefer way is to create one patch per one >> package. >>> >>> Contributed-under: TianoCore Contribution Agreement >> 1.1 >>> Signed-off-by: Liming Gao <liming....@intel.com> >>> --- >>> BaseTools/Scripts/FormatDosFiles.py | 93 >>> +++++++++++++++++++++++++++++++++++++ >>> 1 file changed, 93 insertions(+) >>> create mode 100644 >> BaseTools/Scripts/FormatDosFiles.py >>> >>> diff --git a/BaseTools/Scripts/FormatDosFiles.py >>> b/BaseTools/Scripts/FormatDosFiles.py >>> new file mode 100644 >>> index 0000000..c3a5476 >>> --- /dev/null >>> +++ b/BaseTools/Scripts/FormatDosFiles.py >>> @@ -0,0 +1,93 @@ >>> +# @file FormatDosFiles.py >>> +# This script format the source files to follow dos >> style. >>> +# It supports Python2.x and Python3.x both. >>> +# >>> +# Copyright (c) 2018, Intel Corporation. All rights >> reserved.<BR> >>> +# >>> +# This program and the accompanying materials >>> +# are licensed and made available under the terms >> and conditions of the >>> BSD License >>> +# which accompanies this distribution. The full >> text of the license may be >>> found at >>> +# http://opensource.org/licenses/bsd-license.php >>> +# >>> +# THE PROGRAM IS DISTRIBUTED UNDER THE BSD LICENSE >> ON AN "AS IS" >>> BASIS, >>> +# WITHOUT WARRANTIES OR REPRESENTATIONS OF ANY KIND, >> EITHER >>> EXPRESS OR IMPLIED. >>> +# >>> + >>> +# >>> +# Import Modules >>> +# >>> +import argparse >>> +import os >>> +import os.path >>> +import re >>> +import sys >>> + >>> +""" >>> +difference of string between python2 and python3: >>> + >>> +there is a large difference of string in python2 and >> python3. >>> + >>> +in python2,there are two type string,unicode string >> (unicode type) and 8-bit >>> string (str type). >>> + us = u"abcd", >>> + unicode string,which is internally stored as unicode >> code point. >>> + s = "abcd",s = b"abcd",s = r"abcd", >>> + all of them are 8-bit string,which is internally >> stored as bytes. >>> + >>> +in python3,a new type called bytes replace 8-bit >> string,and str type is >>> regarded as unicode string. >>> + s = "abcd", s = u"abcd", s = r"abcd", >>> + all of them are str type,which is internally stored >> unicode code point. >>> + bs = b"abcd", >>> + bytes type,which is interally stored as bytes >>> + >>> +in python2 ,the both type string can be mixed use,but >> in python3 it could >>> not, >>> +which means the pattern and content in re match >> should be the same type >>> in python3. >>> +in function FormatFile,it read file in binary mode so >> that the content is bytes >>> type,so the pattern should also be bytes type. >>> +As a result,I add encode() to make it compitable >> among python2 and >>> python3. >>> + >>> +difference of encode,decode in python2 and python3: >>> +the builtin function str.encode(encoding) and >> str.decode(encoding) are >>> used for convert between 8-bit string and unicode >> string. >>> + >>> +in python2 >>> + encode convert unicode type to str type.decode vice >> versa.default >>> encoding is ascii. >>> + for example: s = us.encode() >>> + but if the us is str type,the code will also work.it >> will be firstly convert >>> to unicode type, >>> + in this situation,the call equals s = >> us.decode().encode(). >>> + >>> +in python3 >>> + encode convert str type to bytes type,decode vice >> versa.default >>> encoding is utf8. >>> + fpr example: >>> + bs = s.encode(),only str type has encode method,so >> that won't be >>> used wrongly.decode is the same. >>> + >>> +in conclusion: >>> + this code could work the same in python27 and >> python36 >>> environment as far as the re pattern satisfy ascii >> character set. >>> + >>> +""" >>> +def FormatFiles(): >>> + parser = argparse.ArgumentParser() >>> + parser.add_argument('path', nargs=1, help='The >> path for files to be >>> converted.') >>> + parser.add_argument('extensions', nargs='+', >> help='File extensions filter. >>> (Example: .txt .c .h)') >>> + args = parser.parse_args() >>> + filelist = [] >>> + for dirpath, dirnames, filenames in >> os.walk(args.path[0]): >>> + for filename in [f for f in filenames if >> any(f.endswith(ext) for ext in >>> args.extensions)]: >>> + filelist.append(os.path.join(dirpath, >> filename)) >>> + for file in filelist: >>> + fd = open(file, 'rb') >>> + content = fd.read() >>> + fd.close() >>> + # Convert the line endings to CRLF >>> + content = re.sub(r'([^\r])\n'.encode(), >> r'\1\r\n'.encode(), content) >>> + content = re.sub(r'^\n'.encode(), >> r'\r\n'.encode(), content, flags = >>> re.MULTILINE) >>> + # Add a new empty line if the file is not end >> with one >>> + content = re.sub(r'([^\r\n])$'.encode(), >> r'\1\r\n'.encode(), content) >>> + # Remove trailing white spaces >>> + content = re.sub(r'[ \t]+(\r\n)'.encode(), >> r'\1'.encode(), content, flags = >>> re.MULTILINE) >>> + # Replace '\t' with two spaces >>> + content = re.sub('\t'.encode(), ' >> '.encode(), content) >>> + fd = open(file, 'wb') >>> + fd.write(content) >>> + fd.close() >>> + print(file) >>> + >>> +if __name__ == "__main__": >>> + sys.exit(FormatFiles()) >>> \ No newline at end of file >>> -- >>> 2.8.0.windows.1 >>> >>> _______________________________________________ >>> edk2-devel mailing list >>> edk2-devel@lists.01.org >>> https://lists.01.org/mailman/listinfo/edk2-devel >> _______________________________________________ >> edk2-devel mailing list >> edk2-devel@lists.01.org >> https://lists.01.org/mailman/listinfo/edk2-devel _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel