I get your point. We will update this script and send version 2.
> -----Original Message----- > From: Carsey, Jaben > Sent: Thursday, May 24, 2018 10:14 PM > To: Gao, Liming <liming....@intel.com>; Kinney, Michael D > <michael.d.kin...@intel.com> > Cc: edk2-devel@lists.01.org > Subject: RE: [edk2] [RFC] Formalize source files to follow DOS format > > We could do something like we do for compiler flags... append or overwrite > depending on syntax. > > > -----Original Message----- > > From: Gao, Liming > > Sent: Thursday, May 24, 2018 1:35 AM > > To: Kinney, Michael D <michael.d.kin...@intel.com>; Carsey, Jaben > > <jaben.car...@intel.com> > > Cc: edk2-devel@lists.01.org > > Subject: RE: [edk2] [RFC] Formalize source files to follow DOS format > > Importance: High > > > > Mike: > > I agree your comments. On default file set, this script can have the > > default > > ones. User can specify more set to append the default ones instead of > > override the default ones. > > > > Thanks > > Liming > > >-----Original Message----- > > >From: Kinney, Michael D > > >Sent: Tuesday, May 22, 2018 6:59 AM > > >To: Carsey, Jaben <jaben.car...@intel.com>; Kinney, Michael D > > ><michael.d.kin...@intel.com> > > >Cc: Gao, Liming <liming....@intel.com>; edk2-devel@lists.01.org > > >Subject: RE: [edk2] [RFC] Formalize source files to follow DOS format > > > > > >Jaben, > > > > > >Yes. With default behavior is default set and > > >specifying one or more extensions overrides the > > >default set. > > > > > >Mike > > > > > >> -----Original Message----- > > >> From: Carsey, Jaben > > >> Sent: Monday, May 21, 2018 3:43 PM > > >> To: Kinney, Michael D <michael.d.kin...@intel.com> > > >> Cc: Gao, Liming <liming....@intel.com>; edk2- > > >> de...@lists.01.org > > >> Subject: Re: [edk2] [RFC] Formalize source files to > > >> follow DOS format > > >> > > >> Mike, > > >> > > >> Perhaps a default set of file extensions that can be > > >> overridden? > > >> > > >> -Jaben > > >> > > >> > > >> > On May 21, 2018, at 3:41 PM, Kinney, Michael D > > >> <michael.d.kin...@intel.com> wrote: > > >> > > > >> > Liming, > > >> > > > >> > We have a set of standard flags for tools that > > >> > should always be present. > > >> > > > >> > --help > > >> > -v > > >> > -q > > >> > --debug > > >> > > > >> > We should also always have the program name, > > >> > description, version, and copyright. > > >> > > > >> > Please see BaseTools/Scripts/BinToPcd.py as > > >> > an example. > > >> > > > >> > It might be useful to have a way to run this tool > > >> > on a single file when BaseTools/Scripts/PatchCheck.py > > >> > reports an issue. > > >> > > > >> > Do you think it would be good to have one option to > > >> > scan path for file extensions that are documented as > > >> > DOS line endings so the extensions do not have to be > > >> > entered? > > >> > > > >> > Mike > > >> > > > >> > > > >> >> -----Original Message----- > > >> >> From: edk2-devel [mailto:edk2-devel- > > >> >> boun...@lists.01.org] On Behalf Of Carsey, Jaben > > >> >> Sent: Monday, May 21, 2018 7:50 AM > > >> >> To: Gao, Liming <liming....@intel.com>; edk2- > > >> >> de...@lists.01.org > > >> >> Subject: Re: [edk2] [RFC] Formalize source files to > > >> >> follow DOS format > > >> >> > > >> >> Liming, > > >> >> > > >> >> One Pep8 thing. > > >> >> Can you change to use the with statement for the file > > >> >> read/write? > > >> >> > > >> >> Other small thoughts. > > >> >> I think that FileList should be changed to a set as > > >> >> order is not important. > > >> >> Maybe wrapper the re.sub function with your own so > > >> all > > >> >> the .encode() are in one location? As we move to > > >> python > > >> >> 3 we will have fewer changes to make. > > >> >> > > >> >> > > >> >>> -----Original Message----- > > >> >>> From: edk2-devel [mailto:edk2-devel- > > >> >> boun...@lists.01.org] On Behalf Of > > >> >>> Liming Gao > > >> >>> Sent: Sunday, May 20, 2018 9:52 PM > > >> >>> To: edk2-devel@lists.01.org > > >> >>> Subject: [edk2] [RFC] Formalize source files to > > >> follow > > >> >> DOS format > > >> >>> > > >> >>> FormatDosFiles.py is added to clean up dos source > > >> >> files. It bases on > > >> >>> the rules defined in EDKII C Coding Standards > > >> >> Specification. > > >> >>> 5.1.2 Do not use tab characters > > >> >>> 5.1.6 Only use CRLF (Carriage Return Line Feed) line > > >> >> endings. > > >> >>> 5.1.7 All files must end with CRLF > > >> >>> No trailing white space in one line. (To be added in > > >> >> spec) > > >> >>> > > >> >>> The source files in edk2 project with the below > > >> >> postfix are dos format. > > >> >>> .h .c .nasm .nasmb .asm .S .inf .dec .dsc .fdf .uni > > >> >> .asl .aslc .vfr .idf > > >> >>> .txt .bat .py > > >> >>> > > >> >>> The package maintainer can use this script to clean > > >> up > > >> >> all files in his > > >> >>> package. The prefer way is to create one patch per > > >> one > > >> >> package. > > >> >>> > > >> >>> Contributed-under: TianoCore Contribution Agreement > > >> >> 1.1 > > >> >>> Signed-off-by: Liming Gao <liming....@intel.com> > > >> >>> --- > > >> >>> BaseTools/Scripts/FormatDosFiles.py | 93 > > >> >>> +++++++++++++++++++++++++++++++++++++ > > >> >>> 1 file changed, 93 insertions(+) > > >> >>> create mode 100644 > > >> >> BaseTools/Scripts/FormatDosFiles.py > > >> >>> > > >> >>> diff --git a/BaseTools/Scripts/FormatDosFiles.py > > >> >>> b/BaseTools/Scripts/FormatDosFiles.py > > >> >>> new file mode 100644 > > >> >>> index 0000000..c3a5476 > > >> >>> --- /dev/null > > >> >>> +++ b/BaseTools/Scripts/FormatDosFiles.py > > >> >>> @@ -0,0 +1,93 @@ > > >> >>> +# @file FormatDosFiles.py > > >> >>> +# This script format the source files to follow dos > > >> >> style. > > >> >>> +# It supports Python2.x and Python3.x both. > > >> >>> +# > > >> >>> +# Copyright (c) 2018, Intel Corporation. All > > >> rights > > >> >> reserved.<BR> > > >> >>> +# > > >> >>> +# This program and the accompanying materials > > >> >>> +# are licensed and made available under the terms > > >> >> and conditions of the > > >> >>> BSD License > > >> >>> +# which accompanies this distribution. The full > > >> >> text of the license may be > > >> >>> found at > > >> >>> +# http://opensource.org/licenses/bsd-license.php > > >> >>> +# > > >> >>> +# THE PROGRAM IS DISTRIBUTED UNDER THE BSD LICENSE > > >> >> ON AN "AS IS" > > >> >>> BASIS, > > >> >>> +# WITHOUT WARRANTIES OR REPRESENTATIONS OF ANY > > >> KIND, > > >> >> EITHER > > >> >>> EXPRESS OR IMPLIED. > > >> >>> +# > > >> >>> + > > >> >>> +# > > >> >>> +# Import Modules > > >> >>> +# > > >> >>> +import argparse > > >> >>> +import os > > >> >>> +import os.path > > >> >>> +import re > > >> >>> +import sys > > >> >>> + > > >> >>> +""" > > >> >>> +difference of string between python2 and python3: > > >> >>> + > > >> >>> +there is a large difference of string in python2 > > >> and > > >> >> python3. > > >> >>> + > > >> >>> +in python2,there are two type string,unicode string > > >> >> (unicode type) and 8-bit > > >> >>> string (str type). > > >> >>> + us = u"abcd", > > >> >>> + unicode string,which is internally stored as > > >> unicode > > >> >> code point. > > >> >>> + s = "abcd",s = b"abcd",s = r"abcd", > > >> >>> + all of them are 8-bit string,which is > > >> internally > > >> >> stored as bytes. > > >> >>> + > > >> >>> +in python3,a new type called bytes replace 8-bit > > >> >> string,and str type is > > >> >>> regarded as unicode string. > > >> >>> + s = "abcd", s = u"abcd", s = r"abcd", > > >> >>> + all of them are str type,which is internally > > >> stored > > >> >> unicode code point. > > >> >>> + bs = b"abcd", > > >> >>> + bytes type,which is interally stored as bytes > > >> >>> + > > >> >>> +in python2 ,the both type string can be mixed > > >> use,but > > >> >> in python3 it could > > >> >>> not, > > >> >>> +which means the pattern and content in re match > > >> >> should be the same type > > >> >>> in python3. > > >> >>> +in function FormatFile,it read file in binary mode > > >> so > > >> >> that the content is bytes > > >> >>> type,so the pattern should also be bytes type. > > >> >>> +As a result,I add encode() to make it compitable > > >> >> among python2 and > > >> >>> python3. > > >> >>> + > > >> >>> +difference of encode,decode in python2 and python3: > > >> >>> +the builtin function str.encode(encoding) and > > >> >> str.decode(encoding) are > > >> >>> used for convert between 8-bit string and unicode > > >> >> string. > > >> >>> + > > >> >>> +in python2 > > >> >>> + encode convert unicode type to str type.decode > > >> vice > > >> >> versa.default > > >> >>> encoding is ascii. > > >> >>> + for example: s = us.encode() > > >> >>> + but if the us is str type,the code will also > > >> work.it > > >> >> will be firstly convert > > >> >>> to unicode type, > > >> >>> + in this situation,the call equals s = > > >> >> us.decode().encode(). > > >> >>> + > > >> >>> +in python3 > > >> >>> + encode convert str type to bytes type,decode > > >> vice > > >> >> versa.default > > >> >>> encoding is utf8. > > >> >>> + fpr example: > > >> >>> + bs = s.encode(),only str type has encode > > >> method,so > > >> >> that won't be > > >> >>> used wrongly.decode is the same. > > >> >>> + > > >> >>> +in conclusion: > > >> >>> + this code could work the same in python27 and > > >> >> python36 > > >> >>> environment as far as the re pattern satisfy ascii > > >> >> character set. > > >> >>> + > > >> >>> +""" > > >> >>> +def FormatFiles(): > > >> >>> + parser = argparse.ArgumentParser() > > >> >>> + parser.add_argument('path', nargs=1, help='The > > >> >> path for files to be > > >> >>> converted.') > > >> >>> + parser.add_argument('extensions', nargs='+', > > >> >> help='File extensions filter. > > >> >>> (Example: .txt .c .h)') > > >> >>> + args = parser.parse_args() > > >> >>> + filelist = [] > > >> >>> + for dirpath, dirnames, filenames in > > >> >> os.walk(args.path[0]): > > >> >>> + for filename in [f for f in filenames if > > >> >> any(f.endswith(ext) for ext in > > >> >>> args.extensions)]: > > >> >>> + filelist.append(os.path.join(dirpath, > > >> >> filename)) > > >> >>> + for file in filelist: > > >> >>> + fd = open(file, 'rb') > > >> >>> + content = fd.read() > > >> >>> + fd.close() > > >> >>> + # Convert the line endings to CRLF > > >> >>> + content = re.sub(r'([^\r])\n'.encode(), > > >> >> r'\1\r\n'.encode(), content) > > >> >>> + content = re.sub(r'^\n'.encode(), > > >> >> r'\r\n'.encode(), content, flags = > > >> >>> re.MULTILINE) > > >> >>> + # Add a new empty line if the file is not > > >> end > > >> >> with one > > >> >>> + content = re.sub(r'([^\r\n])$'.encode(), > > >> >> r'\1\r\n'.encode(), content) > > >> >>> + # Remove trailing white spaces > > >> >>> + content = re.sub(r'[ \t]+(\r\n)'.encode(), > > >> >> r'\1'.encode(), content, flags = > > >> >>> re.MULTILINE) > > >> >>> + # Replace '\t' with two spaces > > >> >>> + content = re.sub('\t'.encode(), ' > > >> >> '.encode(), content) > > >> >>> + fd = open(file, 'wb') > > >> >>> + fd.write(content) > > >> >>> + fd.close() > > >> >>> + print(file) > > >> >>> + > > >> >>> +if __name__ == "__main__": > > >> >>> + sys.exit(FormatFiles()) > > >> >>> \ No newline at end of file > > >> >>> -- > > >> >>> 2.8.0.windows.1 > > >> >>> > > >> >>> _______________________________________________ > > >> >>> edk2-devel mailing list > > >> >>> edk2-devel@lists.01.org > > >> >>> https://lists.01.org/mailman/listinfo/edk2-devel > > >> >> _______________________________________________ > > >> >> edk2-devel mailing list > > >> >> edk2-devel@lists.01.org > > >> >> https://lists.01.org/mailman/listinfo/edk2-devel _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel