Re: [edk2] [RFC] Formalize source files to follow DOS format

Carsey, Jaben Thu, 24 May 2018 07:13:33 -0700

Follow pep8 for coding style.

The technical benefit is things like that If an exception occurs we still close 
the file.


> -----Original Message-----
> From: Gao, Liming
> Sent: Thursday, May 24, 2018 1:31 AM
> To: Carsey, Jaben <jaben.car...@intel.com>; edk2-devel@lists.01.org
> Subject: RE: [edk2] [RFC] Formalize source files to follow DOS format
> Importance: High
> 
> Jaben:
>   What difference of statement for file read/write?
> 
>   Besides, we use .encode() here to support python 3. After we move to
> python 3, this script is not changed.
> 
> Thanks
> Liming
> >-----Original Message-----
> >From: Carsey, Jaben
> >Sent: Monday, May 21, 2018 10:50 PM
> >To: Gao, Liming <liming....@intel.com>; edk2-devel@lists.01.org
> >Subject: RE: [edk2] [RFC] Formalize source files to follow DOS format
> >
> >Liming,
> >
> >One Pep8 thing.
> >Can you change to use the with statement for the file read/write?
> >
> >Other small thoughts.
> >I think that FileList should be changed to a set as order is not important.
> >Maybe wrapper the re.sub function with your own so all the .encode() are
> in
> >one location?  As we move to python 3 we will have fewer changes to
> make.
> >
> >
> >> -----Original Message-----
> >> From: edk2-devel [mailto:edk2-devel-boun...@lists.01.org] On Behalf Of
> >> Liming Gao
> >> Sent: Sunday, May 20, 2018 9:52 PM
> >> To: edk2-devel@lists.01.org
> >> Subject: [edk2] [RFC] Formalize source files to follow DOS format
> >>
> >> FormatDosFiles.py is added to clean up dos source files. It bases on
> >> the rules defined in EDKII C Coding Standards Specification.
> >> 5.1.2 Do not use tab characters
> >> 5.1.6 Only use CRLF (Carriage Return Line Feed) line endings.
> >> 5.1.7 All files must end with CRLF
> >> No trailing white space in one line. (To be added in spec)
> >>
> >> The source files in edk2 project with the below postfix are dos format.
> >> .h .c .nasm .nasmb .asm .S .inf .dec .dsc .fdf .uni .asl .aslc .vfr .idf
> >> .txt .bat .py
> >>
> >> The package maintainer can use this script to clean up all files in his
> >> package. The prefer way is to create one patch per one package.
> >>
> >> Contributed-under: TianoCore Contribution Agreement 1.1
> >> Signed-off-by: Liming Gao <liming....@intel.com>
> >> ---
> >>  BaseTools/Scripts/FormatDosFiles.py | 93
> >> +++++++++++++++++++++++++++++++++++++
> >>  1 file changed, 93 insertions(+)
> >>  create mode 100644 BaseTools/Scripts/FormatDosFiles.py
> >>
> >> diff --git a/BaseTools/Scripts/FormatDosFiles.py
> >> b/BaseTools/Scripts/FormatDosFiles.py
> >> new file mode 100644
> >> index 0000000..c3a5476
> >> --- /dev/null
> >> +++ b/BaseTools/Scripts/FormatDosFiles.py
> >> @@ -0,0 +1,93 @@
> >> +# @file FormatDosFiles.py
> >> +# This script format the source files to follow dos style.
> >> +# It supports Python2.x and Python3.x both.
> >> +#
> >> +#  Copyright (c) 2018, Intel Corporation. All rights reserved.<BR>
> >> +#
> >> +#  This program and the accompanying materials
> >> +#  are licensed and made available under the terms and conditions of the
> >> BSD License
> >> +#  which accompanies this distribution.  The full text of the license may
> be
> >> found at
> >> +#  http://opensource.org/licenses/bsd-license.php
> >> +#
> >> +#  THE PROGRAM IS DISTRIBUTED UNDER THE BSD LICENSE ON AN "AS
> IS"
> >> BASIS,
> >> +#  WITHOUT WARRANTIES OR REPRESENTATIONS OF ANY KIND, EITHER
> >> EXPRESS OR IMPLIED.
> >> +#
> >> +
> >> +#
> >> +# Import Modules
> >> +#
> >> +import argparse
> >> +import os
> >> +import os.path
> >> +import re
> >> +import sys
> >> +
> >> +"""
> >> +difference of string between python2 and python3:
> >> +
> >> +there is a large difference of string in python2 and python3.
> >> +
> >> +in python2,there are two type string,unicode string (unicode type) and
> 8-
> >bit
> >> string (str type).
> >> +  us = u"abcd",
> >> +  unicode string,which is internally stored as unicode code point.
> >> +  s = "abcd",s = b"abcd",s = r"abcd",
> >> +  all of them are 8-bit string,which is internally stored as bytes.
> >> +
> >> +in python3,a new type called bytes replace 8-bit string,and str type is
> >> regarded as unicode string.
> >> +  s = "abcd", s = u"abcd", s = r"abcd",
> >> +  all of them are str type,which is internally stored unicode code point.
> >> +  bs = b"abcd",
> >> +  bytes type,which is interally stored as bytes
> >> +
> >> +in python2 ,the both type string can be mixed use,but in python3 it could
> >> not,
> >> +which means the pattern and content in re match should be the same
> type
> >> in python3.
> >> +in function FormatFile,it read file in binary mode so that the content is
> >bytes
> >> type,so the pattern should also be bytes type.
> >> +As a result,I add encode() to make it compitable among python2 and
> >> python3.
> >> +
> >> +difference of encode,decode in python2 and python3:
> >> +the builtin function str.encode(encoding) and str.decode(encoding) are
> >> used for convert between 8-bit string and unicode string.
> >> +
> >> +in python2
> >> +  encode convert unicode type to str type.decode vice versa.default
> >> encoding is ascii.
> >> +  for example: s = us.encode()
> >> +  but if the us is str type,the code will also work.it will be firstly 
> >> convert
> >> to unicode type,
> >> +  in this situation,the call equals s = us.decode().encode().
> >> +
> >> +in python3
> >> +  encode convert str type to bytes type,decode vice versa.default
> >> encoding is utf8.
> >> +  fpr example:
> >> +  bs = s.encode(),only str type has encode method,so that won't be
> >> used wrongly.decode is the same.
> >> +
> >> +in conclusion:
> >> +  this code could work the same in python27 and python36
> >> environment as far as the re pattern satisfy ascii character set.
> >> +
> >> +"""
> >> +def FormatFiles():
> >> +    parser = argparse.ArgumentParser()
> >> +    parser.add_argument('path', nargs=1, help='The path for files to be
> >> converted.')
> >> +    parser.add_argument('extensions', nargs='+', help='File extensions
> filter.
> >> (Example: .txt .c .h)')
> >> +    args = parser.parse_args()
> >> +    filelist = []
> >> +    for dirpath, dirnames, filenames in os.walk(args.path[0]):
> >> +        for filename in [f for f in filenames if any(f.endswith(ext) for 
> >> ext in
> >> args.extensions)]:
> >> +            filelist.append(os.path.join(dirpath, filename))
> >> +    for file in filelist:
> >> +        fd = open(file, 'rb')
> >> +        content = fd.read()
> >> +        fd.close()
> >> +        # Convert the line endings to CRLF
> >> +        content = re.sub(r'([^\r])\n'.encode(), r'\1\r\n'.encode(), 
> >> content)
> >> +        content = re.sub(r'^\n'.encode(), r'\r\n'.encode(), content, 
> >> flags =
> >> re.MULTILINE)
> >> +        # Add a new empty line if the file is not end with one
> >> +        content = re.sub(r'([^\r\n])$'.encode(), r'\1\r\n'.encode(), 
> >> content)
> >> +        # Remove trailing white spaces
> >> +        content = re.sub(r'[ \t]+(\r\n)'.encode(), r'\1'.encode(), 
> >> content,
> flags
> >=
> >> re.MULTILINE)
> >> +        # Replace '\t' with two spaces
> >> +        content = re.sub('\t'.encode(), '  '.encode(), content)
> >> +        fd = open(file, 'wb')
> >> +        fd.write(content)
> >> +        fd.close()
> >> +        print(file)
> >> +
> >> +if __name__ == "__main__":
> >> +    sys.exit(FormatFiles())
> >> \ No newline at end of file
> >> --
> >> 2.8.0.windows.1
> >>
> >> _______________________________________________
> >> edk2-devel mailing list
> >> edk2-devel@lists.01.org
> >> https://lists.01.org/mailman/listinfo/edk2-devel
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel

Re: [edk2] [RFC] Formalize source files to follow DOS format

Reply via email to