Mike,

Perhaps a default set of file extensions that can be overridden?

-Jaben


> On May 21, 2018, at 3:41 PM, Kinney, Michael D <michael.d.kin...@intel.com> 
> wrote:
> 
> Liming,
> 
> We have a set of standard flags for tools that 
> should always be present.
> 
> --help
> -v
> -q
> --debug
> 
> We should also always have the program name,
> description, version, and copyright.
> 
> Please see BaseTools/Scripts/BinToPcd.py as 
> an example.
> 
> It might be useful to have a way to run this tool
> on a single file when BaseTools/Scripts/PatchCheck.py
> reports an issue.
> 
> Do you think it would be good to have one option to
> scan path for file extensions that are documented as
> DOS line endings so the extensions do not have to be
> entered?
> 
> Mike
> 
> 
>> -----Original Message-----
>> From: edk2-devel [mailto:edk2-devel-
>> boun...@lists.01.org] On Behalf Of Carsey, Jaben
>> Sent: Monday, May 21, 2018 7:50 AM
>> To: Gao, Liming <liming....@intel.com>; edk2-
>> de...@lists.01.org
>> Subject: Re: [edk2] [RFC] Formalize source files to
>> follow DOS format
>> 
>> Liming,
>> 
>> One Pep8 thing.
>> Can you change to use the with statement for the file
>> read/write?
>> 
>> Other small thoughts.
>> I think that FileList should be changed to a set as
>> order is not important.
>> Maybe wrapper the re.sub function with your own so all
>> the .encode() are in one location?  As we move to python
>> 3 we will have fewer changes to make.
>> 
>> 
>>> -----Original Message-----
>>> From: edk2-devel [mailto:edk2-devel-
>> boun...@lists.01.org] On Behalf Of
>>> Liming Gao
>>> Sent: Sunday, May 20, 2018 9:52 PM
>>> To: edk2-devel@lists.01.org
>>> Subject: [edk2] [RFC] Formalize source files to follow
>> DOS format
>>> 
>>> FormatDosFiles.py is added to clean up dos source
>> files. It bases on
>>> the rules defined in EDKII C Coding Standards
>> Specification.
>>> 5.1.2 Do not use tab characters
>>> 5.1.6 Only use CRLF (Carriage Return Line Feed) line
>> endings.
>>> 5.1.7 All files must end with CRLF
>>> No trailing white space in one line. (To be added in
>> spec)
>>> 
>>> The source files in edk2 project with the below
>> postfix are dos format.
>>> .h .c .nasm .nasmb .asm .S .inf .dec .dsc .fdf .uni
>> .asl .aslc .vfr .idf
>>> .txt .bat .py
>>> 
>>> The package maintainer can use this script to clean up
>> all files in his
>>> package. The prefer way is to create one patch per one
>> package.
>>> 
>>> Contributed-under: TianoCore Contribution Agreement
>> 1.1
>>> Signed-off-by: Liming Gao <liming....@intel.com>
>>> ---
>>> BaseTools/Scripts/FormatDosFiles.py | 93
>>> +++++++++++++++++++++++++++++++++++++
>>> 1 file changed, 93 insertions(+)
>>> create mode 100644
>> BaseTools/Scripts/FormatDosFiles.py
>>> 
>>> diff --git a/BaseTools/Scripts/FormatDosFiles.py
>>> b/BaseTools/Scripts/FormatDosFiles.py
>>> new file mode 100644
>>> index 0000000..c3a5476
>>> --- /dev/null
>>> +++ b/BaseTools/Scripts/FormatDosFiles.py
>>> @@ -0,0 +1,93 @@
>>> +# @file FormatDosFiles.py
>>> +# This script format the source files to follow dos
>> style.
>>> +# It supports Python2.x and Python3.x both.
>>> +#
>>> +#  Copyright (c) 2018, Intel Corporation. All rights
>> reserved.<BR>
>>> +#
>>> +#  This program and the accompanying materials
>>> +#  are licensed and made available under the terms
>> and conditions of the
>>> BSD License
>>> +#  which accompanies this distribution.  The full
>> text of the license may be
>>> found at
>>> +#  http://opensource.org/licenses/bsd-license.php
>>> +#
>>> +#  THE PROGRAM IS DISTRIBUTED UNDER THE BSD LICENSE
>> ON AN "AS IS"
>>> BASIS,
>>> +#  WITHOUT WARRANTIES OR REPRESENTATIONS OF ANY KIND,
>> EITHER
>>> EXPRESS OR IMPLIED.
>>> +#
>>> +
>>> +#
>>> +# Import Modules
>>> +#
>>> +import argparse
>>> +import os
>>> +import os.path
>>> +import re
>>> +import sys
>>> +
>>> +"""
>>> +difference of string between python2 and python3:
>>> +
>>> +there is a large difference of string in python2 and
>> python3.
>>> +
>>> +in python2,there are two type string,unicode string
>> (unicode type) and 8-bit
>>> string (str type).
>>> +    us = u"abcd",
>>> +    unicode string,which is internally stored as unicode
>> code point.
>>> +    s = "abcd",s = b"abcd",s = r"abcd",
>>> +    all of them are 8-bit string,which is internally
>> stored as bytes.
>>> +
>>> +in python3,a new type called bytes replace 8-bit
>> string,and str type is
>>> regarded as unicode string.
>>> +    s = "abcd", s = u"abcd", s = r"abcd",
>>> +    all of them are str type,which is internally stored
>> unicode code point.
>>> +    bs = b"abcd",
>>> +    bytes type,which is interally stored as bytes
>>> +
>>> +in python2 ,the both type string can be mixed use,but
>> in python3 it could
>>> not,
>>> +which means the pattern and content in re match
>> should be the same type
>>> in python3.
>>> +in function FormatFile,it read file in binary mode so
>> that the content is bytes
>>> type,so the pattern should also be bytes type.
>>> +As a result,I add encode() to make it compitable
>> among python2 and
>>> python3.
>>> +
>>> +difference of encode,decode in python2 and python3:
>>> +the builtin function str.encode(encoding) and
>> str.decode(encoding) are
>>> used for convert between 8-bit string and unicode
>> string.
>>> +
>>> +in python2
>>> +    encode convert unicode type to str type.decode vice
>> versa.default
>>> encoding is ascii.
>>> +    for example: s = us.encode()
>>> +    but if the us is str type,the code will also work.it
>> will be firstly convert
>>> to unicode type,
>>> +    in this situation,the call equals s =
>> us.decode().encode().
>>> +
>>> +in python3
>>> +    encode convert str type to bytes type,decode vice
>> versa.default
>>> encoding is utf8.
>>> +    fpr example:
>>> +    bs = s.encode(),only str type has encode method,so
>> that won't be
>>> used wrongly.decode is the same.
>>> +
>>> +in conclusion:
>>> +    this code could work the same in python27 and
>> python36
>>> environment as far as the re pattern satisfy ascii
>> character set.
>>> +
>>> +"""
>>> +def FormatFiles():
>>> +    parser = argparse.ArgumentParser()
>>> +    parser.add_argument('path', nargs=1, help='The
>> path for files to be
>>> converted.')
>>> +    parser.add_argument('extensions', nargs='+',
>> help='File extensions filter.
>>> (Example: .txt .c .h)')
>>> +    args = parser.parse_args()
>>> +    filelist = []
>>> +    for dirpath, dirnames, filenames in
>> os.walk(args.path[0]):
>>> +        for filename in [f for f in filenames if
>> any(f.endswith(ext) for ext in
>>> args.extensions)]:
>>> +            filelist.append(os.path.join(dirpath,
>> filename))
>>> +    for file in filelist:
>>> +        fd = open(file, 'rb')
>>> +        content = fd.read()
>>> +        fd.close()
>>> +        # Convert the line endings to CRLF
>>> +        content = re.sub(r'([^\r])\n'.encode(),
>> r'\1\r\n'.encode(), content)
>>> +        content = re.sub(r'^\n'.encode(),
>> r'\r\n'.encode(), content, flags =
>>> re.MULTILINE)
>>> +        # Add a new empty line if the file is not end
>> with one
>>> +        content = re.sub(r'([^\r\n])$'.encode(),
>> r'\1\r\n'.encode(), content)
>>> +        # Remove trailing white spaces
>>> +        content = re.sub(r'[ \t]+(\r\n)'.encode(),
>> r'\1'.encode(), content, flags =
>>> re.MULTILINE)
>>> +        # Replace '\t' with two spaces
>>> +        content = re.sub('\t'.encode(), '
>> '.encode(), content)
>>> +        fd = open(file, 'wb')
>>> +        fd.write(content)
>>> +        fd.close()
>>> +        print(file)
>>> +
>>> +if __name__ == "__main__":
>>> +    sys.exit(FormatFiles())
>>> \ No newline at end of file
>>> --
>>> 2.8.0.windows.1
>>> 
>>> _______________________________________________
>>> edk2-devel mailing list
>>> edk2-devel@lists.01.org
>>> https://lists.01.org/mailman/listinfo/edk2-devel
>> _______________________________________________
>> edk2-devel mailing list
>> edk2-devel@lists.01.org
>> https://lists.01.org/mailman/listinfo/edk2-devel
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel

Reply via email to