This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git


The following commit(s) were added to refs/heads/master by this push:
     new d7225d6  [CodeFormat] Add clang-format script (#4934)
d7225d6 is described below

commit d7225d61ef33a4dc3356417e32589c680e267b3d
Author: sduzh <[email protected]>
AuthorDate: Sat Nov 28 18:40:06 2020 +0800

    [CodeFormat] Add clang-format script (#4934)
    
    run build-support/check-format.sh to check cpp styles;
    run build-support/clang-format.sh to fix cpp style issues;
---
 build-support/check-format.sh             |  34 +++++++
 build-support/clang-format.sh             |  35 ++++++++
 build-support/lintutils.py                | 111 +++++++++++++++++++++++
 build-support/run_clang_format.py         | 144 ++++++++++++++++++++++++++++++
 docs/en/developer-guide/format-code.md    |  27 ++----
 docs/zh-CN/developer-guide/format-code.md |  24 ++---
 6 files changed, 335 insertions(+), 40 deletions(-)

diff --git a/build-support/check-format.sh b/build-support/check-format.sh
new file mode 100755
index 0000000..633c617
--- /dev/null
+++ b/build-support/check-format.sh
@@ -0,0 +1,34 @@
+#!/usr/bin/env bash
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+##############################################################
+# This script will run the clang-format to check but without
+# updating cpp files.
+##############################################################
+
+set -eo pipefail
+
+ROOT=`dirname "$0"`
+ROOT=`cd "$ROOT"; pwd`
+
+export DORIS_HOME=`cd "${ROOT}/.."; pwd`
+
+CLANG_FORMAT=${CLANG_FORMAT_BINARY:=$(which clang-format)}
+
+python3 ${DORIS_HOME}/build-support/run_clang_format.py 
--clang_format_binary="${CLANG_FORMAT}" 
--source_dirs="${DORIS_HOME}/be/src,${DORIS_HOME}/be/test" --quiet
+
diff --git a/build-support/clang-format.sh b/build-support/clang-format.sh
new file mode 100755
index 0000000..8682b10
--- /dev/null
+++ b/build-support/clang-format.sh
@@ -0,0 +1,35 @@
+#!/usr/bin/env bash
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+##############################################################
+# This script run the clang-format to check and fix
+# cplusplus source files.
+##############################################################
+
+set -eo pipefail
+
+ROOT=`dirname "$0"`
+ROOT=`cd "$ROOT"; pwd`
+
+export DORIS_HOME=`cd "${ROOT}/.."; pwd`
+
+CLANG_FORMAT=${CLANG_FORMAT_BINARY:=$(which clang-format)}
+
+python3 ${DORIS_HOME}/build-support/run_clang_format.py 
--clang_format_binary="${CLANG_FORMAT}" --fix 
--source_dirs="${DORIS_HOME}/be/src","${DORIS_HOME}/be/test"
+
+
diff --git a/build-support/lintutils.py b/build-support/lintutils.py
new file mode 100644
index 0000000..e651a3a
--- /dev/null
+++ b/build-support/lintutils.py
@@ -0,0 +1,111 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+# Modified from Apache Arrow project.
+
+import multiprocessing as mp
+import os
+from fnmatch import fnmatch
+from subprocess import Popen
+
+
+def chunk(seq, n):
+    """
+    divide a sequence into equal sized chunks
+    (the last chunk may be smaller, but won't be empty)
+    """
+    chunks = []
+    some = []
+    for element in seq:
+        if len(some) == n:
+            chunks.append(some)
+            some = []
+        some.append(element)
+    if len(some) > 0:
+        chunks.append(some)
+    return chunks
+
+
+def dechunk(chunks):
+    "flatten chunks into a single list"
+    seq = []
+    for chunk in chunks:
+        seq.extend(chunk)
+    return seq
+
+
+def run_parallel(cmds, **kwargs):
+    """
+    Run each of cmds (with shared **kwargs) using subprocess.Popen
+    then wait for all of them to complete.
+    Runs batches of multiprocessing.cpu_count() * 2 from cmds
+    returns a list of tuples containing each process'
+    returncode, stdout, stderr
+    """
+    complete = []
+    for cmds_batch in chunk(cmds, mp.cpu_count() * 2):
+        procs_batch = [Popen(cmd, **kwargs) for cmd in cmds_batch]
+        for proc in procs_batch:
+            stdout, stderr = proc.communicate()
+            complete.append((proc.returncode, stdout, stderr))
+    return complete
+
+
+_source_extensions = '''
+.h
+.cc
+.cpp
+'''.split()
+
+
+def get_sources(source_dir, exclude_globs=[]):
+    sources = []
+    for directory, subdirs, basenames in os.walk(source_dir):
+        for path in [os.path.join(directory, basename)
+                     for basename in basenames]:
+            # filter out non-source files
+            if os.path.splitext(path)[1] not in _source_extensions:
+                continue
+
+            path = os.path.abspath(path)
+
+            # filter out files that match the globs in the globs file
+            if any([fnmatch(path, glob) for glob in exclude_globs]):
+               continue
+
+            sources.append(path)
+    return sources
+
+
+def stdout_pathcolonline(completed_process, filenames):
+    """
+    given a completed process which may have reported some files as problematic
+    by printing the path name followed by ':' then a line number, examine
+    stdout and return the set of actually reported file names
+    """
+    returncode, stdout, stderr = completed_process
+    bfilenames = set()
+    for filename in filenames:
+        bfilenames.add(filename.encode('utf-8') + b':')
+    problem_files = set()
+    for line in stdout.splitlines():
+        for filename in bfilenames:
+            if line.startswith(filename):
+                problem_files.add(filename.decode('utf-8'))
+                bfilenames.remove(filename)
+                break
+    return problem_files, stdout
diff --git a/build-support/run_clang_format.py 
b/build-support/run_clang_format.py
new file mode 100644
index 0000000..f19c10c
--- /dev/null
+++ b/build-support/run_clang_format.py
@@ -0,0 +1,144 @@
+#!/usr/bin/env python
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+# Modified from Apache Arrow project.
+
+from __future__ import print_function
+import lintutils
+from subprocess import PIPE
+import argparse
+import difflib
+import multiprocessing as mp
+import sys
+from functools import partial
+
+
+# examine the output of clang-format and if changes are
+# present assemble a (unified)patch of the difference
+def _check_one_file(filename, formatted):
+    with open(filename, "rb") as reader:
+        original = reader.read()
+
+    if formatted != original:
+        # Run the equivalent of diff -u
+        diff = list(difflib.unified_diff(
+            original.decode('utf8').splitlines(True),
+            formatted.decode('utf8').splitlines(True),
+            fromfile=filename,
+            tofile="{} (after clang format)".format(
+                filename)))
+    else:
+        diff = None
+
+    return filename, diff
+
+def _check_dir(arguments, source_dir, exclude_globs):
+    formatted_filenames = []
+    for path in lintutils.get_sources(source_dir, exclude_globs):
+            formatted_filenames.append(str(path))
+
+    if arguments.fix:
+        if not arguments.quiet:
+            print("\n".join(map(lambda x: "Formatting {}".format(x),
+                                formatted_filenames)))
+
+        # Break clang-format invocations into chunks: each invocation formats
+        # 16 files. Wait for all processes to complete
+        results = lintutils.run_parallel([
+            [arguments.clang_format_binary, "-style=file", "-i"] + some
+            for some in lintutils.chunk(formatted_filenames, 16)
+        ])
+        for returncode, stdout, stderr in results:
+            # if any clang-format reported a parse error, bubble it
+            if returncode != 0:
+                sys.exit(returncode)
+
+    else:
+        # run an instance of clang-format for each source file in parallel,
+        # then wait for all processes to complete
+        results = lintutils.run_parallel([
+            [arguments.clang_format_binary, "-style=file", filename]
+            for filename in formatted_filenames
+        ], stdout=PIPE, stderr=PIPE)
+
+        checker_args = []
+        for filename, res in zip(formatted_filenames, results):
+            # if any clang-format reported a parse error, bubble it
+            returncode, stdout, stderr = res
+            if returncode != 0:
+                print(stderr)
+                sys.exit(returncode)
+            checker_args.append((filename, stdout))
+
+        error = False
+        pool = mp.Pool()
+        try:
+            # check the output from each invocation of clang-format in parallel
+            for filename, diff in pool.starmap(_check_one_file, checker_args):
+                if not arguments.quiet:
+                    print("Checking {}".format(filename))
+                if diff:
+                    print("{} had clang-format style issues".format(filename))
+                    # Print out the diff to stderr
+                    error = True
+                    # pad with a newline
+                    print(file=sys.stderr)
+                    sys.stderr.writelines(diff)
+        except Exception:
+            error = True
+            raise
+        finally:
+            pool.terminate()
+            pool.join()
+        sys.exit(1 if error else 0)
+
+
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser(
+        description="Runs clang-format on all of the source "
+        "files. If --fix is specified enforce format by "
+        "modifying in place, otherwise compare the output "
+        "with the existing file and output any necessary "
+        "changes as a patch in unified diff format")
+    parser.add_argument("--clang_format_binary",
+                        required=True,
+                        help="Path to the clang-format binary")
+    parser.add_argument("--exclude_globs",
+                        help="Filename containing globs for files "
+                        "that should be excluded from the checks")
+    parser.add_argument("--source_dirs",
+                        required=True,
+                        help="Comma-separated root directories of the source 
code")
+    parser.add_argument("--fix", default=False,
+                        action="store_true",
+                        help="If specified, will re-format the source "
+                        "code instead of comparing the re-formatted "
+                        "output, defaults to %(default)s")
+    parser.add_argument("--quiet", default=False,
+                        action="store_true",
+                        help="If specified, only print errors")
+    arguments = parser.parse_args()
+
+    exclude_globs = []
+    if arguments.exclude_globs:
+        with open(arguments.exclude_globs) as f:
+            exclude_globs.extend(line.strip() for line in f)
+
+    for source_dir in arguments.source_dirs.split(','):
+        if len(source_dir) > 0:
+            _check_dir(arguments, source_dir, exclude_globs)
diff --git a/docs/en/developer-guide/format-code.md 
b/docs/en/developer-guide/format-code.md
index a473f6a..32fde71 100644
--- a/docs/en/developer-guide/format-code.md
+++ b/docs/en/developer-guide/format-code.md
@@ -25,7 +25,7 @@ under the License.
 -->
 
 # Format Code
-To automatically format the code, clang-format is a good choice.
+Doris use `Clang-format` to automatically check the format of your source code.
 
 ## Code Style
 Doris Code Style is based on Google's, makes a few changes. The customized 
.clang-format
@@ -58,31 +58,16 @@ the version is lower than clang-format-9.0.
 ## Usage
 
 ### CMD
-`clang-format --style=file -i $File$` 
+Change directory to the root directory of Doris sources and run the following 
command:
+`build-support/clang-format.sh`
 
-`-style=file` Clang-format will try to find the .clang-format file located in 
the closest parent directory of the input file. When the standard input is 
used, the search is started from the current directory.
-
-`--lines = m:n` Format a range of lines. Multiple ranges can be formatted by 
specifying several -lines arguments.
-
-`-i`input file
-
-Note: filter out the files which should not be formatted, when batch 
clang-formatting files. 
- 
- A example of how to filter \*.h/\*.cpp and exclude some dirs:
- 
- Centos
-
-`find . -type f -not \( -wholename ./env/* \) -regextype posix-egrep -regex
- ".*\.(cpp|h)" | xargs clang-format -i -style=file`
- 
- Mac
- 
- `find -E . -type f -not \( -wholename ./env/* \) -regex ".*\.(cpp|h)" | xargs 
clang-format -i --style=file`
+NOTE: Python3 is required to run the `clang-format.sh` script.
 
 ### Using clang-format in IDEs or Editors
 #### Clion
 If using the plugin 'ClangFormat' in Clion, choose `Reformat Code` or press 
the keyboard 
 shortcut.
+
 #### VS Code
 VS Code needs install the extension 'Clang-Format', and specify the executable 
path of 
 clang-format in settings.
@@ -93,4 +78,4 @@ Open the vs code configuration page and search 
`clang_format`, fill the box as f
 "clang_format_path":  "$clang-format path$",
 "clang_format_style": "file"
 ```
-Then, right click the file and choose `Format Document`.
\ No newline at end of file
+Then, right click the file and choose `Format Document`.
diff --git a/docs/zh-CN/developer-guide/format-code.md 
b/docs/zh-CN/developer-guide/format-code.md
index 129d58f..3d4e803 100644
--- a/docs/zh-CN/developer-guide/format-code.md
+++ b/docs/zh-CN/developer-guide/format-code.md
@@ -25,7 +25,7 @@ under the License.
 -->
 
 # 代码格式化
-为了自动格式化代码,推荐使用clang-format进行代码格式化。
+Doris使用clang-format进行代码格式化,并在build-support目录下提供了封装脚本`clang-format.sh`.
 
 ## 代码风格定制
 Doris的代码风格在Google Style的基础上稍有改动,定制为.clang-format文件,位于Doris根目录。
@@ -53,24 +53,10 @@ clang-format程序的版本匹配,从支持的StyleOption上看,应该是低
 ## 使用方式
 
 ### 命令行运行
-`clang-format --style=file -i $File$` 
+cd到Doris根目录下,然后执行如下命令:
+`build-support/clang-format.sh`
 
-`--sytle=file`就会自动找到.clang-format文件,根据文件Option配置来格式化代码。
-
-`--lines=m:n`通过指定起始行和结束行修改文件的指定范围
-
-`-i`指定被格式化文件
-
-批量文件clang-format时,需注意过滤不应该格式化的文件。例如,只格式化*.h/*.cpp,并排除某些文件夹:
-
- Centos
-
-`find . -type f -not \( -wholename ./env/* \) -regextype posix-egrep -regex
- ".*\.(cpp|h)" | xargs clang-format -i -style=file`
- 
- Mac
- 
- `find -E . -type f -not \( -wholename ./env/* \) -regex ".*\.(cpp|h)" | xargs 
clang-format -i --style=file`
+注:`clang-format.sh`脚本要求您的机器上安装了python 3
 
 ### 在IDE或Editor中使用clang-format
 #### Clion
@@ -85,4 +71,4 @@ VS Code需安装扩展程序Clang-Format,但需要自行提供clang-format执
 "clang_format_style": "file"
 ```
 
-然后,右键点击`Format Document`即可。
\ No newline at end of file
+然后,右键点击`Format Document`即可。


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to