Copilot commented on code in PR #47: URL: https://github.com/apache/solr-orbit/pull/47#discussion_r3329380946
########## scripts/release-checks.sh: ########## @@ -0,0 +1,36 @@ +#!/usr/bin/env bash +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +set -euo pipefail +cd "$(dirname "$0")/.." + +release_version=$1 +next_version=$2 + +if [ -z "$release_version" ] || [ -z "$next_version" ]; then + echo "Usage: $0 <release_version> <next_version>" + echo " e.g. $0 0.9.2 0.9.3" + exit 1 +fi Review Comment: With `set -euo pipefail` enabled on line 19, the assignments `release_version=$1` and `next_version=$2` will cause the script to abort with an "unbound variable" error when invoked without arguments, before reaching the `-z` check on line 25. The usage message is unreachable. Either guard the assignments with defaults (e.g. `release_version=${1:-}`) or check `$#` before assigning. ########## scripts/download-rat.py: ########## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +"""Download and verify the Apache RAT JAR. + +Usage: python3 scripts/download-rat.py <version> <target_jar_path> + +Downloads apache-rat-<version>-bin.tar.gz from the Apache mirror, +verifies its SHA-512 checksum, extracts the JAR, and writes it to +<target_jar_path>. +""" + +import hashlib +import io +import sys +import tarfile +import urllib.request +from pathlib import Path + +BASE_URL = "https://downloads.apache.org/creadur" Review Comment: `downloads.apache.org` only keeps the current release of each Apache project; older releases are moved to `archive.apache.org`. Pinning RAT to a fixed `RAT_VERSION` (currently `0.18`) means this download will start 404'ing as soon as a newer RAT release is published. Consider either pulling from `https://archive.apache.org/dist/creadur/...` (which retains all versions) or falling back to it when the primary URL returns 404. ########## scripts/download-rat.py: ########## @@ -0,0 +1,75 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +"""Download and verify the Apache RAT JAR. + +Usage: python3 scripts/download-rat.py <version> <target_jar_path> + +Downloads apache-rat-<version>-bin.tar.gz from the Apache mirror, +verifies its SHA-512 checksum, extracts the JAR, and writes it to +<target_jar_path>. +""" + +import hashlib +import io +import sys +import tarfile +import urllib.request +from pathlib import Path + +BASE_URL = "https://downloads.apache.org/creadur" + + +def download(url: str) -> bytes: + print(f"Downloading {url}", flush=True) + with urllib.request.urlopen(url) as resp: + return resp.read() + + +def main() -> None: + if len(sys.argv) != 3: + print(f"Usage: {sys.argv[0]} <version> <target_jar_path>", file=sys.stderr) + sys.exit(1) + + version, target = sys.argv[1], Path(sys.argv[2]) + tarball_name = f"apache-rat-{version}-bin.tar.gz" + tar_url = f"{BASE_URL}/apache-rat-{version}/{tarball_name}" + sha_url = f"{tar_url}.sha512" + + tarball = download(tar_url) + expected_hex = download(sha_url).decode().strip() + + actual_hex = hashlib.sha512(tarball).hexdigest() + if actual_hex != expected_hex: + print(f"SHA-512 mismatch for {tarball_name}!", file=sys.stderr) + print(f" expected: {expected_hex}", file=sys.stderr) + print(f" actual: {actual_hex}", file=sys.stderr) + sys.exit(1) Review Comment: The Apache `.sha512` files published under `downloads.apache.org` are typically not bare hex digests; they use either the GNU coreutils format (`<hash> <filename>`) or BSD format (`SHA512 (<filename>) = <hash>`). After `.decode().strip()`, `expected_hex` will still contain the filename (or the `SHA512 (...) =` prefix), so the direct equality check on line 57 will always fail and `make rat` will never succeed on a fresh machine. Parse the hex value out of the checksum file (e.g. take the first whitespace-delimited token or extract the value after `=`) before comparing. ########## THIRD_PARTY.md: ########## @@ -0,0 +1,119 @@ +| Name | Version | License | Author | URL | +|---------------------------|-------------|--------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------| +| Faker | 40.19.1 | MIT License | joke2k | https://github.com/joke2k/faker | +| Jinja2 | 3.1.6 | BSD License | UNKNOWN | https://github.com/pallets/jinja/ | +| MarkupSafe | 3.0.3 | BSD-3-Clause | UNKNOWN | https://github.com/pallets/markupsafe/ | +| PyJWT | 2.13.0 | MIT | Jose Padilla <[email protected]> | https://github.com/jpadilla/pyjwt | +| PyYAML | 6.0.3 | MIT License | Kirill Simonov | https://pyyaml.org/ | +| Pygments | 2.20.0 | BSD-2-Clause | Georg Brandl <[email protected]> | https://pygments.org | +| annotated-types | 0.7.0 | MIT License | Adrian Garcia Badaracco <[email protected]>, Samuel Colvin <[email protected]>, Zac Hatfield-Dodds <[email protected]> | https://github.com/annotated-types/annotated-types | +| astroid | 4.0.4 | LGPL-2.1-or-later | UNKNOWN | https://github.com/pylint-dev/astroid/issues | +| attrs | 26.1.0 | MIT | Hynek Schlawack <[email protected]> | https://www.attrs.org/en/stable/changelog.html | +| bokeh | 3.9.0 | BSD-3-Clause | Bokeh Team | https://bokeh.org | +| boto3 | 1.43.14 | Apache-2.0 | Amazon Web Services | https://github.com/boto/boto3 | +| botocore | 1.43.14 | Apache-2.0 | Amazon Web Services | https://github.com/boto/botocore | +| cachetools | 7.1.4 | MIT | Thomas Kemmer <[email protected]> | https://github.com/tkem/cachetools/ | +| certifi | 2026.5.20 | Mozilla Public License 2.0 (MPL 2.0) | Kenneth Reitz | https://github.com/certifi/python-certifi | +| cffi | 2.0.0 | MIT | Armin Rigo, Maciej Fijalkowski | https://cffi.readthedocs.io/en/latest/whatsnew.html | +| charset-normalizer | 3.4.7 | MIT | "Ahmed R. TAHRI" <[email protected]> | https://github.com/jawah/charset_normalizer/blob/master/CHANGELOG.md | +| click | 8.4.1 | BSD-3-Clause | UNKNOWN | https://github.com/pallets/click/ | +| cloudpickle | 3.1.2 | BSD License | The cloudpickle developer team | https://github.com/cloudpipe/cloudpickle | +| colorama | 0.4.6 | BSD License | Jonathan Hartley <[email protected]> | https://github.com/tartley/colorama | +| contourpy | 1.3.3 | BSD License | Ian Thomas <[email protected]> | https://github.com/contourpy/contourpy | +| coverage | 7.14.1 | Apache-2.0 | Ned Batchelder and 255 others | https://github.com/coveragepy/coveragepy | +| cryptography | 48.0.0 | Apache-2.0 OR BSD-3-Clause | The Python Cryptographic Authority and individual contributors <[email protected]> | https://github.com/pyca/cryptography | +| dask | 2026.3.0 | BSD-3-Clause | UNKNOWN | https://github.com/dask/dask/ | +| dill | 0.4.1 | BSD License | Mike McKerns | https://github.com/uqfoundation/dill | +| distlib | 0.4.0 | Python Software Foundation License | Vinay Sajip | https://github.com/pypa/distlib | +| distributed | 2026.3.0 | BSD-3-Clause | UNKNOWN | https://distributed.dask.org | +| docutils | 0.22.4 | BSD License; GNU General Public License (GPL); Public Domain | David Goodger <[email protected]> | https://docutils.sourceforge.io | +| filelock | 3.29.0 | MIT | UNKNOWN | https://github.com/tox-dev/py-filelock | +| fsspec | 2026.4.0 | BSD-3-Clause | UNKNOWN | https://github.com/fsspec/filesystem_spec | +| github3.py | 4.0.1 | BSD-3-Clause | Ian Stapleton Cordasco <[email protected]> | https://github.com/sigmavirus24/github3.py | +| google-auth | 2.53.0 | Apache Software License | Google Cloud Platform | https://github.com/googleapis/google-auth-library-python | +| google-crc32c | 1.8.0 | UNKNOWN | Google LLC | https://github.com/googleapis/python-crc32c | +| google-resumable-media | 2.9.0 | Apache Software License | Google Cloud Platform | https://github.com/googleapis/google-resumable-media-python | +| h5py | 3.16.0 | BSD-3-Clause | Andrew Collette <[email protected]> | https://www.h5py.org/ | +| id | 1.6.1 | Apache Software License | UNKNOWN | https://pypi.org/project/id/ | +| idna | 3.16 | BSD-3-Clause | Kim Davies <[email protected]> | https://github.com/kjd/idna | +| ijson | 3.5.0 | BSD-3-Clause AND ISC | Rodrigo Tobar <[email protected]>, Ivan Sagalaev <[email protected]> | https://github.com/ICRAR/ijson | +| iniconfig | 2.3.0 | MIT | Ronny Pfannschmidt <[email protected]>, Holger Krekel <[email protected]> | https://github.com/pytest-dev/iniconfig | +| isort | 5.13.2 | MIT License | Timothy Crosley | https://pycqa.github.io/isort/ | +| jaraco.classes | 3.4.0 | MIT License | Jason R. Coombs | https://github.com/jaraco/jaraco.classes | +| jaraco.context | 6.1.2 | MIT | "Jason R. Coombs" <[email protected]> | https://github.com/jaraco/jaraco.context | +| jaraco.functools | 4.5.0 | MIT | "Jason R. Coombs" <[email protected]> | https://github.com/jaraco/jaraco.functools | +| jmespath | 1.1.0 | MIT License | James Saryerwinnie | https://github.com/jmespath/jmespath.py | +| jsonschema | 4.26.0 | MIT | Julian Berman <[email protected]> | https://github.com/python-jsonschema/jsonschema | +| jsonschema-specifications | 2025.9.1 | MIT | Julian Berman <[email protected]> | https://github.com/python-jsonschema/jsonschema-specifications | +| jwcrypto | 1.5.7 | LGPL-3.0-or-later | UNKNOWN | https://github.com/latchset/jwcrypto | +| keyring | 25.7.0 | MIT | Kang Zhang <[email protected]> | https://github.com/jaraco/keyring | +| lazy-object-proxy | 1.12.0 | BSD-2-Clause | Ionel Cristian Mărieș <[email protected]> | https://python-lazy-object-proxy.readthedocs.io/en/latest/changelog.html | +| locket | 1.0.0 | BSD License | Michael Williamson | http://github.com/mwilliamson/locket.py | +| markdown-it-py | 4.2.0 | MIT License | Chris Sewell <[email protected]> | https://github.com/executablebooks/markdown-it-py | +| mccabe | 0.6.1 | MIT License | Ian Cordasco | https://github.com/pycqa/mccabe | +| mdurl | 0.1.2 | MIT License | Taneli Hukkinen <[email protected]> | https://github.com/executablebooks/mdurl | +| mimesis | 19.1.0 | MIT | Isaak Uchakaev <[email protected]> | https://github.com/lk-geimfari/mimesis | +| more-itertools | 11.1.0 | MIT | Erik Rose <[email protected]> | https://github.com/more-itertools/more-itertools | +| msgpack | 1.1.2 | Apache-2.0 | Inada Naoki <[email protected]> | https://msgpack.org/ | +| narwhals | 2.21.2 | MIT License | Marco Gorelli <[email protected]> | https://github.com/narwhals-dev/narwhals | +| nh3 | 0.3.5 | MIT | messense <[email protected]> | UNKNOWN | +| numpy | 1.26.4 | BSD License | Travis E. Oliphant et al. | https://numpy.org | +| packaging | 26.2 | Apache-2.0 OR BSD-2-Clause | Donald Stufft <[email protected]> | https://github.com/pypa/packaging | +| pandas | 3.0.3 | BSD License | The Pandas Development Team <[email protected]> | https://pandas.pydata.org | +| partd | 1.4.2 | BSD | UNKNOWN | http://github.com/dask/partd/ | +| pillow | 12.2.0 | MIT-CMU | Jeffrey 'Alex' Clark <[email protected]> | https://python-pillow.github.io | +| pkginfo | 1.12.1.2 | MIT License | Tres Seaver, Agendaless Consulting | https://code.launchpad.net/~tseaver/pkginfo/trunk | +| platformdirs | 4.9.6 | MIT | UNKNOWN | https://github.com/tox-dev/platformdirs | +| pluggy | 1.6.0 | MIT License | Holger Krekel <[email protected]> | UNKNOWN | +| psutil | 7.2.2 | BSD-3-Clause | Giampaolo Rodola | https://github.com/giampaolo/psutil | +| py | 1.11.0 | MIT License | holger krekel, Ronny Pfannschmidt, Benjamin Peterson and others | https://py.readthedocs.io/ | +| py-cpuinfo | 9.0.0 | MIT License | Matthew Brennan Jones | https://github.com/workhorsy/py-cpuinfo | +| pyasn1 | 0.6.3 | BSD-2-Clause | Ilya Etingof <[email protected]> | https://github.com/pyasn1/pyasn1 | +| pyasn1_modules | 0.4.2 | BSD License | Ilya Etingof | https://github.com/pyasn1/pyasn1-modules | +| pycparser | 3.0 | BSD-3-Clause | Eli Bendersky <[email protected]> | https://github.com/eliben/pycparser | +| pydantic | 2.13.4 | MIT | Samuel Colvin <[email protected]>, Eric Jolibois <[email protected]>, Hasan Ramezani <[email protected]>, Adrian Garcia Badaracco <[email protected]>, Terrence Dorsey <[email protected]>, David Montague <[email protected]>, Serge Matveenko <[email protected]>, Marcelo Trylesinski <[email protected]>, Sydney Runkle <[email protected]>, David Hewitt <[email protected]>, Alex Hall <[email protected]>, Victorien Plot <[email protected]> | https://github.com/pydantic/pydantic | +| pydantic_core | 2.46.4 | MIT | Samuel Colvin <[email protected]>, Adrian Garcia Badaracco <[email protected]>, David Montague <[email protected]>, David Hewitt <[email protected]>, Sydney Runkle <[email protected]>, Victorien Plot <[email protected]> | https://github.com/pydantic | +| pylint | 4.0.5 | GPL-2.0-or-later | Python Code Quality Authority <[email protected]> | https://github.com/pylint-dev/pylint | +| pylint-quotes | 0.2.3 | MIT License | Erick Daniszewski | https://github.com/edaniszewski/pylint-quotes | +| pyproject-api | 1.10.0 | MIT | Bernát Gábor <[email protected]> | https://pyproject-api.readthedocs.io | +| pysolr | 3.11.0 | BSD License | Daniel Lindsley | https://github.com/django-haystack/pysolr/ | +| pytest | 9.0.3 | MIT | Holger Krekel, Bruno Oliveira, Ronny Pfannschmidt, Floris Bruynooghe, Brianna Laugher, Freya Bruhin, Others (See AUTHORS) | https://docs.pytest.org/en/latest/ | +| pytest-asyncio | 1.4.0 | Apache-2.0 | Tin Tvrtković <[email protected]> | https://github.com/pytest-dev/pytest-asyncio | +| pytest-benchmark | 5.0.1 | BSD License | Ionel Cristian Mărieș | https://github.com/ionelmc/pytest-benchmark | +| python-dateutil | 2.9.0.post0 | Apache Software License; BSD License | Gustavo Niemeyer | https://github.com/dateutil/dateutil | +| python-discovery | 1.3.1 | MIT License | UNKNOWN | https://github.com/tox-dev/python-discovery | +| readme_renderer | 44.0 | Apache Software License | The Python Packaging Authority <[email protected]> | UNKNOWN | +| referencing | 0.37.0 | MIT | Julian Berman <[email protected]> | https://github.com/python-jsonschema/referencing | +| requests | 2.34.2 | Apache Software License | Kenneth Reitz <[email protected]> | https://github.com/psf/requests | +| requests-toolbelt | 1.0.0 | Apache Software License | Ian Cordasco, Cory Benfield | https://toolbelt.readthedocs.io/ | +| rfc3986 | 2.0.0 | Apache Software License | Ian Stapleton Cordasco | http://rfc3986.readthedocs.io | +| rich | 15.0.0 | MIT License | Will McGugan | https://github.com/Textualize/rich | +| rpds-py | 0.30.0 | MIT | Julian Berman <[email protected]> | https://github.com/crate-py/rpds | +| ruff | 0.15.15 | MIT | "Astral Software Inc." <[email protected]> | https://docs.astral.sh/ruff | +| s3transfer | 0.17.0 | Apache Software License | Amazon Web Services | https://github.com/boto/s3transfer | +| six | 1.17.0 | MIT License | Benjamin Peterson | https://github.com/benjaminp/six | +| solr-orbit | 0.1.0 | Apache Software License | UNKNOWN | https://github.com/apache/solr-orbit | +| solr-orbit | 0.1.0 | Apache Software License | UNKNOWN | https://github.com/apache/solr-orbit | Review Comment: `solr-orbit` is listed twice as an identical row. One of these duplicate entries should be removed. ########## .rat-excludes: ########## @@ -0,0 +1,102 @@ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, +# software distributed under the License is distributed on an +# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +# KIND, either express or implied. See the License for the +# specific language governing permissions and limitations +# under the License. + +# --------------------------------------------------------------------------- +# Apache RAT exclusion list +# Patterns are Ant-style globs; one per line. +# Files listed here legitimately carry no Apache license header. +# --------------------------------------------------------------------------- + +# Python bytecode and tool caches +**/__pycache__/** +**/*.pyc +**/*.pyo +.pytest_cache/** +.ruff_cache/** +*.egg-info/** + +# Empty __init__.py namespace markers (no content to annotate) +**/__init__.py Review Comment: The comment "Empty __init__.py namespace markers (no content to annotate)" is inaccurate: per the codebase context, `solrorbit/__init__.py` contains imports, constants, banner text and functions (and does carry a full header). Excluding all `**/__init__.py` from RAT therefore hides genuine source files from the audit and risks future non-empty `__init__.py` files being committed without headers. Consider either narrowing the pattern to truly empty markers, or removing it and adding headers to the ones that have content. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
