This is an automated email from the ASF dual-hosted git repository.

raulcd pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/main by this push:
     new a0fcb50ede GH-46669: [CI][Archery] Automate Zulip and email 
notifications for Extra CI (#47546)
a0fcb50ede is described below

commit a0fcb50ede92f426e3fe4c3336fe5d4eba5c08de
Author: Raúl Cumplido <[email protected]>
AuthorDate: Mon Sep 15 10:42:10 2025 +0200

    GH-46669: [CI][Archery] Automate Zulip and email notifications for Extra CI 
(#47546)
    
    ### Rationale for this change
    
    We plan to move more Crossbow CI jobs to the main Arrow repository. We have 
some reporting tools like the Zulip message notifications, the email report 
notifications and the nightlies report page: http://crossbow.voltrondata.com/ 
that we should replicate for the CI.
    
    ### What changes are included in this PR?
    
    Add GH action that will send the chat and email reports. Also it adds 
infrastructure to archery to work with Arrow CI.
    
    There's a lot of baggage around crossbow so I've decided to implement a new 
module for Arrow's `ci` where we can use the GitHub API and we don't require 
cloning and a lot of things that were required to work with crossbow. In the 
future we might be able to get rid of all the related `crossbow` CLI part.
    
    ### Are these changes tested?
    
    Via CI
    
    ### Are there any user-facing changes?
    
    No
    
    * GitHub Issue: #46669
    
    Authored-by: Raúl Cumplido <[email protected]>
    Signed-off-by: Raúl Cumplido <[email protected]>
---
 .github/workflows/cpp_extra.yml                    |  50 ++++++++
 dev/archery/archery/ci/cli.py                      | 126 +++++++++++++++++++++
 dev/archery/archery/ci/core.py                     |  98 ++++++++++++++++
 dev/archery/archery/cli.py                         |   2 +
 dev/archery/archery/crossbow/reports.py            |  20 +++-
 .../templates/chat_nightly_workflow_report.txt.j2  |  38 +++++++
 .../archery/templates/email_workflow_report.txt.j2 |  45 ++++++++
 7 files changed, 373 insertions(+), 6 deletions(-)

diff --git a/.github/workflows/cpp_extra.yml b/.github/workflows/cpp_extra.yml
index 52560ed692..001db5e78f 100644
--- a/.github/workflows/cpp_extra.yml
+++ b/.github/workflows/cpp_extra.yml
@@ -278,3 +278,53 @@ jobs:
           cmake --build cpp/examples/minimal_build.build
           cd cpp/examples/minimal_build
           ../minimal_build.build/arrow-example
+  
+  report-extra-cpp:
+    runs-on: ubuntu-latest
+    needs:
+      - docker
+      - jni-macos
+    # We don't have the job id as part of the context neither the job name.
+    # The GitHub API exposes numeric id or job name but not the github.job 
(report-extra-cpp).
+    # We match github.job to the name so we can pass it via context in order 
to be ignored on the report.
+    # The job is still running.
+    name: ${{ github.job }}
+    if: github.event_name == 'schedule' && always()
+    steps:
+      - name: Checkout Arrow
+        uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # 
v5.0.0
+        with:
+          fetch-depth: 0
+      - name: Setup Python
+        uses: actions/setup-python@e797f83bcb11b83ae66e0230d6156d7c80228e7c # 
v6.0.0
+        with:
+          python-version: 3
+      - name: Setup Archery
+        run: python3 -m pip install -e dev/archery[crossbow]
+      - name: Send email
+        env:
+          GH_TOKEN: ${{ github.token }}
+          SMTP_PASSWORD: ${{ secrets.ARROW_SMTP_PASSWORD }}
+        run: |
+          archery ci report-email \
+            --ignore ${{ github.job }} \
+            --recipient-email '[email protected]' \
+            --repository ${{ github.repository }} \
+            --send \
+            --sender-email '[email protected]' \
+            --sender-name Arrow \
+            --smtp-port 587 \
+            --smtp-server 'commit-email.info' \
+            --smtp-user arrow \
+            ${{ github.run_id }}
+      - name: Send chat message
+        if: always()
+        env:
+          GH_TOKEN: ${{ github.token }}
+          CHAT_WEBHOOK: ${{ secrets.ARROW_ZULIP_WEBHOOK }}
+        run: |
+          archery ci report-chat \
+            --ignore ${{ github.job }} \
+            --repository ${{ github.repository }} \
+            --send \
+            ${{ github.run_id }}
diff --git a/dev/archery/archery/ci/cli.py b/dev/archery/archery/ci/cli.py
new file mode 100644
index 0000000000..bf7b68d532
--- /dev/null
+++ b/dev/archery/archery/ci/cli.py
@@ -0,0 +1,126 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import click
+
+from .core import Workflow
+from ..crossbow.reports import ChatReport, EmailReport, ReportUtils
+
+
[email protected]()
[email protected]('--github-token', '-t', default=None,
+              envvar=['GH_TOKEN'],
+              help='OAuth token for GitHub authentication')
[email protected]('--output-file', metavar='<output>',
+              type=click.File('w', encoding='utf8'), default='-',
+              help='Capture output result into file.')
[email protected]_context
+def ci(ctx, github_token, output_file):
+    """
+    Tools for CI Extra jobs on GitHub actions.
+    """
+    ctx.ensure_object(dict)
+    ctx.obj['github_token'] = github_token
+    ctx.obj['output'] = output_file
+
+
[email protected]()
[email protected]('workflow_id', required=True)
[email protected]('--send/--dry-run', default=False,
+              help='Just display the report, don\'t send it.')
[email protected]('--repository', '-r', default='apache/arrow',
+              help='The repository where the workflow is located.')
[email protected]('--ignore', '-i', default="",
+              help='Job name to ignore from the list of jobs.')
[email protected]('--webhook', '-w', envvar=['CHAT_WEBHOOK'],
+              help='Zulip/Slack Webhook address to send the report to.')
[email protected]('--extra-message-success', '-s', default=None,
+              help='Extra message, will be appended if no failures.')
[email protected]('--extra-message-failure', '-f', default=None,
+              help='Extra message, will be appended if there are failures.')
[email protected]_obj
+def report_chat(obj, workflow_id, send, repository, ignore, webhook,
+                extra_message_success, extra_message_failure):
+    """
+    Send a chat report to a webhook showing success/failure
+    of jobs in a workflow run.
+    """
+    output = obj['output']
+
+    report_chat = ChatReport(
+        report=Workflow(workflow_id, repository,
+                        ignore_job=ignore, gh_token=obj['github_token']),
+        extra_message_success=extra_message_success,
+        extra_message_failure=extra_message_failure
+    )
+    if send:
+        ReportUtils.send_message(webhook, 
report_chat.render("workflow_report"))
+    else:
+        output.write(report_chat.render("workflow_report"))
+
+
[email protected]()
[email protected]('workflow_id', required=True)
[email protected]('--sender-name', '-n',
+              help='Name to use for report e-mail.')
[email protected]('--sender-email', '-e',
+              help='E-mail to use for report e-mail.')
[email protected]('--recipient-email', '-t',
+              help='Where to send the e-mail report')
[email protected]('--smtp-user', '-u',
+              help='E-mail address to use for SMTP login')
[email protected]('--smtp-password', '-P', envvar=['SMTP_PASSWORD'],
+              help='SMTP password to use for report e-mail.')
[email protected]('--smtp-server', '-s', default='smtp.gmail.com',
+              help='SMTP server to use for report e-mail.')
[email protected]('--smtp-port', '-p', default=465,
+              help='SMTP port to use for report e-mail.')
[email protected]('--send/--dry-run', default=False,
+              help='Just display the report, don\'t send it.')
[email protected]('--repository', '-r', default='apache/arrow',
+              help='The repository where the workflow is located.')
[email protected]('--ignore', '-i', default="",
+              help='Job name to ignore from the list of jobs.')
[email protected]_obj
+def report_email(obj, workflow_id, sender_name, sender_email, recipient_email,
+                 smtp_user, smtp_password, smtp_server, smtp_port, send,
+                 repository, ignore):
+    """
+    Send an email report showing success/failure of jobs in
+    a Workflow run
+    """
+    output = obj['output']
+
+    email_report = EmailReport(
+        report=Workflow(workflow_id, repository,
+                        ignore_job=ignore, gh_token=obj['github_token']),
+        sender_name=sender_name,
+        sender_email=sender_email,
+        recipient_email=recipient_email
+    )
+
+    if send:
+        ReportUtils.send_email(
+            smtp_user=smtp_user,
+            smtp_password=smtp_password,
+            smtp_server=smtp_server,
+            smtp_port=smtp_port,
+            recipient_email=recipient_email,
+            message=email_report.render("workflow_report")
+        )
+    else:
+        output.write(email_report.render("workflow_report"))
diff --git a/dev/archery/archery/ci/core.py b/dev/archery/archery/ci/core.py
new file mode 100644
index 0000000000..6cb4fa847e
--- /dev/null
+++ b/dev/archery/archery/ci/core.py
@@ -0,0 +1,98 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from functools import cached_property
+
+import requests
+
+
+class Workflow:
+    def __init__(self, workflow_id, repository, ignore_job, gh_token=None):
+        self.workflow_id = workflow_id
+        self.gh_token = gh_token
+        self.repository = repository
+        self.ignore_job = ignore_job
+        self.headers = {
+            'Accept': 'application/vnd.github.v3+json',
+        }
+        if self.gh_token:
+            self.headers["Authorization"] = f"Bearer {self.gh_token}"
+        workflow_resp = requests.get(
+            
f'https://api.github.com/repos/{repository}/actions/runs/{workflow_id}',
+            headers=self.headers
+        )
+        if workflow_resp.status_code == 200:
+            self.workflow_data = workflow_resp.json()
+
+        else:
+            # TODO: We could send an error report instead
+            raise Exception(
+                f'Failed to fetch workflow data: {workflow_resp.status_code}')
+
+    @property
+    def conclusion(self):
+        return self.workflow_data.get('conclusion')
+
+    @property
+    def jobs_url(self):
+        return self.workflow_data.get('jobs_url')
+
+    @property
+    def name(self):
+        return self.workflow_data.get('name')
+
+    @property
+    def url(self):
+        return self.workflow_data.get('html_url')
+
+    @cached_property
+    def jobs(self):
+        jobs = []
+        jobs_resp = requests.get(self.jobs_url, headers=self.headers)
+        if jobs_resp.status_code == 200:
+            jobs_data = jobs_resp.json()
+            for job_data in jobs_data.get('jobs', []):
+                if job_data.get('name') != self.ignore_job:
+                    job = Job(job_data)
+                    jobs.append(job)
+        return jobs
+
+    def failed_jobs(self):
+        return [job for job in self.jobs if not job.is_successful()]
+
+    def successful_jobs(self):
+        return [job for job in self.jobs if job.is_successful()]
+
+
+class Job:
+    def __init__(self, job_data):
+        self.job_data = job_data
+
+    @property
+    def conclusion(self):
+        return self.job_data.get('conclusion')
+
+    @property
+    def name(self):
+        return self.job_data.get('name')
+
+    @property
+    def url(self):
+        return self.job_data.get('html_url')
+
+    def is_successful(self):
+        return self.conclusion == 'success'
diff --git a/dev/archery/archery/cli.py b/dev/archery/archery/cli.py
index 927dcb730a..48ad466977 100644
--- a/dev/archery/archery/cli.py
+++ b/dev/archery/archery/cli.py
@@ -874,6 +874,8 @@ add_optional_command("release", module=".release.cli", 
function="release",
                      parent=archery)
 add_optional_command("crossbow", module=".crossbow.cli", function="crossbow",
                      parent=archery)
+add_optional_command("ci", module=".ci.cli", function="ci",
+                     parent=archery)
 
 
 if __name__ == "__main__":
diff --git a/dev/archery/archery/crossbow/reports.py 
b/dev/archery/archery/crossbow/reports.py
index 9f73845b9a..32962410d6 100644
--- a/dev/archery/archery/crossbow/reports.py
+++ b/dev/archery/archery/crossbow/reports.py
@@ -217,6 +217,7 @@ class ConsoleReport(Report):
 class ChatReport(JinjaReport):
     templates = {
         'text': 'chat_nightly_report.txt.j2',
+        'workflow_report': 'chat_nightly_workflow_report.txt.j2',
     }
     fields = [
         'report',
@@ -246,13 +247,19 @@ class ReportUtils:
     @classmethod
     def send_email(cls, smtp_user, smtp_password, smtp_server, smtp_port,
                    recipient_email, message):
-        import smtplib
+        from smtplib import SMTP, SMTP_SSL
 
-        server = smtplib.SMTP_SSL(smtp_server, smtp_port)
-        server.ehlo()
-        server.login(smtp_user, smtp_password)
-        server.sendmail(smtp_user, recipient_email, message)
-        server.close()
+        if smtp_port == 465:
+            smtp_cls = SMTP_SSL
+        else:
+            smtp_cls = SMTP
+        with smtp_cls(smtp_server, smtp_port) as smtp:
+            if smtp_port == 465:
+                smtp.ehlo()
+            else:
+                smtp.starttls()
+            smtp.login(smtp_user, smtp_password)
+            smtp.sendmail(smtp_user, recipient_email, message)
 
     @classmethod
     def write_csv(cls, report, add_headers=True):
@@ -267,6 +274,7 @@ class EmailReport(JinjaReport):
     templates = {
         'nightly_report': 'email_nightly_report.txt.j2',
         'token_expiration': 'email_token_expiration.txt.j2',
+        'workflow_report': 'email_workflow_report.txt.j2',
     }
     fields = [
         'report',
diff --git a/dev/archery/archery/templates/chat_nightly_workflow_report.txt.j2 
b/dev/archery/archery/templates/chat_nightly_workflow_report.txt.j2
new file mode 100644
index 0000000000..794b2f9060
--- /dev/null
+++ b/dev/archery/archery/templates/chat_nightly_workflow_report.txt.j2
@@ -0,0 +1,38 @@
+{#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#}
+*Extra CI GitHub report for <{{ report.url }}|{{ report.name }}>*
+{% if report.failed_jobs() %}
+:x: *{{ report.failed_jobs() | length }} failed jobs*
+{% for job in report.failed_jobs() -%}
+- <{{ job.url }}|{{ job.name }}>
+{% endfor %}
+{%- endif -%}
+{% if report.successful_jobs() %}
+
+:tada: *{{ report.successful_jobs() | length }} successful jobs*
+{%- endif -%}
+
+{% if extra_message_success and not report.failed_jobs() %}
+
+{{ extra_message_success }}
+{%- endif -%}
+{% if extra_message_failure and report.failed_jobs() %}
+
+{{ extra_message_failure }}
+{% endif %}
\ No newline at end of file
diff --git a/dev/archery/archery/templates/email_workflow_report.txt.j2 
b/dev/archery/archery/templates/email_workflow_report.txt.j2
new file mode 100644
index 0000000000..b08b4d3ffc
--- /dev/null
+++ b/dev/archery/archery/templates/email_workflow_report.txt.j2
@@ -0,0 +1,45 @@
+{#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#}
+{%- if True -%}
+{%- endif -%}
+From: {{ sender_name }} <{{ sender_email }}>
+To: {{ recipient_email }}
+Subject: [NIGHTLY] Arrow Build Report for {{ report.name }}: {{ 
report.failed_jobs() | length }} failed
+
+Arrow Build Report for {{ report.name }}
+
+Workflow URL: {{ report.url }}
+
+{% if report.failed_jobs() %}
+Failed Jobs:
+
+{% for job in report.failed_jobs() -%}
+- {{ job.name }}
+  {{ job.url }}
+{% endfor %}
+{% endif %}
+
+{% if report.successful_jobs() %}
+Succeeded Jobs:
+
+{% for job in report.successful_jobs() -%}
+- {{ job.name }}
+  {{ job.url }}
+{% endfor %}
+{%- endif -%}
\ No newline at end of file

Reply via email to