Re: System Test Dashboards - Phase Two??

Ferruzzi, Dennis Mon, 12 Aug 2024 11:33:33 -0700

Sorry for the delay, my turn on vacation!

Freddy - I agree, let's go with the JSON unless someone objects.  I think your 
example there looks good.  I found the same when I was looking into the XML; it 
appears to be a very loose standard with varying degrees of support.  I don't 
know that we gain anything by attempting to follow it, but I used it as a great 
general guide for making our own format.


The key to me is exactly what Freddy mentioned, we intend to define the 
minimum.  I'd like to ensure that a provider can add whatever they want in 
there, as long as that "minimum contract" is met.  For example, if a provider 
wants to report the last 10 results for each test, or a bool flag noting if the 
state changed since the previous run, they can add that in the test's 
"properties" section and it should not break anything for the user.  And who 
knows, maybe one provider comes up with some neat datapoint that others will 
adopt and we can all grow.

I guess that also implies that there needs to be both a page documenting the 
universal/minimum format plus each provider who chooses to add their own info 
would need to document that somewhere.  I'm not too familiar with the inner 
workings of the docs generation system.  Would it be possible to make this a 
"top-level" docs page that gets listed on https://airflow.apache.org/docs/ 
instead of a subsection in the Ecosystem page?  I was initially thinking about 
a main doc page which has the minimum structure documented, and "for each 
provider package, if the package contains a file called dashboard_schema.rst 
(or some standardized name we agree on), then generate that page and link to 
it", but Astronomer hosts the LLM dashboard and does not have a provider 
package, so that plan wouldn't work.  Any suggestions on the best way to build 
out the doc structure for this?

Once we sort out where the docs can live, I can re-post my proposed format from 
earlier in this thread into a new LC email and we can go from there.  Freddy 
and I ended up co-presenting on a System Test Dashboards talk (along with Rahul 
from Astronomer) at the upcoming Airflow Summit; we intend to mention this as a 
"what's next" so the timing should work out nice for that if we get lazy 
consensus before the Summit.

  - ferruzzi


________________________________
From: Jarek Potiuk <ja...@potiuk.com>
Sent: Monday, August 5, 2024 2:11 AM
To: dev@airflow.apache.org
Subject: RE: [EXT] System Test Dashboards - Phase Two??

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne 
cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas 
confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le 
contenu ne présente aucun risque.



+1. Let's do it :). I think if others will agree, a LAZY CONSENSUS thread
will do + documenting the format as PR + getting each team to implement it
is a good way to progress.

Once we have it exposed by at least two dashboards, we can add a "canary"
test that will pull them and fail in case of error (soft - non-blocking
fail).

On Mon, Aug 5, 2024 at 11:07 AM Freddy Demiane <fdemi...@google.com.invalid>
wrote:

> Hello all,
>
> I took a long time to reply to this thread, I was on a long vacation. +1 to
> Dennis' suggestion, I believe the provided JSON format contains the minimal
> information required to render a dashboard. I experimented with generating
> a JSON output from a development CI (from Google's side), I added a sample
> output of how it could look like.
> As of the Junit XML format, while I was searching online, I found multiple
> different standards, and for tools to visualize results, each tool has
> their own standard (even though the formats look similar). Again, I never
> used the JUnit XML format nor tooling for visualization, so I might be
> wrong in this regard.
> I suggest sticking with a defined JSON schema, and later if we want to
> generate some aggregated report, and decide to go with a Junit XML
> visualizer, we can transform the data.
> Nevertheless, I played a bit with generating a Junit XML result, which I
> also attached in this email.
> Let me know what you think.
>
> Best,
> Freddy
>
> JSON Sample Output:
>
> {
>   "provider": "Google",
>   "tests": 159,
>   "errors": 0,
>   "failures": 21,
>   "skipped": 68,
>   "timestamp": "2024-07-30 12:00:00.460429+00:00",
>   "duration": 8263.065936,
>   "properties": {
>     "airflow_commit_id": "41508f23ad",
>     "providers_commit_id": "41508f23ad"
>   },
>   "testcases": [
>     {
>       "name": "example_gcp_translate_speech",
>       "file": "
> https://github.com/apache/airflow/blob/main/tests/system/providers/google/cloud/translate_speech/example_translate_speech.py
> ",
>       "duration": 0,
>       "result": {
>         "state": "SKIPPED",
>         "message": "Skipped",
>         "type": ""
>       },
>       "properties": {
>         "logs_folder_link": "",
>         "failed_tasks": []
>       },
>       "skipped": true
>     },
>     {
>       "name": "example_gcp_translate",
>       "file": "
> https://github.com/apache/airflow/blob/main/tests/system/providers/google/cloud/translate/example_translate.py
> ",
>       "duration": "0:00:02.443829",
>       "result": {
>         "state": "SUCCESS",
>         "message": "",
>         "type": ""
>       },
>       "properties": {
>         "logs_folder_link":
> "
> https://console.cloud.google.com/storage/browser/dashboard-system-tests-public-logs-dev/2024-07-30
> 12:00:00.460429+00:00-41508f23ad-41508f23ad/dag_id=example_gcp_translate
> <https://console.cloud.google.com/storage/browser/dashboard-system-tests-public-logs-dev/2024-07-3012:00:00.460429+00:00-41508f23ad-41508f23ad/dag_id=example_gcp_translate>
> ",
>         "failed_tasks": []
>       },
>       "skipped": false
>     },
>     {
>       "name": "vertex_ai_pipeline_job_operations",
>       "file": "
> https://github.com/apache/airflow/blob/main/tests/system/providers/google/cloud/vertex_ai/example_vertex_ai_pipeline_job.py
> ",
>       "duration": "0:24:35.451846",
>       "result": {
>         "state": "FAILURE",
>         "message": "Check the logs at:
>
> https://console.cloud.google.com/storage/browser/dashboard-system-tests-public-logs-dev/2024-07-30
>
> 12:00:00.460429+00:00-41508f23ad-41508f23ad/dag_id=vertex_ai_pipeline_job_operations
> <https://console.cloud.google.com/storage/browser/dashboard-system-tests-public-logs-dev/2024-07-3012:00:00.460429+00:00-41508f23ad-41508f23ad/dag_id=vertex_ai_pipeline_job_operations>
> ",
>         "type": "Test Failure"
>       },
>       "properties": {
>         "logs_folder_link":
> "
> https://console.cloud.google.com/storage/browser/dashboard-system-tests-public-logs-dev/2024-07-30
>
> 12:00:00.460429+00:00-41508f23ad-41508f23ad/dag_id=vertex_ai_pipeline_job_operations
> <https://console.cloud.google.com/storage/browser/dashboard-system-tests-public-logs-dev/2024-07-3012:00:00.460429+00:00-41508f23ad-41508f23ad/dag_id=vertex_ai_pipeline_job_operations>
> ",
>         "failed_tasks": [
>           "watcher",
>           "run_pipeline_job"
>         ]
>       },
>       "skipped": false
>     },
>
> XML Sample Output:
>
> <testsuite name="Google" tests="159" failures="21" time="8263.065936"
> timestamp="2024-07-30 12:00:00.460429+00:00">
> <testcase name="example_gcp_translate_speech"
> file="
> https://github.com/apache/airflow/blob/main/tests/system/providers/google/cloud/translate_speech/example_translate_speech.py
> "
> time="0">
> <skipped>Ignored</skipped>
> </testcase>
> <testcase name="example_gcp_translate"
> file="
> https://github.com/apache/airflow/blob/main/tests/system/providers/google/cloud/translate/example_translate.py
> "
> time="0:00:02.443829"/>
> <testcase name="vertex_ai_pipeline_job_operations"
> file="
> https://github.com/apache/airflow/blob/main/tests/system/providers/google/cloud/vertex_ai/example_vertex_ai_pipeline_job.py
> "
> time="0:24:35.451846">
> <failure message="Check the logs at:
>
> https://console.cloud.google.com/storage/browser/dashboard-system-tests-public-logs-dev/2024-07-30
>
> 12:00:00.460429+00:00-41508f23ad-41508f23ad/dag_id=vertex_ai_pipeline_job_operations
> <https://console.cloud.google.com/storage/browser/dashboard-system-tests-public-logs-dev/2024-07-3012:00:00.460429+00:00-41508f23ad-41508f23ad/dag_id=vertex_ai_pipeline_job_operations>
> "
> type="ERROR"/>
> </testcase>
> <testcase name="vertex_ai_custom_job_operations_list_custom_job"
> file="
> https://github.com/apache/airflow/blob/main/tests/system/providers/google/cloud/vertex_ai/example_vertex_ai_list_custom_jobs.py
> "
> time="0:00:19.742450"/>
> <testcase name="vertex_ai_auto_ml_operations_video_training_job"
> file="
> https://github.com/apache/airflow/blob/main/tests/system/providers/google/cloud/vertex_ai/example_vertex_ai_auto_ml_video_training.py
> "
> time="-1">
> <failure message="" type="ERROR"/>
> </testcase>
> <testcase name="vertex_ai_batch_prediction_operations"
> file="
> https://github.com/apache/airflow/blob/main/tests/system/providers/google/cloud/vertex_ai/example_vertex_ai_batch_prediction_job.py
> "
> time="2:00:16.013429">
> <failure message="Check the logs at:
>
> https://console.cloud.google.com/storage/browser/dashboard-system-tests-public-logs-dev/2024-07-30
>
> 12:00:00.460429+00:00-41508f23ad-41508f23ad/dag_id=vertex_ai_batch_prediction_operations
> <https://console.cloud.google.com/storage/browser/dashboard-system-tests-public-logs-dev/2024-07-3012:00:00.460429+00:00-41508f23ad-41508f23ad/dag_id=vertex_ai_batch_prediction_operations>
> "
> type="ERROR"/>
> </testcase>
> <testcase name="vertex_ai_generative_model_dag"
> file="
> https://github.com/apache/airflow/blob/main/tests/system/providers/google/cloud/vertex_ai/example_vertex_ai_generative_model.py
> "
> time="0:00:18.350772"/>
> <testcase name="vertex_ai_custom_job_operations_custom"
> file="
> https://github.com/apache/airflow/blob/main/tests/system/providers/google/cloud/vertex_ai/example_vertex_ai_custom_job.py
> "
> time="0:21:58.908473"/>
> <testcase name="vertex_ai_auto_ml_operations_list_training_job"
> file="
> https://github.com/apache/airflow/blob/main/tests/system/providers/google/cloud/vertex_ai/example_vertex_ai_auto_ml_list_training.py
> "
> time="0:00:24.284901"/>
> <testcase name="vertex_ai_auto_ml_operations_forecasting_training_job"
> file="
> https://github.com/apache/airflow/blob/main/tests/system/providers/google/cloud/vertex_ai/example_vertex_ai_auto_ml_forecasting_training.py
> "
> time="1:42:17.086311"/>
> <testcase name="vertex_ai_auto_ml_operations_text_training_job"
> file="
> https://github.com/apache/airflow/blob/main/tests/system/providers/google/cloud/vertex_ai/example_vertex_ai_auto_ml_text_training.py
> "
> time="-1">
> <failure message="" type="ERROR"/>
> </testcase>
> <testcase name="vertex_ai_hyperparameter_tuning_job_operations"
> file="
> https://github.com/apache/airflow/blob/main/tests/system/providers/google/cloud/vertex_ai/example_vertex_ai_hyperparameter_tuning_job.py
> "
> time="0:31:34.276985">
> <failure message="Check the logs at:
>
> https://console.cloud.google.com/storage/browser/dashboard-system-tests-public-logs-dev/2024-07-30
>
> 12:00:00.460429+00:00-41508f23ad-41508f23ad/dag_id=vertex_ai_hyperparameter_tuning_job_operations
> <https://console.cloud.google.com/storage/browser/dashboard-system-tests-public-logs-dev/2024-07-3012:00:00.460429+00:00-41508f23ad-41508f23ad/dag_id=vertex_ai_hyperparameter_tuning_job_operations>
> "
> type="ERROR"/>
> </testcase>
> <testcase name="vertex_ai_model_service_operations"
> file="
> https://github.com/apache/airflow/blob/main/tests/system/providers/google/cloud/vertex_ai/example_vertex_ai_model_service.py
> "
> time="0:19:51.716990"/>
> <testcase name="vertex_ai_auto_ml_operations_tabular_training_job"
> file="
> https://github.com/apache/airflow/blob/main/tests/system/providers/google/cloud/vertex_ai/example_vertex_ai_auto_ml_tabular_training.py
> "
> time="1:57:02.637592"/>
> <testcase name="vertex_ai_endpoint_service_operations"
> file="
> https://github.com/apache/airflow/blob/main/tests/system/providers/google/cloud/vertex_ai/example_vertex_ai_endpoint.py
> "
> time="-1">
> <failure message="" type="ERROR"/>
> </testcase>
>
>
> On Wed, Jun 26, 2024 at 12:47 AM Ferruzzi, Dennis
> <ferru...@amazon.com.invalid> wrote:
>
> > I was unaware of the Teradata dashboard!  Outstanding to see.
> >
> > I can take point on design discussion and documentation for this, but in
> > the end it'll be up to each provider to update their own infra, so there
> is
> > only so much I can do.
> >
> > I didn't really think this would catch on so enthusiastically.   One
> other
> > thing I was thinking about but dropped from the initial idea would be
> > adding an optional field to the provider.yaml with a dashboard url.
> > Currently it is up to the provider to manually add a link to the list on
> > the ecosystem page.  If we make it part of the yaml file, new providers
> > might see it when looking for a template and jump onboard.  It would also
> > make the dashboards more programmatically-discoverable, maybe even
> > something that can be used in generating a docs page and skip the manual
> > step of adding it to the ecosystem page if someone wants to do that at
> some
> > point.  Given the way this discussion caught on, maybe it should be two
> > fields: 'dashboard-html' and 'dashboard-json' (or -xml or whatever we
> > decide is the vended format).
> >
> >
> > As at least half of the existing dashboards already export some form of
> > json, I'd propose we stick to that unless someone has a compelling reason
> > to convert to XML?  I looked into junit-xml and I like the way they break
> > down their schema, so maybe json-ify that with some tweaks?
> >
> > Proposed formatting:
> >
> > {
> >   "testsuite": {
> >     "provider": string,    [REQUIRED]
> >     "tests": int,          [REQUIRED]  // Could drop this as it's just
> > len(testcases) but by that same logic it's easy enough to add it...
> >     "errors": int,         [REQUIRED]
> >     "failures": int,       [REQUIRED]
> >     "skipped": int,        [REQUIRED]
> >     "timestamp": string,   [REQUIRED]  // Standardize on UTC?
> >     "duration": float,     [OPTIONAL]  // Seconds?
> >     "properties": {},      [OPTIONAL]  // Let's make all "properties"
> > blocks free-form and optional; a provider may add whatever extra values
> in
> > this block that they want.
> >     "testcases": [
> >       {
> >         "name": string,        [OPTIONAL]
> >         "file": string,        [REQUIRED]
> >         "duration": float,     [OPTIONAL] // Seconds?
> >         "result": {
> >             "state": "SUCCESS" | "SKIPPED" | "FAILURE",   [REQUIRED]
> >             "message": string, [OPTIONAL]
> >             "type": string,    [OPTIONAL]  // Exception type in case of a
> > failure.
> >         },
> >         "properties": {},      [OPTIONAL]  // Let's make all "properties"
> > blocks free-form and optional; a provider may add whatever extra values
> in
> > this block that they want.
> >       },
> >     ]
> >   }
> > }
> >
> > Sample:
> >
> >
> > {
> >   "testsuite": {
> >     "provider": "AWS",
> >     "tests": 3,
> >     "errors": 0,
> >     "failures": 1,
> >     "skipped": 1,
> >     "timestamp": "2020-01-26T13:45:02",
> >     "duration": 139.89,
> >     "properties": {
> >       "commit": "ef7bebf",
> >       "executor": "celery",
> >     },
> >     "testcases": [
> >       // Example successful test
> >       {
> >         "name": "example_appflow",
> >         "file": "tests/system/providers/amazon/aws/example_appflow.py",
> >         "duration": 45.87,
> >         "result": {
> >             "state": "SUCCESS"
> >         },
> >         "properties": {
> >           "source": "
> >
> https://github.com/apache/airflow/blob/main/tests/system/providers/amazon/aws/example_appflow.py
> > ",
> >           "operators": [
> >             "AppflowRunOperator",
> >             "S3CreateBucketOperator",
> >             "S3CreateObjectOperator",
> >             "S3DeleteBucketOperator"
> >           ]
> >         }
> >       },
> >       // Example of a test case that was skipped.
> >       {
> >         "name": "example_athena",
> >         "file": "tests/system/providers/amazon/aws/example_athena.py",
> >         "duration": 0.01,
> >         "result": {
> >             "state": "SKIPPED",
> >             "message": "Message explaining why."
> >         },
> >         "properties": {
> >           "source": "
> >
> https://github.com/apache/airflow/blob/main/tests/system/providers/amazon/aws/example_athena.py
> > ",
> >           "operators": [
> >             "AthenaOperator",
> >             "S3CreateBucketOperator",
> >             "S3CreateObjectOperator",
> >             "S3DeleteBucketOperator"
> >           ]
> >         }
> >       },
> >       // Example of a test case that failed.
> >       {
> >         "name": "example_batch",
> >         "file": "tests/system/providers/amazon/aws/example_batch.py",
> >         "duration": 94.01,
> >         "result": {
> >             "state": "FAILURE",
> >             "message": "Some failure message, maybe a link to logs or a
> > stack trace?",
> >             "type": "AssertionError",
> >         },
> >         "properties": {
> >           "source": "
> >
> https://github.com/apache/airflow/blob/main/tests/system/providers/amazon/aws/example_batch.py
> > ",
> >           "operators": [
> >             "BatchCreateComputeEnvironmentOperator",
> >             "BatchComputeEnvironmentSensor",
> >             "BatchJobQueueSensor",
> >             "BatchOperator",
> >             "BatchSensor",
> >           ]
> >         }
> >       },
> >     ]
> >   }
> > }
> >
> >
> >  - ferruzzi
> >
> >
> > ________________________________
> > From: Eugen Kosteev <eu...@kosteev.com>
> > Sent: Tuesday, June 25, 2024 6:04 AM
> > To: dev@airflow.apache.org
> > Subject: RE: [EXT] System Test Dashboards - Phase Two??
> >
> > CAUTION: This email originated from outside of the organization. Do not
> > click links or open attachments unless you can confirm the sender and
> know
> > the content is safe.
> >
> >
> >
> > AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe.
> > Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne
> pouvez
> > pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain
> que
> > le contenu ne présente aucun risque.
> >
> >
> >
> > +1 to the idea of standardizing the format of the system test results
> > output
> >
> > On Tue, Jun 25, 2024 at 10:40 AM Jarek Potiuk <ja...@potiuk.com> wrote:
> >
> > > So we have almost everyone on board!
> > >
> > > Now we also need the Teradata team to add whatever JSON/XML we come up
> > with
> > > :). In case people have not noticed, among our dashboards [1] we also
> > have
> > > Teradata dashboard [2]
> > >
> > > [1]
> > >
> > >
> >
> https://airflow.apache.org/ecosystem/#airflow-provider-system-test-dashboards
> > > [2] https://teradata.github.io/airflow/
> > >
> > > Anyone would like to take a lead on it? I am personally fine with
> either
> > > approach - Junit xml (or json version of it), or custom json is fine
> for
> > > me.
> > >
> > > J.
> > >
> > >
> > > On Tue, Jun 25, 2024 at 10:24 AM Pankaj Koti
> > > <pankaj.k...@astronomer.io.invalid> wrote:
> > >
> > > > For context, the Astronomer LLM providers dashboard operates as
> > follows:
> > > >
> > > > 1. Fetch the latest source code for providers and system
> tests/example
> > > DAGs
> > > > from the Airflow repository, deploy them to an Airflow instance, and
> > > > execute the
> > > > DAGs.
> > > > 2. Use the Airflow API to retrieve the DAG run statuses and produce a
> > > JSON
> > > > output of these statuses.
> > > > 3. The dashboard, hosted on GitHub Pages, consumes the JSON data
> > > > generated in step 2.
> > > >
> > > > We are willing to adopt and adhere to a JSON or XML specification
> and a
> > > > model HTML view if one is established.
> > > >
> > > > Best regards,
> > > >
> > > > *Pankaj Koti*
> > > > Senior Software Engineer (Airflow OSS Engineering team)
> > > > Location: Pune, Maharashtra, India
> > > > Timezone: Indian Standard Time (IST)
> > > >
> > > >
> > > > On Mon, Jun 24, 2024 at 11:40 PM Ferruzzi, Dennis
> > > > <ferru...@amazon.com.invalid> wrote:
> > > >
> > > > > >  The information in our database is similar to the structure of
> the
> > > AWS
> > > > > providers json file
> > > > > >
> > https://aws-mwaa.github.io/open-source/system-tests/dashboard.json
> > > +
> > > > a
> > > > > field for logs.
> > > > > >  We also have an extra field that specifies the commit-id against
> > > which
> > > > > the CI was run,
> > > > > >  which I believe is helpful in case users want to know whether
> > their
> > > PR
> > > > > was merged before
> > > > > >  or after a failure.
> > > > >
> > > > > The commit ID is a handy addition for sure, I may look into adding
> > that
> > > > to
> > > > > the AWS dashboard.  I haven't had a chance to look into junit-xml
> > yet,
> > > > but
> > > > > I think what we could do is agree on a minimum structure and allow
> > for
> > > > > extras.   For example, logs are great, but if Google provides them
> > and
> > > > AWS
> > > > > doesn't, that shouldn't break anything for the user trying to fetch
> > > logs.
> > > > > But the test name, timestamp, and success/fail state are definitely
> > > among
> > > > > the required minimum fields.
> > > > >
> > > > > > we could consider enforcing the presence of *some* dashboard that
> > > shows
> > > > > results of regular system tests executions for any new provider.
> > > > >
> > > > > The issue there is that smaller providers come and go, and are
> often
> > > > added
> > > > > by community members, not even necessarily with the provider's
> > > knowledge.
> > > > >  We can't force them to provide any support.  If Random Contributor
> > > adds
> > > > > support for a new provider, neither the contributor nor the
> provider
> > > can
> > > > be
> > > > > required to provide hosting for a dashboard and infrastructure to
> run
> > > the
> > > > > tests.  So (for the foreseeable future) the dashboards need to be
> an
> > > > opt-in
> > > > > project by/for the providers.   Maybe some day the project might be
> > > able
> > > > to
> > > > > provide hosting for the smaller dashboards or something, but I
> think
> > > the
> > > > > infrastructure to run the tests will always be optional and at the
> > > > expense
> > > > > (and effort) of some other interested party (almost certainly the
> > > > provider
> > > > > themselves, but who knows... ).
> > > > >
> > > > >
> > > > >  - ferruzzi
> > > > >
> > > > >
> > > > > ________________________________
> > > > > From: Michał Modras <michalmod...@google.com.INVALID>
> > > > > Sent: Monday, June 24, 2024 5:20 AM
> > > > > To: dev@airflow.apache.org
> > > > > Subject: RE: [EXT] System Test Dashboards - Phase Two??
> > > > >
> > > > > CAUTION: This email originated from outside of the organization. Do
> > not
> > > > > click links or open attachments unless you can confirm the sender
> and
> > > > know
> > > > > the content is safe.
> > > > >
> > > > >
> > > > >
> > > > > AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur
> > > externe.
> > > > > Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous
> ne
> > > > pouvez
> > > > > pas confirmer l’identité de l’expéditeur et si vous n’êtes pas
> > certain
> > > > que
> > > > > le contenu ne présente aucun risque.
> > > > >
> > > > >
> > > > >
> > > > > Hi,
> > > > >
> > > > > +1 to this idea. I think standardizing the format of the presented
> > test
> > > > run
> > > > > results makes sense. I also agree that we don't necessarily need to
> > > > enforce
> > > > > it in any hard way. However, given that we have dashboards of these
> > > three
> > > > > major providers, we could consider enforcing the presence of *some*
> > > > > dashboard
> > > > > that shows results of regular system tests executions for any new
> > > > provider.
> > > > > WDYT?
> > > > >
> > > > > Best,
> > > > > Michal
> > > > >
> > > > > On Sun, Jun 23, 2024 at 10:09 PM Freddy Demiane
> > > > > <fdemi...@google.com.invalid>
> > > > > wrote:
> > > > >
> > > > > > Hello,
> > > > > >
> > > > > > Thank you for the comments! Indeed, +1 to the idea, I believe
> this
> > > > would
> > > > > be
> > > > > > a good step to increase the quality of providers. From our
> (Google)
> > > > side,
> > > > > > the dashboard's CI outputs the results to a database, which are
> > then
> > > > used
> > > > > > to generate an HTML page. Yet, generating and publishing a JSON
> or
> > a
> > > > > JUnit
> > > > > > XML style file would be a simple task for us. The information in
> > our
> > > > > > database is similar to the structure of the AWS providers json
> file
> > > > > >
> https://aws-mwaa.github.io/open-source/system-tests/dashboard.json
> > +
> > > > > > a field for logs. We also have an extra field that specifies the
> > > > > commit-id
> > > > > > against which the CI was run, which I believe is helpful in case
> > > users
> > > > > want
> > > > > > to know whether their PR was merged before or after a failure.
> > > > > > If we want to go with the junit-xml style format (I checked this
> > > > > reference
> > > > > > <
> > > > > >
> > > > >
> > > >
> > >
> >
> https://www.ibm.com/docs/en/developer-for-zos/16.0?topic=formats-junit-xml-format
> > > > > > >),
> > > > > > one thing I could think of is to make each "Dashboard CI run"
> > > generate
> > > > an
> > > > > > xml file where each test is represented by a testcase, which as
> > Jarek
> > > > > > mentioned, could be used in some way in the canary builds.
> > > > > > Let me know what you think.
> > > > > >
> > > > > > Best,
> > > > > > Freddy
> > > > > >
> > > > > >
> > > > > > On Fri, Jun 21, 2024 at 11:12 AM Jarek Potiuk <ja...@potiuk.com>
> > > > wrote:
> > > > > >
> > > > > > > This is a fantastic idea! I love it !
> > > > > > >
> > > > > > > It also has some very far reaching possible spin-offs in the
> > > future -
> > > > > > > literally few days ago, when I discussed some of the future
> > > security
> > > > > > > related work that we might want to do, there was a concept of
> > > having
> > > > a
> > > > > > sort
> > > > > > > of CI of all CIs where we (and by we I mean wider Python
> > ecosystem)
> > > > > could
> > > > > > > gather a status of pre-release versions of dependencies before
> > they
> > > > hit
> > > > > > > release stage, and some kind of interchange between those CI
> > > systems
> > > > > that
> > > > > > > will be machine-parseable is pretty much prerequisite for that.
> > So
> > > we
> > > > > > could
> > > > > > > generally try it out and sort out some issues, see how it works
> > in
> > > > our
> > > > > > > small "airflow" world, but in the future we might be able to
> use
> > > > > similar
> > > > > > > mechanisms to get alerts for a number of our dependencies - and
> > > even
> > > > > > > further than that, we could make such approach much more
> > > wide-spread
> > > > (I
> > > > > > am
> > > > > > > discussing it with people from Python Software
> > Foundation/Packaging
> > > > > team
> > > > > > /
> > > > > > > Python security, so there is a chance this might actually
> > > materialize
> > > > > in
> > > > > > a
> > > > > > > long term). This would be the first step.
> > > > > > >
> > > > > > > I think the first step for it could be rather simple and we do
> > not
> > > > have
> > > > > > to
> > > > > > > invent our own standard - we could easily start with junit-xml
> > > style
> > > > > > output
> > > > > > > produced by each dashboard and available under some URL that we
> > > could
> > > > > > pull
> > > > > > > in our canary builds and have a step in our canary builds that
> > > could
> > > > > > > aggregate multiple xmlunit files coming from various
> dashboards,
> > > > > display
> > > > > > > them as the output, and fail the job in case some tests are
> > failing
> > > > > (with
> > > > > > > maybe some thresholds). Pytest and a number of tools natively
> > > > supports
> > > > > > the
> > > > > > > junit-xml format, it's pretty established as machine-readable
> > test
> > > > > > results,
> > > > > > > and I think it has all we need to start with
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.pytest.org/en/latest/how-to/usage.html#creating-junitxml-format-files
> > > > > > > .
> > > > > > > There is a lot of tooling around this format - including easy
> > ways
> > > > how
> > > > > we
> > > > > > > could possibly integrate it with Github Actions output (think
> > links
> > > > to
> > > > > > the
> > > > > > > tests that failed directly in GitHub UI), showing logs of
> failed
> > > > tests
> > > > > > etc.
> > > > > > > etc.
> > > > > > >
> > > > > > > If we can get the Astronomer, Amazon and Google team on board
> > with
> > > > it,
> > > > > we
> > > > > > > could likely implement a simple version quickly and iterate
> over
> > > it -
> > > > > > later
> > > > > > > we could think about possibly evolving that into a more
> > extensible
> > > > > > > approach.
> > > > > > >
> > > > > > > J.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Jun 20, 2024 at 11:27 PM Ferruzzi, Dennis
> > > > > > > <ferru...@amazon.com.invalid> wrote:
> > > > > > >
> > > > > > > > Congrats to the Google team for getting their dashboard live,
> > it
> > > > > looks
> > > > > > > > great!  I've been thinking of something for a while and
> thought
> > > I'd
> > > > > > > mention
> > > > > > > > it here.  I'm wearing a few different hats here so I'll try
> to
> > > > > clarify
> > > > > > > > context on my plural pronouns the best I can.
> > > > > > > >
> > > > > > > > Now that we [Providers] have a couple of big dashboards up,
> I'm
> > > > > curious
> > > > > > > if
> > > > > > > > we [Airflow dev community] might collaborate on a community
> > > > "optional
> > > > > > > > guideline" for a json (or yaml or whatever) format output on
> > the
> > > > > > > dashboards
> > > > > > > > for any providers interested in participating.  I'm not
> > > interested
> > > > in
> > > > > > (or
> > > > > > > > trying to) impose any kind of hard-line policy or standard
> > here,
> > > > but
> > > > > I
> > > > > > > > wonder if we [owners of the existing dashboards] might set
> some
> > > > > > > non-binding
> > > > > > > > precedent for future providers to join.  If others don't
> follow
> > > > suit,
> > > > > > > then
> > > > > > > > they wouldn't benefit from whatever uses folks come up with
> for
> > > the
> > > > > > data,
> > > > > > > > but I personally don't think we [Airflow] can or should try
> to
> > > > impose
> > > > > > > this
> > > > > > > > on providers.
> > > > > > > >
> > > > > > > > To my knowledge there are three provider-owned system test
> > > > dashboards
> > > > > > > > currently live, and I look forward to seeing more in time:
> > > > > > > >
> > > > > > > > Astronomer (found this LLM-specific one, not sure if there is
> > > > another
> > > > > > > > one): https://astronomer.github.io/llm-dags-dashboard/
> > > > > > > > AWS:
> > > > > >
> https://aws-mwaa.github.io/open-source/system-tests/dashboard.html
> > > > > > > > and
> > > > > >
> https://aws-mwaa.github.io/open-source/system-tests/dashboard.json
> > > > > > > > Google:
> > > > > > > >
> > > > >
> > https://storage.googleapis.com/providers-dashboard-html/dashboard.html
> > > > > > > >
> > > > > > > > Each was developed independently, and the path/name of the
> > Google
> > > > one
> > > > > > may
> > > > > > > > hint that there is already an alternative to the html view
> that
> > > I'm
> > > > > > just
> > > > > > > > not familiar with, so maybe we [the three providers] could
> > > > > collaborate
> > > > > > on
> > > > > > > > some precedent that others could follow?  We [AWS] already
> have
> > > > ours
> > > > > > > > exporting in json so discussion might start there and see
> where
> > > it
> > > > > > goes?
> > > > > > > > Either way... Even if we [Airflow] don't do anything with the
> > > > json, I
> > > > > > > bet a
> > > > > > > > user could find interesting things to build if we give them
> the
> > > > > tools.
> > > > > > > >  Maybe aggregating a dashboard which monitors (and alerts?)
> the
> > > > > status
> > > > > > of
> > > > > > > > the system tests which cover the operators their workflow
> > depends
> > > > on,
> > > > > > > > maybe?  Who knows what someone may come up with once they
> have
> > > the
> > > > > > tools
> > > > > > > to
> > > > > > > > mix and match the data from various providers.
> > > > > > > >
> > > > > > > > Is there any interest in the idea of a "standard json schema"
> > for
> > > > > these
> > > > > > > > and any future system test dashboards?
> > > > > > > >
> > > > > > > >
> > > > > > > >  - ferruzzi
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> > --
> > Eugene
> >
>

Re: System Test Dashboards - Phase Two??

Reply via email to