petercheon commented on PR #66065:
URL: https://github.com/apache/airflow/pull/66065#issuecomment-4426181446
> This fixes the ES9 to ES8 regression. Before I can approve please validate
that the elasticsearch es_compat_with option only accepts a major version
string for example 7 8 or 9 and raise AirflowConfigException on invalid values.
Do not construct the Elasticsearch client twice. Create a single instance and
pass it to apply_compat_with. Make the transport wrapper accept args and kwargs
and decorate it with functools.wraps so it is tolerant of different
elastic_transport perform_request signatures. Add a short docs entry for
es_compat_with with an airflow.cfg example and note that the transport level
rewrite overrides per API header negotiation. Confirm whether a newsfragment is
required. Address these and I will re check and approve.
Thanks for the review @Subham-KRLX — addressed below.
1. Major-version-only validation with AirflowConfigException.
Already enforced. _compat.py validates the option against ^\d+$ after
.strip()-ing whitespace and raises AirflowConfigException on any non-numeric
value at client construction time, so a misconfiguration like "v8" or "8.0"
fails fast in the worker startup log instead of producing a per-request 400
storm. Covered by
test_apply_compat_with_rejects_non_numeric_major (parametrized over "v8",
"8.0", "abc", "8;9") and test_apply_compat_with_strips_whitespace_in_config.
2. Single Elasticsearch(...) construction.
I might be misreading your concern here — each call site already constructs
the client exactly once and hands that instance to apply_compat_with, which
mutates client.transport.perform_request in place and returns the same client.
The line you quoted (self.es = apply_compat_with(Elasticsearch(self.url,
basic_auth=(user,
password), **kwargs))) is the current code, not a suggested change. If you
spotted a site where two clients are actually being built, could you point me
at the file:line? I want to make sure I'm not missing one.
3. *args, **kwargs + functools.wraps on the transport wrapper. ✅
Done in the latest push. The inner perform_request is now
@functools.wraps(original_perform_request) decorated and accepts *args,
**kwargs, with headers pulled via kwargs.get("headers") and put back in kwargs
after the rewrite. This way the wrapper is tolerant of future
elastic_transport.Transport.perform_request signature
changes (new keyword-only params, reordered positionals) and inspect /
debuggers / __wrapped__-aware tooling see the original signature. The existing
wire-level tests continue to pass because elasticsearch-py always passes
headers as a kwarg.
4. Docs entry with airflow.cfg example and transport-level override note. ✅
Extended providers/elasticsearch/docs/logging/index.rst (the existing
Pinning the compatible-with content-negotiation level subsection) with:
- explicit valid-value rules (positive integer major; non-numeric →
AirflowConfigException)
- a .. note:: calling out that the fix is installed at the transport layer
and therefore overrides elasticsearch-py's per-API-method Accept / Content-Type
negotiation — and that constructor headers= on Elasticsearch.__init__ and the
elasticsearch_configs section do not work for this purpose, since
elasticsearch-py re-applies
its own compatible-with=<client_major> right before the request goes out.
The airflow.cfg example was already there ([elasticsearch] es_compat_with =
8).
5. Newsfragment.
Not required for providers/. Per the repo's CLAUDE.md and
dev/README_RELEASE_PROVIDERS.md, provider release managers regenerate the
changelog directly from git log, so per-PR newsfragments under newsfragments/
are not consumed. The user-visible note already lives at the top of
providers/elasticsearch/docs/changelog.rst (above
the 6.5.3 section), which is the documented path for provider release notes.
Ready for another look when you have a moment.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]