Hi. In RHEL-8, we ship a wget version that suffers from bug fixed by [1]. The fix resolved issue with matching subdomains when no_proxy domain definition was prefixed with dot, e.q. "no_prefix=.mit.edu". As part of backporting the fix to RHEL, I wanted to create an upstream test for no_prefix functionality. However I found that there is still one corner case, which is not handled by the current upstream code and honestly I'm not sure what is the intended domain matching behavior in that case. Man page is also not very specific in this regard.
The corner case is as follows: - no_proxy=.mit.edu - download URL is e.g. "http://mit.edu/file1" In this case the proxy settings are used, because domains don't match due to the leftmost dot in no_proxy domain definition. This is either intended or corner case that was not considered. One could argue, that if the no_proxy is set to ".mit.edu", then leftmost dot means that the proxy settings should not apply only to subdomains of "mit.edu", but proxy settings should still apply to "mit.edu" domain itself. From my point of view, after reading wget man page, I don't think that the leftmost dost in no_proxy definition has any special meaning. I think that this corner case should be either fixed, or alternatively wget manpage should be made more specific about the intended behavior. Anyway, I'm attaching patches fixing the corner case and adding test case for no_proxy behavior. And one small fix for the test framework - HttpTest begin() function was not returning a result value, but always None. Please let me know if the corner case is really an intended behavior and I'll change the test case and can fix the man page instead of the code. [1] http://git.savannah.gnu.org/cgit/wget.git/commit/?id=fd85ac9cc623847e9d94d9f9241ab34e2c146cbf Thank you. Regards, Tomas -- Tomas Hozza Associate Manager, Software Engineering - EMEA ENG Core Services PGP: 1D9F3C2D UTC+2 (CEST) Red Hat Inc. http://cz.redhat.com
>From b5201fd72945c3a9619dde82ad35fa7fa6c24ce2 Mon Sep 17 00:00:00 2001 From: Tomas Hozza <[email protected]> Date: Thu, 7 Nov 2019 12:46:15 +0100 Subject: [PATCH 1/5] testenv: HTTPTest.begin() should return exit value Previously the HTTPTest.begin() method always returned None. However this is not consistent with the begin() implementation of the parent class (BaseTest). This change ensures that HTTPTest.begin() returns a value. Signed-off-by: Tomas Hozza <[email protected]> --- testenv/test/http_test.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/testenv/test/http_test.py b/testenv/test/http_test.py index fef0c2ef..462ac6e7 100644 --- a/testenv/test/http_test.py +++ b/testenv/test/http_test.py @@ -42,7 +42,7 @@ class HTTPTest(BaseTest): print_green("Test Passed.") else: self.tests_passed = False - super(HTTPTest, self).begin() + return super(HTTPTest, self).begin() def instantiate_server_by(self, protocol): server = {HTTP: HTTPd, -- 2.21.0
>From f556f41cced4c7b9faa20a8d17c10e2c34c36bfb Mon Sep 17 00:00:00 2001 From: Tomas Hozza <[email protected]> Date: Thu, 7 Nov 2019 13:01:44 +0100 Subject: [PATCH 2/5] testenv: Allow definition of environment variables for wget execuion Added new test hook called EnvironmentVariables, for defining environment variables when wget is executed in tests. This is handy for testing environment variables, which are accepted by wget. Signed-off-by: Tomas Hozza <[email protected]> --- testenv/README | 3 +++ testenv/conf/environment_variables.py | 14 ++++++++++++++ testenv/test/base_test.py | 6 +++++- 3 files changed, 22 insertions(+), 1 deletion(-) create mode 100644 testenv/conf/environment_variables.py diff --git a/testenv/README b/testenv/README index 6580bc99..e734b55f 100644 --- a/testenv/README +++ b/testenv/README @@ -220,6 +220,9 @@ executed. The currently supported options are: file. While all Download URL's are passed to Urls, a notable exception is when in-url authentication is used. In such a case, the URL is specified in the WgetCommands string. + * EnvironmentVariables: A dictionary with key-value items, which will be + defined as environment variables during the execution of wget command in + test. Post-Test Hooks: ================================================================================ diff --git a/testenv/conf/environment_variables.py b/testenv/conf/environment_variables.py new file mode 100644 index 00000000..323c051c --- /dev/null +++ b/testenv/conf/environment_variables.py @@ -0,0 +1,14 @@ +from conf import hook + +""" Test Option: EnvironmentVariables +This hook is used to define environment variables used for execution of wget +command in test.""" + + +@hook(alias='EnvironmentVariables') +class URLs: + def __init__(self, envs): + self.envs = envs + + def __call__(self, test_obj): + test_obj.envs.update(**self.envs) diff --git a/testenv/test/base_test.py b/testenv/test/base_test.py index 1e1eb181..db099137 100644 --- a/testenv/test/base_test.py +++ b/testenv/test/base_test.py @@ -51,6 +51,7 @@ class BaseTest: self.wget_options = '' self.urls = [] + self.envs = dict() self.tests_passed = True self.ready = False @@ -97,12 +98,15 @@ class BaseTest: cmd_line = self.gen_cmd_line() params = shlex.split(cmd_line) print(params) + envs = {"HOME": os.getcwd()} + envs.update(**self.envs) + print(envs) if os.getenv("SERVER_WAIT"): time.sleep(float(os.getenv("SERVER_WAIT"))) try: - ret_code = call(params, env={"HOME": os.getcwd()}) + ret_code = call(params, env=envs) except FileNotFoundError: raise TestFailed("The Wget Executable does not exist at the " "expected path.") -- 2.21.0
>From afc3fcb7f2e31dcd4f6c62386d3d168240d8c0d4 Mon Sep 17 00:00:00 2001 From: Tomas Hozza <[email protected]> Date: Thu, 7 Nov 2019 13:11:30 +0100 Subject: [PATCH 3/5] testenv: Add test for handling of no_proxy environment variable Added new test with 5 cases, which are testing various combinations of no_proxy environment variable definition and requested URLs Signed-off-by: Tomas Hozza <[email protected]> --- testenv/Test-no_proxy-env.py | 131 +++++++++++++++++++++++++++++++++++ 1 file changed, 131 insertions(+) create mode 100755 testenv/Test-no_proxy-env.py diff --git a/testenv/Test-no_proxy-env.py b/testenv/Test-no_proxy-env.py new file mode 100755 index 00000000..5afce505 --- /dev/null +++ b/testenv/Test-no_proxy-env.py @@ -0,0 +1,131 @@ +#!/usr/bin/env python3 +from sys import exit +from test.http_test import HTTPTest +from test.base_test import HTTP +from misc.wget_file import WgetFile + +""" + This test ensures, that domains with and without leftmost dot defined in + no_proxy environment variable are accepted by wget. The idea is to use + non-existing proxy server address and detect whether files are downloaded + when proxy settings are omitted based on no_proxy environment variable + value. +""" +# File Definitions +File1 = "Would you like some Tea?" +File2 = "With lemon or cream?" + +A_File = WgetFile ("File1", File1) +B_File = WgetFile ("File2", File2) + +WGET_URLS = [["File1", "File2"]] +WGET_ENVS = { + "http_proxy": "nonexisting.localhost:8080", + "no_proxy": "working1.localhost,.working2.localhost" +} + +Servers = [HTTP] +Files = [[A_File, B_File]] + +ExpectedReturnCodeWorking = 0 +ExpectedReturnCodeNotWorking = 4 # network error (non-existing proxy address) + +ExpectedDownloadedFilesWorking = [A_File, B_File] + +# Pre and Post Test Hooks +test_options = { + "Urls" : WGET_URLS, + "EnvironmentVariables": WGET_ENVS +} +post_test_working = { + "ExpectedFiles" : ExpectedDownloadedFilesWorking, + "ExpectedRetcode" : ExpectedReturnCodeWorking +} +post_test_not_working = { + "ExpectedRetcode" : ExpectedReturnCodeNotWorking +} + +# Case #1: +# - Requested domain matches exactly the domain definition in no_proxy. +# - Domain definition in no_proxy is NOT dot-prefixed +# Expected result: proxy settings don't apply and files are downloaded. +pre_test_working1 = { + "ServerFiles" : Files, + "Domains" : ["working1.localhost"] +} + +err_case_1 = HTTPTest ( + pre_hook=pre_test_working1, + test_params=test_options, + post_hook=post_test_working, + protocols=Servers +).begin () + +# Case #2: +# - Requested domain is sub-domain of a domain definition in no_proxy. +# - Domain definition in no_proxy is NOT dot-prefixed +# Expected result: proxy settings don't apply and files are downloaded. +pre_test_working2 = { + "ServerFiles" : Files, + "Domains" : ["www.working1.localhost"] +} + +err_case_2 = HTTPTest ( + pre_hook=pre_test_working2, + test_params=test_options, + post_hook=post_test_working, + protocols=Servers +).begin () + +# Case #3: +# - Requested domain matches exactly the domain definition in no_proxy, +# except for the leftmost dot (".") in no_proxy domain definition. +# - Domain definition in no_proxy IS dot-prefixed +# Expected result: proxy settings don't apply and files are downloaded. +pre_test_working3 = { + "ServerFiles" : Files, + "Domains" : ["working2.localhost"] +} + +err_case_3 = HTTPTest ( + pre_hook=pre_test_working3, + test_params=test_options, + post_hook=post_test_working, + protocols=Servers +).begin () + +# Case #4: +# - Requested domain is sub-domain of a domain definition in no_proxy. +# - Domain definition in no_proxy IS dot-prefixed +# Expected result: proxy settings don't apply and files are downloaded. +pre_test_working4 = { + "ServerFiles" : Files, + "Domains" : ["www.working2.localhost"] +} + +err_case_4 = HTTPTest ( + pre_hook=pre_test_working4, + test_params=test_options, + post_hook=post_test_working, + protocols=Servers +).begin () + +# Case #5 +# - Requested domain does not match a domain definition in no_proxy. +# - Requested domain is NOT sub-domain of a domain definition in no_proxy. +# Expected result: proxy settings apply and files are NOT downloaded due to +# network error when using proxy with non-existing URL. +pre_test_not_working1 = { + "ServerFiles" : Files, + "Domains" : ["www.example.localhost"] +} + +err_case_5 = HTTPTest ( + pre_hook=pre_test_not_working1, + test_params=test_options, + post_hook=post_test_not_working, + protocols=Servers +).begin () + +# Combine error codes from all test cases +exit (max(err_case_1, err_case_2, err_case_3, err_case_4, err_case_5)) -- 2.21.0
>From 6c850d3361988191d5505907d340ea5a24b70133 Mon Sep 17 00:00:00 2001 From: Tomas Hozza <[email protected]> Date: Thu, 7 Nov 2019 13:15:11 +0100 Subject: [PATCH 4/5] src/host.c: Handle a corner case in suffix matching Handle case when domains match, except for leftmost dot in domain definition. Example when 'what' is 'mit.edu' and 'list[i]' is '.mit.edu'. Signed-off-by: Tomas Hozza <[email protected]> --- src/host.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/src/host.c b/src/host.c index e4f0ffb2..64347637 100644 --- a/src/host.c +++ b/src/host.c @@ -1024,7 +1024,10 @@ sufmatch (const char **list, const char *what) for (i = 0; list[i]; i++) { j = strlen (list[i]); - if (lw < j) + /* Ignore leftmost dot from length in case of dot-prefixed subdomain + * in list[i] and lenght mismatch with what of size 1 (the leftmost dot) + * */ + if (!(lw == j-1 && list[i][0] == '.') && lw < j) continue; /* what is no (sub)domain of list[i] */ for (k = lw; j >= 0 && k >= 0; j--, k--) @@ -1038,6 +1041,12 @@ sufmatch (const char **list, const char *what) */ if (j == -1 && (k == -1 || what[k] == '.' || list[i][0] == '.')) return true; + + /* Exact domain match, except for leftmost dot in list[i] + * example: what = "mit.edu" and list[i] = ".mit.edu" + */ + if (j == 0 && k == -1 && list[i][0] == '.') + return true; } return false; -- 2.21.0
