Re: [Bug-wget] [PATCH 3/3] Add command line option to disable use of .netrc

2017-05-12 Thread Giuseppe Scrivano
Hi Tomas,

Tomas Hozza  writes:

> Although internally code uses option for (not) reading .netrc for
> credentials, it was not possible to turn this behavior off on command
> line. Note that it was possible to turn it off using wgetrc.
>
> Idea for this change came from Bruce Jerrick (bmj...@gmail.com).
> Reference: https://bugzilla.redhat.com/show_bug.cgi?id=1425097

wouldn't "-e netrc=off" be sufficient?

Regards,
Giuseppe



[Bug-wget] [PATCH 1/3] Added tests for HTTP authentication using credentials from .netrc

2017-05-12 Thread Tomas Hozza
Getting credentials from .netrc has been broken from time to time, thus
adding a test coverage to prevent regressions.

Also added setting of "HOME" environment variable when executing wget,
to make sure LocalFiles like .netrc, which are created just for the
test, are actually used.

Signed-off-by: Tomas Hozza 
---
 testenv/Makefile.am |  3 ++
 testenv/Test-auth-basic-netrc-pass-given.py | 68 +
 testenv/Test-auth-basic-netrc-user-given.py | 68 +
 testenv/Test-auth-basic-netrc.py| 66 
 testenv/test/base_test.py   |  2 +-
 5 files changed, 206 insertions(+), 1 deletion(-)
 create mode 100755 testenv/Test-auth-basic-netrc-pass-given.py
 create mode 100755 testenv/Test-auth-basic-netrc-user-given.py
 create mode 100755 testenv/Test-auth-basic-netrc.py

diff --git a/testenv/Makefile.am b/testenv/Makefile.am
index 3febec7..7104314 100644
--- a/testenv/Makefile.am
+++ b/testenv/Makefile.am
@@ -75,6 +75,9 @@ if HAVE_PYTHON3
   TESTS = Test-504.py   \
 Test-auth-basic-fail.py \
 Test-auth-basic.py  \
+Test-auth-basic-netrc.py\
+Test-auth-basic-netrc-user-given.py \
+Test-auth-basic-netrc-pass-given.py \
 Test-auth-both.py   \
 Test-auth-digest.py \
 Test-auth-no-challenge.py   \
diff --git a/testenv/Test-auth-basic-netrc-pass-given.py 
b/testenv/Test-auth-basic-netrc-pass-given.py
new file mode 100755
index 000..43dfe34
--- /dev/null
+++ b/testenv/Test-auth-basic-netrc-pass-given.py
@@ -0,0 +1,68 @@
+#!/usr/bin/env python3
+from sys import exit
+from test.http_test import HTTPTest
+from misc.wget_file import WgetFile
+
+"""
+This test ensures Wget uses credentials from .netrc for Basic 
Authorization Negotiation.
+In this case we test that .netrc credentials are used in case only
+password is given on the command line.
+Also, we ensure that Wget saves the host after a successful auth and
+doesn't wait for a challenge the second time.
+"""
+# File Definitions ###
+File1 = "I am an invisble man."
+File2 = "I too am an invisible man."
+
+User = "Sauron"
+Password = "TheEye"
+
+File1_rules = {
+"Authentication": {
+"Type"  : "Basic",
+"User"  : User,
+"Pass"  : Password
+}
+}
+File2_rules = {
+"ExpectHeader"  : {
+"Authorization" : "Basic U2F1cm9uOlRoZUV5ZQ=="
+}
+}
+
+Netrc = "machine 127.0.0.1\n\tlogin {0}".format(User)
+
+A_File = WgetFile ("File1", File1, rules=File1_rules)
+B_File = WgetFile ("File2", File2, rules=File2_rules)
+Netrc_File = WgetFile (".netrc", Netrc)
+
+WGET_OPTIONS = "--password={0}".format(Password)
+WGET_URLS = [["File1", "File2"]]
+
+Files = [[A_File, B_File]]
+LocalFiles = [Netrc_File]
+
+ExpectedReturnCode = 0
+ExpectedDownloadedFiles = [A_File, B_File, Netrc_File]
+
+ Pre and Post Test Hooks #
+pre_test = {
+"ServerFiles"   : Files,
+"LocalFiles": LocalFiles
+}
+test_options = {
+"WgetCommands"  : WGET_OPTIONS,
+"Urls"  : WGET_URLS
+}
+post_test = {
+"ExpectedFiles" : ExpectedDownloadedFiles,
+"ExpectedRetcode"   : ExpectedReturnCode
+}
+
+err = HTTPTest (
+pre_hook=pre_test,
+test_params=test_options,
+post_hook=post_test
+).begin ()
+
+exit (err)
diff --git a/testenv/Test-auth-basic-netrc-user-given.py 
b/testenv/Test-auth-basic-netrc-user-given.py
new file mode 100755
index 000..57b6148
--- /dev/null
+++ b/testenv/Test-auth-basic-netrc-user-given.py
@@ -0,0 +1,68 @@
+#!/usr/bin/env python3
+from sys import exit
+from test.http_test import HTTPTest
+from misc.wget_file import WgetFile
+
+"""
+This test ensures Wget uses credentials from .netrc for Basic 
Authorization Negotiation.
+In this case we test that .netrc credentials are used in case only
+user login is given on the command line.
+Also, we ensure that Wget saves the host after a successful auth and
+doesn't wait for a challenge the second time.
+"""
+# File Definitions ###
+File1 = "I am an invisble man."
+File2 = "I too am an invisible man."
+
+User = "Sauron"
+Password = "TheEye"
+
+File1_rules = {
+"Authentication": {
+"Type"  : "Basic",
+"User"  : User,
+"Pass"  : Password
+}
+}
+File2_rules = {
+"ExpectHeader"  : {
+"Authorization" : "Basic U2F1cm9uOlRoZUV5ZQ=="
+}
+}
+
+Netrc = "machine 127.0.0.1\n\tlogin {0}\n\tpassword {1}".format(User, Password)
+
+A_File = WgetFile ("File1", File1, rules=

[Bug-wget] [PATCH 0/3] Changes related to use of .netrc

2017-05-12 Thread Tomas Hozza
I'm sending couple of changes, including tests, which are result of Fedora Bug 
https://bugzilla.redhat.com/show_bug.cgi?id=1425097. For details, please see 
commit messages of each patch.

Tomas Hozza (3):
  Added tests for HTTP authentication using credentials from .netrc
  Fixed getting of credentials from .netrc
  Add command line option to disable use of .netrc

 doc/wget.texi   |  6 +++
 src/http.c  |  2 +-
 src/main.c  |  3 ++
 testenv/Makefile.am |  4 ++
 testenv/Test-auth-basic-netrc-pass-given.py | 68 +
 testenv/Test-auth-basic-netrc-user-given.py | 68 +
 testenv/Test-auth-basic-netrc.py| 66 
 testenv/Test-auth-basic-no-netrc-fail.py| 59 +
 testenv/test/base_test.py   |  2 +-
 9 files changed, 276 insertions(+), 2 deletions(-)
 create mode 100755 testenv/Test-auth-basic-netrc-pass-given.py
 create mode 100755 testenv/Test-auth-basic-netrc-user-given.py
 create mode 100755 testenv/Test-auth-basic-netrc.py
 create mode 100755 testenv/Test-auth-basic-no-netrc-fail.py

-- 
2.7.4




[Bug-wget] [PATCH 3/3] Add command line option to disable use of .netrc

2017-05-12 Thread Tomas Hozza
Although internally code uses option for (not) reading .netrc for
credentials, it was not possible to turn this behavior off on command
line. Note that it was possible to turn it off using wgetrc.

Idea for this change came from Bruce Jerrick (bmj...@gmail.com).
Reference: https://bugzilla.redhat.com/show_bug.cgi?id=1425097

Signed-off-by: Tomas Hozza 
---
 doc/wget.texi|  6 
 src/main.c   |  3 ++
 testenv/Makefile.am  |  1 +
 testenv/Test-auth-basic-no-netrc-fail.py | 59 
 4 files changed, 69 insertions(+)
 create mode 100755 testenv/Test-auth-basic-no-netrc-fail.py

diff --git a/doc/wget.texi b/doc/wget.texi
index a2bf9dc..e4e0bf6 100644
--- a/doc/wget.texi
+++ b/doc/wget.texi
@@ -703,6 +703,12 @@ Before (over)writing a file, back up an existing file by 
adding a
 files are rotated to @samp{.2}, @samp{.3}, and so on, up to
 @var{backups} (and lost beyond that).
 
+@cindex authentication credentials
+@item --no-netrc
+Do not try to obtain credentials from @file{.netrc} file. By default
+@file{.netrc} file is searched for credentials in case none have been
+passed on command line and authentication is required.
+
 @cindex continue retrieval
 @cindex incomplete downloads
 @cindex resume download
diff --git a/src/main.c b/src/main.c
index 8e9d6e9..297499e 100644
--- a/src/main.c
+++ b/src/main.c
@@ -359,6 +359,7 @@ static struct cmdline_option option_data[] =
 #endif
 { "method", 0, OPT_VALUE, "method", -1 },
 { "mirror", 'm', OPT_BOOLEAN, "mirror", -1 },
+{ "netrc", 0, OPT_BOOLEAN, "netrc", -1 },
 { "no", 'n', OPT__NO, NULL, required_argument },
 { "no-clobber", 0, OPT_BOOLEAN, "noclobber", -1 },
 { "no-config", 0, OPT_BOOLEAN, "noconfig", -1},
@@ -629,6 +630,8 @@ Download:\n"),
   -nc, --no-clobberskip downloads that would download to\n\
  existing files (overwriting them)\n"),
 N_("\
+   --no-netrc  don't try to obtain credentials from 
.netrc\n"),
+N_("\
   -c,  --continue  resume getting a partially-downloaded 
file\n"),
 N_("\
--start-pos=OFFSET  start downloading from zero-based position 
OFFSET\n"),
diff --git a/testenv/Makefile.am b/testenv/Makefile.am
index 7104314..ef4158a 100644
--- a/testenv/Makefile.am
+++ b/testenv/Makefile.am
@@ -78,6 +78,7 @@ if HAVE_PYTHON3
 Test-auth-basic-netrc.py\
 Test-auth-basic-netrc-user-given.py \
 Test-auth-basic-netrc-pass-given.py \
+Test-auth-basic-no-netrc-fail.py\
 Test-auth-both.py   \
 Test-auth-digest.py \
 Test-auth-no-challenge.py   \
diff --git a/testenv/Test-auth-basic-no-netrc-fail.py 
b/testenv/Test-auth-basic-no-netrc-fail.py
new file mode 100755
index 000..fad15e9
--- /dev/null
+++ b/testenv/Test-auth-basic-no-netrc-fail.py
@@ -0,0 +1,59 @@
+#!/usr/bin/env python3
+from sys import exit
+from test.http_test import HTTPTest
+from misc.wget_file import WgetFile
+
+"""
+This test ensures that Wget will not use credentials from .netrc
+when --no-netrc option is specified and Basic authentication is required
+and fails.
+"""
+# File Definitions ###
+File1 = "I am an invisble man."
+
+User = "Sauron"
+Password = "TheEye"
+
+File1_rules = {
+"Authentication": {
+"Type"  : "Basic",
+"User"  : User,
+"Pass"  : Password
+}
+}
+
+Netrc = "machine 127.0.0.1\n\tlogin {0}\n\tpassword {1}".format(User, Password)
+
+A_File = WgetFile ("File1", File1, rules=File1_rules)
+Netrc_File = WgetFile (".netrc", Netrc)
+
+WGET_OPTIONS = "--no-netrc"
+WGET_URLS = [["File1"]]
+
+Files = [[A_File]]
+LocalFiles = [Netrc_File]
+
+ExpectedReturnCode = 6
+ExpectedDownloadedFiles = [Netrc_File]
+
+ Pre and Post Test Hooks #
+pre_test = {
+"ServerFiles"   : Files,
+"LocalFiles": LocalFiles
+}
+test_options = {
+"WgetCommands"  : WGET_OPTIONS,
+"Urls"  : WGET_URLS
+}
+post_test = {
+"ExpectedFiles" : ExpectedDownloadedFiles,
+"ExpectedRetcode"   : ExpectedReturnCode
+}
+
+err = HTTPTest (
+pre_hook=pre_test,
+test_params=test_options,
+post_hook=post_test
+).begin ()
+
+exit (err)
-- 
2.7.4




[Bug-wget] [PATCH 2/3] Fixed getting of credentials from .netrc

2017-05-12 Thread Tomas Hozza
There seemed to be a copy&paste error in http.c code, which decides
whether to get credentials from .netrc. In ftp.c "user" and "pass"
variables are char*, while in http.c, these are char**. For this reason
they should be dereferenced when determining if password and user login
is set to some value.

Also since both variables are dereferenced on lines above the changed
code, it does not really make sense to check if they are NULL.

This patch is based on fix from Bruce Jerrick .
Fedora bug: https://bugzilla.redhat.com/show_bug.cgi?id=1425097

Signed-off-by: Tomas Hozza 
---
 src/http.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/http.c b/src/http.c
index 8b77a10..323f559 100644
--- a/src/http.c
+++ b/src/http.c
@@ -1900,7 +1900,7 @@ initialize_request (const struct url *u, struct http_stat 
*hs, int *dt, struct u
 *passwd = NULL;
 
   /* Check for ~/.netrc if none of the above match */
-  if (opt.netrc && (!user || (!passwd || !*passwd)))
+  if (opt.netrc && (!*user || !*passwd))
 search_netrc (u->host, (const char **) user, (const char **) passwd, 0);
 
   /* We only do "site-wide" authentication with "global" user/password
-- 
2.7.4




[Bug-wget] [bug #50935] TEXTHTML not properly set if page is already downloaded

2017-05-12 Thread anonymous
Follow-up Comment #4, bug #50935 (project wget):

The problem with the replacement commands you recommended (replacing -pH with
-pHE in the second) is that it redownloads the files that it adjusts the
extension of.

As for making a head request, how expensive is that?  It would be ideal for my
use case if it didn't have to make any network requests at all for already
downloaded files.  Is using a heuristic like if it begins with "http://savannah.gnu.org/bugs/?50935>

___
  Message sent via/by Savannah
  http://savannah.gnu.org/




[Bug-wget] [bug #50935] TEXTHTML not properly set if page is already downloaded

2017-05-12 Thread Tim Ruehsen
Update of bug #50935 (project wget):

  Status:   Need Info => Confirmed  

___

Follow-up Comment #3:

Sorry, my stupidity :-)
I was stuck with the first command and everything was fine, so I didn't really
check the next command :-(

You are right, if the file exists the -p -nc combination says 'File ...
already there; not retrieving.' and does nothing.

Instead it should read and parse that file (after checking that it really is a
HTML or CSS). Wget currently has no heuristic, so it should make a HEAD
request to check the content-type. What Wget really does is looking at the
file name extension.

So you can do the trick with


wget -xHE -nc 'https://news.ycombinator.com/item?id=14245538'
wget -pH -nc 'https://news.ycombinator.com/item?id=14245538'


I will add this issue as a reference in Wget2 development, where we will do it
correctly (using HEAD request).

Thanks for your report !


___

Reply to this item at:

  

___
  Message sent via/by Savannah
  http://savannah.gnu.org/