[Bug-wget] [PATCH] Allow openSSL compiled without SSLv2

2011-04-11 Thread Cristian Rodríguez
Hi:

the attached patch adds support to an openSSL library compiled without
SSlv2 , in which case, wget will behave like if it was using
the GNUTLS backend, that is, doing sslv3 only.

# Bazaar merge directive format 2 (Bazaar 0.90)
# revision_id: cristian@linux-us4g-20110411021140-k71ctv0bcygv05mj
# target_branch: bzr://bzr.savannah.gnu.org/wget/trunk/
# testament_sha1: 0b8aab4ce061b99614d52e9fa063e5f604cd0124
# timestamp: 2011-04-10 23:25:17 -0300
# base_revision_id: gscriv...@gnu.org-20110407105651-ofq3ntt3w0h6zkq9
# 
# Begin patch
=== modified file 'src/openssl.c'
--- src/openssl.c   2011-04-04 14:56:51 +
+++ src/openssl.c   2011-04-11 02:11:40 +
@@ -187,8 +187,10 @@
   meth = SSLv23_client_method ();
   break;
 case secure_protocol_sslv2:
+#ifndef OPENSSL_NO_SSL2
   meth = SSLv2_client_method ();
   break;
+#endif
 case secure_protocol_sslv3:
   meth = SSLv3_client_method ();
   break;

# Begin bundle
IyBCYXphYXIgcmV2aXNpb24gYnVuZGxlIHY0CiMKQlpoOTFBWSZTWTL7pGAAAXhfgAAQWGf/91Kl
zgCwUANa5u9avddMdo4aIk8qeTyJMw00mU9CZM1Hqep6j1DCAZRDTGppTeEp+qaaaAaA00aA
DQEkhoQJpplJ6eqemphD1DQyGmQNAj1JT0R6INGhoZAAElEwqeKekeSeUeFNDTRtINABo09T
DebO3wy7zPm0CmOnswQYUnz1fe8Kyy7YsMx4fQPzKYzAYzmk0qwlXzef7myR73QL0ZRdEkMy1SVa
p9NXxxJBkHC3NNLAdUE+ksVJCKypafCYeTue0SpBPkjoX3wWGRCFrWkaxiGMYGFzEYKgjnDD4g1G
k5UeUVhceQcTP1WfSf1bAky2PgHS6fucUBhFq+W86/U+YrBFCVG5i181Uw2jYgLCm5LTc0alGySm
16BMGMq9Vj+HfQpLEqOVw4dhgraqguKSqiUcxPKIzW4E8DgGXNv/T0mIYDADSFlZNMKi1Mpij+IB
gxHYx8ch2kzzHgB9ejROs4c5QoXzFF5Yq8ImjQlkysbdcclNy1ysIRIFsM4Swta12Ly7GAdHeo0W
hUNRfboKBL5qqNAtxndeDZcFYtz7FW7bGFbZBCd8CcE02kVClGTPxGuSHhqSYrx5eoCiJUJgzPJH
RmzAdTL8XzV17XLIN7nwN7YRhRWwNmyQsGhl590SOsNs0zQJNoeUqMlw98d+3eLOBFp44P4bELIY
lqiDgxgjF+DBRSdBnJcR8hcDXu4wHA1BzJS6qKZCIdFx1zQ4Z7tryobFDG2iuQaL/Cvwx2pR3zZ0
TYb/GWXeaIlANz0DGh9MchtfC7KSI9HVAfP2QXkKSMmOjtVVrqD4QFkmBv9bTO4TPGwP+aQhg3V7
U2IGqxMrDpS60fUitWQQC5bhZzjTOcL3wmbSDmk7AOxtO73TH6VqxtXg85tYZsopYVOzhHEKQOTt
aD5ZBQsKhZtPQ6sMVMrQYE7QgpsEtFNj46FfUo1wM+Qut3OacZJksYYxjIOEHjODhICTLsWAREUa
A2yIp1ATNUd95poeB7ANLzSVcu5zKDMkCAgt8EJdifXVaKIfDa6K8cCYclZ1WgpLdHF5XLuUFHXJ
Ks+IPs4MXVBpqi6LfTareKvhaBcjOV3VS9PD+DtY+hQGtrhUDjeXxC8JMKmgoBFbd+tNEu++dUsw
Yg+ogBmuMwOcymNMjQlhysOQMimA1m//F3JFOFCQMvukYA==


Re: [Bug-wget] How do I tell wget not to follow links in a file?

2011-04-11 Thread David Skalinder
Okay, I have filed bug #33044 for this issue at
https://savannah.gnu.org/bugs/index.php?33044.  I've also moved the demo
to http://davidskalinder.com/wgettest/ and added a bunch of directories to
the unwanted link page to make the problem clearer.

It strikes me that this issue must come up fairly frequently, especially
for sites with fairly flat directory hierarchies.  For example, any site
which keeps a recent updates page that includes a link to a previous
updates page, both of which contain links to many root-level directories,
would be affected.  A user who wanted to maintain an up-to-date mirror of
such a site would have no option but to download the entire site every
week.

HTH

DS


 On 04/07/2011 05:26 AM, Giuseppe Scrivano wrote:
 David Skalinder da...@skalinder.net writes:

 I want to mirror part of a website that contains two links pages,
 each
 of
 which contains links to many root-level directories and also to the
 other
 links page.  I want to download recursively all the links from one
 links
 page, but not from the other: that is, I want to tell wget download
 links1 and follow all of its links, but do not download or follow
 links
 from links2.

 I've put a demo of this problem up at http://fangjaw.com/wgettest --
 there
 is a diagram there that might state the problem more clearly.

 This functionality seems so basic that I assume I must be overlooking
 something.  Clearly wget has been designed to give users control over
 which files they download; but all I can find is that -X controls
 both
 saving and link-following at the directory level, while -R controls
 saving
 at the file level but still follows links from unsaved files.

 why doesn't -X work in the scenario you have described?  If all links
 from `links2' are under /B, you can exclude them using something like:

 That scenario seems rather unlikely, unless we're talking about
 autogenerated folder index files...

 This issue would be resolved if wget had a way to avoid its current
 behavior of always unconditionally downloading HTML files regardless of
 what rejection rules say. Then you can just reject that single file (and
 if need be, download it as part of a separate session.

 --
 Micah J. Cowan
 http://micah.cowan.name/


 I think that's right.  As I mention on the demo page, links2 could easily
 contain links to hundreds of different directories, in which case you're
 out of luck.

 As Micah notes, if -R did not download the files at all (or even just
 downloaded them but did not queue their links), that should fix the
 problem.  Also, if a user could alter the robots.txt file, I think she
 could make wget act correctly by including something like

 User-agent: *
 Disallow: wgettest/links2.html

 But obviously, most wget users won't have access to the server side.
 Since (I assume) wget knows how to follow that robots instruction, it
 seems like it should be able to follow a similar instruction from the
 client side.

 David








Re: [Bug-wget] How do I tell wget not to follow links in a file?

2011-04-11 Thread David Skalinder
It just occurred to me that since wget will perform this task properly if
it gets the rule from robots.txt, maybe this issue could be worked around
by proxying or spoofing the remote site's robots.txt file locally?  That
is, I write

User-agent: *
Disallow: wgettest/links2.html

into a file, save it in my home directory, and then somehow tell wget that
davidskalinder.com/robots.txt is actually located at
/home/user/robots.txt?

Does anybody know a convenient way of doing this?  Or is there an easier
workaround I'm overlooking?




Re: [Bug-wget] [PATCH] Allow openSSL compiled without SSLv2

2011-04-11 Thread Giuseppe Scrivano
Thanks for the patch.  Committed and pushed.

Cheers,
Giuseppe



Cristian Rodríguez crrodrig...@opensuse.org writes:

 Hi:

 the attached patch adds support to an openSSL library compiled without
 SSlv2 , in which case, wget will behave like if it was using
 the GNUTLS backend, that is, doing sslv3 only.


 # Bazaar merge directive format 2 (Bazaar 0.90)
 # revision_id: cristian@linux-us4g-20110411021140-k71ctv0bcygv05mj
 # target_branch: bzr://bzr.savannah.gnu.org/wget/trunk/
 # testament_sha1: 0b8aab4ce061b99614d52e9fa063e5f604cd0124
 # timestamp: 2011-04-10 23:25:17 -0300
 # base_revision_id: gscriv...@gnu.org-20110407105651-ofq3ntt3w0h6zkq9
 # 
 # Begin patch
 === modified file 'src/openssl.c'
 --- src/openssl.c 2011-04-04 14:56:51 +
 +++ src/openssl.c 2011-04-11 02:11:40 +
 @@ -187,8 +187,10 @@
meth = SSLv23_client_method ();
break;
  case secure_protocol_sslv2:
 +#ifndef OPENSSL_NO_SSL2
meth = SSLv2_client_method ();
break;
 +#endif
  case secure_protocol_sslv3:
meth = SSLv3_client_method ();
break;

 # Begin bundle
 IyBCYXphYXIgcmV2aXNpb24gYnVuZGxlIHY0CiMKQlpoOTFBWSZTWTL7pGAAAXhfgAAQWGf/91Kl
 zgCwUANa5u9avddMdo4aIk8qeTyJMw00mU9CZM1Hqep6j1DCAZRDTGppTeEp+qaaaAaA00aA
 DQEkhoQJpplJ6eqemphD1DQyGmQNAj1JT0R6INGhoZAAElEwqeKekeSeUeFNDTRtINABo09T
 DebO3wy7zPm0CmOnswQYUnz1fe8Kyy7YsMx4fQPzKYzAYzmk0qwlXzef7myR73QL0ZRdEkMy1SVa
 p9NXxxJBkHC3NNLAdUE+ksVJCKypafCYeTue0SpBPkjoX3wWGRCFrWkaxiGMYGFzEYKgjnDD4g1G
 k5UeUVhceQcTP1WfSf1bAky2PgHS6fucUBhFq+W86/U+YrBFCVG5i181Uw2jYgLCm5LTc0alGySm
 16BMGMq9Vj+HfQpLEqOVw4dhgraqguKSqiUcxPKIzW4E8DgGXNv/T0mIYDADSFlZNMKi1Mpij+IB
 gxHYx8ch2kzzHgB9ejROs4c5QoXzFF5Yq8ImjQlkysbdcclNy1ysIRIFsM4Swta12Ly7GAdHeo0W
 hUNRfboKBL5qqNAtxndeDZcFYtz7FW7bGFbZBCd8CcE02kVClGTPxGuSHhqSYrx5eoCiJUJgzPJH
 RmzAdTL8XzV17XLIN7nwN7YRhRWwNmyQsGhl590SOsNs0zQJNoeUqMlw98d+3eLOBFp44P4bELIY
 lqiDgxgjF+DBRSdBnJcR8hcDXu4wHA1BzJS6qKZCIdFx1zQ4Z7tryobFDG2iuQaL/Cvwx2pR3zZ0
 TYb/GWXeaIlANz0DGh9MchtfC7KSI9HVAfP2QXkKSMmOjtVVrqD4QFkmBv9bTO4TPGwP+aQhg3V7
 U2IGqxMrDpS60fUitWQQC5bhZzjTOcL3wmbSDmk7AOxtO73TH6VqxtXg85tYZsopYVOzhHEKQOTt
 aD5ZBQsKhZtPQ6sMVMrQYE7QgpsEtFNj46FfUo1wM+Qut3OacZJksYYxjIOEHjODhICTLsWAREUa
 A2yIp1ATNUd95poeB7ANLzSVcu5zKDMkCAgt8EJdifXVaKIfDa6K8cCYclZ1WgpLdHF5XLuUFHXJ
 Ks+IPs4MXVBpqi6LfTareKvhaBcjOV3VS9PD+DtY+hQGtrhUDjeXxC8JMKmgoBFbd+tNEu++dUsw
 Yg+ogBmuMwOcymNMjQlhysOQMimA1m//F3JFOFCQMvukYA==