wget 1.6 doesn't realise solaris has non-GNU msgfmt/gettext

2001-01-19 Thread Göran Uddeborg

The configure file from wget appears to unconditionally choose the GNU
format of message files.  CATOBJEXT is unconditionally set to ".gmo"
in aclocal.m4 and thus configure.  This results in the precompiled
gmo-files included in the distribution to be installed.

On a standard Solaris system, without GNU gettext installed, this
fails.  Solaris has gettext and msgfmt, but the binary format is
(apparently) not compatible.

I believe the definition of WGET_WITH_NLS in aclocal.m4 has to be
modified, so a "clever" choice of CATOBJEXT is made.  I.e. it gets set
to ".mo" on Solaris systems where GNU gettext isn't installed, and the
rest follows from that.  On systems with GNU gettext, it is still
".gmo" of course.  As an example of one package where this seems to
work, I look at gettext 0.10.35.



Re: Alloca, windows (again)

2001-01-19 Thread Hrvoje Niksic

Herold Heiko [EMAIL PROTECTED] writes:

 In 1.7-dev I get
 
 LINK : warning LNK4005: no objects used from library
 c:\programmi\devstudio\vc\lib\libcpmt.lib
 alloca.obj : error LNK2001: unresolved external symbol _xmalloc

Try removing alloca.obj from the linking line.



Wget and secure servers

2001-01-19 Thread Randy Sweeten

I just tried wget for Windows.  It looks like it would do everything I need,
except that next month the website I need to access will be on a secure
server, https.  It looks like wget 1.5.3 does not support https.  Any chance
for such an enhancement soon?

Randy Sweeten  Technical Support
Central House Internetwww.centralhouse.net
Central House Technologies www.centralhouse.com
(209)245-5900 x3   [EMAIL PROTECTED]

Central House University
Online Training / Classes for End-Users, Business, Technical Certification
Over 400 Classes Available, as low as $50.00 per year
http://www.centralhouse.net/training/






Re: passworded sites

2001-01-19 Thread Dmitri Loguinov

Hi Hack,

Thanks for getting back to me. I didn't realize that the new version 1.6
existed, however, it has some of the same problems. I tried it "as is"
and it failed on problem #1 that I identified below. It doesn't really
matter if the password has the @ in it, any HTTP redirect seems to throw
the password off, even in 1.6. 

I also tried to use version 1.6 to rip a free website, but purposely
specified my username (with the %40 = '@' in it) in the command line. It
failed to get that website as well beyond the very first HTML file.
Removing the password of course fixed the problem. 

It seems to me that wget could use some work, but I am sure than 1.7-dev
is much better and you've taken care of these problems. My making the
patch is probably not a very good idea, since I hacked the 1.5.3 code to
work under Windows 2000 and couldn't do a very good job in 3 hours. I
don't think you want it. But the basic idea is that whenever wget is
redirected with 301, or follows *any* link, I make sure that the new
link gets the password from the cur_url link before we even try to
follow the new link. Thus, suppose page A is passworded. Page A has a
link to page B (no password there). However, page B references D, which
does have the password. Then, my code whenever it follows links, keeps
the same password in all transitions A-B-D and succeeds in coming back
into the protected area cleanly. Furthermore, site A might have a
different DNS name, say X, and wget will drop the password in that case
again (i.e., A-B-X, or A-X). 

The hack around @ is not as clean, but it works in my case (may not work
in general). I suggest that you decouple the password from the URL. In
wget, both are *always* kept together in the field called url or smth
similar. This creates confusion upon calling parse_url() and similar
functions. My suggestion -- take the password out of the URL in the very
beginning of a session, and keep it separate. 

Thanks
Dmitri

Hack Kampbjrn wrote:
 
 Please try the latest wget version 1.6 or even better try the CVS
 developement (version 1.7-dev). Take a look at http://sunsite.dk/wget
 for instruccions on how to get it.
 
 There has been done some work on improving wget's handling of passwords,
 specifically the handling of '@' in passwords. But if not all of your
 cases has been addresse, consider submitting your patch. The web-site
 also says how the wget development team prefers to receive such patches
 (diff -u against the CVS source)
 
 Dmitri Loguinov wrote:
 
  Hi
 
  I am sure you're aware of the fact that wget 1.5.3 does not properly
  handle passworded HTTP sites (even with Basic authentication). There are
  several areas where the username/password are silently "dropped" in the
  code, and wget tries to access the same site with no password.
  Furthermore, the deal was complicated, because my username contained
  character '@'. Handling of the character was OK in retrieving the first
  page (because it was marked as %40), but upon redirection and other
  stuff described below, the password was dropped because the code is
  written sloppily.
 
  1. HTTP code 301 -- page permanently moved. The site I worked with,
  always redirected every page to http://site:80 and would not accept
  http://site. Therefore, upon redirection, it's important to keep the
  password in the code, which does not happen in wget.
 
  2. The same site referenced itself with fully qualified URLs. Such as,
  instead of saying href = "main.html" it would say href =
  "http://site/directory/main.html." Wget would lose the password in that
  case as well. Furthermore, wget would think that the URL belongs to a
  *different* site and would not take the link if the -L (i.e., local
  files only) option is specified. This was apparently because the cur_url
  contained the password, but the href did not (again, some patching was
  needed to bypass the first @ as part of my username).
 
  3. If the username contains @ (such an email address), then after a few
  iterations of the main code, the %40 would eventually get replaced by @
  and upon future searches for the site name, the code would get stuck on
  the first symbol @ instead of the second one, which separates the
  password from the website. Consider this URL:
  '[EMAIL PROTECTED]@www.site.com/main' -- once the %40 is expanded to the
  first @, the code would NOT convert it back to %40 as required by one of
  the RFCs.
 
  It took me about 3 hours to patch the code, but I am not sure what other
  functionality I might have disabled or affected. To tell the truth, it
  is quite annoying that simple things like these were not thought of by
  whoever wrote the code. Anyhow, thanks for writing it. :)
 
  Dmitri
 
 --
 Med venlig hilsen / Kind regards
 
 Hack Kampbjrn   [EMAIL PROTECTED]
 HackLine +45 2031 7799



Re: Wget and secure servers

2001-01-19 Thread Jan Prikryl

Quoting Randy Sweeten ([EMAIL PROTECTED]):

 I just tried wget for Windows.  It looks like it would do everything I need,
 except that next month the website I need to access will be on a secure
 server, https.  It looks like wget 1.5.3 does not support https.  Any chance
 for such an enhancement soon?

Current development version 1.7-dev supports secure HTTP, however the
current release (1.6) does not. The 1.7-dev is availiable as C source
via CVS; Windows binaries are time from time kindly produced by
Heiko Herold - see http://sunsite.dk/wget for details). I do not know
howerver if Heiko's binaries already contain support for HTTP over
SSL. 

-- jan

+--
 Jan Prikryl| vr|vis center for virtual reality and visualisation
 [EMAIL PROTECTED] | http://www.vrvis.at
+--