Re: --mirror and --cut-dirs=2 bug?

2008-11-03 Thread Brock Murch
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Micah,

Many thanks with all your very timely help. I have had no issues since 
following you instructions to upgrade to 1.11.4 and installing it in the /opt 
directory. I used:

$ ./configure --prefix=/opt/wget

And point to ist specifically:

/opt/wget/bin/wget  --tries=10 -r -N -l inf --wait=1\
-nH --cut-dirs=2 ftp://oceans.gsfc.nasa.gov/MODISA/ATTEPH/ \
-o /home1/software/modis/atteph/mirror_a.log \
--directory-prefix=/home1/software/modis/atteph

Thanks again.

Brock


On Monday 27 October 2008 3:06 pm, Micah Cowan wrote:
 Brock Murch wrote:
  Sorry, 1 quick question? Do you know of anyone providing rpm's of 1.11.4
  for CentOS?

 Not offhand. It may not yet be available; it was only packaged for
 Fedora Core a couple months ago, I think. RPMfind.net just lists 1.11.4
 sources for fc9 and fc10.

  If not, would you recommend uninstalling the current one? Before
  installing from your src? Many thanks.

 I'd advise against that: I believe various important components of Red
 Hat/CentOS rely on wget to fetch things. Sometimes minor changes in the
 output/interface of wget cause problems for automated scripts that form
 an integral part of an operating system. Though really, I think most of
 the changes that would pose such a danger are actually already in the
 Red Hat modified 1.10.2 sources (taken from the development sources
 for what was later released as 1.11).

 What I tend to do on my systems, is to configure the sources like:

   $ ./configure --prefix=$HOME/opt/wget

 and then either add $HOME/opt to my $PATH, or invoke it directly as
 $HOME/opt/wget/bin/wget.

 Note that if you want to build wget with support for HTTPS, you'll need
 to have the development package for openssl installed.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQFJDwveMAkzD2qY/pURAmvuAJ9XG784Djq0mwcTu/nN56tPSM+AMQCgm2KX
dzPQ263FF7Gaw4qtE1X0wTI=
=CC9T
-END PGP SIGNATURE-



Re: --mirror and --cut-dirs=2 bug?

2008-10-27 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Brock Murch wrote:
 I try to keep a mirror of NASA atteph ancilliary data for modis processing. I 
 know that means little, but I have a cron script that runs 2 times a day. 
 Sometimes it works, and others, not so much. The sh script is listed at the 
 end of this email below. As is the contents of the remote ftp server's root 
 and portions fo the log. 
 
 I don't need all the data on the remote server, only some thus I use 
 --cut-dirs.To make matters stranger, the software (also from NASA) that uses 
 these files, looks for them in a single place on the client machine where the 
 software runs, but needs data from 2 different directories on the remote ftp 
 server. If the data is not on the client machine, the software kindly ftp's 
 the files to the local directory. However, I don't allow write access to that 
 directory as many people use the software and when it is d/l'ed it has the 
 wrong perms for others to use it, thus I mirror the data I need from the ftp 
 site locally. In the script below, there are 2 wget commands, but they are to 
 slightly different directories (MODISA  MODIST).

I wouldn't recommend that. Using the same output directory for two
different source directories seems likely to lead to problems. You'd
most likely be better off by pulling to two locations, and then
combining them afterwards.

I don't know for sure that it _will_ cause problems (except if they
happen to have same-named files), as long as .listing files are being
properly removed (there were some recently-fixed bugs related to that, I
think? ...just appending new listings on top of existing files).

 It appears to me that the problem occurs if there is a ftp server error, and 
 wget starts a retry. wget goes to the server root, gets the .listing from 
 there for some reason (as opposed to the directory it should go to on the 
 server), and then goes to the dir it needs to mirror and can't find the files 
 (that are listed in the root dir) and creates dirs, and then I get No such 
 file errors and recursive directories created. Any advice would be 
 appreciated.

This snippet seems to be the source of the problem:

 Error in server response, closing control connection.
 Retrying.
 
 - --14:53:53--  ftp://oceans.gsfc.nasa.gov/MODIST/ATTEPH/2002/110/
   (try: 2) = `/home1/software/modis/atteph/2002/110/.listing'
 Connecting to oceans.gsfc.nasa.gov|169.154.128.45|:21... connected.
 Logging in as anonymous ... Logged in!
 == SYST ... done.== PWD ... done.
 == TYPE I ... done.  == CWD not required.
 == PASV ... done.== LIST ... done.

That CWD not required bit is erroneous. I'm 90% sure we fixed this
issue recently (though I'm not 100% sure that it went to release: I
believe so).

I believe we made some related fixes more recently. You provided a great
amount of useful information, but one thing that seems to be missing (or
I missed it) is the Wget version number. Judging from the log, I'd say
it's 1.10.2 or older; the most recent version of Wget is 1.11.4; could
you please try to verify whether Wget continues to exhibit this problem
in the latest release version?

I'll also try to look into this as I have time (but it might be awhile
before I can give it some serious attention; it'd be very helpful if you
could do a little more legwork).

- --
Thanks very much,
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer.
GNU Maintainer: wget, screen, teseq
http://micah.cowan.name/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFJBgNh7M8hyUobTrERAuGoAKCCUoBN0sURKA/51x0o4HN59K8+AACfUYuj
i8XW58MvjvbS3oy4OsOmbpc=
=4kpD
-END PGP SIGNATURE-


Re: --mirror and --cut-dirs=2 bug?

2008-10-27 Thread Micah Cowan
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Micah Cowan wrote:
 I believe we made some related fixes more recently. You provided a great
 amount of useful information, but one thing that seems to be missing (or
 I missed it) is the Wget version number. Judging from the log, I'd say
 it's 1.10.2 or older; the most recent version of Wget is 1.11.4; could
 you please try to verify whether Wget continues to exhibit this problem
 in the latest release version?

This problem looks like the one that Mike Grant fixed in October of
2006: http://hg.addictivecode.org/wget/1.11/rev/161aa64e7e8f, so it
should definitely be fixed in 1.11.4. Please let me know if it isn't.

- --
Regards,
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer.
GNU Maintainer: wget, screen, teseq
http://micah.cowan.name/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFJBgY+7M8hyUobTrERArrRAJ4p4Y7jwWfic0Wul7UBnBXlSzD2XQCePifc
kWs00JOULkzJmzozK7lmcfA=
=iSL3
-END PGP SIGNATURE-


Re: --mirror and --cut-dirs=2 bug?

2008-10-27 Thread Brock Murch
Micah,

Thanks for your quick attention to this. Yous, I probably forgot to include 
the version #

[EMAIL PROTECTED] atteph]# wget --version
GNU Wget 1.10.2 (Red Hat modified)

Copyright (C) 2005 Free Software Foundation, Inc.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

Originally written by Hrvoje Niksic [EMAIL PROTECTED].

I will see if I can get the newest version for:
[EMAIL PROTECTED] atteph]# cat /etc/redhat-release
CentOS release 4.2 (Final)

I'll let you know how that goes.

Brock

On Monday 27 October 2008 2:19 pm, Micah Cowan wrote:
 Micah Cowan wrote:
  I believe we made some related fixes more recently. You provided a great
  amount of useful information, but one thing that seems to be missing (or
  I missed it) is the Wget version number. Judging from the log, I'd say
  it's 1.10.2 or older; the most recent version of Wget is 1.11.4; could
  you please try to verify whether Wget continues to exhibit this problem
  in the latest release version?

 This problem looks like the one that Mike Grant fixed in October of
 2006: http://hg.addictivecode.org/wget/1.11/rev/161aa64e7e8f, so it
 should definitely be fixed in 1.11.4. Please let me know if it isn't.



--mirror and --cut-dirs=2 bug?

2008-10-24 Thread Brock Murch
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

I try to keep a mirror of NASA atteph ancilliary data for modis processing. I 
know that means little, but I have a cron script that runs 2 times a day. 
Sometimes it works, and others, not so much. The sh script is listed at the 
end of this email below. As is the contents of the remote ftp server's root 
and portions fo the log. 

I don't need all the data on the remote server, only some thus I use 
- --cut-dirs. To make matters stranger, the software (also from NASA) that uses 
these files, looks for them in a single place on the client machine where the 
software runs, but needs data from 2 different directories on the remote ftp 
server. If the data is not on the client machine, the software kindly ftp's 
the files to the local directory. However, I don't allow write access to that 
directory as many people use the software and when it is d/l'ed it has the 
wrong perms for others to use it, thus I mirror the data I need from the ftp 
site locally. In the script below, there are 2 wget commands, but they are to 
slightly different directories (MODISA  MODIST).

It appears to me that the problem occurs if there is a ftp server error, and 
wget starts a retry. wget goes to the server root, gets the .listing from 
there for some reason (as opposed to the directory it should go to on the 
server), and then goes to the dir it needs to mirror and can't find the files 
(that are listed in the root dir) and creates dirs, and then I get No such 
file errors and recursive directories created. Any advice would be 
appreciated.

Brock Murch

Here is an example of the bad type of dir structure I end up with (there 
should be no EO1 and below):

[EMAIL PROTECTED] atteph]# find . -type d -name * | grep EO1
./2002/110/EO1
./2002/110/EO1/CZCS
./2002/110/EO1/CZCS/CZCS
./2002/110/EO1/CZCS/CZCS/CZCS
./2002/110/EO1/CZCS/CZCS/CZCS/CZCS
./2002/110/EO1/CZCS/CZCS/CZCS/CZCS/COMMON
./2002/110/EO1/CZCS/CZCS/CZCS/CZCS/COMMON/CZCS
./2002/110/EO1/CZCS/CZCS/CZCS/CZCS/COMMON/CZCS/CZCS
./2002/110/EO1/CZCS/CZCS/CZCS/CZCS/COMMON/CZCS/CZCS/CZCS
./2002/110/EO1/CZCS/CZCS/CZCS/CZCS/COMMON/CZCS/CZCS/CZCS/CZCS
./2002/110/EO1/CZCS/CZCS/CZCS/CZCS/COMMON/CZCS/CZCS/CZCS/CZCS/CZCS
./2002/110/EO1/CZCS/CZCS/CZCS/CZCS/COMMON/CZCS/CZCS/CZCS/CZCS/CZCS/CZCS
./2002/110/EO1/CZCS/CZCS/CZCS/CZCS/COMMON/CZCS/CZCS/CZCS/CZCS/CZCS/CZCS/CZCS
./2002/110/EO1/CZCS/CZCS/CZCS/CZCS/COMMON/CZCS/CZCS/CZCS/CZCS/CZCS/CZCS/CZCS/CZCS
./2002/110/EO1/CZCS/CZCS/CZCS/CZCS/COMMON/CZCS/CZCS/CZCS/CZCS/CZCS/CZCS/CZCS/CZCS/CZCS
./2002/110/EO1/CZCS/CZCS/CZCS/CZCS/COMMON/CZCS/CZCS/CZCS/CZCS/CZCS/CZCS/CZCS/CZCS/CZCS/CZCS

Or:
[EMAIL PROTECTED] atteph]# ls /home1/software/modis/atteph/2002/110/EO1/
CZCS  README
[EMAIL PROTECTED] atteph]# ls /home1/software/modis/atteph/2002/110/EO1/CZCS/
CZCS  README
[EMAIL PROTECTED] atteph]# ls 
/home1/software/modis/atteph/2002/110/EO1/CZCS/CZCS/
CZCS  README
[EMAIL PROTECTED] atteph]# ls 
/home1/software/modis/atteph/2002/110/EO1/CZCS/CZCS/CZCS/
CZCS  README
[EMAIL PROTECTED] atteph]# ls 
/home1/software/modis/atteph/2002/110/EO1/CZCS/CZCS/CZCS/CZCS/
COMMON
[EMAIL PROTECTED] atteph]# ls 
/home1/software/modis/atteph/2002/110/EO1/CZCS/CZCS/CZCS/CZCS/COMMON/
CZCS  README
[EMAIL PROTECTED] atteph]# ls 
/home1/software/modis/atteph/2002/110/EO1/CZCS/CZCS/CZCS/CZCS/COMMON/CZCS/
CZCS  README
[EMAIL PROTECTED] atteph]# ls 
/home1/software/modis/atteph/2002/110/EO1/CZCS/CZCS/CZCS/CZCS/COMMON/CZCS/CZCS/
CZCS  README
[EMAIL PROTECTED] atteph]# ls 
/home1/software/modis/atteph/2002/110/EO1/CZCS/CZCS/CZCS/CZCS/COMMON/CZCS/CZCS/CZCS/
CZCS  README
[EMAIL PROTECTED] atteph]# ll 
/home1/software/modis/atteph/2002/110/EO1/CZCS/CZCS/CZCS/CZCS/COMMON/CZCS/CZCS/CZCS/

And

[EMAIL PROTECTED] atteph]# ll /home1/software/modis/atteph/2002/110/EO1/README 
- -rw-r--r--  1 root root 9499 Aug 20 10:12 
/home1/software/modis/atteph/2002/110/EO1/README
[EMAIL PROTECTED] atteph]# ll 
/home1/software/modis/atteph/2002/110/EO1/CZCS/README 
- -rw-r--r--  1 root root 9499 Aug 20 10:12 
/home1/software/modis/atteph/2002/110/EO1/CZCS/README
[EMAIL PROTECTED] atteph]# ll 
/home1/software/modis/atteph/2002/110/EO1/CZCS/CZCS/README 
- -rw-r--r--  1 root root 9499 Aug 20 10:12 
/home1/software/modis/atteph/2002/110/EO1/CZCS/CZCS/README
[EMAIL PROTECTED] atteph]# ll 
/home1/software/modis/atteph/2002/110/EO1/CZCS/CZCS/CZCS/README 
- -rw-r--r--  1 root root 9499 Aug 20 10:12 
/home1/software/modis/atteph/2002/110/EO1/CZCS/CZCS/CZCS/README
[EMAIL PROTECTED] atteph]# ll 
/home1/software/modis/atteph/2002/110/EO1/CZCS/CZCS/CZCS/CZCS/README 
ls: /home1/software/modis/atteph/2002/110/EO1/CZCS/CZCS/CZCS/CZCS/README: No 
such file or directory
[EMAIL PROTECTED] atteph]# ll 
/home1/software/modis/atteph/2002/110/EO1/CZCS/CZCS/CZCS/CZCS/COMMON/README 
- -rw-r--r--  1 root root 9499 Aug 20 10:12 
/home1/software/modis/atteph/2002/110/EO1/CZCS/CZCS/CZCS/CZCS/COMMON/README


All the README files are all the same, and the same as the one is the ftp 
server