Re: -X regex syntax? (repost)
On Fri, 18 Feb 2005, [ISO-8859-1] "Jens Rösner" wrote: } Hi Vince! } } > I did give -X"*backup" a try, and } > it too didn't work for me. :( } } Does the -X"dir" work for you at all? } If not, there might be a problem with MacOS. } I hope one of the more knowledgeable people here } can help you! I've tried both on my Mac OS X box and Linux box, and the problem is still there. It appears that -X will work if you know the exact name and path of the directory, but that is not entirely the case with my problem. I need to be able to not download any and all directories named .backup. There are many of them in different paths, so I figured I need some kind of regex as -X".backup" is not enough. Does anyone here know what syntax wget's regex engine uses? I have not been able to find any documentation about it. Thanks, /vjl/
Re: -X regex syntax? (repost)
Hi Vince! > I did give -X"*backup" a try, and > it too didn't work for me. :( Does the -X"dir" work for you at all? If not, there might be a problem with MacOS. I hope one of the more knowledgeable people here can help you! > However, I would like to confirm something dumb - will wget fetch these > directories, regardless of what I put in --exclude-directories, but when > it is done fetching the URL, will it then discard those directories? As far as I can tell from a log file I just created, wget does not follow links into these directories. So no files downloaded from them. CU Jens -- DSL Komplett von GMX +++ Supergünstig und stressfrei einsteigen! AKTION "Kein Einrichtungspreis" nutzen: http://www.gmx.net/de/go/dsl
Re: -X regex syntax? (repost)
On Thu, 17 Feb 2005, [ISO-8859-1] "Jens Rösner" wrote: Hi Jens, } Would -X"*backup" be OK for you? It depends on how the trailing wildcard is used - the actual name of the directories is ".backup", but they are in each directory [and yes, there is html in each page which refers to them, which is why i'm trying to avoid grabbing them in the first place]. I did give -X"*backup" a try, and it too didn't work for me. :( } If yes, give it a try. } If not, I think you'd need the correct escaping for the ".", } but I have no idea how to do that, but } http://mrpip.orcon.net.nz/href/asciichar.html } lists } %2E } as the code. Does this work? I gave that a try too [thanks!], but it still fetches the .backup directory: --exclude-directories="%2Ebackup". However, I would like to confirm something dumb - will wget fetch these directories, regardless of what I put in --exclude-directories, but when it is done fetching the URL, will it then discard those directories? the reason I ask this is because each time I've tried doing this, I've interrupted the process with a ^C when I saw it fetching files from a .backup directory. One of the goals, besides to save disc space, is to save bandwidth, so I'd ideally like wget never to fetch those directories to begin with. Thanks for the tips, Jens! /vjl/
Re: -X regex syntax? (repost)
Hi Vince! > So, so far these don't work for me: > > --exclude-directories='*.backup*' > --exclude-directories="*.backup*" > --exclude-directories="*\.backup*" Would -X"*backup" be OK for you? If yes, give it a try. If not, I think you'd need the correct escaping for the ".", but I have no idea how to do that, but http://mrpip.orcon.net.nz/href/asciichar.html lists %2E as the code. Does this work? CU Jens > > I've also tried this on my linux box running v1.9.1 as well. Same results. > Any other ideas? > > Thanks a lot for your tips, and quick reply! > > /vjl/ -- Lassen Sie Ihren Gedanken freien Lauf... z.B. per FreeSMS GMX bietet bis zu 100 FreeSMS/Monat: http://www.gmx.net/de/go/mail
Re: -X regex syntax? (repost)
On Thu, 17 Feb 2005, [ISO-8859-1] "Jens Rösner" wrote: Hi Jens! } > tip or two with regards to using -X? } I'll try! Thanks - I do appreciate it! } > wget -r --exclude-directories='*.backup*' --no-parent \ } > http://example.com/dir/stuff/ } Well, I am using wget under Windows and there, you have } have to use "exp", not 'exp', to make it work. The *x* works as expected. } I could not test whether the . in your dir name causes any problem. I tried it with double quotes, and I'm still seeing wget download files in the .backup directories. I've also tried escaping the "." with a "\" but that doesn't seem to work either. :( So, so far these don't work for me: --exclude-directories='*.backup*' --exclude-directories="*.backup*" --exclude-directories="*\.backup*" I've also tried this on my linux box running v1.9.1 as well. Same results. Any other ideas? Thanks a lot for your tips, and quick reply! /vjl/
Re: -X regex syntax? (repost)
Hi Vince! > tip or two with regards to using -X? I'll try! > wget -r --exclude-directories='*.backup*' --no-parent \ > http://example.com/dir/stuff/ Well, I am using wget under Windows and there, you have have to use "exp", not 'exp', to make it work. The *x* works as expected. I could not test whether the . in your dir name causes any problem. Good luck! Jens (just another user) -- DSL Komplett von GMX +++ Supergünstig und stressfrei einsteigen! AKTION "Kein Einrichtungspreis" nutzen: http://www.gmx.net/de/go/dsl
-X regex syntax? (repost)
I hate to do this, but I am still stumped by this. Can anyone pass along a tip or two with regards to using -X? Thanks, /vjl/ [repost follows]: Hi all, I'm using GNU Wget 1.9.1 under Mac OS X, and I'm trying to confirm that I have the correct syntax for using the -X [or --exclude-directories] argument. For example, I have a URL which I would like to wget with a -r. The URL contains many directories that are named, ".backup". I do not wish to download those directories. The way I've been attempting to do that is as follows: wget -r --exclude-directories='*.backup*' --no-parent \ http://example.com/dir/stuff/ This does not appear to work. What is the proper syntax for wget's regex engine? Thanks for any tips you can provide... /vjl/