Voytek Eymont wrote: > Micah, > > thanks !!!!!! > I'm loging in OK. > > on next step I do like: > > wget --load-cookies=my-cookies.txt --save-cookies=my-cookies.txt > --keep-session-cookies > http://www.domain.tld/main.htm?_template=advanced&_module=active_list > > that fails until I put "" around the http string like so: > > wget --load-cookies=my-cookies.txt --save-cookies=my-cookies.txt > --keep-session-cookies > "http://www.domain.tld/main.htm?_template=advanced&_module=active_list" > > or should I use some '%' characters ? for & ? or just " " around https > string ? >
Just surround it with double " " or single ' ' quotes. If & is not quoted your shell thinks you want to execute a program called wget and then assign active_list to a shell variable called _module (if there wasn't a = it would try to run a program called _module, which would give you an error message you could notice) > next question: the resulting file has lots and lots of bumpf like > space.gif galore, etc, > > how do I make into text as much as possible, is there a wget function, or ? > Remove anything between < and >, then unescape the entities. That should give you quite clean text with a minimal effort.
