Re: Wget scriptability
Micah Cowan wrote: Okay, so there's been a lot of thought in the past, regarding better extensibility features for Wget. Things like hooks for adding support for traversal of new Content-Types besides text/html, or adding some form of JavaScript support, or support for MetaLink. Also, support for being able to filter results pre- and post-processing by Wget: for example, being able to do some filtering on the HTML to change how Wget sees it before parsing for links, but without affecting the actual downloaded version; or filtering the links themselves to alter what Wget fetches. However, another thing that's been vaguely itching at me lately, is the fact that Wget's design is not particularly unix-y. Instead of doing one thing, and doing it well, it does a lot of things, some well, some not. It does what various people needed. It wasn't an excercise in writing a unixy utility. It was a program that solved real problems for real people. But the thing everyone loves about Unix and GNU (and certainly the thing that drew me to them), is the bunch-of-tools-on-a-crazy-pipeline paradigm, I have always hated that. With a passion. - The tools themselves, as much as possible, should be written in an easily-hackable scripting language. Python makes a good candidate. Where we want efficiency, we can implement modules in C to do the work. At the time Wget was conceived, that was Tcl's mantra. It failed miserably. :-) How about concentrating on the problems listed in your first paragraph (which is why I quoted it)? Could you show us how would a buch of shell tools solve them? Or how would a librarized Wget solve them? Or how would any other paradigm or architecture or whatever solve them? -- .-. .-.Yes, I am an agent of Satan, but my duties are largely (_ \ / _) ceremonial. | |[EMAIL PROTECTED]
Re: More portability stuff [Re: gettext configuration]
Micah Cowan wrote: AFAIK, _no_ system supports POSIX 100%, AIX and Solaris have certified POSIX support. That's for the latest, IEEE Std 1003.1-2001. More systems have certified POSIX support for the older POSIX release. OTOH, all POSIX releases have optional parts which don't have to be implemented. -- .-. .-.Yes, I am an agent of Satan, but my duties are largely (_ \ / _) ceremonial. | |[EMAIL PROTECTED]
Re: Requests are always HTTP/1.0 ?!
Graham Leggett wrote: On Wed, May 2, 2007 9:16 am, Daniel Stenberg wrote: Host: kpic1 is a HTTP/1.1 feature. So this is non-sensical. Some pre-1.1 servers have required this header, I don't see how the 1.0 spec forbids it and by using it you can utilize name-based virtual hosting so I disagree with your conclusion. HTTP/1.0 doesn't support name based virtual hosting. If wget works now, it's only working by accident. It is not an either-or proposition. When a client sends HTTP/1.1 in a request it is telling the server that it can correctly process any valid HTTP/1.1 response. Sending HTTP/1.0 doesn't mean that the client can't use HTTP/1.1 features (like the Host header). The client is merely asking the server to return a valid HTTP/1.0 response. And the servers are doing exactly that. They also process the Host header as HTTP/1.1 spec says they should because they happen to support that feature. They don't have to be HTTP/1.1 compliant for that feature to work. -- .-. .-.Yes, I am an agent of Satan, but my duties are largely (_ \ / _) ceremonial. | |[EMAIL PROTECTED]
Re: No more Libtool (long)
Hrvoje Niksic wrote: Google doesn't show even nearly enough hits when you search for libtool sucks. Because it's an understatement. :-) -- .-. .-.Yes, I am an agent of Satan, but my duties are largely (_ \ / _) ceremonial. | |[EMAIL PROTECTED]
Re: php site handling problem: PHPSESSIONID
Marcel Partap wrote: http://www.rwth-aachen.de/gruenderkolleg/spinoffs/firma.php?id=22PHPSESSID=7f0dd9f1ff08d9ce3acc2039577c60b1 That PHPSESSID stuff sux. If I download it again, it will not overwrite old with new but make a new one. So I dema.. eeh WE need a command switch to kill thos PHPSESSIONID's probably other scripts languages have similar behaviour... PHPSESSID is just the default name for the variable which holds session cookie. As far as PHP session support goes, it can be in the HTTP cookie header or embedded in the URLs. I agree that removing session ids and various security related thingies from URLs would be a nice thing. But variable name is per site and user interface is going to be clunky. -- .-. .-.Yes, I am an agent of Satan, but my duties are largely (_ \ / _) ceremonial. | |[EMAIL PROTECTED]
Re: string_t breaks compilation under Solaris
Hrvoje Niksic wrote: 2) Server messages printed by Wget in normal operation, such as the 200 Ok message. That one is printed just for the fun factor anyway, we could as well print just the response code. However, I don't see a problem with simply filtering out the non-ASCII's from the response code. People who put non-ASCII messages in server response lines won't be able to see them properly in Wget's output, but I honestly couldn't care less. Those messages are using Latin 1. HTTP/1.0 worked that way and it was not changed in HTTP/1.1 for compatibility reasons. I don't expect that non-ASCII characters are used much, though. I've only seen the status messages translated to French, once. The Warning header carries text intended for the user and it's the only such header, AFAIK. Again, text defaults to Latin 1 and the usual MIME-ish complications can be used for other character sets. -- .-. .-.Yes, I am an agent of Satan, but my duties are largely (_ \ / _) ceremonial. | |[EMAIL PROTECTED]
Re: new string module
Jan Minar wrote: What's wrong with mbrtowc(3) and friends? The mysterious solution is probably to use wprintf(3) instead printf(3). Couple of questions on #c on freenode would give you that answer. Historically, wget source was written in a way which allowed one to compile it on really old systems. That would rule out C95 functions. (I'm not advocating this approach, just answering the question.) -- .-. .-.Yes, I am an agent of Satan, but my duties are largely (_ \ / _) ceremonial. | |[EMAIL PROTECTED]
Re: Large Files Support for Wget
Hrvoje Niksic wrote: David Fritz [EMAIL PROTECTED] writes: IIUC, GNU coreutils uses uintmax_t to store large numbers relating to the file system and prints them with something like this: char buf[INT_BUFSIZE_BOUND (uintmax_t)]; printf (_(The file is %s octets long.\n), umaxtostr (size, buf)); That's probably the most portable way to do it. For the time being. However, in C99 %ju is the correct format for printing uintmax_t. There are systems which have uintmax_t, but don't have the j modifier, so the whole thing is a problem if you want to write failsafe configure check. And there might be run-time problems, as well. * Change most (all?) occurrences of `long' in the code to `off_t'. Or should we go the next logical step and just use uintmax_t right away? Just use off_t. -- .-. .-.Yes, I am an agent of Satan, but my duties are largely (_ \ / _) ceremonial. | |[EMAIL PROTECTED]
Re: bug in use index.html
Hrvoje Niksic wrote: The whole matter of conversion of / to /index.html on the file system is a hack. But I really don't know how to better represent empty trailing file name on the file system. Another, for now rather limited, hack: on file systems which support some sort of file attributes you can mark index.html as an unwanted child of an empty trailing file name. AFAIK, that should work at least on Solaris and Linux. Others will join the club one day, I hope. -- .-. .-.Yes, I am an agent of Satan, but my duties are largely (_ \ / _) ceremonial. | |[EMAIL PROTECTED]
Re: connect.c compilation problems on irix
Hrvoje Niksic wrote: access to structure elements *looks* like it's accessing an sa_len element, whereas in fact it's in fact accessing a union. This is all very nice until you try to name a variable sa_len. That's why dear standards reserve large chunks of the namespace. Something or other posixoid in nature probably reserves all identifiers starting with sa_ in case you include a certain header file. And in case you defined identifiers ending with _t while including sys/types.h, you're violating the scriptures. :-) -- .-. .-.Yes, I am an agent of Satan, but my duties are largely (_ \ / _) ceremonial. | |[EMAIL PROTECTED]