Re: Wget scriptability

2008-08-03 Thread Dražen Kačar
Micah Cowan wrote:

 Okay, so there's been a lot of thought in the past, regarding better
 extensibility features for Wget. Things like hooks for adding support
 for traversal of new Content-Types besides text/html, or adding some
 form of JavaScript support, or support for MetaLink. Also, support for
 being able to filter results pre- and post-processing by Wget: for
 example, being able to do some filtering on the HTML to change how Wget
 sees it before parsing for links, but without affecting the actual
 downloaded version; or filtering the links themselves to alter what Wget
 fetches.

 However, another thing that's been vaguely itching at me lately, is the
 fact that Wget's design is not particularly unix-y. Instead of doing one
 thing, and doing it well, it does a lot of things, some well, some not.

It does what various people needed. It wasn't an excercise in writing a
unixy utility. It was a program that solved real problems for real
people.

 But the thing everyone loves about Unix and GNU (and certainly the thing
 that drew me to them), is the bunch-of-tools-on-a-crazy-pipeline
 paradigm,

I have always hated that. With a passion.

  - The tools themselves, as much as possible, should be written in an
 easily-hackable scripting language. Python makes a good candidate. Where
 we want efficiency, we can implement modules in C to do the work.

At the time Wget was conceived, that was Tcl's mantra. It failed
miserably. :-)

How about concentrating on the problems listed in your first paragraph
(which is why I quoted it)? Could you show us how would a buch of shell
tools solve them? Or how would a librarized Wget solve them? Or how
would any other paradigm or architecture or whatever solve them?

-- 
 .-.   .-.Yes, I am an agent of Satan, but my duties are largely
(_  \ /  _)   ceremonial.
 |
 |[EMAIL PROTECTED]


Re: More portability stuff [Re: gettext configuration]

2007-10-29 Thread Dražen Kačar
Micah Cowan wrote:

 AFAIK, _no_ system supports POSIX 100%,

AIX and Solaris have certified POSIX support. That's for the latest,
IEEE Std 1003.1-2001. More systems have certified POSIX support for the
older POSIX release.

OTOH, all POSIX releases have optional parts which don't have to be
implemented.

-- 
 .-.   .-.Yes, I am an agent of Satan, but my duties are largely
(_  \ /  _)   ceremonial.
 |
 |[EMAIL PROTECTED]


Re: Requests are always HTTP/1.0 ?!

2007-05-04 Thread Dražen Kačar
Graham Leggett wrote:
 On Wed, May 2, 2007 9:16 am, Daniel Stenberg wrote:
 
  Host: kpic1 is a HTTP/1.1 feature. So this is non-sensical.
 
  Some pre-1.1 servers have required this header, I don't see how the 1.0
  spec
  forbids it and by using it you can utilize name-based virtual hosting so I
  disagree with your conclusion.
 
 HTTP/1.0 doesn't support name based virtual hosting. If wget works now,
 it's only working by accident.

It is not an either-or proposition.

When a client sends HTTP/1.1 in a request it is telling the server that
it can correctly process any valid HTTP/1.1 response. Sending HTTP/1.0
doesn't mean that the client can't use HTTP/1.1 features (like the Host
header). The client is merely asking the server to return a valid
HTTP/1.0 response.

And the servers are doing exactly that. They also process the Host
header as HTTP/1.1 spec says they should because they happen to support
that feature. They don't have to be HTTP/1.1 compliant for that feature
to work.

-- 
 .-.   .-.Yes, I am an agent of Satan, but my duties are largely
(_  \ /  _)   ceremonial.
 |
 |[EMAIL PROTECTED]


Re: No more Libtool (long)

2005-06-27 Thread Dražen Kačar
Hrvoje Niksic wrote:

 Google doesn't show even nearly enough hits when you search for libtool
 sucks.

Because it's an understatement. :-)

-- 
 .-.   .-.Yes, I am an agent of Satan, but my duties are largely
(_  \ /  _)   ceremonial.
 |
 |[EMAIL PROTECTED]


Re: php site handling problem: PHPSESSIONID

2005-03-11 Thread Dražen Kačar
Marcel Partap wrote:

 http://www.rwth-aachen.de/gruenderkolleg/spinoffs/firma.php?id=22PHPSESSID=7f0dd9f1ff08d9ce3acc2039577c60b1
 That PHPSESSID stuff sux. If I download it again, it will not overwrite 
 old with new but make a new one. So I dema.. eeh WE need a command 
 switch to kill thos PHPSESSIONID's probably other scripts languages have 
 similar behaviour...

PHPSESSID is just the default name for the variable which holds session
cookie. As far as PHP session support goes, it can be in the HTTP cookie 
header or embedded in the URLs. I agree that removing session ids and
various security related thingies from URLs would be a nice thing.

But variable name is per site and user interface is going to be clunky.

-- 
 .-.   .-.Yes, I am an agent of Satan, but my duties are largely
(_  \ /  _)   ceremonial.
 |
 |[EMAIL PROTECTED]


Re: string_t breaks compilation under Solaris

2005-02-21 Thread Dražen Kačar
Hrvoje Niksic wrote:

 2) Server messages printed by Wget in normal operation, such as the
200 Ok message.  That one is printed just for the fun factor
anyway, we could as well print just the response code.  However, I
don't see a problem with simply filtering out the non-ASCII's from
the response code.  People who put non-ASCII messages in server
response lines won't be able to see them properly in Wget's output,
but I honestly couldn't care less.

Those messages are using Latin 1. HTTP/1.0 worked that way and it was not
changed in HTTP/1.1 for compatibility reasons. I don't expect that
non-ASCII characters are used much, though. I've only seen the status
messages translated to French, once.

The Warning header carries text intended for the user and it's the only
such header, AFAIK. Again, text defaults to Latin 1 and the usual MIME-ish
complications can be used for other character sets.

-- 
 .-.   .-.Yes, I am an agent of Satan, but my duties are largely
(_  \ /  _)   ceremonial.
 |
 |[EMAIL PROTECTED]


Re: new string module

2005-01-05 Thread Dražen Kačar
Jan Minar wrote:

 What's wrong with mbrtowc(3) and friends?  The mysterious solution is
 probably to use wprintf(3) instead printf(3).  Couple of questions on #c
 on freenode would give you that answer.

Historically, wget source was written in a way which allowed one to
compile it on really old systems. That would rule out C95 functions.

(I'm not advocating this approach, just answering the question.)

-- 
 .-.   .-.Yes, I am an agent of Satan, but my duties are largely
(_  \ /  _)   ceremonial.
 |
 |[EMAIL PROTECTED]


Re: Large Files Support for Wget

2004-05-10 Thread Dražen Kačar
Hrvoje Niksic wrote:
 David Fritz [EMAIL PROTECTED] writes:
 
  IIUC, GNU coreutils uses uintmax_t to store large numbers relating to
  the file system and prints them with something like this:
 
 char buf[INT_BUFSIZE_BOUND (uintmax_t)];
 printf (_(The file is %s octets long.\n), umaxtostr (size, buf));
 
 That's probably the most portable way to do it.

For the time being. However, in C99 %ju is the correct format for printing
uintmax_t. There are systems which have uintmax_t, but don't have the j
modifier, so the whole thing is a problem if you want to write failsafe
configure check. And there might be run-time problems, as well.

 * Change most (all?) occurrences of `long' in the code to `off_t'.  Or
   should we go the next logical step and just use uintmax_t right
   away?

Just use off_t.

-- 
 .-.   .-.Yes, I am an agent of Satan, but my duties are largely
(_  \ /  _)   ceremonial.
 |
 |[EMAIL PROTECTED]


Re: bug in use index.html

2004-03-04 Thread Dražen Kačar
Hrvoje Niksic wrote:
 The whole matter of conversion of / to /index.html on the file
 system is a hack.  But I really don't know how to better represent
 empty trailing file name on the file system.

Another, for now rather limited, hack: on file systems which support some
sort of file attributes you can mark index.html as an unwanted child of an
empty trailing file name. AFAIK, that should work at least on Solaris and
Linux. Others will join the club one day, I hope.

-- 
 .-.   .-.Yes, I am an agent of Satan, but my duties are largely
(_  \ /  _)   ceremonial.
 |
 |[EMAIL PROTECTED]


Re: connect.c compilation problems on irix

2003-12-01 Thread Dražen Kačar
Hrvoje Niksic wrote:

 access to structure elements *looks* like it's accessing an sa_len
 element, whereas in fact it's in fact accessing a union.  This is all
 very nice until you try to name a variable sa_len.

That's why dear standards reserve large chunks of the namespace. Something
or other posixoid in nature probably reserves all identifiers starting
with sa_ in case you include a certain header file.

And in case you defined identifiers ending with _t while including
sys/types.h, you're violating the scriptures. :-)

-- 
 .-.   .-.Yes, I am an agent of Satan, but my duties are largely
(_  \ /  _)   ceremonial.
 |
 |[EMAIL PROTECTED]