Norman Ramsey sent a note to David and me today, in which he basically fumed about how complicated the parser options are, and suggested we (1) define a language which would implement a predicate calculus for describing which pages to pluck and which to leave, and (2) use a visualization system which would show us (somehow) a picture of what had been plucked (presumably embedded in a larger space of some sort).
Both interesting ideas, though perhaps not for Plucker. Incidentally, I deleted his note, or I'd forward it along. However, it got me thinking about the plucker-build command line. (Those of you who don't use command lines can stop reading right now :-). Would it be easier to use Plucker, cognitively or otherwise, if we could, in a simple situation, just say plucker-build foo.txt >foo.pdb or plucker-build http://www.iana.org/assignments/character-sets >csets.pdb ?? Sure seems like it to me. "Make the simple easy to do", as they say. So I tried modifying the parser to allow this along with the current behavior, which took just a small amount of work. I like it! I'd like to check this in, but thought I'd see what people thought before doing so. A couple of notes: If no 'doc_file' is given, the DB is written to stdout. Verbosity is set to zero. (Bit of an issue there; we send a lot of status stuff to stdout right now that should go to stderr.) Only one argument is allowed. If it begins with http: or file:, that URL is used as the home_url. If it doesn't, the parser checks for a file with the same name as the argument, and if that file exists, a file: is prepended to create the home_url. If the file doesn't exist, an error message is raised. If we write the document to stdout AND no doc_name has been specified or found, the home URL is used as the doc name. Bill
