On 2015-02-01, at 22:06, Jörg Weger <joerg73....@googlemail.com> wrote:
> Is the character count “wc --char <textfile>” returns with or without > blank spaces? (Which is important for me.) “man wc” doesn’t talk about that. > > I had hoped there was a better way than to edit the result of > “pdftotext” in my text editor or in libreoffice writer (deleting > unnecessary carriage returns and spaces by searching for regular > expressions) which are able to do the count I need. In fact I had hoped > that ConTeXt was able to count the characters and spaces it renders to > PDF (is that theoretically possible?) … I am pretty sure that you can make sed filter out blank characters. So then you can just chain pdftotext, sed and wc. OTOH, here's a relevant question (and a simple answer) on SO. (It seems to count newlines, though.) JFF, I've just coded this in Emacs Lisp: --8<---------------cut here---------------start------------->8--- ;; Count non-blank characters in a buffer (defun how-many-visible-chars () "Count visible (i.e., other than spaces, tabs and newlines) characters in the buffer." (interactive) (let ((count 0)) (save-excursion (goto-char (point-min)) (while (not (eobp)) (unless (looking-at-p "[ \t\n]") (setq count (1+ count))) (forward-char))) (message "%d visible characters" count))) --8<---------------cut here---------------end--------------->8--- It's terribly unoptimized, but I ran it on a 300+ kB file on my low-end netbook and it ran in something like 2 seconds, so it's not that bad in practice. Also, it's not well-coded: it should e.g. return the number instead of displaying the message when called non-interactively, it might take active region into account etc. - but as a proof-of-concept, it works surprisingly well (i.e., fast). > Greetings Jörg Best, -- Marcin Borkowski http://octd.wmi.amu.edu.pl/en/Marcin_Borkowski Faculty of Mathematics and Computer Science Adam Mickiewicz University ___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___________________________________________________________________________________