Re: [O] inter-word space in org -> latex

2015-09-22 Thread Rasmus
Marcin Borkowski  writes:

> ;; Convert single spaces after periods etc. to "\ " when exporting to LaTeX
>
> (defun my-latex-filter-nonfrenchspacing (text backend info)
>   "Convert single spaces after dots to \"\ \"."
>   (when (and (org-export-derived-backend-p backend 'latex)
>sentence-end-double-space)
> (replace-regexp-in-string
>  (concat "\\(" sentence-end-base "\\)"
>"[ \u00a0]\\([^ \t\u00a0\n]\\)")
>  "\\1 \\2" text)))
>
> (add-to-list 'org-export-filter-plain-text-functions
>'my-latex-filter-nonfrenchspacing)
>
> It is a bit simplistic (after all, I wrote it just now in 15 minutes),
> but it seems to work fine.  It makes a few assumptions, though.  One of
> them is that you don't mess with sentence-end-base too much: I assumed
> that there are no non-shy groups there.  (By default there are not, and
> I don't see any reason for them to be there, but what do I know.)  Also,
> I assume that for the period to /not/ end the sentence, it should be
> followed by one space and something non-spacey.

Here'a an alternative implementation with other limitation, e.g. only
looking at [A-Z] for capitals:

   http://permalink.gmane.org/gmane.emacs.orgmode/101176

> Also, note that while Emacs' way of differentiating between
> a sentence-ending period and a non-sentence-ending period are fairly
> simple, (La)TeX's rules are a bit more complicated (look up "space
> factor" in The TeXbook).  For instance, LaTeX assumes that a period
> after a capital letter /never/ ends the sentence, and you have to use \@
> before such period to change that.  The algorithm TeX uses is really
> clever, and can be (ab)used in funny ways to do funny stuff in
> low-level, hackish TeX ways (been there, done that - for instance, when
> I once reimplemented the theorem-like environments, I used space factor
> to make sure that if a theorem begins with an enumeration, it looks
> fine.  The "standard" LaTeX implementation of theorem-like environments
> is kind of crazy, even if it works in typical cases.  Try typesetting
> a theorem with a long optional argument in a narrow column and see what
> happens, for instance.).

But isn't a lot of the cruft from TeX "fixed" in LaTeX.  E.g. I believe
the correct space is automatically used after emphasis.


> TL;DR: just use \frenchspacing.  Everyone will be happier.

Or not.

-- 
The Kids call him Billy the Saint







Re: [O] inter-word space in org -> latex

2015-09-22 Thread Marcin Borkowski

On 2015-09-22, at 22:17, Marcin Borkowski  wrote:

> On 2015-09-14, at 16:42, Dan Griswold  wrote:
>
>> How should I mark in org mode that I want a space following a period
>> concluding an abbreviation to be seen by LaTeX as an interword space?
>
> #+LATEX_HEADER: \frenchspacing
>
> and never worry again.
>
> OTOH, it would be relatively easy to write a filter which converts Emacs
> rules wrt. spaces (single/double space) into LaTeX ones.

And here it is:

--8<---cut here---start->8---
;; Convert single spaces after periods etc. to "\ " when exporting to LaTeX

(defun my-latex-filter-nonfrenchspacing (text backend info)
  "Convert single spaces after dots to \"\ \"."
  (when (and (org-export-derived-backend-p backend 'latex)
 sentence-end-double-space)
(replace-regexp-in-string
 (concat "\\(" sentence-end-base "\\)"
 "[ \u00a0]\\([^ \t\u00a0\n]\\)")
 "\\1 \\2" text)))

(add-to-list 'org-export-filter-plain-text-functions
 'my-latex-filter-nonfrenchspacing)
--8<---cut here---end--->8---

It is a bit simplistic (after all, I wrote it just now in 15 minutes),
but it seems to work fine.  It makes a few assumptions, though.  One of
them is that you don't mess with sentence-end-base too much: I assumed
that there are no non-shy groups there.  (By default there are not, and
I don't see any reason for them to be there, but what do I know.)  Also,
I assume that for the period to /not/ end the sentence, it should be
followed by one space and something non-spacey.

Probably the biggest drawback is that non-breaking spaces get converted
to breaking ones.  However, if you care about those, you can always
use the filter given in the example for filters in the Org manual, and
make sure it runs before the above one.  (TeX considers tildes as
normal-sized spaces.)

Also, note that while Emacs' way of differentiating between
a sentence-ending period and a non-sentence-ending period are fairly
simple, (La)TeX's rules are a bit more complicated (look up "space
factor" in The TeXbook).  For instance, LaTeX assumes that a period
after a capital letter /never/ ends the sentence, and you have to use \@
before such period to change that.  The algorithm TeX uses is really
clever, and can be (ab)used in funny ways to do funny stuff in
low-level, hackish TeX ways (been there, done that - for instance, when
I once reimplemented the theorem-like environments, I used space factor
to make sure that if a theorem begins with an enumeration, it looks
fine.  The "standard" LaTeX implementation of theorem-like environments
is kind of crazy, even if it works in typical cases.  Try typesetting
a theorem with a long optional argument in a narrow column and see what
happens, for instance.).

TL;DR: just use \frenchspacing.  Everyone will be happier.

Hth,

-- 
Marcin Borkowski
http://octd.wmi.amu.edu.pl/en/Marcin_Borkowski
Faculty of Mathematics and Computer Science
Adam Mickiewicz University



Re: [O] inter-word space in org -> latex

2015-09-22 Thread Marcin Borkowski

On 2015-09-14, at 16:42, Dan Griswold  wrote:

> Dear org-mode community,
>
> In LaTeX, a space after a period is treated as an inter-sentence space,
> which is wider than an inter-word space. This can lead to overly wide
> spacing after a period that ends an abbreviation rather than a space. The
> way to cover this in LaTeX is to use a backslash prior to the space, as in:
>
> Mr.\ Henry Higgins.
>
> I have some documents in org that have the same issue: periods concluding
> abbreviations, with the result that LaTeX puts more space than I want after
> the abbreviation. Yet the use of "\ " does not work, as the backslash is
> exported to LaTeX as a literal backslash.
>
> How should I mark in org mode that I want a space following a period
> concluding an abbreviation to be seen by LaTeX as an interword space?

#+LATEX_HEADER: \frenchspacing

and never worry again.

OTOH, it would be relatively easy to write a filter which converts Emacs
rules wrt. spaces (single/double space) into LaTeX ones.

BTW: Bringhurst claims that using larger spaces at the end of the
sentence is Bad Style™.

> Thanks,
>
> Dan


-- 
Marcin Borkowski
http://octd.wmi.amu.edu.pl/en/Marcin_Borkowski
Faculty of Mathematics and Computer Science
Adam Mickiewicz University



Re: [O] inter-word space in org -> latex

2015-09-16 Thread Dan Griswold
On Mon, Sep 14, 2015 at 12:54 PM, Suvayu Ali 
wrote:

> You can use entities.  Maybe something like this:
>
> (setq org-entities-user ; can also use "\ "
>   '(("space" "~" nil " " " " " " " ")))
>
> Then the following:
>
>   Mr.\space{}Henry Higgins.
>


That's also  a nice solution. Thanks.


Re: [O] inter-word space in org -> latex

2015-09-16 Thread Dan Griswold
On Mon, Sep 14, 2015 at 12:50 PM, Rasmus  wrote:

> > How should I mark in org mode that I want a space following a period
> > concluding an abbreviation to be seen by LaTeX as an interword space?
>
> .@@latex:\ @@


Pretty neat. Thanks.


Re: [O] inter-word space in org -> latex

2015-09-14 Thread Suvayu Ali
On Mon, Sep 14, 2015 at 10:42:25AM -0400, Dan Griswold wrote:
> Dear org-mode community,
> 
> In LaTeX, a space after a period is treated as an inter-sentence space,
> which is wider than an inter-word space. This can lead to overly wide
> spacing after a period that ends an abbreviation rather than a space. The
> way to cover this in LaTeX is to use a backslash prior to the space, as in:
> 
> Mr.\ Henry Higgins.
> 
> I have some documents in org that have the same issue: periods concluding
> abbreviations, with the result that LaTeX puts more space than I want after
> the abbreviation. Yet the use of "\ " does not work, as the backslash is
> exported to LaTeX as a literal backslash.
> 
> How should I mark in org mode that I want a space following a period
> concluding an abbreviation to be seen by LaTeX as an interword space?

You can use entities.  Maybe something like this:

(setq org-entities-user ; can also use "\ "
  '(("space" "~" nil " " " " " " " ")))

Then the following:

  Mr.\space{}Henry Higgins.

exports as:

  Mr.~Henry Higgins.

Hope this helps,

-- 
Suvayu

Open source is the future. It sets us free.



Re: [O] inter-word space in org -> latex

2015-09-14 Thread Rasmus
Hi Dan,

Dan Griswold  writes:

> Dear org-mode community,
>
> In LaTeX, a space after a period is treated as an inter-sentence space,
> which is wider than an inter-word space. This can lead to overly wide
> spacing after a period that ends an abbreviation rather than a space. The
> way to cover this in LaTeX is to use a backslash prior to the space, as in:
>
> Mr.\ Henry Higgins.
>
> I have some documents in org that have the same issue: periods concluding
> abbreviations, with the result that LaTeX puts more space than I want after
> the abbreviation. Yet the use of "\ " does not work, as the backslash is
> exported to LaTeX as a literal backslash.
>
> How should I mark in org mode that I want a space following a period
> concluding an abbreviation to be seen by LaTeX as an interword space?

.@@latex:\ @@

Though I practice I type double space after full sentences and when
there's a single space and small letters.

(defun rasmus/org-latex-filter-nobreaks-double-space (text backend info)
"Tries to export \"S1. S2\" as \"S1.\\ S2\",
   while letting \"S1.  S2\" be exported without tilde"
;; TODO: error with this output:
;; [[file:nasty dir/Screenshot. from 2015-03-05 19:05:00.png]]
(when (and text (org-export-derived-backend-p backend 'latex))
  (let ((preamble (or (string-match-p "begin{document}" text) 0))
(case-fold-search nil))
(concat (substring text 0 preamble)
(replace-regexp-in-string "\\. \\([^ A-Z\n]\\)"
  ". \\1"
  (substring text preamble))

(add-to-list 'org-export-filter-final-output-functions
 'rasmus/org-latex-filter-nobreaks-double-space)

Or,

(defcustom rasmus/org-latex-unicode-to-tex  '((" " "~")
(" " "\\,")
("​" ""))
  "list of re rep pairs which are replaced during latex export")

(defun rasmus/org-latex-unicode-to-tex (text backend info)
  "Replace unicode strings with their TeX equivalents.

  Currently:  ' ' (no break space) to '~'
  ' ' (thin space) to '\,'
  '​'  (zero width space) to ''."
  (when (org-export-derived-backend-p backend 'latex)
(cl-loop for (re rep) in rasmus/org-latex-unicode-to-tex do
 (setq text (replace-regexp-in-string re rep text t t)))
text))

(add-to-list 'org-export-filter-final-output-functions
 'rasmus/org-latex-unicode-to-tex)


Rasmus

-- 
I hear there's rumors on the, uh, Internets. . .