[NTG-context] Sanitize in XML/XHTML documents

2016-02-02 Thread Andreas Schneider
Hi,

I hope I don't just overlook some plain obvious solution, but a longer search 
in the Wiki and the mailing list (and the context source) didn't come up with 
anything too useful 

Anyway: is there any mechanism that I can use to "fix" quotations while 
typesetting XML documents?
Pandoc produces some text. In TeX I would usually use 
\quotation{some text} to have proper, language dependent quotes.

I tried to intercept  using \xmltexentity{quot}{...} but apparently the 
internal replacement takes precedence here. Otherwise I would have tried to 
build some small state machine which remembers if it is currently inside a 
quotation of not.

My next try would be to somehow intercept the XML stream (or flush) using lua 
and replace " ... " inline. However that all seems quite hacky.

So, does ConTeXt currently offer anything I can piggyback to get quotations set 
properly? :-)

Best regards,
Andreas
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] Sanitize in XML/XHTML documents

2016-02-02 Thread massifr
> Pandoc produces some text. In TeX I would usually use 
> \quotation{some text} to have proper, language dependent quotes.
Are you sure that Pandoc can produce  only? Have you tried the 
--html-q-tags option?

Greetings,
Massi
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] Sanitize in XML/XHTML documents

2016-02-02 Thread Hans Hagen

On 2/2/2016 11:32 AM, Andreas Schneider wrote:

Hi,

I hope I don't just overlook some plain obvious solution, but a longer search 
in the Wiki and the mailing list (and the context source) didn't come up with 
anything too useful 

Anyway: is there any mechanism that I can use to "fix" quotations while 
typesetting XML documents?
Pandoc produces some text. In TeX I would usually use 
\quotation{some text} to have proper, language dependent quotes.


To me that looks like a pandoc bug: how is a backend supposed to deal 
with left/right quotation marks?



I tried to intercept  using \xmltexentity{quot}{...} but apparently the 
internal replacement takes precedence here. Otherwise I would have tried to build 
some small state machine which remembers if it is currently inside a quotation of 
not.


quot lt gt amp are kind of system entities so not to be messed with

you could do a replacement:  ->  and then 
\xmltexentity{myquot}{?} or \xmlsetentity{myquot}{?} depending on what 
gets done



My next try would be to somehow intercept the XML stream (or flush) using lua and replace 
" ... " inline. However that all seems quite hacky.


even then you can have issues: what if you have nested and/or unbalanced 
quotes ...



So, does ConTeXt currently offer anything I can piggyback to get quotations set 
properly? :-)


Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
  tel: 038 477 53 69 | www.pragma-ade.com | www.pragma-pod.nl
-
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] Sanitize in XML/XHTML documents

2016-02-02 Thread Andreas Schneider

Am 2016-02-02 13:19, schrieb mass...@fastwebnet.it:
Pandoc produces some text. In TeX I would usually use 
\quotation{some text} to have proper, language dependent quotes.

Are you sure that Pandoc can produce  only? Have you tried the
--html-q-tags option?

Greetings,
Massi


*face-palms*
Yes, that did the trick. I completely overlooked that option.
Thank you very much for that info! Now I can properly do quotes with an 
xmlsetup.


Best regards,
Andreas
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___

Re: [NTG-context] Sanitize in XML/XHTML documents

2016-02-02 Thread Andreas Schneider

Am 2016-02-02 12:14, schrieb Hans Hagen:

On 2/2/2016 11:32 AM, Andreas Schneider wrote:
Anyway: is there any mechanism that I can use to "fix" quotations 
while typesetting XML documents?
Pandoc produces some text. In TeX I would usually use 
\quotation{some text} to have proper, language dependent quotes.


To me that looks like a pandoc bug: how is a backend supposed to deal
with left/right quotation marks?

My next try would be to somehow intercept the XML stream (or flush) 
using lua and replace " ... " inline. However that all seems quite 
hacky.


even then you can have issues: what if you have nested and/or
unbalanced quotes ...


Pandoc allows to replace quotes with "smart punctuation". That would 
cause "..." to be replaced by “...”. That way I can at least safely 
determine start and end, thereby replacing “ with \quotation\bgroup and 
” with \egroup. I'm aware that there might still be cases where this 
goes wrong, but it should be much more rare then :-)


So I guess my lua based search and replace (combined with said "smart 
punctuation") could be my best bet at the moment. If any other ideas pop 
up, I'm happy to hear them, though :-)


Thanks for your quick response!

Best regards,
Andreas
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___