Am 07.12.2009 um 08:52 schrieb Taco Hoekwater:
Is /ActualText supposed to be in PDFDoc Encoding?
No, you could also use Unicode Encoding.
Wolfgang
___
If your question is of interest to others as well, please add
Taco Hoekwater wrote:
Wolfgang Schuster wrote:
Hi Hans,
you showed a while ago how the actualtext function of pdf works
and i have a module where i would use it but letters outside of
ascii appear wrong when i copy the text
\starttext
text \pdfliteral{/Span /ActualText (Müller)
Am 07.12.2009 um 09:46 schrieb Hans Hagen:
\def\pdfactualtext#1#2%
{\pdfliteral direct{/Span /ActualText
\ctxlua{tex.write(lpdf.tosixteen(#2))} BDC}#1\pdfliteral direct{EMC}}
\starttext
text \pdfactualtext{Meier}{Müller} text
\stoptext
Perfect, will this end in the core?
Regards,
Wolfgang Schuster wrote:
Am 07.12.2009 um 09:46 schrieb Hans Hagen:
\def\pdfactualtext#1#2%
{\pdfliteral direct{/Span /ActualText \ctxlua{tex.write(lpdf.tosixteen(#2))}
BDC}#1\pdfliteral direct{EMC}}
\starttext
text \pdfactualtext{Meier}{Müller} text
\stoptext
Perfect, will this end
Am 07.12.2009 um 10:11 schrieb Hans Hagen:
hm, doesn't that kind of functionality demands a bit more 'thinking'? what
exactly is needed? how does it relate to linebreaks? other content? etc ..
actually, such a mechanism should be implemented a bit differently (maybe
attributes and delayed
Wolfgang Schuster wrote:
Am 07.12.2009 um 10:11 schrieb Hans Hagen:
hm, doesn't that kind of functionality demands a bit more 'thinking'? what
exactly is needed? how does it relate to linebreaks? other content? etc ..
actually, such a mechanism should be implemented a bit differently (maybe
Am 07.12.2009 um 11:21 schrieb Hans Hagen:
detail ...
\def\ruby#1#2%
{\dontleavehmode\bgroup
\setbox\scratchboxone\hbox{#1}%
\setbox\scratchboxtwo\hbox{#2}%
\scratchdimen\wd\ifdim\wd\scratchboxone\wd\scratchboxtwo\scratchboxone\else\scratchboxtwo\fi
\setbox\scratchbox\vbox
Hi Hans,
you showed a while ago how the actualtext function of pdf works
and i have a module where i would use it but letters outside of
ascii appear wrong when i copy the text
\starttext
text \pdfliteral{/Span /ActualText (Müller) BDC}Meier\pdfliteral{EMC} text
\stoptext
becomes
text Müller
Wolfgang Schuster wrote:
Hi Hans,
you showed a while ago how the actualtext function of pdf works
and i have a module where i would use it but letters outside of
ascii appear wrong when i copy the text
\starttext
text \pdfliteral{/Span /ActualText (Müller) BDC}Meier\pdfliteral{EMC}
Barry Schwartz wrote:
Please tell me this isn't in a FAQ. :) Is there support for ActualText
tags so that searching and extraction will work with OpenType fonts
and Unicode? If so, do discretionary hyphens get treated as 00AD
instead of 002D?
can you explain in mode detail what you mean with
can you explain in mode detail what you mean with 'actual text tags' ?
He means ActualText tags :-) See the PDF spec section 14.9.4, page 623.
It's a more generic way to support searching than ToUnicode vectors: you just
specify the actual string of underlying Unicode characters. The PDF
Am 19.09.2009 um 19:10 schrieb Arthur Reutenauer:
Anyway, this needs support at the engine level and I don't think
there is;
actually it would be nice to add that to LuaTeX.
Heiko Oberdiek wrote the accsupp package to use ActualText in LaTeX,
why shouldn't it be then possible to use it in
Arthur Reutenauer arthur.reutena...@normalesup.org skribis:
He means ActualText tags :-) See the PDF spec section 14.9.4, page 623.
It's a more generic way to support searching than ToUnicode vectors: you just
specify the actual string of underlying Unicode characters. The PDF spec uses
Heiko Oberdiek wrote the accsupp package to use ActualText in LaTeX,
why shouldn't it be then possible to use it in LuaTeX (and ConTeXt)?
Right, you don't need additional engine support, you can use \pdfliteral in
pdfTeX, and in LuaTeX as well. Heiko's package should be quite easy to port to
Arthur Reutenauer wrote:
can you explain in mode detail what you mean with 'actual text tags' ?
He means ActualText tags :-) See the PDF spec section 14.9.4, page 623.
It's a more generic way to support searching than ToUnicode vectors: you just
specify the actual string of underlying
Barry Schwartz wrote:
Also, I noticed when playing around with the examples from the Th
ligature discussion that searching and extraction didn't work with
small caps, though it did work with the ligature. With ActualText tags
hm, mkiv has an analyser for names-unicode and afaik small caps
Hans Hagen pra...@wxs.nl skribis:
put an ActualText tag on anything that happens not to match what you
would get from the ToUnicode mapping.
hm, if one knows the character (say c) then why not adapt the tounicode
vector
The same glyph could correspond to different Unicode in the
source.
Barry Schwartz chemoelect...@chemoelectric.org skribis:
In practice what I see with my method is that discretionary hyphens
always get an ActualText, and if the font is older and has names like
Asmall or ffl (which I don't bother handling specially) then the
substituted stuff gets an
Please tell me this isn't in a FAQ. :) Is there support for ActualText
tags so that searching and extraction will work with OpenType fonts
and Unicode? If so, do discretionary hyphens get treated as 00AD
instead of 002D?
19 matches
Mail list logo