Hi again, list. char-emj.lua is automatically generated, I guess from here?
https://www.unicode.org/Public/17.0.0/emoji/emoji-test.txt. Well, I
regenerated data with a custom script (attached) in order to test newest
emojis and I find something weird. Emojis whose names include double quotes
(Ux201C and Ux201D) cannot be accessed nor with English quotes nor with
ASCII double quotes (example attached). However, direct access by codepoint
works fine. So, my questions are:

1. How is data from char-emj.lua generated? A script in the distribution
would help to ease and speed updates after Unicode releases.
2. Is the double quote issue expected or should it be fixed? I think
ASCII-only names would be easier to type, but it's just my opinion

Regards,

Jairo
local lpeg = require("lpeg")
local C, Cg, Ct, utfR, P, R, V, match = lpeg.C, lpeg.Cg, lpeg.Ct, lpeg.utfR, lpeg.P, lpeg.R, lpeg.V, lpeg.match

local emojis = {}

local grammar = P({
	"grammar",
	hex = (R("09", "af", "AF") ^ 1) / function(c)
		return tonumber(c, 16)
	end,
	word = (utfR(0, 0x10FFFF) - P(" ")) ^ 1,
	words = V("word") * (P(" ") * V("word")) ^ 0,
	cps = Ct(V("hex") * (P(" ") * V("hex")) ^ 0),
	qualified = C(V("word")),
	name = C(V("word") * (P(" ") * V("word")) ^ 0),
	grammar = Ct(
		Cg(V("cps"), "codepoints")
			* P(" ") ^ 1
			* P(";")
			* P(" ") ^ 1
			* Cg(V("word"), "qualified")
			* P(" ") ^ 1
			* P("#")
			* P(" ") ^ 1
			* V("word")
			* P(" ") ^ 1
			* V("word")
			* P(" ") ^ 1
			* Cg(V("words"), "name")
			* P(-1)
	),
})

for line in io.lines("./emoji-test.txt") do
	local matches = match(grammar, line)
	if matches then
		if matches["qualified"] ~= "unqualified" then
			emojis[matches["name"]:lower()] = matches["codepoints"]
		end
	end
end

-- Is this OK?
table.tofile("char-emj.lua", emojis, true, nil, false, true)

Attachment: emojitest.pdf
Description: Adobe PDF document

\definefontfeature[colored][default][ccmp=yes,dist=yes,sbix=yes]
\definefont[emoj][file:NotoColorEmoji.ttf*colored]

\startTEXpage

\emoj\emoji{japanese “bargain” button} % This doesn't work
\zwj\Ux{1F250} % This does

\stopTEXpage
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : [email protected] / 
https://mailman.ntg.nl/mailman3/lists/ntg-context.ntg.nl
webpage  : https://www.pragma-ade.nl / https://context.aanhet.net (mirror)
archive  : https://github.com/contextgarden/context
wiki     : https://wiki.contextgarden.net
___________________________________________________________________________________

Reply via email to