Hi Folks,
I am creating an HTML entry form for inputting text that can extend beyond the
ASCII range, so the trick is standardizing the input of entities, and of course
what to do with the ampersand character. There are 2 parts to this challenge:
1. Creating the text entry UI and providing rules for inputting entities
as well as detecting and reporting invalid entries, and
2. Converting the inputted entities into their corresponding UTF-8 value
for storage in MarkLogic, especially so that the exported values can be
converted back into the appropriate entities for html display or for export
such as to a Microsoft Word document.
It seems that I cannot have my cake and eat it too, for example if I want to
allow the user to simply insert a title with an ampersand they could enter:
Red & White
But if I want to allow them to enter other encoded values such as:
“ Red & White”
Then there needs to be the expectation that entering and ampersand by itself is
disallowed, that the former must be supplied as
Red & White
So how do folks tend to deal with this issue for each of the parts that I
describe above?
Thanks for any help with this. It seems like a simple issue but that has a lot
of complexity, especially when folks allow proprietary named and numbered html
encodings with private use area Unicode mapping. Is this the bane of UI entry
for XML UTF-8 mapping or what? J
Tim M.
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general