Re: a character for an unknown character

2017-01-03 Thread Christoph Päper
Marcel Schneider : > On Sat, 31 Dec 2016 11:01:16 +0100, Christoph Päper wrote: >> >> It has indeed. That’s why two different technologies have to be used to get >> typographically harmonic hexadecimal numbers, e.g. in CSS: … > > Thank you for the code. I didnʼt know this,

Marking up hexadecimal numbers (was: Re: a character for an unknown character)

2017-01-02 Thread Marcel Schneider
Iʼve messed up my e-mail by not converting HTML to text. Please disregard. The used webmail applies HTML tags and deletes all unknown ones. Sorry. On Sat, 31 Dec 2016 22:04:02 +0100 (CET), I wrote: > On Sat, 31 Dec 2016 11:01:16 +0100, Christoph Päper wrote: > > > > Richard Wordingham : > > > >

Marking up hexadecimal numbers (was: Re: a character for an unknown character)

2017-01-02 Thread Marcel Schneider
On Sat, 31 Dec 2016 22:04:02 +0100 (CET), I wrote: > On Sat, 31 Dec 2016 11:01:16 +0100, Christoph Päper wrote: > > > > Richard Wordingham : > > > > > >> Perhaps the letters for hexadecimal digits should have been encoded > > >> separately? > > > > > > The idea has been rejected several times.

Re: a character for an unknown character

2016-12-31 Thread Marcel Schneider
On Sat, 31 Dec 2016 11:01:16 +0100, Christoph Päper wrote: > > Richard Wordingham : > > > >> Perhaps the letters for hexadecimal digits should have been encoded > >> separately? > > > > The idea has been rejected several times. > > It has indeed. That’s why two different technologies have to

Re-use of Modifier Letters for Superscript Abbreviations (was: Re: a character for an unknown character)

2016-12-31 Thread Marcel Schneider
On Sat, 31 Dec 2016 09:20:30 +, Richard Wordingham wrote: […] > It's in a different universe, restricted to one book, namely Footfall. Thank you for the reference. […] > Did you look in the article about Klingon, namely > https://en.wikipedia.org/wiki/Klingon_language , or > in the

Re: a character for an unknown character

2016-12-31 Thread Christoph Päper
Richard Wordingham : > >> Perhaps the letters for hexadecimal digits should have been encoded >> separately? > > The idea has been rejected several times. It has indeed. That’s why two different technologies have to be used to get typographically harmonic

Re: a character for an unknown character

2016-12-31 Thread Richard Wordingham
On Sat, 31 Dec 2016 02:09:12 +0100 (CET) Marcel Schneider wrote: > On Fri, 30 Dec 2016 22:17:12 +, Richard Wordingham wrote: > > You obviously haven't read the story's discussion of whether the > > fithp would honour a peace treaty! > I havenʼt read nor watched Star Trek (nor Star Wars).

Re: a character for an unknown character

2016-12-30 Thread Marcel Schneider
On Fri, 30 Dec 2016 22:17:12 +, Richard Wordingham wrote: > > On Fri, 30 Dec 2016 20:13:41 +0100 (CET) > Marcel Schneider wrote: […] > > If the apostrophe and the > > single comma quote are disunified, then U+02BC is used to spell the > > word «fiʼ» (your first option). You might also wish

Re: a character for an unknown character

2016-12-30 Thread Richard Wordingham
On Fri, 30 Dec 2016 20:13:41 +0100 (CET) Marcel Schneider wrote: > > U+2E31 WORD SEPARATOR MIDDLE DOT > > U+30FB KATAKANA MIDDLE DOT > These seem to me identical to U+00B7 and U+2022 respectively. Perhaps > weʼre here faced with two examples of what Asmus referred to

Re: a character for an unknown character

2016-12-30 Thread Asmus Freytag
On 12/30/2016 4:37 AM, Richard Wordingham wrote: On Fri, 30 Dec 2016 01:23:55 +0100 (CET) Marcel Schneider wrote: On Wed, 28 Dec 2016 19:05:17 -0800, Asmus Freytag wrote: On 12/28/2016 5:47 PM, Richard Wordingham wrote: U+02BC being shifted from a letter to a

Re: a character for an unknown character

2016-12-30 Thread Marcel Schneider
On Fri, 30 Dec 2016 12:37:27 +, Richard Wordingham wrote: > On Fri, 30 Dec 2016 01:23:55 +0100 (CET) Marcel Schneider wrote: > > On Wed, 28 Dec 2016 19:05:17 -0800, Asmus Freytag wrote: > > > On 12/28/2016 5:47 PM, Richard Wordingham wrote: > > U+02BC being shifted from a letter to a

Re: a character for an unknown character

2016-12-30 Thread Richard Wordingham
On Fri, 30 Dec 2016 01:23:55 +0100 (CET) Marcel Schneider wrote: > On Wed, 28 Dec 2016 19:05:17 -0800, Asmus Freytag wrote: > > On 12/28/2016 5:47 PM, Richard Wordingham wrote: > U+02BC being shifted from a letter to a punctuation must have been > anticipated at

Re: a character for an unknown character

2016-12-29 Thread Marcel Schneider
On Wed, 28 Dec 2016 19:05:17 -0800, Asmus Freytag wrote: > On 12/28/2016 5:47 PM, Richard Wordingham wrote: > > On Tue, 27 Dec 2016 21:33:32 -0800 > > Asmus Freytag wrote: > > > > > > When it comes to marks (or symbols) of less generic or more complex > > > shapes, the > > > presumption

Re: a character for an unknown character

2016-12-29 Thread William_J_G Overington
Martin Mueller wrote: > But for the purposes of my project, which involves folks here, there, and > everywhere working on editorial problems relating to digital transcriptions > of Early Modern texts, Whilst recognising that I am going somewhat off the specific topic of this thread, yet

Re: a character for an unknown character

2016-12-28 Thread Asmus Freytag
On 12/28/2016 5:47 PM, Richard Wordingham wrote: On Tue, 27 Dec 2016 21:33:32 -0800 Asmus Freytag wrote: When it comes to marks (or symbols) of less generic or more complex shapes, the presumption that the mark only has "one" shape may be more common, and examples of the

Re: a character for an unknown character

2016-12-28 Thread Richard Wordingham
On Tue, 27 Dec 2016 21:33:32 -0800 Asmus Freytag wrote: > When it comes to marks (or symbols) of less generic or more complex > shapes, the > presumption that the mark only has "one" shape may be more common, > and examples of the mark > being repurposed may be less

Re: a character for an unknown character

2016-12-27 Thread Asmus Freytag
On 12/27/2016 8:03 AM, Marcel Schneider wrote: On 27/12/16 01:11, Richard Wordingham wrote: On Sun, 25 Dec 2016 19:31:28 +0200 "Jukka K. Korpela" wrote: […] If some graphic symbol is by convention used to represent a lacuna, then the issue, as regards to Unicode, is simply whether that

Re: a character for an unknown character

2016-12-27 Thread Marcel Schneider
aracter. > > Of course, there is one character that is already widely used in this > rôle - U+003F QUESTION MARK. Some of its Unicode properties are not > suitable, and its informal 'unknown character' semantic conflicts with > its rôle as a punctuation mark. Effectively this use of QUESTIO

Re: a character for an unknown character

2016-12-27 Thread Janusz S. Bień
cognized in this special meaning without such a > higher-level convention. We had already defined a convention and can live with it :-) But why not improving it? > There’s a theoretical (?) problem with this. Let us assume that you > decide to use a particular character to represent “unknown c

Re: a character for an unknown character

2016-12-26 Thread William_J_G Overington
ld be reasonable. > There’s a theoretical (?) problem with this. Let us assume that you decide to > use a particular character to represent “unknown character” in your data, when working with some type of written texts. What happens when you encounter, in the study of those text, a graphic symbol

Re: a character for an unknown character

2016-12-25 Thread Jukka K. Korpela
as needed. You should not expect the character to be recognized in this special meaning without such a higher-level convention. There’s a theoretical (?) problem with this. Let us assume that you decide to use a particular character to represent “unknown character” in your data, when working

Re: a character for an unknown character

2016-12-23 Thread Philippe Verdy
you only get a hint that some character has been hit because you see an additional bullet or asterisk. This is per design. Such input field should be limited to short input that is easy for you to type from your keyboard. But many input forms will also include a clickable icon/button that can be

Re: a character for an unknown character

2016-12-23 Thread Richard Wordingham
On Sat, 24 Dec 2016 00:44:00 +0100 Philippe Verdy wrote: > If IBus cannot hide the input (i.e. generate the entered characters > to the application without displaying it, and let the output hints > (bullets/asterisks) be generated only by the application, then iBus > has a

Re: a character for an unknown character

2016-12-23 Thread Philippe Verdy
> *To: *Martin Mueller <martinmuel...@northwestern.edu> > *Cc: *William_J_G Overington <wjgo_10...@btinternet.com>, " > unicode@unicode.org" <unicode@unicode.org> > *Subject: *Re: a character for an unknown character > > > > if you want something that

Re: a character for an unknown character

2016-12-23 Thread Philippe Verdy
This is stil lthe standard and expected behavior for password input field in HTML and many devices, to not display the actual characters but to **display** them as bullets or asterisks. If IBus cannot hide the input (i.e. generate the entered characters to the application without displaying it,

Re: a character for an unknown character

2016-12-23 Thread Richard Wordingham
On Fri, 23 Dec 2016 20:35:07 +0100 Philippe Verdy wrote: > But note that input fields for entering password or secret codes in > application forms/dialogs are typically using black bullets U+2022 > (•) or simply ASCII asterisks U+002A (*) to replace the entered > characters:

Re: a character for an unknown character

2016-12-23 Thread Martin Mueller
org" <unicode@unicode.org> Subject: Re: a character for an unknown character if you want something that is very unlikely to be present in original texts, it would be preferable to avoid the black dot or any other bullets which may be used as punctuation marks. Consider using some geo

Re: a character for an unknown character

2016-12-23 Thread Philippe Verdy
if you want something that is very unlikely to be present in original texts, it would be preferable to avoid the black dot or any other bullets which may be used as punctuation marks. Consider using some geometric shape, notably those inherited from DOS code pages, such as the filled square

Re: a character for an unknown character

2016-12-22 Thread Martin Mueller
. From: Leo Broukhis <leo...@gmail.com> Reply-To: "l...@mailcom.com" <l...@mailcom.com> Date: Thursday, December 22, 2016 at 6:31 PM To: Martin Mueller <martinmuel...@northwestern.edu> Cc: unicode Unicode Discussion <unicode@unicode.org> Subject: Re: a character for

Re: a character for an unknown character

2016-12-22 Thread Leo Broukhis
You may want to consider U+2370 APL FUNCTIONAL SYMBOL QUAD QUESTION. Leo On Dec 22, 2016 15:35, "Martin Mueller" wrote: These are very handsome and interesting. But for the purposes of my project, which involves folks here, there, and everywhere working on

Re: a character for an unknown character

2016-12-22 Thread Martin Mueller
These are very handsome and interesting. But for the purposes of my project, which involves folks here, there, and everywhere working on editorial problems relating to digital transcriptions of Early Modern texts, the cardinal requirement is that the character can be found on and deployed from

Re: a character for an unknown character

2016-12-22 Thread William_J_G Overington
Martin Mueller wrote: > Is there a Unicode character that says “I represent an alphanumerical > character, but I don’t know which”. This is a very common problem in the > transcription of historical texts where you have lacunas. I have been reading this thread with interest. I have produced

Re: a character for an unknown character

2016-12-21 Thread Garth Wallace
I think CYFI has characters in the PUA for "lost sign" and "damaged sign". Both are shaded squares using different patterns. On Wed, Dec 21, 2016 at 3:49 PM, Richard Wordingham < richard.wording...@ntlworld.com> wrote: > On Wed, 21 Dec 2016 02:29:59 + > Martin Mueller

Re: a character for an unknown character

2016-12-21 Thread Richard Wordingham
On Wed, 21 Dec 2016 02:29:59 + Martin Mueller wrote: > I’m new to this list. Please excuse my technical incompetence. > Is there a Unicode character that says “I represent an alphanumerical > character, but I don’t know which”. This is a very common problem

Re: Invisible letter (was Re: a character for an unknown character)

2016-12-21 Thread Janusz S. Bien
Quote/Cytat - David Corbett (Wed 21 Dec 2016 05:56:27 PM CET): Couldn’t you use U+1D52 MODIFIER LETTER SMALL O? In our corpus COMBINING LATIN SMALL LETTER O sometimes occurs in its combining function, it seemed more elegant to use a uniform encoding. But you

Invisible letter (was Re: a character for an unknown character)

2016-12-21 Thread David Corbett
Couldn’t you use U+1D52 MODIFIER LETTER SMALL O? (I changed the subject line because the invisible letter proposal is not relevant to the question about a lacuna character.) > I strongly support this. In our historical corpus of Polish > > http://korpusy.klf.uw.edu.pl/en/IMPACT_GT_2/ > > we have

Re: a character for an unknown character

2016-12-21 Thread Janusz S. Bien
Quote/Cytat - Michael Everson (Wed 21 Dec 2016 05:25:30 PM CET): I still believe that we need INVISIBLE LETTER http://unicode.org/review/pr-41-invisible.pdf I think that for the display of combining characters without a base character that the recommended NBSP

Re: a character for an unknown character

2016-12-21 Thread Michael Everson
I still believe that we need INVISIBLE LETTER http://unicode.org/review/pr-41-invisible.pdf I think that for the display of combining characters without a base character that the recommended NBSP makes no sense. NBSP is supposed to glue the characters on either side of it to itself. It makes

Re: a character for an unknown character

2016-12-21 Thread Karl Williamson
On 12/21/2016 08:45 AM, David Corbett wrote: One Unicode character specifically for this purpose is U+3013 GETA MARK. It is a Japanese symbol used to replace characters that cannot be read during transcription of manuscripts (source: Japanese Wikipedia). It looks like a bold equals sign: 〓.

Re: a character for an unknown character

2016-12-21 Thread David Corbett
One Unicode character specifically for this purpose is U+3013 GETA MARK. It is a Japanese symbol used to replace characters that cannot be read during transcription of manuscripts (source: Japanese Wikipedia). It looks like a bold equals sign: 〓. Other people have suggested U+FFFD REPLACEMENT

Re: a character for an unknown character

2016-12-21 Thread Rebecca T
U+FFFD REPLACEMENT CHARACTER � On Wed, Dec 21, 2016 at 3:05 AM Philippe Verdy wrote: > there's a "replacement" control, whose rendering is undefined. It may > represent any missing part covering more than one character, such as parts > that have been burned, or overstrikken.

Re: a character for an unknown character

2016-12-21 Thread Philippe Verdy
there's a "replacement" control, whose rendering is undefined. It may represent any missing part covering more than one character, such as parts that have been burned, or overstrikken. This Unicode character can act as a substitute but its rendering is purposely undefined. An application may show

a character for an unknown character

2016-12-20 Thread Martin Mueller
I’m new to this list. Please excuse my technical incompetence. Is there a Unicode character that says “I represent an alphanumerical character, but I don’t know which”. This is a very common problem in the transcription of historical texts where you have lacunas. Often, the extent of the