(I know this is way too simplistic a response but it is kind of like giving everyone an invisible cloak and an invisible dagger and not telling them what a cloak and dagger is for [cutting butter & keeping warm]).
On Fri, Mar 27, 2015 at 3:57 PM, Michael Norton < [email protected]> wrote: > Why wouldn't Unicode itself have it? > > On Fri, Mar 27, 2015 at 1:07 PM, Ken Whistler <[email protected]> wrote: > >> Search engine companies (and in particular, Google) have such >> information squirreled away in their index databases, at least as >> far as usage stats for Unicode characters on the web go -- but it >> is proprietary information, and they generally don't publish >> information about such statistics. >> >> Perhaps there are researchers out there who have set web crawlers >> on a mission to generate such web statistics for publication, and maybe >> somebody on this list knows of such research -- but it would be >> virtually impossible to generate such information for the much >> wider collection of documents and data that are not easily accessible >> for web indexing. (Behind password walls, in pdf document archives, >> in proprietary databases, ... ) As an example of why this is a problem, >> consider the fact that there are *peta*bytes of information picked up >> and stored in databases from scanners and other devices used at >> tens of millions of retail points of sale. Such data, by its nature, >> would tend >> to skew heavily towards use of ASCII a-z and digits 0-9 in its >> character data. How would you end up weighting such (mostly >> publicly inaccessible) data in trying to count up for overall statistics >> on character use? >> >> There are more traditional usage count studies that focus on >> counts of character frequency within single language orthographies >> in single scripts (e.g., letter frequences for French text), but I don't >> think that is what you were asking about. >> >> Here is some discussion of a similar question posted on stackoverflow: >> >> http://stackoverflow.com/questions/22184624/unicode- >> character-usage-statistics >> >> --Ken >> >> On 3/27/2015 9:31 AM, Michael Norton wrote: >> >>> Hello and thank you for an incredible service (just joining the list). >>> Is there a list of usage statistics per character of the Unicode set >>> available somewhere? >>> >>> >>> >> _______________________________________________ >> Unicode mailing list >> [email protected] >> http://unicode.org/mailman/listinfo/unicode >> > > > > -- > > Michael A. Norton, B.A. Cinema, M.P.A. > My Cinema Home: http://www.NortonsNook.com > > "All great actors are mere mathematical masters of speech and the human > body." > > > > > -- Michael A. Norton, B.A. Cinema, M.P.A. My Cinema Home: http://www.NortonsNook.com "All great actors are mere mathematical masters of speech and the human body."
_______________________________________________ Unicode mailing list [email protected] http://unicode.org/mailman/listinfo/unicode

