Re: The (Klingon) Empire Strikes Back

2016-11-10 Thread Mark Davis ☕️
The committee doesn't "tentatively approve, pending X".

But the good news is that I think it was the sense of the committee that
the evidence of use for Klingon is now sufficient, and the rest of the
proposal was in good shape (other than the lack of a date), so really only
the IP stands in the way.

I would suggest that the Klingon community work towards getting Paramount
to engage with us, so that any IP issues could be settled.

Mark

Mark

On Thu, Nov 10, 2016 at 10:33 AM, Shawn Steele 
wrote:

> More generally, does that mean that alphabets with perceived owners will
> only be considered for encoding with permission from those owner(s)?  What
> if the ownership is ambiguous or unclear?
>
>
>
> Getting permission may be a lot of work, or cost money, in some cases.
> Will applications be considered pending permission, perhaps being
> provisionally approved until such permission is received?
>
>
>
> Is there specific language that Unicode would require from owners to be
> comfortable in these cases?  It makes little sense for a submitter to go
> through a complex exercise to request permission if Unicode is not
> comfortable with the wording of the permission that is garnered.  Are there
> other such agreements that could perhaps be used as templates?
>
>
>
> Historically, the message pIqaD supporters have heard from Unicode has
> been that pIqaD is a toy script that does not have enough use.  The new
> proposal attempts to respond to those concerns, particularly since there is
> more interest in the script now.  Now, additional (valid) concerns are
> being raised.
>
>
>
> In Mark’s case it seems like it would be nice if Unicode could consider
> the rest of the proposal and either tentatively approve it pending
> Paramount’s approval, or to provide feedback as to other defects in the
> proposal that would need addressed for consideration.  Meanwhile Mark can
> figure out how to get Paramount’s agreement.
>
>
>
> -Shawn
>
>
>
> *From:* Unicode [mailto:unicode-boun...@unicode.org] *On Behalf Of *Peter
> Constable
> *Sent:* Wednesday, November 9, 2016 8:49 PM
> *To:* Mark E. Shoulson ; David Faulks <
> davidj_fau...@yahoo.ca>
> *Cc:* Unicode Mailing List 
> *Subject:* RE: The (Klingon) Empire Strikes Back
>
>
>
> *From:* Unicode [mailto:unicode-boun...@unicode.org
> ] *On Behalf Of *Mark E. Shoulson
> *Sent:* Friday, November 4, 2016 1:18 PM
>
> > At any rate, this isn't Unicode's problem…
>
>
>
> You saying that potential IP issues are not Unicode’s problem does not in
> fact make it not a problem. A statement in writing from authorized
> Paramount representatives stating it would not be a problem for either
> Unicode, its members or implementers of Unicode would make it not a problem
> for Unicode.
>
>
>
>
>
>
>
> Peter
>


Re: Dataset for all ISO639 code sorted by country/territory?

2016-11-10 Thread Andrew West
On 10 November 2016 at 17:56, Doug Ewell  wrote:
>
> Keep in mind that the CLDR table documents 675 of the world's best-known
> languages, counting variants such as three different orthographies of
> Uzbek.

Oddly, it seems that there are over 1.2 billion speakers of Cantonese
in China, but no speakers of Mandarin (the biggest language by number
of speakers in the world).

Andrew


RE: The (Klingon) Empire Strikes Back

2016-11-10 Thread Shawn Steele
More generally, does that mean that alphabets with perceived owners will only 
be considered for encoding with permission from those owner(s)?  What if the 
ownership is ambiguous or unclear?

Getting permission may be a lot of work, or cost money, in some cases.  Will 
applications be considered pending permission, perhaps being provisionally 
approved until such permission is received?

Is there specific language that Unicode would require from owners to be 
comfortable in these cases?  It makes little sense for a submitter to go 
through a complex exercise to request permission if Unicode is not comfortable 
with the wording of the permission that is garnered.  Are there other such 
agreements that could perhaps be used as templates?

Historically, the message pIqaD supporters have heard from Unicode has been 
that pIqaD is a toy script that does not have enough use.  The new proposal 
attempts to respond to those concerns, particularly since there is more 
interest in the script now.  Now, additional (valid) concerns are being raised.

In Mark’s case it seems like it would be nice if Unicode could consider the 
rest of the proposal and either tentatively approve it pending Paramount’s 
approval, or to provide feedback as to other defects in the proposal that would 
need addressed for consideration.  Meanwhile Mark can figure out how to get 
Paramount’s agreement.

-Shawn

From: Unicode [mailto:unicode-boun...@unicode.org] On Behalf Of Peter Constable
Sent: Wednesday, November 9, 2016 8:49 PM
To: Mark E. Shoulson ; David Faulks 
Cc: Unicode Mailing List 
Subject: RE: The (Klingon) Empire Strikes Back

From: Unicode [mailto:unicode-boun...@unicode.org] On Behalf Of Mark E. Shoulson
Sent: Friday, November 4, 2016 1:18 PM
> At any rate, this isn't Unicode's problem…

You saying that potential IP issues are not Unicode’s problem does not in fact 
make it not a problem. A statement in writing from authorized Paramount 
representatives stating it would not be a problem for either Unicode, its 
members or implementers of Unicode would make it not a problem for Unicode.



Peter


RE: Dataset for all ISO639 code sorted by country/territory?

2016-11-10 Thread Doug Ewell
Mats Blakstad wrote:

> For myself I was not actually considering the amount of speakers in
> each country, but to map languages with countries/territories where
> the language originated or have been spoken traditionally.

And that is where I think you'll have disagreement on the details.

> So I guess what matters is which language people mostly expect to find
> under the country/territory.

Yep, that's the challenge.

> Would it be possible to extend this dataset to all languages and start
> build an open source data set for language-territory mapping?
> http://www.unicode.org/cldr/charts/latest/supplemental/language_territory_information.html
>  

That's a good question for the CLDR folks, who have their own mailing
list.

Keep in mind that the CLDR table documents 675 of the world's best-known
languages, counting variants such as three different orthographies of
Uzbek. While anything is possible, extending this to "all languages,"
e.g. the other 6,300 lesser-known living languages, might require a bit
of time and money.

There is also a resource in the "UDHR in Unicode" project that might be
worth investigating, though it too is an imperfect match with what you
seem to be looking for.

--
Doug Ewell | Thornton, CO, US | ewellic.org




Re: Dataset for all ISO639 code sorted by country/territory?

2016-11-10 Thread Mats Blakstad
On 20 September 2016 at 18:34, Doug Ewell  wrote:

> > Is there any dataset that contains all languages in the world sorted
> > by country/territory?
>
> As others have pointed out, be careful about how slippery this slope can
> get. Everyone has his or her own opinion about how many speakers of
> Language X in country Y need to be identified, estimated, or conjectured
> in order to say that "language X is spoken in country Y."
>

For myself I was not actually considering the amount of speakers in each
country, but to map languages with countries/territories where the language
originated or have been spoken traditionally.
For instance in Norway we do have many immigrants from Pakistan, but I
doubt any of them would expect to see Urdu sorted under Norway, even though
there are many people in Norway that speak Urdu.
They would expect to see it under Pakistan that is a their heritage
country, I guess this is a lot an identity issue also

I do understand that it is not easy to get a perfect language-country
mapping, and I guess the mapping also depend on the use.
For myself I want people to be able to sort languages by
country/territories to make it easier to make lists of translations, I
think it can be good to be able to sort by territories instead of providing
a looong list of languages.
So I guess what matters is which language people mostly expect to find
under the country/territory.


>
> > I manage to find a dataset on the website of Ethnologue, though it
> > doesn't look like open source, need to check with them exactly how I'm
> > allowed to use it:
> > http://www.ethnologue.com/codes/download-code-tables
>
> The readme file included in the downloadable zip file makes SIL's terms
> very clear. Basically you need to credit SIL as the source of the data,
> not change it, and not make the data directly available for others to
> download. It's best not to get caught up in "open source" as if any
> other terms would make the data totally unusable.
>
>
I agree that a dataset is not unusable just because it is not open source,
but for myself I in fact need a dowbloadable file!

I tried contact SiL but they will only sell the dataset for a fee and will
not give an open source license.

Would it be possible to extend this dataset to all languages and start
build an open source data set for language-territory mapping?
http://www.unicode.org/cldr/charts/latest/supplemental/language_territory_information.html