Bug#910548: base-files - please consider adding /usr/share/common-licenses/Unicode-Data

2021-04-01 Thread Russ Allbery
Paul Hardy  writes:

> I recently formatted the Unicode Data license for the d/copyright file
> of a Debian package that I created.  I thought I would offer it to
> Debian if you are interested.  You probably do not want the Copyright
> stanza, and you might not want the Comment stanza, but I erred on the
> side of too much rather than too little.

> Unicode data files are used in a number of free software packages, such
> as linux-libc-dev and the Linux kernel itself.  Use of Unicode data in
> software is likely to continue growing over time.  Thus you might find
> this useful.

Hi Paul,

It looks like you included the entire statement from the web site, which I
think is intended to cover the whole web site.  As near as I can tell, the
files that Debian is packaging are the ones referenced by this stanza:

4. Further specifications of rights and restrictions pertaining to the
   use of the particular set of data files known as the "Unicode
   Character Database" can be found in the License.

and therefore appear to only be covered by this license instead:

https://www.unicode.org/license.html

The full license that you formatted includes a bunch of other clauses like
choice of venue and unilateral license changes that I don't think are
intended to cover the things that we're packaging.  I think we should
therefore consider incorporating only the above text instead?

Scott Kitterman  writes:

> According to my wrangling of codesearch.debian.net, unicode.org gets
> mentioned in over 1,000 packages and it's mostly about this data.  I
> think that's enough to merit inclusion in common-licenses.

Could you provide more detail on your search?  I searched for:

path:debian/copyright DOWNLOADING, INSTALLING, COPYING OR OTHERWISE
USING UNICODE INC

and only found 26 packages.  I'm not sure that's enough to warrant
inclusion in common-licenses.

-- 
Russ Allbery (r...@debian.org)  



Bug#910548: base-files - please consider adding /usr/share/common-licenses/Unicode-Data

2019-01-30 Thread Scott Kitterman
On Sat, 26 Jan 2019 08:47:52 -0800 Paul Hardy  wrote:
> Unicode's new version for 2019 is attached, with data files in
> http://www.unicode.org/ivd/data/ explicitly mentioned as covered under
> the license.  The source text is at
> http://www.unicode.org/copyright.html.

According to my wrangling of codesearch.debian.net, unicode.org gets mentioned 
in over 1,000 packages and it's mostly about this data.  I think that's enough 
to merit inclusion in common-licenses.

Scott K

signature.asc
Description: This is a digitally signed message part.


Bug#910548: base-files - please consider adding /usr/share/common-licenses/Unicode-Data

2019-01-26 Thread Paul Hardy
Unicode's new version for 2019 is attached, with data files in
http://www.unicode.org/ivd/data/ explicitly mentioned as covered under
the license.  The source text is at
http://www.unicode.org/copyright.html.

Thanks,


Paul Hardy


Unicode-Data
Description: Binary data


Unicode-Data.sig
Description: Binary data


Bug#910548: base-files - please consider adding /usr/share/common-licenses/Unicode-Data

2018-10-19 Thread Paul Hardy
On Thu, Oct 18, 2018 at 11:22 PM Russ Allbery  wrote:
>
> ...It therefore probably isn't meaningfully under
> any license, although including the Unicode Data License in the package
> doesn't hurt.

That was my thinking.  Plus it would have been acutely embarrassing to
run afoul of the Unicode Consortium's license terms, even if
unintentional, after they so kindly gave me a lifetime membership
gratis[1]. :-)

> Disclaimer: I'm not a lawyer, let alone a copyright lawyer.

Me neither, so I erred on the side of caution.

Thanks,


Paul Hardy

[1] http://unicode.org/consortium/members.html



Bug#910548: base-files - please consider adding /usr/share/common-licenses/Unicode-Data

2018-10-19 Thread Russ Allbery
Paul Hardy  writes:

> I understand your intention, but it's not that straightforward.  The
> data that I saw in Debian packages I looked through used various pieces
> of property data from various files from the Unicode Consortium within
> pre-built arrays also containing other data, though I didn't look
> through all packages that used Unicode data by any means.

> In my case, I used Unicode code point descriptions in the comment fields
> of lex patterns (flex on Debian) in my beta2uni program (part of my
> unibetacode package), which converts Beta Code to Unicode.  Here are a
> few such lines of code:

> \*\/[Aa] print_pattern (yytext, 0x0386);  /* GREEK CAPITAL LETTER
> ALPHA WITH TONOS*/
> \*\/[Ee] print_pattern (yytext, 0x0388);  /* GREEK CAPITAL LETTER
> EPSILON WITH TONOS  */
> \*\/[Hh] print_pattern (yytext, 0x0389);  /* GREEK CAPITAL LETTER ETA
> WITH TONOS  */
> \*\/[Ii] print_pattern (yytext, 0x038A);  /* GREEK CAPITAL LETTER IOTA
> WITH TONOS */
> \*\/[Oo] print_pattern (yytext, 0x038C);  /* GREEK CAPITAL LETTER
> OMICRON WITH TONOS  */
> \*\/[Uu] print_pattern (yytext, 0x038E);  /* GREEK CAPITAL LETTER
> UPSILON WITH TONOS  */
> \*\/[Ww] print_pattern (yytext, 0x038F);  /* GREEK CAPITAL LETTER
> OMEGA WITH TONOS*/
> etc.

It's not directly relevant to this bug report, since the data in question
in some of these packages may be of a different nature, but it might be
worth pointing out that data such as what you show above is probably not
copyrightable under at least US law (and I think that exception is fairly
common internationally).  It therefore probably isn't meaningfully under
any license, although including the Unicode Data License in the package
doesn't hurt.

Disclaimer: I'm not a lawyer, let alone a copyright lawyer.

-- 
Russ Allbery (r...@debian.org)   



Bug#910548: base-files - please consider adding /usr/share/common-licenses/Unicode-Data

2018-10-18 Thread Paul Hardy
Josh,

I understand your intention, but it's not that straightforward.  The
data that I saw in Debian packages I looked through used various
pieces of property data from various files from the Unicode Consortium
within pre-built arrays also containing other data, though I didn't
look through all packages that used Unicode data by any means.

In my case, I used Unicode code point descriptions in the comment
fields of lex patterns (flex on Debian) in my beta2uni program (part
of my unibetacode package), which converts Beta Code to Unicode.  Here
are a few such lines of code:

\*\/[Aa] print_pattern (yytext, 0x0386);  /* GREEK CAPITAL LETTER
ALPHA WITH TONOS*/
\*\/[Ee] print_pattern (yytext, 0x0388);  /* GREEK CAPITAL LETTER
EPSILON WITH TONOS  */
\*\/[Hh] print_pattern (yytext, 0x0389);  /* GREEK CAPITAL LETTER ETA
WITH TONOS  */
\*\/[Ii] print_pattern (yytext, 0x038A);  /* GREEK CAPITAL LETTER IOTA
WITH TONOS */
\*\/[Oo] print_pattern (yytext, 0x038C);  /* GREEK CAPITAL LETTER
OMICRON WITH TONOS  */
\*\/[Uu] print_pattern (yytext, 0x038E);  /* GREEK CAPITAL LETTER
UPSILON WITH TONOS  */
\*\/[Ww] print_pattern (yytext, 0x038F);  /* GREEK CAPITAL LETTER
OMEGA WITH TONOS*/
etc.

I used the utf8gen program (another package that I wrote and then
debianized) to create those lines of code, typing in the regular
expressions myself by hand after utf8gen did the monotonous work of
printing everything to the right of those patterns on each line for me
from data that I had pre-extracted from a Unicode data file.

I had to have the Unicode names in front of me to type the correct
regular expression for each code point.  The way I did that also will
help me or anyone else debug the program in the future.

Were I to attempt to pull such comment strings from another package on
the fly, I would have to write a program that knew which lines in my
source code needed those comment strings, fetch them from said
external package, and create a new source code file for lex/flex
before building the final program.  Apart from the most obvious
immediate inconveniences of doing that, two others come to mind:

1) I could not then produce the source file in final form without
running on a distro such as Debian that implemented a packaging
scheme, or providing complicated build instructions for an end user
(most likely a student of ancient Greek who would not have deep
knowledge of building software packages).  As implemented, my
unibetacode package builds and installs on many distros just the way
it is, including on non-GNU/Linux systems thanks to the modern miracle
of GNU Autotools.

2) I would have to perform such a partial build just to read the
comments that I intended for debugging (and I would have had to resort
to an external table while typing in the generating regular
expressions rather than having them conveniently on the same line of
code).

There would also be the impracticality of telling such groups as the
Linux kernel developers and other upstream teams that they must switch
to using the Unicode package that Debian provides for their future
builds.


OTOH, packaging the Unicode data files could be useful for other,
unrelated purposes.  Of course, such a package would be one more
instance of the need for the Unicode Consortium's license and
(lengthy) copyright information in yet one more package's
debian/copyright file. :-)

Yet that still doesn't answer the question of whether or not Debian
would find such a common file of Unicode license & copyright terms
useful...but the text is there if Debian makes that decision.  If not,
at least I took the time to make it available.

Thanks,


Paul Hardy



Bug#910548: base-files - please consider adding /usr/share/common-licenses/Unicode-Data

2018-10-18 Thread Josh Triplett
On Sun, 7 Oct 2018 16:25:39 -0700 Paul Hardy  wrote:
> Package: base-files
> Severity: wishlist
> Tags: patch
> 
> Hello,
> 
> I recently formatted the Unicode Data license for the d/copyright file
> of a Debian package that I created.  I thought I would offer it to
> Debian if you are interested.  You probably do not want the Copyright
> stanza, and you might not want the Comment stanza, but I erred on the
> side of too much rather than too little.
> 
> Unicode data files are used in a number of free software packages,
> such as linux-libc-dev and the Linux kernel itself.  Use of Unicode
> data in software is likely to continue growing over time.  Thus you
> might find this useful.

Duplication of such data among multiple packages does not seem like a
feature, and certainly not enough duplication to justify a
common-licenses entry. I would hope that most such uses could pull in
these data files from a common package.



Bug#910548: base-files - please consider adding /usr/share/common-licenses/Unicode-Data

2018-10-18 Thread Santiago Vila
reassign 910548 debian-policy
thanks

On Sun, 7 Oct 2018, Paul Hardy wrote:

> Package: base-files
> Severity: wishlist
> Tags: patch
> 
> Hello,
> 
> I recently formatted the Unicode Data license for the d/copyright file
> of a Debian package that I created.  I thought I would offer it to
> Debian if you are interested. [...]

Hello. According to /usr/share/doc/base-files/README, the decision to
include a license or not is delegated to the Debian Policy Group, so
I'm reassigning this bug.

Thanks.



Bug#910548: base-files - please consider adding /usr/share/common-licenses/Unicode-Data

2018-10-07 Thread Paul Hardy
Package: base-files
Severity: wishlist
Tags: patch

Hello,

I recently formatted the Unicode Data license for the d/copyright file
of a Debian package that I created.  I thought I would offer it to
Debian if you are interested.  You probably do not want the Copyright
stanza, and you might not want the Comment stanza, but I erred on the
side of too much rather than too little.

Unicode data files are used in a number of free software packages,
such as linux-libc-dev and the Linux kernel itself.  Use of Unicode
data in software is likely to continue growing over time.  Thus you
might find this useful.

Thank you,


Paul Hardy


Unicode-Data.sig
Description: PGP signature


Unicode-Data
Description: Binary data