Re: Bundled copies of the Porter stemmer library

2013-05-10 Thread Florian Weimer

On 04/17/2013 02:48 PM, Florian Weimer wrote:

I found some packages which embed copies of the Porter stemmer library
(PostgreSQL, tracker, pl, etc.).  Should I file bugs once I have the
full list, or should I apply for a bundling exception?


FYI, I'm deferring dealing with this until I've got better tool support 
for finding and comparing the clones.


--
Florian Weimer / Red Hat Product Security Team
--
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

Re: Bundled copies of the Porter stemmer library

2013-04-19 Thread Toshio Kuratomi
On Apr 18, 2013 7:33 AM, Petr Pisar ppi...@redhat.com wrote:

 On 2013-04-17, Florian Weimer fwei...@redhat.com wrote:
  Ugh, hit Send too soon.
 
  I found some packages which embed copies of the Porter stemmer library
  (PostgreSQL, tracker, pl, etc.).  Should I file bugs once I have the
  full list, or should I apply for a bundling exception?
 
  I don't know if the existing copies are patched in significant ways.

 The SWI Prologue (pl package) modifies the code and modifes the old
 Relase 1 (current Porter's release is
 2 http://tartarus.org/martin/PorterStemmer/c.txt). I worry unbundling
 will not be easy because upstream does not provide a library but
 a simple C code (no headers, no interface) and because pl changes some
 prototypes to fit better to the pl and of course adds the binding
 helpers. Unified diff has 980 lines of 383 upstream lines now.


Yeah, if upstream isn't shipping this as a proper library i'd try to get
this in as either a copylib (guidelines for that already exist) or as a
code snippet (guidelines for that don't yet exist but the fpc has recently
granted several exceptions based on the concept.)

It still needs to go through fpc to grant the exception for two reasons: 1)
confirm that it is in fact a copylib or code snippet 2) to assign it a
virtual provide so packages that bundle it can be notified in case a severe
bug is discovered in the porter stemmer upstream on which these are based.

-Toshio

 -- Petr

 --
 devel mailing list
 devel@lists.fedoraproject.org
 https://admin.fedoraproject.org/mailman/listinfo/devel
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

Re: Bundled copies of the Porter stemmer library

2013-04-19 Thread Florian Weimer

On 04/18/2013 04:32 PM, Petr Pisar wrote:

On 2013-04-17, Florian Weimer fwei...@redhat.com wrote:

Ugh, hit Send too soon.

I found some packages which embed copies of the Porter stemmer library
(PostgreSQL, tracker, pl, etc.).  Should I file bugs once I have the
full list, or should I apply for a bundling exception?

I don't know if the existing copies are patched in significant ways.


The SWI Prologue (pl package) modifies the code and modifes the old
Relase 1 (current Porter's release is
2 http://tartarus.org/martin/PorterStemmer/c.txt).


I think you're packaging the newer Snowball-generated stemmers.  Those 
are available from http://snowball.tartarus.org/, and there are actual 
release tarballs (albeit unversioned ones).  As you can see, upstream 
has added support for additional languages which you still have to pick 
up.  This upstream development activity seems to be a fairly strong 
argument against bundling.


--
Florian Weimer / Red Hat Product Security Team
--
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

Re: Bundled copies of the Porter stemmer library

2013-04-18 Thread Petr Pisar
On 2013-04-17, Florian Weimer fwei...@redhat.com wrote:
 Ugh, hit Send too soon.

 I found some packages which embed copies of the Porter stemmer library
 (PostgreSQL, tracker, pl, etc.).  Should I file bugs once I have the
 full list, or should I apply for a bundling exception?

 I don't know if the existing copies are patched in significant ways.

The SWI Prologue (pl package) modifies the code and modifes the old
Relase 1 (current Porter's release is
2 http://tartarus.org/martin/PorterStemmer/c.txt). I worry unbundling
will not be easy because upstream does not provide a library but
a simple C code (no headers, no interface) and because pl changes some
prototypes to fit better to the pl and of course adds the binding
helpers. Unified diff has 980 lines of 383 upstream lines now.

-- Petr

-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

Bundled copies of the Porter stemmer library

2013-04-17 Thread Florian Weimer

--
Florian Weimer / Red Hat Product Security Team
--
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

Bundled copies of the Porter stemmer library

2013-04-17 Thread Florian Weimer
Ugh, hit Send too soon.

I found some packages which embed copies of the Porter stemmer library
(PostgreSQL, tracker, pl, etc.).  Should I file bugs once I have the
full list, or should I apply for a bundling exception?

I don't know if the existing copies are patched in significant ways.

-- 
Florian Weimer / Red Hat Product Security Team
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

Re: Bundled copies of the Porter stemmer library

2013-04-17 Thread Tom Lane
Florian Weimer fwei...@redhat.com writes:
 I found some packages which embed copies of the Porter stemmer library
 (PostgreSQL, tracker, pl, etc.).  Should I file bugs once I have the
 full list, or should I apply for a bundling exception?

Well, as far as postgresql goes, you'll get zero interest from upstream
in unbundling, because the Porter code isn't widely available as a
prepackaged library.  (AFAIK anyway --- has that situation changed in
the last few years?)

regards, tom lane
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel

Re: Bundled copies of the Porter stemmer library

2013-04-17 Thread Toshio Kuratomi
On Wed, Apr 17, 2013 at 10:05:12AM -0400, Tom Lane wrote:
 Florian Weimer fwei...@redhat.com writes:
  I found some packages which embed copies of the Porter stemmer library
  (PostgreSQL, tracker, pl, etc.).  Should I file bugs once I have the
  full list, or should I apply for a bundling exception?
 
Is this an actual library or just a C file distributed from a web page?

Filing bugs is probably needed as the packages will require modification
even if the FPC grants an exception.  However, it may be good to work out
some of the details beforehand.  If an exception were granted, we'd just
need to add virtual provides to the packages.  This is a lot easier than
modifying build scripts to build against the unbundled library.  Note
that the FPC will need many details in order to decide whether to grant an
exception or not.  So they *may* need to have information about each case of
bundling in each package anyway.  (Then again, there may be information that
applies to all of them... I need to write up a draft of our code snippet
exception criteria which this might (or might not) fall under if it's just
distributed as a source file on a website.)

 Well, as far as postgresql goes, you'll get zero interest from upstream
 in unbundling, because the Porter code isn't widely available as a
 prepackaged library.  (AFAIK anyway --- has that situation changed in
 the last few years?)
 
The hard question is really whether the code is modified.  For
distributions, another dependent library is simply one more package to ship.

What usually happens in a simple case of bundling is:

1) We identify it.
2) We package the bundled library separately
3) We add a flag to the build scripts of the bundling packages to use a
   system library instead of the bundled library (and the cascade of places
   in the build scripts that that has to change).
4) We submit that upstream and upstream accepts it.
5) Other distributions realize that they can now build with a system version
   of this library and they start packaging the system library to accomodate
   that (this latter point works in the other direction as well... we
   sometimes get the work done by other distributions in this same manner).

There are many non-simple cases (usually revolving around the bundled
library being modified) and a few cases where upstream doesn't want our
work or the package maintainer in Fedora doesn't want to perform the work
but those are happily, few.

-Toshio


pgpIA_wUcEo8K.pgp
Description: PGP signature
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel