Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-07-03 Thread Jeffrey Johnson

On Jul 2, 2012, at 11:21 PM, Alexey Tourbin wrote:

> On Mon, Jul 2, 2012 at 9:17 AM, Jeffrey Johnson  wrote:
>>> All RPMv4.4+ packages, that is, but not RPMv4.0. I find this "file
>>> coloring" business very annoying, by the way, and it took me some time
>>> to realize that "fc" actually stands for "file coloring". :-)
>> 
>> RPMv4.0 was a l-o-n-g time ago.
> 
> It was a long time ago, but it was not that bad. In ALT Linux we still
> use rpm-4.0 code base (with many important backports etc.). Since I've
> recently broken with ALT Linux, I cannot make more claims, you see…
> 

In all cases: whether rpm-4.0.x had file classification is irrelevant to
whether a set:versions "container" can be used for keyword storage.

>> I find multilib quite annoying: I was asked for an implementation,
>> and did so. 'Twas a job mon: already 7y since leaving @redhat, and file
>> colors for multi lib were several years before that.
> 
> I don't like how "multilib" works either. You can no longer identify
> packages by their names, and then there are special rules to resolve
> file conflicts, which basically say that the license is that you can
> swamp files as much as you want, provided that you got the first hand
> in that strange "x86-64" relationship. There is than that recent "x32"
> stuff where you can run in 64-bit mode using only 32-bit pointers. How
> can you address THAT? To me, the world is declining, like the Roman
> Empire. ;-)
> 

Dependency graphs based solely on package names are overrated, particularly
when there are multiple naming schemes around, and packages are continuously
renamed, and so the benefits of consistent naming are already a pleasant lie.

With file colors, the test for x32 elf "magic" sets a file color bit and nothing
else in rpm needs changing.

The existing preference configuration to resolve executable (any non-elf 
executable
file conflict is a different issue) automates the resolution in 2 locations. 
Might
need a tweak to handle secondary chjoices (i.e. when neither of the conflicting
files is the preferred arch) … *shrug*


>>> I see no reason why rpm(1) should ever consider any preferences. To
>>> me, rpm(1) is exactly black-and-white thing, a watchdog which checks
>>> logical assertions. Things are either consistent enough to proceed
>>> (and e.g. to upgrade the packages), or not - and then you get e.g.
>>> non-zero exit code and you are forced to bail out. Higher-level logic
>>> of finding an upgrade plan anyways belongs to something like apt(1) or
>>> yum(1), although it is executed in some basic librpm terms which we
>>> must make efficient enough. I see no reason to discuss closest metrics
>>> or largest overlaps - this is as interesting as irrelevant to our
>>> basic tasks.
>> 
>> Everyone has an opinion: yours is particularly brittle but
>> entirely logical. Now go persuade everyone else that your
>> opinion is The One True Opinion and RPM will surely change too.
>> 
 There's a similar application with dual/triple/... licensed software and
 computing
 per-file, not per-package, license affinity precisely where set:versions 
 (or
 Bloom
 filters) will represent keywords (like "LGPLv2" or "BSD") easily. Licenses
 unlike
 file(1) magic keywords will require name space administration. SUrely LSB
 and LFF
 are seeking something useful to do for RPM packaging these days, and might
 be convinced to make some set of license tokens "standard" so that license
 affinity can be precisely computed in distributed software.
>>> 
>>> You then discuss more applications which are largely irrelevant to our
>>> basic tasks. (I realize that I'm revisiting and older discussion,
>>> which might not be completely fair because our understanding might
>>> have evolved since.)
>> 
>> Um … what are "… our basic tasks."?
> 
> Our basic task, is that when we feed packages to rpm(1), it must
> quickly decide whether the upgrade is feasible or not. Of course this
> involves hard and sometimes speculative considerations whether things
> are going to work. But if things are definitely not going to work, rpm
> should upgrade never! Not without a special flag which reads
> '--upgrade-as-i-wish'.
> 

We disagree on several points here:
1) upgrade decisions based solely on version information do not handle 
rebuilds.
2) FULLSTOP "never!" behavior when something breaks makes no forward
progress when there's some minor issue. A "best effort" is generally a 
better approach
even in the face of occasionally making matters worse.
3) adding Yet Another Disabler that noone knows how to use solves no 
meaningful problem.

> What is not part of "our basic tasks" is to find an upgrade plan. How
> they do that - it's another business, and completely another story. We
> only check if the upgrade is consistent or not.
> 

Assuming transitive closure on dependency assertions isn't all that is needed.

>> I cannot determine what is "fair" wit

Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-07-02 Thread Alexey Tourbin
On Mon, Jul 2, 2012 at 9:17 AM, Jeffrey Johnson  wrote:
>> All RPMv4.4+ packages, that is, but not RPMv4.0. I find this "file
>> coloring" business very annoying, by the way, and it took me some time
>> to realize that "fc" actually stands for "file coloring". :-)
>
> RPMv4.0 was a l-o-n-g time ago.

It was a long time ago, but it was not that bad. In ALT Linux we still
use rpm-4.0 code base (with many important backports etc.). Since I've
recently broken with ALT Linux, I cannot make more claims, you see...

> I find multilib quite annoying: I was asked for an implementation,
> and did so. 'Twas a job mon: already 7y since leaving @redhat, and file
> colors for multi lib were several years before that.

I don't like how "multilib" works either. You can no longer identify
packages by their names, and then there are special rules to resolve
file conflicts, which basically say that the license is that you can
swamp files as much as you want, provided that you got the first hand
in that strange "x86-64" relationship. There is than that recent "x32"
stuff where you can run in 64-bit mode using only 32-bit pointers. How
can you address THAT? To me, the world is declining, like the Roman
Empire. ;-)

>> I see no reason why rpm(1) should ever consider any preferences. To
>> me, rpm(1) is exactly black-and-white thing, a watchdog which checks
>> logical assertions. Things are either consistent enough to proceed
>> (and e.g. to upgrade the packages), or not - and then you get e.g.
>> non-zero exit code and you are forced to bail out. Higher-level logic
>> of finding an upgrade plan anyways belongs to something like apt(1) or
>> yum(1), although it is executed in some basic librpm terms which we
>> must make efficient enough. I see no reason to discuss closest metrics
>> or largest overlaps - this is as interesting as irrelevant to our
>> basic tasks.
>
> Everyone has an opinion: yours is particularly brittle but
> entirely logical. Now go persuade everyone else that your
> opinion is The One True Opinion and RPM will surely change too.
>
>>> There's a similar application with dual/triple/... licensed software and
>>> computing
>>> per-file, not per-package, license affinity precisely where set:versions (or
>>> Bloom
>>> filters) will represent keywords (like "LGPLv2" or "BSD") easily. Licenses
>>> unlike
>>> file(1) magic keywords will require name space administration. SUrely LSB
>>> and LFF
>>> are seeking something useful to do for RPM packaging these days, and might
>>> be convinced to make some set of license tokens "standard" so that license
>>> affinity can be precisely computed in distributed software.
>>
>> You then discuss more applications which are largely irrelevant to our
>> basic tasks. (I realize that I'm revisiting and older discussion,
>> which might not be completely fair because our understanding might
>> have evolved since.)
>
> Um … what are "… our basic tasks."?

Our basic task, is that when we feed packages to rpm(1), it must
quickly decide whether the upgrade is feasible or not. Of course this
involves hard and sometimes speculative considerations whether things
are going to work. But if things are definitely not going to work, rpm
should upgrade never! Not without a special flag which reads
'--upgrade-as-i-wish'.

What is not part of "our basic tasks" is to find an upgrade plan. How
they do that - it's another business, and completely another story. We
only check if the upgrade is consistent or not.

> I cannot determine what is "fair" without knowing what is being compared …
>
>> Anyway, set-versions are not the "next big thing" with plenty of
>> applications. It's rather a very boring stuff which nevertheless
>> answers the question "how we can possibly enhance ABI compatibility
>> control beyond sonames". The answer is that we must involve into
>> set/subset testing - that's the model, that it is very expensive, and
>> that the only reasonable and possibly the best way to go is to replace
>> symbols with numbers, and to treat sets of numbers as special kind of
>> versions. Now why is that? But that's a much better perspective for
>> discussion.
>
> We differ in usage cases. I see a "container", you see a "version".

I realize that the "version" is somehow a contrived concept behind
really a containter. Like: - Whats' that? - It's a version. - Oh my
gosh!  But then again, if you try to organize your thinking beyond
sets, bytes, and characters, the concept of "version" pops up pretty
naturally. If we are going to satisfy some requirement, that's good
for "version".

> There are many usage cases for subset operations in "package management"
> no matter what we think or how the operations are implemented and represented.

Not all subset operations are equally important, or complex. As I
said, sometimes you should make easy things easy, and simply use
 which is Rb-tree or something like this. Sometimes things
go off, though. Like this: - How many symbols do you want to encode? -
10M. - That'

Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-07-01 Thread Jeffrey Johnson

On Jul 1, 2012, at 11:43 PM, Alexey Tourbin  wrote:

> On Sat, Jun 23, 2012 at 10:29 PM, Jeffrey Johnson  wrote:
>> In the interest of getting off negative nerdy obscure discussions, let's
>> try a positive alternative application for Golob-Rice subset operations.
>> 
>> All RPMv4 packages attach (a lightly filtered) file(1) magic string to
>> every file.
> 
> All RPMv4.4+ packages, that is, but not RPMv4.0. I find this "file
> coloring" business very annoying, by the way, and it took me some time
> to realize that "fc" actually stands for "file coloring". :-)
> 

RPMv4.0 was a l-o-n-g time ago.

I find multilib quite annoying: I was asked for an implementation,
and did so. 'Twas a job mon: already 7y since leaving @redhat, and file
colors for multi lib were several years before that.

And the "fc" stands for "file classifier". But feel free
to claim whatever you want just like everyone else.


>> The file(1) data is mostly usable as a "keyword" namespace exactly as is.
>> Yes there are flaws: however magic strings are from file(1) is about
>> as good as any other de facto keyword tagging of file content.
>> 
>> keywords are strings just like elf symbols are, and set:versions (or Bloom
>> filters)
>> are a compact representation from which its rather easy to do subset
>> computations.
>> 
>> One extension that would be needed is a "closest" metric in order to
>> "prefer"
>> the largest subset overlap: with set:versions any contained subset will
>> satisfy the
>> logical assertions, and there's no easy way to prefer the larger sub-set.
> 
> I see no reason why rpm(1) should ever consider any preferences. To
> me, rpm(1) is exactly black-and-white thing, a watchdog which checks
> logical assertions. Things are either consistent enough to proceed
> (and e.g. to upgrade the packages), or not - and then you get e.g.
> non-zero exit code and you are forced to bail out. Higher-level logic
> of finding an upgrade plan anyways belongs to something like apt(1) or
> yum(1), although it is executed in some basic librpm terms which we
> must make efficient enough. I see no reason to discuss closest metrics
> or largest overlaps - this is as interesting as irrelevant to our
> basic tasks.

Everyone has an opinion: yours is particularly brittle but
entirely logical. Now go persuade everyone else that your
opinion is The One True Opinion and RPM will surely change too.

> 
>> There's a similar application with dual/triple/... licensed software and
>> computing
>> per-file, not per-package, license affinity precisely where set:versions (or
>> Bloom
>> filters) will represent keywords (like "LGPLv2" or "BSD") easily. Licenses
>> unlike
>> file(1) magic keywords will require name space administration. SUrely LSB
>> and LFF
>> are seeking something useful to do for RPM packaging these days, and might
>> be convinced to make some set of license tokens "standard" so that license
>> affinity can be precisely computed in distributed software.
> 
> You then discuss more applications which are largely irrelevant to our
> basic tasks. (I realize that I'm revisiting and older discussion,
> which might not be completely fair because our understanding might
> have evolved since.)
> 

Um … what are "… our basic tasks."?

I cannot determine what is "fair" without knowing what is being compared …

> Anyway, set-versions are not the "next big thing" with plenty of
> applications. It's rather a very boring stuff which nevertheless
> answers the question "how we can possibly enhance ABI compatibility
> control beyond sonames". The answer is that we must involve into
> set/subset testing - that's the model, that it is very expensive, and
> that the only reasonable and possibly the best way to go is to replace
> symbols with numbers, and to treat sets of numbers as special kind of
> versions. Now why is that? But that's a much better perspective for
> discussion.

We differ in usage cases. I see a "container", you see a "version".

There are many usage cases for subset operations in "package management"
no matter what we think or how the operations are implemented and represented.

73 de Jeff

> __
> RPM Package Managerhttp://rpm5.org
> Developer Communication Listrpm-devel@rpm5.org

__
RPM Package Managerhttp://rpm5.org
Developer Communication Listrpm-devel@rpm5.org


Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-07-01 Thread Alexey Tourbin
On Sat, Jun 23, 2012 at 10:29 PM, Jeffrey Johnson  wrote:
> In the interest of getting off negative nerdy obscure discussions, let's
> try a positive alternative application for Golob-Rice subset operations.
>
> All RPMv4 packages attach (a lightly filtered) file(1) magic string to
> every file.

All RPMv4.4+ packages, that is, but not RPMv4.0. I find this "file
coloring" business very annoying, by the way, and it took me some time
to realize that "fc" actually stands for "file coloring". :-)

> The file(1) data is mostly usable as a "keyword" namespace exactly as is.
> Yes there are flaws: however magic strings are from file(1) is about
> as good as any other de facto keyword tagging of file content.
>
> keywords are strings just like elf symbols are, and set:versions (or Bloom
> filters)
> are a compact representation from which its rather easy to do subset
> computations.
>
> One extension that would be needed is a "closest" metric in order to
> "prefer"
> the largest subset overlap: with set:versions any contained subset will
> satisfy the
> logical assertions, and there's no easy way to prefer the larger sub-set.

I see no reason why rpm(1) should ever consider any preferences. To
me, rpm(1) is exactly black-and-white thing, a watchdog which checks
logical assertions. Things are either consistent enough to proceed
(and e.g. to upgrade the packages), or not - and then you get e.g.
non-zero exit code and you are forced to bail out. Higher-level logic
of finding an upgrade plan anyways belongs to something like apt(1) or
yum(1), although it is executed in some basic librpm terms which we
must make efficient enough. I see no reason to discuss closest metrics
or largest overlaps - this is as interesting as irrelevant to our
basic tasks.

> There's a similar application with dual/triple/... licensed software and
> computing
> per-file, not per-package, license affinity precisely where set:versions (or
> Bloom
> filters) will represent keywords (like "LGPLv2" or "BSD") easily. Licenses
> unlike
> file(1) magic keywords will require name space administration. SUrely LSB
> and LFF
> are seeking something useful to do for RPM packaging these days, and might
> be convinced to make some set of license tokens "standard" so that license
> affinity can be precisely computed in distributed software.

You then discuss more applications which are largely irrelevant to our
basic tasks. (I realize that I'm revisiting and older discussion,
which might not be completely fair because our understanding might
have evolved since.)

Anyway, set-versions are not the "next big thing" with plenty of
applications. It's rather a very boring stuff which nevertheless
answers the question "how we can possibly enhance ABI compatibility
control beyond sonames". The answer is that we must involve into
set/subset testing - that's the model, that it is very expensive, and
that the only reasonable and possibly the best way to go is to replace
symbols with numbers, and to treat sets of numbers as special kind of
versions. Now why is that? But that's a much better perspective for
discussion.
__
RPM Package Managerhttp://rpm5.org
Developer Communication Listrpm-devel@rpm5.org


Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-07-01 Thread Alexey Tourbin
On Sat, Jun 23, 2012 at 10:10 PM, Jeffrey Johnson  wrote:
>> So I suggest that set-strings must cover all usages of set-subset
>> computations, where static data structures are appropriate (so that
>> you compute it, write into RPM header, and do not expect to modify
>> it). it is still "efficient", despite the fact that we try hard to
>> achieve a better string representation. The fact that it can be
>> packaged nicely does not make it any worse in other respects, except
>> that obviously we can stuff somewhat less bits into alnum characters.
>> But I also wanted it to look "nice" (this is why base62 was devised
>> over much simpler base64 encoding). But so, it looks nice, and it is
>> efficient. Why are you unhappy? Do you necessarily require that you
>> have to do bitmask tests yourself?
>
> I'm not "unhappy". Rather I'm pointing out that -- if your primary
> goal is bandwidth reduction in *.rpm package metadata -- that there are
> equivalent savings that aren't from "pretty" encodings.

Okay, the primary goal was not so much to minimize bandwidth, but just
to make things "nice" overall. But again, realize that set-versions
are O(N) in size, where N is the number of symbols. For example, here
is the Provides of perl-base package in ALT Linux:

$ rpm -q --provides perl-base
perl = 1:5.14.2
perl-version = 0.88
perl-PerlIO = 1:5.14.2
perl-Storable = 1:5.14.2
perl-Digest-MD5
perl-Time-HiRes
perl-MIME-Base64
libperl-5.14.so()(64bit) =
set:odiA4ZlKzevK5y39eQsWZckcCoxsZqg4k4MIETgeNn0RxqjzEscOAdkenBZ7a373IGkMzlRZcFtq8vCvMeJMgzVyfiFR61GhMIUZ81a3aDJtilWwCksMTw3vQxUgc4x1QMFnA2h8lmdZpJ4jHU4wlFn4W2wHpFabuLlqZiIyviaTYTPl1PjrcIm4iZdvp5fOXR9PKnYQyfo6w1AOS7V5qO5ND9AGGby7LRkpBYZuOEZhvKfx9ZEGiSTLG9vx8ka0dtcUm5WXZtFUvjG92kZwnUjHVWc9NZapw9ocaec37K7rO7N4rl3mGGxh21z3bE4FdZ6RI6k52BrCvogEqZqeVZLcBZ9tcAbHq6ObepE2vXR9TQODoNkoeyaOQGbrDchblKzS8wBPTGtnv5Ze8K9FAsQN2dsQxAJ2hD2NOIL1ROhhFydp27xpqCYssiowuiYfJtCdofpKW9qohptrbV1pmkjVb4kI6lhKHZb5ae7p4u65NodjOSHmOw81L7fjwZjoK2uwH5rACj7NxXW9kk10uOsM6EJdRY4nrZqYbkKN2G1gQvUtZzqxPMjWgkh3P9LmcluzYzgCfZoexmZxQP4nRLYuemEWzHXkX1Ys5wqbfMZDxBbi7PtkohqPCqcB4f6zurrib4y7aUHs6aqCiVRHGDdR4thBgZFjpXDxIDN4X2QjAOjRQZhNzilkpBXmCgy75JDndDdw1DCMEZkmfgAnLU4rUEDIZ2pUvC5q3S2lMtq2Zj17f2xDL80cyg11BFQ8thh9iR2SIXCB69Jrda3E7ArKzyXYci6KlHo1jfvWP6Q2lre8VQCGQZoH8qDo7vaq2MGlNFgtgJGQgTGVx0CKZAAhlijag1ZKUGmgqFOdprCVtuZc8IoZBsuiY1DKZyS6e1HPHGYrZcOcd4KNcminIr9spwghADXzujggTmVsCutsEh6LLbjD2boZan6ycEnBMoFsbyIoz5ggnxFKIjuJyZvM6voYIs2qurMGOCoeB2csGGtTV6b82ItcZ2kj7mbamrZsQscXQ8EUT7lflT6IamZAZLZx7GTKUaZjwPyUltZ6gHWKIvw5wbNQMZGvOU6F43xdP7M2lJ0R9g3cVd4oiQig2F8K0AUdnwXz7LVPbjJMk8iBytXGntBKoPI6G3ahWHgSvmhGVVdNJeNXZl0AkcxhvhFqRujtasuOxCEz8WJCLZaO642OhFjtOVZliv4b4Y34Uh2vftwCo1vXxEJxtVVek42wG9La5uiLoz9o77UfQz4pu0yvZmTR8rs1X05ykr61bK73IOzGuMFjfKxQX2uUUL7zkl661NxWK8VR04bP7p1GVLn7SWPfXyfm2MbTZG8ytPzT2vDJYjmKTi6hRJ9RFPpZegntTk7uuL7ahsSej3aiBbEWVKYXWhkntIZ3YfDPOzRxosdgSKMNRY1llmBvZk5JlN3ETltMlAI7v7bq24bCvG9iGyCPyoUoKVLYpL8cycfepxYcgm4ioyz8zt3ZkAMQi4iZfs2zbd34VnocYTB6cyEZxR67x0i7wQRP7VrwDilZfniZgmdH5pAsWxXX0auKdd1nSN93XGZECEQP4nKKWCLwjYVjjAB1iRBohqEbRBstuQLCBMEL8OLapxqEYozeGlFMoSbo6eV6lCFuOKbYwvN6mjofKRtjavsU4hYm2ArDYgZ8alF8tZCw1BFvmhI0zL4w2un5m3S3G2tcZpWdVODLSi3IB4pQyR755EB7zUaPCF8LafHzDKHBSKEZohdNEooVN3lY5DnGpz25rW38dU8TXRG2pckDnGb5LVBVYpxK7VsybdZGsy5JP8Alk1Mf5O4zFmGcpZLs172YECGj3f8JzUcpknwF2PWxqRZgnZapssvh3CIOl8fA1bfZiUGMFg1eu62GXYlWiqWaA9g627YjeQkGdz9AhZCVECGOqNEZFX2Jy6O23lHvgJ2ylDSwi2RJGZdl1ckGMVwZ5z3uKJ207KVuQWyeCv7Ne4wxyWj9JgP0Z2PEmyXsan5QBFEZq323vZmlRyhhmPoPDle2zyxxT7AZ3vL42k3MLOzNPtGpZcTgIJfhy2H38fUU0YIFNpi9ygZGg5NBZdu1XL0NWesCbk2PapUsltirZ7qo6Jwbg2UkwsgEaVsclZ2EPrD5brplmPl9DHlrgLcAZCaNIG9HIFSfZErD7U0tJob9JvoHKZogXZgtaH0UkZjrpma2lCxPpj9qdZdh4gli6C9mqCKElC2NjZKh9T0Uw1ddLDYNmvPB971Xyzx9qSNQ99acW1uP36D7qgjyPsf9qO1ZDqA9fZuJ5oEkNIpIoatW0f157Sp3bTZ7o58XhKir4N8PnJIfj3P4NQjtrjW85vAEHU9B2dZACzNfNACZn7TO2AyRBmnPn6XWYKoAEW9UiH8Z12Z5OBMa2ovftCZJ3Zpqoc0GG9kpGjXEZCN0ypGqk4lnYTCL4X8i1B2AabsfEG100r2HQWbjoIrM1LtGBAqSuPAPwIEVnaSqAvQFfCi9xCHS3EKRSZa9EsDZktcKHply2cMb5ldXl06eDg3i1xHM9uPTvm7laFyWorkUMcCZiV5d0tlXzFuiDjA3pOsOjUFmgAzPNJK17LfQSHA4avpaLbZ4lGSPRdQWOOKBxGDgOYMnK2Z11iom0XUBCvybJ2POnO81yDKHCEtVHJE4ZKTMiruLDjAe876HiwstanPxJ0m6ZeYmWgraCbEF7iaQZoZkgkwHS8ONi1OXSBF7crddfeYDwWLZk9w4lRr5lj5b3bZx2DFkDwk1IwczPHksFPHGgM8hSnBhfBXpcRLalykEkN9p0LGi6u5Zzs7q7dLLOPRrRLNtaz83XPT1ytZAbLKy7dKfRzTBQ314Zbe7tf8HZwplZuD7ZJZi2Juyxq8JdDPBRCJYgD8ZoDjoNbfrnxv7oJ60T8OyKLvkuLjFnWnHZd1M3OwsOM4uDRX3hk8OsAzsQSehIcdLYzRjYAloGknyZGmDCQLMlfpzzz18aIP9BMhLr364JPk6sSFK4qDU8EMcHGgYUWRUaJE9HnLTs9TzLxTveQnBch7PRnniDwgh2hUF0sDUbQmR9wRcDHtZD5p4QZBHaJYKWpdRtiL1nSsNEzvsMZ9PI05mLLNmZIfluzk3mopgeSQWtQGoiHzmRlNFnrhy5AxC9hHegoVllHAHc8VZrOZyNnBZD9b3Vaqrybs1uveBSfZG3mohKqO2WbTUjg3FXD1kkQPIWA0Z2Z3d6Q77wriiou1ZldJgnkru5WtLsY5YAvwTMdFzoa82XjjDRQwCXnzyRRuXqLF6TGDcrEZIAlAtgs8qgZr10JIaHcp7ZiG4oZpCP8nBCCWGxkBOrue2foa1QoJ30LXN7tMFc5U3wuIoXZmSIyBApsgPkgk5
perl(AutoLoader.pm) = 5.710
perl(B.pm) = 1.290
perl(Benchmark.pm) = 1.120
perl(Carp.pm) = 1.200
perl(Carp/Heavy.pm)
perl(Class/Struct.pm) = 0.630
perl(Config

Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-06-28 Thread Jeffrey Johnson

On Jun 28, 2012, at 1:27 PM, Alexey Tourbin wrote:

> 
> The downside is, of course, that when a dependency R subset P is
> broken, it is not easy to find out which P symbols were deleted or
> renamed (or which R symbols are missing).  But this is largely a
> developer's, or should I say a hacker's, problem.
> 

Calculating set union/disjunction (for Bloom filters, surely almost as
easy with set:versions) for P/R "sets"
union   P | R
intersectionP & R
is natural/rapid/easy even if the symbols are unknown.

FWIW: the "arbitrary tags" implemented @rpm5.org have a similar
one-way lossy mapping where the string representation of the
tag name cannot be unmapped from the 30 bit (truncated SHA1) tag number.

The lack of mapping invertibility matters little to "batch oriented" installer 
usage cases
in RPM libraries even if "enquiring minds" wish concise/complete/informative 
error
messages for diagnosis of problems.

Error messages (and symbols) are overrated and less useful with moderately
sized token dictionaries imho.

I hope to get around to generating set:versions this weekend. I was almost there
10 days ago, but got side-tracked by 1 week's worth of cable modem issues which
has now led to postponed backup/upgrade chores … almost finished.

73 de Jeff__
RPM Package Managerhttp://rpm5.org
Developer Communication Listrpm-devel@rpm5.org


Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-06-28 Thread Alexey Tourbin
On Thu, Jun 28, 2012 at 8:18 PM, Alexey Tourbin
 wrote:
> There is also a philosophical consideration which somehow accompanies
> this practical consideration. There is a short story, I believe by
> Borges, where a clever scientist devises a 1-1 map of reality. A 1-1
> map of reality turns out to be a very true picture of reality. What's
> wrong with this approach? Well, a 1-1 map of the world turns out to be
> an exact copy of the world, which is of no use in terms of being "a
> map". Somehow, the reality must be "construed" and "reduced" to a
> simpler (and somewhat coarser) description to become a useful model.
> This is also why we don't plug ELF sections into RPM headers: we
> believe that much simpler (or at least much shorter) dependencies must
> be used to represent ELF binaries in terms of their
> interconnectedness, and must also omit other less important details.

This philosophical argument applies to set-versions in a
(not-so-)obvious manner, which I will now clarify. It goes like this:
although the ultimate goal is to check that R-set of symbols is a
subset of P-set of symbols, you do not necessarily have to store the
full names of the symbols in order to perform a somewhat stripped-down
check itself. When it only matters if R is subset of P, the names
themselves become largely irrelevant, provided that you can devise a
very clever substitution/encoding scheme. You can make "much simpler
(or at least shorter) dependencies" by getting rid of the names in a
manner which does not destroy the check.

The downside is, of course, that when a dependency R subset P is
broken, it is not easy to find out which P symbols were deleted or
renamed (or which R symbols are missing).  But this is largely a
developer's, or should I say a hacker's, problem.

On the other hand, from the user point of view, and also from rpm(1)
perspective, this approach simply promotes synchronous or rather
"transactional" upgrades. It says like, guys, I will not apply
half-baked updates before you fix it all - so that apps and libraries
match. Which totally makes sense!
__
RPM Package Managerhttp://rpm5.org
Developer Communication Listrpm-devel@rpm5.org


Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-06-28 Thread Jeffrey Johnson

On Jun 28, 2012, at 12:18 PM, Alexey Tourbin wrote:

> 
> There is also a philosophical consideration which somehow accompanies
> this practical consideration. There is a short story, I believe by
> Borges, where a clever scientist devises a 1-1 map of reality. A 1-1
> map of reality turns out to be a very true picture of reality. What's
> wrong with this approach? Well, a 1-1 map of the world turns out to be
> an exact copy of the world, which is of no use in terms of being "a
> map". Somehow, the reality must be "construed" and "reduced" to a
> simpler (and somewhat coarser) description to become a useful model.
> This is also why we don't plug ELF sections into RPM headers: we
> believe that much simpler (or at least much shorter) dependencies must
> be used to represent ELF binaries in terms of their
> interconnectedness, and must also omit other less important details.
> 

(aside)
Nice: I'll dig out the Borges story … meanwhile (iirc) there is a story (and 
surely
some sort of literary response to Borges) called
How Much Shall We Bet
in "Cosmicomics" that I pass along for your amusement.

Seriously yes: carrying redundant metadata _EXACTLY_ as is
in the interest of accurate data -- while well intended -- isn't
needed.

And both Bloom filters and set:versions are useful/compact data
stores for existence tests without a need to also carry the _EXACT_
values (if/when the false positive rate can be tolerated, which
is often true for the moderate sized sets needed in package management).

The rest I agree with.

73 de Jeff__
RPM Package Managerhttp://rpm5.org
Developer Communication Listrpm-devel@rpm5.org


Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-06-28 Thread Alexey Tourbin
On Sat, Jun 23, 2012 at 7:10 AM, Jeffrey Johnson  wrote:
>> Why is it any wrong to minimize bandwidth, or, in other words, why it
>> is bad to spend less money? Your answer is like, because the meaning
>> of life is not to spend less money, which is a wrong perspective.
>> Okay, but what's a better perspective? Spending more money for less
>> good is not at all a better perspective.
>
> Nothing I said implies that bandwidth reduction is "wrong".
>
> If you want to minimize bandwidth downloading packages compress
> the metadata and deal with the "legacy compatible" issues however
> you want/need.
>
> My rule of thumb is that metadata is ~15% of the size of a
> typical *.rpm package: assuming 50% compression one might save
> 7.5% of the package download cost (0.50 * 0.15 = 7.5%).

An interesting practical matter behind set-versions is that we can
simply represent the exact set of symbols, which can be encoded and
pictured like this:

Provides: libfoo.so.0 = set:sym1,sym2,sym3,sym4
Requires: libfoo.so.0 = set:sym1,sym3
(where symN stands for direct ELF symbol name - e.g. strcmp). You
simply name the symbols which you require or provide! In fact, this is
exactly how early alpha set-versions were implemented - before I had
some time to ponder over approximate set/subset encoding problem (it
is then how sym1,sym2... sets where converted into numbers, per
symbol, and compressed). The point here is that the price of the exact
set representation may, or may not, be prohibitive. If the price is
not prohibitive, it's a no-brainer: you don't have to involve into
approximate subset business at all. Sometimes, you simply should not
use bloom filters, despite the fact that they might seem appealing.
However, if the price is prohibitive, which it was, the reason for
going into approximate subset business is also a no-brainer: you
should cut down heavily and optimize for size first. If you simply
introduced probabilities without making things less prohibitive, did
you do anything useful at all? You only spoiled things a bit!

There is also a philosophical consideration which somehow accompanies
this practical consideration. There is a short story, I believe by
Borges, where a clever scientist devises a 1-1 map of reality. A 1-1
map of reality turns out to be a very true picture of reality. What's
wrong with this approach? Well, a 1-1 map of the world turns out to be
an exact copy of the world, which is of no use in terms of being "a
map". Somehow, the reality must be "construed" and "reduced" to a
simpler (and somewhat coarser) description to become a useful model.
This is also why we don't plug ELF sections into RPM headers: we
believe that much simpler (or at least much shorter) dependencies must
be used to represent ELF binaries in terms of their
interconnectedness, and must also omit other less important details.

Back to the story of set-versions, with the "original" implementation
(which introduced full Golomb coding), it was estimated that the size
of architecture-dependent pkglist.classic.bz2 metadata is going to go
up from about 3M to about 12M, four-fold! This still was almost
prohibitive to soar up like this. It was considered non-prohibitive
only by the virtue of information-theoretical considerations: since we
are going to encode that many symbols, we must not fool ourselves into
thinking that we could somehow pay a smaller price - that is, without
violating fundamental laws.

By the way, what's the information-theoretical minimum? Say, we want
to encode 1024 20-bit hash values (which yields the false positive
rate at about 0.1%). Well, the first mathematical intuition is that we
need to cut 20-bit range into 1024 smaller stripes, which gives 10
bits per stripe, on average. It is a little bit more complicated than
that, though, exactly because of this "on average" business: we must
also take some bits to encode stripe boundaries. But this is only a
mathematical intuition. The exact formula, in R, is:

> lchoose(2**20,2**10)/log(2)/2**10
[1] 11.43581

(so, on the other hand, and somewhat unexpectedly, it is that old good
"n choose k" business.)  Current implementation lines up at about 11.6
bits per symbol. This is why sometimes I say that set-versions
currently take about 2 alnum character per Provides symbol - this is
because each alnum character can stuff about log2(62)=5.9+ bits.

But compare this to "early alpha" set-versions which represented exact
sets in form of "set:sym1,sym2,sym3,sym4", that is, in terms of
enumerating symbols. With exact sets, can you think of going to
anywhere near 2 characters per symbol, especially that you also need a
separator? This leaves you 1 character per symbol! :-)

There's a saying that Perl makes easy things easy and hard things
possible. Set-versions were designed to make hard things possible.
__
RPM Package Managerhttp://rpm5.org
Developer Communication List 

Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-06-23 Thread Jeffrey Johnson

On Jun 23, 2012, at 2:29 PM, Jeffrey Johnson wrote:

> 
> On Jun 23, 2012, at 2:10 PM, Jeffrey Johnson wrote:
> 
>> 
>> On Jun 23, 2012, at 1:49 PM, Alexey Tourbin  wrote:
>> 
> 
> <…>
> 
>> Perhaps making base61 encoding MANDATORY in rpm would displease
>> everyone equally: choosing a prime is as "pretty" as all other
>> encoding criteria. 
>> 
> 
> In the interest of getting off negative nerdy obscure discussions, let's
> try a positive alternative application for Golob-Rice subset operations.
> 
> All RPMv4 packages attach (a lightly filtered) file(1) magic string to
> every file.
> 
> The file(1) data is mostly usable as a "keyword" namespace exactly as is.
> Yes there are flaws: however magic strings are from file(1) is about
> as good as any other de facto keyword tagging of file content.
> 
> keywords are strings just like elf symbols are, and set:versions (or Bloom 
> filters)
> are a compact representation from which its rather easy to do subset 
> computations.
> 
> One extension that would be needed is a "closest" metric in order to "prefer"
> the largest subset overlap: with set:versions any contained subset will 
> satisfy the
> logical assertions, and there's no easy way to prefer the larger sub-set.
> 
> There's a similar application with dual/triple/... licensed software and 
> computing
> per-file, not per-package, license affinity precisely where set:versions (or 
> Bloom
> filters) will represent keywords (like "LGPLv2" or "BSD") easily. Licenses 
> unlike
> file(1) magic keywords will require name space administration. SUrely LSB and 
> LFF
> are seeking something useful to do for RPM packaging these days, and might
> be convinced to make some set of license tokens "standard" so that license
> affinity can be precisely computed in distributed software.
> 

The above suggestions were an attempt to generalize assertions to "feature 
vectors" (which
is the standard technique by which SAT solvers can start to handle 
"preference").

Meanwhile here's another application of existing set:versions "containers"
Query-by-ELF-symbol
that's perfectly obvious as well.

1) Generate tables for Provideversion/Requireversion by mentioning in the usual 
macro
that configures Berkeley DB. The table will be automagically generated lazily on
1st access (or do --rebuilddb if you insist on being a Luddite).

2) Generate a set:versions set with the symbols of interest. POPT already has a 
data type for
#define POPT_ARG_BITSET 16U+14U /*!< arg ==> bit set */
which results in a Bloom filter (set:versions can be added to popt just like 
rpmbf was by
swiping code from RPM: just as easy is to add some vectors that RPM fills in on 
startup
so that option processing can generate a set:versions set).

3) Retrieve the set:versions values from the PV/RV table; the retrieval will be 
blazingly
fast because there is no need to do secondary loading of headers, and there is
already means to retrieve by a *RE pattern like "^set:.*$" that essentially do 
a partial retrieve on
"set:"

4) Filter the retrieved set:versions keys down to a set of headers to load; do 
the usual --queryformat
display of NEVRA (or whatever other --queryformat is useful).

None of the above is very hard to do (with @rpm5.org code), and is sure to be 
lots faster
than the equivalent scripted operations because of Berkeley DB and btree 
indexing.

73 de Jeff



Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-06-23 Thread Jeffrey Johnson

On Jun 23, 2012, at 2:10 PM, Jeffrey Johnson wrote:

> 
> On Jun 23, 2012, at 1:49 PM, Alexey Tourbin  wrote:
> 

<…>

> Perhaps making base61 encoding MANDATORY in rpm would displease
> everyone equally: choosing a prime is as "pretty" as all other
> encoding criteria. 
> 

In the interest of getting off negative nerdy obscure discussions, let's
try a positive alternative application for Golob-Rice subset operations.

All RPMv4 packages attach (a lightly filtered) file(1) magic string to
every file.

The file(1) data is mostly usable as a "keyword" namespace exactly as is.
Yes there are flaws: however magic strings are from file(1) is about
as good as any other de facto keyword tagging of file content.

keywords are strings just like elf symbols are, and set:versions (or Bloom 
filters)
are a compact representation from which its rather easy to do subset 
computations.

One extension that would be needed is a "closest" metric in order to "prefer"
the largest subset overlap: with set:versions any contained subset will satisfy 
the
logical assertions, and there's no easy way to prefer the larger sub-set.

There's a similar application with dual/triple/... licensed software and 
computing
per-file, not per-package, license affinity precisely where set:versions (or 
Bloom
filters) will represent keywords (like "LGPLv2" or "BSD") easily. Licenses 
unlike
file(1) magic keywords will require name space administration. SUrely LSB and 
LFF
are seeking something useful to do for RPM packaging these days, and might
be convinced to make some set of license tokens "standard" so that license
affinity can be precisely computed in distributed software.

73 de Jeff



Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-06-23 Thread Jeffrey Johnson

On Jun 23, 2012, at 1:49 PM, Alexey Tourbin  wrote:

> On Sat, Jun 23, 2012 at 1:55 AM, Jeffrey Johnson  wrote:
>> There are lots of usage cases for efficient sub-set computations
>> in package management, not just as a de facto API/ABI check
>> using ELF symbols. Most of the other usage cases for efficient
>> sub-set computations are not subject to an ultimately optimal
>> string encoding.
> 
> The possibility of optimal string encoding should not be
> underestimated. If you can't "write it down with a pencil", which by
> the way refers to Alan Turing's style of reasoning, it becomes very
> problematic anyway. Your intuition is probably that raw bytes are
> cheap because you can bitwise-AND them in terms of direct CPU
> instructions, and any "encoding" is expensive because you have to
> crunch bits a lot. This is not very true in practice, though. But you
> should arm yourself with valgrind(1) and spend a few days with it
> before you understand how you waste you CPU powers (and bandwidth, for
> that matter). Set-string encoding has reasonable, and affordable,
> cost. It can be all put together and presented in a rather less
> pessimistic manner. (For example, there is a cache_decode_set routine
> that boosts things by a factor of about o 5. This is simply due to the
> fact that you do not always have to decode the same Provides version
> all over again. This is part of that "fancy-schmancy" stuff which you
> must ignore on the first reading.)
> 
> So I suggest that set-strings must cover all usages of set-subset
> computations, where static data structures are appropriate (so that
> you compute it, write into RPM header, and do not expect to modify
> it). it is still "efficient", despite the fact that we try hard to
> achieve a better string representation. The fact that it can be
> packaged nicely does not make it any worse in other respects, except
> that obviously we can stuff somewhat less bits into alnum characters.
> But I also wanted it to look "nice" (this is why base62 was devised
> over much simpler base64 encoding). But so, it looks nice, and it is
> efficient. Why are you unhappy? Do you necessarily require that you
> have to do bitmask tests yourself?

I'm not "unhappy". Rather I'm pointing out that -- if your primary
goal is bandwidth reduction in *.rpm package metadata -- that there are
equivalent savings that aren't from "pretty" encodings.

The other reason is this: missing values in the {E,V,R} triple
(and attaching some reasonable DWIM semantic) lead to huge
amounts of pointless discussion. Adding all of
twiddle-in-version
set:versions
dpkg-like
is all underway atm. Running digest/encoding strings through
"legacy compatible" (i.e. what most rpm tool chains will
continue to be using for years) rpmvercmp(3) is an accident
waiting to be reported again again again.

Its just as easy to attach a tag with binary data as any other
approach to "pretty" encodings, all of which involve some
aesthetic choice of permitted/excluded ASCII characters, and so
the encodings proliferate to meet all possible opinions.

Perhaps making base61 encoding MANDATORY in rpm would displease
everyone equally: choosing a prime is as "pretty" as all other
encoding criteria. 

73 de Jeff

__
RPM Package Managerhttp://rpm5.org
Developer Communication Listrpm-devel@rpm5.org


Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-06-23 Thread Alexey Tourbin
On Sat, Jun 23, 2012 at 1:55 AM, Jeffrey Johnson  wrote:
> There are lots of usage cases for efficient sub-set computations
> in package management, not just as a de facto API/ABI check
> using ELF symbols. Most of the other usage cases for efficient
> sub-set computations are not subject to an ultimately optimal
> string encoding.

The possibility of optimal string encoding should not be
underestimated. If you can't "write it down with a pencil", which by
the way refers to Alan Turing's style of reasoning, it becomes very
problematic anyway. Your intuition is probably that raw bytes are
cheap because you can bitwise-AND them in terms of direct CPU
instructions, and any "encoding" is expensive because you have to
crunch bits a lot. This is not very true in practice, though. But you
should arm yourself with valgrind(1) and spend a few days with it
before you understand how you waste you CPU powers (and bandwidth, for
that matter). Set-string encoding has reasonable, and affordable,
cost. It can be all put together and presented in a rather less
pessimistic manner. (For example, there is a cache_decode_set routine
that boosts things by a factor of about o 5. This is simply due to the
fact that you do not always have to decode the same Provides version
all over again. This is part of that "fancy-schmancy" stuff which you
must ignore on the first reading.)

So I suggest that set-strings must cover all usages of set-subset
computations, where static data structures are appropriate (so that
you compute it, write into RPM header, and do not expect to modify
it). it is still "efficient", despite the fact that we try hard to
achieve a better string representation. The fact that it can be
packaged nicely does not make it any worse in other respects, except
that obviously we can stuff somewhat less bits into alnum characters.
But I also wanted it to look "nice" (this is why base62 was devised
over much simpler base64 encoding). But so, it looks nice, and it is
efficient. Why are you unhappy? Do you necessarily require that you
have to do bitmask tests yourself?
__
RPM Package Managerhttp://rpm5.org
Developer Communication Listrpm-devel@rpm5.org


Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-06-22 Thread Jeffrey Johnson

On Jun 22, 2012, at 8:02 PM, Alexey Tourbin  wrote:

> On Sat, Jun 23, 2012 at 1:55 AM, Jeffrey Johnson  wrote:
>> I would state that compression (of any sort) to minimize
>> bandwidth is entirely the wrong problem to solve.
> 
> So what kind of a problem are we trying to solve? Why are you making
> all these small puns? Are you ready to go out and pronounce that "I am
> the Jefferey Johnson is entirely confident that whichever problem we
> face we must address, but to minimize bandwidth is entirely wrong and
> shall be addressed never!" :-)
> 

For the record: there is no intentional pun in that quote: I
tried to state a misdirection that cannot (and SHOULD not) be
solved in the context of "package management", nothing more.

And I do feel for PTT's in Kzhakistan extracting every ruble
for bandwidth.

> Why is it any wrong to minimize bandwidth, or, in other words, why it
> is bad to spend less money? Your answer is like, because the meaning
> of life is not to spend less money, which is a wrong perspective.
> Okay, but what's a better perspective? Spending more money for less
> good is not at all a better perspective.
> 

Nothing I said implies that bandwidth reduction is "wrong".

If you want to minimize bandwidth downloading packages compress
the metadata and deal with the "legacy compatible" issues however
you want/need.

My rule of thumb is that metadata is ~15% of the size of a
typical *.rpm package: assuming 50% compression one might save
7.5% of the package download cost (0.50 * 0.15 = 7.5%).

The "real" problem is the perceived need to be up-to-date
as of this second, and the duplicated/redundant/necessary
monolithic repository metadata download is an expensive way to
determine
Nothing to upgrade: relax.
An incremental download of changed metadata is simple
engineering (not widely practiced).

> Okay, but what do we actually try to do? Um, we try to bring some
> binary compatibility by using the limited amount of information with
> some contrived data structures which test set-subset relation. If
> space were not a problem, we could simply plug ELF sections into RPM
> headers, couldn't we? Why not? And why is there a distinction between
> the header and the payload? :-)
> 

Hmm .. we have different POV here: I have the luxury of being
able to change anything in RPM that makes sound engineering sense,
which perhaps warps my opinions.

Does it make engineering sense to include ELF headers in their
entirety in *.rpm metadata?
No.

Does it make sense to attempt set:versions for de facto ABI subsets
(if done carefully)?
Yes.

I *like* set:versions or we would not be having this conversation.

>> My contrarian POV should not be taken as opposed to compression
>> or elegance or anything else. Just that minimal size in the
>> representation of bit fields (or bit positions as numbers)
>> overly limits the applicability of set:versions.
>> 
>> There are lots of usage cases for efficient sub-set computations
>> in package management, not just as a de facto API/ABI check
>> using ELF symbols. Most of the other usage cases for efficient
>> sub-set computations are not subject to an ultimately optimal
>> string encoding.
> 
> My POV is that I ask "how the best I can do what I need to do, is it
> doable, what is the price, can it be reduced, etc". Of course, these
> "best to do" and "need to do" terms are not exactly mathematical, and
> I already hear some laughs in the audience. Nevertheless, the only
> thing which you can oppose to this is simply a better implementation.
> Anything else does not count. Why? That's because! Go tell people they
> should spend more money. ;-)
> 

again: I *like* set:versions and see many usage cases, not just in dependencies.

>> E.g. ripping off the base62 encoding and distributing binary
>> assertion fields saves at least as much bandwidth as your
>> guesstimate that a Golomb-Rice code is ~20% more efficient
>> than a Bloom filter.
> 
> I see no point in criticizing base62 encoding being suboptimal as
> compared to raw bytes, because it does just that: squeezes bits into
> alnum characters. It is pretty clear and pointless that raw bytes
> stuff more bits. The goal was just that - to "dump" bits into
> "readable" and "usable" form.
> 

I did not say sub-optimal: I said that the encoding into a string
entails a cost that is largely the same as the difference between
"numbers" in set:versions and bits in Bloom filters.

I can tie binary set:versions into "traditional" EVR tags if I choose
to attempt entirely and entirely remove the need for base62 encoding
if necessary. So could you.

There is no desire to break "legacy compatibility" needlessly however;
base63 is as good as its gonna get if one MUST encode into a string.

>> Just in case: yes acidic sarcasm was fully intended.
> 
> There is no point in your sarcasm, except that probably that you are
> very clever (which I totally agree).
> 

You miss the point of the sarcasm: I don't give a h00t whether

Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-06-22 Thread Alexey Tourbin
On Sat, Jun 23, 2012 at 1:55 AM, Jeffrey Johnson  wrote:
> I would state that compression (of any sort) to minimize
> bandwidth is entirely the wrong problem to solve.

So what kind of a problem are we trying to solve? Why are you making
all these small puns? Are you ready to go out and pronounce that "I am
the Jefferey Johnson is entirely confident that whichever problem we
face we must address, but to minimize bandwidth is entirely wrong and
shall be addressed never!" :-)

Why is it any wrong to minimize bandwidth, or, in other words, why it
is bad to spend less money? Your answer is like, because the meaning
of life is not to spend less money, which is a wrong perspective.
Okay, but what's a better perspective? Spending more money for less
good is not at all a better perspective.

Okay, but what do we actually try to do? Um, we try to bring some
binary compatibility by using the limited amount of information with
some contrived data structures which test set-subset relation. If
space were not a problem, we could simply plug ELF sections into RPM
headers, couldn't we? Why not? And why is there a distinction between
the header and the payload? :-)

> My contrarian POV should not be taken as opposed to compression
> or elegance or anything else. Just that minimal size in the
> representation of bit fields (or bit positions as numbers)
> overly limits the applicability of set:versions.
>
> There are lots of usage cases for efficient sub-set computations
> in package management, not just as a de facto API/ABI check
> using ELF symbols. Most of the other usage cases for efficient
> sub-set computations are not subject to an ultimately optimal
> string encoding.

My POV is that I ask "how the best I can do what I need to do, is it
doable, what is the price, can it be reduced, etc". Of course, these
"best to do" and "need to do" terms are not exactly mathematical, and
I already hear some laughs in the audience. Nevertheless, the only
thing which you can oppose to this is simply a better implementation.
Anything else does not count. Why? That's because! Go tell people they
should spend more money. ;-)

> E.g. ripping off the base62 encoding and distributing binary
> assertion fields saves at least as much bandwidth as your
> guesstimate that a Golomb-Rice code is ~20% more efficient
> than a Bloom filter.

I see no point in criticizing base62 encoding being suboptimal as
compared to raw bytes, because it does just that: squeezes bits into
alnum characters. It is pretty clear and pointless that raw bytes
stuff more bits. The goal was just that - to "dump" bits into
"readable" and "usable" form.

> Just in case: yes acidic sarcasm was fully intended.

There is no point in your sarcasm, except that probably that you are
very clever (which I totally agree).

> You already made a valid point asking what probability means wrto
> false positives. In fact your 2**10/2**20 is a very different
> estimate of false positive probability than the approx. currently
> in use in rpmbf.c (which is based on a statistical model for
> Bloom filter parameters).

The 2**10/2**20 is the best estimate and you probably cannot further
improve it. The whole business of set-versions, which I have been
pondering recently, boils down to the questions, can you improve the
ratio just a little bit? Or can you pay just a little bit less for a
little bit more? Is there a better data structure? The answer is
basically turns out to be "NO". The "set of numbers" is good to go,
and the Galois fields have some discrepancies which you do not want to
know, and the benefit is very marginal.

> Below is what is in use by RPM which chooses {m, k} given
> an estimated population n and a desired probability of false
> positives e (I forget where I found the actual computation,
> can likely find it again if/when necessary).
>
> What I would like to be able to do with set:versions
> in RPM is to be able to use either of these 2 "container"
> representations interchangeably.
>
> I'd also like to be able to pre-compute either form
> in package tags for whatever purpose without the
> dreadful tyranny of being "legacy compatible" with
> older versions of RPM. Adding a new tag that isn't
> used by older RPM is exactly as compatible as following
> existing string representations for NEVR dependencies
> in existing tags, arguably more compatible because
> older versions of RPM are incapable of interpreting
> unknown tags.
>
> And again: the best solution for the download bandwidth
> problem is
>        Update the metadata less often.
> if you think a bit.

The best solution for "don't be stupid" is just don't be stupid. Of
course, downloading metadata is not the only problem, and not exactly
the one which we exactly try to solve. It all combines into a complex
system. The question is then, to me, can we do any better? How much
does this whole business cost? if it's still expensive, which it is,
how can we cut down on the costs? These kind of questions I am willing
to 

Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-06-22 Thread Jeffrey Johnson

On Jun 22, 2012, at 5:26 PM, Alexey Tourbin  wrote:

> On Fri, Jun 22, 2012 at 5:30 AM, Jeffrey Johnson  wrote:
>> Sure numbers "make sense".
>> 
>> But God invented 0 and 1 and who needs steenkin carries to do
>> arithmetic in Galois fields?!?
> 
> Jeffery, I understand that you are ironic and sarcastic, but I can't
> see the reason why, as per our discussion. If you ask honest
> questions, like "what this means", I try to do my best to answer the
> question, possibly involving considerations like "numbers comes from
> God", which are questionable but not irrelevant. They may help to
> understand, or may not.
> 

Apologies for the sarcasm. My comments were inline with your observations
regarding undergraduate difficulties with Galois fields and matrices imho.

But math jokes are bone dry …


>>> P: 01010010010101
>>> R: 010001
>> 
>> Yes … but this is "premature optimization" …
> 
> There is no premature optimization here, and it is fair to ask what we
> may possibly want to try to optimize. My answer is: first, it must
> take less space (given the probability) because we have to download
> the repo metadata every time we run the "update"; second, set-versions
> must compare quickly (we must be able to compare them all within a
> second). What's your suggestion? Do you want them to take more space
> and compare slowly, so as to avoid premature optimizations? :-) Or do
> you dislike them exactly because they are not Bloom filters? Go on
> then, make shorter strings, given the same probability, which compare
> faster! But this is largely impossible, exactly because of these
> laughable topics like "what is a number" or "what is a probability"
> which I'm trying to present.
> 

I would state that compression (of any sort) to minimize
bandwidth is entirely the wrong problem to solve.

Even in Kazakistan, one can minimize bandwidth by
Upgrading less often!


My contrarian POV should not be taken as opposed to compression
or elegance or anything else. Just that minimal size in the
representation of bit fields (or bit positions as numbers)
overly limits the applicability of set:versions.

There are lots of usage cases for efficient sub-set computations
in package management, not just as a de facto API/ABI check
using ELF symbols. Most of the other usage cases for efficient
sub-set computations are not subject to an ultimately optimal
string encoding.

E.g. ripping off the base62 encoding and distributing binary
assertion fields saves at least as much bandwidth as your
guesstimate that a Golomb-Rice code is ~20% more efficient
than a Bloom filter.

So don't take my flip comments as being opposed to what you have
implemented.

>> How big the repo is determines how important the
>> distro is and nothing else. Just look at how impo'tent Debian is …
> 
> Your considerations are probably true but are largely irrelevant to
> the discussion.
> Besides that, I can't understand them. Are you sarcastic? I am
> sarcastic too, that's not at all a problem. Shall we talk? :-)
> 

Just in case: yes acidic sarcasm was fully intended.

>>> $ print -l sym1 sym2 |/usr/lib/rpm/mkset 10
>>> set:dbUnz4
>>> $ print -l sym1 sym2 |/usr/lib/rpm/mkset 20
>>> set:nl2ZEALdS
>>> 
>>> In the first run, the probability is modest ("print -l" is zsh syntax
>>> to print an arg per line). In the second run, the probability is much
>>> higher (and it takes more space). There is also a function in scripts
>>> to sustain probabilities at about o 0.1%.
>> 
>> Hmmm … I've been defaulting Bloom filters ultra conservatively
>> at 10**-4 and mobing to 10**-6 at the first hint of a possible problem.
> 
> The number does not matter here, the only consideration was that
> 2^{-10} was kind of cute number, and it works well in practice - that
> is, we catch all or most of the bugs where a library symbol has been
> deleted. But it could also be 2^{-16} or even 2^{-20}. It is a
> trade-off, and I see no ground for criticizing it for just that. Or
> again it boils down to the question how many megabytes you expect to
> download when you run the "update".

You already made a valid point asking what probability means wrto
false positives. In fact your 2**10/2**20 is a very different
estimate of false positive probability than the approx. currently
in use in rpmbf.c (which is based on a statistical model for
Bloom filter parameters).

Below is what is in use by RPM which chooses {m, k} given
an estimated population n and a desired probability of false
positives e (I forget where I found the actual computation,
can likely find it again if/when necessary).

What I would like to be able to do with set:versions
in RPM is to be able to use either of these 2 "container"
representations interchangeably.

I'd also like to be able to pre-compute either form
in package tags for whatever purpose without the
dreadful tyranny of being "legacy compatible" with
older versions of RPM. Adding a new tag that isn't
used by older RPM is exactly as compatible as followin

Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-06-22 Thread Alexey Tourbin
On Fri, Jun 22, 2012 at 5:30 AM, Jeffrey Johnson  wrote:
> Sure numbers "make sense".
>
> But God invented 0 and 1 and who needs steenkin carries to do
> arithmetic in Galois fields?!?

Jeffery, I understand that you are ironic and sarcastic, but I can't
see the reason why, as per our discussion. If you ask honest
questions, like "what this means", I try to do my best to answer the
question, possibly involving considerations like "numbers comes from
God", which are questionable but not irrelevant. They may help to
understand, or may not.

>> P: 01010010010101
>> R: 010001
>
> Yes … but this is "premature optimization" …

There is no premature optimization here, and it is fair to ask what we
may possibly want to try to optimize. My answer is: first, it must
take less space (given the probability) because we have to download
the repo metadata every time we run the "update"; second, set-versions
must compare quickly (we must be able to compare them all within a
second). What's your suggestion? Do you want them to take more space
and compare slowly, so as to avoid premature optimizations? :-) Or do
you dislike them exactly because they are not Bloom filters? Go on
then, make shorter strings, given the same probability, which compare
faster! But this is largely impossible, exactly because of these
laughable topics like "what is a number" or "what is a probability"
which I'm trying to present.

> How big the repo is determines how important the
> distro is and nothing else. Just look at how impo'tent Debian is …

Your considerations are probably true but are largely irrelevant to
the discussion.
Besides that, I can't understand them. Are you sarcastic? I am
sarcastic too, that's not at all a problem. Shall we talk? :-)

>> $ print -l sym1 sym2 |/usr/lib/rpm/mkset 10
>> set:dbUnz4
>> $ print -l sym1 sym2 |/usr/lib/rpm/mkset 20
>> set:nl2ZEALdS
>>
>> In the first run, the probability is modest ("print -l" is zsh syntax
>> to print an arg per line). In the second run, the probability is much
>> higher (and it takes more space). There is also a function in scripts
>> to sustain probabilities at about o 0.1%.
>
> Hmmm … I've been defaulting Bloom filters ultra conservatively
> at 10**-4 and mobing to 10**-6 at the first hint of a possible problem.

The number does not matter here, the only consideration was that
2^{-10} was kind of cute number, and it works well in practice - that
is, we catch all or most of the bugs where a library symbol has been
deleted. But it could also be 2^{-16} or even 2^{-20}. It is a
trade-off, and I see no ground for criticizing it for just that. Or
again it boils down to the question how many megabytes you expect to
download when you run the "update".
__
RPM Package Managerhttp://rpm5.org
Developer Communication Listrpm-devel@rpm5.org


Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-06-22 Thread Michael Schroeder
On Fri, Jun 22, 2012 at 09:39:22PM +0400, Alexey Tourbin wrote:
> What is a probability? (That's the second stunning questions after
> what is a number.) If numbers were unlimited, they could have
> represented symbols exactly. The only reason the numbers "clash" is
> because they "sit" within a limited range. What you have to do to
> control probabilities is to select the appropriate range, which is a
> trade-off between how many clashes are possible and how much bits per
> number you are allowed to take. The basic idea is that, to encode a
> Provides version, you first need to know how many symbols you are
> going to encode, say 1024. You then ask yourself, how many bits per
> symbol should I take. If you take e.g. only 10 bits, that's plain
> stupid, because 10-bit numbers can address only 1024 values (so you
> end up with most of the bits set, or most of the numbers taken, that
> is to say). If you take 20 bits per symbol, though, things get more
> interesting. 20-bit numbers can address 2^{20} different symbols.
> Since the probability is due to range limitations, you get
> 2^{10}/2^{20}=0.1% false positive rate, which is basically what you
> want. When encoding a Requires version, it's another matter: you
> should use the same number of bits as the corresponding Requires
> versions have.

You mean the corresponding Provides, i.e. the symbols in the "target"
library, right? Basically you should never need to downsample when
doing the set comparison, as downsampling kills your probablilities.

Cheers,
  Michael.

-- 
Michael Schroeder   m...@suse.de
SUSE LINUX Products GmbH,  GF Jeff Hawn, HRB 16746 AG Nuernberg
main(_){while(_=~getchar())putchar(~_-1/(~(_|32)/13*2-11)*13);}
__
RPM Package Managerhttp://rpm5.org
Developer Communication Listrpm-devel@rpm5.org


Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-06-22 Thread Alexey Tourbin
On Thu, Jun 21, 2012 at 7:51 PM, Jeffrey Johnson  wrote:
> On Jun 18, 2012, at 2:32 PM, Jeffrey Johnson wrote:
>>
>> The "contained in" or subset semantic that applies to the operations "<" and 
>> ">="
>> is rather easy to do as well. E.g. if (assuming on;y existence, not 
>> versioned inequality ranges)
>>       P == Bloom filter of Provides: tokens
>>       R == Bloom filter of a (possible) subset of tokens
>> then the subset computation is nothing more than
>>       (P & R) == R

That's right. Set-versions do the same thing, only reframed in terms
of set of numbers (as opposed to explicit bit masks). The rpmsetcmp
routine basically does just that: (P & R) == R check, written in a
slightly different manner, with a bunch of fancy stuff. :-)

> I should point out the implicit (and tricky) assumption regarding
> using Bloom filters to easily compute set and union intersection:
>        Both P and R MUST have exactly the same parameterization.
> Choosing an a priori parametrization is hard because it MUST
> deal with the worst possible case of the largest estimated population
> which increases the sparseness of all other Bloom filters.

There are a few kinds of "parametrization" which we must ponder.
First, you should use the same hash function for R and P sets, so that
it makes the same number per symbol (or sets the same bits). With
different hash functions, there is no chance to compare sets
meaningfully. Second, you should also ponder what a "number" actually
is (or how high a bit you may want to be able to set). The thing is,
the numbers are not unlimited (or otherwise the whole world can be
represented with just a big number). It helps to think of a number as
a tiny bit of information within a limited range. The range is another
parameter.

> This also applies to "tuning" to reduce the probability of false positives.
>
> There are no obvious ways to rescale a Bloom filter meaningfully either.

There IS a (not-so-)obvious way to rescale Bloom filters, and numbers,
for that matter. Big surprise, big surprise! It boils down to the
question what a number is. If your number is just a few bits in the
range modulo power of 2, you can "rescale" the number down to a
smaller range by simply stripping its higher bits. The equivalent
operation can be devised to "downsample" a bloom filter into a lesser
precision, provided that some (not-so-)obvious conditions are met.
Basically you need to split the filter into two halves and bitwise-OR
them.

> Do set: versions have the same difficulty (even if the parameters are hidden
> as implementation/design constraints somehow)?

There is some difficulty that we always need to use the same hash
function, which is hardwired, and cannot be easily changed (it is
Jenkins on-at-a-time hash with a fancy initial constant). There is no
difficulty to compare set-versions per se, though. Because a
set-versions is just a set of numbers within a limited range modulo
power of two. If ranges are different, you first need to "downsample"
the higher-ranged set by stripping its higher bits and sorting the
numbers again, but then you proceed normally. Note that the range, or
"bpp", has to be encoded explicitly, and is part of a set-version.
Again, it helps to think in terms what a number is. A number is a
thing within the range, modulo power of two. If you don't know the
range, you don't know what the number is.

> Or does the Golomb-Rice encoding "scale" more naturally than Bloom filter 
> parameters?
> If the set:versions implementation "scales" better than Bloom filters (which 
> seems to be the case),
> then there are lots of usage cases (like packages in a repository, or files 
> in a package) where
> an efficient means to compute set membership is quite useful.
>
> (aside)
> Apologies for "scales" imprecision. I'm merely trying to ask is
>        To what extent does set:versions depend on a priori assumptions?
>
> BTW, what is the current false positive failure probability for set:versions?

What is a probability? (That's the second stunning questions after
what is a number.) If numbers were unlimited, they could have
represented symbols exactly. The only reason the numbers "clash" is
because they "sit" within a limited range. What you have to do to
control probabilities is to select the appropriate range, which is a
trade-off between how many clashes are possible and how much bits per
number you are allowed to take. The basic idea is that, to encode a
Provides version, you first need to know how many symbols you are
going to encode, say 1024. You then ask yourself, how many bits per
symbol should I take. If you take e.g. only 10 bits, that's plain
stupid, because 10-bit numbers can address only 1024 values (so you
end up with most of the bits set, or most of the numbers taken, that
is to say). If you take 20 bits per symbol, though, things get more
interesting. 20-bit numbers can address 2^{20} different symbols.
Since the probability is due to range limitations, you get
2^{10}/2^{20

Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-06-21 Thread Jeffrey Johnson

On Jun 21, 2012, at 8:27 PM, Alexey Tourbin  wrote:

> On Thu, Jun 21, 2012 at 7:28 PM, Jeffrey Johnson  wrote:
>>> More precisely, a set-version can be (in principle) converted to a
>>> Bloom filter which uses only one hash function. The idea is that such
>>> a filter will set bits in a highly sparse set of bits, one by one.
>>> Instead, a set-version remembers the bits simply by their indices.
>>> Setting the bit becomes taking the number, and there is a
>>> straightforward correspondence. It also turns out that a set of
>>> numbers can be easier to deal with, as opposed to a sparse set of
>>> bits.
>> 
>> You've made an unsupported claim here re "numbers can be easier".
>> Presumably you are talking about means of encoding, where clever
>> mathematics derived from numerical transforms can be used to
>> remove redundancy. With a pile of bits in a Bloom filter all one
>> has to work with is a pile of bits.
> 
> Numbers are easier to deal with because you can use 'unsigned v[]'
> array to represent them, which can be seen throughout the code,
> accompanied by 'int c' (which is argc/argv[] style). Bits are somewhat
> more complicated: you need to use something like 
> macros, and you cannot use pointer arithmetic to implement traversal,
> etc.
> 

Well different strokes here. The
#include 
is complicated only because of effete Linux kernel debutantes
with their precious kernel stack advocacy issues re select(2).
These minor defines (swiped directly from  mind you)

are hugely portable:
#define __PBM_NBITS /*@-sizeoftype@*/(8 * sizeof(__pbm_bits))/*@=sizeoftype@*/
#define __PBM_IX(d) ((d) / __PBM_NBITS)
#define __PBM_MASK(d)   ((__pbm_bits) 1 << (((unsigned)(d)) % __PBM_NBITS))
#define __PBM_BITS(set) ((__pbm_bits *)(set)->bits)

#define PBM_FREE(s) _free(s);
#define PBM_SET(d, s)   (__PBM_BITS (s)[__PBM_IX (d)] |= __PBM_MASK (d))
#define PBM_CLR(d, s)   (__PBM_BITS (s)[__PBM_IX (d)] &= ~__PBM_MASK (d))
#define PBM_ISSET(d, s) ((__PBM_BITS (s)[__PBM_IX (d)] & __PBM_MASK (d)) != 0)

#define PBM_ALLOC(d)xcalloc(__PBM_IX (d) + 1, __PBM_NBITS/8)

See .

> Also, to me, a set of numbers just "makes sense", and is good to go.
> If you wonder how much complicated things can get, there's been some
> recent papers out there which use matrix solving for approximate set
> representation. They are basically unreadable (given the undergraduate
> skills), and you can be completely lost because there is apparently no
> connection between the Galois fields and what you actually try to do.
> After you try to read these papers, the "set of numbers" becomes a
> wonderful salvation which comes directly from God and the Holy Spirit
> and brings peace to your soul. This wonderful salvation, which comes
> from God, turns out to be a good thing. You should use it, when you
> have some. :-)
> 

Sure numbers "make sense".

But God invented 0 and 1 and who needs steenkin carries to do
arithmetic in Galois fields?!?

>> The counter argument is this: testing bits (in a Bloom filter) is
>> rather KISS. Other compression/redundancy removal schemes
>> end up trading storage for implementation complexity cost.
>> In a running installer like RPM, there will always be a need
>> for memory dealing with payload contents. Since large amounts
>> of memory are eventually going to be needed/used, savings
>> by using Golomb-Reid codes are mostly irrelevant for the
>> 100K -> 1M set sizes used by RPM no matter whether bits or
>> numbers are used to represent.
> 
> I do not agree that testing bits is more KISS than a set of numbers.
> It you think that testing bits is the only and natural thing to do, I
> disagree. The reason is: Requires versions are much more sparse than
> Provides versions. If the Provides version is optimal, in terms of bit
> usage (50% set), the Requires version must be necessarily suboptimal
> (too few bits set). It can be pictured like this:
> 
> P: 01010010010101
> R: 010001
> 

Yes … but this is "premature optimization" …

> The conclusion is, if you want to stick to bitmasks and reduce subset
> comparison to bitmask tests, there will be a great deal of
> inefficiency because of sparse R-sets. On the other hand, sets of
> numbers can be pictured like this:
> 
> P: 1,3,7,13,15
> R: 1,13
> 
> Note that R-set does not take extra space, and there is a simple
> merge-like algorithm which advances either R or P (or both - which is
> how rpmsetcmp routine is implemented).
> 

… if you are serious abtout minimal size, compressing headers
in *.rpm packages will save more bandwidth than any single other
optimization one might choose to implement.

>> But don't take my argument as being opposed to set:versions whatsoever.
>> Just (for RPM and even for *.rpm) that compression size isn't as important
>> to me as it is to you as the "primary goal" of set: versions.
> 
> The size of *.rpm is very important when you want to update the
> information about a repo, like in "apt-get update" - you have to
> download it

Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-06-21 Thread Alexey Tourbin
On Thu, Jun 21, 2012 at 7:28 PM, Jeffrey Johnson  wrote:
>> More precisely, a set-version can be (in principle) converted to a
>> Bloom filter which uses only one hash function. The idea is that such
>> a filter will set bits in a highly sparse set of bits, one by one.
>> Instead, a set-version remembers the bits simply by their indices.
>> Setting the bit becomes taking the number, and there is a
>> straightforward correspondence. It also turns out that a set of
>> numbers can be easier to deal with, as opposed to a sparse set of
>> bits.
>
> You've made an unsupported claim here re "numbers can be easier".
> Presumably you are talking about means of encoding, where clever
> mathematics derived from numerical transforms can be used to
> remove redundancy. With a pile of bits in a Bloom filter all one
> has to work with is a pile of bits.

Numbers are easier to deal with because you can use 'unsigned v[]'
array to represent them, which can be seen throughout the code,
accompanied by 'int c' (which is argc/argv[] style). Bits are somewhat
more complicated: you need to use something like 
macros, and you cannot use pointer arithmetic to implement traversal,
etc.

Also, to me, a set of numbers just "makes sense", and is good to go.
If you wonder how much complicated things can get, there's been some
recent papers out there which use matrix solving for approximate set
representation. They are basically unreadable (given the undergraduate
skills), and you can be completely lost because there is apparently no
connection between the Galois fields and what you actually try to do.
After you try to read these papers, the "set of numbers" becomes a
wonderful salvation which comes directly from God and the Holy Spirit
and brings peace to your soul. This wonderful salvation, which comes
from God, turns out to be a good thing. You should use it, when you
have some. :-)

> The counter argument is this: testing bits (in a Bloom filter) is
> rather KISS. Other compression/redundancy removal schemes
> end up trading storage for implementation complexity cost.
> In a running installer like RPM, there will always be a need
> for memory dealing with payload contents. Since large amounts
> of memory are eventually going to be needed/used, savings
> by using Golomb-Reid codes are mostly irrelevant for the
> 100K -> 1M set sizes used by RPM no matter whether bits or
> numbers are used to represent.

I do not agree that testing bits is more KISS than a set of numbers.
It you think that testing bits is the only and natural thing to do, I
disagree. The reason is: Requires versions are much more sparse than
Provides versions. If the Provides version is optimal, in terms of bit
usage (50% set), the Requires version must be necessarily suboptimal
(too few bits set). It can be pictured like this:

P: 01010010010101
R: 010001

The conclusion is, if you want to stick to bitmasks and reduce subset
comparison to bitmask tests, there will be a great deal of
inefficiency because of sparse R-sets. On the other hand, sets of
numbers can be pictured like this:

P: 1,3,7,13,15
R: 1,13

Note that R-set does not take extra space, and there is a simple
merge-like algorithm which advances either R or P (or both - which is
how rpmsetcmp routine is implemented).

> But don't take my argument as being opposed to set:versions whatsoever.
> Just (for RPM and even for *.rpm) that compression size isn't as important
> to me as it is to you as the "primary goal" of set: versions.

The size of *.rpm is very important when you want to update the
information about a repo, like in "apt-get update" - you have to
download it all. The question is then how big the repo is, and how
many set-versions you can afford to download. :-) The "primary goal"
was not to make things much worse. In other words, the price must not
be prohibitive even for the repo of 10k or 15k packages strong.
(Please also realize that the Internet was not a given until very
recent times - some shoddy ISPs in Kamchatka still want to charge
something like $0.1 per 1Mb.)

> AFAICT set:versions also has a finite probability of failure due to 
> (relatively rare)
> hash collisions converting strings to numbers.
>
> Is there any hope that set:versions (or the Golomb-Rice code) can supply a 
> parameterization
> to "tune" the probability of false positives?
>
> Or am I mis-understanding what is implemented?

You can tune the probability by selecting the appropriate "bpp"
paramter, and passing it to "mkset" (which is how scripts work). The
"bpp" indicates how much bits must be used per hash value (the range
is 10..32). For example, if you want to encode Provides versions which
has 1024 symbols, and the error rate has to be fixed at about 0.1%,
you need to use bpp=20 - that is, after each symbol is hashed, its 20
lower bits will be taken into account and encoded. The probability
then is simply the number of symbols over the capacity of the
universe, which is 2^{10}/2^{20}=0.1%. On the other hand,

Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-06-21 Thread Jeffrey Johnson

On Jun 18, 2012, at 2:32 PM, Jeffrey Johnson wrote:

> 
> The "contained in" or subset semantic that applies to the operations "<" and 
> ">="
> is rather easy to do as well. E.g. if (assuming on;y existence, not versioned 
> inequality ranges)
>   P == Bloom filter of Provides: tokens
>   R == Bloom filter of a (possible) subset of tokens
> then the subset computation is nothing more than
>   (P & R) == R
> 

I should point out the implicit (and tricky) assumption regarding
using Bloom filters to easily compute set and union intersection:
Both P and R MUST have exactly the same parameterization.
Choosing an a priori parametrization is hard because it MUST
deal with the worst possible case of the largest estimated population
which increases the sparseness of all other Bloom filters.

This also applies to "tuning" to reduce the probability of false positives.

There are no obvious ways to rescale a Bloom filter meaningfully either.

Do set: versions have the same difficulty (even if the parameters are hidden
as implementation/design constraints somehow)?

Or does the Golomb-Rice encoding "scale" more naturally than Bloom filter 
parameters?
If the set:versions implementation "scales" better than Bloom filters (which 
seems to be the case),
then there are lots of usage cases (like packages in a repository, or files in 
a package) where
an efficient means to compute set membership is quite useful.

(aside)
Apologies for "scales" imprecision. I'm merely trying to ask is
To what extent does set:versions depend on a priori assumptions?

BTW, what is the current false positive failure probability for set:versions?


73 de Jeff
__
RPM Package Managerhttp://rpm5.org
Developer Communication Listrpm-devel@rpm5.org


Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-06-21 Thread Jeffrey Johnson

On Jun 21, 2012, at 12:51 AM, Alexey Tourbin wrote:

> On Thu, Jun 21, 2012 at 12:15 AM, Alexey Tourbin
>  wrote:
>> On Mon, Jun 18, 2012 at 10:32 PM, Jeffrey Johnson  wrote:
>>> Good: the above confirmation of the characteristics allows a set:versions
>>> implementation to proceed.
>> 
>> Hello, there's been some speculation about Bloom filters below, which
>> I cannot address right now, offhand. Nevertheless, I can say that, in
>> some highly mathematical sense, set-versions are exactly equivalent to
>> Bloom filters. They do just the same thing, if you will. The only
>> difference is that set-versions are more compact: they take somewhat
>> less space, which was, if you remember, the number one goal of the
>> original implementation.
> 
> More precisely, a set-version can be (in principle) converted to a
> Bloom filter which uses only one hash function. The idea is that such
> a filter will set bits in a highly sparse set of bits, one by one.
> Instead, a set-version remembers the bits simply by their indices.
> Setting the bit becomes taking the number, and there is a
> straightforward correspondence. It also turns out that a set of
> numbers can be easier to deal with, as opposed to a sparse set of
> bits.
> 

You've made an unsupported claim here re "numbers can be easier".
Presumably you are talking about means of encoding, where clever
mathematics derived from numerical transforms can be used to
remove redundancy. With a pile of bits in a Bloom filter all one
has to work with is a pile of bits.

The counter argument is this: testing bits (in a Bloom filter) is
rather KISS. Other compression/redundancy removal schemes
end up trading storage for implementation complexity cost.
In a running installer like RPM, there will always be a need
for memory dealing with payload contents. Since large amounts
of memory are eventually going to be needed/used, savings
by using Golomb-Reid codes are mostly irrelevant for the
100K -> 1M set sizes used by RPM no matter whether bits or
numbers are used to represent.

But don't take my argument as being opposed to set:versions whatsoever.
Just (for RPM and even for *.rpm) that compression size isn't as important
to me as it is to you as the "primary goal" of set: versions.

> If you want to know more why things have to work like this, and e.g.
> where constants pop up, there is a good starting point at "Cache-,
> Hash-, and Space-efficient Bloom filters" paper by Felix Putze, Peter
> Sanders, and Johannes Singler. Actually this paper helped me a lot to
> put things together and to produce the original implementation.
> Reading this paper requires some working mathematical knowledge,
> though. This requirement must not be underestimated, but also should
> not be overestimated. The paper is very readable: it tells you what
> you may want to do and what you have to do.
> http://algo2.iti.kit.edu/singler/publications/cacheefficientbloomfilters-wea2007.pdf

Thanks for the pointer. I actually read and studied this paper a few years back
trying to decide whether there was a need to attempt cache-line optimization
(but I totally ignored the paragraph that links Golomb codes to Bloom filters).

At the time I decided that the complexity cost to attempt cache-line 
optimization
wasn't a priority in RPM, but rather KISS simplicity while users considered
changes that involve a finite probability of false positive failures.

AFAICT set:versions also has a finite probability of failure due to (relatively 
rare)
hash collisions converting strings to numbers.

Is there any hope that set:versions (or the Golomb-Rice code) can supply a 
parameterization
to "tune" the probability of false positives?

Or am I mis-understanding what is implemented?

73 de Jeff
> __
> RPM Package Managerhttp://rpm5.org
> Developer Communication Listrpm-devel@rpm5.org

__
RPM Package Managerhttp://rpm5.org
Developer Communication Listrpm-devel@rpm5.org


Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-06-20 Thread Alexey Tourbin
On Thu, Jun 21, 2012 at 12:15 AM, Alexey Tourbin
 wrote:
> On Mon, Jun 18, 2012 at 10:32 PM, Jeffrey Johnson  wrote:
>> Good: the above confirmation of the characteristics allows a set:versions
>> implementation to proceed.
>
> Hello, there's been some speculation about Bloom filters below, which
> I cannot address right now, offhand. Nevertheless, I can say that, in
> some highly mathematical sense, set-versions are exactly equivalent to
> Bloom filters. They do just the same thing, if you will. The only
> difference is that set-versions are more compact: they take somewhat
> less space, which was, if you remember, the number one goal of the
> original implementation.

More precisely, a set-version can be (in principle) converted to a
Bloom filter which uses only one hash function. The idea is that such
a filter will set bits in a highly sparse set of bits, one by one.
Instead, a set-version remembers the bits simply by their indices.
Setting the bit becomes taking the number, and there is a
straightforward correspondence. It also turns out that a set of
numbers can be easier to deal with, as opposed to a sparse set of
bits.

If you want to know more why things have to work like this, and e.g.
where constants pop up, there is a good starting point at "Cache-,
Hash-, and Space-efficient Bloom filters" paper by Felix Putze, Peter
Sanders, and Johannes Singler. Actually this paper helped me a lot to
put things together and to produce the original implementation.
Reading this paper requires some working mathematical knowledge,
though. This requirement must not be underestimated, but also should
not be overestimated. The paper is very readable: it tells you what
you may want to do and what you have to do.
http://algo2.iti.kit.edu/singler/publications/cacheefficientbloomfilters-wea2007.pdf
__
RPM Package Managerhttp://rpm5.org
Developer Communication Listrpm-devel@rpm5.org


Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-06-20 Thread Alexey Tourbin
On Mon, Jun 18, 2012 at 10:32 PM, Jeffrey Johnson  wrote:
> Good: the above confirmation of the characteristics allows a set:versions
> implementation to proceed.

Hello, there's been some speculation about Bloom filters below, which
I cannot address right now, offhand. Nevertheless, I can say that, in
some highly mathematical sense, set-versions are exactly equivalent to
Bloom filters. They do just the same thing, if you will. The only
difference is that set-versions are more compact: they take somewhat
less space, which was, if you remember, the number one goal of the
original implementation.

Very informal, if you want to encode 1K symbols out of 1M symbols
using a bloom filter, you need at least
1024 * 1.44 * 10 bits of information (which is a factor of 1.44 per symbol)
while with a set-version you need at least
1024 * (10 + 1.44) bits of information (which is an additive constant
of 1.44 per symbol).
It is basically fair to say that set-versions are 20% shorter than the
equivalent bloom filters, which is not unimportant.
By the way, the information-theoretical minimum is
1024 * 10 bits of information -
you cannot go beyond that without violating fundamental laws.
__
RPM Package Managerhttp://rpm5.org
Developer Communication Listrpm-devel@rpm5.org


Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-06-18 Thread Jeffrey Johnson

On Jun 15, 2012, at 11:58 PM, Alexey Tourbin wrote:

> On Fri, Jun 1, 2012 at 6:07 AM, Jeffrey Johnson  wrote:
>> I asked 2 very specific questions … the rest is quite important also,
>> but I need to understand precisely what properties set:versions have in order
>> to implement correctly (and I don't fully understand your reply).
>> 
>> Specifically:
>> 
>>1) Is the set:versions VERSION independent of the order of the
>>calls to rpmsetAdd()? (you know the routine as set_add())
> 
> Completely independent - you can add symbols in any order. The symbols
> are then hashed and sorted by their numeric values. The underlying
> idea is that a set-version is just a (sorted) set of numbers. You can
> add whatever symbol to it, possibly twice, the symbol will be hashed
> to a number, in a unique manner, and finally you can get the string
> representation of the set of numbers. This involves much fuss under
> the hood, but basically, you should think of the set of symbols, which
> is just the set of numbers, after each symbol has been hashed
> individually.
> 
>>2) Can the set:versions encoding be compared for more than equality?
>>What set/arithmetic property is the basis for the comparison? What
>>circumstances/constraints are there related to
>>… You cannot always compare
>>set-versions in terms of "greater or equal" (but when you 
>> can, it's
>>important).
> 
> Set-versions compare as sets. There are Euler diagrams to visaulise
> set comparsion, which is an undergraduate matetrial. The idea is that,
> real numbers are linear order: you can always tell either V1 V1>=V2. Sets are quite another matter: you cannot always apply for
> "tertium non datur" (either lt or ge). Which is to say that sets can
> be quite different and do not compare easily. The order can be imposed
> on the sets, though, by requiring the "greater" sets to have at least
> the same elements they compare against (perhaps I'm starting to retell
> the undergraduate material, which is not going to last). To sum up,
> there IS a mathematical basis behind "Requires: foo >= set:asdf"
> dependencies.
> 
>> I can of course answer my own questions with try-and-see test cases.

Good: the above confirmation of the characteristics allows a set:versions
implementation to proceed.

What I am hearing is that set:versions is not fundamentally different from 
these transforms
Create a Bloom filter (with acceptable false positive rate)
Compress the Bloom filter parameters and bits.
Encode the compression in base{16,32,62,64} for transport.

The "contained in" or subset semantic that applies to the operations "<" and 
">="
is rather easy to do as well. E.g. if (assuming on;y existence, not versioned 
inequality ranges)
P == Bloom filter of Provides: tokens
R == Bloom filter of a (possible) subset of tokens
then the subset computation is nothing more than
(P & R) == R

The important difference is that Bloom filters have a false positive failure 
probability which
is likely not present in the Golomb-Rice coding in the current implementation.

I'm expecting that Bloom+Compression+Encoding is comparable in size to the 
Golomb-Rice
encoding. The only way to tell (that I know of) is to try-and-see, which I will 
likely also do in
order to tell how good the compression is with Golomb-Rice coding.

Meanwhile there's lots of usage cases for the set:versions implementation in 
RPM, not just
as a container of Elf symbols that is a de facto ABI sub-set.

73 de Jeff

> __
> RPM Package Managerhttp://rpm5.org
> Developer Communication Listrpm-devel@rpm5.org

__
RPM Package Managerhttp://rpm5.org
Developer Communication Listrpm-devel@rpm5.org


Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-06-15 Thread Alexey Tourbin
On Fri, Jun 1, 2012 at 6:07 AM, Jeffrey Johnson  wrote:
> I asked 2 very specific questions … the rest is quite important also,
> but I need to understand precisely what properties set:versions have in order
> to implement correctly (and I don't fully understand your reply).
>
> Specifically:
>
>        1) Is the set:versions VERSION independent of the order of the
>        calls to rpmsetAdd()? (you know the routine as set_add())

Completely independent - you can add symbols in any order. The symbols
are then hashed and sorted by their numeric values. The underlying
idea is that a set-version is just a (sorted) set of numbers. You can
add whatever symbol to it, possibly twice, the symbol will be hashed
to a number, in a unique manner, and finally you can get the string
representation of the set of numbers. This involves much fuss under
the hood, but basically, you should think of the set of symbols, which
is just the set of numbers, after each symbol has been hashed
individually.

>        2) Can the set:versions encoding be compared for more than equality?
>        What set/arithmetic property is the basis for the comparison? What
>        circumstances/constraints are there related to
>                        … You cannot always compare
>                set-versions in terms of "greater or equal" (but when you can, 
> it's
>                important).

Set-versions compare as sets. There are Euler diagrams to visaulise
set comparsion, which is an undergraduate matetrial. The idea is that,
real numbers are linear order: you can always tell either V1=V2. Sets are quite another matter: you cannot always apply for
"tertium non datur" (either lt or ge). Which is to say that sets can
be quite different and do not compare easily. The order can be imposed
on the sets, though, by requiring the "greater" sets to have at least
the same elements they compare against (perhaps I'm starting to retell
the undergraduate material, which is not going to last). To sum up,
there IS a mathematical basis behind "Requires: foo >= set:asdf"
dependencies.

> I can of course answer my own questions with try-and-see test cases.
__
RPM Package Managerhttp://rpm5.org
Developer Communication Listrpm-devel@rpm5.org


Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-05-31 Thread Jeffrey Johnson

On May 31, 2012, at 7:24 PM, Alexey Tourbin wrote:

>>> 
>> 
>> I'm trying to understand rpmsetcmp() as a "black box" independent
>> of all the gory implementation details of ELF symbols, base62 encoding,
>> and RPM dependencies.
>> 
>> I believe that set:versions are much like Bloom filters:
>>1) strings can be added to a "set" in any order
>>2) the comparison operation implied by
>>Requires: foo >= set:….
>>is identical to "contained in" or "is subset of"
>> Is that the case?


I asked 2 very specific questions … the rest is quite important also,
but I need to understand precisely what properties set:versions have in order
to implement correctly (and I don't fully understand your reply).

Specifically:

1) Is the set:versions VERSION independent of the order of the
calls to rpmsetAdd()? (you know the routine as set_add())

2) Can the set:versions encoding be compared for more than equality?
What set/arithmetic property is the basis for the comparison? What
circumstances/constraints are there related to
… You cannot always compare
set-versions in terms of "greater or equal" (but when you can, 
it's
important).

I can of course answer my own questions with try-and-see test cases.

73 de Jeff__
RPM Package Managerhttp://rpm5.org
Developer Communication Listrpm-devel@rpm5.org


Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-05-31 Thread Alexey Tourbin
On Thu, May 31, 2012 at 9:03 AM, Jeffrey Johnson  wrote:
>>> The mixed code case is interesting: what happens
>>> if a set:version encoding contains the literal string "0:V-R"
>>
>> I can't understand you question. A version is either a set-version, or
>> not a set-version.  If a version is a set-version, it has to be
>> prefixed by "set:". Regular RPM versions cannot be prefixed by "set:",
>> because "set" cannot be decoded as a valid serial number, which has to
>> be an integer. There might be some implementation sloppiness out
>> there, but in principle, I believe the encoding scheme is sane, and
>> makes sense.
>>
>> Now the question is, what if a set-version cannot be decoded? But that
>> can be perplexed by a question, what if a regular rpm version cannot
>> be decoded? Or can you decode any junk as a valid RPM version?
>
> I'm trying to understand rpmsetcmp() as a "black box" independent
> of all the gory implementation details of ELF symbols, base62 encoding,
> and RPM dependencies.
>
> I believe that set:versions are much like Bloom filters:
>        1) strings can be added to a "set" in any order
>        2) the comparison operation implied by
>                Requires: foo >= set:….
>        is identical to "contained in" or "is subset of"
> Is that the case?

To me, a set-version is just a VERSION. I can't stress that enough.
When you need a library, it is a legitimate question to ask which
version you need. If you answer that you need at least version 1.0, a
conventional version, you must be kidding. Because there is no
connection between what you actually need and a god-damn number. So a
plausible answer must be "Well, we need at least a version which
provides  symbols which we need to use". Let's take
this approach to the extreme, and say that the VERSION which we need
is simply the one which provides at least  symbols. To
me, this is much better an approach to library versioning, and
possibly the only viable approach. How else can you express your
expectations about a library? Suppose someone is talking to you at a
conference, and says "Mr. Johnson, we are very proud, blah-blah-blah,
because blah-blah-blah". What you want to tell them is basically "Go
out, folks, it works". Now, with set-versions, things really work. :-)

Back to implementation,
1) set-strings should be considered opaque, static, and unmodifiable.
Once they are formed, there is no useful way to alter them. They
express an idea of a VERSION in a manner which cannot be easily
comprehended, but that's not a problem (or otherwise there is
perplexing questions of what regular rpm versions must mean, and
whether any piece of junk can be decoded as a regular rpm version).
2) They must compare as sets, in terms of elements unique to the first
set, common elements, and elements unique to the second set. The
implementation does not quite much yet, because it already tries to
mimic regular versions. Regular versions are linear order.
Set-versions are partial order.  You cannot always compare
set-versions in terms of "greater or equal" (but when you can, it's
important).
__
RPM Package Managerhttp://rpm5.org
Developer Communication Listrpm-devel@rpm5.org


Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-05-31 Thread Jeffrey Johnson

On May 31, 2012, at 5:07 AM, Michael Schroeder wrote:

> On Wed, May 30, 2012 at 02:56:20PM -0400, Jeffrey Johnson wrote:
>> 
>> On Apr 23, 2012, at 10:32 AM, Jeff Johnson wrote:
>> 
>>> I should point out that writing the attached
>>> message (and sending from the wrong e-mail address) has instantly
>>> led to a different -- and perhaps more natural -- syntax like
>>> 
>>> Requires: set(libfoo.so.1) >= whatever
>>> 
>> 
>> After a month of quietly muddling, I like this namespace like syntax
>> better because it doesn't lead to magic tokens like "set:" in the EVR
>> string and is more consistent with existing namespaces in RPM dependencies.
> 
> Is that just syntax? I.e., will rpm -q --whatprovides libfoo.so.1
> still find the package?
> 

Depends on what I choose to implement. It's quite easy to use
an additional per-namspace table, or to add two keys to the existing table, or
to treat "set(…)" as a probe method.

Because of the length/encoding in set:version data, --whatprovides is rather
awkward and uninformative for humans.

The namespace convention that exists in RPM is that in a statement like
Dependency: foo(N) = E:V-R
that E:V-R applies to the token N, not the wrapped name foo(N).

All other behaviors aren't specified.

73 de Jeff__
RPM Package Managerhttp://rpm5.org
Developer Communication Listrpm-devel@rpm5.org


Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-05-31 Thread Michael Schroeder
On Wed, May 30, 2012 at 02:56:20PM -0400, Jeffrey Johnson wrote:
> 
> On Apr 23, 2012, at 10:32 AM, Jeff Johnson wrote:
> 
> > I should point out that writing the attached
> > message (and sending from the wrong e-mail address) has instantly
> > led to a different -- and perhaps more natural -- syntax like
> > 
> > Requires: set(libfoo.so.1) >= whatever
> > 
> 
> After a month of quietly muddling, I like this namespace like syntax
> better because it doesn't lead to magic tokens like "set:" in the EVR
> string and is more consistent with existing namespaces in RPM dependencies.

Is that just syntax? I.e., will rpm -q --whatprovides libfoo.so.1
still find the package?

Cheers,
  Michael.

-- 
Michael Schroeder   m...@suse.de
SUSE LINUX Products GmbH,  GF Jeff Hawn, HRB 16746 AG Nuernberg
main(_){while(_=~getchar())putchar(~_-1/(~(_|32)/13*2-11)*13);}
__
RPM Package Managerhttp://rpm5.org
Developer Communication Listrpm-devel@rpm5.org


Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-05-30 Thread Jeffrey Johnson

On May 30, 2012, at 11:24 PM, Alexey Tourbin  wrote:

> On Thu, May 31, 2012 at 6:51 AM, Jeffrey Johnson  wrote:
>> We are in violent agreement here over a minor issue
>> of implementation/representation.
> 
> By the way, actual problems that will arise are rarely what you expect
> them to be. In 2010, I was naive and I thought that "char bitv[]" was
> a pretty good representation of bit sequence (which can be still seen
> in set.c). It then took many days to devise a sophisticated decoding
> routine which avoids bitv[] altogether and makes things smooth. So, in
> a violent agreement, don't take things for granted. :-)
> 

In 2001, I thought that typedef'd enum's were the best
implementation for named bit fields because the names
were displayed in gdb backtraces. There is some speshul
painfulness not only from the typedef's in both C/C++
but also from the inability to extend the no. of bits
beyond whatever and "int" contains.

But these are implementation issues which I choose to
call "minor" ;-)

>> The mixed code case is interesting: what happens
>> if a set:version encoding contains the literal string "0:V-R"
> 
> I can't understand you question. A version is either a set-version, or
> not a set-version.  If a version is a set-version, it has to be
> prefixed by "set:". Regular RPM versions cannot be prefixed by "set:",
> because "set" cannot be decoded as a valid serial number, which has to
> be an integer. There might be some implementation sloppiness out
> there, but in principle, I believe the encoding scheme is sane, and
> makes sense.
> 
> Now the question is, what if a set-version cannot be decoded? But that
> can be perplexed by a question, what if a regular rpm version cannot
> be decoded? Or can you decode any junk as a valid RPM version?
> 

I'm trying to understand rpmsetcmp() as a "black box" independent
of all the gory implementation details of ELF symbols, base62 encoding,
and RPM dependencies.

I believe that set:versions are much like Bloom filters:
1) strings can be added to a "set" in any order
2) the comparison operation implied by
Requires: foo >= set:….
is identical to "contained in" or "is subset of"
Is that the case?

73 de Jeff
>> and a match is attempted against a traditional dependency like
>>Provides: foo = 0:V-R
>> If the literal string in the Provides: is encoded on the
>> fly, will setcmp(…) match or not?
> __
> RPM Package Managerhttp://rpm5.org
> Developer Communication Listrpm-devel@rpm5.org

__
RPM Package Managerhttp://rpm5.org
Developer Communication Listrpm-devel@rpm5.org


Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-05-30 Thread Alexey Tourbin
On Thu, May 31, 2012 at 6:51 AM, Jeffrey Johnson  wrote:
> We are in violent agreement here over a minor issue
> of implementation/representation.

By the way, actual problems that will arise are rarely what you expect
them to be. In 2010, I was naive and I thought that "char bitv[]" was
a pretty good representation of bit sequence (which can be still seen
in set.c). It then took many days to devise a sophisticated decoding
routine which avoids bitv[] altogether and makes things smooth. So, in
a violent agreement, don't take things for granted. :-)

> The mixed code case is interesting: what happens
> if a set:version encoding contains the literal string "0:V-R"

I can't understand you question. A version is either a set-version, or
not a set-version.  If a version is a set-version, it has to be
prefixed by "set:". Regular RPM versions cannot be prefixed by "set:",
because "set" cannot be decoded as a valid serial number, which has to
be an integer. There might be some implementation sloppiness out
there, but in principle, I believe the encoding scheme is sane, and
makes sense.

Now the question is, what if a set-version cannot be decoded? But that
can be perplexed by a question, what if a regular rpm version cannot
be decoded? Or can you decode any junk as a valid RPM version?

> and a match is attempted against a traditional dependency like
>        Provides: foo = 0:V-R
> If the literal string in the Provides: is encoded on the
> fly, will setcmp(…) match or not?
__
RPM Package Managerhttp://rpm5.org
Developer Communication Listrpm-devel@rpm5.org


Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-05-30 Thread Jeffrey Johnson

On May 30, 2012, at 10:44 PM, Alexey Tourbin  wrote:

> On Thu, May 31, 2012 at 6:21 AM, Jeffrey Johnson  wrote:
>>> On Mon, Apr 23, 2012 at 6:32 PM, Jeff Johnson  wrote:
 I should point out that writing the attached
 message (and sending from the wrong e-mail address) has instantly
 led to a different -- and perhaps more natural -- syntax like
 
 Requires: set(libfoo.so.1) >= whatever
>>> 
>>> Hello,
>>> Set-versions are just that - versions. One must arguably think of them
>>> in terms of VERSIONS. If you need a library, it is a legitimate
>>> question to ask which version you need. If you think you need the
>>> version at least 1.0, there's a good question: why the heck you think
>>> you need the version at least 1.0 (and whether 2.0 would still fit).
>>> With set-versions, things get straightforward: you need at least a
>>> version which provides  API symbols - that's much
>>> better a description of a library than a god-damn arbitrary number.
>> 
>> Yes VERSIONS.
>> The issue for RPM is how to represent/attach a different VERSION comparison.
> 
> If those are only VERSIONS, they must apply to the same NAME. Look,
> there are two-fold way of dependency resolution in rpm. Set-versions
> were designed to fit into the scheme. First, you look up the name in
> rpmdb/Providename, and fetch the headers. Second, you decide whether
> the versions match.
> 
> The real question is what to do if we are forced to match a
> set-version against a non-set/another kind of version. Well, the
> answer is that you should return as if the NAMEs were different. And
> unless you're in a deeply theoretical mood, I believe the approach is
> perfectly valid.  If two kinds of versions cannot be matched, pretend
> they apply to different names.

We are in violent agreement here over a minor issue
of implementation/representation.

The mixed code case is interesting: what happens
if a set:version encoding contains the literal string "0:V-R"
and a match is attempted against a traditional dependency like
Provides: foo = 0:V-R
If the literal string in the Provides: is encoded on the
fly, will setcmp(…) match or not?

73 de Jeff

__
RPM Package Managerhttp://rpm5.org
Developer Communication Listrpm-devel@rpm5.org


Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-05-30 Thread Alexey Tourbin
On Thu, May 31, 2012 at 6:21 AM, Jeffrey Johnson  wrote:
>> On Mon, Apr 23, 2012 at 6:32 PM, Jeff Johnson  wrote:
>>> I should point out that writing the attached
>>> message (and sending from the wrong e-mail address) has instantly
>>> led to a different -- and perhaps more natural -- syntax like
>>>
>>> Requires: set(libfoo.so.1) >= whatever
>>
>> Hello,
>> Set-versions are just that - versions. One must arguably think of them
>> in terms of VERSIONS. If you need a library, it is a legitimate
>> question to ask which version you need. If you think you need the
>> version at least 1.0, there's a good question: why the heck you think
>> you need the version at least 1.0 (and whether 2.0 would still fit).
>> With set-versions, things get straightforward: you need at least a
>> version which provides  API symbols - that's much
>> better a description of a library than a god-damn arbitrary number.
>
> Yes VERSIONS.
> The issue for RPM is how to represent/attach a different VERSION comparison.

If those are only VERSIONS, they must apply to the same NAME. Look,
there are two-fold way of dependency resolution in rpm. Set-versions
were designed to fit into the scheme. First, you look up the name in
rpmdb/Providename, and fetch the headers. Second, you decide whether
the versions match.

The real question is what to do if we are forced to match a
set-version against a non-set/another kind of version. Well, the
answer is that you should return as if the NAMEs were different. And
unless you're in a deeply theoretical mood, I believe the approach is
perfectly valid.  If two kinds of versions cannot be matched, pretend
they apply to different names.
__
RPM Package Managerhttp://rpm5.org
Developer Communication Listrpm-devel@rpm5.org


Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-05-30 Thread Jeffrey Johnson

On May 30, 2012, at 9:46 PM, Alexey Tourbin  wrote:

> On Mon, Apr 23, 2012 at 6:32 PM, Jeff Johnson  wrote:
>> I should point out that writing the attached
>> message (and sending from the wrong e-mail address) has instantly
>> led to a different -- and perhaps more natural -- syntax like
>> 
>> Requires: set(libfoo.so.1) >= whatever
> 
> Hello,
> Set-versions are just that - versions. One must arguably think of them
> in terms of VERSIONS. If you need a library, it is a legitimate
> question to ask which version you need. If you think you need the
> version at least 1.0, there's a good question: why the heck you think
> you need the version at least 1.0 (and whether 2.0 would still fit).
> With set-versions, things get straightforward: you need at least a
> version which provides  API symbols - that's much
> better a description of a library than a god-damn arbitrary number.
> 

Yes VERSIONS.

The issue for RPM is how to represent/attach a different VERSION comparison.

A prefix marker with set: is mostly sane (until there
are multiple meanings for ':' and '~' and '-' separators
parsing the serialized VERSION representation which is
increasingly the state of affairs @rpm5.org).

So its perhaps a more natural syntax and a cleaner
representation to use the existing name spacing
convention with the form
Requires: set(bar) = VERSION
to distinguish in a namespace the VERSION that needs a
different comparison method attached.

But these are mostly details (until one has to write
the parsing regex and watch the combinatorial explosion
of possibilities with "optional" fields).

But I'm biased at the moment, having just today added
twiddle-in-version ...

> There are some philosophical implications of introducing set-versions,
> in particular, whether it can be extended to describe prototypes and
> calling conventions (e.g. the number of arguments which must be passed
> to a function). This is why I might seem reluctant to participate in
> discussions. I'm thinking! (And perhaps I'm arrogant.)
> 

Thinking is good and arrogance matters little (to me ;-): set:versions
is the right thing to do, collapsing multiple redundant
"just in case" assertions into a single dependency representation.

hth

73 de jeff

>> for set:versions, and for the generalization (for writing strict regression
>> tests,
>> its mostly useless in packaging because there is no mapping that specifies
>> how the mixed DEB <-> RPM version comparison might be done "naturally")
>> 
>> Requires: deb(foo) >= E:V-R
>> Requires: rpm(foo) >= E:V-R
>> 
>> The precedent for foo(bar) name spacing in RPM dependencies with
>> the above syntax is already widely deployed although entirely
>> de facto.
>> 
>> Sure would be nice _NOT_ to have to consider "Have it your own way!"
>> competing syntaxes like
>> Requires: libfoo.so.1 >= set:whatever
>> and
>> Requires: set(libfoo.so.1) >= whatever
> __
> RPM Package Managerhttp://rpm5.org
> Developer Communication Listrpm-devel@rpm5.org

__
RPM Package Managerhttp://rpm5.org
Developer Communication Listrpm-devel@rpm5.org


Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-05-30 Thread Alexey Tourbin
On Mon, Apr 23, 2012 at 6:32 PM, Jeff Johnson  wrote:
> I should point out that writing the attached
> message (and sending from the wrong e-mail address) has instantly
> led to a different -- and perhaps more natural -- syntax like
>
> Requires: set(libfoo.so.1) >= whatever

Hello,
Set-versions are just that - versions. One must arguably think of them
in terms of VERSIONS. If you need a library, it is a legitimate
question to ask which version you need. If you think you need the
version at least 1.0, there's a good question: why the heck you think
you need the version at least 1.0 (and whether 2.0 would still fit).
With set-versions, things get straightforward: you need at least a
version which provides  API symbols - that's much
better a description of a library than a god-damn arbitrary number.

There are some philosophical implications of introducing set-versions,
in particular, whether it can be extended to describe prototypes and
calling conventions (e.g. the number of arguments which must be passed
to a function). This is why I might seem reluctant to participate in
discussions. I'm thinking! (And perhaps I'm arrogant.)

> for set:versions, and for the generalization (for writing strict regression
> tests,
> its mostly useless in packaging because there is no mapping that specifies
> how the mixed DEB <-> RPM version comparison might be done "naturally")
>
> Requires: deb(foo) >= E:V-R
> Requires: rpm(foo) >= E:V-R
>
> The precedent for foo(bar) name spacing in RPM dependencies with
> the above syntax is already widely deployed although entirely
> de facto.
>
> Sure would be nice _NOT_ to have to consider "Have it your own way!"
> competing syntaxes like
> Requires: libfoo.so.1 >= set:whatever
> and
> Requires: set(libfoo.so.1) >= whatever
__
RPM Package Managerhttp://rpm5.org
Developer Communication Listrpm-devel@rpm5.org


Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-05-30 Thread Jeffrey Johnson

On May 30, 2012, at 3:09 PM, Robert Xu wrote:

> On Wed, May 30, 2012 at 2:56 PM, Jeff Johnson  wrote:
>> 
>> OK, I'm in the last stages of adding twiddle-in-version so its time to 
>> consider other changes:
>> 
>>1) Epoch as a string? The only lossage here is when digit strings 
>> differ in length
>>(and strcmp isn't identical to arithmetic comparison). Alternatively, 
>> should Epoch's
>>be forced to a [0-9]+ digit string?
> 
> I don't see the usefulness of Epoch as a string. I would rather have
> it just integers, higher takes precedence.
> Much less mind boggling.
> 

And Epoch: as integer requires special casing of an integer data type 
everywhere. strings
are easier for the implementation and for metadata data types.

I'd rather have a means to eliminate Epoch: … Epoch: is easily dropped
out everywhere by adding this to RPM configuration
%evr_tuple_order  VR
and all epochs will be parsed in dependency strings and ignored.

>> 
>>2) Is there a need for
>>twiddle-in-epoch(assuming a string)
>>twiddle-in-release
>>twiddle-in-distepoch
>>There are/were several ways to implement twiddle-in-version; so far 
>> I've
>>chosen to do only twiddle-in-version because that involved little more
>>than rewriting some regex patterns. OTOH, handling twiddle-in-version
>>as a special case of a more general string comparison (aka rpmvercmp)
>>is at least as much of a problem as just permitting the full 
>> generalization.
>> 
>>3) revisiting dependency ranges with syntax like this
>>Provides: foo = [1.2-3|4.5-6)
>>which would match only Requires in the range, including 1.2-3 because 
>> of the '['
>>and excluding 4.5-6 because of the ')' with the usual mathematical 
>> conventions (in the US).
>> 
>>The problematic character would be choosing a separator (I used '|' 
>> in the example): almost
>>all the usual characters have pre-existing usages, but perhaps 
>> something could still be twitched together.
>> 
> 
> Can you explain twiddle-in-* to me? I'm slightly confused.

Go read the Debian documentation or ask SuSE where the need for 
twiddle-in-version is.

Perhaps the most complete description of already implemented alternatives 
@rpm5.org is here:
https://bugzilla.yoctoproject.org/show_bug.cgi?id=256

I am forced to "compatibility" (with dpkg in Yocto) and feature parity 
(released in rpm-4.10.0) on an
unnecessary (imho) implementation complexity adding twiddle-in-version 
@rpm5.org.

The original RFE (from SuSE) is here
http://www.rpm.org/ticket/56

> (Shows how much I haven't kept up with anything within like the
> past... year >.>)
> 

The twiddle-in-version RFE started in 2008, was proposed in the "OpenSUSE RPM 
Summit" ROADMAP on
September 18th, 2009, and has just been released.

That SHOULD tell you a lot about the urgency and necessity of 
twiddle-in-version in RPM.



73 de Jeff

Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-05-30 Thread Robert Xu
On Wed, May 30, 2012 at 2:56 PM, Jeff Johnson  wrote:
>
> OK, I'm in the last stages of adding twiddle-in-version so its time to 
> consider other changes:
>
>        1) Epoch as a string? The only lossage here is when digit strings 
> differ in length
>        (and strcmp isn't identical to arithmetic comparison). Alternatively, 
> should Epoch's
>        be forced to a [0-9]+ digit string?

I don't see the usefulness of Epoch as a string. I would rather have
it just integers, higher takes precedence.
Much less mind boggling.

>
>        2) Is there a need for
>                twiddle-in-epoch(assuming a string)
>                twiddle-in-release
>                twiddle-in-distepoch
>        There are/were several ways to implement twiddle-in-version; so far 
> I've
>        chosen to do only twiddle-in-version because that involved little more
>        than rewriting some regex patterns. OTOH, handling twiddle-in-version
>        as a special case of a more general string comparison (aka rpmvercmp)
>        is at least as much of a problem as just permitting the full 
> generalization.
>
>        3) revisiting dependency ranges with syntax like this
>                Provides: foo = [1.2-3|4.5-6)
>        which would match only Requires in the range, including 1.2-3 because 
> of the '['
>        and excluding 4.5-6 because of the ')' with the usual mathematical 
> conventions (in the US).
>
>        The problematic character would be choosing a separator (I used '|' in 
> the example): almost
>        all the usual characters have pre-existing usages, but perhaps 
> something could still be twitched together.
>

Can you explain twiddle-in-* to me? I'm slightly confused.
(Shows how much I haven't kept up with anything within like the
past... year >.>)

-- 
later daze. :: Robert Xu :: protocol.by/rxu
__
RPM Package Managerhttp://rpm5.org
Developer Communication Listrpm-devel@rpm5.org


Re: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-05-30 Thread Jeffrey Johnson

On Apr 23, 2012, at 10:32 AM, Jeff Johnson wrote:

> I should point out that writing the attached
> message (and sending from the wrong e-mail address) has instantly
> led to a different -- and perhaps more natural -- syntax like
> 
>   Requires: set(libfoo.so.1) >= whatever
> 

After a month of quietly muddling, I like this namespace like syntax
better because it doesn't lead to magic tokens like "set:" in the EVR
string and is more consistent with existing namespaces in RPM dependencies.

I'll leave the existing "set:" prefix as is until I implement the set(…) syntax.

> for set:versions, and for the generalization (for writing strict regression 
> tests,
> its mostly useless in packaging because there is no mapping that specifies
> how the mixed DEB <-> RPM version comparison might be done "naturally")
> 
>   Requires: deb(foo) >= E:V-R
>   Requires: rpm(foo) >= E:V-R
> 

(aside)
And this is a rather easy/natural extension to choose between alternative
version comparisons (at least for development/testing).

OTOH, its mostly useless for dependency assertion evaluation because there
is no clearly identified "mixed" comparison when
Provides:  deb(foo) >= E:V-R
Requires: rpm(foo) >= E:V-R
I suppose I can attempt a resolution like this
Prefer an explicit comparison in the Provides: if there is a conflict.
Prefer an explicit comparison over an implicit/default comparison
Use the default comparison if no explicit comparison hint is present.

> The precedent for foo(bar) name spacing in RPM dependencies with
> the above syntax is already widely deployed although entirely
> de facto.
> 
> Sure would be nice _NOT_ to have to consider "Have it your own way!"
> competing syntaxes like
>   Requires: libfoo.so.1 >= set:whatever
> and
>   Requires: set(libfoo.so.1) >= whatever
> (and I'll ignore the gawdafulness of existing multilib dependencies like
>   Requires: libfoo.ao.1()(64bit)
> coming along rapidly as a Newer! Better! Bestest! ABI standard now
> that there is support in kernel >= 3.2.
> 
> *shrug*
> 
> Its all implementable, just not very KISS when aesthetic/innate opinions get 
> involved.
> 

OK, I'm in the last stages of adding twiddle-in-version so its time to consider 
other changes:

1) Epoch as a string? The only lossage here is when digit strings 
differ in length
(and strcmp isn't identical to arithmetic comparison). Alternatively, 
should Epoch's
be forced to a [0-9]+ digit string?

2) Is there a need for
twiddle-in-epoch(assuming a string)
twiddle-in-release
twiddle-in-distepoch
There are/were several ways to implement twiddle-in-version; so far I've
chosen to do only twiddle-in-version because that involved little more
than rewriting some regex patterns. OTOH, handling twiddle-in-version
as a special case of a more general string comparison (aka rpmvercmp)
is at least as much of a problem as just permitting the full 
generalization.

3) revisiting dependency ranges with syntax like this
Provides: foo = [1.2-3|4.5-6)
which would match only Requires in the range, including 1.2-3 because 
of the '['
and excluding 4.5-6 because of the ')' with the usual mathematical 
conventions (in the US).

The problematic character would be choosing a separator (I used '|' in 
the example): almost
all the usual characters have pre-existing usages, but perhaps 
something could still be twitched together.

73 de Jeff
__
RPM Package Managerhttp://rpm5.org
Developer Communication Listrpm-devel@rpm5.org


Fwd: EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-04-23 Thread Jeff Johnson
I should point out that writing the attached
message (and sending from the wrong e-mail address) has instantly
led to a different -- and perhaps more natural -- syntax like

Requires: set(libfoo.so.1) >= whatever

for set:versions, and for the generalization (for writing strict regression 
tests,
its mostly useless in packaging because there is no mapping that specifies
how the mixed DEB <-> RPM version comparison might be done "naturally")

Requires: deb(foo) >= E:V-R
Requires: rpm(foo) >= E:V-R

The precedent for foo(bar) name spacing in RPM dependencies with
the above syntax is already widely deployed although entirely
de facto.

Sure would be nice _NOT_ to have to consider "Have it your own way!"
competing syntaxes like
Requires: libfoo.so.1 >= set:whatever
and
Requires: set(libfoo.so.1) >= whatever
(and I'll ignore the gawdafulness of existing multilib dependencies like
Requires: libfoo.ao.1()(64bit)
coming along rapidly as a Newer! Better! Bestest! ABI standard now
that there is support in kernel >= 3.2.

*shrug*

Its all implementable, just not very KISS when aesthetic/innate opinions get 
involved.

hth

73 de Jeff


Begin forwarded message:

> From: Jeff Johnson 
> Subject: EVR issues: set:versions, epoch-as-string, now twiddle-in-version
> Date: April 23, 2012 10:16:28 AM EDT
> To: rpm-devel@rpm5.org
> 
> OK, adoption of the ancient SuSE "twiddle-in-version" patch by
> @rpm.org has now added yet a 3rd incompatibility that needs
> to be addressed within what is commonly known as "rpmvercmp"
> 
> (aside)
> The designator "rpmvercmp" is already horrendously misleading
> because its the name of the comparison routine for each of the
> V and R members of the tuple {E,V,R}. There are many many
> many false starts at package version comparisons that naively
> attempt
>   rc = rpmvercmp("E:V-R", whatever);
> when what is actually needed is a loop of each member of the
> tuple like
>static char * EVRprecedence = "EVR";
>char ptype;
>int i;
> 
>for (i = 0; (ptype = EVRprecedence[i]) != '\0; i++) {
>   switch (ptype) {
>   case 'E':   /* coerce and compare Epoch: digit strings */   break;
>   case 'V':
>   case 'R':
>   case 'D':   /* Distepoch/Disttag comparison is similar to V and R 
> iirc */
>   rc = rpmvercmp(somestring, anotherstring);  break;
>   }
>   if (rc) break;
>}
> 
> I've written out the algorithm so that the asymmetry of the full
> comparison with the existing convention of Epoch: as digit string
> can be more easily seen.
> 
> With 3 "incompatibilities" in existing EVR comparison looming,
> I am tipped in favor of treating Eoch: more generally as a string, not
> just a digit sequence, in order to
>   Handle all the incompatibilities (to disappoint every one equally) at 
> the same time.
> 
> The issue of whether Epoch: SHOULD be treated as a [0-9]+: pattern is
> merely a stricter policy convention on the general case. There are other
> possible policy conventions with V and R that could and should be
> imposed more strictly than
>   Neither V nor R can contain a '-' character.
> which has always been seen as some sort of deficiency in RPM compared
> to APT/DPKG.
> 
> I can also see some usage case if "set" is described as a namespace marker
> for a type of comparison, particularly since '>=" is being overloaded as a
> subset "contained within" semantic instead of "rpmvercmp" comparison.
> 
> The next most useful namespace marker is likely going to be "deb" (or
> "dpkg" or "borg" to taste ;-) to at least attempt some means to also
> write regression tests for the corner cases that are surely going to be
> recognized soon. Because of pattern parsing ambiguity with ':' (which
> has already been overused as a separator of Disttag/Distepoch), this
> would mean that adding a "deb" item to the (being proposed here) EVR
> name space based on a generalization of Epoch: as a string marker
> for dpkg comparison (where Yet Another meaning for ':' is encountered).
> 
> An example is easier than words: Suppose that there is a EVR string like
>   0:1.2-3
> that then needs a "deb" or "rpm" name space marker. The parser pattern
> could/should/would then return a tuple like
>   {"deb:0", "1.2", "3" }
> or
>   {"deb", "0:1.2", "3" }
> Either parsing is deterministic and implementable. I point out

EVR issues: set:versions, epoch-as-string, now twiddle-in-version

2012-04-23 Thread Jeff Johnson
OK, adoption of the ancient SuSE "twiddle-in-version" patch by
@rpm.org has now added yet a 3rd incompatibility that needs
to be addressed within what is commonly known as "rpmvercmp"

(aside)
The designator "rpmvercmp" is already horrendously misleading
because its the name of the comparison routine for each of the
V and R members of the tuple {E,V,R}. There are many many
many false starts at package version comparisons that naively
attempt
rc = rpmvercmp("E:V-R", whatever);
when what is actually needed is a loop of each member of the
tuple like
static char * EVRprecedence = "EVR";
char ptype;
int i;

for (i = 0; (ptype = EVRprecedence[i]) != '\0; i++) {
switch (ptype) {
case 'E':   /* coerce and compare Epoch: digit strings */   break;
case 'V':
case 'R':
case 'D':   /* Distepoch/Disttag comparison is similar to V and R 
iirc */
rc = rpmvercmp(somestring, anotherstring);  break;
}
if (rc) break;
}

I've written out the algorithm so that the asymmetry of the full
comparison with the existing convention of Epoch: as digit string
can be more easily seen.

With 3 "incompatibilities" in existing EVR comparison looming,
I am tipped in favor of treating Eoch: more generally as a string, not
just a digit sequence, in order to
Handle all the incompatibilities (to disappoint every one equally) at 
the same time.

The issue of whether Epoch: SHOULD be treated as a [0-9]+: pattern is
merely a stricter policy convention on the general case. There are other
possible policy conventions with V and R that could and should be
imposed more strictly than
Neither V nor R can contain a '-' character.
which has always been seen as some sort of deficiency in RPM compared
to APT/DPKG.

I can also see some usage case if "set" is described as a namespace marker
for a type of comparison, particularly since '>=" is being overloaded as a
subset "contained within" semantic instead of "rpmvercmp" comparison.

The next most useful namespace marker is likely going to be "deb" (or
"dpkg" or "borg" to taste ;-) to at least attempt some means to also
write regression tests for the corner cases that are surely going to be
recognized soon. Because of pattern parsing ambiguity with ':' (which
has already been overused as a separator of Disttag/Distepoch), this
would mean that adding a "deb" item to the (being proposed here) EVR
name space based on a generalization of Epoch: as a string marker
for dpkg comparison (where Yet Another meaning for ':' is encountered).

An example is easier than words: Suppose that there is a EVR string like
0:1.2-3
that then needs a "deb" or "rpm" name space marker. The parser pattern
could/should/would then return a tuple like
{"deb:0", "1.2", "3" }
or
{"deb", "0:1.2", "3" }
Either parsing is deterministic and implementable. I point out the details
solely to indicate some of the implementation issues that will need
a conventionally defined answer.

So the generalization to epoch-as-string (with an enforcement pattern of [0-9]+
for the Luddites) use now
+1
from me.

And its time to open the discussion for other comments. Note that you likely
have 2-3 weeks to respond before I get around to attempting the epoch-as-string
(and twiddle-in-version and set:versions generation) implementations.

Other opinions?

(aside)
A simple +1, 0, or -1 reply suffices and will be tallied. Talking
about package manager version comparison in public is known to
have carcinogenic side effects inducing brain tumors.

73 de Jeff



smime.p7s
Description: S/MIME cryptographic signature