Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-08 Thread Zbigniew Jędrzejewski-Szmek
On Fri, Apr 04, 2014 at 04:15:59PM +0200, Mikolaj Izdebski wrote:
 As I promised, I prepared a benchmark of lbzip2 and bzip2.
 Decompression of linux-3.12.6.tar.bz2
 -
 
 command|   real |   user |  sys | memory
 ---+++--+---
 lbzip2 |   0.79 |  30.72 | 1.70 | 448804
 lbzip2 -u  |   5.85 |  18.62 | 1.83 |  80992
 pbzip2 |  24.48 |  24.27 | 0.61 |  98444
 bzip2  |  23.95 |  23.46 | 0.44 |   4212
That is *impressive*.

Zbyszek
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-08 Thread Zbigniew Jędrzejewski-Szmek
On Fri, Apr 04, 2014 at 12:49:25PM -0400, Matthew Miller wrote:
 On Fri, Apr 04, 2014 at 04:15:59PM +0200, Mikolaj Izdebski wrote:
  lbzip2 was the fastest compressor and decompressor in all tests.
  It the best command for interactive use.
  
  lbzip2 -u always produced smallest files (even smaller than bzip2)
  while consuming the least amount of resources (CPU power and memory).
  This directly translates to lowest bills in cloud, which makes lbzip2
  -u the best choice here.
 
 But... the size difference in your test cases appear to be 0.1% and
 0.02%. Am I reading that right? And, compressing linux-3.12.6.tar with xz
 instead of bzip2 gives a 15.6%, or with xz -9, 19.7%. Of course, that's very
 slow, and the other resource factors are important too. (And lbzip2 is
 impressively fast.)
I think that xz is orthogonal: bzip2 files are quite popular and gains
in decompression speed are useful, even if bzip2 isn't the compressor of choice.

OTOH, we could also replace xz with a multithreaded implementation
like pxz by default. In case of xz it would matter even more, since it
is generally slower. It would be great if somebody proposed a change like
that.

Zbyszek
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-08 Thread Laurent Rineau
Le Friday 04 April 2014 16:15:59 Mikolaj Izdebski a écrit :
 CPU:   Haswell B0, Genuine Intel(R) CPU @ 2.20GHz
 bogomips:  4389.60
 Processors:56
 NUMA Nodes:2
 Memory:31966 MB

It would be fair to post also a bench that ran on a more usual machine, like 
quad-core (8 threads), and 16GB of RAM.

-- 
Laurent Rineau
http://fedoraproject.org/wiki/LaurentRineau

-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-04 Thread Mikolaj Izdebski
As I promised, I prepared a benchmark of lbzip2 and bzip2.
I also added pbzip2 for comparison.


Basic information
=

Test date: 2014-04-04
Tester:Mikolaj Izdebski
Test subjects: lbzip2 2.5
   bzip2 1.0.6
   pbzip2 1.1.6
Test purpose:  compare performance, memory usage and compression
   ratio of lbzip2, bzip2 and pbzip2 in Fedora

CPU:   Haswell B0, Genuine Intel(R) CPU @ 2.20GHz
bogomips:  4389.60
Processors:56
NUMA Nodes:2
Memory:31966 MB

System:Fedora release 20 (Heisenbug)
Arch:  x86_64
Inst. method:  anaconda 20.25.15-1 (kickstart)

File system:   tmpfs (/dev/shm)


Methodology
===

Compress and decompress different payloads:
 - Linux kernel sources.
 - tarball created from /usr

Linux source tarball was chosen because it is a quite big bz2 file
which can be easily downloaded from the Internet to reproduce test
results.  MD5 sums are provided for reproducibility.

  linux-3.12.6.tar  544061440  02d8601f28c519a9d4d0a2ae99bb597a
  linux-3.12.6.tar.bz2   91104346  2e1e42cf9c164d8c24bc1e33bb3c7b2b

Tarball created by running tar cf payload.tar /usr was chosen
because it contains different types of data: text files, executables,
uncompressible files, while it should still allow to reproduce the
results quite easily.

  payload.tar  1463183360
  payload.tar.bz2   424518771

Each compression and decompression was ran three times.  The run with
median of real time (wall clock) was chosen, other two were rejected.

Times and memory usage were measured using GNU time utility.


Results
===

real- elapsed real time (wall clock, seconds)
user- elapsed user time (seconds)
sys - elapsed system time (seconds)
memory  - maximum resident set size (kbytes)
compr. size - size of resulting compressed file (bytes)


Decompression of linux-3.12.6.tar.bz2
-

command|   real |   user |  sys | memory
---+++--+---
lbzip2 |   0.79 |  30.72 | 1.70 | 448804
lbzip2 -u  |   5.85 |  18.62 | 1.83 |  80992
pbzip2 |  24.48 |  24.27 | 0.61 |  98444
bzip2  |  23.95 |  23.46 | 0.44 |   4212


Compression of linux-3.12.6.tar
---

command|   real |   user |  sys | memory | compr. size
---+++--++
lbzip2 |   1.30 |  61.45 | 2.35 | 360280 | 91383535
lbzip2 -u  |   2.51 |  44.11 | 1.43 | 211456 | 91084544
pbzip2 |   2.69 | 105.79 | 4.11 | 488840 | 91411005
bzip2  |  66.16 |  65.82 | 0.22 |   7996 | 91104346


Decompression of payload.tar.bz2


command|   real |   user |  sys | memory
---+++--+---
lbzip2 |   2.19 |  95.16 | 3.81 | 750548
lbzip2 -u  |  23.34 |  60.31 | 5.04 | 120140
pbzip2 |  69.55 |  69.07 | 1.92 | 139060
bzip2  |  68.30 |  66.93 | 1.27 |   4216


Compression of payload.tar
--

command|   real |   user |  sys | memory | compr. size
---+++--++
lbzip2 |   3.36 | 170.07 | 6.38 | 380448 | 424676188
lbzip2 -u  |   6.45 | 123.14 | 3.80 | 255524 | 424518771
pbzip2 |   6.78 | 288.33 | 8.90 | 491644 | 425213134
bzip2  | 176.68 | 175.76 | 0.67 |   8000 | 425108407


Conclusions
===

Memory usage depended on number of threads used.  Difference of
memory usage between parallel and non-parallel runs can be ignored
as even parallel tools can be run in non-parallel mode.

lbzip2 was the fastest compressor and decompressor in all tests.
It the best command for interactive use.

lbzip2 -u always produced smallest files (even smaller than bzip2)
while consuming the least amount of resources (CPU power and memory).
This directly translates to lowest bills in cloud, which makes lbzip2
-u the best choice here.

pbzip2 did not allow parallel decompression.  During compression it
was always the slowest, used highest amounts of memory and offered the
worst compression ratio.

You don't have to believe this report, you are free to try lbzip2
and see for yourself.


-- 
Mikolaj Izdebski
Software Engineer, Red Hat
IRC: mizdebsk
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-04 Thread Mikolaj Izdebski
On 04/02/2014 08:03 PM, Bill Nottingham wrote:
 A quick check shows lbzip2 doesn't provide a library interface, much less
 one compatible with libbz2. Is that ever intended?
 
 If it's not, saying lbzip2 is the default bzip2 *implementation* may be a
 bit of a stretch. Perhaps s/implementation/command/.

I have clarified that in the change proposal by explicitly stating that
this change replaces bzip2 tool only and that libbzip2 is not affected.


http://fedoraproject.org/w/index.php?title=Changes/lbzip2diff=375469oldid=375468

-- 
Mikolaj Izdebski
Software Engineer, Red Hat
IRC: mizdebsk
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-04 Thread Michal Schmidt
On 04/04/2014 04:15 PM, Mikolaj Izdebski wrote:
 Compression of payload.tar
 --
 
 command|   real |   user |  sys | memory | compr. size
 ---+++--++
 lbzip2 |   3.36 | 170.07 | 6.38 | 380448 | 424676188
 lbzip2 -u  |   6.45 | 123.14 | 3.80 | 255524 | 424518771
 pbzip2 |   6.78 | 288.33 | 8.90 | 491644 | 425213134
 bzip2  | 176.68 | 175.76 | 0.67 |   8000 | 425108407
 
 
 Conclusions
 ===
 [...]
 lbzip2 -u always produced smallest files (even smaller than bzip2)
 while consuming the least amount of resources (CPU power and memory).

The table above says it needs about 30 times *more* memory than bzip2.

Michal

-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-04 Thread Mikolaj Izdebski
On 04/04/2014 05:16 PM, Michal Schmidt wrote:
 On 04/04/2014 04:15 PM, Mikolaj Izdebski wrote:
 Compression of payload.tar
 --

 command|   real |   user |  sys | memory | compr. size
 ---+++--++
 lbzip2 |   3.36 | 170.07 | 6.38 | 380448 | 424676188
 lbzip2 -u  |   6.45 | 123.14 | 3.80 | 255524 | 424518771
 pbzip2 |   6.78 | 288.33 | 8.90 | 491644 | 425213134
 bzip2  | 176.68 | 175.76 | 0.67 |   8000 | 425108407


 Conclusions
 ===
 [...]
 lbzip2 -u always produced smallest files (even smaller than bzip2)
 while consuming the least amount of resources (CPU power and memory).
 
 The table above says it needs about 30 times *more* memory than bzip2.

No, it shows that it *used* that much memory.

The system had 32 GB of RAM, lbzip2 using all 56 CPUs used less than 1.2
% of available memory.  That is *very* conservative.

Memory usage can be limited by lowering number of threads used (-n) or
by specifying explicit memory limit (-m, undocumented for now, it will
be fully supported in future version of lbzip2 after it gets enough
testing).

-- 
Mikolaj Izdebski
Software Engineer, Red Hat
IRC: mizdebsk
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-04 Thread Mikolaj Izdebski
On 04/04/2014 05:26 PM, Mikolaj Izdebski wrote:
 On 04/04/2014 05:16 PM, Michal Schmidt wrote:
 On 04/04/2014 04:15 PM, Mikolaj Izdebski wrote:
 Compression of payload.tar
 --

 command|   real |   user |  sys | memory | compr. size
 ---+++--++
 lbzip2 |   3.36 | 170.07 | 6.38 | 380448 | 424676188
 lbzip2 -u  |   6.45 | 123.14 | 3.80 | 255524 | 424518771
 pbzip2 |   6.78 | 288.33 | 8.90 | 491644 | 425213134
 bzip2  | 176.68 | 175.76 | 0.67 |   8000 | 425108407


 Conclusions
 ===
 [...]
 lbzip2 -u always produced smallest files (even smaller than bzip2)
 while consuming the least amount of resources (CPU power and memory).

 The table above says it needs about 30 times *more* memory than bzip2.
 
 No, it shows that it *used* that much memory.
 
 The system had 32 GB of RAM, lbzip2 using all 56 CPUs used less than 1.2
 % of available memory.  That is *very* conservative.
 
 Memory usage can be limited by lowering number of threads used (-n) or
 by specifying explicit memory limit (-m, undocumented for now, it will
 be fully supported in future version of lbzip2 after it gets enough
 testing).

lbzip2 can use less memory than bzip2 while still being much faster.

Example results:

Command being timed: bzip2 -kf linux-3.12.6.tar
User time (seconds): 47.70
System time (seconds): 0.20
Percent of CPU this job got: 95%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:49.92
Maximum resident set size (kbytes): 7884

Command being timed: lbzip2 -kfusn1 linux-3.12.6.tar
User time (seconds): 31.77
System time (seconds): 0.89
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:32.96
Maximum resident set size (kbytes): 7704

-- 
Mikolaj Izdebski
Software Engineer, Red Hat
IRC: mizdebsk
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-04 Thread Matthew Miller
On Fri, Apr 04, 2014 at 04:15:59PM +0200, Mikolaj Izdebski wrote:
 lbzip2 was the fastest compressor and decompressor in all tests.
 It the best command for interactive use.
 
 lbzip2 -u always produced smallest files (even smaller than bzip2)
 while consuming the least amount of resources (CPU power and memory).
 This directly translates to lowest bills in cloud, which makes lbzip2
 -u the best choice here.

But... the size difference in your test cases appear to be 0.1% and
0.02%. Am I reading that right? And, compressing linux-3.12.6.tar with xz
instead of bzip2 gives a 15.6%, or with xz -9, 19.7%. Of course, that's very
slow, and the other resource factors are important too. (And lbzip2 is
impressively fast.)


-- 
Matthew Miller--   Fedora Project--mat...@fedoraproject.org
  Tepid change for the somewhat better!
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-04 Thread Susi Lehtola
On Fri, 4 Apr 2014 12:49:25 -0400
Matthew Miller mat...@fedoraproject.org wrote:
 On Fri, Apr 04, 2014 at 04:15:59PM +0200, Mikolaj Izdebski wrote:
  lbzip2 -u always produced smallest files (even smaller than bzip2)
  while consuming the least amount of resources (CPU power and memory).
  This directly translates to lowest bills in cloud, which makes lbzip2
  -u the best choice here.
 
 But... the size difference in your test cases appear to be 0.1% and
 0.02%. Am I reading that right? And, compressing linux-3.12.6.tar with xz
 instead of bzip2 gives a 15.6%, or with xz -9, 19.7%. Of course, that's very
 slow, and the other resource factors are important too. (And lbzip2 is
 impressively fast.)

Well, looking at the table, I calculate size differences of -0.10% and
-0.14% for lbzip2 and lbzip2 -u, respectively, compared to bzip2 for
compression of payload.tar.

.. and -0.31% and -0.02% for linux-3.12.6.tar.
-- 
Susi Lehtola
Fedora Project Contributor
jussileht...@fedoraproject.org
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-04 Thread Mikolaj Izdebski
On 04/04/2014 07:01 PM, Susi Lehtola wrote:
 On Fri, 4 Apr 2014 12:49:25 -0400
 Matthew Miller mat...@fedoraproject.org wrote:
 On Fri, Apr 04, 2014 at 04:15:59PM +0200, Mikolaj Izdebski wrote:
 lbzip2 -u always produced smallest files (even smaller than bzip2)
 while consuming the least amount of resources (CPU power and memory).
 This directly translates to lowest bills in cloud, which makes lbzip2
 -u the best choice here.

 But... the size difference in your test cases appear to be 0.1% and
 0.02%. Am I reading that right? And, compressing linux-3.12.6.tar with xz
 instead of bzip2 gives a 15.6%, or with xz -9, 19.7%. Of course, that's very
 slow, and the other resource factors are important too. (And lbzip2 is
 impressively fast.)
 
 Well, looking at the table, I calculate size differences of -0.10% and
 -0.14% for lbzip2 and lbzip2 -u, respectively, compared to bzip2 for
 compression of payload.tar.

In general lbzip2 has compression ratio very close to bzip2.

lbzip2 -u almost always produces marginally smaller files than bzip2.
Without -u it varies.  Sometimes lbzip2 produces marginally bigger,
sometimes smaller bz2 files.

ot

About xz...

For some types of data bz2 compression works better than xz.  Examples:
sparse disk images containing lots of zeroes, or genome DNA sequences.

$ dd if=/dev/zero of=zero bs=100 count=100
$ lbzip2 -ku zero
$ xz -k zero
-rw-rw-r--. 1 1 Apr  4 19:17 zero
-rw-rw-r--. 1   113 Apr  4 19:17 zero.bz2
-rw-rw-r--. 1 14676 Apr  4 19:17 zero.xz

xz doesn't allow parallel decompression in general.  When restoring
backups you are under time pressure and fast decompression can come very
handy.

When xz file is damaged then all data succeeding the point of damage is
lost.  But lbzrecover tool from lbzip2-utils allows easy recovery of
data from undamaged parts of any bz2 file.

Personally, for above reasons I recommend people to use lbzip2 for
backups rather than xz.  But I admit xz is a better format for some use
cases.

/ot

-- 
Mikolaj Izdebski
Software Engineer, Red Hat
IRC: mizdebsk
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-03 Thread Ville Skyttä
On Wed, Apr 2, 2014 at 11:27 PM, Zbigniew Jędrzejewski-Szmek
zbys...@in.waw.pl wrote:
 ** possibly adjust spec files to require or build-require lbzip2 instead of
 bzip2.
 Is this necessary?

I don't think so. A better way would be to change them to depend on
the actual executables they use, /usr/bin/bzip2 etc. Naturally, the
lbzip2/bzip2 alternative packaging needs to be properly done so that
this is possible, but I assume that's going to be done in any case.
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-03 Thread Mikolaj Izdebski
On 04/03/2014 03:47 AM, Chris Adams wrote:
 Many of the common users (such as rpm) are linked
 against the library and don't use the command, so they won't be
 impacted.

rpm does use bzip2 *command* and it would be impacted by this change.

rpm uses libbz2 only for compression and decompression of rpm package
payloads.  Since Fedora uses LZMA compression, libbz2 is used only when
installing some older third-party packages which happen to be compressed
with bzip2.

-- 
Mikolaj Izdebski
Software Engineer, Red Hat
IRC: mizdebsk
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-03 Thread Ville Skyttä
On Thu, Apr 3, 2014 at 1:03 PM, Mikolaj Izdebski mizde...@redhat.com wrote:
 rpm does use bzip2 *command*

To be more precise, I believe only rpmbuild does.
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-03 Thread Miloslav Trmač
2014-04-02 19:24 GMT+02:00 Jaroslav Reznik jrez...@redhat.com:

 = Proposed System Wide Change:  lbzip2 as default bzip2 implementation =
 https://fedoraproject.org/wiki/Changes/lbzip2


While the speedup is desirable, it's not really obvious that this is the
right time to do the change.

Looking at http://lbzip2.org/news , lbzip2 is still fixing crashes during
compression and decompression.  That's rather troubling: we need the bzip2
implementation to be roughly as stable as file system*.*  The Change page
implies that bzip2 is not actively maintained; that may be true but looking
at bugzilla.redhat.com, there has AFAICT never been a bug reporting that
something can't be compressed or decompressed--that's a *very* high bar to
match.  (I do appreciate that assertion failure and silent miscompression
are not the same thing.)

Having the library implementation and the command-line implementation
completely separate may frustrate debugging efforts when using an
application-builtin compression and saving uncompressed and compressing
manually may give different results.  That's not a deal-breaker but having
a single implementation would certainly simplify things.

Ultimately the easiest way to make this implementation change happen, not
only in Fedora but in all distributions, would be for the improvements to
be integrated into the upstream bzip2 codebase; has that possibility been
explored at all?
Mirek
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-03 Thread Mikolaj Izdebski
On 04/03/2014 06:08 PM, Miloslav Trmač wrote:
 Looking at http://lbzip2.org/news , lbzip2 is still fixing crashes during
 compression and decompression.  That's rather troubling: we need the bzip2
 implementation to be roughly as stable as file system*.*

They say that every non-trivial piece of software has at least one bug.
 bzip2 also has bugs, myself I am aware of a few of them.

 The Change page
 implies that bzip2 is not actively maintained; that may be true but looking
 at bugzilla.redhat.com, there has AFAICT never been a bug reporting that
 something can't be compressed or decompressed--that's a *very* high bar to
 match.  (I do appreciate that assertion failure and silent miscompression
 are not the same thing.)

Neither was for lbzip2.  And as a matter of fact, bzip2 is compiled with
most of assertions disabled.  But I understand the point.

It may be true that bzip2 is more stable, but that's because it has been
given a chance being included in popular operating systems and after
initial bugs were fixed it has been there without any changes for years.

I care about lbzip2 quality very much.  I run a test suite consisting of
over 320,000 automated test cases, I compile it with all possible
warnings enabled I test it with multiple static analysis tools.  Any
bugs that may be found during testing in Fedora will be taken care of
with high priority.

I believe that lbzip2 deserves to be given a chance and if for some
reason it turns out not to be ready, the Change can be reverted very
easily with a single spec file modification.

 Having the library implementation and the command-line implementation
 completely separate may frustrate debugging efforts when using an
 application-builtin compression and saving uncompressed and compressing
 manually may give different results.  That's not a deal-breaker but having
 a single implementation would certainly simplify things.

Users will still be able to run bzip2 explicitly if needed or configure
alternatives to used it as implementation of /usr/bin/bzip2 on their
systems.

Besides that I am willing to provide a library interface for lbzip2 in
future if there is demand.

 Ultimately the easiest way to make this implementation change happen, not
 only in Fedora but in all distributions, would be for the improvements to
 be integrated into the upstream bzip2 codebase; has that possibility been
 explored at all?

lbzip2 as it is now is a merger of 2 projects -- a parallel bzip2-like
tool using libbz2 by Laszlo Ersek (started in 2008) and improved
low-level bzip2 library by me (started in 2007).  The 2 projects were
merged in 2010 and since then I took maintenance of lbzip2.

While I would like the improvements and new features of lbzip2 to be
included in bzip2, I lost my hopes of this ever happening.  Both Laszlo
me and tried contacting Julian Seward (the author or bzip2) and
contributing to bzip2, but without any success.

Initially Julian admited that it would be desirable to parallelise
bzip2.  The plan was to evaluate existing implementations and decide
which one to integrate with bzip2, or start the work from scratch.  But
nothing of that happened.  The last conversation about improving bzip2
took place in 2009 and only a single patch was included in bzip2 since
then -- a fix for important security bug which I discovered
(CVE-2010-0405 [1], it took it 6 months to be applied in bzip2).

That said, I am still willing to cooperate with Julian and discuss
possibilities of merging some code or improving bzip2 in any way.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=CVE-2010-0405

-- 
Mikolaj Izdebski
Software Engineer, Red Hat
IRC: mizdebsk
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-02 Thread Jaroslav Reznik
= Proposed System Wide Change:  lbzip2 as default bzip2 implementation =
https://fedoraproject.org/wiki/Changes/lbzip2

Change owner(s): Mikolaj Izdebski mizde...@redhat.com

This change aims at making lbzip2 [1] default bzip2 implementation used in 
Fedora. 

== Detailed Description ==
lbzip2 is an independent implementation of bzip2 compression tool. It provides 
interface strictly compatible with bzip2, but also adds several new features 
and improvements, such as:

* multi-threaded operation for both compression and decompression, with almost 
linear scalability,
* improved performance, even on single-core systems,
* improved extra utilities (bzdiff, bzless, bzip2recover, etc.),
* improved compatibility with gzip. 

lbzip2 is a mature project and it has been used in production for years. It is 
already packaged for Fedora and it is also available in EPEL.

The case of bzip2 and lbzip2 is an ideal candidate for usage of alternatives - 
both tools provide commands with compatible interfaces. This change proposes 
assigning higher priority to lbzip2 than to bzip2, which will effectively 
cause lbzip2 to be used instead of bzip2, if lbzip2 is installed. If for some 
reason some users don't like the change they can reconfigure alternatives 
manually and keep using bzip2. 

== Scope ==
* Proposal owners:
** make lbzip2 and bzip2 packages use alternatives for binaries and manpages 
they provide,
** set higher priority for lbzip2 in alternatives,
** identify packages which require bzip2 and port some of them to use lbzip2 
instead. 

* Other developers:
** test if their packages work with lbzip2,
** possibly adjust spec files to require or build-require lbzip2 instead of 
bzip2. 

* Release engineering: no action required. 
* Policies and guidelines: no change required. 

[1] http://lbzip2.org/
___
devel-announce mailing list
devel-annou...@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel-announce
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-02 Thread Bill Nottingham
Jaroslav Reznik (jrez...@redhat.com) said: 
 = Proposed System Wide Change:  lbzip2 as default bzip2 implementation =
 https://fedoraproject.org/wiki/Changes/lbzip2
 
 Change owner(s): Mikolaj Izdebski mizde...@redhat.com
 
 This change aims at making lbzip2 [1] default bzip2 implementation used in 
 Fedora. 
 
 == Detailed Description ==
 lbzip2 is an independent implementation of bzip2 compression tool. It 
 provides 
 interface strictly compatible with bzip2, but also adds several new features 
 and improvements, such as:
 
 * multi-threaded operation for both compression and decompression, with 
 almost 
 linear scalability,
 * improved performance, even on single-core systems,
 * improved extra utilities (bzdiff, bzless, bzip2recover, etc.),
 * improved compatibility with gzip. 
 
 lbzip2 is a mature project and it has been used in production for years. It 
 is 
 already packaged for Fedora and it is also available in EPEL.

A quick check shows lbzip2 doesn't provide a library interface, much less
one compatible with libbz2. Is that ever intended?

If it's not, saying lbzip2 is the default bzip2 *implementation* may be a
bit of a stretch. Perhaps s/implementation/command/.

Bill
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-02 Thread Tom Hughes

On 02/04/14 18:24, Jaroslav Reznik wrote:


== Detailed Description ==
lbzip2 is an independent implementation of bzip2 compression tool. It provides
interface strictly compatible with bzip2, but also adds several new features
and improvements, such as:

* multi-threaded operation for both compression and decompression, with almost
linear scalability,


Does that mean that it creates multiple streams in the compressed file?

If it does then be aware that some bzip2 decoders (notable the Java one) 
will not be able to decompress the result.


Tom

--
Tom Hughes (t...@compton.nu)
http://compton.nu/
--
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-02 Thread Mikolaj Izdebski
On 04/02/2014 08:03 PM, Bill Nottingham wrote:
 Jaroslav Reznik (jrez...@redhat.com) said: 
 = Proposed System Wide Change:  lbzip2 as default bzip2 implementation =
 https://fedoraproject.org/wiki/Changes/lbzip2

 Change owner(s): Mikolaj Izdebski mizde...@redhat.com

 This change aims at making lbzip2 [1] default bzip2 implementation used in 
 Fedora. 

 == Detailed Description ==
 lbzip2 is an independent implementation of bzip2 compression tool. It 
 provides 
 interface strictly compatible with bzip2, but also adds several new features 
 and improvements, such as:

 * multi-threaded operation for both compression and decompression, with 
 almost 
 linear scalability,
 * improved performance, even on single-core systems,
 * improved extra utilities (bzdiff, bzless, bzip2recover, etc.),
 * improved compatibility with gzip. 

 lbzip2 is a mature project and it has been used in production for years. It 
 is 
 already packaged for Fedora and it is also available in EPEL.
 
 A quick check shows lbzip2 doesn't provide a library interface, much less
 one compatible with libbz2. Is that ever intended?

That was once intended (in 2007-2010), but for now I decided to provide
bzip2-compatible commands only.  If there is demand I will reconsider
providing a library with bzip2-compatible API/ABI.

 If it's not, saying lbzip2 is the default bzip2 *implementation* may be a
 bit of a stretch. Perhaps s/implementation/command/.

You're right, the title and description may be ambiguous.  In this
sentence bzip2 means bzip2 command.

-- 
Mikolaj Izdebski
Software Engineer, Red Hat
IRC: mizdebsk
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-02 Thread Mikolaj Izdebski
On 04/02/2014 08:10 PM, Tom Hughes wrote:
 On 02/04/14 18:24, Jaroslav Reznik wrote:
 
 == Detailed Description ==
 lbzip2 is an independent implementation of bzip2 compression tool. It
 provides
 interface strictly compatible with bzip2, but also adds several new
 features
 and improvements, such as:

 * multi-threaded operation for both compression and decompression,
 with almost
 linear scalability,
 
 Does that mean that it creates multiple streams in the compressed file?
 
 If it does then be aware that some bzip2 decoders (notable the Java one)
 will not be able to decompress the result.

lbzip2 creates only *one* stream per compressed file, even when using
multiple threads.  Such files can be decompressed with all versions of
bzip2, libbz2 and other tools, such as Apache Commons Compress.

This is a difference between lbzip2 and pbzip2, which creates multiple
streams.  Files created with pbzip2 cannot be decompressed by some
software, such as libbzip2 (all versions), bzip2 older than version
0.9.0, Apache Commons Compress.

-- 
Mikolaj Izdebski
Software Engineer, Red Hat
IRC: mizdebsk
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-02 Thread Reindl Harald


Am 02.04.2014 20:18, schrieb Mikolaj Izdebski:
 lbzip2 is a mature project and it has been used in production for years. It 
 is 
 already packaged for Fedora and it is also available in EPEL.

 A quick check shows lbzip2 doesn't provide a library interface, much less
 one compatible with libbz2. Is that ever intended?
 
 That was once intended (in 2007-2010), but for now I decided to provide
 bzip2-compatible commands only.  If there is demand I will reconsider
 providing a library with bzip2-compatible API/ABI.
 
 If it's not, saying lbzip2 is the default bzip2 *implementation* may be a
 bit of a stretch. Perhaps s/implementation/command/.
 
 You're right, the title and description may be ambiguous.  In this
 sentence bzip2 means bzip2 command

packages using the library (output of yum remove bzip2-libs-1.0.6-9.fc20.x86_64)

-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket gnupg-1.4.16-2.fc20.x86_64 
verarbeitet
-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket 
perl-Compress-Raw-Bzip2-2.062-2.fc20.x86_64 verarbeitet
-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket 
libsemanage-2.1.10-14.fc20.x86_64 verarbeitet
-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket 
ImageMagick-c++-6.8.6.3-3.fc20.x86_64 verarbeitet
-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket 
6:kdelibs-4.12.3-1.fc20.x86_64 verarbeitet
-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket 
ffmpeg-libs-2.1.4-4.fc20.20140324.rh.x86_64 verarbeitet
-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket bzip2-1.0.6-9.fc20.x86_64 
verarbeitet
-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket 
ffmpeg-compat-0.6.7-4.fc20.x86_64 verarbeitet
-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket 
python-deltarpm-3.6-3.fc20.x86_64 verarbeitet
-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket 
libarchive-3.1.2-7.fc20.x86_64 verarbeitet
-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket 
gnome-vfs2-2.24.4-14.fc20.x86_64 verarbeitet
-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket 
elfutils-libs-0.158-1.fc20.x86_64 verarbeitet
-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket 
elinks-0.12-0.36.pre6.fc20.x86_64 verarbeitet
-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket deltarpm-3.6-3.fc20.x86_64 
verarbeitet
-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket 
rpm-build-libs-4.11.2-2.fc20.x86_64 verarbeitet
-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket rpm-4.11.2-2.fc20.x86_64 
verarbeitet
-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket 
gstreamer-plugins-bad-free-0.10.23-20.fc20.x86_64 verarbeitet
-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket unzip-6.0-12.fc20.x86_64 
verarbeitet
-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket 
php-5.5.10-2.fc20.20140306.rh.x86_64 verarbeitet
-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket 
ImageMagick-libs-6.8.6.3-3.fc20.x86_64 verarbeitet
-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket 
GraphicsMagick-1.3.18-4.fc20.x86_64 verarbeitet
-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket zip-3.0-9.fc20.x86_64 
verarbeitet
-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket 
php-cli-5.5.10-2.fc20.20140306.rh.x86_64 verarbeitet
-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket 
ImageMagick-6.8.6.3-3.fc20.x86_64 verarbeitet
-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket 
gstreamer1-plugins-bad-free-1.2.3-1.fc20.x86_64 verarbeitet
-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket 
gstreamer-ffmpeg-0.10.13-11.fc20.x86_64 verarbeitet
-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket 
strigi-libs-0.7.8-2.fc20.x86_64 verarbeitet
-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket 
python3-libs-3.3.2-11.fc20.x86_64 verarbeitet
-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket 
clamav-lib-0.98.1-1.fc20.x86_64 verarbeitet
-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket 
genisoimage-1.1.11-22.fc20.x86_64 verarbeitet
-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket 
gnupg2-2.0.22-1.fc20.x86_64 verarbeitet
-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket 
rpm-libs-4.11.2-2.fc20.x86_64 verarbeitet
-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket 
2:gimp-2.8.10-4.fc20.x86_64 verarbeitet

-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket 
python-libs-2.7.5-11.fc20.x86_64 verarbeitet

-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket 
ffmpeg-latest-2.2-2.fc20.20140324.rh.x86_64 verarbeitet

-- Abhängigkeit libbz2.so.1()(64bit) wird für Paket 
tokyocabinet-1.4.48-2.fc20.x86_64 verarbeitet



signature.asc
Description: OpenPGP digital signature
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-02 Thread Tom Hughes

On 02/04/14 19:22, Mikolaj Izdebski wrote:


lbzip2 creates only *one* stream per compressed file, even when using
multiple threads.  Such files can be decompressed with all versions of
bzip2, libbz2 and other tools, such as Apache Commons Compress.

This is a difference between lbzip2 and pbzip2, which creates multiple
streams.  Files created with pbzip2 cannot be decompressed by some
software, such as libbzip2 (all versions), bzip2 older than version
0.9.0, Apache Commons Compress.


That's great then. I was aware that this was an issue with pbzip2 and 
wanted to make sure it wasn't a problem here.


Tom

--
Tom Hughes (t...@compton.nu)
http://compton.nu/
--
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-02 Thread Josh Stone
On 04/02/2014 11:33 AM, Reindl Harald wrote:
 packages using the library (output of yum remove 
 bzip2-libs-1.0.6-9.fc20.x86_64)

Try: repoquery --whatrequires 'libbz2.so.1()(64bit)'

We'll certainly need to keep the library around.
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-02 Thread Mikolaj Izdebski
 On 04/02/2014 11:33 AM, Reindl Harald wrote:
  packages using the library (output of yum remove
  bzip2-libs-1.0.6-9.fc20.x86_64)
 
 Try: repoquery --whatrequires 'libbz2.so.1()(64bit)'
 
 We'll certainly need to keep the library around.

Yes, but we can have /usr/bin/bzip2 (and other commands) pointing to
lbzip2 binary and at the same time applications can keep using libbz2.

--
Mikolaj Izdebski
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-02 Thread Zbigniew Jędrzejewski-Szmek
On Wed, Apr 02, 2014 at 03:10:23PM -0400, Mikolaj Izdebski wrote:
  On 04/02/2014 11:33 AM, Reindl Harald wrote:
   packages using the library (output of yum remove
   bzip2-libs-1.0.6-9.fc20.x86_64)
  
  Try: repoquery --whatrequires 'libbz2.so.1()(64bit)'
 
  We'll certainly need to keep the library around.
 
 Yes, but we can have /usr/bin/bzip2 (and other commands) pointing to
 lbzip2 binary and at the same time applications can keep using libbz2.
Hm, both python-libs and python3-libs, and rpm are on the list. This
means that basically every system will have two implementations of bzip2.
Not *that* big of an issue, but it certainly would be nicer to add the
library interface so that we can get rid of this duplication.

Zbyszek
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-02 Thread Zbigniew Jędrzejewski-Szmek
 ** possibly adjust spec files to require or build-require lbzip2 instead of 
 bzip2. 
Is this necessary? Wouldn't it be better to have lbzip2 Provide bzip2
or something so that updating all those packages is not necessary,
and also that people who prefer normal bzip2 can still use it?

Zbyszek
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-02 Thread drago01
On Wed, Apr 2, 2014 at 10:27 PM, Zbigniew Jędrzejewski-Szmek
zbys...@in.waw.pl wrote:
 ** possibly adjust spec files to require or build-require lbzip2 instead of
 bzip2.
 Is this necessary? Wouldn't it be better to have lbzip2 Provide bzip2
 or something so that updating all those packages is not necessary,
 and also that people who prefer normal bzip2 can still use it?

Why would people prefer it? If it is the same but slower?
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-02 Thread Ian Malone
On 2 April 2014 21:46, drago01 drag...@gmail.com wrote:
 On Wed, Apr 2, 2014 at 10:27 PM, Zbigniew Jędrzejewski-Szmek
 zbys...@in.waw.pl wrote:
 ** possibly adjust spec files to require or build-require lbzip2 instead of
 bzip2.
 Is this necessary? Wouldn't it be better to have lbzip2 Provide bzip2
 or something so that updating all those packages is not necessary,
 and also that people who prefer normal bzip2 can still use it?

 Why would people prefer it? If it is the same but slower?

Yes, if it's interface compatible then it's pretty nice. Multithreaded
compression is handy from time to time. Pity about no library
interface (lbzip2 might be an unfortunate name choice...),
particularly for things like perl and python. I suppose from the pov
of minimal systems it might be nice to not have to have both if you
need to fulfil both a bzip2 library requirement and a bzip2
requirement, but that's a very particular case for a few kB saving.

-- 
imalone
http://ibmalone.blogspot.co.uk
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-02 Thread Al Dunsmuir
On Wednesday, April 2, 2014, 4:27:55 PM, Zbigniew Jędrzejewski-Szmek wrote:

 ** possibly adjust spec files to require or build-require lbzip2 instead of 
 bzip2. 
 Is this necessary? Wouldn't it be better to have lbzip2 Provide bzip2
 or something so that updating all those packages is not necessary,
 and also that people who prefer normal bzip2 can still use it?

This  sounds like a very intrusive change that in the worst case could
introduce  errors  and  expose user data to permanent loss without any
means to recover.

Many tools read and write zip files. The actual ZIP standard format is
controlled by (coordinated by) PKZIP via their application note. There
is a lot of discussion about common extensions, but each tool can have
their own private extensions that may be incompatible.

The  InfoZIP  team  that  gives  you the zip and unzip tools has added
support   for   the  bzip2  and  lzma  compression  and  decompression
algorithms,  as  well  as  AES encryption/decryption in their upcoming
beta  release. I've been involved in reworking U*IX build support, and
working towards better support on mainframe platforms (z/OS, z/VM) and
AIX. I do all my primary development and testing on Fedora, but others
have their own preferred platform.

This  implementation  has  been built and extensively tested using the
current  release  of the real bzip2 library. Substituting a completely
different  library  implementation without going through extensive and
explicit validating and testing is risky and unreasonable. At best, it
would   complicate   problem  reporting,  reproduction,  analysis  and
correction.

The  libzip  effort  is not part of the InfoZIP project. They may have
created a wonderful library, but it will not have identical interfaces
and behaviours as the original bzip2 library.

Let the upstream tools decide which bzip implementation(s) to support,
and do the necessary validation to ensure that it all works correctly.
It  is  the  upstream  tool's  reputation that would be damaged if the
Fedora  library  caused  user  data  to be lost. Let them do their job
correctly.

Final  gloomy  thought  of  the  day: Breaking zip files could you are
breaking the actual tool used for backups. That makes data loss a very
permanent problem.

Al

-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-02 Thread Mikolaj Izdebski
  ** possibly adjust spec files to require or build-require lbzip2 instead of
  bzip2.
 Is this necessary? Wouldn't it be better to have lbzip2 Provide bzip2
 or something so that updating all those packages is not necessary,
 and also that people who prefer normal bzip2 can still use it?

You could do that, but then two packages would have the same virtual
provide and it wouldn't be well defined which one would be installed.
In such cases YUM seems to choose packages with shorter names, so it
would prefer bzip2 over lbzip2.  At least one package somewhere
low in the system has to require lbzip2 explicitly.

I think that package maintainers who want to use lbzip2 should declare
it as explicit dependency.  Assuming that packages are calling standard
bzip2 command names (bzip2, bzcat and so on) users will still be able
to switch implementations as they want by configuring alternatives,
even if package has requires on lbzip2.

We don't need to migrate all packages to Require lbzip2.  If lbzip2 has
highest priority in alternatives then it will be used as long as it is
installed.  It doesn't even need to Provide bzip2.  It should be enough
to include lbzip2 in minimal installation and add is a dependency of
@buildsys-build to effectively make it default implementation.

--
Mikolaj Izdebski
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-02 Thread Mikolaj Izdebski
 On 2 April 2014 21:46, drago01 drag...@gmail.com wrote:
  On Wed, Apr 2, 2014 at 10:27 PM, Zbigniew Jędrzejewski-Szmek
  zbys...@in.waw.pl wrote:
  ** possibly adjust spec files to require or build-require lbzip2 instead
  of
  bzip2.
  Is this necessary? Wouldn't it be better to have lbzip2 Provide bzip2
  or something so that updating all those packages is not necessary,
  and also that people who prefer normal bzip2 can still use it?
 
  Why would people prefer it? If it is the same but slower?
 
 Yes, if it's interface compatible then it's pretty nice. Multithreaded
 compression is handy from time to time. Pity about no library
 interface (lbzip2 might be an unfortunate name choice...),
 particularly for things like perl and python. I suppose from the pov
 of minimal systems it might be nice to not have to have both if you
 need to fulfil both a bzip2 library requirement and a bzip2
 requirement, but that's a very particular case for a few kB saving.

It shouldn't be difficult to provide library interface.  In fact
I considered planned it from the very beginning, but then lacked motivation.
I'm sure that accepting this Change will be a motivator good enough for me
to finally do this.

--
Mikolaj Izdebski
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-02 Thread Matthew Garrett
On Wed, Apr 02, 2014 at 05:14:53PM -0400, Al Dunsmuir wrote:

 This  implementation  has  been built and extensively tested using the
 current  release  of the real bzip2 library. Substituting a completely
 different  library  implementation without going through extensive and
 explicit validating and testing is risky and unreasonable. At best, it
 would   complicate   problem  reporting,  reproduction,  analysis  and
 correction.

The suggestion is to replace the tool, not the library.

-- 
Matthew Garrett | mj...@srcf.ucam.org
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-02 Thread Al Dunsmuir
Hello Al,

On Wednesday, April 2, 2014, 5:14:53 PM, Al Dunsmuir wrote:
 On Wednesday, April 2, 2014, 4:27:55 PM, Zbigniew Jędrzejewski-Szmek wrote:
 ** possibly adjust spec files to require or build-require lbzip2 instead of
 bzip2. 
 Is this necessary? Wouldn't it be better to have lbzip2 Provide bzip2
 or something so that updating all those packages is not necessary,
 and also that people who prefer normal bzip2 can still use it?

It's  been  a long day. In my haste to reply, I wrote libzip where I
meant to say lbzip2. The rest of my point remains unchanged.

By  the way, part of the reason for my slip is that while I have heard
of a libzip library, I have never heard of lbzip2 before. I've never
seen  it  come  up  with  web  searches related to the InfoZIP work to
support  bzip2  encoding.  I'm  going to post the original post and my
comments in the private InfoZIP development list, but I suspect I will
get responses that I am not the only person becoming aware of this new
tool.

Al

-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-02 Thread Mikolaj Izdebski
 This  implementation  has  been built and extensively tested using the
 current  release  of the real bzip2 library. Substituting a completely
 different  library  implementation without going through extensive and
 explicit validating and testing is risky and unreasonable. At best, it
 would   complicate   problem  reporting,  reproduction,  analysis  and
 correction.

lbzip2 does not (at least not yet) provide interfaces of libbz2 library,
only command line tools.  This Change does *not* affect users of libbz2.

--
Mikolaj Izdebski
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-02 Thread drago01
On Wed, Apr 2, 2014 at 11:39 PM, Chris Adams li...@cmadams.net wrote:
 Once upon a time, Mikolaj Izdebski mizde...@redhat.com said:
 lbzip2 does not (at least not yet) provide interfaces of libbz2 library,
 only command line tools.  This Change does *not* affect users of libbz2.

 Is there enough of a gain to the system to only partially replace a core
 program like this (especially with alternatives)?  This seems like a
 case where either we get a new and improved and replaces the old
 version (where the new one just obsoletes the old one, such as the
 jpeg-turbo change), or just leave it alone.

 Please understand, I don't mean to attack you or your code.  I just
 think adding a second implementation of a core utility like bzip2 is a
 bad idea unless there is a significant gain.  If there's a point where
 lbzip2 can fully replace bzip2 (so all CLI and API uses), and there are
 good benefits, then Fedora should just replace the old implementation
 with a new one.


Well the change says  multi-threaded operation for both compression
and decompression, with almost linear scalability linear scalability
means speed ups on the range of 2-8x on current desktop / laptop
systems.
Which I'd call a significant gain ;)
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-02 Thread Chris Adams
Once upon a time, Mikolaj Izdebski mizde...@redhat.com said:
 lbzip2 does not (at least not yet) provide interfaces of libbz2 library,
 only command line tools.  This Change does *not* affect users of libbz2.

Is there enough of a gain to the system to only partially replace a core
program like this (especially with alternatives)?  This seems like a
case where either we get a new and improved and replaces the old
version (where the new one just obsoletes the old one, such as the
jpeg-turbo change), or just leave it alone.

Please understand, I don't mean to attack you or your code.  I just
think adding a second implementation of a core utility like bzip2 is a
bad idea unless there is a significant gain.  If there's a point where
lbzip2 can fully replace bzip2 (so all CLI and API uses), and there are
good benefits, then Fedora should just replace the old implementation
with a new one.

-- 
Chris Adams li...@cmadams.net
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-02 Thread Mikolaj Izdebski
 On Wed, Apr 2, 2014 at 11:39 PM, Chris Adams li...@cmadams.net wrote:
  Once upon a time, Mikolaj Izdebski mizde...@redhat.com said:
  lbzip2 does not (at least not yet) provide interfaces of libbz2 library,
  only command line tools.  This Change does *not* affect users of libbz2.
 
  Is there enough of a gain to the system to only partially replace a core
  program like this (especially with alternatives)?  This seems like a
  case where either we get a new and improved and replaces the old
  version (where the new one just obsoletes the old one, such as the
  jpeg-turbo change), or just leave it alone.

Eventually lbzip2 may replace bzip2, but I don't want to make any drastic
changes.  Using alternatives allows us to have a nice contingency plan
in case something goes unexpected.  Once lbzip2 is used by default,
further chnages will be just a metter of agreement between maintainers.

  Please understand, I don't mean to attack you or your code.  I just
  think adding a second implementation of a core utility like bzip2 is a
  bad idea unless there is a significant gain.  If there's a point where
  lbzip2 can fully replace bzip2 (so all CLI and API uses), and there are
  good benefits, then Fedora should just replace the old implementation
  with a new one.

I believe there is a signifficant gain, see below.

 Well the change says  multi-threaded operation for both compression
 and decompression, with almost linear scalability linear scalability
 means speed ups on the range of 2-8x on current desktop / laptop
 systems.
 Which I'd call a significant gain ;)

I don't have any recent benchmark comparing performance of lbzip2 with bzip2
because doing them doesn't make much sense for me -- lbzip2 is so much faster.
I compare only different versions to make sure there are no performance
regressions.

However I have one *old* benchmark made just after lbzip2 2.0 release
2 and half years ago (current version is 2.5).  But since then lbzip2
was improved.

  https://github.com/kjn/lbzip2/wiki/Benchmark:-Opteron-24

I will prepare more benchmarks of recent lbzip2 version for desktop
(1, 2, 4, 8 cores) and bigger multicore server systems.

A test of scalability (pretty old too) is available at:

  http://archive.lbzip2.org/scaling/scaling.html

I hope that's convincing enough for now.

--
Mikolaj Izdebski
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-02 Thread Al Dunsmuir
On Wednesday, April 2, 2014, 2:03:38 PM, Bill Nottingham wrote:
 Jaroslav Reznik (jrez...@redhat.com) said:
 = Proposed System Wide Change:  lbzip2 as default bzip2 implementation =
 https://fedoraproject.org/wiki/Changes/lbzip2
 
 Change owner(s): Mikolaj Izdebski mizde...@redhat.com
 
 This change aims at making lbzip2 [1] default bzip2 implementation used in 
 Fedora. 
 
 == Detailed Description ==
 lbzip2 is an independent implementation of bzip2 compression tool. It 
 provides 
 interface strictly compatible with bzip2, but also adds several new features 
 and improvements, such as:
 
 * multi-threaded operation for both compression and decompression, with 
 almost 
 linear scalability,
 * improved performance, even on single-core systems,
 * improved extra utilities (bzdiff, bzless, bzip2recover, etc.),
 * improved compatibility with gzip. 
 
 lbzip2 is a mature project and it has been used in production for years. It 
 is 
 already packaged for Fedora and it is also available in EPEL.

 A quick check shows lbzip2 doesn't provide a library interface, much less
 one compatible with libbz2. Is that ever intended?

 If it's not, saying lbzip2 is the default bzip2 *implementation* may be a
 bit of a stretch. Perhaps s/implementation/command/.

Bill,

This  clarification is significant.  The change proposal text needs to
be updated to reflect this.

As long as the encoding is guaranteed to be byte-for-byte identical to
that produced by the original bzip2 (and libbz2) implementation, the
risks are lowered.

Scenarios   affected  by  this  substitution  are  those  with  direct
invocation of the command (from the command prompt, a shell script, or
system() type call).

The  lbzip2 utility sounds interesting, and I am now disappointed that
there is no separate library interface with these characteristics that
we  can  investigate using instead of libbz2. As later comments state,
perhaps in the future.

Al

-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-02 Thread Mikolaj Izdebski
 This  clarification is significant.  The change proposal text needs to
 be updated to reflect this.

I will add a clarification tomorrow.

 
 As long as the encoding is guaranteed to be byte-for-byte identical to
 that produced by the original bzip2 (and libbz2) implementation, the
 risks are lowered.

No, encoding is almost never bytewise identical to bzip2, but it doesn't
have to be as long as the resulting bz2 file has correct format.

bzip2 itself changed encoding between versions without any impact on users.
Even the same version of bzip2 can produce different compressed files
for the same input with and with the same block size.

lbzip2 uses improved algorithms, which are not only faster, but allow for
slighty better compression ratio, for example:

$ echo test | bzip2 | wc -c
45
$ echo test | lbzip2 | wc -c
43

 Scenarios   affected  by  this  substitution  are  those  with  direct
 invocation of the command (from the command prompt, a shell script, or
 system() type call).

That's true.  And even in the unlikely case that something goes wrong,
developers (or even users themselves) have possibility to easily switch
back to bzip2.

--
Mikolaj Izdebski
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-02 Thread Chris Adams
Once upon a time, drago01 drag...@gmail.com said:
 Well the change says  multi-threaded operation for both compression
 and decompression, with almost linear scalability linear scalability
 means speed ups on the range of 2-8x on current desktop / laptop
 systems.
 Which I'd call a significant gain ;)

That depends.  If I made an alternate mv that ran 10 times faster,
would we go through the pain of alternatives to offer two options for
it?

How much impact on real-world system usage will speeding up the bzip2
command offer?  Many of the common users (such as rpm) are linked
against the library and don't use the command, so they won't be
impacted.  We'll end up with more disk space used, because the current
/usr/bin/bzip2 and friends are linked against the library (so there's
only one copy of bzip2 code).  Since lbzip2 doesn't offer a library, I
guess each program has to have a copy of the code, and other system
tools will still use the bzip2 library.

I think the right way to move forward is to make a library that is at
least API-compatible with the current libbz2.so.1, make all the tools
use it, and just replace bzip2 with lbzip2.
-- 
Chris Adams li...@cmadams.net
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-02 Thread Toshio Kuratomi
On Wed, Apr 02, 2014 at 08:47:11PM -0500, Chris Adams wrote:
 
 I think the right way to move forward is to make a library that is at
 least API-compatible with the current libbz2.so.1, make all the tools
 use it, and just replace bzip2 with lbzip2.

Although I'm still on the fence about whether I'd vote for the Change as is,
I tend to agree with this sentiment.

Having two sets of code with different characteristics seems like
isomething of a disservice to users (I started bzip'ing my logs and backups
because the performance was suitable for my task when I tested with
/usr/bin/bzip2 but then when I operated on those logs with a custom python
script it was 3x slower!)

From past precedent I agree that getting the new package to the point where
we think it's a suitable replacement and then just making the switch for the
next release makes the most sense.

-Toshio


pgp1_BNd0eyBq.pgp
Description: PGP signature
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-02 Thread Zbigniew Jędrzejewski-Szmek
On Wed, Apr 02, 2014 at 07:26:59PM -0700, Toshio Kuratomi wrote:
 On Wed, Apr 02, 2014 at 08:47:11PM -0500, Chris Adams wrote:
  
  I think the right way to move forward is to make a library that is at
  least API-compatible with the current libbz2.so.1, make all the tools
  use it, and just replace bzip2 with lbzip2.
 
 Although I'm still on the fence about whether I'd vote for the Change as is,
 I tend to agree with this sentiment.
 
 Having two sets of code with different characteristics seems like
 isomething of a disservice to users (I started bzip'ing my logs and backups
 because the performance was suitable for my task when I tested with
 /usr/bin/bzip2 but then when I operated on those logs with a custom python
 script it was 3x slower!)
 
 From past precedent I agree that getting the new package to the point where
 we think it's a suitable replacement and then just making the switch for the
 next release makes the most sense.
I agree that this is desirable. OTOH, lbzip2.rpm is 90k, so I guess we can
suffer the extra disk usage :)

Zbyszek
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct

Re: F21 System Wide Change: lbzip2 as default bzip2 implementation

2014-04-02 Thread Toshio Kuratomi
On Thu, Apr 03, 2014 at 04:48:03AM +0200, Zbigniew Jędrzejewski-Szmek wrote:
 On Wed, Apr 02, 2014 at 07:26:59PM -0700, Toshio Kuratomi wrote:
  On Wed, Apr 02, 2014 at 08:47:11PM -0500, Chris Adams wrote:
   
   I think the right way to move forward is to make a library that is at
   least API-compatible with the current libbz2.so.1, make all the tools
   use it, and just replace bzip2 with lbzip2.
  
  Although I'm still on the fence about whether I'd vote for the Change as is,
  I tend to agree with this sentiment.
  
  Having two sets of code with different characteristics seems like
  isomething of a disservice to users (I started bzip'ing my logs and backups
  because the performance was suitable for my task when I tested with
  /usr/bin/bzip2 but then when I operated on those logs with a custom python
  script it was 3x slower!)
  
  From past precedent I agree that getting the new package to the point where
  we think it's a suitable replacement and then just making the switch for the
  next release makes the most sense.
 I agree that this is desirable. OTOH, lbzip2.rpm is 90k, so I guess we can
 suffer the extra disk usage :)
 
If it was about disk usage :-)

But it's not.  It's about having two codebases that do the same thing in
different ways.

-Toshio


pgp5MTiU5Fmrw.pgp
Description: PGP signature
-- 
devel mailing list
devel@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct