Re: [gentoo-dev] bzipped manpages

2017-01-17 Thread james

On 01/17/2017 12:26 PM, Michał Górny wrote:

[sent off list to reduce amount of spam]

Please cease this off-topic. It has nothing to do with the subject
debated. If you want to talk over a beer, then please go to a pub. If
you want to compare your pe^w^w^w embedded systems, then please
find an appropriate media for that.


Apologies. Agreed.


Documents on embedded systems, now that we have a solid definition of 
what an embedded system actually is, is quite important and minimizing 
storage/ram usage of resources whilst accessing those docs is of great 
importance.  Expecting a workstation or even a server to have the latest 
mix of documents for the variety of codes and hardware found in a 
heterogeneous cluster, is an unreasonable stretch, and often leads to 
mismatches, imho. Imagine a variety of embedded systems that morph in a 
different cluster matrix, dozens or hundreds of times a day.  The 
admins, programmers or users with larger codes may need a deep read on 
the docs sporadically' but in a time-critical manner. They do not want 
or need extraneous or irrelevant docs in the pool. Heterogeneous 
clusters are now all the rage in clusters, particularly, for HPC. 
Current, accurate and minimized docs are keenly important for all 
embedded systems, particularly if they can

receive codes or binaries dynamically and morph.


As embedded systems venues of deployment morph, dynamically, their 
interned documentation should be "auto adjusted" like the system 
software despite being on identical hardware particularly since  the 
embedded clusters grow more complex with the latest codes and hi level 
language codes on top of these embedded clusters.  That is, in the role 
they (dynamically play) as well in how individual software is used on 
otherwise identical hardware can change rapidly. Lofty goals, yes, but 
since the subject of local documentation  on embedded systems came up, 
naturally it is quite reasonable to seek the fastest possible diverse 
usage with the least footprint on resources.



hth,
James




Re: [gentoo-dev] bzipped manpages

2017-01-17 Thread james

On 01/17/2017 01:05 AM, Daniel Campbell wrote:

On 01/13/2017 08:06 AM, james wrote:

On 01/13/2017 02:45 AM, Sven Eden wrote:


Btw.: Even "embedded experts" wholeheartedly agree that they disagree
what
"embedded" actually is. But I do think SoCs actually *do* qualify, at
least to
some degree...



Huh?

Probably who you deem as an expert; they have not clearly defined
systems types and semantics of an embedded systems. An embedded system
is one that is 'closed' to pedestrian/consumer/user modifications,
excepting rooting and other non-normal bypass mechanisms. A modification
is not the same thing as a configuration. An embedded system is designed
with limited functionality or "canned product functionality" for
consumers of very specific task-sets.  Early Micros where often more
accurately referred to as 'microcontrollers' as their function was
simply to replace mechanical control systems that were prone to wear and
failure. When programming occurs (again rooting and hacking do not
count), it is only allowed by the system designer(s). So a Rasp. Pi on
the internet, open to dozens or thousands of coders, is not an embedded
system. At some point it may become an embedded system, but it must be
locked down, limited in functionality and purged of all that software
used for development but not needed to run and function as the
designer(s) intend. Updates are usually in a binary form, again under
the strict control as designed by the product (embedded systems) developer.


Given that, the reason why so many folks are confused as to what an
embedded system actually is, is that there are lots of "open" platforms
where users are encouraged to be the designer, thus having architecture,
coding, and modification access that an ordinary user would not have;
again, security hacks that grant non-normal access
do not count. That is if you 'hack' into the product or the bios of a
computer, you have not converted the device's intended usage into a
embedded system, although you may have low level access to the hardware,
firmware and other subsystems that the designers did not intentionally
make available to you. When a computer is 'locked down and limited' like
a kiosk, it actually is an embedded system.


Traditionally, the easy way to set up product developers was through
vendors (OEM like Freescale, Samsung, Broadcom, etc) via a  'dev board'.
Example codes, minimal stack of an rtos or vendor supplied software
system, along with documentation and details of the in-situ hardware
that comprise the 'dev board'. Small systems did not have (nor do they
now) have an 'OS' instead they were simple state-machines or run a
polling algorithm. Most embedded systems still operate on these sorts of
codes, even today.


Fast forward, Rasp. Pi et. al are dev boards that can be turned into
open, multi user systems, say if you make it a typical minimized linux
system. Some even have inputs for keyboard, mouse and terminal; so that
sort of system, would not be an embedded system. Now take the same
board, lock it down so all it does is control the sprinklers in your
yard, with limited functional interfaces to the 'standard user' and it
is indeed an 'embedded system'. Most products with a small
microprocessor are 'embedded systems'. Most Rasp. Pi boards are user
systems because they are open and unlimited an any given time and are
not 'locked down'.


It takes a designer, or a team of designers to create an 'embedded
system', particularly if the embedded system is to be turned into a
commercial product. The net effect of boards like Rasp. Pi is open up
the opportunity for folks to learn 'product development'. Most have
chosen to create  user systems with some functionality not found in
traditional desktop systems. Surely there are edge cases that blur
the lines of distinction; but most are not a finalized product (embedded
system) as they are in a constant state of flux related to the interned
software, thus they are not an 'embedded system'. A properly designed
embedded system can last in its minimized and limited form for decades
or more and operated as intended (think digital alarm clock). Others do
need an update to the firmware (locked down internal software), but that
is only performed by the product owners or vendors, in the normal case
of operation. Indeterminant hardware is just hardware; it has to be
robustly defined, tested and implemented to be a user system, an
embedded system, or whatever the designer has in mind.


 So hopefully, I have articulated the fact that an 'embedded system' is
determined by the designer, not the underlying hardware from a vendor.
Robust embedded system design, regardless of VHDL or C or ? codes
are more of an art-form than a technical expose on software development.
I know embedded designers that have created thousands of products  some
in a matter of weeks, and other teams that fail to produce a single
robust product, in their entire lifetime.  The most prolific designer of
them all, is simple 

Re: [gentoo-dev] bzipped manpages

2017-01-17 Thread Fabian Groffen
On 16-01-2017 22:13:39 -0800, Daniel Campbell wrote:
> On 01/10/2017 05:16 AM, Fabian Groffen wrote:
> > On 09-01-2017 09:08:22 +0100, Jan Stary wrote:
> >> The particular problem I am having is that http://mdocml.bsd.lv/ ,
> >> my manpage formatter of choice, does deliberately not support bzip
> >> (or any other outside decompressors for that matter).
> > 
> > Attached patch works for me.  XZ should be a similar exercise, a little
> > cleanup would be nice then though.
> > 
> This is awesome; has upstream been sent this yet, by any chance?

Nope, given the initial quoted email from the main developer, I gave it
zero chance of success, and not worth the bother.  AFAICT I posted the
patch to the public domain, so anyone can take it and use it as they see
fit.

Also, if Gentoo would ship this package, it probably should have this
patch since it would enable the package to work with our default
settings.  However, we don't ship this package.  I made a preliminary
ebuild, but ran into some issues, e.g. it cannot replace `/usr/bin/man`
because that would break lesspipe.sh's ability to view manpages.
Concluding that (for me) the manpages didn't render any faster, or
better -- I'd say even worse since I got no colours -- I decided to
leave the package as is, because of no gain.

Fabian


-- 
Fabian Groffen
Gentoo on a different level


signature.asc
Description: Digital signature


Re: [gentoo-dev] bzipped manpages

2017-01-16 Thread Daniel Campbell
On 01/10/2017 05:16 AM, Fabian Groffen wrote:
> On 09-01-2017 09:08:22 +0100, Jan Stary wrote:
>> The particular problem I am having is that http://mdocml.bsd.lv/ ,
>> my manpage formatter of choice, does deliberately not support bzip
>> (or any other outside decompressors for that matter).
> 
> Attached patch works for me.  XZ should be a similar exercise, a little
> cleanup would be nice then though.
> 
> Fabian
> 
This is awesome; has upstream been sent this yet, by any chance?

-- 
Daniel Campbell - Gentoo Developer
OpenPGP Key: 0x1EA055D6 @ hkp://keys.gnupg.net
fpr: AE03 9064 AE00 053C 270C  1DE4 6F7A 9091 1EA0 55D6



signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] bzipped manpages

2017-01-16 Thread Daniel Campbell
On 01/13/2017 08:06 AM, james wrote:
> On 01/13/2017 02:45 AM, Sven Eden wrote:
> 
>> Btw.: Even "embedded experts" wholeheartedly agree that they disagree
>> what
>> "embedded" actually is. But I do think SoCs actually *do* qualify, at
>> least to
>> some degree...
> 
> 
> Huh?
> 
> Probably who you deem as an expert; they have not clearly defined
> systems types and semantics of an embedded systems. An embedded system
> is one that is 'closed' to pedestrian/consumer/user modifications,
> excepting rooting and other non-normal bypass mechanisms. A modification
> is not the same thing as a configuration. An embedded system is designed
> with limited functionality or "canned product functionality" for
> consumers of very specific task-sets.  Early Micros where often more
> accurately referred to as 'microcontrollers' as their function was
> simply to replace mechanical control systems that were prone to wear and
> failure. When programming occurs (again rooting and hacking do not
> count), it is only allowed by the system designer(s). So a Rasp. Pi on
> the internet, open to dozens or thousands of coders, is not an embedded
> system. At some point it may become an embedded system, but it must be
> locked down, limited in functionality and purged of all that software
> used for development but not needed to run and function as the
> designer(s) intend. Updates are usually in a binary form, again under
> the strict control as designed by the product (embedded systems) developer.
> 
> 
> Given that, the reason why so many folks are confused as to what an
> embedded system actually is, is that there are lots of "open" platforms
> where users are encouraged to be the designer, thus having architecture,
> coding, and modification access that an ordinary user would not have;
> again, security hacks that grant non-normal access
> do not count. That is if you 'hack' into the product or the bios of a
> computer, you have not converted the device's intended usage into a
> embedded system, although you may have low level access to the hardware,
> firmware and other subsystems that the designers did not intentionally
> make available to you. When a computer is 'locked down and limited' like
> a kiosk, it actually is an embedded system.
> 
> 
> Traditionally, the easy way to set up product developers was through
> vendors (OEM like Freescale, Samsung, Broadcom, etc) via a  'dev board'.
> Example codes, minimal stack of an rtos or vendor supplied software
> system, along with documentation and details of the in-situ hardware
> that comprise the 'dev board'. Small systems did not have (nor do they
> now) have an 'OS' instead they were simple state-machines or run a
> polling algorithm. Most embedded systems still operate on these sorts of
> codes, even today.
> 
> 
> Fast forward, Rasp. Pi et. al are dev boards that can be turned into
> open, multi user systems, say if you make it a typical minimized linux
> system. Some even have inputs for keyboard, mouse and terminal; so that
> sort of system, would not be an embedded system. Now take the same
> board, lock it down so all it does is control the sprinklers in your
> yard, with limited functional interfaces to the 'standard user' and it
> is indeed an 'embedded system'. Most products with a small
> microprocessor are 'embedded systems'. Most Rasp. Pi boards are user
> systems because they are open and unlimited an any given time and are
> not 'locked down'.
> 
> 
> It takes a designer, or a team of designers to create an 'embedded
> system', particularly if the embedded system is to be turned into a
> commercial product. The net effect of boards like Rasp. Pi is open up
> the opportunity for folks to learn 'product development'. Most have
> chosen to create  user systems with some functionality not found in
> traditional desktop systems. Surely there are edge cases that blur
> the lines of distinction; but most are not a finalized product (embedded
> system) as they are in a constant state of flux related to the interned
> software, thus they are not an 'embedded system'. A properly designed
> embedded system can last in its minimized and limited form for decades
> or more and operated as intended (think digital alarm clock). Others do
> need an update to the firmware (locked down internal software), but that
> is only performed by the product owners or vendors, in the normal case
> of operation. Indeterminant hardware is just hardware; it has to be
> robustly defined, tested and implemented to be a user system, an
> embedded system, or whatever the designer has in mind.
> 
> 
>  So hopefully, I have articulated the fact that an 'embedded system' is
> determined by the designer, not the underlying hardware from a vendor.
> Robust embedded system design, regardless of VHDL or C or ? codes
> are more of an art-form than a technical expose on software development.
> I know embedded designers that have created thousands of products  some
> in a matter of weeks, and other 

Re: [gentoo-dev] bzipped manpages

2017-01-16 Thread Kent Fredric
On Tue, 10 Jan 2017 15:06:47 +0100
Michał Górny  wrote:

> > the manpage formatter needs to call
> > external unpackers. All this to save 40M. I honestly don't think
> > it's worth it.  
> 
> Calling external tools in a pipeline is a pretty normal solution
> in the *nix world. It could be even considered safer than implementing
> multiple disjoint features in a single tool where an issue in one
> module could damage other parts of the program.

Its also possible to handle this in an inside-out mechansim
using a `zrun` from `sys-apps/moreutils`


---
 zrun head -n 1 /usr/share/man/man1/cat.1.bz2
.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.35.

---

Its a bit of an unusual approach, but can do things where normal pipe become 
difficult

How it works becomes quickly apparent:

--
zrun md5sum /usr/share/man/man1/{cat,pr}.1.bz2  
   
88ca1b726a36ae027d435e4c21fe50ba  /tmp/8NJoD5mrdJ-cat.1
78bb81a36e4bf7c32dcf4231361a17d6  /tmp/6xSouo9fyC-pr.1
---

Its a nice trick to have up your sleeve  :)


pgpGPZ6zWpU_D.pgp
Description: OpenPGP digital signature


Re: [gentoo-dev] bzipped manpages

2017-01-16 Thread Jan Stary
On Jan 10 14:16:47, grob...@gentoo.org wrote:
> On 09-01-2017 09:08:22 +0100, Jan Stary wrote:
> > The particular problem I am having is that http://mdocml.bsd.lv/ ,
> > my manpage formatter of choice, does deliberately not support bzip
> > (or any other outside decompressors for that matter).
> 
> Attached patch works for me.

Works for me too, thanks.

Jan






Re: [gentoo-dev] bzipped manpages

2017-01-14 Thread Kent Fredric
On Tue, 10 Jan 2017 13:05:58 +0100
Jan Stary  wrote:

> I am not really familiar eith this system - what would be
> the right piece of information that does relate tot this?

Nothing really, because Gentoo doesn't have "a version", its a rolling release
model.

The closest approximation would be the output from 

   emerge --info

Which may need root permissions to run.

That will report:

- Certain data about the timestamps of your package repository
- Certain versions of certain specific interesting important libraries.
- Basic hardware information
- Portage ENV Preferences.



pgppkOMxOWJcJ.pgp
Description: OpenPGP digital signature


Re: [gentoo-dev] bzipped manpages

2017-01-13 Thread james

On 01/13/2017 02:45 AM, Sven Eden wrote:


Btw.: Even "embedded experts" wholeheartedly agree that they disagree what
"embedded" actually is. But I do think SoCs actually *do* qualify, at least to
some degree...



Huh?

Probably who you deem as an expert; they have not clearly defined 
systems types and semantics of an embedded systems. An embedded system 
is one that is 'closed' to pedestrian/consumer/user modifications, 
excepting rooting and other non-normal bypass mechanisms. A modification 
is not the same thing as a configuration. An embedded system is designed 
with limited functionality or "canned product functionality" for 
consumers of very specific task-sets.  Early Micros where often more 
accurately referred to as 'microcontrollers' as their function was 
simply to replace mechanical control systems that were prone to wear and 
failure. When programming occurs (again rooting and hacking do not 
count), it is only allowed by the system designer(s). So a Rasp. Pi on 
the internet, open to dozens or thousands of coders, is not an embedded 
system. At some point it may become an embedded system, but it must be 
locked down, limited in functionality and purged of all that software 
used for development but not needed to run and function as the 
designer(s) intend. Updates are usually in a binary form, again under 
the strict control as designed by the product (embedded systems) developer.



Given that, the reason why so many folks are confused as to what an 
embedded system actually is, is that there are lots of "open" platforms
where users are encouraged to be the designer, thus having architecture, 
coding, and modification access that an ordinary user would not have; 
again, security hacks that grant non-normal access
do not count. That is if you 'hack' into the product or the bios of a 
computer, you have not converted the device's intended usage into a 
embedded system, although you may have low level access to the hardware, 
firmware and other subsystems that the designers did not intentionally 
make available to you. When a computer is 'locked down and limited' like 
a kiosk, it actually is an embedded system.



Traditionally, the easy way to set up product developers was through 
vendors (OEM like Freescale, Samsung, Broadcom, etc) via a  'dev board'. 
Example codes, minimal stack of an rtos or vendor supplied software 
system, along with documentation and details of the in-situ hardware 
that comprise the 'dev board'. Small systems did not have (nor do they 
now) have an 'OS' instead they were simple state-machines or run a 
polling algorithm. Most embedded systems still operate on these sorts of 
codes, even today.



Fast forward, Rasp. Pi et. al are dev boards that can be turned into 
open, multi user systems, say if you make it a typical minimized linux 
system. Some even have inputs for keyboard, mouse and terminal; so that 
sort of system, would not be an embedded system. Now take the same 
board, lock it down so all it does is control the sprinklers in your 
yard, with limited functional interfaces to the 'standard user' and it 
is indeed an 'embedded system'. Most products with a small 
microprocessor are 'embedded systems'. Most Rasp. Pi boards are user 
systems because they are open and unlimited an any given time and are 
not 'locked down'.



It takes a designer, or a team of designers to create an 'embedded 
system', particularly if the embedded system is to be turned into a 
commercial product. The net effect of boards like Rasp. Pi is open up 
the opportunity for folks to learn 'product development'. Most have 
chosen to create  user systems with some functionality not found in 
traditional desktop systems. Surely there are edge cases that blur
the lines of distinction; but most are not a finalized product (embedded 
system) as they are in a constant state of flux related to the interned 
software, thus they are not an 'embedded system'. A properly designed 
embedded system can last in its minimized and limited form for decades 
or more and operated as intended (think digital alarm clock). Others do 
need an update to the firmware (locked down internal software), but that 
is only performed by the product owners or vendors, in the normal case 
of operation. Indeterminant hardware is just hardware; it has to be 
robustly defined, tested and implemented to be a user system, an 
embedded system, or whatever the designer has in mind.



 So hopefully, I have articulated the fact that an 'embedded system' is 
determined by the designer, not the underlying hardware from a vendor.

Robust embedded system design, regardless of VHDL or C or ? codes
are more of an art-form than a technical expose on software development.
I know embedded designers that have created thousands of products  some 
in a matter of weeks, and other teams that fail to produce a single 
robust product, in their entire lifetime.  The most prolific designer of 
them all, is simple referred to as 'doctor 

Re: [gentoo-dev] bzipped manpages

2017-01-12 Thread Sven Eden
Am Donnerstag, 12. Januar 2017, 19:08:05 CET schrieb Walter Dnes:
> On Wed, Jan 11, 2017 at 05:15:25PM +0100, Jan Stary wrote
> 
> > On Jan 11 13:34:09, sven.e...@gmx.de wrote:
> > > Am Dienstag, 10. Januar 2017, 13:36:15 CET schrieb Jan Stary:
> > > > > You arguing that 40MB is nothing on modern systems (which, by the
> > > > > way is
> > > > > not exactly true, talking about embedded ones).
> > > > 
> > > > Can you gove an example of an embedded system with manpages?
> > > 
> > > My Raspberry Pi 3. ;-)
> > 
> > How is that an embedded system?
> > It's a full blown linux installation.
> 
>   Not every "full blown linux installation" has a multi-terabyte hard
> drive attached.  There are small laptops with SSDs instead of spinning
> disks.

And my Raspberry PI 3 has less CPU power, less RAM and less storage space than 
my goddam android cell phone. ;-)

Btw.: Even "embedded experts" wholeheartedly agree that they disagree what 
"embedded" actually is. But I do think SoCs actually *do* qualify, at least to 
some degree...


signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-dev] bzipped manpages

2017-01-12 Thread Walter Dnes
On Wed, Jan 11, 2017 at 05:15:25PM +0100, Jan Stary wrote
> On Jan 11 13:34:09, sven.e...@gmx.de wrote:
> > Am Dienstag, 10. Januar 2017, 13:36:15 CET schrieb Jan Stary:
> > > > You arguing that 40MB is nothing on modern systems (which, by the way is
> > > > not exactly true, talking about embedded ones).
> > > 
> > > Can you gove an example of an embedded system with manpages?
> > 
> > My Raspberry Pi 3. ;-)
> 
> How is that an embedded system?
> It's a full blown linux installation.

  Not every "full blown linux installation" has a multi-terabyte hard
drive attached.  There are small laptops with SSDs instead of spinning
disks.

-- 
Walter Dnes 
I don't run "desktop environments"; I run useful applications



Re: [gentoo-dev] bzipped manpages

2017-01-11 Thread Jan Stary
On Jan 11 13:34:09, sven.e...@gmx.de wrote:
> Am Dienstag, 10. Januar 2017, 13:36:15 CET schrieb Jan Stary:
> > > You arguing that 40MB is nothing on modern systems (which, by the way is
> > > not exactly true, talking about embedded ones).
> > 
> > Can you gove an example of an embedded system with manpages?
> 
> My Raspberry Pi 3. ;-)

How is that an embedded system?
It's a full blown linux installation.




Re: [gentoo-dev] bzipped manpages

2017-01-11 Thread Michael Orlitzky
On 01/10/2017 06:54 AM, Jan Stary wrote:
> 
> These are workarounds. Let me get back to the original question:
> would you please consider having _uncompressed_ manpages as the default?
> 
> On this particular system, the bzipped /usr/share/man/ is 67M.
> The uncompressed man/ is 108M. That's 40M saved. Seriously?

Since everyone else is giving you a hard time, I think compressing all
of the documentation by default is annoying and over-complicates things.

However, if you asked me what *could* be compressed by default, man
pages with their separate /usr/share/man directory and doman helper
would be at the top of the list. They aren't meant to be read by humans,
the default reader handles them, and they compress well.

And in direct contradiction to my last paragraph, we did invent dohtml
and now have PORTAGE_COMPRESS_EXCLUDE_SUFFIXES set to work around this
exact problem, that web browsers won't call bunzip2 in a pipeline. It
seems a little unfair to let web browsers off the hook but not mdocml.

There are probably fewer people who care about the space than there are
users who have been annoyed that they can't run an example because
ruby/php/python won't run a compressed script. I know that we have
"docompress", but inside the ebuild is the wrong place to battle an
implementation-specific default setting.




Re: [gentoo-dev] bzipped manpages

2017-01-11 Thread Sven Eden
Am Dienstag, 10. Januar 2017, 13:36:15 CET schrieb Jan Stary:
> > You arguing that 40MB is nothing on modern systems (which, by the way is
> > not exactly true, talking about embedded ones).
> 
> Can you gove an example of an embedded system with manpages?

My Raspberry Pi 3. ;-)

Cheers

Sven


signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-dev] bzipped manpages

2017-01-10 Thread Michał Górny
On Tue, 10 Jan 2017 15:01:15 +0200
Mart Raudsepp  wrote:

> Ühel kenal päeval, T, 10.01.2017 kell 19:19, kirjutas Vadim A. Misbakh-
> Soloviov:
> > that will 
> > affect tons of users (which are happy with current "defaults")
> > because yours 
> > only own local problems (not having root access on the system)?  
> 
> Yes, the default should be changed for everyone.
> To PORTAGE_COMPRESS="xz".

Please do not encourage the use of 'wannabe-7zip' further. If you want
LZMA2, then lzip has less complexity, less overhead and slightly better
compression ratio than xz. It is also more reliable.

-- 
Best regards,
Michał Górny



pgp_KeX_aWQM3.pgp
Description: OpenPGP digital signature


Re: [gentoo-dev] bzipped manpages

2017-01-10 Thread Michał Górny
On Tue, 10 Jan 2017 12:54:21 +0100
Jan Stary  wrote:

> On Jan 09 09:30:11, ike...@gentoo.org wrote:
> > Hiya Jan,
> > 
> > The following snippet from Ingo is correct:
> >   
> > > So, you want to hear something constructive?  Your best option is to
> > > just decompress that stuff on your system.  (Gentoo is famous for
> > > its excessive configurability - maybe there is even an option?)  
> > 
> > We are both famous for our excessive configurability and there is even
> > an option already!  5:)  If you look in the manpage (once you've
> > decompress it somewhere, or online at [1]) for make.conf, you'll see the
> > entry for PORTAGE_COMPRESS, which you can set as follows:
> > 
> > PORTAGE_COMPRESS=""  
> 
> I am only a user on this system,
> and have no control over which packages are installed
> and have no write permissions in /usr/share/man/ or make.conf

If you are only a user, then why don't you contact your sysadmin
instead of trying to work around his choice at the distribution level?
After all, as you already know, he will need to rebuild everything
anyway.

> > As mentioned in [2,3,others].  You'll then need to reinstall all
> > packages.  If you manually decompress the files, then the uncompressed
> > manpages won't be registered with portage and won't get removed if the
> > owning package is uninstalled.  
> 
> Also, the uncompressed manpage will not get updated
> when the packages gets updated. I will have two copies,
> a stale *.1 and an up-to-date *.1.bz2.
> 
> These are workarounds. Let me get back to the original question:
> would you please consider having _uncompressed_ manpages as the default?
> 
> On this particular system, the bzipped /usr/share/man/ is 67M.
> The uncompressed man/ is 108M. That's 40M saved. Seriously?

~40% is a pretty good gain. However, if you really insist on comparing
this, few points to consider:

1. Since there are many small files involved, the results highly differ
depending on the filesystem used. On some filesystems (btrfs), it will
be very hard to even get any conclusive numbers.

2. The compression feature extends to all documentation,
including /usr/share/doc and /usr/share/info. So you should really
consider it all rather than limiting your view to manpages.

3. In some cases, the compression can also improve performance by
reducing I/O overhead. While it's debatable whether it will happen on
manpages (see filesystem problem above too), there is no real
performance loss to be considered either. After all, manpages are read
rather rarely in the lifetime of production system, so any effort in
decompressing them is a minor problem.

> There is an option to support; the packages need to be reinstalled
> or there are untracked files;

Are you arguing for removing the option altogether? Because as far as I
can see, the problems involved in changing the value are rather
an argument not to change it...

> the manpage formatter needs to call
> external unpackers. All this to save 40M. I honestly don't think
> it's worth it.

Calling external tools in a pipeline is a pretty normal solution
in the *nix world. It could be even considered safer than implementing
multiple disjoint features in a single tool where an issue in one
module could damage other parts of the program.

If you really do mind it, pretty much every compression format
supported by Gentoo provides a simple library you could use. There are
also a few libraries that provide support for multiple compression
formats.

To summarize, I'm afraid you don't have any arguments besides using
non-standard tool whose upstream refuses to support normal Gentoo
installations (is Gentoo really the only distribution using manpage
compression other than gzip?). You have multiple solutions available
that do not require Gentoo to suddenly change the defaults that work
for most of our users (and some of them appreciate them). Those
include:

1. using another man page tool,

2. writing a wrapper that would decompress manpages for your tool,

3. patching your tool to support bzip2 (I see that Fabian provided you
with a patch already),

4. talking to your sysadmin to update the system to meet your needs.

-- 
Best regards,
Michał Górny



pgpSZJFXUWuFE.pgp
Description: OpenPGP digital signature


Re: [gentoo-dev] bzipped manpages

2017-01-10 Thread Mart Raudsepp
Ühel kenal päeval, T, 10.01.2017 kell 14:39, kirjutas Ulrich Mueller:
> > > > > > On Tue, 10 Jan 2017, Mart Raudsepp wrote:
> > Yes, the default should be changed for everyone.
> > To PORTAGE_COMPRESS="xz".
> 
> Back in 2013, vapier had made extensive studies of compression tools
> for man pages and documentation, and the conclusion was that bzip2
> gives the best overall compression ratio for these files.
> https://bugs.gentoo.org/show_bug.cgi?id=169260

Nvm then, assuming it also gives the fastest streaming decompression
ratio, which I suspect is close enough with unxz.

My real point back of my mind was that these days it can be faster to
read a heavily compressed file from disk and decompress than to only
read a larger file from disk. And then we got SSDs, so that might be
different again.




Re: [gentoo-dev] bzipped manpages

2017-01-10 Thread Ulrich Mueller
> On Tue, 10 Jan 2017, Mart Raudsepp wrote:

> Yes, the default should be changed for everyone.
> To PORTAGE_COMPRESS="xz".

Back in 2013, vapier had made extensive studies of compression tools
for man pages and documentation, and the conclusion was that bzip2
gives the best overall compression ratio for these files.
https://bugs.gentoo.org/show_bug.cgi?id=169260

Ulrich


pgpDEGJDQs_fX.pgp
Description: PGP signature


Re: [gentoo-dev] bzipped manpages

2017-01-10 Thread Fabian Groffen
On 09-01-2017 09:08:22 +0100, Jan Stary wrote:
> The particular problem I am having is that http://mdocml.bsd.lv/ ,
> my manpage formatter of choice, does deliberately not support bzip
> (or any other outside decompressors for that matter).

Attached patch works for me.  XZ should be a similar exercise, a little
cleanup would be nice then though.

Fabian

-- 
Fabian Groffen
Gentoo on a different level
--- mdocml-1.13.4/read.c
+++ mdocml-1.13.4/read.c
@@ -38,6 +38,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "mandoc_aux.h"
 #include "mandoc.h"
@@ -62,6 +62,7 @@
 	enum mandoclevel  wlevel; /* ignore messages below this */
 	int		  options; /* parser options */
 	int		  gzip; /* current input file is gzipped */
+	int		  bzip; /* current input file is bzipp2ed */
 	int		  filenc; /* encoding of the current file */
 	int		  reparse_count; /* finite interp. stack */
 	int		  line; /* line number in the file */
@@ -610,6 +611,7 @@
 		struct buf *fb, int *with_mmap)
 {
 	gzFile		 gz;
+	BZFILE		*bz;
 	size_t		 off;
 	ssize_t		 ssz;
 
@@ -629,7 +629,7 @@
 	 * concerned that this is going to tank any machines.
 	 */
 
-	if (curp->gzip == 0 && S_ISREG(st.st_mode)) {
+	if (curp->gzip == 0 && curp->bzip == 0 && S_ISREG(st.st_mode)) {
 		if (st.st_size > 0x7fff) {
 			mandoc_msg(MANDOCERR_TOOLARGE, curp, 0, 0, NULL);
 			return 0;
@@ -639,11 +641,15 @@
 	}
 #endif
 
+	gz = NULL;
+	bz = NULL;
 	if (curp->gzip) {
 		if ((gz = gzdopen(fd, "rb")) == NULL)
 			err((int)MANDOCLEVEL_SYSERR, "%s", file);
-	} else
-		gz = NULL;
+	} else if (curp->bzip) {
+		if ((bz = BZ2_bzdopen(fd, "rb")) == NULL)
+			err((int)MANDOCLEVEL_SYSERR, "%s", file);
+	}
 
 	/*
 	 * If this isn't a regular file (like, say, stdin), then we must
@@ -663,9 +669,13 @@
 			}
 			resize_buf(fb, 65536);
 		}
-		ssz = curp->gzip ?
-		gzread(gz, fb->buf + (int)off, fb->sz - off) :
-		read(fd, fb->buf + (int)off, fb->sz - off);
+		if (curp->gzip) {
+			ssz = gzread(gz, fb->buf + (int)off, fb->sz - off);
+		} else if (curp->bzip) {
+			ssz = BZ2_bzread(bz, fb->buf + (int)off, fb->sz - off);
+		} else {
+		ssz = read(fd, fb->buf + (int)off, fb->sz - off);
+		}
 		if (ssz == 0) {
 			fb->sz = off;
 			return 1;
@@ -785,6 +795,7 @@
 	curp->file = file;
 	cp = strrchr(file, '.');
 	curp->gzip = (cp != NULL && ! strcmp(cp + 1, "gz"));
+	curp->bzip = (cp != NULL && ! strcmp(cp + 1, "bz2"));
 
 	/* First try to use the filename as it is. */
 
@@ -804,6 +815,13 @@
 			curp->gzip = 1;
 			return fd;
 		}
+		mandoc_asprintf(, "%s.bz2", file);
+		fd = open(cp, O_RDONLY);
+		free(cp);
+		if (fd != -1) {
+			curp->bzip = 1;
+			return fd;
+		}
 	}
 
 	/* Neither worked, give up. */
--- mdocml-1.13.4/configure
+++ mdocml-1.13.4/configure
@@ -255,7 +255,7 @@
 fi
 
 # --- LDADD ---
-LDADD="${LDADD} ${LD_SQLITE3} ${LD_OHASH} -lz"
+LDADD="${LDADD} ${LD_SQLITE3} ${LD_OHASH} -lz -lbz2"
 echo "LDADD=\"${LDADD}\"" 1>&2
 echo "LDADD=\"${LDADD}\"" 1>&3
 echo 1>&3


signature.asc
Description: Digital signature


Re: [gentoo-dev] bzipped manpages

2017-01-10 Thread Mart Raudsepp
Ühel kenal päeval, T, 10.01.2017 kell 19:19, kirjutas Vadim A. Misbakh-
Soloviov:
> that will 
> affect tons of users (which are happy with current "defaults")
> because yours 
> only own local problems (not having root access on the system)?

Yes, the default should be changed for everyone.
To PORTAGE_COMPRESS="xz".

Note that it doesn't only affect man pages, but also tons of
/usr/share/doc/$PF/* files.




Re: [gentoo-dev] bzipped manpages

2017-01-10 Thread Jan Stary
On Jan 10 19:19:03, gen...@mva.name wrote:
> В письме от вторник, 10 января 2017 г. 13:08:14 +07 пользователь Jan Stary 
> написал:
> > On Jan 10 19:04:47, gen...@mva.name wrote:
> > > > There is an option to support; the packages need to be reinstalled
> > > > or there are untracked files; the manpage formatter needs to call
> > > > external unpackers. All this to save 40M. I honestly don't think
> > > > it's worth it.
> > > 
> > > Why do you care about calling external unpacker,
> > > but do not care about saving 40MB?
> > 
> > Because not having to call an external unpacker
> > allows for the manpage formatter to be simple;
> > whereas saving 40M of space is of no concern.
> 
> You arguing that 40MB is nothing on modern systems (which, by the way is not 
> exactly true, talking about embedded ones).

Can you gove an example of an embedded system with manpages?

> So, I guess, it means, that you have modern powerfull hardware, which is 
> pretty fine with some overheads.

If having an extra 40MB is "modern, powerfull hardware", then yes.

> Then why do you need "simple" manpage formatter?

Why do I want software to be simple?

> And actually, why calling external unpacker is so complicated? Almost any 
> programming language I know, has functions identical to C's system()...

It's not that complicated; it's unneeded, it's another dependency, etc.

> Do you fully understand, that you asking to change "defaults", that will 
> affect tons of users (which are happy with current "defaults") because yours 
> only own local problems (not having root access on the system)?

This has nothing to do with not having root - that only makes it
unable for me to use the _workarounds_.

What would be the effect of having uncompressed manpages as the default?
(Besides having them renderred by any manpage formatter,
and wasting 40MB of space, obviously)?

Jan




Re: [gentoo-dev] bzipped manpages

2017-01-10 Thread Vadim A. Misbakh-Soloviov
В письме от вторник, 10 января 2017 г. 13:08:14 +07 пользователь Jan Stary 
написал:
> On Jan 10 19:04:47, gen...@mva.name wrote:
> > > There is an option to support; the packages need to be reinstalled
> > > or there are untracked files; the manpage formatter needs to call
> > > external unpackers. All this to save 40M. I honestly don't think
> > > it's worth it.
> > 
> > Why do you care about calling external unpacker,
> > but do not care about saving 40MB?
> 
> Because not having to call an external unpacker
> allows for the manpage formatter to be simple;
> whereas saving 40M of space is of no concern.

You arguing that 40MB is nothing on modern systems (which, by the way is not 
exactly true, talking about embedded ones).
So, I guess, it means, that you have modern powerfull hardware, which is 
pretty fine with some overheads.

Then why do you need "simple" manpage formatter? Why don't use all of that 
complicated ones (man-db, vim's Man.vim, whatever) instead?

P.S.:

And actually, why calling external unpacker is so complicated? Almost any 
programming language I know, has functions identical to C's system()...

P.P.S:

Do you fully understand, that you asking to change "defaults", that will 
affect tons of users (which are happy with current "defaults") because yours 
only own local problems (not having root access on the system)?



Re: [gentoo-dev] bzipped manpages

2017-01-10 Thread Jan Stary
On Jan 10 19:04:47, gen...@mva.name wrote:
> > There is an option to support; the packages need to be reinstalled
> > or there are untracked files; the manpage formatter needs to call
> > external unpackers. All this to save 40M. I honestly don't think
> > it's worth it.
> 
> Why do you care about calling external unpacker,
> but do not care about saving 40MB?

Because not having to call an external unpacker
allows for the manpage formatter to be simple;
whereas saving 40M of space is of no concern.




Re: [gentoo-dev] bzipped manpages

2017-01-10 Thread Jan Stary
> > This is Gentoo 2.2 (4.4.6-gentoo x86_64).
> 
> That doesn't actually tell any Gentoo user anything about your system
> except a very specific few bits of data which do not relate at all to
> the rest of the subject matter of your e-mail.

I am not really familiar eith this system - what would be
the right piece of information that does relate tot this?

> [1] echo 'PORTAGE_COMPRESS=""' >> /etc/portage/make.conf

-ksh: /etc/portage/make.conf: cannot create [Permission denied]

Jan




Re: [gentoo-dev] bzipped manpages

2017-01-10 Thread Vadim A. Misbakh-Soloviov
> There is an option to support; the packages need to be reinstalled
> or there are untracked files; the manpage formatter needs to call
> external unpackers. All this to save 40M. I honestly don't think
> it's worth it.

Why do you care about calling external unpacker, but do not care about saving 
40MB?

Double standards?



Re: [gentoo-dev] bzipped manpages

2017-01-10 Thread Jan Stary
On Jan 10 12:54:21, h...@stare.cz wrote:
> Also, the uncompressed manpage will not get updated
> when the packages gets updated. I will have two copies,
> a stale *.1 and an up-to-date *.1.bz2.

And things like /usr/share/man/man1/sx.1.bz2
will not get unbzipped, because it's a symlink, now broken.
So a special care needs to be taken of the _names_ of these.

Jan




Re: [gentoo-dev] bzipped manpages

2017-01-10 Thread Jan Stary
On Jan 09 09:30:11, ike...@gentoo.org wrote:
> Hiya Jan,
> 
> The following snippet from Ingo is correct:
> 
> > So, you want to hear something constructive?  Your best option is to
> > just decompress that stuff on your system.  (Gentoo is famous for
> > its excessive configurability - maybe there is even an option?)
> 
> We are both famous for our excessive configurability and there is even
> an option already!  5:)  If you look in the manpage (once you've
> decompress it somewhere, or online at [1]) for make.conf, you'll see the
> entry for PORTAGE_COMPRESS, which you can set as follows:
> 
> PORTAGE_COMPRESS=""

I am only a user on this system,
and have no control over which packages are installed
and have no write permissions in /usr/share/man/ or make.conf

> As mentioned in [2,3,others].  You'll then need to reinstall all
> packages.  If you manually decompress the files, then the uncompressed
> manpages won't be registered with portage and won't get removed if the
> owning package is uninstalled.

Also, the uncompressed manpage will not get updated
when the packages gets updated. I will have two copies,
a stale *.1 and an up-to-date *.1.bz2.

These are workarounds. Let me get back to the original question:
would you please consider having _uncompressed_ manpages as the default?

On this particular system, the bzipped /usr/share/man/ is 67M.
The uncompressed man/ is 108M. That's 40M saved. Seriously?

There is an option to support; the packages need to be reinstalled
or there are untracked files; the manpage formatter needs to call
external unpackers. All this to save 40M. I honestly don't think
it's worth it.

Jan




Re: [gentoo-dev] bzipped manpages

2017-01-09 Thread Jeroen Roovers
On Mon, 9 Jan 2017 09:08:22 +0100
Jan Stary  wrote:

> This is Gentoo 2.2 (4.4.6-gentoo x86_64).

That doesn't actually tell any Gentoo user anything about your system
except a very specific few bits of data which do not relate at all to
the rest of the subject matter of your e-mail.


Kind regards and good luck with [1],
 jer



[1] echo 'PORTAGE_COMPRESS=""' >> /etc/portage/make.conf; \
emerge -e world# [2]
[2] This emerge call might be quicker:
emerge -1 `find /usr/share/man -type f`
depending on the (maximum) length of the argument list.



Re: [gentoo-dev] bzipped manpages

2017-01-09 Thread Kent Fredric
On Mon, 9 Jan 2017 09:30:11 +
Mike Auty  wrote:

> As mentioned in [2,3,others].  You'll then need to reinstall all
> packages. 

Well, most. Probably a subset of "all", and if anything gets stuck half way, 
you'll
want to know which remaining packages need merged.

find /usr/share/man/ -name "*.bz2" -print0 | xargs -0 qfile -qC | sort -u

This aught to be a good starting point for determining which packages to 
re-emerge.

> If you manually decompress the files, then the uncompressed
> manpages won't be registered with portage and won't get removed if the
> owning package is uninstalled.  Hope that helps...

And if you've already done this, knowing which files are no longer "tracked" by 
portage
so you can purge them might be helpful:

find /usr/share/man/ -type f -print0 | xargs -0 qfile -Co

I myself found some weird things lurking ...

/usr/share/man/:
/usr/share/man/tack:
/usr/share/man/ize ;-) ...  
 [ ok ]




pgpuUIwvWv0fR.pgp
Description: OpenPGP digital signature


Re: [gentoo-dev] bzipped manpages

2017-01-09 Thread Mike Auty
Hiya Jan,

The following snippet from Ingo is correct:

> So, you want to hear something constructive?  Your best option is to
> just decompress that stuff on your system.  (Gentoo is famous for
> its excessive configurability - maybe there is even an option?)

We are both famous for our excessive configurability and there is even
an option already!  5:)  If you look in the manpage (once you've
decompress it somewhere, or online at [1]) for make.conf, you'll see the
entry for PORTAGE_COMPRESS, which you can set as follows:

PORTAGE_COMPRESS=""

As mentioned in [2,3,others].  You'll then need to reinstall all
packages.  If you manually decompress the files, then the uncompressed
manpages won't be registered with portage and won't get removed if the
owning package is uninstalled.  Hope that helps...

Mike  5:)

[1] https://dev.gentoo.org/~zmedico/portage/doc/man/make.conf.5.html
[2] https://blog.flameeyes.eu/2007/12/a-word-against-ecompress/
[3] http://www.gossamer-threads.com/lists/gentoo/user/286024



signature.asc
Description: OpenPGP digital signature