Bug#930803: new program: runcached

2019-06-26 Thread Andras Korn
On Wed, Jun 26, 2019 at 12:18:25PM +0200, Andras Korn wrote:

> > > I just wrote this script:
> > > https://gist.github.com/akorn/51ee2fe7d36fa139723c851d87e56096 and thought
> > > it might be a good addition to moreutils.
> > > 
> > > It caches the stdout, stderr and exit status of arbitrary commands for a
> > > configurable length of time, returning data from cache on subsequent
> > > invocations if the cache is still fresh.
> > 
> > thanks for your suggestion; I think it's quite an interesting idea!
> > And you made me curious:  which commands are you running through
> > 'runcached'?  All programs I thought of have their basic functionality
> > based on side-effects as file system or network access.
> 
> Examples:
> 
>  * I have a Makefile to regenerate the data.cdb file for tinydns whenever
>any of the local source files changed. Part of the process is downloading
>some remote DNS zones with axfr-get, which is relatively slow. The remote
>zones don't change frequently; I don't want to download them each time,
>but I do want to download them at least once every eight hours or so. So
>I run axfr-get via runcached, with a cache ttl of 8 hours.
> 
>  * Some monitoring systems like Zabbix and Munin need to run data-gathering
>stuff that is time-consuming; it sometimes happens that several plugins
>need to run the same thing, but extract different data from its output.
>Instead of having separate, ad-hoc caching in all such plugins, it's
>better to have a generic caching solution.
> 
>  * When looking for space hogs in the filesystem, I often use
>'du -hscx * .* | sort -h' and then dig down further. Without runcached
>I would have to either save the output separately, or keep opening new
>sessions in screen(1), or wait for the same output to be generated again,
>when I go up to the higher level directory again. With runcached, `du` is
>cheap the 2nd time, and I don't care if the numbers are slightly off when
>they come from the cache.
> 
>  * Same for ad-hoc log analysis sessions: I may grep through a bunch of
>logs, then grep for something else, then grep for the same thing again.
>With runcached, I don't need to worry about saving output I may need
>again, because runcached does it for me.

* Oh, and ldapsearch. I have some scheduled scripts that each do something
  to members of the same group; before, I had an ad-hoc cache for group
  members, but with runcached I can just re-run ldapsearch with the same
  arguments and get cached results.

András

-- 
Win any staring contest: slowly go in for a kiss without breaking eye contact.



Bug#930803: new program: runcached

2019-06-26 Thread Andras Korn
On Tue, Jun 25, 2019 at 08:22:12PM +0200, Nicolas Schier wrote:

Hi,

> > I just wrote this script:
> > https://gist.github.com/akorn/51ee2fe7d36fa139723c851d87e56096 and thought
> > it might be a good addition to moreutils.
> > 
> > It caches the stdout, stderr and exit status of arbitrary commands for a
> > configurable length of time, returning data from cache on subsequent
> > invocations if the cache is still fresh.
> 
> thanks for your suggestion; I think it's quite an interesting idea!
> And you made me curious:  which commands are you running through
> 'runcached'?  All programs I thought of have their basic functionality
> based on side-effects as file system or network access.

Examples:

 * I have a Makefile to regenerate the data.cdb file for tinydns whenever
   any of the local source files changed. Part of the process is downloading
   some remote DNS zones with axfr-get, which is relatively slow. The remote
   zones don't change frequently; I don't want to download them each time,
   but I do want to download them at least once every eight hours or so. So
   I run axfr-get via runcached, with a cache ttl of 8 hours.

 * Some monitoring systems like Zabbix and Munin need to run data-gathering
   stuff that is time-consuming; it sometimes happens that several plugins
   need to run the same thing, but extract different data from its output.
   Instead of having separate, ad-hoc caching in all such plugins, it's
   better to have a generic caching solution.

 * When looking for space hogs in the filesystem, I often use
   'du -hscx * .* | sort -h' and then dig down further. Without runcached
   I would have to either save the output separately, or keep opening new
   sessions in screen(1), or wait for the same output to be generated again,
   when I go up to the higher level directory again. With runcached, `du` is
   cheap the 2nd time, and I don't care if the numbers are slightly off when
   they come from the cache.

 * Same for ad-hoc log analysis sessions: I may grep through a bunch of
   logs, then grep for something else, then grep for the same thing again.
   With runcached, I don't need to worry about saving output I may need
   again, because runcached does it for me.

> > It currently has semi-esoteric dependencies: it's written in zsh and uses
> > chpst from the runit package for locking. If you're willing to include the
> > script I can change it to use flock(1) instead, but I'm not rewriting it in
> > POSIX sh.
> 
> Adding new scripts to the moreutils collection is usually done by
> forwarding to the upstream maintainer (Joey Hess ) and
> asking for script inclusion.  But, as Joey keeps more than just one eye
> on cross platform compatibility, I expect non-POSIX implementations to
> be rejected.  Do you keep your non-POSIX statement?

I modified it to use zsh's system module for locking, but I'm sticking with
zsh; I have no interest in rewriting it in plain Bourne sh. zsh isn't much
less cross-platform than, say, perl or Python.

The --prune-cache functionality probably depends on GNU find(1).

I'm offering the script in the belief that it might be useful to others, but
getting it into moreutils is no priority for me.

> Did you think about the license you want to stick it to? GPL2+?

I was thinking GPLv3+, but if the rest of moreutils is GPL2+, I'm fine with
that too.

András

-- 
A synonym is a word you use when you can't spell the other one.



Bug#930803: new program: runcached

2019-06-25 Thread Nicolas Schier
tags -1 + moreinfo
thanks

Hi András,

> I just wrote this script:
> https://gist.github.com/akorn/51ee2fe7d36fa139723c851d87e56096 and thought
> it might be a good addition to moreutils.
> 
> It caches the stdout, stderr and exit status of arbitrary commands for a
> configurable length of time, returning data from cache on subsequent
> invocations if the cache is still fresh.

thanks for your suggestion; I think it's quite an interesting idea!
And you made me curious:  which commands are you running through
'runcached'?  All programs I thought of have their basic functionality
based on side-effects as file system or network access.

> It currently has semi-esoteric dependencies: it's written in zsh and uses
> chpst from the runit package for locking. If you're willing to include the
> script I can change it to use flock(1) instead, but I'm not rewriting it in
> POSIX sh.

Adding new scripts to the moreutils collection is usually done by
forwarding to the upstream maintainer (Joey Hess ) and
asking for script inclusion.  But, as Joey keeps more than just one eye
on cross platform compatibility, I expect non-POSIX implementations to
be rejected.  Do you keep your non-POSIX statement?

Did you think about the license you want to stick it to? GPL2+?

Kind regards,
Nicolas

-- 
epost: nico...@fjasle.eu   irc://oftc.net/nsc
↳ gpg: 18ed 52db e34f 860e e9fb  c82b 7d97 0932 55a0 ce7f
 -- frykten for herren er opphav til kunnskap --


signature.asc
Description: PGP signature


Bug#930803: new program: runcached

2019-06-20 Thread Andras Korn
Package: moreutils
Version: 0.62-1
Severity: wishlist

Hi,

I just wrote this script:
https://gist.github.com/akorn/51ee2fe7d36fa139723c851d87e56096 and thought
it might be a good addition to moreutils.

It caches the stdout, stderr and exit status of arbitrary commands for a
configurable length of time, returning data from cache on subsequent
invocations if the cache is still fresh.

It currently has semi-esoteric dependencies: it's written in zsh and uses
chpst from the runit package for locking. If you're willing to include the
script I can change it to use flock(1) instead, but I'm not rewriting it in
POSIX sh.

Best regards,

András

-- 
  Want to make  using your computer? hold shift and press '4' four times.