Re: “What’s in a package”

2021-09-24 Thread Ludovic Courtès
Hello,

Katherine Cox-Buday  skribis:

> That sounds like a great start. I tossed out some other ideas elsewhere in 
> the thread. Most of them involve meta-inspection of the package, Guix 
> ecosystem, runtime environment and logs. It would be nice in general to have 
> a kind of "agent" that you could run repeatedly over the course of packaging 
> that would suggest next steps on ~stderr~ and next logical packaging 
> definition on ~stdout~. Kind of like pair-programming with Guix :)
>
> It would perform different operations dependent on what stage in the 
> life-cycle the package is at, i.e. ~import~ when no package definition 
> exists, build when one does, and possibly running the result in a container 
> when the package build succeeds.
>
> E.g. your PyTorch example, starting from scratch (note: ~guix import~ may not 
> always feel like the right command to invoke in this example. This may be 
> some larger concept than import; also, the example always redirects to 
> package.scm for brevity, but the user would probably want to look at it 
> first):
>
> #+begin_example
>   $ guix import upstream pytorch
>
>   stderr: This looks like it might be python package (heuristics.scm:123 - 
> package name starts with py), try this instead:
>   stdout: guix import upstream pypi pytorch

[...]

I like these ideas!

> etc., etc. Typing that out, it feels dangerously close to Microsoft's Clippy, 
> but hopefully more helpful :)

Heh, Clippy was cute.  ;-)

> Heuristics, by definition, wouldn't be correct all the time, but this kind of 
> thing could help new contributors (or experienced contributors with bad 
> memories like me!), and in some cases actually do some of the programming.
>
> And every time someone comes to the mailing list or IRC with a question, we 
> can ask ourselves if this is a common question, and maybe create a new 
> heuristic.

Agreed.

Let’s see if we can get there…

Thanks,
Ludo’.



Re: “What’s in a package”

2021-09-23 Thread Katherine Cox-Buday
Ludovic Courtès  writes:

> I agree.  Like Konrad and you wrote, it’s good that we can all have our
> quick-and-dirty packages in personal channels, and it’s good that these
> are separate channels.

Definitely! I would also encourage Guix to adopt some of the tools that make 
creating quick, but bad, packages easier. E.g. builds that patch elfs with the 
library paths of their dependencies.

> I’m not sure how tooling could help in the way of making quick packages,
> but it’s worth exploring.  Examples that come to mind are: merging the
> npm importer that currently lives in a branch, and providing a generic
> “guix import upstream” importer that would figure out as much as
> possible so that one doesn’t have to start from a blank page.

That sounds like a great start. I tossed out some other ideas elsewhere in the 
thread. Most of them involve meta-inspection of the package, Guix ecosystem, 
runtime environment and logs. It would be nice in general to have a kind of 
"agent" that you could run repeatedly over the course of packaging that would 
suggest next steps on ~stderr~ and next logical packaging definition on 
~stdout~. Kind of like pair-programming with Guix :)

It would perform different operations dependent on what stage in the life-cycle 
the package is at, i.e. ~import~ when no package definition exists, build when 
one does, and possibly running the result in a container when the package build 
succeeds.

E.g. your PyTorch example, starting from scratch (note: ~guix import~ may not 
always feel like the right command to invoke in this example. This may be some 
larger concept than import; also, the example always redirects to package.scm 
for brevity, but the user would probably want to look at it first):

#+begin_example
  $ guix import upstream pytorch

  stderr: This looks like it might be python package (heuristics.scm:123 - 
package name starts with py), try this instead:
  stdout: guix import upstream pypi pytorch

  $ guix import upstream pypi pytorch | tee package.scm

  $ guix import upstream package.scm | tee package.scm

  stderr: downloading...
  stderr: It looks like this fails to build because it's missing autoconf 
(heuristics.scm:133 - grepping build output found a missing autoconf error). 
Try adding it as a native-input.
  stdout: (package definition with imports defined and native-input modified)

  $ guix import upstream package.scm

  stderr: downloading...
  stderr: It looks like this package comes with binaries that are available as 
Guix packages (heuristics.scm:143 - unpacking source includes binary or object 
files, heuristics.scm:153 - bundled files match output of known packages). Try 
this package definition instead:
  stdout: (package definition with suggested inputs and overridden phases to 
remove the binaries from the download)

  $ guix import upstream package.scm | tee package.scm

  stderr: It looks like this package vendors libraries that are available as 
Guix packages (heuristics.scm:163 - unpacking source includes vendored 
libraries, heuristics.scm:153 - bundled files match output of known packages). 
Try this package definition instead:
  stdout: (package definition with suggested inputs and overridden phases to 
remove the vendored libraries from the download)

  $ guix import upstream package.scm | tee package.scm
  
  stderr: It looks like this package searches XDG_DATA_DIRS for some files 
(heuristics.scm:163 - grep an strace of a containerized run of the output). Try 
this package definition instead:
  stdout: (package definition with ~native-search-paths~ defined)
#+end_example

etc., etc. Typing that out, it feels dangerously close to Microsoft's Clippy, 
but hopefully more helpful :)

Heuristics, by definition, wouldn't be correct all the time, but this kind of 
thing could help new contributors (or experienced contributors with bad 
memories like me!), and in some cases actually do some of the programming.

And every time someone comes to the mailing list or IRC with a question, we can 
ask ourselves if this is a common question, and maybe create a new heuristic.

-- 
Katherine



Re: “What’s in a package”

2021-09-23 Thread Ludovic Courtès
Hi Katherine,

Katherine Cox-Buday  skribis:

> This is perhaps a rehash of the "worse is better"[2] conversation, but
> I often struggle with deciding whether to do things the "fast" way, or
> the "correct" way. I think when your path is clear, the correct way
> will get you farther, faster. But when you're doing experiments, or
> exploratory programming, being bogged down with the "correct" way of
> doing things (i.e. Guix packages) might take a lot of time for no
> benefit. E.g. maybe you end up packaging a cluster of things that you
> find out don't work out for you. Of course the challenge is: if you
> choose the fast way, and it works out, do you got back to do it the
> correct way so that you're on sound footing?

I can very much relate to this.  PyTorch was one case where I hesitated
between the “fast way” and the “correct way”; I chose the latter
thinking that it would probably be beneficial to others, otherwise you
wouldn’t have heard about it.  ;-)

> Bringing this back to Guix, and maybe the GNU philosophy, it has been
> very helpful for me to be able to leverage the flexibility of Guix to
> occasionally do things the "fast" way, perhaps by packaging a
> binary. Paradoxically, it has allowed me to stay within the Guix and
> free software ecosystem. In my opinion, flexibility is key to growing
> the ecosystem and community, and I would encourage Guix as a project
> to take every opportunity to give the user options.

I agree.  Like Konrad and you wrote, it’s good that we can all have our
quick-and-dirty packages in personal channels, and it’s good that these
are separate channels.

I’m not sure how tooling could help in the way of making quick packages,
but it’s worth exploring.  Examples that come to mind are: merging the
npm importer that currently lives in a branch, and providing a generic
“guix import upstream” importer that would figure out as much as
possible so that one doesn’t have to start from a blank page.

Thanks,
Ludo’.



Re: “What’s in a package”

2021-09-22 Thread zimoun
Hi,

On Tue, 21 Sep 2021 at 15:20, Katherine Cox-Buday  
wrote:

>  I.e.,
> when trying to achieve a goal, it is a pain to package things that
> aren't yet packaged, but what I get in return are sane environments,
> deployments, and meta-data about all of these.

I concur! :-)

> This is perhaps a rehash of the "worse is better"[2] conversation, but
> I often struggle with deciding whether to do things the "fast" way, or
> the "correct" way. I think when your path is clear, the correct way
> will get you farther, faster. But when you're doing experiments, or
> exploratory programming, being bogged down with the "correct" way of
> doing things (i.e. Guix packages) might take a lot of time for no
> benefit. E.g. maybe you end up packaging a cluster of things that you
> find out don't work out for you. Of course the challenge is: if you
> choose the fast way, and it works out, do you got back to do it the
> correct way so that you're on sound footing?
>
> Bringing this back to Guix, and maybe the GNU philosophy, it has been
> very helpful for me to be able to leverage the flexibility of Guix to
> occasionally do things the "fast" way, perhaps by packaging a
> binary. Paradoxically, it has allowed me to stay within the Guix and
> free software ecosystem. In my opinion, flexibility is key to growing
> the ecosystem and community, and I would encourage Guix as a project
> to take every opportunity to give the user options.

Long time ago, I watched this badly recorded video [1] about “Haskell is
useless”.  I reframe for packages the exposed double-axis:

 useful  | trad-pkg ~~>Nirvana
 | ^
 | | Guix
useless  |   
 ---
  unsafesafe

where ’unsafe’ vs ’safe’ could read ’fast’ vs ’robust’; and trad-pkg
reads apt, conda, spack, yum, etc.


1: 


Cheers,
simon



Re: “What’s in a package”

2021-09-22 Thread Pjotr Prins
Great post Ludovic! We are contributors, consumers and fans of GNU
Guix for good reason.

On Wed, Sep 22, 2021 at 12:08:15AM +0530, Arun Isaac wrote:
> 
> Hi Ludo,
> 
> >   https://hpc.guix.info/blog/2021/09/whats-in-a-package/
> 
> Thanks for writing this article! This article will be very useful to
> share with others who might think our insistence on auditability,
> reproducibility and software freedom to be bordering on the
> pedantic. Frankly, I find it quite mysterious why Guix is not more
> widely adopted in science. But, I suppose we will get there, in time!
> :-)
> 
> Cheers,
> Arun





Re: [Spam:]Re: “What’s in a package”

2021-09-22 Thread Konrad Hinsen
Katherine Cox-Buday  writes:

> As we've seen these past years with COVID-19 and the world's supply
> chains, efficiency has some kind of inverse relationship with
> robustness. If you go too far down the path of efficiency, you are not
> very flexible, and you're building sand castles.

That's exactly what I have seen happening in scientific software for a
while :

  https://hal.archives-ouvertes.fr/hal-02117588

> It's for this reason I appreciate having "robust" software underneath
> my sand castle. At least I know only so much can crumble :)

100 % agreement!

> I want to be careful here in what I suggest. I think it is very
> important that Guix remain a bastion of robust software with very high
> standards. I don't want to see the PyPi PyTorch packages of the world

Me neither. My suggestion was for support in Guix the tool, not Guix the
software distribution. People can/should package their sand castles in
their private channels.

> So with your example: make it really easy to transform that PyPi
> package into a terrible Guix primitive of some kind, but don't let me
> commit it to Guix proper.

I trust our maintainer team to not let this happen.

> Maybe interactive software that introspects how a package
> is written and behaves at runtime (in a container?) and utilizes the
> homoiconicity of scheme to suggest modifications of the package, or
> next steps. E.g. expand the linter to suggest things like

That sounds interesting!

> Speaking of industry, I don't think we leverage software to build software 
> enough.

Definitely not.

> And by the way, none of those ideas would be possible if Guix weren't
> such a robust and sane ecosystem.

Exactly. We can discuss (and more) adding sloppy stuff on top of Guix,
but it wouldn't work the other way round.


"Jonathan McHugh"  writes:

> Your focus regarding a transition from exploratory to robust is
> important (though may have equal significance in the other
> direction?).

Not equal as I see it, but yes, it matters as well, for dragging a
stable package out int the open again for significant improvements.

> Would security experts have (understandable) criteria to prioritise
> choices for 'robust corridors' within an ecosystem of sourcefiles and
> encapsulated blobs?

I'd love to hear from security experts too!

Konrad.



Re: [Spam:]Re: “What’s in a package”

2021-09-22 Thread Jonathan McHugh
Hi Konrad,

Similarly I found the post excellent.

Your focus regarding a transition from exploratory to robust is important 
(though may have equal significance in the other direction?).

Would security experts have (understandable) criteria to prioritise choices for 
'robust corridors' within an ecosystem of sourcefiles and encapsulated blobs?


Jonathan McHugh
indieterminacy@libre.brussels

September 22, 2021 3:32 PM, "Konrad Hinsen"  wrote:

> Hi Katherine and Ludo,
> 
>> I appreciate this post very much. Setting aside questions of freedom,
> 
> +1
> 
>> This is perhaps a rehash of the "worse is better"[2] conversation, but
>> I often struggle with deciding whether to do things the "fast" way, or
>> the "correct" way. I think when your path is clear, the correct way
>> will get you farther, faster. But when you're doing experiments, or
>> exploratory programming, being bogged down with the "correct" way of
>> doing things (i.e. Guix packages) might take a lot of time for no
> 
> Exactly. Most software engineering tools situate themselves somewhere on
> the "fast" vs. "robust" scale, and defend their position as the one and
> only Good Thing. Guix is at the "robust" end of the scale in the
> software management category. And that's what I want for most of the
> software I use, i.e. everything I don't hack on myself. Which is why I
> like Guix :-)
> 
> What is so far insufficiently supported by computing technology is the
> necessary transition from "fast" to "robust". There are a few
> exceptions, such as programming language with gradual typing. In most
> situations, moving software from exploratory to robust involves a lot of
> rewriting, often manually, with no tooling support.
> 
>> Bringing this back to Guix, and maybe the GNU philosophy, it has been
>> very helpful for me to be able to leverage the flexibility of Guix to
>> occasionally do things the "fast" way, perhaps by packaging a
>> binary. Paradoxically, it has allowed me to stay within the Guix and
>> free software ecosystem. In my opinion, flexibility is key to growing
>> the ecosystem and community, and I would encourage Guix as a project
>> to take every opportunity to give the user options.
> 
> +100 :-)
> 
> There is a lot we can improve here. Tutorials would be a good start.
> Example: How do you package a binary in Guix? In particular, how do you
> deal with binaries that have binary dependencies that they expect in
> /lib etc.? A next step would be tool support: Grab whatever PyPI offers,
> even if it's only binary wheels, and turn that into a Guix package.
> 
> Another aspect would be supporting software development moving from fast
> to robust. Suppose I have software I compile by hand, or via a simple
> Makefile, somewhere in my home directory. How do I go from there to (1)
> a quick-and-dirty Guix package, then (2) a very basic publishable Guix
> package and finally (3) a Guix package with tests and documentation?
> The path should be supported by various tools, from automatic rewriting
> to debugging. As an example, something I have wished for more than once
> is the possibility to run the individual build steps of a Guix package
> under my own account in my home directory, for debugging purposes.
> 
> Konrad
> --
> -
> Konrad Hinsen
> Centre de Biophysique Moléculaire, CNRS Orléans
> Synchrotron Soleil - Division Expériences
> Saint Aubin - BP 48
> 91192 Gif sur Yvette Cedex, France
> Tel. +33-1 69 35 97 15
> E-Mail: konrad DOT hinsen AT cnrs DOT fr
> http://dirac.cnrs-orleans.fr/~hinsen
> ORCID: https://orcid.org/-0003-0330-9428
> Twitter: @khinsen
> -



Re: [Spam:]Re: “What’s in a package”

2021-09-22 Thread Katherine Cox-Buday
Konrad Hinsen  writes:

> What is so far insufficiently supported by computing technology is the
> necessary transition from "fast" to "robust".

This is really a large problem in the industry. Especially since in most 
circles moving fast is considered the preferred way to do things. SaaS and 
abstractions are endemic, and while helpful to get things going, it can lead to 
precarious systems with interdependencies and risks that are not fully 
understood or appreciated.

The "fast" path does allow people to test out new ideas very quickly, but there 
is a hidden cost. As we've seen these past years with COVID-19 and the world's 
supply chains, efficiency has some kind of inverse relationship with 
robustness. If you go too far down the path of efficiency, you are not very 
flexible, and you're building sand castles.

It's for this reason I appreciate having "robust" software underneath my sand 
castle. At least I know only so much can crumble :)

> There are a few exceptions, such as programming language with gradual typing.
> In most situations, moving software from exploratory to robust involves a lot
> of rewriting, often manually, with no tooling support.

I really like this framing. How can we support every step of the continuum with 
a gentle pull towards robustness? That sounds like something to strive for.

>> Bringing this back to Guix, and maybe the GNU philosophy, it has been
>> very helpful for me to be able to leverage the flexibility of Guix to
>> occasionally do things the "fast" way, perhaps by packaging a
>> binary. Paradoxically, it has allowed me to stay within the Guix and
>> free software ecosystem. In my opinion, flexibility is key to growing
>> the ecosystem and community, and I would encourage Guix as a project
>> to take every opportunity to give the user options.
>
> +100 :-)
>
> There is a lot we can improve here. Tutorials would be a good start.
> Example: How do you package a binary in Guix? In particular, how do you
> deal with binaries that have binary dependencies that they expect in
> /lib etc.? A next step would be tool support: Grab whatever PyPI offers,
> even if it's only binary wheels, and turn that into a Guix package.

I want to be careful here in what I suggest. I think it is very important that 
Guix remain a bastion of robust software with very high standards. I don't want 
to see the PyPi PyTorch packages of the world in Guix. I /do/ want to see 
tooling in Guix that allows users to package and utilize these things as 
first-class primitives in the Guix world.

In other words, let me create beautiful and terrible things, but don't let me 
unleash them on the world.

So with your example: make it really easy to transform that PyPi package into a 
terrible Guix primitive of some kind, but don't let me commit it to Guix proper.

> Another aspect would be supporting software development moving from fast
> to robust. Suppose I have software I compile by hand, or via a simple
> Makefile, somewhere in my home directory. How do I go from there to (1)
> a quick-and-dirty Guix package, then (2) a very basic publishable Guix
> package and finally (3) a Guix package with tests and documentation?
> The path should be supported by various tools, from automatic rewriting
> to debugging. As an example, something I have wished for more than once
> is the possibility to run the individual build steps of a Guix package
> under my own account in my home directory, for debugging purposes.

This kind of stuff really excites me. If we could build tooling that somehow 
moves things along the continuum, that would really be something. Maybe 
interactive software that introspects how a package is written and behaves at 
runtime (in a container?) and utilizes the homoiconicity of scheme to suggest 
modifications of the package, or next steps. E.g. expand the linter to suggest 
things like documentation, or to identify at what point on the continuum the 
package might currently be, and how to move forward. Does the package vendor 
binaries? Does Guix have any packages that look like those binaries? What does 
the packages binaries want to link to? What paths does it try and access when 
run?

Speaking of industry, I don't think we leverage software to build software 
enough.

And by the way, none of those ideas would be possible if Guix weren't such a 
robust and sane ecosystem.

-- 
Katherine



Re: [Spam:]Re: “What’s in a package”

2021-09-22 Thread Konrad Hinsen
Hi Katherine and Ludo,

> I appreciate this post very much. Setting aside questions of freedom,

+1

> This is perhaps a rehash of the "worse is better"[2] conversation, but
> I often struggle with deciding whether to do things the "fast" way, or
> the "correct" way. I think when your path is clear, the correct way
> will get you farther, faster. But when you're doing experiments, or
> exploratory programming, being bogged down with the "correct" way of
> doing things (i.e. Guix packages) might take a lot of time for no

Exactly. Most software engineering tools situate themselves somewhere on
the "fast" vs. "robust" scale, and defend their position as the one and
only Good Thing. Guix is at the "robust" end of the scale in the
software management category. And that's what I want for most of the
software I use, i.e. everything I don't hack on myself. Which is why I
like Guix :-)

What is so far insufficiently supported by computing technology is the
necessary transition from "fast" to "robust". There are a few
exceptions, such as programming language with gradual typing. In most
situations, moving software from exploratory to robust involves a lot of
rewriting, often manually, with no tooling support.

> Bringing this back to Guix, and maybe the GNU philosophy, it has been
> very helpful for me to be able to leverage the flexibility of Guix to
> occasionally do things the "fast" way, perhaps by packaging a
> binary. Paradoxically, it has allowed me to stay within the Guix and
> free software ecosystem. In my opinion, flexibility is key to growing
> the ecosystem and community, and I would encourage Guix as a project
> to take every opportunity to give the user options.

+100 :-)

There is a lot we can improve here. Tutorials would be a good start.
Example: How do you package a binary in Guix? In particular, how do you
deal with binaries that have binary dependencies that they expect in
/lib etc.? A next step would be tool support: Grab whatever PyPI offers,
even if it's only binary wheels, and turn that into a Guix package.

Another aspect would be supporting software development moving from fast
to robust. Suppose I have software I compile by hand, or via a simple
Makefile, somewhere in my home directory. How do I go from there to (1)
a quick-and-dirty Guix package, then (2) a very basic publishable Guix
package and finally (3) a Guix package with tests and documentation?
The path should be supported by various tools, from automatic rewriting
to debugging. As an example, something I have wished for more than once
is the possibility to run the individual build steps of a Guix package
under my own account in my home directory, for debugging purposes.

Konrad
-- 
-
Konrad Hinsen
Centre de Biophysique Moléculaire, CNRS Orléans
Synchrotron Soleil - Division Expériences
Saint Aubin - BP 48
91192 Gif sur Yvette Cedex, France
Tel. +33-1 69 35 97 15
E-Mail: konrad DOT hinsen AT cnrs DOT fr
http://dirac.cnrs-orleans.fr/~hinsen/
ORCID: https://orcid.org/-0003-0330-9428
Twitter: @khinsen
-



Re: “What’s in a package”

2021-09-21 Thread Katherine Cox-Buday
Ludovic Courtès  writes:

> Hello Guix!
>
> I and others are often disappointed (or angry!) when looking at the
> weaknesses of the most popular software deployment tools.  I felt that
> acutely after packaging PyTorch last month and felt the need to look
> more closely at what others are doing and to document our motivation,
> having put so much sweat in all these packages:
>
>   https://hpc.guix.info/blog/2021/09/whats-in-a-package/
>
> It’s probably no news to people here, but the packaging approach has a
> direct impact on verifiability, and thus on security and transparency,
> as expected from a scientific process.  The idea is to explain all that
> looking at the contents of packages, in particular for pip and CONDA.
>
> Feel free to share with non-Guix people and to comment!
>
> Ludo’.

I appreciate this post very much. Setting aside questions of freedom, and 
security -- both of which I value a lot -- the main benefit of Guix has, for 
me, been: simplicity (but not always ease)[1]. I.e., when trying to achieve a 
goal, it is a pain to package things that aren't yet packaged, but what I get 
in return are sane environments, deployments, and meta-data about all of these.

This is perhaps a rehash of the "worse is better"[2] conversation, but I often 
struggle with deciding whether to do things the "fast" way, or the "correct" 
way. I think when your path is clear, the correct way will get you farther, 
faster. But when you're doing experiments, or exploratory programming, being 
bogged down with the "correct" way of doing things (i.e. Guix packages) might 
take a lot of time for no benefit. E.g. maybe you end up packaging a cluster of 
things that you find out don't work out for you. Of course the challenge is: if 
you choose the fast way, and it works out, do you got back to do it the correct 
way so that you're on sound footing?

Bringing this back to Guix, and maybe the GNU philosophy, it has been very 
helpful for me to be able to leverage the flexibility of Guix to occasionally 
do things the "fast" way, perhaps by packaging a binary. Paradoxically, it has 
allowed me to stay within the Guix and free software ecosystem. In my opinion, 
flexibility is key to growing the ecosystem and community, and I would 
encourage Guix as a project to take every opportunity to give the user options.

[1] - https://www.infoq.com/presentations/Simple-Made-Easy/
[2] - https://en.wikipedia.org/wiki/Worse_is_better

-- 
Katherine



Re: “What’s in a package”

2021-09-21 Thread Arun Isaac

Hi Ludo,

>   https://hpc.guix.info/blog/2021/09/whats-in-a-package/

Thanks for writing this article! This article will be very useful to
share with others who might think our insistence on auditability,
reproducibility and software freedom to be bordering on the
pedantic. Frankly, I find it quite mysterious why Guix is not more
widely adopted in science. But, I suppose we will get there, in time!
:-)

Cheers,
Arun


signature.asc
Description: PGP signature


“What’s in a package”

2021-09-20 Thread Ludovic Courtès
Hello Guix!

I and others are often disappointed (or angry!) when looking at the
weaknesses of the most popular software deployment tools.  I felt that
acutely after packaging PyTorch last month and felt the need to look
more closely at what others are doing and to document our motivation,
having put so much sweat in all these packages:

  https://hpc.guix.info/blog/2021/09/whats-in-a-package/

It’s probably no news to people here, but the packaging approach has a
direct impact on verifiability, and thus on security and transparency,
as expected from a scientific process.  The idea is to explain all that
looking at the contents of packages, in particular for pip and CONDA.

Feel free to share with non-Guix people and to comment!

Ludo’.