Re: Dealing with language bindings for libraries.

2018-05-10 Thread Konrad Hinsen

On 09/05/2018 20:00, Julien Lepiller wrote:


We already have such a case: capstone and python-capstone. There is no
redundancy since python-capstone knows how to load the shared library
created in the capstone package. So we have two packages, with the same


My situation is a bit different. The package I am working on (OpenBabel, 
http://openbabel.org/wiki/Main_Page) has a monolithic build process 
based on CMake that builds the library plus bindings for all languages 
that it detects in its environment. There is no obvious way to build the 
language bindings to go with an already installed library.


It's almost trivial to package in Guix for any language combination you 
care about: just add the languages you need to the inputs. It's also 
straightforward to make each language binding a separate output of the 
package. But choosing this option makes building the package very expensive.


Konrad.



Re: Dealing with language bindings for libraries.

2018-05-10 Thread Catonano
2018-05-10 11:27 GMT+02:00 Fis Trivial :

>
> Catonano writes:
>
> > 2018-05-09 17:21 GMT+02:00 Fis Trivial :
> >
> >>
> >> Hi, Guixs.
> >>
> >> Recently I encountered some libraries that's written in c++ and have
> >> multiple language bindings, each of them has their corresponding build
> >> system, namely, R, Python, Java. And all the bindings are in
> >> tree. During the build process, One would first build the c++ part by
> >> cmake, then chdir into each language binding sub-directory and invoke
> >> its build system.
> >>
> >> For packaging them in guix, one way to deal with it is building each
> >> binding as an independent package, each package has it's own
> >> dependencies for the specific binding and common dependencies for C++
> >> part, that way, we will have N independent packages, for N language
> >> binding. But it will result in a huge waste of computing resource. I
> >> don't want to waste precious computing time of guix's build farm.
> >>
> >
> > Maybe I'm being naive, but I don't understand why this would involve any
> > further overhead
> >
> > You could have the c++ part as a dinamically linked library and the
> > bindings would have it as a dependency
> >
> > What's the overhead in this ?
>
> Thanks for the reply.
>
> This is somehow like dealing with git submodules, people just include
> submodules' directory in their build system, for example in cmake
> script:
>
> add_subdirectory(submodule_directory).
>
> If you want to split it when packaging, you need to write a cmake config
> script for that submodule and then patch the outer cmake file for
> replacing `add_subdirectory` with `find_package`. I don't think there is
> a shortcut for this. But doing so might be welcomed by upstream.
>
>
> In the case of language binding, the build code for bindings would try
> to find the shared object and other meta data required for build inside
> build tree, not in system path.


Ah, now I see what you mean.

[...]
> So, to make the c++ part as shared object and let the bindings find it
> at compile time, I need to rewrite all the build code for bindings,
> which I don't consider necessary for either upstream nor packaging.
>

In my view, this is a form of bundling

Some projects bundle some libraries

This make life harder for packagers

It' s frown upon and considered a bad habit

In fact, in the Guix manual there' s a list of chacks to do when you review
a new package, submitted onto the mailing list

And one of the checks is to see if the packag _bundles_ anything and if it
doesm to try to unbundle and link to system provided resources instead

I think this case is similar

I understand it's boring but I think the right thing to do would be to
unbundle also in this case

Complicating the semantics of the packages (what a package is) is a risky
business.

Of course I'm the less competent system engineer and packager worldwide and
this is just my opinion 


Re: Dealing with language bindings for libraries.

2018-05-10 Thread Fis Trivial

Catonano writes:

> 2018-05-09 17:21 GMT+02:00 Fis Trivial :
>
>>
>> Hi, Guixs.
>>
>> Recently I encountered some libraries that's written in c++ and have
>> multiple language bindings, each of them has their corresponding build
>> system, namely, R, Python, Java. And all the bindings are in
>> tree. During the build process, One would first build the c++ part by
>> cmake, then chdir into each language binding sub-directory and invoke
>> its build system.
>>
>> For packaging them in guix, one way to deal with it is building each
>> binding as an independent package, each package has it's own
>> dependencies for the specific binding and common dependencies for C++
>> part, that way, we will have N independent packages, for N language
>> binding. But it will result in a huge waste of computing resource. I
>> don't want to waste precious computing time of guix's build farm.
>>
>
> Maybe I'm being naive, but I don't understand why this would involve any
> further overhead
>
> You could have the c++ part as a dinamically linked library and the
> bindings would have it as a dependency
>
> What's the overhead in this ?

Thanks for the reply.

This is somehow like dealing with git submodules, people just include
submodules' directory in their build system, for example in cmake
script:

add_subdirectory(submodule_directory).

If you want to split it when packaging, you need to write a cmake config
script for that submodule and then patch the outer cmake file for
replacing `add_subdirectory` with `find_package`. I don't think there is
a shortcut for this. But doing so might be welcomed by upstream.


In the case of language binding, the build code for bindings would try
to find the shared object and other meta data required for build inside
build tree, not in system path. And by saying "try to find", it actually
tries to first invoke the c++ build tool(cmake, autotools etc.), then
walk through the build tree to check the shared object. I hope the
following listing can make it a bit clear what happens when building
without package manager:

1. Build the c++ part in project root /path-to-project/configure

   $ ./configure && make

2. Change dir into language binding subdirectory

   $ cd /path-to-project/python_binding

   a. Invoke build

  $ python setup.py build

   b. Build code for binding (setup.py in this example) invoke
  make in ../makefile first

   c. The c++ part has already been built manually at step 1,
   nothing needs to be done by make.

   d. Build code (setup.py) walk through the build tree to check
   the shared object built by make, and other required meta
   data.

   e. Build the binding part inside
   /path-to-project/python-binding.

3. repeat step 2 for other language bindings (replace setup.py
and directory name for their corresponding build script and
directory name).

So, to make the c++ part as shared object and let the bindings find it
at compile time, I need to rewrite all the build code for bindings,
which I don't consider necessary for either upstream nor packaging.



Re: Dealing with language bindings for libraries.

2018-05-10 Thread Catonano
2018-05-09 17:21 GMT+02:00 Fis Trivial :

>
> Hi, Guixs.
>
> Recently I encountered some libraries that's written in c++ and have
> multiple language bindings, each of them has their corresponding build
> system, namely, R, Python, Java. And all the bindings are in
> tree. During the build process, One would first build the c++ part by
> cmake, then chdir into each language binding sub-directory and invoke
> its build system.
>
> For packaging them in guix, one way to deal with it is building each
> binding as an independent package, each package has it's own
> dependencies for the specific binding and common dependencies for C++
> part, that way, we will have N independent packages, for N language
> binding. But it will result in a huge waste of computing resource. I
> don't want to waste precious computing time of guix's build farm.
>

Maybe I'm being naive, but I don't understand why this would involve any
further overhead

You could have the c++ part as a dinamically linked library and the
bindings would have it as a dependency

What's the overhead in this ?


Re: Dealing with language bindings for libraries.

2018-05-09 Thread Fis Trivial

Ricardo Wurmus writes:

> Fis Trivial  writes:
>
>>> We can also
>>> reuse parts of build systems without having to reimplement them
>>> manually.  We would simply reference them with something like this:
>>>
>>> --8<---cut here---start->8---
>>>   (add-after 'install 'strip-jar-timestamps
>>> (assoc-ref ant:%standard-phases 'strip-jar-timestamps))
>>> --8<---cut here---end--->8---
>>>
>> Oh, I didn't thought about that before, thanks. But would something
>> similar to this be nicer?
>>
>> --8<---cut here---start->8---
>> (define-public foobar
>>   (package
>> (name "foobar")
>> (source (origin ... ))
>> (build-system cmake-build-system)
>> (output "python" ; builds foobar-python
>>  `(package/inherit foobar
>> (name "foobar-python")
>> (source (getcwd))
>> (build-system python-build-system)
>> (inputs
>>  `(,@(package-inputs foobar)
>>("pytest" ,pytest)))
>> (arguments
>>  `(#:phases
>>(modify-phases %standard-phases
>>  (add-before 'configure 'cd
>>(lambda* _
>>  (chdir "./python"
>> (home-page "https://foobar.html;)
>> (license ...)))
>> --8<---cut here---end--->8---
>
> An output is not a package and it does not have its own
> inputs.  Nor can a package be recursively defined (here foobar refers to
> foobar itself).
>

Yes, I know. It's an example I can come up with, in order to make some
discussions. I would argue that an output is at least conceptually a
package. I don't have many experiment with RPM, but I know that there
are source packages and binary packages. To me, in guix, the scheme
package definition combined with the upstream source is the RPM version
of source package, and the built store output is the RPM version of
binary package.

As for the recursive problem, I get that. I haven't really implement
anything yet, so it's just an example of thought. I have hard time
reading scheme code, thanks to syntax rules, diving into a new project
is like learning a new language by reading it's compiler's source code
which is also written in this new language. I will take my time to get
into it. Not important here.

> That’s not how the package DSL works, and I don’t think it should work
> like this.

Why you think it shouldn't work like this? I hope some of core members
(you people are better program designer than me) can discuss this, if
it's a implementable and accepted feature, I will try harder to get into
the source code and help.

>
> You cannot have it both ways: include inputs conditionally *and* have
> the thing be one and the same package.  You *can*, however, define a
> procedure that generates closely related packages.  But then these are
> separate packages and don’t share the same build environment.
>

Actually, this topic has came up a few times now, optional dependencies.
It could be done in Nix and RPM. I am not saying guix should do what
others done. But it makes a lot of sense of having that in tool box.

>> I read the package definition of python-capstone as pointed out by
>> Julien Lepiller, thanks. It requires manipulating python build code to
>> achieve the effect. It's true that we can do that by inspecting build
>> code, but these language bindings are designed to be build in source
>> tree, I don't think the solution of python-capstone should be adopted as
>> an universal solution.
>
> I don’t think there *can* be a universal solution.  I’ve seen both kinds
> of packages in the past; the solution depends on the build system.

Maybe not. But maybe we can discuss possible some improvement for the
future. :)

Thanks.


Re: Dealing with language bindings for libraries.

2018-05-09 Thread Ricardo Wurmus

Fis Trivial  writes:

> An ideal scenario would be the one that we can specify multiple outputs
> for one packages, each output corresponds to one language binding, and
> we can specify different dependencies and build system for each
> output. Is there any chance we can do that in guix?

Yes, we already use multiple outputs in some packages IIRC.  We can also
reuse parts of build systems without having to reimplement them
manually.  We would simply reference them with something like this:

--8<---cut here---start->8---
  (add-after 'install 'strip-jar-timestamps
(assoc-ref ant:%standard-phases 'strip-jar-timestamps))
--8<---cut here---end--->8---

Using separate independent outputs means that users who fetch
substitutes can avoid fetching irrelevant substitutes.  This doesn’t
help users who build everything from source as they’ll need to have the
dependencies for all language bindings.

Dependent on the way this is implemented we can also have a common
package and use that as an input to the separate language binding
packages.  There would be no wasted cycles as the common parts would not
need to be rebuilt.

--
Ricardo




Re: Dealing with language bindings for libraries.

2018-05-09 Thread Ricardo Wurmus

Fis Trivial  writes:

>> We can also
>> reuse parts of build systems without having to reimplement them
>> manually.  We would simply reference them with something like this:
>>
>> --8<---cut here---start->8---
>>   (add-after 'install 'strip-jar-timestamps
>> (assoc-ref ant:%standard-phases 'strip-jar-timestamps))
>> --8<---cut here---end--->8---
>>
> Oh, I didn't thought about that before, thanks. But would something
> similar to this be nicer?
>
> --8<---cut here---start->8---
> (define-public foobar
>   (package
> (name "foobar")
> (source (origin ... ))
> (build-system cmake-build-system)
> (output "python"  ; builds foobar-python
>  `(package/inherit foobar
>  (name "foobar-python")
>  (source (getcwd))
>  (build-system python-build-system)
>  (inputs
>   `(,@(package-inputs foobar)
> ("pytest" ,pytest)))
>  (arguments
>   `(#:phases
> (modify-phases %standard-phases
>   (add-before 'configure 'cd
> (lambda* _
>   (chdir "./python"
> (home-page "https://foobar.html;)
> (license ...)))
> --8<---cut here---end--->8---

That’s not how the package DSL works, and I don’t think it should work
like this.  An output is not a package and it does not have its own
inputs.  Nor can a package be recursively defined (here foobar refers to
foobar itself).

You cannot have it both ways: include inputs conditionally *and* have
the thing be one and the same package.  You *can*, however, define a
procedure that generates closely related packages.  But then these are
separate packages and don’t share the same build environment.

> I read the package definition of python-capstone as pointed out by
> Julien Lepiller, thanks. It requires manipulating python build code to
> achieve the effect. It's true that we can do that by inspecting build
> code, but these language bindings are designed to be build in source
> tree, I don't think the solution of python-capstone should be adopted as
> an universal solution.

I don’t think there *can* be a universal solution.  I’ve seen both kinds
of packages in the past; the solution depends on the build system.

--
Ricardo




Re: Dealing with language bindings for libraries.

2018-05-09 Thread Fis Trivial

Ricardo Wurmus writes:

> Fis Trivial  writes:
>
>> An ideal scenario would be the one that we can specify multiple outputs
>> for one packages, each output corresponds to one language binding, and
>> we can specify different dependencies and build system for each
>> output. Is there any chance we can do that in guix?
>
> Yes, we already use multiple outputs in some packages IIRC.

That's how I learn the term "output". :)

> We can also
> reuse parts of build systems without having to reimplement them
> manually.  We would simply reference them with something like this:
>
> --8<---cut here---start->8---
>   (add-after 'install 'strip-jar-timestamps
> (assoc-ref ant:%standard-phases 'strip-jar-timestamps))
> --8<---cut here---end--->8---
>
Oh, I didn't thought about that before, thanks. But would something
similar to this be nicer?

--8<---cut here---start->8---
(define-public foobar
  (package
(name "foobar")
(source (origin ... ))
(build-system cmake-build-system)
(output "python"; builds foobar-python
 `(package/inherit foobar
   (name "foobar-python")
   (source (getcwd))
   (build-system python-build-system)
   (inputs
`(,@(package-inputs foobar)
  ("pytest" ,pytest)))
   (arguments
`(#:phases
  (modify-phases %standard-phases
(add-before 'configure 'cd
  (lambda* _
(chdir "./python"
(home-page "https://foobar.html;)
(license ...)))
--8<---cut here---end--->8---

> Using separate independent outputs means that users who fetch
> substitutes can avoid fetching irrelevant substitutes.  This doesn’t
> help users who build everything from source as they’ll need to have the
> dependencies for all language bindings.

On the other hand, it will help people who don't everything from source,
I would say that's a large amount of people, including me.

>
> Dependent on the way this is implemented we can also have a common
> package and use that as an input to the separate language binding
> packages.  There would be no wasted cycles as the common parts would not
> need to be rebuilt.

I read the package definition of python-capstone as pointed out by
Julien Lepiller, thanks. It requires manipulating python build code to
achieve the effect. It's true that we can do that by inspecting build
code, but these language bindings are designed to be build in source
tree, I don't think the solution of python-capstone should be adopted as
an universal solution. 


Re: Dealing with language bindings for libraries.

2018-05-09 Thread Julien Lepiller
Le Wed, 9 May 2018 18:25:13 +0200,
Konrad Hinsen  a écrit :

> On 09/05/2018 17:21, Fis Trivial wrote:
> 
> > An ideal scenario would be the one that we can specify multiple
> > outputs for one packages, each output corresponds to one language
> > binding, and we can specify different dependencies and build system
> > for each output. Is there any chance we can do that in guix?  
> 
> +1
> 
> I am currently in exactly the same situation.
> 
> Konrad.

We already have such a case: capstone and python-capstone. There is no
redundancy since python-capstone knows how to load the shared library
created in the capstone package. So we have two packages, with the same
build time as with one package, but much easier to create. The only
difficulty is to make the binding find the library (but the same would
be true for different outputs I guess).

HTH



Re: Dealing with language bindings for libraries.

2018-05-09 Thread Konrad Hinsen

On 09/05/2018 17:21, Fis Trivial wrote:


An ideal scenario would be the one that we can specify multiple outputs
for one packages, each output corresponds to one language binding, and
we can specify different dependencies and build system for each
output. Is there any chance we can do that in guix?


+1

I am currently in exactly the same situation.

Konrad.