Re: Guix Data Service - September update

2019-10-06 Thread Ludovic Courtès
Howdy!

Alex Sassmannshausen  skribis:

> Christopher Baines  writes:

[...]

>> So I've got an initial thing working for the version histories now. You
>> can construct a URL like [1], which will show a table about the known
>> versions of the package (icecat in this case) on the master branch.
>>
>> 1: http://data.guix.gnu.org/repository/1/branch/master/package/icecat
>>
>> The same data is available in JSON [2], and that might work for getting
>> the data in the hpcguix-web service.
>>
>> 2: http://data.guix.gnu.org/repository/1/branch/master/package/icecat.json
>
> This is incredibly cool.

Seconded, and the timeline looks nice too!

That means we could build UIs along the lines of:

  guix pull --when=icecat=60.5

Very cool!

Ludo’.



Re: Guix Data Service - September update

2019-10-02 Thread Alex Sassmannshausen


Christopher Baines  writes:

> Ludovic Courtès  writes:
>
>> That would be great.  In the end, it seems to be that there are quite a
>> few services we could build around the Data Service.  I’m not sure how
>> they should interact.
>>
>> For instance, Mumi could talk to data.guix.gnu.org over an HTTP API, or
>> should we replicate the database at issues.guix.gnu.org so that Mumi can
>> tap directly into it?
>>
>> Likewise, how should something like hpcguix-web (the package browser at
>> ) exploit available data, for instance to
>> show the history of package versions?
>
> So I've got an initial thing working for the version histories now. You
> can construct a URL like [1], which will show a table about the known
> versions of the package (icecat in this case) on the master branch.
>
> 1: http://data.guix.gnu.org/repository/1/branch/master/package/icecat
>
> The same data is available in JSON [2], and that might work for getting
> the data in the hpcguix-web service.
>
> 2: http://data.guix.gnu.org/repository/1/branch/master/package/icecat.json

This is incredibly cool.

Suddenly I understand how useful the Data Service could turn out to be!

Alex



Re: Guix Data Service - September update

2019-10-02 Thread Christopher Baines

Ludovic Courtès  writes:

> That would be great.  In the end, it seems to be that there are quite a
> few services we could build around the Data Service.  I’m not sure how
> they should interact.
>
> For instance, Mumi could talk to data.guix.gnu.org over an HTTP API, or
> should we replicate the database at issues.guix.gnu.org so that Mumi can
> tap directly into it?
>
> Likewise, how should something like hpcguix-web (the package browser at
> ) exploit available data, for instance to
> show the history of package versions?

So I've got an initial thing working for the version histories now. You
can construct a URL like [1], which will show a table about the known
versions of the package (icecat in this case) on the master branch.

1: http://data.guix.gnu.org/repository/1/branch/master/package/icecat

The same data is available in JSON [2], and that might work for getting
the data in the hpcguix-web service.

2: http://data.guix.gnu.org/repository/1/branch/master/package/icecat.json

Fetching the data for individual packages definately won't work well for
all applications, so I'm definately open to exposing the data in other
ways as well.

Chris


signature.asc
Description: PGP signature


Re: Guix Data Service - September update

2019-09-16 Thread zimoun
Hi Chris,

On Sat, 14 Sep 2019 at 12:52, Christopher Baines  wrote:

> Accessing the history of package versions isn't possible yet, but this
> is something I can look at adding, the information is there in the
> database.

This should be awesome. :-)

It often arises when one wants to reproduce a scientific paper
providing the versions of the tools used.

For example see [1].
Well, it is not easy [2]: locate in which .scm file the package is
defined, then checkout the Guix repo and `git log` this file.
And it is even more error-prone if the package has changed of .scm
file (e.g., the recent haskell-xyz move).

[1] https://lists.gnu.org/archive/html/help-guix/2019-06/msg00094.html
[2] https://lists.gnu.org/archive/html/help-guix/2019-06/msg00098.html


Well the access to the information will ease the time travel. :-)


Thank you for this initiative even if I am not sure to clearly
understand yet what the Guix Data Service is. :-)


All the best,
simon



Re: Guix Data Service - September update

2019-09-14 Thread Christopher Baines

Ludovic Courtès  writes:

> Hi Chris,
>
> Christopher Baines  skribis:
>
>> For processing jobs, the records in the database used to be deleted when the
>> job was completed, but now the records are kept and there's a page showing
>> jobs [4].
>>
>> 4: http://milano-guix-1.di.unimi.it:8765/jobs
>
> Nice!  The page only shows completed jobs, not queued jobs, right?

It shows all jobs currently, queued, running and completed (based off
the events relating to the job in the database).

>>  - I want to get back to making progress on automating code review for Guix
>>patches, this was one of the main motivations for getting lint warnings in
>>the database and on to the comparison page
>
> That would be great.  In the end, it seems to be that there are quite a
> few services we could build around the Data Service.  I’m not sure how
> they should interact.
>
> For instance, Mumi could talk to data.guix.gnu.org over an HTTP API, or
> should we replicate the database at issues.guix.gnu.org so that Mumi can
> tap directly into it?

It's probably better to use some standard interface like a HTTP API
rather than the database directly. What data were you thinking would be
useful for Mumi?

> Likewise, how should something like hpcguix-web (the package browser at
> ) exploit available data, for instance to
> show the history of package versions?

There are some URLs that can be used to access data, for example this
URL should return packages for the latest revision of the master branch
[1].

1: 
http://data.guix.gnu.org/repository/1/branch/master/latest-processed-revision/packages.json?all_results=on

Accessing the history of package versions isn't possible yet, but this
is something I can look at adding, the information is there in the
database.

>>  - The Guix package and service definitions haven't been merged, so I want to
>>look at that once the location for the Git repository is sorted out
>
> Looks like it’s done now:
> .  :-)

Yep, and I've gone ahead and reconfigured milano-guix-1, so
http://data.guix.gnu.org/ is now the URL.

> Are there specific areas where you’d like help?

Yes, or rather I have far more ideas than time. I need to get my
thoughts in order though and write them down somewhere, maybe a ROADMAP
file in the repository, similar to the one in the main Guix
repository...

> Would you encourage people to start and hack tools or services that
> build upon the available data?

Yes, although I'm still unsure how stable the API will be, so that's an
important thing to keep in mind.


signature.asc
Description: PGP signature


Re: Guix Data Service - September update

2019-09-11 Thread Ludovic Courtès
Hi Chris,

Christopher Baines  skribis:

> For processing jobs, the records in the database used to be deleted when the
> job was completed, but now the records are kept and there's a page showing
> jobs [4].
>
> 4: http://milano-guix-1.di.unimi.it:8765/jobs

Nice!  The page only shows completed jobs, not queued jobs, right?

> Most recently, lint warnings for lint checkers that don't require network
> access are stored in the database. The warnings are displayed on the revision
> page, and included on the compare page (for example [5]). This follows on from
> the changes that I started talking about here [6].

Nice.  The under-the-hood changes you mentioned above are also really
cool, it seems to be a solid base now.

> In terms of what's next:
>
>  - I've started writing a proposal for the upcoming Outreachy round relating
>to internationalisation in the Guix Data Service

Yay!

>  - I want to get back to making progress on automating code review for Guix
>patches, this was one of the main motivations for getting lint warnings in
>the database and on to the comparison page

That would be great.  In the end, it seems to be that there are quite a
few services we could build around the Data Service.  I’m not sure how
they should interact.

For instance, Mumi could talk to data.guix.gnu.org over an HTTP API, or
should we replicate the database at issues.guix.gnu.org so that Mumi can
tap directly into it?

Likewise, how should something like hpcguix-web (the package browser at
) exploit available data, for instance to
show the history of package versions?

>  - I want to provide public dumps of the database from milano-guix-1, as well
>as a small extract of that database. I think restoring a database locally
>is a good way to get data for local development
>
>  - Relating to the Outreachy proposal but also generally, I want to write some
>documentation on how to get the Guix Data Service running locally
>
>  - Currently the Git repository is on my personal Git server, and there are
>discussions about moving it to Savannah
>
>  - The Guix package and service definitions haven't been merged, so I want to
>look at that once the location for the Git repository is sorted out

Looks like it’s done now:
.  :-)

Are there specific areas where you’d like help?  Would you encourage
people to start and hack tools or services that build upon the available
data?

Thank you!

Ludo’.



Guix Data Service - September update

2019-09-08 Thread Christopher Baines
Hey,

I think I sent out the last update about the Guix Data Service back in May
[1], and quite a few things have changed since then. This is a summary of
changes since then, and a list of things that I'm interested in looking at
next.

1: https://lists.gnu.org/archive/html/guix-devel/2019-05/msg00332.html
   More progress with the Guix Data Service (17th of May)

I ran out of disk space on the server I'd been using to run the Guix Data
Service [2] so I removed it. Thanks to a kind donation of a machine from
UNIMI-DI through Giovanni Biscuolo, I deployed the Guix Data Service to
milano-guix-1, which has a lot more resources. It was down recently due to a
disk failure, but it's back online now [3].

2: https://prototype-guix-data-service.cbaines.net/ (no longer used)
3: http://milano-guix-1.di.unimi.it:8765/

It's not currently running on the standard ports, as I haven't got around to
setting up NGinx yet, but that's something I'm looking to work on soon. Quite
a few things have changed (and hopefully improved) in the last few months,
I've tried to give a summary below, but it's probably easier just to have a
look at it running on milano-guix-1 [3].

There have been many changes to the user interface, the index page has changed
to show a list of branches (rather than some revisions and jobs), there are
more links between pages, and some pages now link to cgit where useful. The
404 pages have been improved and cache headers are now set as well.

For processing jobs, the records in the database used to be deleted when the
job was completed, but now the records are kept and there's a page showing
jobs [4]. Also, the code now supports processing jobs without container
support for inferiors in Guix, and can process jobs in parallel, prioritising
the latest revision for each branch. Separate processes are used for each job
to allow concurrency, as well as improving memory management as those
processes exit when the job is finished. The log handling for jobs is also
more efficient.

4: http://milano-guix-1.di.unimi.it:8765/jobs

In terms of the Guix service, Sqitch is integrated in to the guix-data-service
script that provides the web server, so database migrations can be
automatically run on startup. There's also an option to create a pid file,
which is useful as it prevents the jobs process from starting until migrations
have been applied to the database.

I investigated why the comparison function was broken, and it turned out that
some unique constraints didn't work as intended in the case of columns with
NULL values, and the queries around inserting data failed in a similar
way. This is now handled properly, and there are migrations to remove the
duplicate values from the database. This was breaking some of the comparison
functionality.

Most recently, lint warnings for lint checkers that don't require network
access are stored in the database. The warnings are displayed on the revision
page, and included on the compare page (for example [5]). This follows on from
the changes that I started talking about here [6].

5: 
http://milano-guix-1.di.unimi.it:8765/compare?base_commit=e1e3fe08480868f960eea3ec1584c0c12b022e25_commit=067ea2989fce98f3f3f115534e2e685cfc681039
6: https://lists.gnu.org/archive/html/guix-devel/2019-05/msg00127.html
   Linting, and how to get the information in to the Guix Data Serivce (6th May)

Other smaller things:

 - There are pages for the latest processed revision for a branch (e.g. [7])

 - There's less code duplication for the code relating to inserting new data
   in to the database.

 - Names are with each database connection, so it's easier to see what each
   connection is doing

 - glibc-locales from the inferior Guix is used when loading data, which fixes
   some locale issues

 - I hacked some better NULL value support on top of guile-squee [8]`

 - I started changing the code to handle data in the natural type (e.g. number
   for numbers), rather than using strings. This worked for a while as squee
   always returned and expected strings, but to provide more adaptable code
   for working with the database, being able to use the type information for
   each value is really useful.

7: 
http://milano-guix-1.di.unimi.it:8765/repository/1/branch/master/latest-processed-revision
8: 
https://git.cbaines.net/guix/data-service/commit/?id=14419422008cc1ba42dea5ef90e6fb2762633064

In terms of what's next:

 - I've started writing a proposal for the upcoming Outreachy round relating
   to internationalisation in the Guix Data Service

 - I want to get back to making progress on automating code review for Guix
   patches, this was one of the main motivations for getting lint warnings in
   the database and on to the comparison page

 - I want to provide public dumps of the database from milano-guix-1, as well
   as a small extract of that database. I think restoring a database locally
   is a good way to get data for local development

 - Relating to the Outreachy proposal but also