Re: Downloader for "wrapped" tarbar?

2020-06-06 Thread zimoun
Dear Hartmut,

On Sat, 6 Jun 2020 at 17:29, Hartmut Goebel
 wrote:

> 2. When implementing some "wrapped-fetch" (name tdb), modeled like
> "git-fetch", there is no easy way for the user to verify the hash, as
> this is taken from the "inner" tarball. How does this work with
> substitutes, download-nar and SWH?

Today, Guix feeds SWH with only one stream "guix lint" and only for
'git-fetch' packages; if I understand well.  The origin methods for
Guix packages look like:

  1 bzr-fetch
  3 cvs-fetch
  9 url-fetch/tarbomb
 24 url-fetch/zipbomb
 28 hg-fetch
 30 computed-origin-method
 67 no-origin
115 svn-fetch
135 svn-multi-fetch
   3574 git-fetch
   9690 url-fetch

where 'svn-multi-fetch' are mainly CTAN/TeX packages.  Well, as you
see, most of the packages are not yet archived in SWH. Since SWH
supports 'svn-fetch' and 'hg-fetch', it is doable to add them to "guix
lint" but it is low-priority -- at least on my TODO. :-)

The SWH-side WIP is about 'url-fetch'.  I have not followed all the
recent developments by lewo but roughly speaking they are implementing
another "lister" [2,3,4] for tarballs.  Well, the final aim is that
SWH automatically ingests https://guix.gnu.org/sources.json which is
automatically generated every X minutes.  Currently, the compliance of
this 'sources.json' is still a WIP; the format is changing and the
specification not yet fixed.

What SWH archives is the upstream source, i.e., *not* "guix build -S"
but what comes from 'origin'.  What happens after and what Guix does
not matter for SWH.

Therefore, if I understand correctly, SWH will archive the initial
tarball.  (Sorry, I am lost with the "inner/outer" terminology.)  Note
that only the package tarball you pointed [5] needs to be checksummed,
well if this initial package tarball matches then 'contents.tar.gz'
will match too, isn't it?

I hope to not have misread and missed something.


All the best,
simon

[1] https://archive.softwareheritage.org/save/
[2] https://docs.softwareheritage.org/devel/swh-lister/index.html
[3] https://forge.softwareheritage.org/D2025
[4] https://forge.softwareheritage.org/T1991
[5] https://github.com/hexpm/specifications/blob/master/package_tarball.md



Re: Downloader for "wrapped" tarbar?

2020-06-06 Thread Hartmut Goebel
Am 02.06.20 um 21:41 schrieb Marius Bakke:
> It would be ideal to have an origin method that could extract the
> "inner" tarball, i.e. contents.tar.gz for hex.pm and data.tar.gz in the
> case of RubyGems.  As zimoun mentioned, a good place to start is look at
> how other origin methods are implemented such as url-fetch/tarbomb, etc.

I started implementing into this direction and would like your advice on
the design. I found two options:

1. When implementing some "url-fetch/wrapped" (name tdb), *two* items
will be kept in the store: the "outer" and the "inner" tarball. This is
since "url-fetch" and siblings use the built-in downloader, which AFAIK
always puts the downloaded files into the store.

In this case we need to check the hash of the "outer" tarball, as the
built-in downloader requires a hash to be passed and to match. But then
we can not check the hash of the "outer" tarball. How would this work
with substitutes and download-nar?

2. When implementing some "wrapped-fetch" (name tdb), modeled like
"git-fetch", there is no easy way for the user to verify the hash, as
this is taken from the "inner" tarball. How does this work with
substitutes, download-nar and SWH?

-- 
Regards
Hartmut Goebel

| Hartmut Goebel  | h.goe...@crazy-compilers.com   |
| www.crazy-compilers.com | compilers which you thought are impossible |




signature.asc
Description: OpenPGP digital signature


Re: Downloader for "wrapped" tarbar?

2020-06-02 Thread Marius Bakke
Hartmut Goebel  writes:

> Hi,
>
> was just written in another mail, I'm currently working on a
> erlang/rebar build system. This includes an importer from hex.pm, a
> package repository for elixir and erlang packages. (Since this is build
> into rebar3 I assume it what PyPI is for Python and CPAN for Perl.)
>
> At hex.pm, packages are provided in a tarfile [1] wrapping the source
> tar-file:
>
> -rw-r--r-- 0/0   1 2017-06-14 21:57 VERSION
> -rw-r--r-- 0/0  64 2017-06-14 21:57 CHECKSUM
> -rw-r--r-- 0/0 532 2017-06-14 21:57 metadata.config
> -rw-r--r-- 0/0    4744 2017-06-14 21:57 contents.tar.gz
>
> IMHO it does not make sense to keep this wrapping tar-file in the store.
>
> So my idea is to create a "hexpm-fetch" method, which downloads the
> tar-file and only stores the "content.tar.gz" in the store (using a
> proper name, of course).
>
> How can this be done?

Tarballs from rubygems.org has the same problem and works around it by
special support in ruby-build-system.

It would be ideal to have an origin method that could extract the
"inner" tarball, i.e. contents.tar.gz for hex.pm and data.tar.gz in the
case of RubyGems.  As zimoun mentioned, a good place to start is look at
how other origin methods are implemented such as url-fetch/tarbomb, etc.


signature.asc
Description: PGP signature


Re: Downloader for "wrapped" tarbar?

2020-06-01 Thread Ekaitz Zarraga
‐‐‐ Original Message ‐‐‐
On Sunday, May 31, 2020 10:19 AM, Hartmut Goebel  
wrote:

> Am 30.05.20 um 12:24 schrieb Ekaitz Zarraga:
>
> > I took a look to guix/download.scm I think you just need to check what 
> > url-fetch/zipbomb does because the usecase is similar to what you are 
> > looking for.
>
> Yes, I've already seen this. And there also is url-fetch/tarbomb. But
> this "%store-monad" in there discourages me, as I'm afraif this will
> keep the file in the store.


I've been thinking about this and I don't expect those files to be kept in the 
store. Doesn't the store just keep the result of the packaging rather than the 
source of it?


> > Thanks for the work you are doing, I'm interested on it because I want to 
> > package Wings3D, so once you are done you'll probably have a tester :)
>
> You already can start testing the rebar3 builder :-)  You can find my
> WIP at
> https://gitlab.digitalcourage.de/htgoebel/guix/-/tree/HG-rebar-build-system

I'll take a look if I have some free time, thanks for the link!




Re: Software heritage and Downloader for "wrapped" tarbar?

2020-06-01 Thread zimoun
Dear Hartmut,

On Sun, 31 May 2020 at 10:21, Hartmut Goebel  wrote:

> related to the "wrapped tarball downloader":

Sorry, I have not followed closely this topic, could you provide a
link/entry point about "wrapped tarball downloader"?


> Will this work with Software Heritage? E.g. will Software Heritage be
> able to archive the unwrapped tarbar?

As said above, I do not exactly know what mean "unwrapped tarball" but
the current situation about SWH is: "guix lint" queues the origin if
it is 'git-fetch' and SWH (will soon) fetch the tarballs from
http://guix.gnu.org/sources.json (type: 'url' for now, if I have not
misread the last updates).


Thanks,
simon



Software heritage and Downloader for "wrapped" tarbar?

2020-05-31 Thread Hartmut Goebel
Hi

related to the "wrapped tarball downloader":

Will this work with Software Heritage? E.g. will Software Heritage be
able to archive the unwrapped tarbar?

-- 
Schönen Gruß
Hartmut Goebel
Dipl.-Informatiker (univ), CISSP, CSSLP, ISO 27001 Lead Implementer
Information Security Management, Security Governance, Secure Software
Development

Goebel Consult, Landshut
http://www.goebel-consult.de

Blog: https://www.goe-con.de/blog/35.000-gegen-vorratdatenspeicherung
Kolumne:
https://www.goe-con.de/hartmut-goebel/cissp-gefluester/2011-09-kommerz-uber-recht-fdp-die-gefaellt-mir-partei




0x7B752811BF773B65.asc
Description: application/pgp-keys


Re: Downloader for "wrapped" tarbar?

2020-05-31 Thread Hartmut Goebel
Am 30.05.20 um 12:24 schrieb Ekaitz Zarraga:
> I took a look to guix/download.scm I think you just need to check what 
> url-fetch/zipbomb does because the usecase is similar to what you are looking 
> for.

Yes, I've already seen this. And there also is url-fetch/tarbomb. But
this "%store-monad" in there discourages me, as I'm afraif this will
keep the file in the store.

> Thanks for the work you are doing, I'm interested on it because I want to 
> package Wings3D, so once you are done you'll probably have a tester :)
You already can start testing the rebar3 builder :-)  You can find my
WIP at




-- 
Regards
Hartmut Goebel

| Hartmut Goebel  | h.goe...@crazy-compilers.com   |
| www.crazy-compilers.com | compilers which you thought are impossible |




Re: Downloader for "wrapped" tarbar?

2020-05-30 Thread Ekaitz Zarraga
‐‐‐ Original Message ‐‐‐
On Saturday, May 30, 2020 10:39 AM, Hartmut Goebel 
 wrote:

> Hi,
>
> was just written in another mail, I'm currently working on a
> erlang/rebar build system. This includes an importer from hex.pm, a
> package repository for elixir and erlang packages. (Since this is build
> into rebar3 I assume it what PyPI is for Python and CPAN for Perl.)
>
> At hex.pm, packages are provided in a tarfile [1] wrapping the source
> tar-file:
>
> -rw-r--r-- 0/0   1 2017-06-14 21:57 VERSION
> -rw-r--r-- 0/0  64 2017-06-14 21:57 CHECKSUM
> -rw-r--r-- 0/0 532 2017-06-14 21:57 metadata.config
> -rw-r--r-- 0/0    4744 2017-06-14 21:57 contents.tar.gz
>
> IMHO it does not make sense to keep this wrapping tar-file in the store.
>
> So my idea is to create a "hexpm-fetch" method, which downloads the
> tar-file and only stores the "content.tar.gz" in the store (using a
> proper name, of course).
>
> How can this be done?
>
> [1] https://github.com/hexpm/specifications/blob/master/package_tarball.md
>
>

Hi,

Probably you're able to reach the same conclusions as I did but anyway...

I took a look to guix/download.scm I think you just need to check what 
url-fetch/zipbomb does because the usecase is similar to what you are looking 
for.

Hope this helps at least a little.

Thanks for the work you are doing, I'm interested on it because I want to 
package Wings3D, so once you are done you'll probably have a tester :)

Best,
Ekaitz




Downloader for "wrapped" tarbar?

2020-05-30 Thread Hartmut Goebel
Hi,

was just written in another mail, I'm currently working on a
erlang/rebar build system. This includes an importer from hex.pm, a
package repository for elixir and erlang packages. (Since this is build
into rebar3 I assume it what PyPI is for Python and CPAN for Perl.)

At hex.pm, packages are provided in a tarfile [1] wrapping the source
tar-file:

-rw-r--r-- 0/0   1 2017-06-14 21:57 VERSION
-rw-r--r-- 0/0  64 2017-06-14 21:57 CHECKSUM
-rw-r--r-- 0/0 532 2017-06-14 21:57 metadata.config
-rw-r--r-- 0/0    4744 2017-06-14 21:57 contents.tar.gz

IMHO it does not make sense to keep this wrapping tar-file in the store.

So my idea is to create a "hexpm-fetch" method, which downloads the
tar-file and only stores the "content.tar.gz" in the store (using a
proper name, of course).

How can this be done?

[1] https://github.com/hexpm/specifications/blob/master/package_tarball.md

-- 
Regards
Hartmut Goebel

| Hartmut Goebel  | h.goe...@crazy-compilers.com   |
| www.crazy-compilers.com | compilers which you thought are impossible |