Re: Broadinstitute catch - elegant way to disable tests requiring online access?

2021-07-28 Thread Steffen Möller



On 28.07.21 14:27, Nilesh Patra wrote:

Hi Steffen,

On 28/07/21 01:06 PM, Steffen Möller wrote:

Hi Nilesh,

  The import of pristine-tar has worked after removing .gitattributes, but
  then the git lfs references were still in the tarball and the
  upstream  branch. pristine-tar could be pushed, but then the other branches 
would
  trigger git lfs errors when pushed to salsa. Only after having all fasta.gz
  lfs files removed, the upload went smoothly and you all now find this on
  https://salsa.debian.org/med-team/broad-catch


  Is that really intended?
  We would not be able to run tests in this case since you essentially
  ended
  up repacking it , since several tests seem to be using that data.

  BTW, I tried to do a little solution for the lfs thingy, for it to not
  store "references" and committed it to my personal repository[1]
  Can you have a look and let me know if it looks sensible?

  Also, you might as well want to have a look at pristine-lfs[2] which
  could
  be interesting to use. I've attempted to use this too, please consider
  taking a look
  I admit, I'm not very used to the lfs workflow, so something could be
  wrong
  for sure.

  Not sure why the CI fails though -- probably it does not work fine with
  lfs, but just thinking of it as a ground for more ideas here :)
  I can clone it locally in a different space and re-produce the tarball,
  and
  I fixed a few tests (not all) -- please consider trying once and let me
  know
[Edit]: all tests pass now on a clean chroot for me

You have outperformed yourself on this one. Thank you tons!

I just removed the broad-catch repository of mine from salsa, please
kindly inject what you have, instead. I am saying that also since
apparently my SALSA_TOKEN apparently does not have authority to adjust
the CI settings.

Pushed to med-team salsa

https://salsa.debian.org/med-team/broad-catch


May I also ask you to upload the package?

I actually wanted to ask a few things before that:

0) Are my changes even correct - is this the way out?
This is my first time packaging a lfs based repository


No idea if it is _the_ way out - but it was _a_ way out at least that
allowed the package to test.


1) When I did the changes, I realised then that the size of repository
  goes really, really huge

$ du -sh broad-catch
785Mbroad-catch

I did not anticipate the package to grow _that_ much. Without the lfs
data it was just at 22M.

(well, a good portion of this is due to both pristine-tar and
pristine-lfs branches)

So do we really want to go that way? I was initially of the impression
that it would be a few megs, but that's quite unfortunately not the case
We can alternatively go with what you pushed earlier and disable tests
instead




2) If we intend to upload this as is, then we really, _really_ need to
either remove installation of all the example fasta files, or we need to
do a separate -examples binary package

Yes. Or get a solution with getData as a test-dependency. But we would
not want to run this with every commit.

3) Would FTP masters be happy with this? Is such a large size OK to go
into the archive?


I first read this a "zero K" :)

The data they ship was last updated in 2018 if I read this right. In
that sense it is nothing volatile.

If we had our own Debian Med repository then this would all be a
no-brainer. We would upload the whole thing or just the data to it.


4) Is the package *that* useful to do that sort of solution for the
same?

No. And even if. Heck, let the folks install it themselves :)

5) Is it easy to long-term maintain it in the current state?

I admit, my connection would not be so reliable to upload all of it in
one shot - atleast not today.

@Andreas, can you take over the uploading work?
And would you also chime into the discussion here?


Next steps would be tell routine-update not to try updating this one. I
can do that.

I think uscan will work fine on this. The next step would be to
introduce pristine-lfs support instead, I guess?


I need to educate myself on pristine-lfs.


And then I would like someone from upstream to comment on
this package and direct us a bit on what else would be good to have in
Debian to help their cause. My "plan" about that is to just go the
official route and introduce the package in a github issue - if it is a
regular user of that package replying, I guess we are just about as happy.

Yes, that sounds sensible
However, I mightn't have a lot of time to spend over this


Maybe we should just leave it in salsa for now and wait if there is any
demand for that software surfacing on the mailing list. The issue github
we can also post with the work residing in salsa.

Best,
Steffen



Re: Broadinstitute catch - elegant way to disable tests requiring online access?

2021-07-28 Thread Nilesh Patra
Hi Steffen,

On 28/07/21 01:06 PM, Steffen Möller wrote:
> Hi Nilesh,
> > >  The import of pristine-tar has worked after removing .gitattributes, but
> > >  then the git lfs references were still in the tarball and the
> > >  upstream  branch. pristine-tar could be pushed, but then the other 
> > > branches would
> > >  trigger git lfs errors when pushed to salsa. Only after having all 
> > > fasta.gz
> > >  lfs files removed, the upload went smoothly and you all now find this on
> > >  https://salsa.debian.org/med-team/broad-catch
> > >  
> >  Is that really intended?
> >  We would not be able to run tests in this case since you essentially
> >  ended
> >  up repacking it , since several tests seem to be using that data.
> >  
> >  BTW, I tried to do a little solution for the lfs thingy, for it to not
> >  store "references" and committed it to my personal repository[1]
> >  Can you have a look and let me know if it looks sensible?
> >  
> >  Also, you might as well want to have a look at pristine-lfs[2] which
> >  could
> >  be interesting to use. I've attempted to use this too, please consider
> >  taking a look
> >  I admit, I'm not very used to the lfs workflow, so something could be
> >  wrong
> >  for sure.
> >  
> >  Not sure why the CI fails though -- probably it does not work fine with
> >  lfs, but just thinking of it as a ground for more ideas here :)
> >  I can clone it locally in a different space and re-produce the tarball,
> >  and
> >  I fixed a few tests (not all) -- please consider trying once and let me
> >  know
> > [Edit]: all tests pass now on a clean chroot for me
> 
> You have outperformed yourself on this one. Thank you tons!
> 
> I just removed the broad-catch repository of mine from salsa, please
> kindly inject what you have, instead. I am saying that also since
> apparently my SALSA_TOKEN apparently does not have authority to adjust
> the CI settings.

Pushed to med-team salsa

https://salsa.debian.org/med-team/broad-catch

> May I also ask you to upload the package?

I actually wanted to ask a few things before that:

0) Are my changes even correct - is this the way out?
This is my first time packaging a lfs based repository

1) When I did the changes, I realised then that the size of repository
 goes really, really huge

$ du -sh broad-catch
785Mbroad-catch

(well, a good portion of this is due to both pristine-tar and
pristine-lfs branches)

So do we really want to go that way? I was initially of the impression
that it would be a few megs, but that's quite unfortunately not the case
We can alternatively go with what you pushed earlier and disable tests
instead

2) If we intend to upload this as is, then we really, _really_ need to
either remove installation of all the example fasta files, or we need to
do a separate -examples binary package

3) Would FTP masters be happy with this? Is such a large size OK to go
into the archive?

4) Is the package *that* useful to do that sort of solution for the
same?

5) Is it easy to long-term maintain it in the current state?

I admit, my connection would not be so reliable to upload all of it in
one shot - atleast not today.

@Andreas, can you take over the uploading work?
And would you also chime into the discussion here?

> Next steps would be tell routine-update not to try updating this one. I
> can do that.

I think uscan will work fine on this. The next step would be to
introduce pristine-lfs support instead, I guess?

> And then I would like someone from upstream to comment on
> this package and direct us a bit on what else would be good to have in
> Debian to help their cause. My "plan" about that is to just go the
> official route and introduce the package in a github issue - if it is a
> regular user of that package replying, I guess we are just about as happy.

Yes, that sounds sensible
However, I mightn't have a lot of time to spend over this

Nilesh


signature.asc
Description: PGP signature


Re: ArrayExpressHTS ran into removed fastx-toolkit

2021-07-28 Thread Steffen Möller



On 28.07.21 06:30, Andreas Tille wrote:

Hi,

On Wed, Jul 28, 2021 at 01:13:38PM +0900, Charles Plessy wrote:

I had a quick lookKind regards

   Andreas.


  (git grep) at the source code of ArrayExpressHTS and
although it has an option to indicate the path to fasta_formatter, it
appears to never use it.

I patched it out. Many thanks for spotting that!

... and may be report this issue to ArrayExpressHTS upstream?



With that boost, I also had a look at patching out tophat but it seems
like that protocol is much depending on all the bits and pieces that
were nice and shiny when the package was introduced in 2010. We should
discuss with upstream what to do about the package. At the moment, the
only values I see are
 a) retrieving data from ArrayExpress for RNA-seq data (judge
yourselves on
https://www.bioconductor.org/packages/release/bioc/vignettes/ArrayExpressHTS/inst/doc/ArrayExpressHTS.R)
which was my motivation to package it
 b) increasing %age of what we cover from Bioconductor
 c) increasing %age of what we cover from CRAN

Best,
Steffen




Re: Broadinstitute catch - elegant way to disable tests requiring online access?

2021-07-28 Thread Steffen Möller

Hi Nilesh,

On 27.07.21 23:53, Nilesh Patra wrote:

On 28 July 2021 3:00:08 am IST, Nilesh Patra  wrote:

Hi Steffen,

On Tue, 27 Jul 2021 at 23:05, Steffen Möller 
wrote:


Hi Nilesh,

Thank you tons for thinking along.

It took me a bit but. Too long. The answer is that git does not lose

the

sensation of a file being a git lfs reference even when you download

a

tar.gz. For some reason I had expected that all genomes were truly

gzipped

fasta files, but no, they were still references. Maybe I had

inadvertently

transformed a few to what they point to during a first build and that

is

why I then did not find the issue a bit earlier.

The import of pristine-tar has worked after removing .gitattributes,

but

then the git lfs references were still in the tarball and the

upstream

branch. pristine-tar could be pushed, but then the other branches

would

trigger git lfs errors when pushed to salsa. Only after having all

fasta.gz

lfs files removed, the upload went smoothly and you all now find this

on

https://salsa.debian.org/med-team/broad-catch


Is that really intended?
We would not be able to run tests in this case since you essentially
ended
up repacking it , since several tests seem to be using that data.

BTW, I tried to do a little solution for the lfs thingy, for it to not
store "references" and committed it to my personal repository[1]
Can you have a look and let me know if it looks sensible?

Also, you might as well want to have a look at pristine-lfs[2] which
could
be interesting to use. I've attempted to use this too, please consider
taking a look
I admit, I'm not very used to the lfs workflow, so something could be
wrong
for sure.

Not sure why the CI fails though -- probably it does not work fine with
lfs, but just thinking of it as a ground for more ideas here :)
I can clone it locally in a different space and re-produce the tarball,
and
I fixed a few tests (not all) -- please consider trying once and let me
know

Edit: all tests pass now on a clean chroot for me


You have outperformed yourself on this one. Thank you tons!

I just removed the broad-catch repository of mine from salsa, please
kindly inject what you have, instead. I am saying that also since
apparently my SALSA_TOKEN apparently does not have authority to adjust
the CI settings. May I also ask you to upload the package?

Next steps would be tell routine-update not to try updating this one. I
can do that. And then I would like someone from upstream to comment on
this package and direct us a bit on what else would be good to have in
Debian to help their cause. My "plan" about that is to just go the
official route and introduce the package in a github issue - if it is a
regular user of that package replying, I guess we are just about as happy.

Many thanks again!

Steffen