Just following up on this to suggest that the ergonomics of this are
less than ideal. I'm blissfully unaware of the constraints that
samtools developers face, but
maybe a solution would be to create the REF_CACHE stuff from a given
fasta in a tmp dir (or in memory or in $REF_CACHE) if it's needed and
not present.

The long version is that I am trying to index a cram file for which I
have the reference fasta. I am getting messages about REF_CACHE. If
samtools needs the reference in order to index, how can I specify
that? I'm sure I've indexed without a reference but I am hitting that
in this case.

after some time it says: [W::cram_populate_ref] Creating reference
cache directory /root/.cache/hts-ref
it also tells me: "This may become large; see the samtools(1) manual
page REF_CACHE discussion"

The coverage of what REF_CACHE and REF_PATH are is not covered in
enough detail to be able to use them in the man page
(http://www.htslib.org/doc/samtools.html) though there is more detail
in the workflow page (http://www.htslib.org/workflow/). In case anyone
else encounters this, in order to get this working without having it
autodownload to /root/.cache, I had to do:

wget -q 
https://raw.githubusercontent.com/samtools/samtools/develop/misc/seq_cache_populate.pl
perl seq_cache_populate.pl -root $(pwd)/cache ${fasta}
export REF_PATH=$(pwd)/cache/%2s/%2s/%s:http://www.ebi.ac.uk/ena/cram/md5/%s
export REF_CACHE=xx

-Brent


On Wed, Feb 28, 2018 at 10:40 AM Brent Pedersen <bpede...@gmail.com> wrote:
>
> On Thu, Feb 8, 2018 at 11:13 AM, James Bonfield <j...@sanger.ac.uk> wrote:
> > On Thu, Feb 08, 2018 at 09:32:00AM -0700, Brent Pedersen wrote:
> >> I've been working more with crams lately and in some cases, it seems
> >> the default behavior of htslib is to automatically start downloading
> >> reference files locally even though I have not set REF_CACHE or
> >> REF_PATH. This is deep in a complex pipeline behind several layers of
> >> abstraction, but I think this is the default.
> >
> > If you don't set them it'll automatically default to using the EBI for
> > REF_PATH and ~/.cache/hts-seqs (IIRC) for REF_CACHE.
> >
> > If you really wish to totally disable them, keep REF_CACHE clear and
> > set REF_PATH to somewhere with no files, eg /foo, and it'll fail the
> > search.
> >
> > It should then fail.
> >
> > James
> >
> > --
> > James Bonfield (j...@sanger.ac.uk)
> > The Sanger Institute, Hinxton, Cambs, CB10 1SA
> >
> >
> > --
> >  The Wellcome Sanger Institute is operated by Genome Research
> >  Limited, a charity registered in England with number 1021457 and a
> >  company registered in England with number 2742969, whose registered
> >  office is 215 Euston Road, London, NW1 2BE.
>
>
> Thanks for the reply. It seems a better solution would be to fail
> unless REF_CACHE or REF_PATH are set.
> I'm having scenarios where a user uses the wrong reference and
> samtools detects this and starts downloading with message: "Creating
> reference cache directory $HOME/.cache/hts-ref"
>
> I never want to have stuff download automatically. Even setting
> REF_PATH=xx , it will start downloading if I specify the wrong
> reference.
> -Brent


_______________________________________________
Samtools-help mailing list
Samtools-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/samtools-help

Reply via email to