[pygr] Re: multiz44way?

Namshin Kim Fri, 04 Sep 2009 18:54:16 -0700

Hi Paul,
Would you test it again?

I had made a mistake when I first updated the server, saving downloadable
resources into WORLDBASEPATH (downloadable resources should be saved into
different path, separate from WORLDBASEPATH). I thought I deleted all of
them, but I forgot to delete .fasta .txt, actual SourceURLs in
WORLDBASEPATH.


Now, it is clean. Would you test it again?

Thanks!
Namshin Kim






On Sat, Sep 5, 2009 at 10:08 AM, Paul Rigor (gmail) <paulri...@gmail.com>wrote:

> Hi Namshin,
> I just wanted to point out that the MSA pulls the incorrect url for
> ponAbe2.  For example, if I now attempt to download that genome separately
> from the UCLA XMLRPC, the downloader obtains the correct file:
>
> import os
> os.environ['WORLDBASEPATH'] = '.,
> http://biodb2.bioinformatics.ucla.edu:5000'
> from pygr import worldbase
> g = worldbase('Bio.Seq.Genome.PONAB.ponAbe2',download=True)
> INFO downloader.download_unpickler: Beginning download of
> http://biodb.bioinformatics.ucla.edu/GENOMES/ponAbe2/ponAbe2.gz to
> /tmp/ponAbe2.gz..
>
>
> However, as you can see from the error log of the MSA resource download,
> the downloader is pulling the wrong ponAbe2 file from the biodb server:
>
>
>
> INFO downloader.download_unpickler: Beginning download 
> ofhttp://biodb.bioinformatics.ucla.edu/GENOMES/ponAbe2/chromFa.tar.gz
>  to
> /extra/baldig1/genomics/pygrdata/alignments/human/hg18/multiz44way/ponAbe2.tar.gz...
>
> Thanks for looking into this!!
>
> Paul
>
>
> On Fri, Sep 4, 2009 at 2:43 PM, Paul Rigor (gmail) <paulri...@gmail.com>wrote:
>
>> Hi Namshin,
>>
>> Ah, gotcha.
>> So the downloader picks up the binaries from biodb.  So when using the
>> XMLRPC, it unpickles the URL for the tar balls.  But I'm still getting an
>> error when the downloader attempts to pick up the ponAbe2 tar file.  Could
>> you fix the path on your server since url stored by the worldbase XMLRPC
>> server is wrong.
>>
>> Please see the attached log.
>>
>> Thanks,
>> Paul
>>
>>
>>
>> On Thu, Sep 3, 2009 at 6:29 PM, Namshin Kim <deepr...@gmail.com> wrote:
>>
>>> Hi Paul,
>>> Actual files for all downloadable resources are saved in biodb, not
>>> biodb2. biodb2 has binary files for genomes and NLMSA, and not accessible
>>> via http. The files are too big to save in one server...
>>>
>>> biodb - only for downloading via http
>>> biodb2 - only for accessing via xmlrpc
>>>
>>> And biodb URLs for public resources. You can download files from below
>>> URLs.
>>>
>>> http://biodb.bioinformatics.ucla.edu/PYGRDATA # for text converted
>>> binaries (NLMSA)
>>> http://biodb.bioinformatics.ucla.edu/MEGATEST # for megatest files
>>> http://biodb.bioinformatics.ucla.edu/GENOMES # for genome assemblies,
>>> compressed files
>>>
>>> Yours,
>>> Namshin Kim
>>>
>>>
>>>
>>> On Fri, Sep 4, 2009 at 10:20 AM, Paul Rigor (gmail) <paulri...@gmail.com
>>> > wrote:
>>>
>>>> Hi Namshin,
>>>> Downloading the genome works for me too.  But downloading the multiple
>>>> alignment doesn't pull all the necessary resources.
>>>>
>>>> However, I think you might have missed the latter part of my email in
>>>> which I described a possible error with the hard-coded XMLRPC server url
>>>> which points to 'biodb' instead of 'biodb2' as you've described in your
>>>> instructions on this thread.
>>>>
>>>> I'm now starting to download the 44way alignment separately.  Now even
>>>> though I've explicitly set the WORLDBASEPATH to use the  'biodb2' server,
>>>> the url still points to 'biodb'.  See the log below.  So I'm guessing that
>>>> the XMLRPC server needs to be updated to reflect the actual locations of 
>>>> the
>>>> gzipped picked files.
>>>>
>>>> Thanks,
>>>> Paul
>>>>
>>>> ===log===
>>>> (2.6.2)06:00 PM 20129 
>>>> pri...@mine-17/extra/baldig1/genomics/pygrdata/alignments/human/hg18/multiz44way
>>>> $ time python downloadMSA.py
>>>> INFO downloader.download_unpickler: Beginning download of
>>>> http://biodb.bioinformatics.ucla.edu/PYGRDATA/hg18_multiz44way.txt.gzto
>>>> /extra/baldig1/genomics/pygrdata/alignments/human/hg18/multiz44way/hg18_multiz44way.txt.gz...
>>>>
>>>> INFO downloader.download_monitor: downloaded 7628292096 bytes (10.0%)...
>>>>
>>>>
>>>> On Thu, Sep 3, 2009 at 6:10 PM, Namshin Kim <deepr...@gmail.com> wrote:
>>>>
>>>>> Hi Paul,
>>>>> I am testing it and there is no problem.
>>>>>
>>>>> >>> from pygr import worldbase
>>>>> >>> ponAbe2 = worldbase.Bio.Seq.Genome.PONAB.ponAbe2(download=True)
>>>>> INFO downloader.download_unpickler: Beginning download of
>>>>> http://biodb.bioinformatics.ucla.edu/GENOMES/ponAbe2/ponAbe2.gz to
>>>>> /home/deepreds/test/ponAbe2.gz...
>>>>>
>>>>> And, you may need to unset PYGRDATAPATH as well.
>>>>>
>>>>> Yours,
>>>>> Namshin Kim
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Sep 4, 2009 at 10:04 AM, Paul Rigor (gmail) <
>>>>> paulri...@gmail.com> wrote:
>>>>>
>>>>>> Hi Namshin,
>>>>>> It's not set, my script sets that environment variable to '.,
>>>>>> http://biodb2.bioinformatics.ucla.edu:5000' I've made sure to remove
>>>>>> .pygrdata files, etc.
>>>>>> Thanks
>>>>>> Paul
>>>>>>
>>>>>>
>>>>>> On Thu, Sep 3, 2009 at 6:02 PM, Namshin Kim <deepr...@gmail.com>wrote:
>>>>>>
>>>>>>> Hi Paul,
>>>>>>> Would you please give your WORLDBASEPATH?
>>>>>>>
>>>>>>> $ echo $WORLDBASEPATH
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Namshin Kim
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Sep 4, 2009 at 9:54 AM, Paul Rigor (gmail) <
>>>>>>> paulri...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi Namshin,
>>>>>>>> I'm still encountering the same error as before regarding the
>>>>>>>> download of genome resources.  Download the 44way alignment also does 
>>>>>>>> not
>>>>>>>> download the necessary dependencies. Could there by something wrong 
>>>>>>>> with the
>>>>>>>> URL paths saved in on the resources.  You specified 'biodb2' but 
>>>>>>>> during the
>>>>>>>> download, the url's are directed to 'biodb.'
>>>>>>>>
>>>>>>>> With WOLRDBASEPATH not set, I get the following error.
>>>>>>>>
>>>>>>>> Please see:
>>>>>>>> INFO downloader.download_unpickler: Beginning download of
>>>>>>>> http://biodb.bioinformatics.ucla.edu/GENOMES/ponAbe2/chromFa.tar.gz to
>>>>>>>> /extra/bal
>>>>>>>> dig1/genomics/pygrdata/genomes/ponAbe2.tar.gz...
>>>>>>>> INFO downloader.download_monitor: downloaded 8192 bytes (3593.0%)...
>>>>>>>> INFO downloader.download_unpickler: Download done.
>>>>>>>> INFO downloader.uncompress_file: untarring
>>>>>>>> /extra/baldig1/genomics/pygrdata/genomes/ponAbe2.tar.gz...
>>>>>>>> ...
>>>>>>>> ReadError: not a gzip file
>>>>>>>>
>>>>>>>> However, when I set my WORLDBASEPATH to use the biodb2 server, the
>>>>>>>> download is successful.  So there's something wrong with the hardcoded 
>>>>>>>> url
>>>>>>>> that's distributed with pygr (0.8 beta). But perhaps the xmlrpc 
>>>>>>>> resources on
>>>>>>>> the server should be fixed as well.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Paul
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Sep 3, 2009 at 4:22 PM, cjlee112 <cjlee...@gmail.com>wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Yes, please pass on to Namshin any debugging information you can
>>>>>>>>> supply.
>>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>>>
>>>>>>>>> Chris
>>>>>>>>>
>>>>>>>>> On Sep 3, 2009, at 3:26 PM, Paul Rigor (gmail) wrote:
>>>>>>>>>
>>>>>>>>> > Hi Chris,
>>>>>>>>> >
>>>>>>>>> > Does this mean that the 44way alignment distributed by the UCLA
>>>>>>>>> > server is foobar since I ran exactly those same commands and no
>>>>>>>>> > genome dependencies were downloaded?
>>>>>>>>> >
>>>>>>>>> > Thanks,
>>>>>>>>> > Paul
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Paul Rigor
>>>>>> Graduate Student
>>>>>> Institute for Genomics and Bioinformatics
>>>>>> Donald Bren School of Information and Computer Sciences
>>>>>> University of California, Irvine
>>>>>> http://www.paulrigor.net/
>>>>>> http://www.ics.uci.edu/~prigor <http://www.ics.uci.edu/%7Eprigor>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Paul Rigor
>>>> Graduate Student
>>>> Institute for Genomics and Bioinformatics
>>>> Donald Bren School of Information and Computer Sciences
>>>> University of California, Irvine
>>>> http://www.paulrigor.net/
>>>> http://www.ics.uci.edu/~prigor <http://www.ics.uci.edu/%7Eprigor>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>> --
>> Paul Rigor
>> Graduate Student
>> Institute for Genomics and Bioinformatics
>> Donald Bren School of Information and Computer Sciences
>> University of California, Irvine
>> http://www.paulrigor.net/
>> http://www.ics.uci.edu/~prigor <http://www.ics.uci.edu/%7Eprigor>
>>
>
>
>
> --
> Paul Rigor
> Graduate Student
> Institute for Genomics and Bioinformatics
> Donald Bren School of Information and Computer Sciences
> University of California, Irvine
> http://www.paulrigor.net/
> http://www.ics.uci.edu/~prigor
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"pygr-dev" group.
To post to this group, send email to pygr-dev@googlegroups.com
To unsubscribe from this group, send email to 
pygr-dev+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/pygr-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

[pygr] Re: multiz44way?

Reply via email to