[pygr] Re: multiz44way?

Paul Rigor (gmail) Fri, 04 Sep 2009 19:05:30 -0700

Great, I'm re-running my script now.  Will keep you postedPaul

On Fri, Sep 4, 2009 at 6:54 PM, Namshin Kim <deepr...@gmail.com> wrote:


> Hi Paul,
> Would you test it again?
>
> I had made a mistake when I first updated the server, saving downloadable
> resources into WORLDBASEPATH (downloadable resources should be saved into
> different path, separate from WORLDBASEPATH). I thought I deleted all of
> them, but I forgot to delete .fasta .txt, actual SourceURLs in
> WORLDBASEPATH.
>
> Now, it is clean. Would you test it again?
>
> Thanks!
> Namshin Kim
>
>
>
>
>
>
> On Sat, Sep 5, 2009 at 10:08 AM, Paul Rigor (gmail) 
> <paulri...@gmail.com>wrote:
>
>> Hi Namshin,
>> I just wanted to point out that the MSA pulls the incorrect url for
>> ponAbe2.  For example, if I now attempt to download that genome separately
>> from the UCLA XMLRPC, the downloader obtains the correct file:
>>
>> import os
>> os.environ['WORLDBASEPATH'] = '.,
>> http://biodb2.bioinformatics.ucla.edu:5000'
>> from pygr import worldbase
>> g = worldbase('Bio.Seq.Genome.PONAB.ponAbe2',download=True)
>> INFO downloader.download_unpickler: Beginning download of
>> http://biodb.bioinformatics.ucla.edu/GENOMES/ponAbe2/ponAbe2.gz to
>> /tmp/ponAbe2.gz..
>>
>>
>> However, as you can see from the error log of the MSA resource download,
>> the downloader is pulling the wrong ponAbe2 file from the biodb server:
>>
>>
>>
>> INFO downloader.download_unpickler: Beginning download 
>> ofhttp://biodb.bioinformatics.ucla.edu/GENOMES/ponAbe2/chromFa.tar.gz
>>  to
>> /extra/baldig1/genomics/pygrdata/alignments/human/hg18/multiz44way/ponAbe2.tar.gz...
>>
>> Thanks for looking into this!!
>>
>> Paul
>>
>>
>> On Fri, Sep 4, 2009 at 2:43 PM, Paul Rigor (gmail) 
>> <paulri...@gmail.com>wrote:
>>
>>> Hi Namshin,
>>>
>>> Ah, gotcha.
>>> So the downloader picks up the binaries from biodb.  So when using the
>>> XMLRPC, it unpickles the URL for the tar balls.  But I'm still getting an
>>> error when the downloader attempts to pick up the ponAbe2 tar file.  Could
>>> you fix the path on your server since url stored by the worldbase XMLRPC
>>> server is wrong.
>>>
>>> Please see the attached log.
>>>
>>> Thanks,
>>> Paul
>>>
>>>
>>>
>>> On Thu, Sep 3, 2009 at 6:29 PM, Namshin Kim <deepr...@gmail.com> wrote:
>>>
>>>> Hi Paul,
>>>> Actual files for all downloadable resources are saved in biodb, not
>>>> biodb2. biodb2 has binary files for genomes and NLMSA, and not accessible
>>>> via http. The files are too big to save in one server...
>>>>
>>>> biodb - only for downloading via http
>>>> biodb2 - only for accessing via xmlrpc
>>>>
>>>> And biodb URLs for public resources. You can download files from below
>>>> URLs.
>>>>
>>>> http://biodb.bioinformatics.ucla.edu/PYGRDATA # for text converted
>>>> binaries (NLMSA)
>>>> http://biodb.bioinformatics.ucla.edu/MEGATEST # for megatest files
>>>> http://biodb.bioinformatics.ucla.edu/GENOMES # for genome assemblies,
>>>> compressed files
>>>>
>>>> Yours,
>>>> Namshin Kim
>>>>
>>>>
>>>>
>>>> On Fri, Sep 4, 2009 at 10:20 AM, Paul Rigor (gmail) <
>>>> paulri...@gmail.com> wrote:
>>>>
>>>>> Hi Namshin,
>>>>> Downloading the genome works for me too.  But downloading the multiple
>>>>> alignment doesn't pull all the necessary resources.
>>>>>
>>>>> However, I think you might have missed the latter part of my email in
>>>>> which I described a possible error with the hard-coded XMLRPC server url
>>>>> which points to 'biodb' instead of 'biodb2' as you've described in your
>>>>> instructions on this thread.
>>>>>
>>>>> I'm now starting to download the 44way alignment separately.  Now even
>>>>> though I've explicitly set the WORLDBASEPATH to use the  'biodb2' server,
>>>>> the url still points to 'biodb'.  See the log below.  So I'm guessing that
>>>>> the XMLRPC server needs to be updated to reflect the actual locations of 
>>>>> the
>>>>> gzipped picked files.
>>>>>
>>>>> Thanks,
>>>>> Paul
>>>>>
>>>>> ===log===
>>>>> (2.6.2)06:00 PM 20129 
>>>>> pri...@mine-17/extra/baldig1/genomics/pygrdata/alignments/human/hg18/multiz44way
>>>>> $ time python downloadMSA.py
>>>>> INFO downloader.download_unpickler: Beginning download of
>>>>> http://biodb.bioinformatics.ucla.edu/PYGRDATA/hg18_multiz44way.txt.gzto
>>>>> /extra/baldig1/genomics/pygrdata/alignments/human/hg18/multiz44way/hg18_multiz44way.txt.gz...
>>>>>
>>>>> INFO downloader.download_monitor: downloaded 7628292096 bytes
>>>>> (10.0%)...
>>>>>
>>>>>
>>>>> On Thu, Sep 3, 2009 at 6:10 PM, Namshin Kim <deepr...@gmail.com>wrote:
>>>>>
>>>>>> Hi Paul,
>>>>>> I am testing it and there is no problem.
>>>>>>
>>>>>> >>> from pygr import worldbase
>>>>>> >>> ponAbe2 = worldbase.Bio.Seq.Genome.PONAB.ponAbe2(download=True)
>>>>>> INFO downloader.download_unpickler: Beginning download of
>>>>>> http://biodb.bioinformatics.ucla.edu/GENOMES/ponAbe2/ponAbe2.gz to
>>>>>> /home/deepreds/test/ponAbe2.gz...
>>>>>>
>>>>>> And, you may need to unset PYGRDATAPATH as well.
>>>>>>
>>>>>> Yours,
>>>>>> Namshin Kim
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Sep 4, 2009 at 10:04 AM, Paul Rigor (gmail) <
>>>>>> paulri...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi Namshin,
>>>>>>> It's not set, my script sets that environment variable to '.,
>>>>>>> http://biodb2.bioinformatics.ucla.edu:5000' I've made sure to remove
>>>>>>> .pygrdata files, etc.
>>>>>>> Thanks
>>>>>>> Paul
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Sep 3, 2009 at 6:02 PM, Namshin Kim <deepr...@gmail.com>wrote:
>>>>>>>
>>>>>>>> Hi Paul,
>>>>>>>> Would you please give your WORLDBASEPATH?
>>>>>>>>
>>>>>>>> $ echo $WORLDBASEPATH
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Namshin Kim
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Sep 4, 2009 at 9:54 AM, Paul Rigor (gmail) <
>>>>>>>> paulri...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi Namshin,
>>>>>>>>> I'm still encountering the same error as before regarding the
>>>>>>>>> download of genome resources.  Download the 44way alignment also does 
>>>>>>>>> not
>>>>>>>>> download the necessary dependencies. Could there by something wrong 
>>>>>>>>> with the
>>>>>>>>> URL paths saved in on the resources.  You specified 'biodb2' but 
>>>>>>>>> during the
>>>>>>>>> download, the url's are directed to 'biodb.'
>>>>>>>>>
>>>>>>>>> With WOLRDBASEPATH not set, I get the following error.
>>>>>>>>>
>>>>>>>>> Please see:
>>>>>>>>> INFO downloader.download_unpickler: Beginning download of
>>>>>>>>> http://biodb.bioinformatics.ucla.edu/GENOMES/ponAbe2/chromFa.tar.gz to
>>>>>>>>> /extra/bal
>>>>>>>>> dig1/genomics/pygrdata/genomes/ponAbe2.tar.gz...
>>>>>>>>> INFO downloader.download_monitor: downloaded 8192 bytes
>>>>>>>>> (3593.0%)...
>>>>>>>>> INFO downloader.download_unpickler: Download done.
>>>>>>>>> INFO downloader.uncompress_file: untarring
>>>>>>>>> /extra/baldig1/genomics/pygrdata/genomes/ponAbe2.tar.gz...
>>>>>>>>> ...
>>>>>>>>> ReadError: not a gzip file
>>>>>>>>>
>>>>>>>>> However, when I set my WORLDBASEPATH to use the biodb2 server, the
>>>>>>>>> download is successful.  So there's something wrong with the 
>>>>>>>>> hardcoded url
>>>>>>>>> that's distributed with pygr (0.8 beta). But perhaps the xmlrpc 
>>>>>>>>> resources on
>>>>>>>>> the server should be fixed as well.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Paul
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Sep 3, 2009 at 4:22 PM, cjlee112 <cjlee...@gmail.com>wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Yes, please pass on to Namshin any debugging information you can
>>>>>>>>>> supply.
>>>>>>>>>>
>>>>>>>>>> Thanks!
>>>>>>>>>>
>>>>>>>>>> Chris
>>>>>>>>>>
>>>>>>>>>> On Sep 3, 2009, at 3:26 PM, Paul Rigor (gmail) wrote:
>>>>>>>>>>
>>>>>>>>>> > Hi Chris,
>>>>>>>>>> >
>>>>>>>>>> > Does this mean that the 44way alignment distributed by the UCLA
>>>>>>>>>> > server is foobar since I ran exactly those same commands and no
>>>>>>>>>> > genome dependencies were downloaded?
>>>>>>>>>> >
>>>>>>>>>> > Thanks,
>>>>>>>>>> > Paul
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Paul Rigor
>>>>>>> Graduate Student
>>>>>>> Institute for Genomics and Bioinformatics
>>>>>>> Donald Bren School of Information and Computer Sciences
>>>>>>> University of California, Irvine
>>>>>>> http://www.paulrigor.net/
>>>>>>> http://www.ics.uci.edu/~prigor <http://www.ics.uci.edu/%7Eprigor>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Paul Rigor
>>>>> Graduate Student
>>>>> Institute for Genomics and Bioinformatics
>>>>> Donald Bren School of Information and Computer Sciences
>>>>> University of California, Irvine
>>>>> http://www.paulrigor.net/
>>>>> http://www.ics.uci.edu/~prigor <http://www.ics.uci.edu/%7Eprigor>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Paul Rigor
>>> Graduate Student
>>> Institute for Genomics and Bioinformatics
>>> Donald Bren School of Information and Computer Sciences
>>> University of California, Irvine
>>> http://www.paulrigor.net/
>>> http://www.ics.uci.edu/~prigor <http://www.ics.uci.edu/%7Eprigor>
>>>
>>
>>
>>
>> --
>> Paul Rigor
>> Graduate Student
>> Institute for Genomics and Bioinformatics
>> Donald Bren School of Information and Computer Sciences
>> University of California, Irvine
>> http://www.paulrigor.net/
>> http://www.ics.uci.edu/~prigor
>>
>>
>>
>
> >
>


-- 
Paul Rigor
Graduate Student
Institute for Genomics and Bioinformatics
Donald Bren School of Information and Computer Sciences
University of California, Irvine
http://www.paulrigor.net/
http://www.ics.uci.edu/~prigor

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"pygr-dev" group.
To post to this group, send email to pygr-dev@googlegroups.com
To unsubscribe from this group, send email to 
pygr-dev+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/pygr-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

[pygr] Re: multiz44way?

Reply via email to