Thanks Namshin, the MSA download seems to be proceeding without any
problems.  I'll keep you posted on any issues I may encounter. Also, it
looks like the resources are no longer built as BlastDB's
Thanks,
Paul
On Sun, Sep 6, 2009 at 4:52 AM, Namshin Kim <deepr...@gmail.com> wrote:

> Oops... I figure out what would cause the problem.
>
> We had added uniprot mnemonic keyword for every genome assembly. But, for
> ponAbe2, it looks like I used PONPA before, but now I used PONAB. Thus, it
> is pointing out different locations...
>
> >>> worldbase.dir('ponAbe2', matchType='r')
> ['Bio.MSA.UCSC.ponAbe2_multiz8way', 'Bio.MSA.UCSC.ponAbe2_pairwiseCalJac1',
> 'Bio.MSA.UCSC.ponAbe2_pairwiseGalGal3', 'Bio.MSA.UCSC.ponAbe2_pairwiseHg18',
> 'Bio.MSA.UCSC.ponAbe2_pairwiseHg19', 'Bio.MSA.UCSC.ponAbe2_pairwiseMm9',
> 'Bio.MSA.UCSC.ponAbe2_pairwiseMonDom4',
> 'Bio.MSA.UCSC.ponAbe2_pairwiseOrnAna1',
> 'Bio.MSA.UCSC.ponAbe2_pairwisePanTro2',
> 'Bio.MSA.UCSC.ponAbe2_pairwiseRheMac2', 'Bio.Seq.Genome.*PONAB.ponAbe2'*,
> 'Bio.Seq.Genome.*PONPA.ponAbe2'*]
>
> Now, I fixed it and PONAB.ponAbe2 and PONPA.ponAbe2 will reference same
> URL.
>
> Eventually, I will have to remove PONPA.ponAbe2 and replace all referenced
> NLMSAs.
>
> Sometimes, there is no mnemonic keyword until uniprot decides to have it,
> thus I used tempory keyword...
>
> Yours,
> Namshin Kim
>
>
>
>
> On Sat, Sep 5, 2009 at 3:04 PM, Paul Rigor (gmail) <paulri...@gmail.com>wrote:
>
>> Hi Namshin,
>> Sorry, no dice.  Still getting the same error:
>>  INFO downloader.download_unpickler: Beginning download of
>> http://biodb.bioinformatics.ucla.edu/GENOMES/ponAbe2/chromFa.tar.gz to
>> /extra/baldig1/genomics/pygrdata/alignments/human/hg18/multiz44way/ponAbe2.tar.gz...
>>  INFO downloader.download_monitor: downloaded 8192 bytes (3593.0%)...
>> INFO downloader.download_unpickler: Download done.
>> INFO downloader.uncompress_file: untarring
>> /extra/baldig1/genomics/pygrdata/alignments/human/hg18/multiz44way/ponAbe2.tar.gz...
>>
>>
>> The chromFa.tar.gz file does not exist still.
>>
>> I made sure to start the download in a newly created folder.  Also, the
>> WOLRDBASEPATH env variable was unset.
>>
>>
>>
>> On Fri, Sep 4, 2009 at 7:05 PM, Paul Rigor (gmail) 
>> <paulri...@gmail.com>wrote:
>>
>>> Great, I'm re-running my script now.  Will keep you posted Paul
>>>
>>>
>>> On Fri, Sep 4, 2009 at 6:54 PM, Namshin Kim <deepr...@gmail.com> wrote:
>>>
>>>> Hi Paul,
>>>> Would you test it again?
>>>>
>>>> I had made a mistake when I first updated the server, saving
>>>> downloadable resources into WORLDBASEPATH (downloadable resources should be
>>>> saved into different path, separate from WORLDBASEPATH). I thought I 
>>>> deleted
>>>> all of them, but I forgot to delete .fasta .txt, actual SourceURLs in
>>>> WORLDBASEPATH.
>>>>
>>>> Now, it is clean. Would you test it again?
>>>>
>>>> Thanks!
>>>> Namshin Kim
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Sat, Sep 5, 2009 at 10:08 AM, Paul Rigor (gmail) <
>>>> paulri...@gmail.com> wrote:
>>>>
>>>>> Hi Namshin,
>>>>> I just wanted to point out that the MSA pulls the incorrect url for
>>>>> ponAbe2.  For example, if I now attempt to download that genome separately
>>>>> from the UCLA XMLRPC, the downloader obtains the correct file:
>>>>>
>>>>> import os
>>>>> os.environ['WORLDBASEPATH'] = '.,
>>>>> http://biodb2.bioinformatics.ucla.edu:5000'
>>>>> from pygr import worldbase
>>>>>  g = worldbase('Bio.Seq.Genome.PONAB.ponAbe2',download=True)
>>>>> INFO downloader.download_unpickler: Beginning download of
>>>>> http://biodb.bioinformatics.ucla.edu/GENOMES/ponAbe2/ponAbe2.gz to
>>>>> /tmp/ponAbe2.gz..
>>>>>
>>>>>
>>>>> However, as you can see from the error log of the MSA resource
>>>>> download, the downloader is pulling the wrong ponAbe2 file from the biodb
>>>>> server:
>>>>>
>>>>>
>>>>>
>>>>> INFO downloader.download_unpickler: Beginning download 
>>>>> ofhttp://biodb.bioinformatics.ucla.edu/GENOMES/ponAbe2/chromFa.tar.gz
>>>>>  to
>>>>> /extra/baldig1/genomics/pygrdata/alignments/human/hg18/multiz44way/ponAbe2.tar.gz...
>>>>>
>>>>> Thanks for looking into this!!
>>>>>
>>>>>  Paul
>>>>>
>>>>>
>>>>> On Fri, Sep 4, 2009 at 2:43 PM, Paul Rigor (gmail) <
>>>>> paulri...@gmail.com> wrote:
>>>>>
>>>>>> Hi Namshin,
>>>>>>
>>>>>> Ah, gotcha.
>>>>>> So the downloader picks up the binaries from biodb.  So when using the
>>>>>> XMLRPC, it unpickles the URL for the tar balls.  But I'm still getting an
>>>>>> error when the downloader attempts to pick up the ponAbe2 tar file.  
>>>>>> Could
>>>>>> you fix the path on your server since url stored by the worldbase XMLRPC
>>>>>> server is wrong.
>>>>>>
>>>>>> Please see the attached log.
>>>>>>
>>>>>> Thanks,
>>>>>> Paul
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, Sep 3, 2009 at 6:29 PM, Namshin Kim <deepr...@gmail.com>wrote:
>>>>>>
>>>>>>> Hi Paul,
>>>>>>> Actual files for all downloadable resources are saved in biodb, not
>>>>>>> biodb2. biodb2 has binary files for genomes and NLMSA, and not 
>>>>>>> accessible
>>>>>>> via http. The files are too big to save in one server...
>>>>>>>
>>>>>>> biodb - only for downloading via http
>>>>>>> biodb2 - only for accessing via xmlrpc
>>>>>>>
>>>>>>> And biodb URLs for public resources. You can download files from
>>>>>>> below URLs.
>>>>>>>
>>>>>>> http://biodb.bioinformatics.ucla.edu/PYGRDATA # for text converted
>>>>>>> binaries (NLMSA)
>>>>>>> http://biodb.bioinformatics.ucla.edu/MEGATEST # for megatest files
>>>>>>> http://biodb.bioinformatics.ucla.edu/GENOMES # for genome
>>>>>>> assemblies, compressed files
>>>>>>>
>>>>>>> Yours,
>>>>>>> Namshin Kim
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Sep 4, 2009 at 10:20 AM, Paul Rigor (gmail) <
>>>>>>> paulri...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi Namshin,
>>>>>>>> Downloading the genome works for me too.  But downloading the
>>>>>>>> multiple alignment doesn't pull all the necessary resources.
>>>>>>>>
>>>>>>>> However, I think you might have missed the latter part of my email
>>>>>>>> in which I described a possible error with the hard-coded XMLRPC 
>>>>>>>> server url
>>>>>>>> which points to 'biodb' instead of 'biodb2' as you've described in your
>>>>>>>> instructions on this thread.
>>>>>>>>
>>>>>>>> I'm now starting to download the 44way alignment separately.  Now
>>>>>>>> even though I've explicitly set the WORLDBASEPATH to use the  'biodb2'
>>>>>>>> server, the url still points to 'biodb'.  See the log below.  So I'm
>>>>>>>> guessing that the XMLRPC server needs to be updated to reflect the 
>>>>>>>> actual
>>>>>>>> locations of the gzipped picked files.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Paul
>>>>>>>>
>>>>>>>> ===log===
>>>>>>>> (2.6.2)06:00 PM 20129 
>>>>>>>> pri...@mine-17/extra/baldig1/genomics/pygrdata/alignments/human/hg18/multiz44way
>>>>>>>> $ time python downloadMSA.py
>>>>>>>> INFO downloader.download_unpickler: Beginning download of
>>>>>>>> http://biodb.bioinformatics.ucla.edu/PYGRDATA/hg18_multiz44way.txt.gzto
>>>>>>>> /extra/baldig1/genomics/pygrdata/alignments/human/hg18/multiz44way/hg18_multiz44way.txt.gz...
>>>>>>>>
>>>>>>>> INFO downloader.download_monitor: downloaded 7628292096 bytes
>>>>>>>> (10.0%)...
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Sep 3, 2009 at 6:10 PM, Namshin Kim <deepr...@gmail.com>wrote:
>>>>>>>>
>>>>>>>>> Hi Paul,
>>>>>>>>> I am testing it and there is no problem.
>>>>>>>>>
>>>>>>>>>  >>> from pygr import worldbase
>>>>>>>>> >>> ponAbe2 = worldbase.Bio.Seq.Genome.PONAB.ponAbe2(download=True)
>>>>>>>>> INFO downloader.download_unpickler: Beginning download of
>>>>>>>>> http://biodb.bioinformatics.ucla.edu/GENOMES/ponAbe2/ponAbe2.gz to
>>>>>>>>> /home/deepreds/test/ponAbe2.gz...
>>>>>>>>>
>>>>>>>>> And, you may need to unset PYGRDATAPATH as well.
>>>>>>>>>
>>>>>>>>> Yours,
>>>>>>>>> Namshin Kim
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>   On Fri, Sep 4, 2009 at 10:04 AM, Paul Rigor (gmail) <
>>>>>>>>> paulri...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Namshin,
>>>>>>>>>> It's not set, my script sets that environment variable to '.,
>>>>>>>>>> http://biodb2.bioinformatics.ucla.edu:5000' I've made sure to
>>>>>>>>>> remove .pygrdata files, etc.
>>>>>>>>>> Thanks
>>>>>>>>>> Paul
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>     On Thu, Sep 3, 2009 at 6:02 PM, Namshin Kim <
>>>>>>>>>> deepr...@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Paul,
>>>>>>>>>>> Would you please give your WORLDBASEPATH?
>>>>>>>>>>>
>>>>>>>>>>> $ echo $WORLDBASEPATH
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Namshin Kim
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Sep 4, 2009 at 9:54 AM, Paul Rigor (gmail) <
>>>>>>>>>>> paulri...@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Namshin,
>>>>>>>>>>>> I'm still encountering the same error as before regarding the
>>>>>>>>>>>> download of genome resources.  Download the 44way alignment also 
>>>>>>>>>>>> does not
>>>>>>>>>>>> download the necessary dependencies. Could there by something 
>>>>>>>>>>>> wrong with the
>>>>>>>>>>>> URL paths saved in on the resources.  You specified 'biodb2' but 
>>>>>>>>>>>> during the
>>>>>>>>>>>> download, the url's are directed to 'biodb.'
>>>>>>>>>>>>
>>>>>>>>>>>> With WOLRDBASEPATH not set, I get the following error.
>>>>>>>>>>>>
>>>>>>>>>>>> Please see:
>>>>>>>>>>>>  INFO downloader.download_unpickler: Beginning download of
>>>>>>>>>>>> http://biodb.bioinformatics.ucla.edu/GENOMES/ponAbe2/chromFa.tar.gz
>>>>>>>>>>>>  to
>>>>>>>>>>>> /extra/bal
>>>>>>>>>>>> dig1/genomics/pygrdata/genomes/ponAbe2.tar.gz...
>>>>>>>>>>>> INFO downloader.download_monitor: downloaded 8192 bytes
>>>>>>>>>>>> (3593.0%)...
>>>>>>>>>>>> INFO downloader.download_unpickler: Download done.
>>>>>>>>>>>> INFO downloader.uncompress_file: untarring
>>>>>>>>>>>> /extra/baldig1/genomics/pygrdata/genomes/ponAbe2.tar.gz...
>>>>>>>>>>>> ...
>>>>>>>>>>>>  ReadError: not a gzip file
>>>>>>>>>>>>
>>>>>>>>>>>> However, when I set my WORLDBASEPATH to use the biodb2 server,
>>>>>>>>>>>> the download is successful.  So there's something wrong with the 
>>>>>>>>>>>> hardcoded
>>>>>>>>>>>> url that's distributed with pygr (0.8 beta). But perhaps the xmlrpc
>>>>>>>>>>>> resources on the server should be fixed as well.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Paul
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Sep 3, 2009 at 4:22 PM, cjlee112 <cjlee...@gmail.com>wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yes, please pass on to Namshin any debugging information you
>>>>>>>>>>>>> can supply.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>
>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sep 3, 2009, at 3:26 PM, Paul Rigor (gmail) wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> > Hi Chris,
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Does this mean that the 44way alignment distributed by the
>>>>>>>>>>>>> UCLA
>>>>>>>>>>>>> > server is foobar since I ran exactly those same commands and
>>>>>>>>>>>>> no
>>>>>>>>>>>>> > genome dependencies were downloaded?
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Thanks,
>>>>>>>>>>>>> > Paul
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>>  Paul Rigor
>>>>>>>>>> Graduate Student
>>>>>>>>>> Institute for Genomics and Bioinformatics
>>>>>>>>>> Donald Bren School of Information and Computer Sciences
>>>>>>>>>> University of California, Irvine
>>>>>>>>>> http://www.paulrigor.net/
>>>>>>>>>> http://www.ics.uci.edu/~prigor
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Paul Rigor
>>>>>>>> Graduate Student
>>>>>>>> Institute for Genomics and Bioinformatics
>>>>>>>> Donald Bren School of Information and Computer Sciences
>>>>>>>> University of California, Irvine
>>>>>>>> http://www.paulrigor.net/
>>>>>>>> http://www.ics.uci.edu/~prigor
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Paul Rigor
>>>>>> Graduate Student
>>>>>> Institute for Genomics and Bioinformatics
>>>>>> Donald Bren School of Information and Computer Sciences
>>>>>> University of California, Irvine
>>>>>> http://www.paulrigor.net/
>>>>>> http://www.ics.uci.edu/~prigor
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Paul Rigor
>>>>> Graduate Student
>>>>> Institute for Genomics and Bioinformatics
>>>>> Donald Bren School of Information and Computer Sciences
>>>>> University of California, Irvine
>>>>> http://www.paulrigor.net/
>>>>> http://www.ics.uci.edu/~prigor
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Paul Rigor
>>> Graduate Student
>>> Institute for Genomics and Bioinformatics
>>> Donald Bren School of Information and Computer Sciences
>>> University of California, Irvine
>>> http://www.paulrigor.net/
>>> http://www.ics.uci.edu/~prigor
>>>
>>
>>
>>
>> --
>> Paul Rigor
>> Graduate Student
>> Institute for Genomics and Bioinformatics
>> Donald Bren School of Information and Computer Sciences
>> University of California, Irvine
>> http://www.paulrigor.net/
>> http://www.ics.uci.edu/~prigor
>>   >>
>>


-- 
Paul Rigor
Graduate Student
Institute for Genomics and Bioinformatics
Donald Bren School of Information and Computer Sciences
University of California, Irvine
http://www.paulrigor.net/
http://www.ics.uci.edu/~prigor

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"pygr-dev" group.
To post to this group, send email to pygr-dev@googlegroups.com
To unsubscribe from this group, send email to 
pygr-dev+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/pygr-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to