[pygr] Re: multiz44way?

Namshin Kim Sun, 06 Sep 2009 04:52:11 -0700

Oops... I figure out what would cause the problem.

We had added uniprot mnemonic keyword for every genome assembly. But, for
ponAbe2, it looks like I used PONPA before, but now I used PONAB. Thus, it
is pointing out different locations...


>>> worldbase.dir('ponAbe2', matchType='r')
['Bio.MSA.UCSC.ponAbe2_multiz8way', 'Bio.MSA.UCSC.ponAbe2_pairwiseCalJac1',
'Bio.MSA.UCSC.ponAbe2_pairwiseGalGal3', 'Bio.MSA.UCSC.ponAbe2_pairwiseHg18',
'Bio.MSA.UCSC.ponAbe2_pairwiseHg19', 'Bio.MSA.UCSC.ponAbe2_pairwiseMm9',
'Bio.MSA.UCSC.ponAbe2_pairwiseMonDom4',
'Bio.MSA.UCSC.ponAbe2_pairwiseOrnAna1',
'Bio.MSA.UCSC.ponAbe2_pairwisePanTro2',
'Bio.MSA.UCSC.ponAbe2_pairwiseRheMac2', 'Bio.Seq.Genome.*PONAB.ponAbe2'*,
'Bio.Seq.Genome.*PONPA.ponAbe2'*]

Now, I fixed it and PONAB.ponAbe2 and PONPA.ponAbe2 will reference same URL.

Eventually, I will have to remove PONPA.ponAbe2 and replace all referenced
NLMSAs.

Sometimes, there is no mnemonic keyword until uniprot decides to have it,
thus I used tempory keyword...

Yours,
Namshin Kim




On Sat, Sep 5, 2009 at 3:04 PM, Paul Rigor (gmail) <paulri...@gmail.com>wrote:

> Hi Namshin,
> Sorry, no dice.  Still getting the same error:
>  INFO downloader.download_unpickler: Beginning download of
> http://biodb.bioinformatics.ucla.edu/GENOMES/ponAbe2/chromFa.tar.gz to
> /extra/baldig1/genomics/pygrdata/alignments/human/hg18/multiz44way/ponAbe2.tar.gz...
>  INFO downloader.download_monitor: downloaded 8192 bytes (3593.0%)...
> INFO downloader.download_unpickler: Download done.
> INFO downloader.uncompress_file: untarring
> /extra/baldig1/genomics/pygrdata/alignments/human/hg18/multiz44way/ponAbe2.tar.gz...
>
>
> The chromFa.tar.gz file does not exist still.
>
> I made sure to start the download in a newly created folder.  Also, the
> WOLRDBASEPATH env variable was unset.
>
>
>
> On Fri, Sep 4, 2009 at 7:05 PM, Paul Rigor (gmail) <paulri...@gmail.com>wrote:
>
>> Great, I'm re-running my script now.  Will keep you posted Paul
>>
>>
>> On Fri, Sep 4, 2009 at 6:54 PM, Namshin Kim <deepr...@gmail.com> wrote:
>>
>>> Hi Paul,
>>> Would you test it again?
>>>
>>> I had made a mistake when I first updated the server, saving downloadable
>>> resources into WORLDBASEPATH (downloadable resources should be saved into
>>> different path, separate from WORLDBASEPATH). I thought I deleted all of
>>> them, but I forgot to delete .fasta .txt, actual SourceURLs in
>>> WORLDBASEPATH.
>>>
>>> Now, it is clean. Would you test it again?
>>>
>>> Thanks!
>>> Namshin Kim
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Sat, Sep 5, 2009 at 10:08 AM, Paul Rigor (gmail) <paulri...@gmail.com
>>> > wrote:
>>>
>>>> Hi Namshin,
>>>> I just wanted to point out that the MSA pulls the incorrect url for
>>>> ponAbe2.  For example, if I now attempt to download that genome separately
>>>> from the UCLA XMLRPC, the downloader obtains the correct file:
>>>>
>>>> import os
>>>> os.environ['WORLDBASEPATH'] = '.,
>>>> http://biodb2.bioinformatics.ucla.edu:5000'
>>>> from pygr import worldbase
>>>>  g = worldbase('Bio.Seq.Genome.PONAB.ponAbe2',download=True)
>>>> INFO downloader.download_unpickler: Beginning download of
>>>> http://biodb.bioinformatics.ucla.edu/GENOMES/ponAbe2/ponAbe2.gz to
>>>> /tmp/ponAbe2.gz..
>>>>
>>>>
>>>> However, as you can see from the error log of the MSA resource download,
>>>> the downloader is pulling the wrong ponAbe2 file from the biodb server:
>>>>
>>>>
>>>>
>>>> INFO downloader.download_unpickler: Beginning download 
>>>> ofhttp://biodb.bioinformatics.ucla.edu/GENOMES/ponAbe2/chromFa.tar.gz
>>>>  to
>>>> /extra/baldig1/genomics/pygrdata/alignments/human/hg18/multiz44way/ponAbe2.tar.gz...
>>>>
>>>> Thanks for looking into this!!
>>>>
>>>>  Paul
>>>>
>>>>
>>>> On Fri, Sep 4, 2009 at 2:43 PM, Paul Rigor (gmail) <paulri...@gmail.com
>>>> > wrote:
>>>>
>>>>> Hi Namshin,
>>>>>
>>>>> Ah, gotcha.
>>>>> So the downloader picks up the binaries from biodb.  So when using the
>>>>> XMLRPC, it unpickles the URL for the tar balls.  But I'm still getting an
>>>>> error when the downloader attempts to pick up the ponAbe2 tar file.  Could
>>>>> you fix the path on your server since url stored by the worldbase XMLRPC
>>>>> server is wrong.
>>>>>
>>>>> Please see the attached log.
>>>>>
>>>>> Thanks,
>>>>> Paul
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Sep 3, 2009 at 6:29 PM, Namshin Kim <deepr...@gmail.com>wrote:
>>>>>
>>>>>> Hi Paul,
>>>>>> Actual files for all downloadable resources are saved in biodb, not
>>>>>> biodb2. biodb2 has binary files for genomes and NLMSA, and not accessible
>>>>>> via http. The files are too big to save in one server...
>>>>>>
>>>>>> biodb - only for downloading via http
>>>>>> biodb2 - only for accessing via xmlrpc
>>>>>>
>>>>>> And biodb URLs for public resources. You can download files from below
>>>>>> URLs.
>>>>>>
>>>>>> http://biodb.bioinformatics.ucla.edu/PYGRDATA # for text converted
>>>>>> binaries (NLMSA)
>>>>>> http://biodb.bioinformatics.ucla.edu/MEGATEST # for megatest files
>>>>>> http://biodb.bioinformatics.ucla.edu/GENOMES # for genome assemblies,
>>>>>> compressed files
>>>>>>
>>>>>> Yours,
>>>>>> Namshin Kim
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Sep 4, 2009 at 10:20 AM, Paul Rigor (gmail) <
>>>>>> paulri...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi Namshin,
>>>>>>> Downloading the genome works for me too.  But downloading the
>>>>>>> multiple alignment doesn't pull all the necessary resources.
>>>>>>>
>>>>>>> However, I think you might have missed the latter part of my email in
>>>>>>> which I described a possible error with the hard-coded XMLRPC server url
>>>>>>> which points to 'biodb' instead of 'biodb2' as you've described in your
>>>>>>> instructions on this thread.
>>>>>>>
>>>>>>> I'm now starting to download the 44way alignment separately.  Now
>>>>>>> even though I've explicitly set the WORLDBASEPATH to use the  'biodb2'
>>>>>>> server, the url still points to 'biodb'.  See the log below.  So I'm
>>>>>>> guessing that the XMLRPC server needs to be updated to reflect the 
>>>>>>> actual
>>>>>>> locations of the gzipped picked files.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Paul
>>>>>>>
>>>>>>> ===log===
>>>>>>> (2.6.2)06:00 PM 20129 
>>>>>>> pri...@mine-17/extra/baldig1/genomics/pygrdata/alignments/human/hg18/multiz44way
>>>>>>> $ time python downloadMSA.py
>>>>>>> INFO downloader.download_unpickler: Beginning download of
>>>>>>> http://biodb.bioinformatics.ucla.edu/PYGRDATA/hg18_multiz44way.txt.gzto
>>>>>>> /extra/baldig1/genomics/pygrdata/alignments/human/hg18/multiz44way/hg18_multiz44way.txt.gz...
>>>>>>>
>>>>>>> INFO downloader.download_monitor: downloaded 7628292096 bytes
>>>>>>> (10.0%)...
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Sep 3, 2009 at 6:10 PM, Namshin Kim <deepr...@gmail.com>wrote:
>>>>>>>
>>>>>>>> Hi Paul,
>>>>>>>> I am testing it and there is no problem.
>>>>>>>>
>>>>>>>>  >>> from pygr import worldbase
>>>>>>>> >>> ponAbe2 = worldbase.Bio.Seq.Genome.PONAB.ponAbe2(download=True)
>>>>>>>> INFO downloader.download_unpickler: Beginning download of
>>>>>>>> http://biodb.bioinformatics.ucla.edu/GENOMES/ponAbe2/ponAbe2.gz to
>>>>>>>> /home/deepreds/test/ponAbe2.gz...
>>>>>>>>
>>>>>>>> And, you may need to unset PYGRDATAPATH as well.
>>>>>>>>
>>>>>>>> Yours,
>>>>>>>> Namshin Kim
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>   On Fri, Sep 4, 2009 at 10:04 AM, Paul Rigor (gmail) <
>>>>>>>> paulri...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi Namshin,
>>>>>>>>> It's not set, my script sets that environment variable to '.,
>>>>>>>>> http://biodb2.bioinformatics.ucla.edu:5000' I've made sure to
>>>>>>>>> remove .pygrdata files, etc.
>>>>>>>>> Thanks
>>>>>>>>> Paul
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>     On Thu, Sep 3, 2009 at 6:02 PM, Namshin Kim <
>>>>>>>>> deepr...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Paul,
>>>>>>>>>> Would you please give your WORLDBASEPATH?
>>>>>>>>>>
>>>>>>>>>> $ echo $WORLDBASEPATH
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Namshin Kim
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Fri, Sep 4, 2009 at 9:54 AM, Paul Rigor (gmail) <
>>>>>>>>>> paulri...@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Namshin,
>>>>>>>>>>> I'm still encountering the same error as before regarding the
>>>>>>>>>>> download of genome resources.  Download the 44way alignment also 
>>>>>>>>>>> does not
>>>>>>>>>>> download the necessary dependencies. Could there by something wrong 
>>>>>>>>>>> with the
>>>>>>>>>>> URL paths saved in on the resources.  You specified 'biodb2' but 
>>>>>>>>>>> during the
>>>>>>>>>>> download, the url's are directed to 'biodb.'
>>>>>>>>>>>
>>>>>>>>>>> With WOLRDBASEPATH not set, I get the following error.
>>>>>>>>>>>
>>>>>>>>>>> Please see:
>>>>>>>>>>>  INFO downloader.download_unpickler: Beginning download of
>>>>>>>>>>> http://biodb.bioinformatics.ucla.edu/GENOMES/ponAbe2/chromFa.tar.gz 
>>>>>>>>>>> to
>>>>>>>>>>> /extra/bal
>>>>>>>>>>> dig1/genomics/pygrdata/genomes/ponAbe2.tar.gz...
>>>>>>>>>>> INFO downloader.download_monitor: downloaded 8192 bytes
>>>>>>>>>>> (3593.0%)...
>>>>>>>>>>> INFO downloader.download_unpickler: Download done.
>>>>>>>>>>> INFO downloader.uncompress_file: untarring
>>>>>>>>>>> /extra/baldig1/genomics/pygrdata/genomes/ponAbe2.tar.gz...
>>>>>>>>>>> ...
>>>>>>>>>>>  ReadError: not a gzip file
>>>>>>>>>>>
>>>>>>>>>>> However, when I set my WORLDBASEPATH to use the biodb2 server,
>>>>>>>>>>> the download is successful.  So there's something wrong with the 
>>>>>>>>>>> hardcoded
>>>>>>>>>>> url that's distributed with pygr (0.8 beta). But perhaps the xmlrpc
>>>>>>>>>>> resources on the server should be fixed as well.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Paul
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Sep 3, 2009 at 4:22 PM, cjlee112 <cjlee...@gmail.com>wrote:
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Yes, please pass on to Namshin any debugging information you can
>>>>>>>>>>>> supply.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>
>>>>>>>>>>>> Chris
>>>>>>>>>>>>
>>>>>>>>>>>> On Sep 3, 2009, at 3:26 PM, Paul Rigor (gmail) wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> > Hi Chris,
>>>>>>>>>>>> >
>>>>>>>>>>>> > Does this mean that the 44way alignment distributed by the
>>>>>>>>>>>> UCLA
>>>>>>>>>>>> > server is foobar since I ran exactly those same commands and
>>>>>>>>>>>> no
>>>>>>>>>>>> > genome dependencies were downloaded?
>>>>>>>>>>>> >
>>>>>>>>>>>> > Thanks,
>>>>>>>>>>>> > Paul
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>  Paul Rigor
>>>>>>>>> Graduate Student
>>>>>>>>> Institute for Genomics and Bioinformatics
>>>>>>>>> Donald Bren School of Information and Computer Sciences
>>>>>>>>> University of California, Irvine
>>>>>>>>> http://www.paulrigor.net/
>>>>>>>>> http://www.ics.uci.edu/~prigor
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Paul Rigor
>>>>>>> Graduate Student
>>>>>>> Institute for Genomics and Bioinformatics
>>>>>>> Donald Bren School of Information and Computer Sciences
>>>>>>> University of California, Irvine
>>>>>>> http://www.paulrigor.net/
>>>>>>> http://www.ics.uci.edu/~prigor
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Paul Rigor
>>>>> Graduate Student
>>>>> Institute for Genomics and Bioinformatics
>>>>> Donald Bren School of Information and Computer Sciences
>>>>> University of California, Irvine
>>>>> http://www.paulrigor.net/
>>>>> http://www.ics.uci.edu/~prigor
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Paul Rigor
>>>> Graduate Student
>>>> Institute for Genomics and Bioinformatics
>>>> Donald Bren School of Information and Computer Sciences
>>>> University of California, Irvine
>>>> http://www.paulrigor.net/
>>>> http://www.ics.uci.edu/~prigor
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>> --
>> Paul Rigor
>> Graduate Student
>> Institute for Genomics and Bioinformatics
>> Donald Bren School of Information and Computer Sciences
>> University of California, Irvine
>> http://www.paulrigor.net/
>> http://www.ics.uci.edu/~prigor
>>
>
>
>
> --
> Paul Rigor
> Graduate Student
> Institute for Genomics and Bioinformatics
> Donald Bren School of Information and Computer Sciences
> University of California, Irvine
> http://www.paulrigor.net/
> http://www.ics.uci.edu/~prigor
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"pygr-dev" group.
To post to this group, send email to pygr-dev@googlegroups.com
To unsubscribe from this group, send email to 
pygr-dev+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/pygr-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

[pygr] Re: multiz44way?

Reply via email to