Re: [Bioc-devel] Best object structure for representing a pairwise genome alignment ?
Hi Charles, Vince, Yes, a PairwiseAlignments object will contain the sequences of the 2 genomes being aligned so will be big. Could be mitigated by using one object per chromosome instead of trying to represent the full genome alignment in a single object, but then you loose the ability to represent regions that align across chromosomes. Other downsides of using PairwiseAlignments are: - You loose the nice/simple block-to-block mapping that GRangePairs gives you, together with the easy/straightforward way to annotate the links between blocks (via the metadata columns of the GRangePairs). - A PairwiseAlignments object can only represent replacements and indels while the block-to-block mapping in a GRangePairs object can support rearrangements (in addition to indels and replacements). - The GRangesPairs approach even allows you to represent a many-to-many relationship between the blocks/regions of the 2 genomes, something that a PairwiseAlignments-based approach cannot do. So the GRangePairs approach seems more flexible. Maybe a better way to support an arbitrary relationship between the blocks/regions of the 2 genomes would be to use a 3-slot data structure: 2 slots for 2 GRanges objects defining regions on the 2 genomes + 1 slot for representing the links between the regions defined on each genome (these links could be stored in a Hits object). Note that this is a classic bipartite graph. Would particularly make sense if the mapping between the regions is expected to be many-to-many. This kind of container would be able to represent a side-by-side comparison of 2 arbitrary genomes, in its more general form, not just a pairwise genome alignment, which is more restrictive. Cheers, H. On 9/18/20 02:41, Vincent Carey wrote: > Starting from > > PairwiseAlignments-class package:Biostrings R Documentation > > PairwiseAlignments, PairwiseAlignmentsSingleSubject, and > PairwiseAlignmentsSingleSubjectSummary objects > > Description: > > The ‘PairwiseAlignments’ class is a container for storing a set of > pairwise alignments. > > The ‘PairwiseAlignmentsSingleSubject’ class is a container for > storing a set of pairwise alignments with a single subject. > > The ‘PairwiseAlignmentsSingleSubjectSummary’ class is a container > for storing the summary of a set of pairwise alignments. > > Usage: > > ## Constructors: > ## When subject is missing, pattern must be of length 2 > ## S4 method for signature 'XString,XString' > PairwiseAlignments(pattern, subject, > type = "global", substitutionMatrix = NULL, gapOpening = 0, > gapExtension = 1) > ## S4 method for signature 'XStringSet,missing' > PairwiseAlignments(pattern, subject, > type = "global", substitutionMatrix = NULL, gapOpening = 0, > gapExtension = 1) > ## S4 method for signature 'character,character' > PairwiseAlignments(pattern, subject, > type = "global", substitutionMatrix = NULL, gapOpening = 0, > gapExtension = 1, > baseClass = "BString") > > ... > > my question would be whether this is a relevant starting place? Clearly > the focus is not on coordinates, but perhaps a structure that maintains > genomic content and coordinates together would be of use? > > > On Fri, Sep 18, 2020 at 2:49 AM Charles Plessy > wrote: > >> Dear Bioc developers, >> >> I am currently analysing pairwise genome alignments with Bioconductor, >> and I represent them with a GRanges object of the first genome, >> containing one element by alignment block, and storing the coordinates >> in the other genome in a metadata column containing another GRanges object. >> >> Something like this. >> >> GRanges object with 36582 ranges and 2 metadata columns: >> seqnames ranges strand | scorequery >> | >> [1] S1 162-550 + | 861XSR:909374-909853 >> [2] S1833-3738 + | 7238XSR:910181-913291 >> [3] S1 3769-4212 + | 1165XSR:913510-913953 >> [4] S1 4246-4381 + | 359XSR:914134-914275 >> [5] S1 4532-5990 + | 2977 chr2:6694031-6695569 >> ... ... ...... . ... ... >> [36578] S99 17228-17759 - | 793 chr1:2375870-2376379 >> [36579] S99 16417-16935 - | 632 chr1:2376612-2377077 >> [36580] S99 12370-12759 - | 773 chr1:2379949-2380343 >> [36581] S99 5270-5384 - | 295 chr1:843397-843511 >> [36582] S99 1949-3053 - | 2105 chr1:845358-846326 >> --- >> >> Using "Pairwise genome alignment" as a keyword in a search engine, I >> found that the packages CNEr is doing something similar, although it >> uses a dedicated "GRangePairs" object for the purpose. >> >> Before I start to invest
Re: [Bioc-devel] "length(url) == 1 is not TRUE" error for vignette only on build system
Hi all, I've updated the cache calls in netDx per Martin's suggestion above: > bfc <- BiocFileCache() > path <- bfcrpath(bfc, url) The package now builds on netbbiolo but times out on malbec1 and tokay1, and has an error on merida1. I've never had a package timeout before. Is this because of the updated cache calls? A mix of outcomes on different platforms! The error on the Windows machine (riesling) seems related to a message Herve sent me, about java no longer being on the path. Any guidance on how to resolve would be appreciated. Thanks,Shraddha On Mon, Sep 14, 2020 at 12:21 PM Shepherd, Lori < lori.sheph...@roswellpark.org> wrote: > To follow up on this: > > Please check the package code for adding/accessing resources in the netDx > package. If the BiocFileCache options were used correctly there should not > have been multiple entries added to the builders cache as Martin > demonstrated below. > > Once the package has been updated to ensure duplicate entries will not be > added, we can clean up the builders. The package code should be adjusted > first to ensure that users will not encounter the same situation. > > > Lori Shepherd > > Bioconductor Core Team > > Roswell Park Comprehensive Cancer Center > > Department of Biostatistics & Bioinformatics > > Elm & Carlton Streets > > Buffalo, New York 14263 > -- > *From:* Bioc-devel on behalf of Martin > Morgan > *Sent:* Friday, September 11, 2020 6:22 PM > *To:* Pages, Herve ; Shraddha Pai < > shraddha@utoronto.ca> > *Cc:* bioc-devel@r-project.org > *Subject:* Re: [Bioc-devel] "length(url) == 1 is not TRUE" error for > vignette only on build system > > bfcquery() just searches the cache, so if you've created two resources > that match the name then you end up with two rids > > > xx = bfcnew(bfc, "foo") > > bfcquery(bfc, "foo", "rname") > # A tibble: 1 x 10 > rid rname create_time access_time rpath rtype fpath last_modified_t… > etag > > > 1 BFC3 foo 2020-09-11… 2020-09-11… /Use… rela… 64ea… NA NA > # … with 1 more variable: expires > > xx = bfcnew(bfc, "foo_bar") > > bfcquery(bfc, "foo", "rname") > # A tibble: 2 x 10 > rid rname create_time access_time rpath rtype fpath last_modified_t… > etag > > > 1 BFC3 foo 2020-09-11… 2020-09-11… /Use… rela… 64ea… NA NA > 2 BFC4 foo_… 2020-09-11… 2020-09-11… /Use… rela… 64ea… NA NA > # … with 1 more variable: expires > > and even > > > xx = bfcnew(bfc, "foo_bar") > > bfcquery(bfc, "foo", "rname") > # A tibble: 3 x 10 > rid rname create_time access_time rpath rtype fpath last_modified_t… > etag > > > 1 BFC3 foo 2020-09-11… 2020-09-11… /Use… rela… 64ea… NA NA > 2 BFC4 foo_… 2020-09-11… 2020-09-11… /Use… rela… 64ea… NA NA > 3 BFC5 foo_… 2020-09-11… 2020-09-11… /Use… rela… 64ea… NA NA > # … with 1 more variable: expires > > If the cache is under netDx control, then you can be careful about > creating new resources by checking first whether the resource exists, as > outlined in > http://secure-web.cisco.com/1t9-aFwD7wK-0S-MuqPB5uIzT5hQ2gyAD8KMGooS3HThd0k45RZ-frPKl-6Yl2EwZFFcmutTlgEODe0vTjmGHdvrtnhcf5beUQTSS3KiUAl4eWVWv4m3kjsTCUxTVQ2Ll5cGeus1QPlv2osdgHIycOGo6WKDjPqb6PoCxqWOMpEB_x8cSenUxF2N0OkDSEKuwlgJzHwqGjVOwqZQ_odXuqKEnKXz5moLlJO0XSz5XXkTj4kTGwtMAjqKAk_tR8DujPGfYr6Cc70ReKMZUWNVIOc4JgCOAmkILBW94Rp0x_3CX-8DEDeaRhlrgoM7MxkqHGyB0UxHyCXGepxXY196Uglz7WmH1uY02-XQsQ_6e8og/http%3A%2F%2Fbioconductor.org%2Fpackages%2Fdevel%2Fbioc%2Fvignettes%2FBiocFileCache%2Finst%2Fdoc%2FBiocFileCache.html%23cache-to-manage-package-data > > If one is managing internet resources, just use bfcadd > > > bfc <- BiocFileCache() > > path <- bfcrpath(bfc, url) > adding rname ' > https://secure-web.cisco.com/1lUwUbgjSoXsv6f7JfdXhRG2f7_ISpNGSvQkNDjD3wQSEEMkTAVHfopgsowEo1GgsXrbByzyoKcAqYa3Kw2bxocuFRTwCIF5UxwWY7fw_FCeNByJcX4L4A9pGyPD1Il4OWJRoYewqI9SOxbFsK6XqqYQmOxhzALdAwDrJw1cOOpppzIIoz5unFJOO2Ihuca8QuIswQZGQqxSBXbeuu8WxSl9QoenolmV7PHovZu_sUMd1DYSflBtliuDz8hRJHSdp-N4wLsvl5GZXxTcJH4ZruhecvOJPR7aGrHCRLZln3wWoMKctL_Oc2xnUYu0ZYrCrVD5MQ8Wf7fgFEl7RPTt5RbTMxrbLO0zXxZZaEwhL5AU/https%3A%2F%2Fbioconductor.org%2Findex.html > ' > |==| > 100% > > > path <- bfcrpath(bfc, url) > > > > where the first time 'url' is used (as a unique key) the resource is > downloaded to the cache; the second time it is simply accessed from the > cache. > > It might be necessary to manually 'clean up' the builders > > Martin > > > On 9/11/20, 4:10 PM, "Bioc-devel on behalf of Pages, Herve" < > bioc-devel-boun...@r-project.org on behalf of hpa...@fredhutch.org> wrote: > > I'm not a BiocFileCache expert, sorry. You will probably get better > help > by opening a BiocFileCache issue on GitHub. > > Cheers, > H. > > > On 9/11/20 12:00, Shraddha Pai wrote: >
Re: [Bioc-devel] Update R package I developed which has been released by bioconductor
Bioconductor would not have your package on github. Bioconductor has your repository in a git server. If you already have a clone of your github repository you could follow the instructions here to add the remote to the Bioconductor repository you referenced here: http://bioconductor.org/developers/how-to/git/sync-existing-repositories/ so : git clone cd epihet Make sure you do the git fetch --all git pull upstream master before making any changes!! This will make sure you get the release version bumps the core team performed and that everything is up-to-date with our version of your repository at git.bioconductor.org Once you add the remote upstream if you do git remote -v origin should point to your github upstream should point to Bioconductor It would be something similar to origin https://github.com/ .git (fetch) origin https://github.com/.git (push) upstream g...@git.bioconductor.org:packages/epihet (fetch) upstream g...@git.bioconductor.org:packages/epihet (push) So when you push changes: git push origin master would update YOUR github it is saying git push to the origin which is your github on the master branch To update on Bioconductor git push upstream master it is saying git push changes to the upstream with is the bioconductor remote master branch (devel) Make sure the version number has a version bump! Our version is at 1.5.0 so you would have to bump the z of version x.y.z for it to register on Bioconductor (>= 1.5.1 ) Hope that helps clarify. Lori Shepherd Bioconductor Core Team Roswell Park Comprehensive Cancer Center Department of Biostatistics & Bioinformatics Elm & Carlton Streets Buffalo, New York 14263 From: Bioc-devel on behalf of Xiaowen Chen Sent: Monday, September 21, 2020 8:57 AM To: bioc-devel@r-project.org Subject: [Bioc-devel] Update R package I developed which has been released by bioconductor Hi, I submit R package called epihet on 2018. It has been accepted and released on Bioconductor. Recently, we received the reviewers� comments about the corresponding paper for epihet, and the reviewer need us to add more parameters for some function in epihet, Could you tell me what I should do step-by-step? Now I know the link 1. http://secure-web.cisco.com/1-EVvVLIafjMbq01E8DqEQNgmvkKvvnjD-0pOAg6fnFQJZO7UJhpm74hsFevQdnOceH8Ueh2BWTT9kb9bsdL13IqIS93SOktOTaRvzl88vZQIjyKpVvpYcasFBguMZ539EF4nnf-yJhgY9ZqzCEecd6Lt5PcT0r6hY6iwYo-kRFbGTZgQvUYbiv4RVLCB0aMNRwNiL_faHaysCM-N9aMnkF0QCtDVuYHDOpwzG0F5eCLtvMhWpKTH29P3I5RfQgzRralE_Wz3O-tYENZrRrhyzlXMEbSEmVj6BtcGZ5p0YdkUaFJ_5NswJT984xe0R4_R/http%3A%2F%2Fbioconductor.org%2Fdevelopers%2Fhow-to%2Fgit%2Fsync-existing-repositories%2F for sync an existing GitHub repository with Bioconductor 2. http://secure-web.cisco.com/1a15GjCpVQyP3eF8WecbERPx8u0-Q3oUVwuRg9flwlizP3Dc6jfSdhF4CRMxd0I4M7S6UAfWCIbVnl67rKj6PsgIKMl5dppsGvqDKq8rf66Cm3qOq2JthrZCWEmdK4xZxV60M5gelNAIlZgVx1Q4HRDN_N4xilgDKIOy5WaMqWYc-57D0HIplyKyzL7xcmCZRnrdmYTfuxIAtS9KhxTHXcp1dfY6CZmW00wMdYPUWfUCkFjSe4_DancgDyMQkmB6EZ4S-PO5qKEdtIAkAEscewyygalAfrQEpiFigAz2iRvFjnDKvTOw5mBH4gvd0eGkb/http%3A%2F%2Fbioconductor.org%2Fdevelopers%2Fhow-to%2Fgit%2Fbug-fix-in-release-and-devel%2F for fix bugs in devel and release but I still not sure which github repository I need to git clone on my local machine, my own github or Bioconductor github, but I did not found my package �epihet� on Bioconductor github Thank you. Best, Xiaowen --- The information in this email, including attachments, may be confidential and is intended solely for the addressee(s). If you believe you received this email by mistake, please notify the sender by return email as soon as possible. [[alternative HTML version deleted]] ___ Bioc-devel@r-project.org mailing list https://secure-web.cisco.com/1QwqQZjI0Sbz20Nq-1w66qAgugL1iOh2EVLgNdoGeeQf1Nuzdv2Ou_lMkVxGlJ2mkqCi-z71oTBvFuA3VU2fl5YfMBZye1ywSRTjNwr2gewjPBjz5-vvN5fo3klAPDky0ml6QizcuuBaXHfwiyEwNDk_71_imUofXhsLJN0fbtllSV72ATjHx8t7CPyJSK0V5BV51lYdxHNwjx786mvKEFZ4m3fM9GSEl4eIuyjqj9r9u_tiqssz8O-0YRKEooLwlK0RtTlK2936hBtqq-SWFg00QK5ERpoNZ8tI6g8vJHy5C_2NXBDZ4dU9IK8rFZ9Sw/https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fbioc-devel This email message may contain legally privileged and/or confidential information. If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited. If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you. [[alternative HTML version deleted]] ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
[Bioc-devel] Update R package I developed which has been released by bioconductor
Hi, I submit R package called epihet on 2018. It has been accepted and released on Bioconductor. Recently, we received the reviewers’ comments about the corresponding paper for epihet, and the reviewer need us to add more parameters for some function in epihet, Could you tell me what I should do step-by-step? Now I know the link 1. http://bioconductor.org/developers/how-to/git/sync-existing-repositories/ for sync an existing GitHub repository with Bioconductor 2. http://bioconductor.org/developers/how-to/git/bug-fix-in-release-and-devel/ for fix bugs in devel and release but I still not sure which github repository I need to git clone on my local machine, my own github or Bioconductor github, but I did not found my package “epihet” on Bioconductor github Thank you. Best, Xiaowen --- The information in this email, including attachments, may be confidential and is intended solely for the addressee(s). If you believe you received this email by mistake, please notify the sender by return email as soon as possible. [[alternative HTML version deleted]] ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel