Re: [Bioc-devel] Best object structure for representing a pairwise genome alignment ?

2020-09-21 Thread Pages, Herve
Hi Charles, Vince,

Yes, a PairwiseAlignments object will contain the sequences of the 2 
genomes being aligned so will be big. Could be mitigated by using one 
object per chromosome instead of trying to represent the full genome 
alignment in a single object, but then you loose the ability to 
represent regions that align across chromosomes.

Other downsides of using PairwiseAlignments are:
- You loose the nice/simple block-to-block mapping that GRangePairs 
gives you, together with the easy/straightforward way to annotate the 
links between blocks (via the metadata columns of the GRangePairs).
- A PairwiseAlignments object can only represent replacements and indels 
while the block-to-block mapping in a GRangePairs object can support 
rearrangements (in addition to indels and replacements).
- The GRangesPairs approach even allows you to represent a many-to-many 
relationship between the blocks/regions of the 2 genomes, something that 
a PairwiseAlignments-based approach cannot do.

So the GRangePairs approach seems more flexible.

Maybe a better way to support an arbitrary relationship between the 
blocks/regions of the 2 genomes would be to use a 3-slot data structure: 
2 slots for 2 GRanges objects defining regions on the 2 genomes + 1 slot 
for representing the links between the regions defined on each genome 
(these links could be stored in a Hits object). Note that this is a 
classic bipartite graph. Would particularly make sense if the mapping 
between the regions is expected to be many-to-many. This kind of 
container would be able to represent a side-by-side comparison of 2 
arbitrary genomes, in its more general form, not just a pairwise genome 
alignment, which is more restrictive.

Cheers,
H.

On 9/18/20 02:41, Vincent Carey wrote:
> Starting from
> 
> PairwiseAlignments-class  package:Biostrings   R Documentation
> 
> PairwiseAlignments, PairwiseAlignmentsSingleSubject, and
> PairwiseAlignmentsSingleSubjectSummary objects
> 
> Description:
> 
>   The ‘PairwiseAlignments’ class is a container for storing a set of
>   pairwise alignments.
> 
>   The ‘PairwiseAlignmentsSingleSubject’ class is a container for
>   storing a set of pairwise alignments with a single subject.
> 
>   The ‘PairwiseAlignmentsSingleSubjectSummary’ class is a container
>   for storing the summary of a set of pairwise alignments.
> 
> Usage:
> 
>   ## Constructors:
>   ## When subject is missing, pattern must be of length 2
>   ## S4 method for signature 'XString,XString'
>   PairwiseAlignments(pattern, subject,
> type = "global", substitutionMatrix = NULL, gapOpening = 0,
> gapExtension = 1)
>   ## S4 method for signature 'XStringSet,missing'
>   PairwiseAlignments(pattern, subject,
> type = "global", substitutionMatrix = NULL, gapOpening = 0,
> gapExtension = 1)
>   ## S4 method for signature 'character,character'
>   PairwiseAlignments(pattern, subject,
> type = "global", substitutionMatrix = NULL, gapOpening = 0,
> gapExtension = 1,
> baseClass = "BString")
> 
> ...
> 
> my question would be whether this is a relevant starting place?  Clearly
> the focus is not on coordinates, but perhaps a structure that maintains
> genomic content and coordinates together would be of use?
> 
> 
> On Fri, Sep 18, 2020 at 2:49 AM Charles Plessy 
> wrote:
> 
>> Dear Bioc developers,
>>
>> I am currently analysing pairwise genome alignments with Bioconductor,
>> and I represent them with a GRanges object of the first genome,
>> containing one element by alignment block, and storing the coordinates
>> in the other genome in a metadata column containing another GRanges object.
>>
>> Something like this.
>>
>> GRanges object with 36582 ranges and 2 metadata columns:
>> seqnames  ranges strand | scorequery
>>  | 
>> [1]   S1 162-550  + |   861XSR:909374-909853
>> [2]   S1833-3738  + |  7238XSR:910181-913291
>> [3]   S1   3769-4212  + |  1165XSR:913510-913953
>> [4]   S1   4246-4381  + |   359XSR:914134-914275
>> [5]   S1   4532-5990  + |  2977 chr2:6694031-6695569
>> ...  ... ...... .   ...  ...
>> [36578]  S99 17228-17759  - |   793 chr1:2375870-2376379
>> [36579]  S99 16417-16935  - |   632 chr1:2376612-2377077
>> [36580]  S99 12370-12759  - |   773 chr1:2379949-2380343
>> [36581]  S99   5270-5384  - |   295   chr1:843397-843511
>> [36582]  S99   1949-3053  - |  2105   chr1:845358-846326
>> ---
>>
>> Using "Pairwise genome alignment" as a keyword in a search engine, I
>> found that the packages CNEr is doing something similar, although it
>> uses a dedicated "GRangePairs" object for the purpose.
>>
>> Before I start to invest 

Re: [Bioc-devel] "length(url) == 1 is not TRUE" error for vignette only on build system

2020-09-21 Thread Shraddha Pai
Hi all,
I've updated the cache calls in netDx per Martin's suggestion above:
> bfc <- BiocFileCache()
> path <- bfcrpath(bfc, url)

The package now builds on netbbiolo but times out on malbec1 and tokay1,
and has an error on merida1.
I've never had a package timeout before. Is this because of the updated
cache calls?
A mix of outcomes on different platforms!

The error on the Windows machine (riesling) seems related to a message
Herve sent me, about java no longer being on the path.

Any guidance on how to resolve would be appreciated.
Thanks,Shraddha

On Mon, Sep 14, 2020 at 12:21 PM Shepherd, Lori <
lori.sheph...@roswellpark.org> wrote:

> To follow up on this:
>
> Please check the package code for adding/accessing resources in the netDx
> package.  If the BiocFileCache options were used correctly there should not
> have been multiple entries added to the builders cache as Martin
> demonstrated below.
>
> Once the package has been updated to ensure duplicate entries will not be
> added, we can clean up the builders.  The package code should be adjusted
> first to ensure that users will not encounter the same situation.
>
>
> Lori Shepherd
>
> Bioconductor Core Team
>
> Roswell Park Comprehensive Cancer Center
>
> Department of Biostatistics & Bioinformatics
>
> Elm & Carlton Streets
>
> Buffalo, New York 14263
> --
> *From:* Bioc-devel  on behalf of Martin
> Morgan 
> *Sent:* Friday, September 11, 2020 6:22 PM
> *To:* Pages, Herve ; Shraddha Pai <
> shraddha@utoronto.ca>
> *Cc:* bioc-devel@r-project.org 
> *Subject:* Re: [Bioc-devel] "length(url) == 1 is not TRUE" error for
> vignette only on build system
>
> bfcquery() just searches the cache, so if you've created two resources
> that match the name then you end up with two rids
>
> > xx = bfcnew(bfc, "foo")
> > bfcquery(bfc, "foo", "rname")
> # A tibble: 1 x 10
>   rid   rname create_time access_time rpath rtype fpath last_modified_t…
> etag
> 
> 
> 1 BFC3  foo   2020-09-11… 2020-09-11… /Use… rela… 64ea…   NA NA
> # … with 1 more variable: expires 
> > xx = bfcnew(bfc, "foo_bar")
> > bfcquery(bfc, "foo", "rname")
> # A tibble: 2 x 10
>   rid   rname create_time access_time rpath rtype fpath last_modified_t…
> etag
> 
> 
> 1 BFC3  foo   2020-09-11… 2020-09-11… /Use… rela… 64ea…   NA NA
> 2 BFC4  foo_… 2020-09-11… 2020-09-11… /Use… rela… 64ea…   NA NA
> # … with 1 more variable: expires 
>
> and even
>
> > xx = bfcnew(bfc, "foo_bar")
> > bfcquery(bfc, "foo", "rname")
> # A tibble: 3 x 10
>   rid   rname create_time access_time rpath rtype fpath last_modified_t…
> etag
> 
> 
> 1 BFC3  foo   2020-09-11… 2020-09-11… /Use… rela… 64ea…   NA NA
> 2 BFC4  foo_… 2020-09-11… 2020-09-11… /Use… rela… 64ea…   NA NA
> 3 BFC5  foo_… 2020-09-11… 2020-09-11… /Use… rela… 64ea…   NA NA
> # … with 1 more variable: expires 
>
> If the cache is under netDx control, then you can be careful about
> creating new resources by checking first whether the resource exists, as
> outlined in
> http://secure-web.cisco.com/1t9-aFwD7wK-0S-MuqPB5uIzT5hQ2gyAD8KMGooS3HThd0k45RZ-frPKl-6Yl2EwZFFcmutTlgEODe0vTjmGHdvrtnhcf5beUQTSS3KiUAl4eWVWv4m3kjsTCUxTVQ2Ll5cGeus1QPlv2osdgHIycOGo6WKDjPqb6PoCxqWOMpEB_x8cSenUxF2N0OkDSEKuwlgJzHwqGjVOwqZQ_odXuqKEnKXz5moLlJO0XSz5XXkTj4kTGwtMAjqKAk_tR8DujPGfYr6Cc70ReKMZUWNVIOc4JgCOAmkILBW94Rp0x_3CX-8DEDeaRhlrgoM7MxkqHGyB0UxHyCXGepxXY196Uglz7WmH1uY02-XQsQ_6e8og/http%3A%2F%2Fbioconductor.org%2Fpackages%2Fdevel%2Fbioc%2Fvignettes%2FBiocFileCache%2Finst%2Fdoc%2FBiocFileCache.html%23cache-to-manage-package-data
>
> If one is managing internet resources, just  use bfcadd
>
> > bfc <- BiocFileCache()
> > path <- bfcrpath(bfc, url)
> adding rname '
> https://secure-web.cisco.com/1lUwUbgjSoXsv6f7JfdXhRG2f7_ISpNGSvQkNDjD3wQSEEMkTAVHfopgsowEo1GgsXrbByzyoKcAqYa3Kw2bxocuFRTwCIF5UxwWY7fw_FCeNByJcX4L4A9pGyPD1Il4OWJRoYewqI9SOxbFsK6XqqYQmOxhzALdAwDrJw1cOOpppzIIoz5unFJOO2Ihuca8QuIswQZGQqxSBXbeuu8WxSl9QoenolmV7PHovZu_sUMd1DYSflBtliuDz8hRJHSdp-N4wLsvl5GZXxTcJH4ZruhecvOJPR7aGrHCRLZln3wWoMKctL_Oc2xnUYu0ZYrCrVD5MQ8Wf7fgFEl7RPTt5RbTMxrbLO0zXxZZaEwhL5AU/https%3A%2F%2Fbioconductor.org%2Findex.html
> '
>   |==|
> 100%
>
> > path <- bfcrpath(bfc, url)
> >
>
> where the first time 'url' is used (as a unique key) the resource is
> downloaded to the cache; the second time it is simply accessed from the
> cache.
>
> It might be necessary to manually 'clean up' the builders
>
> Martin
>
>
> On 9/11/20, 4:10 PM, "Bioc-devel on behalf of Pages, Herve" <
> bioc-devel-boun...@r-project.org on behalf of hpa...@fredhutch.org> wrote:
>
> I'm not a BiocFileCache expert, sorry. You will probably get better
> help
> by opening a BiocFileCache issue on GitHub.
>
> Cheers,
> H.
>
>
> On 9/11/20 12:00, Shraddha Pai wrote:
> 

Re: [Bioc-devel] Update R package I developed which has been released by bioconductor

2020-09-21 Thread Shepherd, Lori
Bioconductor would not have your package on github.  Bioconductor has your 
repository in a git server.   If you already have a clone of your github 
repository you could follow the instructions here to add the remote to the 
Bioconductor repository you referenced here:
http://bioconductor.org/developers/how-to/git/sync-existing-repositories/

so :
git clone  
cd epihet


Make sure you do the
git fetch --all
git pull upstream master before making any changes!! This will make sure you 
get the release version bumps the core team performed and that everything is 
up-to-date with our version of your repository at git.bioconductor.org

Once you add the remote upstream if you do git remote -v
origin should point to your github
upstream should point to Bioconductor
It would be something similar to

origin https://github.com/ .git (fetch)
origin https://github.com/.git (push)
upstream g...@git.bioconductor.org:packages/epihet (fetch)
upstream g...@git.bioconductor.org:packages/epihet (push)


So when you push changes:

git push origin master
would update YOUR github
it is saying git push  to the origin which is your github on the master branch

To update on Bioconductor
git push upstream master
it is saying git push changes to the upstream with is the bioconductor remote 
master branch (devel)

Make sure the version number has a version bump!  Our version is at 1.5.0  so 
you would have to bump the z of version x.y.z for it to register on 
Bioconductor (>=  1.5.1  )

Hope that helps clarify.




Lori Shepherd

Bioconductor Core Team

Roswell Park Comprehensive Cancer Center

Department of Biostatistics & Bioinformatics

Elm & Carlton Streets

Buffalo, New York 14263


From: Bioc-devel  on behalf of Xiaowen Chen 

Sent: Monday, September 21, 2020 8:57 AM
To: bioc-devel@r-project.org 
Subject: [Bioc-devel] Update R package I developed which has been released by 
bioconductor

Hi,
I submit R package called epihet on 2018. It has been accepted and released on 
Bioconductor.
Recently, we received the reviewers� comments about the corresponding paper for 
epihet, and the reviewer need us to add more parameters for some function in 
epihet,
Could you tell me what I should do step-by-step? Now I know the link

  1.  
http://secure-web.cisco.com/1-EVvVLIafjMbq01E8DqEQNgmvkKvvnjD-0pOAg6fnFQJZO7UJhpm74hsFevQdnOceH8Ueh2BWTT9kb9bsdL13IqIS93SOktOTaRvzl88vZQIjyKpVvpYcasFBguMZ539EF4nnf-yJhgY9ZqzCEecd6Lt5PcT0r6hY6iwYo-kRFbGTZgQvUYbiv4RVLCB0aMNRwNiL_faHaysCM-N9aMnkF0QCtDVuYHDOpwzG0F5eCLtvMhWpKTH29P3I5RfQgzRralE_Wz3O-tYENZrRrhyzlXMEbSEmVj6BtcGZ5p0YdkUaFJ_5NswJT984xe0R4_R/http%3A%2F%2Fbioconductor.org%2Fdevelopers%2Fhow-to%2Fgit%2Fsync-existing-repositories%2F
 for sync an existing GitHub repository with Bioconductor
  2.  
http://secure-web.cisco.com/1a15GjCpVQyP3eF8WecbERPx8u0-Q3oUVwuRg9flwlizP3Dc6jfSdhF4CRMxd0I4M7S6UAfWCIbVnl67rKj6PsgIKMl5dppsGvqDKq8rf66Cm3qOq2JthrZCWEmdK4xZxV60M5gelNAIlZgVx1Q4HRDN_N4xilgDKIOy5WaMqWYc-57D0HIplyKyzL7xcmCZRnrdmYTfuxIAtS9KhxTHXcp1dfY6CZmW00wMdYPUWfUCkFjSe4_DancgDyMQkmB6EZ4S-PO5qKEdtIAkAEscewyygalAfrQEpiFigAz2iRvFjnDKvTOw5mBH4gvd0eGkb/http%3A%2F%2Fbioconductor.org%2Fdevelopers%2Fhow-to%2Fgit%2Fbug-fix-in-release-and-devel%2F
 for fix bugs in devel and release
but I still not sure which github repository I need to git clone on my local 
machine, my own github or Bioconductor github, but I did not found my package 
�epihet� on Bioconductor github

Thank you.
Best,
Xiaowen

---

The information in this email, including attachments, may be confidential and 
is intended solely for the addressee(s). If you believe you received this email 
by mistake, please notify the sender by return email as soon as possible.

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://secure-web.cisco.com/1QwqQZjI0Sbz20Nq-1w66qAgugL1iOh2EVLgNdoGeeQf1Nuzdv2Ou_lMkVxGlJ2mkqCi-z71oTBvFuA3VU2fl5YfMBZye1ywSRTjNwr2gewjPBjz5-vvN5fo3klAPDky0ml6QizcuuBaXHfwiyEwNDk_71_imUofXhsLJN0fbtllSV72ATjHx8t7CPyJSK0V5BV51lYdxHNwjx786mvKEFZ4m3fM9GSEl4eIuyjqj9r9u_tiqssz8O-0YRKEooLwlK0RtTlK2936hBtqq-SWFg00QK5ERpoNZ8tI6g8vJHy5C_2NXBDZ4dU9IK8rFZ9Sw/https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fbioc-devel


This email message may contain legally privileged and/or confidential 
information.  If you are not the intended recipient(s), or the employee or 
agent responsible for the delivery of this message to the intended 
recipient(s), you are hereby notified that any disclosure, copying, 
distribution, or use of this email message is prohibited.  If you have received 
this message in error, please notify the sender immediately by e-mail and 
delete this email message from your computer. Thank you.
[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] Update R package I developed which has been released by bioconductor

2020-09-21 Thread Xiaowen Chen
Hi,
I submit R package called epihet on 2018. It has been accepted and released on 
Bioconductor.
Recently, we received the reviewers’ comments about the corresponding paper for 
epihet, and the reviewer need us to add more parameters for some function in 
epihet,
Could you tell me what I should do step-by-step? Now I know the link

  1.  http://bioconductor.org/developers/how-to/git/sync-existing-repositories/ 
for sync an existing GitHub repository with Bioconductor
  2.  
http://bioconductor.org/developers/how-to/git/bug-fix-in-release-and-devel/ for 
fix bugs in devel and release
but I still not sure which github repository I need to git clone on my local 
machine, my own github or Bioconductor github, but I did not found my package 
“epihet” on Bioconductor github

Thank you.
Best,
Xiaowen

---

The information in this email, including attachments, may be confidential and 
is intended solely for the addressee(s). If you believe you received this email 
by mistake, please notify the sender by return email as soon as possible.

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel