Re: [Bioc-devel] GenomeInfoDb::Seqinfo(genome) broken?

2016-11-03 Thread Raymond Cavalcante
Awesome, thanks for the quick fix!

> On Nov 3, 2016, at 14:50, Hervé Pagès  wrote:
> 
> Should now be fixed in release (GenomeInfoDb 1.10.1) and devel
> (GenomeInfoDb 1.11.2).
> 
> H.
> 
> On 11/03/2016 11:12 AM, Hervé Pagès wrote:
>> I'll look at this. I think Seqinfo(genome="hg19") needs to query
>> NCBI to get some information (e.g. SequenceRole) that allows ordering
>> the sequences in the returned Seqinfo in the "natural" order.
>> 
>> H.
>> 
>> On 11/03/2016 05:47 AM, Michael Lawrence wrote:
>>> I think this is because the NCBI server switched to https (via a
>>> redirect that I guess the R url() connection fails to follow). The
>>> reason rtracklayer still works is that it's only querying UCSC.
>>> GenomeInfoDb also queries NCBI to get the mappings to the NCBI
>>> seqlevels. Does that really need to happen when only getting the
>>> Seqinfo?
>>> 
>>> 
>>> On Thu, Nov 3, 2016 at 5:13 AM, Raymond Cavalcante
>>>  wrote:
 Hello,
 
 Sometime yesterday calls like GenomeInfoDb::Seqinfo(genome = 'hg19')
 stopped working with the error:
 
> Error in file(file, "rt") : cannot open the connection
 
 From the documentation, that call relies on
 fetchExtendedChromInfoFromUCSC() and requires an internet connection,
 which I had and continue to have. I'm not really sure how to deal
 with this problem because the goldenPath link still works
 (http://hgdownload.cse.ucsc.edu/goldenpath/hg19/database/chromInfo.txt.gz
 ),
 so something else is broken...
 
 Oddly, calls to rtracklayer::import.bed() that specify a genome work.
 I don't have any BSgenome packages installed where I'm running it,
 and from the documentation for genome, "An attempt will be made to
 derive the ‘seqinfo’ on the return value using either an installed
 BSgenome package or UCSC, if network access is available." So I would
 guess that rtracklayer::import.bed() would use the same
 fetchExtendedChromInfoFromUCSC()...?
 
 On a related note, is there a non-BSgenome package that has the
 chromosome length / seqinfo information that doesn't require an
 internet connection (other than to download the package)? BSgenome is
 too large to require of users just for chromosome lengths. The org.db
 packages have chromosome lengths, but only with respect to one genome
 version for that organism, and from the documentation it isn't clear
 which version.
 
 Thanks,
 Raymond Cavalcante
[[alternative HTML version deleted]]
 
 ___
 Bioc-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>> 
>>> ___
>>> Bioc-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>> 
>> 
> 
> -- 
> Hervé Pagès
> 
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
> 
> E-mail: hpa...@fredhutch.org
> Phone:  (206) 667-5791
> Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] GenomeInfoDb::Seqinfo(genome) broken?

2016-11-03 Thread Hervé Pagès

Should now be fixed in release (GenomeInfoDb 1.10.1) and devel
(GenomeInfoDb 1.11.2).

H.

On 11/03/2016 11:12 AM, Hervé Pagès wrote:

I'll look at this. I think Seqinfo(genome="hg19") needs to query
NCBI to get some information (e.g. SequenceRole) that allows ordering
the sequences in the returned Seqinfo in the "natural" order.

H.

On 11/03/2016 05:47 AM, Michael Lawrence wrote:

I think this is because the NCBI server switched to https (via a
redirect that I guess the R url() connection fails to follow). The
reason rtracklayer still works is that it's only querying UCSC.
GenomeInfoDb also queries NCBI to get the mappings to the NCBI
seqlevels. Does that really need to happen when only getting the
Seqinfo?


On Thu, Nov 3, 2016 at 5:13 AM, Raymond Cavalcante
 wrote:

Hello,

Sometime yesterday calls like GenomeInfoDb::Seqinfo(genome = 'hg19')
stopped working with the error:


Error in file(file, "rt") : cannot open the connection


From the documentation, that call relies on
fetchExtendedChromInfoFromUCSC() and requires an internet connection,
which I had and continue to have. I'm not really sure how to deal
with this problem because the goldenPath link still works
(http://hgdownload.cse.ucsc.edu/goldenpath/hg19/database/chromInfo.txt.gz
),
so something else is broken...

Oddly, calls to rtracklayer::import.bed() that specify a genome work.
I don't have any BSgenome packages installed where I'm running it,
and from the documentation for genome, "An attempt will be made to
derive the ‘seqinfo’ on the return value using either an installed
BSgenome package or UCSC, if network access is available." So I would
guess that rtracklayer::import.bed() would use the same
fetchExtendedChromInfoFromUCSC()...?

On a related note, is there a non-BSgenome package that has the
chromosome length / seqinfo information that doesn't require an
internet connection (other than to download the package)? BSgenome is
too large to require of users just for chromosome lengths. The org.db
packages have chromosome lengths, but only with respect to one genome
version for that organism, and from the documentation it isn't clear
which version.

Thanks,
Raymond Cavalcante
[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel





--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] GenomeInfoDb::Seqinfo(genome) broken?

2016-11-03 Thread Hervé Pagès

I'll look at this. I think Seqinfo(genome="hg19") needs to query
NCBI to get some information (e.g. SequenceRole) that allows ordering
the sequences in the returned Seqinfo in the "natural" order.

H.

On 11/03/2016 05:47 AM, Michael Lawrence wrote:

I think this is because the NCBI server switched to https (via a
redirect that I guess the R url() connection fails to follow). The
reason rtracklayer still works is that it's only querying UCSC.
GenomeInfoDb also queries NCBI to get the mappings to the NCBI
seqlevels. Does that really need to happen when only getting the
Seqinfo?


On Thu, Nov 3, 2016 at 5:13 AM, Raymond Cavalcante  wrote:

Hello,

Sometime yesterday calls like GenomeInfoDb::Seqinfo(genome = 'hg19') stopped 
working with the error:


Error in file(file, "rt") : cannot open the connection


From the documentation, that call relies on fetchExtendedChromInfoFromUCSC() and 
requires an internet connection, which I had and continue to have. I'm not really 
sure how to deal with this problem because the goldenPath link still works 
(http://hgdownload.cse.ucsc.edu/goldenpath/hg19/database/chromInfo.txt.gz 
), so 
something else is broken...

Oddly, calls to rtracklayer::import.bed() that specify a genome work. I don't have any 
BSgenome packages installed where I'm running it, and from the documentation for genome, 
"An attempt will be made to derive the ‘seqinfo’ on the return value using either an 
installed BSgenome package or UCSC, if network access is available." So I would 
guess that rtracklayer::import.bed() would use the same 
fetchExtendedChromInfoFromUCSC()...?

On a related note, is there a non-BSgenome package that has the chromosome 
length / seqinfo information that doesn't require an internet connection (other 
than to download the package)? BSgenome is too large to require of users just 
for chromosome lengths. The org.db packages have chromosome lengths, but only 
with respect to one genome version for that organism, and from the 
documentation it isn't clear which version.

Thanks,
Raymond Cavalcante
[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] What are the differences between Bioc Views: MethylSeq, DNAMethylation and DifferentialMethylation

2016-11-03 Thread Marcin Kosiński
Links to those Bioc Views:

   - http://bioconductor.org/packages/devel/BiocViews.html#___MethylSeq
   - http://bioconductor.org/packages/devel/BiocViews.html#___DNAMethylation
   -
   
http://bioconductor.org/packages/devel/BiocViews.html#___DifferentialMethylation


2016-11-03 16:56 GMT+01:00 Marcin Kosiński :

> Dear Bioc devs,
>
> I am looking for a tool to jointly analyze methylation and expression data
> from TCGA.
>
> As I haven't earlier used R for finding differentially methylated bases or
> regions I thought Bioc Views would be a great start to look for the tool to
> analyze the methylation data.
>
> I am wondering: what are differences in those 3 Bioc Views: *MethylSeq,
> DNAMethylation and DifferentialMethylation*. Aren't they all created to
> list packages that aims to manage, analyze and visualize methylation
> datasets? Can't there be one Bioc View called `Methylation`?
>
> Thanks for the response and the indulgence,
> Marcin
>

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

[Bioc-devel] What are the differences between Bioc Views: MethylSeq, DNAMethylation and DifferentialMethylation

2016-11-03 Thread Marcin Kosiński
Dear Bioc devs,

I am looking for a tool to jointly analyze methylation and expression data
from TCGA.

As I haven't earlier used R for finding differentially methylated bases or
regions I thought Bioc Views would be a great start to look for the tool to
analyze the methylation data.

I am wondering: what are differences in those 3 Bioc Views: *MethylSeq,
DNAMethylation and DifferentialMethylation*. Aren't they all created to
list packages that aims to manage, analyze and visualize methylation
datasets? Can't there be one Bioc View called `Methylation`?

Thanks for the response and the indulgence,
Marcin

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] GenomeInfoDb::Seqinfo(genome) broken?

2016-11-03 Thread Michael Lawrence
I think this is because the NCBI server switched to https (via a
redirect that I guess the R url() connection fails to follow). The
reason rtracklayer still works is that it's only querying UCSC.
GenomeInfoDb also queries NCBI to get the mappings to the NCBI
seqlevels. Does that really need to happen when only getting the
Seqinfo?


On Thu, Nov 3, 2016 at 5:13 AM, Raymond Cavalcante  wrote:
> Hello,
>
> Sometime yesterday calls like GenomeInfoDb::Seqinfo(genome = 'hg19') stopped 
> working with the error:
>
>> Error in file(file, "rt") : cannot open the connection
>
> From the documentation, that call relies on fetchExtendedChromInfoFromUCSC() 
> and requires an internet connection, which I had and continue to have. I'm 
> not really sure how to deal with this problem because the goldenPath link 
> still works 
> (http://hgdownload.cse.ucsc.edu/goldenpath/hg19/database/chromInfo.txt.gz 
> ), 
> so something else is broken...
>
> Oddly, calls to rtracklayer::import.bed() that specify a genome work. I don't 
> have any BSgenome packages installed where I'm running it, and from the 
> documentation for genome, "An attempt will be made to derive the ‘seqinfo’ on 
> the return value using either an installed BSgenome package or UCSC, if 
> network access is available." So I would guess that rtracklayer::import.bed() 
> would use the same fetchExtendedChromInfoFromUCSC()...?
>
> On a related note, is there a non-BSgenome package that has the chromosome 
> length / seqinfo information that doesn't require an internet connection 
> (other than to download the package)? BSgenome is too large to require of 
> users just for chromosome lengths. The org.db packages have chromosome 
> lengths, but only with respect to one genome version for that organism, and 
> from the documentation it isn't clear which version.
>
> Thanks,
> Raymond Cavalcante
> [[alternative HTML version deleted]]
>
> ___
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] GenomeInfoDb::Seqinfo(genome) broken?

2016-11-03 Thread Raymond Cavalcante
> GenomeInfoDb::Seqinfo(genome='hg19')
Error in file(file, "rt") : cannot open the connection
> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-apple-darwin15.5.0 (64-bit)
Running under: OS X 10.12.1 (Sierra)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4parallel  stats graphics  grDevices utils datasets 
[8] methods   base 

other attached packages:
[1] GenomeInfoDb_1.10.0 IRanges_2.8.0   S4Vectors_0.12.0   
[4] BiocGenerics_0.20.0

> On Nov 3, 2016, at 08:19, Vincent Carey  wrote:
> 
> sessionInfo()?
> 
> On Thu, Nov 3, 2016 at 5:13 AM, Raymond Cavalcante  > wrote:
> Hello,
> 
> Sometime yesterday calls like GenomeInfoDb::Seqinfo(genome = 'hg19') stopped 
> working with the error:
> 
> > Error in file(file, "rt") : cannot open the connection
> 
> From the documentation, that call relies on fetchExtendedChromInfoFromUCSC() 
> and requires an internet connection, which I had and continue to have. I'm 
> not really sure how to deal with this problem because the goldenPath link 
> still works 
> (http://hgdownload.cse.ucsc.edu/goldenpath/hg19/database/chromInfo.txt.gz 
>  
>  >), 
> so something else is broken...
> 
> Oddly, calls to rtracklayer::import.bed() that specify a genome work. I don't 
> have any BSgenome packages installed where I'm running it, and from the 
> documentation for genome, "An attempt will be made to derive the ‘seqinfo’ on 
> the return value using either an installed BSgenome package or UCSC, if 
> network access is available." So I would guess that rtracklayer::import.bed() 
> would use the same fetchExtendedChromInfoFromUCSC()...?
> 
> On a related note, is there a non-BSgenome package that has the chromosome 
> length / seqinfo information that doesn't require an internet connection 
> (other than to download the package)? BSgenome is too large to require of 
> users just for chromosome lengths. The org.db packages have chromosome 
> lengths, but only with respect to one genome version for that organism, and 
> from the documentation it isn't clear which version.
> 
> Thanks,
> Raymond Cavalcante
> [[alternative HTML version deleted]]
> 
> ___
> Bioc-devel@r-project.org  mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel 
> 


[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] GenomeInfoDb::Seqinfo(genome) broken?

2016-11-03 Thread Vincent Carey
sessionInfo()?

On Thu, Nov 3, 2016 at 5:13 AM, Raymond Cavalcante 
wrote:

> Hello,
>
> Sometime yesterday calls like GenomeInfoDb::Seqinfo(genome = 'hg19')
> stopped working with the error:
>
> > Error in file(file, "rt") : cannot open the connection
>
> From the documentation, that call relies on fetchExtendedChromInfoFromUCSC()
> and requires an internet connection, which I had and continue to have. I'm
> not really sure how to deal with this problem because the goldenPath link
> still works (http://hgdownload.cse.ucsc.edu/goldenpath/hg19/database/
> chromInfo.txt.gz  chromInfo.txt.gz>), so something else is broken...
>
> Oddly, calls to rtracklayer::import.bed() that specify a genome work. I
> don't have any BSgenome packages installed where I'm running it, and from
> the documentation for genome, "An attempt will be made to derive the
> ‘seqinfo’ on the return value using either an installed BSgenome package or
> UCSC, if network access is available." So I would guess that
> rtracklayer::import.bed() would use the same fetchExtendedChromInfoFromUCSC
> ()...?
>
> On a related note, is there a non-BSgenome package that has the chromosome
> length / seqinfo information that doesn't require an internet connection
> (other than to download the package)? BSgenome is too large to require of
> users just for chromosome lengths. The org.db packages have chromosome
> lengths, but only with respect to one genome version for that organism, and
> from the documentation it isn't clear which version.
>
> Thanks,
> Raymond Cavalcante
> [[alternative HTML version deleted]]
>
> ___
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] A handful of check to follow up on R CMD BiocCheck

2016-11-03 Thread Kevin RUE
Apologies for the additional spam, for two reasons:

   - The diff files that I've previously sent had the base and modified
   versions swapped. This new one fixes that.
   - This new diff file (always relative to the code I cloned from
   Bioconductor-mirror) also fixes a bug whereby the updated code would fail
   on packages that do not break any of the three guidelines.

Best,
Kevin

On Thu, Nov 3, 2016 at 11:49 AM, Kevin RUE  wrote:

> Hi all,
>
> Please find attached the diff relative to the code that I cloned from
> Bioconductor-mirror yesterday (please ignore the previous diff file).
>
> Basically three new features:
>
>- As per previous email: display up to the first 6 lines that are over
>80 characters long
>- *New*: display up to the first 6 lines that are not indented by a
>multiple of 4 spaces
>- *New*: display up to the first 6 lines that use TAB instead of 4
>spaces for indentation
>
> I also attach the output of the updated code
> , to illustrate the changes.
>
> Notes:
>
>- For demonstration purpose, I indented a handful of lines from the
>checks.R file itself with TAB characters. I assume that's OK, as some lines
>were already longer than 80 characters and not indented by a multiple of 4
>spaces.
>
> All the best,
> Kevin
>
> On Wed, Nov 2, 2016 at 10:00 PM, Kevin RUE  wrote:
>
>> Me again :)
>>
>> Please find attached the first patch to print the first 6 lines over 80
>> characters long. (I'll get to the tabulation offenders next).
>>
>> Note that all the offending lines are stored in the "df.length"
>> data.frame. How about an option like "fullReport=c(FALSE, TRUE)" that print
>> *all* the offending lines?
>> The data.frame also stores the content of the lines for the record, but
>> does not print them. I think Kasper is right: filename and line should be
>> enough to track down the line.
>>
>> All the best,
>> Kevin
>>
>>
>>
>> On Wed, Nov 2, 2016 at 8:08 PM, Kevin RUE  wrote:
>>
>>> Thanks for the feedback!
>>>
>>> I also tend to prefer *all* the lines being reported (or to be honest,
>>> that was really true when I had lots of them; a problem that I largely
>>> mitigated by fixing all of them once and subsequently paying more attention
>>> while developing).
>>>
>>> Printing the content of the offending line somewhat helps me spot the
>>> line faster (more so for tab issues). But I must admit that showing the
>>> whole line is somewhat "overkill". I just started thinking of a compromise
>>> being to only show the first N characters of the line, with N being 80
>>> minus the number of characters necessary to print the filename and line
>>> number.
>>>
>>> Thanks Martin for pointing out the lines in BiocCheck. (Now I feel bad
>>> for not having checked sooner.. hehe!)
>>> I think the idea of BiocCheck showing the first 6 offenders in BiocCheck
>>> quite nice, as I rarely have more since I use using the RStudio "Tools >
>>> Global Options > Code > Display > Show Margin > Margin column: 80" feature.
>>>
>>> I'll give a go at both approaches (developing BiocCheck and my own
>>> scripts)
>>>
>>> Cheers,
>>> Kevin
>>>
>>>
>>> On Wed, Nov 2, 2016 at 7:41 PM, Kasper Daniel Hansen <
>>> kasperdanielhan...@gmail.com> wrote:
>>>
 I would prefer all line numbers reported, but on the other hand I am
 indifferent wrt. the content of the line, unless (say) TABs are marked up
 somehow.

 Kasper

 On Wed, Nov 2, 2016 at 3:17 PM, Martin Morgan <
 martin.mor...@roswellpark.org> wrote:

> On 11/02/2016 02:49 PM, Kevin RUE wrote:
>
>> Dear all,
>>
>> Just thought I'd share a handful of scripts that I wrote to follow up
>> on
>> certain NOTE messages thrown by R CMD BiocCheck.
>>
>> https://github.com/kevinrue/BiocCheckTools
>>
>> They're very simple, but I occasionally find them quite convenient.
>> Apologies if something similar already exists somewhere :)
>>
>
> Maybe consider creating a diff against the source code that, e.g.,
> reported the first 6 offenders? The relevant lines are near
>
> https://github.com/Bioconductor-mirror/BiocCheck/blob/master
> /R/checks.R#L1081
>
> Martin
>
>
>> All the best,
>> Kevin
>>
>> [[alternative HTML version deleted]]
>>
>> ___
>> Bioc-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>>
>
> This email message may contain legally privileged
> and/or...{{dropped:2}}
>
>
> ___
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>


>>>
>>
>
diff --git a/R/checks.R b/R/checks.R
index 9b1f273..69c61ea 100644
--- a/R/checks.R
+++ b/R/checks.R
@@ 

[Bioc-devel] GenomeInfoDb::Seqinfo(genome) broken?

2016-11-03 Thread Raymond Cavalcante
Hello,

Sometime yesterday calls like GenomeInfoDb::Seqinfo(genome = 'hg19') stopped 
working with the error:

> Error in file(file, "rt") : cannot open the connection

From the documentation, that call relies on fetchExtendedChromInfoFromUCSC() 
and requires an internet connection, which I had and continue to have. I'm not 
really sure how to deal with this problem because the goldenPath link still 
works (http://hgdownload.cse.ucsc.edu/goldenpath/hg19/database/chromInfo.txt.gz 
), so 
something else is broken...

Oddly, calls to rtracklayer::import.bed() that specify a genome work. I don't 
have any BSgenome packages installed where I'm running it, and from the 
documentation for genome, "An attempt will be made to derive the ‘seqinfo’ on 
the return value using either an installed BSgenome package or UCSC, if network 
access is available." So I would guess that rtracklayer::import.bed() would use 
the same fetchExtendedChromInfoFromUCSC()...?

On a related note, is there a non-BSgenome package that has the chromosome 
length / seqinfo information that doesn't require an internet connection (other 
than to download the package)? BSgenome is too large to require of users just 
for chromosome lengths. The org.db packages have chromosome lengths, but only 
with respect to one genome version for that organism, and from the 
documentation it isn't clear which version.

Thanks,
Raymond Cavalcante
[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Re: [Bioc-devel] A handful of check to follow up on R CMD BiocCheck

2016-11-03 Thread Kevin RUE
Hi all,

Please find attached the diff relative to the code that I cloned from
Bioconductor-mirror yesterday (please ignore the previous diff file).

Basically three new features:

   - As per previous email: display up to the first 6 lines that are over
   80 characters long
   - *New*: display up to the first 6 lines that are not indented by a
   multiple of 4 spaces
   - *New*: display up to the first 6 lines that use TAB instead of 4
   spaces for indentation

I also attach the output of the updated code
, to illustrate the changes.

Notes:

   - For demonstration purpose, I indented a handful of lines from the
   checks.R file itself with TAB characters. I assume that's OK, as some lines
   were already longer than 80 characters and not indented by a multiple of 4
   spaces.

All the best,
Kevin

On Wed, Nov 2, 2016 at 10:00 PM, Kevin RUE  wrote:

> Me again :)
>
> Please find attached the first patch to print the first 6 lines over 80
> characters long. (I'll get to the tabulation offenders next).
>
> Note that all the offending lines are stored in the "df.length"
> data.frame. How about an option like "fullReport=c(FALSE, TRUE)" that print
> *all* the offending lines?
> The data.frame also stores the content of the lines for the record, but
> does not print them. I think Kasper is right: filename and line should be
> enough to track down the line.
>
> All the best,
> Kevin
>
>
>
> On Wed, Nov 2, 2016 at 8:08 PM, Kevin RUE  wrote:
>
>> Thanks for the feedback!
>>
>> I also tend to prefer *all* the lines being reported (or to be honest,
>> that was really true when I had lots of them; a problem that I largely
>> mitigated by fixing all of them once and subsequently paying more attention
>> while developing).
>>
>> Printing the content of the offending line somewhat helps me spot the
>> line faster (more so for tab issues). But I must admit that showing the
>> whole line is somewhat "overkill". I just started thinking of a compromise
>> being to only show the first N characters of the line, with N being 80
>> minus the number of characters necessary to print the filename and line
>> number.
>>
>> Thanks Martin for pointing out the lines in BiocCheck. (Now I feel bad
>> for not having checked sooner.. hehe!)
>> I think the idea of BiocCheck showing the first 6 offenders in BiocCheck
>> quite nice, as I rarely have more since I use using the RStudio "Tools >
>> Global Options > Code > Display > Show Margin > Margin column: 80" feature.
>>
>> I'll give a go at both approaches (developing BiocCheck and my own
>> scripts)
>>
>> Cheers,
>> Kevin
>>
>>
>> On Wed, Nov 2, 2016 at 7:41 PM, Kasper Daniel Hansen <
>> kasperdanielhan...@gmail.com> wrote:
>>
>>> I would prefer all line numbers reported, but on the other hand I am
>>> indifferent wrt. the content of the line, unless (say) TABs are marked up
>>> somehow.
>>>
>>> Kasper
>>>
>>> On Wed, Nov 2, 2016 at 3:17 PM, Martin Morgan <
>>> martin.mor...@roswellpark.org> wrote:
>>>
 On 11/02/2016 02:49 PM, Kevin RUE wrote:

> Dear all,
>
> Just thought I'd share a handful of scripts that I wrote to follow up
> on
> certain NOTE messages thrown by R CMD BiocCheck.
>
> https://github.com/kevinrue/BiocCheckTools
>
> They're very simple, but I occasionally find them quite convenient.
> Apologies if something similar already exists somewhere :)
>

 Maybe consider creating a diff against the source code that, e.g.,
 reported the first 6 offenders? The relevant lines are near

 https://github.com/Bioconductor-mirror/BiocCheck/blob/master
 /R/checks.R#L1081

 Martin


> All the best,
> Kevin
>
> [[alternative HTML version deleted]]
>
> ___
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
>

 This email message may contain legally privileged and/or...{{dropped:2}}


 ___
 Bioc-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/bioc-devel

>>>
>>>
>>
>
diff --git a/R/checks.R b/R/checks.R
index 02e3841..9b1f273 100644
--- a/R/checks.R
+++ b/R/checks.R
@@ -1057,15 +1057,12 @@ checkFormatting <- function(pkgdir)
 tablines <- 0L
 badindentlines <- 0L
 ok <- TRUE
-
-df.length <- data.frame(stringsAsFactors=FALSE)
-df.indent <- data.frame(stringsAsFactors=FALSE)
-df.tab <- data.frame(stringsAsFactors=FALSE)
+
 for (file in files)
 {
-pkgname <- getPkgNameFromPkgDir(pkgdir)
 if (file.exists(file) && file.info(file)$size == 0)
 {
+pkgname <- getPkgNameFromPkgDir(pkgdir)
 handleNote(sprintf("Add content to the empty file %s.",
 mungeName(file, pkgname)))
 }
@@ -1075,22