Re: [Bioc-devel] Empty DataFrame Causes SummarizedExperiment Constructor Error

2023-05-20 Thread Hervé Pagès

On 5/17/23 23:59, Robert Castelo wrote:

not sure whether this is relevant, but I observed that while an empty 
base R 'data.frame()' constructor gives zero-length character vectors 
for row and column names, the empty 'DataFrame()' constructor gives 
also a zero-length character vector for column names, but NULL for row 
names, shouldn't this be consistent with base R 'data.frame()'?


dimnames(data.frame())
[[1]]
character(0)

[[2]]
character(0)

dimnames(DataFrame())
[[1]]
NULL

[[2]]
character(0)


This is a feature. Handling of the rownames of a DataFrame deviates from 
data.frame as documented in ?DataFrame.


H.



robert.

On 5/17/23 20:45, Hervé Pagès wrote:
Not sure why the colData default is DataFrame(). Seems like this has 
been the default since the birth of the SummarizedExperiment class 
back in 2010 (FWIW the class was born in the GenomicRanges package). 
Anyways, it should probably be NULL, like for rowData. Can you please 
open an issue on GitHub for this? Thanks


H.

On 5/12/23 07:00, Dario Strbenac via Bioc-devel wrote:

Good day,

The default value of colData is DataFrame(). Not specifying an 
informative colData is fine.


countsMini <- matrix(rpois(100, 100), ncol = 10)
colnames(countsMini) <- paste("Cell", 1:10)
rownames(countsMini) <- paste("Gene", 1:10)
SummarizedExperiment(assays = list(counts = countsMini)) # Creates 
the object successfully.


But, explicitly specifying an empty DataFrame triggers an error. I 
don't understand why it is not equivalent to the constructor's default.


SummarizedExperiment(assays = list(counts = countsMini), colData = 
DataFrame())
Error in `rownames<-`(`*tmp*`, value = 
.get_colnames_from_first_assay(assays)) :

   invalid rownames length

What is the subtle difference? It also seems like there could be a 
clearer error message emitted if this is caught in the right place.


--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel



--
Hervé Pagès

Bioconductor Core Team
hpages.on.git...@gmail.com

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Empty DataFrame Causes SummarizedExperiment Constructor Error

2023-05-18 Thread Robert Castelo
FWIW, it seems to me that the constructor expects the integrity between 
the assay data and the column data. giving the correct row names, 
there's no error:


SummarizedExperiment(assays = list(counts = countsMini), colData = 
DataFrame(row.names=colnames(countsMini)))

class: SummarizedExperiment
dim: 10 10
metadata(0):
assays(1): counts
rownames(10): Gene 1 Gene 2 ... Gene 9 Gene 10
rowData names(0):
colnames(10): Cell 1 Cell 2 ... Cell 9 Cell 10
colData names(0):

not sure whether this is relevant, but I observed that while an empty 
base R 'data.frame()' constructor gives zero-length character vectors 
for row and column names, the empty 'DataFrame()' constructor gives also 
a zero-length character vector for column names, but NULL for row names, 
shouldn't this be consistent with base R 'data.frame()'?


dimnames(data.frame())
[[1]]
character(0)

[[2]]
character(0)

dimnames(DataFrame())
[[1]]
NULL

[[2]]
character(0)

robert.

On 5/17/23 20:45, Hervé Pagès wrote:
Not sure why the colData default is DataFrame(). Seems like this has 
been the default since the birth of the SummarizedExperiment class 
back in 2010 (FWIW the class was born in the GenomicRanges package). 
Anyways, it should probably be NULL, like for rowData. Can you please 
open an issue on GitHub for this? Thanks


H.

On 5/12/23 07:00, Dario Strbenac via Bioc-devel wrote:

Good day,

The default value of colData is DataFrame(). Not specifying an 
informative colData is fine.


countsMini <- matrix(rpois(100, 100), ncol = 10)
colnames(countsMini) <- paste("Cell", 1:10)
rownames(countsMini) <- paste("Gene", 1:10)
SummarizedExperiment(assays = list(counts = countsMini)) # Creates 
the object successfully.


But, explicitly specifying an empty DataFrame triggers an error. I 
don't understand why it is not equivalent to the constructor's default.


SummarizedExperiment(assays = list(counts = countsMini), colData = 
DataFrame())
Error in `rownames<-`(`*tmp*`, value = 
.get_colnames_from_first_assay(assays)) :

   invalid rownames length

What is the subtle difference? It also seems like there could be a 
clearer error message emitted if this is caught in the right place.


--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel



--
Robert Castelo, PhD
Associate Professor
Dept. of Medicine and Life Sciences
Universitat Pompeu Fabra (UPF)
Barcelona Biomedical Research Park (PRBB)
Dr Aiguader 88
E-08003 Barcelona, Spain
telf: +34.933.160.514

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Empty DataFrame Causes SummarizedExperiment Constructor Error

2023-05-17 Thread Hervé Pagès
Not sure why the colData default is DataFrame(). Seems like this has 
been the default since the birth of the SummarizedExperiment class back 
in 2010 (FWIW the class was born in the GenomicRanges package). Anyways, 
it should probably be NULL, like for rowData. Can you please open an 
issue on GitHub for this? Thanks


H.

On 5/12/23 07:00, Dario Strbenac via Bioc-devel wrote:

Good day,

The default value of colData is DataFrame(). Not specifying an informative 
colData is fine.

countsMini <- matrix(rpois(100, 100), ncol = 10)
colnames(countsMini) <- paste("Cell", 1:10)
rownames(countsMini) <- paste("Gene", 1:10)
SummarizedExperiment(assays = list(counts = countsMini)) # Creates the object 
successfully.

But, explicitly specifying an empty DataFrame triggers an error. I don't 
understand why it is not equivalent to the constructor's default.

SummarizedExperiment(assays = list(counts = countsMini), colData = DataFrame())
Error in `rownames<-`(`*tmp*`, value = .get_colnames_from_first_assay(assays)) :
   invalid rownames length

What is the subtle difference? It also seems like there could be a clearer 
error message emitted if this is caught in the right place.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


--
Hervé Pagès

Bioconductor Core Team
hpages.on.git...@gmail.com

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] Empty DataFrame Causes SummarizedExperiment Constructor Error

2023-05-12 Thread Dario Strbenac via Bioc-devel
Good day,

The default value of colData is DataFrame(). Not specifying an informative 
colData is fine.

countsMini <- matrix(rpois(100, 100), ncol = 10)
colnames(countsMini) <- paste("Cell", 1:10)
rownames(countsMini) <- paste("Gene", 1:10)
SummarizedExperiment(assays = list(counts = countsMini)) # Creates the object 
successfully.

But, explicitly specifying an empty DataFrame triggers an error. I don't 
understand why it is not equivalent to the constructor's default.

SummarizedExperiment(assays = list(counts = countsMini), colData = DataFrame())
Error in `rownames<-`(`*tmp*`, value = .get_colnames_from_first_assay(assays)) 
: 
  invalid rownames length

What is the subtle difference? It also seems like there could be a clearer 
error message emitted if this is caught in the right place.

--
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel