Re: [Bioc-devel] Empty DataFrame Causes SummarizedExperiment Constructor Error
On 5/17/23 23:59, Robert Castelo wrote: not sure whether this is relevant, but I observed that while an empty base R 'data.frame()' constructor gives zero-length character vectors for row and column names, the empty 'DataFrame()' constructor gives also a zero-length character vector for column names, but NULL for row names, shouldn't this be consistent with base R 'data.frame()'? dimnames(data.frame()) [[1]] character(0) [[2]] character(0) dimnames(DataFrame()) [[1]] NULL [[2]] character(0) This is a feature. Handling of the rownames of a DataFrame deviates from data.frame as documented in ?DataFrame. H. robert. On 5/17/23 20:45, Hervé Pagès wrote: Not sure why the colData default is DataFrame(). Seems like this has been the default since the birth of the SummarizedExperiment class back in 2010 (FWIW the class was born in the GenomicRanges package). Anyways, it should probably be NULL, like for rowData. Can you please open an issue on GitHub for this? Thanks H. On 5/12/23 07:00, Dario Strbenac via Bioc-devel wrote: Good day, The default value of colData is DataFrame(). Not specifying an informative colData is fine. countsMini <- matrix(rpois(100, 100), ncol = 10) colnames(countsMini) <- paste("Cell", 1:10) rownames(countsMini) <- paste("Gene", 1:10) SummarizedExperiment(assays = list(counts = countsMini)) # Creates the object successfully. But, explicitly specifying an empty DataFrame triggers an error. I don't understand why it is not equivalent to the constructor's default. SummarizedExperiment(assays = list(counts = countsMini), colData = DataFrame()) Error in `rownames<-`(`*tmp*`, value = .get_colnames_from_first_assay(assays)) : invalid rownames length What is the subtle difference? It also seems like there could be a clearer error message emitted if this is caught in the right place. -- Dario Strbenac University of Sydney Camperdown NSW 2050 Australia ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel -- Hervé Pagès Bioconductor Core Team hpages.on.git...@gmail.com ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Re: [Bioc-devel] Empty DataFrame Causes SummarizedExperiment Constructor Error
FWIW, it seems to me that the constructor expects the integrity between the assay data and the column data. giving the correct row names, there's no error: SummarizedExperiment(assays = list(counts = countsMini), colData = DataFrame(row.names=colnames(countsMini))) class: SummarizedExperiment dim: 10 10 metadata(0): assays(1): counts rownames(10): Gene 1 Gene 2 ... Gene 9 Gene 10 rowData names(0): colnames(10): Cell 1 Cell 2 ... Cell 9 Cell 10 colData names(0): not sure whether this is relevant, but I observed that while an empty base R 'data.frame()' constructor gives zero-length character vectors for row and column names, the empty 'DataFrame()' constructor gives also a zero-length character vector for column names, but NULL for row names, shouldn't this be consistent with base R 'data.frame()'? dimnames(data.frame()) [[1]] character(0) [[2]] character(0) dimnames(DataFrame()) [[1]] NULL [[2]] character(0) robert. On 5/17/23 20:45, Hervé Pagès wrote: Not sure why the colData default is DataFrame(). Seems like this has been the default since the birth of the SummarizedExperiment class back in 2010 (FWIW the class was born in the GenomicRanges package). Anyways, it should probably be NULL, like for rowData. Can you please open an issue on GitHub for this? Thanks H. On 5/12/23 07:00, Dario Strbenac via Bioc-devel wrote: Good day, The default value of colData is DataFrame(). Not specifying an informative colData is fine. countsMini <- matrix(rpois(100, 100), ncol = 10) colnames(countsMini) <- paste("Cell", 1:10) rownames(countsMini) <- paste("Gene", 1:10) SummarizedExperiment(assays = list(counts = countsMini)) # Creates the object successfully. But, explicitly specifying an empty DataFrame triggers an error. I don't understand why it is not equivalent to the constructor's default. SummarizedExperiment(assays = list(counts = countsMini), colData = DataFrame()) Error in `rownames<-`(`*tmp*`, value = .get_colnames_from_first_assay(assays)) : invalid rownames length What is the subtle difference? It also seems like there could be a clearer error message emitted if this is caught in the right place. -- Dario Strbenac University of Sydney Camperdown NSW 2050 Australia ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel -- Robert Castelo, PhD Associate Professor Dept. of Medicine and Life Sciences Universitat Pompeu Fabra (UPF) Barcelona Biomedical Research Park (PRBB) Dr Aiguader 88 E-08003 Barcelona, Spain telf: +34.933.160.514 ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Re: [Bioc-devel] Empty DataFrame Causes SummarizedExperiment Constructor Error
Not sure why the colData default is DataFrame(). Seems like this has been the default since the birth of the SummarizedExperiment class back in 2010 (FWIW the class was born in the GenomicRanges package). Anyways, it should probably be NULL, like for rowData. Can you please open an issue on GitHub for this? Thanks H. On 5/12/23 07:00, Dario Strbenac via Bioc-devel wrote: Good day, The default value of colData is DataFrame(). Not specifying an informative colData is fine. countsMini <- matrix(rpois(100, 100), ncol = 10) colnames(countsMini) <- paste("Cell", 1:10) rownames(countsMini) <- paste("Gene", 1:10) SummarizedExperiment(assays = list(counts = countsMini)) # Creates the object successfully. But, explicitly specifying an empty DataFrame triggers an error. I don't understand why it is not equivalent to the constructor's default. SummarizedExperiment(assays = list(counts = countsMini), colData = DataFrame()) Error in `rownames<-`(`*tmp*`, value = .get_colnames_from_first_assay(assays)) : invalid rownames length What is the subtle difference? It also seems like there could be a clearer error message emitted if this is caught in the right place. -- Dario Strbenac University of Sydney Camperdown NSW 2050 Australia ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel -- Hervé Pagès Bioconductor Core Team hpages.on.git...@gmail.com ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
[Bioc-devel] Empty DataFrame Causes SummarizedExperiment Constructor Error
Good day, The default value of colData is DataFrame(). Not specifying an informative colData is fine. countsMini <- matrix(rpois(100, 100), ncol = 10) colnames(countsMini) <- paste("Cell", 1:10) rownames(countsMini) <- paste("Gene", 1:10) SummarizedExperiment(assays = list(counts = countsMini)) # Creates the object successfully. But, explicitly specifying an empty DataFrame triggers an error. I don't understand why it is not equivalent to the constructor's default. SummarizedExperiment(assays = list(counts = countsMini), colData = DataFrame()) Error in `rownames<-`(`*tmp*`, value = .get_colnames_from_first_assay(assays)) : invalid rownames length What is the subtle difference? It also seems like there could be a clearer error message emitted if this is caught in the right place. -- Dario Strbenac University of Sydney Camperdown NSW 2050 Australia ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel