On Thu, May 14, 2009 at 5:19 AM, Michael Lawrence <[email protected]>wrote:
> > > On Thu, May 14, 2009 at 5:07 AM, Nicolas Delhomme <[email protected]>wrote: > >> Hi Michael, >> >> Thanks a lot. This is working very nicely. However, the user has to pay >> attention to the fact that it's query and subject are ordered in the same >> way to properly use the index generated. >> >> For example, my subject has the names: "2L" "2R" "3L" "3R" "4" "X" and >> the query: "4" "X" "3R" "2R" "2L" "3L". The overlap function takes care of >> this and compare the right spaces. It returns a RangesMatchingList with >> names: >> "4" "X" "3R" "2R" "2L" "3L". This means that when I export the result as >> a matrix, the indices will be corrupted. >> > > Yes, I noted this in the documentation. I am not sure I would say the > results are "corrupted". > > >> >> I can think of two solutions: >> Either, there should be a warning emitted (when doing the overlap if the >> names are not ordered the same) >> Or, and that would be my preferred solution, have an additional slot in >> the RangesMatchingList holding the mapping index from the query names to the >> subject names. This could then be used by the as.matrix method to return the >> "correct" indices. This should make it safe for the case where the user does >> not provide a query and a subject ordered in the same way. And it should be >> robust to the cases where the query and subject spaces are not entirely >> identical. >> > > I was thinking of doing something like this. Thanks for giving me the > motivation to actually do it. > Just checked this into the devel branch (trunk). > > Michael > > >> >> Well, this is just my two cents' worth as I'm not (yet) so familiar with >> the code. >> >> Best, >> >> --------------------------------------------------------------- >> Nicolas Delhomme >> >> High Throughput Functional Genomics Center >> >> European Molecular Biology Laboratory >> >> Tel: +49 6221 387 8426 >> Email: [email protected] >> Meyerhofstrasse 1 - Postfach 10.2209 >> 69102 Heidelberg, Germany >> --------------------------------------------------------------- >> >> >> >> On 13 May 2009, at 22:20, Michael Lawrence wrote: >> >> >>> >>> On Wed, May 13, 2009 at 7:05 AM, Nicolas Delhomme <[email protected]> >>> wrote: >>> Hi all, >>> >>> I've got the impression that the as.matrix method of the >>> RangesMatchingList does not work as it should. >>> >>> I have a RangesMatchingList which I obtained by using the overlap (from >>> the RangesList class) function that takes two RangesList as input. When I >>> apply as.matrix() on the RangesMatchingList, it gives me the following >>> error: >>> >>> Error in .Method(..., deparse.level = deparse.level) : >>> number of rows of matrices must match (see arg 2) >>> >>> The function is pretty easy: >>> >>> setMethod("as.matrix", "RangesMatchingList", function(x) { >>> cbind(space = space(x), do.call(cbind, lapply(x, as.matrix))) >>> }) >>> >>> When I replace the cbind in the do.call by an rbind, it's already better >>> >>> Thanks, yes this was a bug. As the documentation states, >>> RangesMatchingList was considered experimental and not something that was >>> really tested. But I should have done a better job. >>> >>> >>> Warning message: >>> In .Method(..., deparse.level = deparse.level) : >>> number of rows of result is not a multiple of vector length (arg 1) >>> >>> This is due to the fact that space(x) returns many more spaces than there >>> are overlaps. >>> >>> This is a bug in space(). >>> >>> >>> I could solve that by changing the function into: >>> >>> setMethod("as.matrix", "RangesMatchingList", function(x) { >>> do.call(rbind,lapply(c(1:length(x)),function(i){mat <- >>> as.matrix(x[[i]]);cbind(space=rep(names(x)[[i]],nrow(mat)),mat)})) >>> }) >>> >>> Now, I do not know if I might have a particular use-case (having a >>> RangesMatchingList coming from the RangesList overlap function) that you >>> guys did not think of. >>> >>> It turns out that I had to rethink this method. As above, the user will >>> receive a character matrix, which probably isn't very useful. Could >>> translate the space names into integer IDs, but in order to use that, one >>> would have to split the matrix and loop over each block. In that case, it >>> would just be easier to loop over the RangesMatchingList. Thus, I changed >>> the function to return a doublet matrix, just like RangesMatching, where the >>> indices are adjusted so that they are aligned with the result of calling >>> 'unlist' on the subject and query RangesLists (ie the index is global). I >>> think this will satisfy more use cases, but I'm not sure. >>> >>> These changes were applied in both trunk and release. >>> >>> Thanks for the feedback, and I'd appreciate more if you have any, >>> Michael >>> >>> >>> Just let me know, >>> >>> Best, >>> >>> --------------------------------------------------------------- >>> Nicolas Delhomme >>> >>> High Throughput Functional Genomics Center >>> >>> European Molecular Biology Laboratory >>> >>> Tel: +49 6221 387 8426 >>> Email: [email protected] >>> Meyerhofstrasse 1 - Postfach 10.2209 >>> 69102 Heidelberg, Germany >>> >>> _______________________________________________ >>> Bioc-sig-sequencing mailing list >>> [email protected] >>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing >>> >>> >> > [[alternative HTML version deleted]] _______________________________________________ Bioc-sig-sequencing mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
