Re: [Bioc-sig-seq] the as.matrix method of the RangesMatchingList

Michael Lawrence Thu, 14 May 2009 05:21:35 -0700

On Thu, May 14, 2009 at 5:07 AM, Nicolas Delhomme <[email protected]> wrote:


> Hi Michael,
>
> Thanks a lot. This is working very nicely. However, the user has to pay
> attention to the fact that it's query and subject are ordered in the same
> way to properly use the index generated.
>
> For example, my subject has the names: "2L" "2R" "3L" "3R" "4"  "X" and the
> query: "4"  "X"  "3R" "2R" "2L" "3L". The overlap function takes care of
> this and compare the right spaces. It returns a RangesMatchingList with
> names:
> "4"  "X"  "3R" "2R" "2L" "3L". This means that when I export the result as
> a matrix, the indices will be corrupted.
>

Yes, I noted this in the documentation. I am not sure I would say the
results are "corrupted".


>
> I can think of two solutions:
> Either, there should be a warning emitted (when doing the overlap if the
> names are not ordered the same)
> Or, and that would be my preferred solution, have an additional slot in the
> RangesMatchingList holding the mapping index from the query names to the
> subject names. This could then be used by the as.matrix method to return the
> "correct" indices. This should make it safe for the case where the user does
> not provide a query and a subject ordered in the same way. And it should be
> robust to the cases where the query and subject spaces are not entirely
> identical.
>

I was thinking of doing something like this. Thanks for giving me the
motivation to actually do it.

Michael


>
> Well, this is just my two cents' worth as I'm not (yet) so familiar with
> the code.
>
> Best,
>
> ---------------------------------------------------------------
> Nicolas Delhomme
>
> High Throughput Functional Genomics Center
>
> European Molecular Biology Laboratory
>
> Tel: +49 6221 387 8426
> Email: [email protected]
> Meyerhofstrasse 1 - Postfach 10.2209
> 69102 Heidelberg, Germany
> ---------------------------------------------------------------
>
>
>
> On 13 May 2009, at 22:20, Michael Lawrence wrote:
>
>
>>
>> On Wed, May 13, 2009 at 7:05 AM, Nicolas Delhomme <[email protected]>
>> wrote:
>> Hi all,
>>
>> I've got the impression that the as.matrix method of the
>> RangesMatchingList does not work as it should.
>>
>> I have a RangesMatchingList which I obtained by using the overlap (from
>> the RangesList class) function that takes two RangesList as input. When I
>> apply as.matrix() on the RangesMatchingList, it gives me the following
>> error:
>>
>> Error in .Method(..., deparse.level = deparse.level) :
>>  number of rows of matrices must match (see arg 2)
>>
>> The function is pretty easy:
>>
>> setMethod("as.matrix", "RangesMatchingList", function(x) {
>>  cbind(space = space(x), do.call(cbind, lapply(x, as.matrix)))
>> })
>>
>> When I replace the cbind in the do.call by an rbind, it's already better
>>
>> Thanks, yes this was a bug. As the documentation states,
>> RangesMatchingList was considered experimental and not something that was
>> really tested. But I should have done a better job.
>>
>>
>> Warning message:
>> In .Method(..., deparse.level = deparse.level) :
>>  number of rows of result is not a multiple of vector length (arg 1)
>>
>> This is due to the fact that space(x) returns many more spaces than there
>> are overlaps.
>>
>> This is a bug in space().
>>
>>
>> I could solve that by changing the function into:
>>
>> setMethod("as.matrix", "RangesMatchingList", function(x) {
>>  do.call(rbind,lapply(c(1:length(x)),function(i){mat <-
>> as.matrix(x[[i]]);cbind(space=rep(names(x)[[i]],nrow(mat)),mat)}))
>> })
>>
>> Now, I do not know if I might have a particular use-case (having a
>> RangesMatchingList coming from the RangesList overlap function) that you
>> guys did not think of.
>>
>> It turns out that I had to rethink this method. As above, the user will
>> receive a character matrix, which probably isn't very useful. Could
>> translate the space names into integer IDs, but in order to use that, one
>> would have to split the matrix and loop over each block. In that case, it
>> would just be easier to loop over the RangesMatchingList. Thus, I changed
>> the function to return a doublet matrix, just like RangesMatching, where the
>> indices are adjusted so that they are aligned with the result of calling
>> 'unlist' on the subject and query RangesLists (ie the index is global). I
>> think this will satisfy more use cases, but I'm not sure.
>>
>> These changes were applied in both trunk and release.
>>
>> Thanks for the feedback, and I'd appreciate more if you have any,
>> Michael
>>
>>
>> Just let me know,
>>
>> Best,
>>
>> ---------------------------------------------------------------
>> Nicolas Delhomme
>>
>> High Throughput Functional Genomics Center
>>
>> European Molecular Biology Laboratory
>>
>> Tel: +49 6221 387 8426
>> Email: [email protected]
>> Meyerhofstrasse 1 - Postfach 10.2209
>> 69102 Heidelberg, Germany
>>
>> _______________________________________________
>> Bioc-sig-sequencing mailing list
>> [email protected]
>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>
>>
>

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Re: [Bioc-sig-seq] the as.matrix method of the RangesMatchingList

Reply via email to