[R] A simple question

2015-07-14 Thread Alex Kim via R-help
Hello,

I am trying to create a matrix that looks like this, using the
stri_locate_all function.

 x - ABCDJAKSLABCDAKJSABCD
 m - stri_locate_all_regex(x, 'ABCD')
 m
[[1]]
 start end
[1,] 1   4
[2,]10  13
[3,]18  21

I tried converting m into a matrix, however it always seems to wrap around
the wrong way:

 output - matrix(unlist(m), ncol = 2, byrow = TRUE)
 output
 [,1] [,2]
[1,]1   10
[2,]   184
[3,]   13   21

I want to output the start locations in the first column and the end
locations in the second column into a matrix to look like this.

 [,1] [,2]
[1,] 1   4
[2,]10  13
[3,]18  21

Thank you for your help,
Alex

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Matrix Manipulation

2015-07-04 Thread Alex Kim via R-help
Hi guys,

Suppose I have an extremely large data frame with 2 columns and .5 mil
rows. For example, the last 6 rows may look like this:
.
..
...
89 100
93 120
95 125
101NA
115NA
123NA
124NA

I would like to manipulate this data frame to output a data frame that
looks like:,

10089, 93, 95
120101, 115
125123, 124

What would be the absolute quickest way to do this, given that there are
many rows? Currently I have this:

# m is the large two column data frame
end - na.omit(m[,'V2']);
out - data.frame(End=end,
Start=unname(sapply(split(m[,'V1'],findInterval(m[,'V1'],end))[as.character(0:c(length(end)-1))],paste,collapse='.')))

However this is taking a little bit too long.

Thank you for your help!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.