Arun, Thanks for your reply and the issue is here (if there's anything else I can do to help solve this problem let me know):
https://github.com/Rdatatable/data.table/issues/700 Also thanks for mentioning rollends. M On 06/20/2014 07:41 PM, Arunkumar Srinivasan wrote: > Michael, > > Excellent example. Perfectly reproducible on 1.9.2 and 1.9.3. And it > works fine on 1.8.10. The answer should've only 3 rows. > It'd be even more nice of you if you could file it as a bug report. > > PS: On another note.. you maybe also interested in `CS[SP, roll=TRUE, > rollends=TRUE]` > Arun > > From: Michael Smith [email protected] <mailto:[email protected]> > Reply: Michael Smith [email protected] <mailto:[email protected]> > Date: June 20, 2014 at 1:30:09 PM > To: Arunkumar Srinivasan [email protected] > <mailto:[email protected]> > Cc: [email protected] > [email protected] > <mailto:[email protected]> > Subject: Re: [datatable-help] Bug when Merging with nomatch=0 and roll=T? > >> OK, no problem, here's the code. If there are any problems pasting it >> into R let me know (I used parts of dput, so maybe the email line >> endings are messed up). If you want I can also file a bug report on >> github, just let me know. >> >> CS <- >> data.table( >> structure(list(LPERMCO = c(7L, 33L), datadate = structure(c(15912, >> 15912), class = "Date"), me = c(626550.35284, 7766.385)), .Names = >> c("LPERMCO", >> "datadate", "me"), class = "data.frame", row.names = c(NA, -2L >> )), >> key = "LPERMCO,datadate") >> SP <- >> data.table( >> structure(list(PERMCO = c(7L, 7L, 33L, 33L, 33L, 33L), date = >> structure(c(15884, >> 15917, 15884, 15884, 15917, 15917), class = "Date"), RET = c(-0.118303, >> 0.141225, -0.03137, -0.02533, 0.045967, 0.043694)), .Names = c("PERMCO", >> "date", "RET"), class = "data.frame", row.names = c(NA, -6L)), >> key = "PERMCO,date") >> sapply(CS[SP, nomatch = 0, roll = T], length) >> >> >> The relevant output looks like this, both in 1.9.2 and in dev-1.9.3, and >> for sapply, the "me" column should be 5 but it's 3: >> >> > CS >> LPERMCO datadate me >> 1: 7 2013-07-26 626550.353 >> 2: 33 2013-07-26 7766.385 >> > SP >> PERMCO date RET >> 1: 7 2013-06-28 -0.118303 >> 2: 7 2013-07-31 0.141225 >> 3: 33 2013-06-28 -0.031370 >> 4: 33 2013-06-28 -0.025330 >> 5: 33 2013-07-31 0.045967 >> 6: 33 2013-07-31 0.043694 >> > CS[SP, nomatch = 0, roll = T] >> LPERMCO datadate me RET >> 1: 7 2013-07-31 626550.353 0.141225 >> 2: 33 2013-06-28 7766.385 -0.031370 >> 3: 33 2013-06-28 7766.385 -0.025330 >> 4: 33 2013-07-31 626550.353 0.045967 >> 5: 33 2013-07-31 7766.385 0.043694 >> Warning message: >> In cbind(LPERMCO = c(" 7", "33", "33", "33", "33"), datadate = >> c("2013-07-31", : >> number of rows of result is not a multiple of vector length (arg 3) >> > sapply(CS[SP, nomatch = 0, roll = T], length) >> LPERMCO datadate me RET >> 5 5 3 5 >> >> >> Thanks, >> M >> >> >> >> >> >> On 06/20/2014 05:17 PM, Arunkumar Srinivasan wrote: >> >> For a given data.table, is there any condition … Ergo, it's a bug, >> >> right? >> > >> > Yes. >> > >> >> I'll be glad >> >> to try to boil this down to something that's reproducible. >> > >> > That'd be great. >> > >> > >> > Arun >> > >> > From: Michael Smith [email protected] <mailto:[email protected]> >> > Reply: Michael Smith [email protected] <mailto:[email protected]> >> > Date: June 20, 2014 at 5:37:24 AM >> > To: [email protected] >> > [email protected] >> > <mailto:[email protected]> >> > Subject: Re: [datatable-help] Bug when Merging with nomatch=0 and roll=T? >> > >> >> So let me rephrase my question (haven't received an answer so far): >> >> >> >> For a given data.table, is there any condition under which the lengths >> >> of the vectors in each column may differ? Based on my understanding, >> >> each data.table is also a data.frame, and with a data frame this should >> >> not be possible. For example, it's not possible to have a data.frame >> >> where the first column is a vector of length eight, and the second >> >> column is a vector of length nine. Ergo, it's a bug, right? >> >> >> >> If my understanding is correct, please do let me know and I'll be glad >> >> to try to boil this down to something that's reproducible. >> >> >> >> Thanks, >> >> M >> >> >> >> On 06/19/2014 11:59 AM, Michael Smith wrote: >> >> > By the way, I know it's not reproducible with the code below. Before >> >> > going into further detail, I first wanted to ask whether this looks like >> >> > a bug, or whether I've overlooked something obvious and this is expected >> >> > behavior. >> >> > >> >> > Thanks, >> >> > M >> >> > >> >> > On 06/19/2014 11:51 AM, Michael Smith wrote: >> >> >> I got the following result on my keyed data tables `CS` and `SP`, which >> >> >> seems like a bug (in 1.9.2 and 1.9.3 dev version) to me, since all >> >> >> columns should have the _same_ length: >> >> >> >> >> >>> ## Works as expected: >> >> >>> all((l <- sapply(CS[SP, roll = TRUE], length)) == l[1]) >> >> >> [1] TRUE >> >> >>> ## Works as expected: >> >> >>> all((l <- sapply(CS[SP, nomatch = 0], length)) == l[1]) >> >> >> [1] TRUE >> >> >>> ## Here's the potential _bug_, when combining both: >> >> >>> all((l <- sapply(CS[SP, nomatch = 0, roll = TRUE], length)) == l[1]) >> >> >> [1] FALSE >> >> >> >> >> >> >> >> >> Thanks, >> >> >> >> >> >> M >> >> >> >> >> _______________________________________________ >> >> datatable-help mailing list >> >> [email protected] >> >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help >> >> _______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
