Hi, I noticed that serialize() gives different results depending on R version, which has implications to the digest() function in the digest package. Note, it does give the same output across platforms. I know that serialize() is under development, but is this expected, e.g. is there some kind of header in the result that specifies "who" generated the stream, and if so, exactly what bytes are they?
SETUP: R versions: A) R v2.4.0 (2006-10-03) B) R v2.4.1pat (2007-01-13 r40470) C) R v2.5.0dev (2006-12-12 r40167) This is on WinXP and I start R with Rterm --vanilla. Example: Identical serialize() calls using the different R versions. > raw <- serialize(1, connection=NULL, ascii=TRUE) > print(raw) gives: (A): [1] 41 0a 32 0a 31 33 32 30 39 36 0a 31 33 31 38 34 30 0a 31 34 0a 31 0a 31 0a (B): [1] 41 0a 32 0a 31 33 32 30 39 37 0a 31 33 31 38 34 30 0a 31 34 0a 31 0a 31 0a (C): [1] 41 0a 32 0a 31 33 32 33 35 32 0a 31 33 31 38 34 30 0a 31 34 0a 31 0a 31 0a Note the difference in raw bytes 8 to 10, i.e. > raw[7:11] (A): [1] 32 30 39 36 0a (B): [1] 32 30 39 37 0a (C): [1] 32 33 35 32 0a Does bytes 8, 9 and 10 in the raw vector somehow contain information about the R version or similar? The following poor mans test says that is the only difference: On all R versions, the following gives identical results: > raw <- serialize(1:1e4, connection=NULL, ascii=TRUE) > raw <- as.integer(raw[-c(8:10)]) > sum(raw) [1] 2147884 > sum(log(raw)) [1] 177201.2 If it is true that there is a R version specific header in serialized objects, then the digest() function should exclude such header in order to produce consistent results across R versions, because now digest(1) gives different results. Thank you Henrik ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel