Thank you for the lead, Peter. It may be useful for other packages I write.
As to the strings, I think I have to take what is already there. I agree that strings would be better managed in malloc-style fashion (probably with reference counter) and not by gc(). However I don't want to have a system with two different string classes, such close relatives seldom coexist peacefully. BTW, the slowness of mkChar explains why R is so slow when it needs to compute names for long vectors. Thank you for an interesting discussion, Vadim > -----Original Message----- > From: Peter Dalgaard [mailto:[EMAIL PROTECTED] > Sent: Tuesday, June 08, 2004 3:35 PM > To: Vadim Ogranovich > Cc: R-Help > Subject: Re: [R] fast mkChar > > "Vadim Ogranovich" <[EMAIL PROTECTED]> writes: > > > I am no expert in memory management in R so it's hard for > me to tell > > what is and what is not doable. From reading the code of > allocVector() > > in memory.c I think that the critical part is to vectorize > > CLASS_GET_FREE_NODE and use the vectorized version along > the lines of > > the code fragment below (taken from memory.c). > > > > if (node_class < NUM_SMALL_NODE_CLASSES) { > > CLASS_GET_FREE_NODE(node_class, s); > > > > If this is possible than the rest is just a matter of code > refactoring. > > > > By vectorizing I mean writing a macro > CLASS_GET_FREE_NODE2(node_class, > > s, n) which in one go allocates n little objects of class > node_class > > and "inscribes" them into the elements of vector s, which > is assumed > > to be long enough to hold these objects. > > > > If this is doable than the only missing piece would be a > new function > > setChar(CHARSXP rstr, const char * cstr) which copies > 'cstr' into 'rstr' > > and (re)allocates the heap memory if necessary. Here the setChar() > > macro is safe since s[i]-s are all brand new and thus are > not shared > > with any other object. > > I had a similar idea initially, but I don't think it can fly: > First, allocating n objects at once is not likely to be much > faster than allocating them one-by-one, especially when you > consider the implications of having to deal with > near-out-of-memory conditions. > Second, you have to know the string lengths when allocating, > since the structure of a vector object (CHARSXP) is a header > immediately followed by the data. > > A more interesting line to pursue is that - depending on what > it really is that you need - you might be able to create a > different kind of object that could "walk and quack" like a > character vector, but is stored differently internally. E.g. > you could set up a representation that is just a block of > pointers, pointing to strings that are being maintained in > malloc-style. > > Have a look at External pointers and finalization. > > > -- > O__ ---- Peter Dalgaard Blegdamsvej 3 > c/ /'_ --- Dept. of Biostatistics 2200 Cph. N > (*) \(*) -- University of Copenhagen Denmark Ph: > (+45) 35327918 > ~~~~~~~~~~ - ([EMAIL PROTECTED]) FAX: > (+45) 35327907 > > ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html