The failure of the strdict-01 regression test on Windows has again highlighted the design fault in strings .. both Felix and C++.
When you call char const *p = string("hello").c_str(); in C++ you have a dangling pointer p. The call succeeds because string("hello") is a non-const rvalue, but there's no mechanism in C++ for a method to require an lvalue object: you can insist that a function argument is an lvalue, but not the object of a non-static member function. Consequently the string("hello") temporary may evaporate immediately after p is assigned to its internal buffer, and the destructor deletes the storage, leaving p dangling. In Felix, this means calling "Hello".cstr is more or less guaranteed to be invalid. It is not even valid to do this: fun f(x:string) = { g (strdup (x.cstr)); ... because "x" is a val, and vals can be either eagerly or lazily evaluated. If the function above is inlined, a call such as f("hello") can reduce to g ( strdup ("hello".cstr) ) and the pointer can be left dangling even before strdup is called. Note this problem is not limited to cstr -- it applies to STL begin() and end() too. Anything that gets a pointer into any data structure can have the pointer invalidated if the data structure is destroyed, and if the data structure is a temporary that can happen very quickly. It turns out the Microsoft compiler MSVC++ is much more aggressive about this than gcc or clang, which presumably are dumbed down to cope with dumb end user legacy code: they appear to remove temporaries at the end of the containing statement, instead of immediately after use. Of course the converse may be possible too: gcc aggressively removes copying whereas MS doesn't, this would lead to the same situation because gcc would then be keeping a single value around with the lifetime of the original extended to the copy that it didn't make. In any case, uise of STL iterators or cstr in Felix is affected by this misdesign of strings in C++. I have to say here ITS MY FAULT. I voted in favour of making strings STL containers. Only Pete Becker stood against this and he was right. The correct way to handle this is to copy the string buffer immediately: char const * get_c_string (string const &x) { char const*p = x.c_str(); return strdup (p); } The reason this must work is that any temporary used as an argument may not be destroyed until after the function returns. by that time p is dangling but we've duplicated the buffer. Of course in C++ this causes a memory leak. In Felix we can use a varray instead, which is basically a garbage collected array the contents of which are always on the heap. Another idea is to change Felix cstr implementation to use a function that only accepts an lvalue: char const *get_c_string (string &x) { return x.c_str(); } Felix will never know you called the function wrongly with an rvalue, but the C++ compiler will barf. The delayed error message is a bad thing .. the improved performance is a good thing. The problem is that this isn't safe either. Even an lvalue can evaporate. Another, safe, idea is to change the implementation of Felix strings. Much as I'd love to design my own string class C++ compatibility dictates that it's sane to stick with C++ strings, even if they're broken. But, we can have our cake and eat it too: we can use a POINTER to a C++ string, with the string on the heap. That makes strings first class Objects in Felix instead of values though! Alternatively we might trick the Felix compiler into assuring that C++ strings derived from literals are universally assigned to variables so they can't evaporate. Of course this only fixes the problems for literals, and not string expressions in general ;( The RIght Way (TM) to do this is: void get_c_string (string x, char *p, int len) { strncpy (p,len, x.c_str()); } That is, copy the string contents into a buffer .. what I mean is, that in C++ this function should be added and c_str() should be removed. Well. i don't know what to do. The universally safe solution, using varray, incurs a horrible penalty doing a heap allocation every time we need a char * instead of a C++ string, which is every call to C functions. Of course I could provide both, "safe_cstr" and "__unsafe_cstr". -- john skaller skal...@users.sourceforge.net http://felix-lang.org ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Felix-language mailing list Felix-language@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/felix-language