On 16/09/2014, at 3:24 PM, srean wrote: > > I think there are enough extra copies and other overheads that can be removed > to beat Python at it. Note the C++ code itself is twice or more faster than > the Python code, so passing the buck to C++ wont help.
I haven't seen the C++ version of it. > One neednt allocate that list upfront although that turned out to be faster. > There's got to be a way to make Felix yield competitive. Of course there is, but it doesn't involve copying strings about. Since Felix is a pass by value language, that seems inevitable, even with some inlining, if one is using higher level functional operations like split. RE2's StringPiece would help if the base char array can be made to persist. Its basically a struct { length, char const* } thing. But that raises another deficiency in Felix. When you have a Felix native structure containing pointers, you can nest it in another Felix native structure. The compiler traverses the tree when building the array of pointer offsets for the top level type. There's no way to do this for any C data type *except* a type that is already a pointer, you can label that like: _gc_pointer type fred = "fred*"; Now, there is a way to model a *complete* C data type without knowing the offsets: type fat = "fat" scanner "fred_scanner"; The scanner is a C function that finds all the offsets. This is how Judy Arrays are integrated into Felix, with a custom scanner. However, the scanner has to be applied to *pointer* onto the heap. You cannot actually put one of these objects into another Felix object because the compiler doesn't know how to find the offsets. The run time system does, via the scanner function, but that's no use. The compiler generates a single offset array for all data types unless there's a custom scanner. In fact, the compiler calls a "standard custom scanner" and passes it the offset array. I think the function is called "scan_by_offsets" :) So now the point is for a StringPiece, if implemented in C++ (as the RE2 one is), the fact I know where the contained char* is doesn't help. I can make a custom scanner for it, but then all the StringPiece have to be whole objects. Technically: either "on the machine stack" or "whole objects on the heap". Conservative scan takes care of the stack case. There are some ways to fix this: one is to represent a data structure type at run time not with a single flat RTTI object but recursively. In other words, a "struct" with three fields would be represented by an array of three pointers to the field types. At run time a recursive descent can find all the offsets. This is obviously better because it makes run time type construction a breeze. However the downside is that the scanning for offsets would be slower. The bottom line is that if I want string pieces .. I have to implement them in Felix. -- john skaller skal...@users.sourceforge.net http://felix-lang.org ------------------------------------------------------------------------------ Want excitement? Manually upgrade your production database. When you want reliability, choose Perforce. Perforce version control. Predictably reliable. http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk _______________________________________________ Felix-language mailing list Felix-language@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/felix-language