I have now got Felix compiler to generated encoders for data types (without changing the RTTI system).
Consider this code, note that p *. m in Felix is shorthand for (*p).m which is the same meaning as p->m in C. We cannot use the C notation because the precedence is wrong, being dictated by the precedence required for function types. So: ///////////// struct X { a:int; b:int; }; var b = new X(1,2); println$ b*.a; struct Y { c:X; d:string; } var c = new Y ( X(1,2), "hello"); println$ c*.d; ////////////// We get these: //////////////////////////////////// // TESTING ENCODER for type X ::std::string _s41399t_57573_encoder(void *d) { char *p = (char*)d; ::std::string b = ""; b+=::flx::gc::generic::blit(p,sizeof(_s41399t_57573)); // pod return b; } // TESTING ENCODER for type Y ::std::string _s41401t_57575_encoder(void *d) { char *p = (char*)d; ::std::string b = ""; //Struct b+=::flx::gc::generic::blit(p+offsetof(_s41401t_57575,c),sizeof(_s41399t_57573)); // pod b+=::flx::gc::generic::string_blit(::flx::gc::generic::string_encoder(p+offsetof(_s41401t_57575,d))); //prim return b; } // TESTING ENCODER for type string ::std::string _a13047t_57553_encoder(void *d) { char *p = (char*)d; ::std::string b = ""; b+=::flx::gc::generic::string_blit(::flx::gc::generic::string_encoder(p)); //prim return b; } ///////////////////////////////// Note these encoders are simply generated and compiled but not used yet. First, for a primitive type T which is a "pod" the encoding is done by "blit", which just returns a string with the binary image of the type. For a non-pod primitive a user defined encoder is called. This is a function named with the syntax type mytype = "mytype" requires encoder "myencoder"; and has the type string myencoder (void *p); Given a pointer to an object of the type the user function has to convert it to a string of any length. The system provides an encoder for the type string, which just returns the string. For every primitive with an encoder the compiler wraps the string by prefixing the string with its length (in binary). For pod this isn't necessary because the length is known as sizeof(T). [This may change when we have run time defined types] OK so now for non-primitives. I have changed the definition of "pod" so that a "pod" is any data type not-containing a non-pod primitive. So all pointers are now pod. Structs are pod unless a member is not pod. Same for tuples and records. Unions are always pod because they're pointers or plain ints. Note the previously mentioned issue with cstructs. [I forget what I implemented but I think they're non-pod but I expect this will cause problems] The encoding stuff I have shown ignores pointers. So you can use it to encode anything, but any pointers just get blitted out in binary. So we have a first stage encoder for many types now. Not functions yet! Just types! Now, for this to be useful we need a routine that does two things. (1) Encode an object (2) Find all the pointers, and make sure what they point at is encoded too. Then we cat the results together and that's the encoding. We can find the pointers from the shape offset table. The algorithm will basically have two sets: already encoded and not yet encoded. When we grab a pointer we first convert it to a head pointer (start of heap object). Then we add that to the not yet encoded set (unless it already in one of the two sets). If the pointer isn't a Felix pointer we have to just leave it. Perhaps abort. (but not if NULL, that's OK). So we form a closure of encodings of all the linked objects this way and just concatenate them, with their lengths AND the original address. To decode, we split up the stream into substrings (we have the lengths). We make the objects, recording the old pointer and the new pointer in a list. For a pod we just allocate store and blit the data in. For a non-pod primitive we have to use the user supplied decoder. For a composite non-pod we just do it memberwise, undoing the encoding. When we're finished, we have list of new object addresses, and we have a map old pointer -> new pointer. So we we run through the objects in the list, and using the offset tables we grab each old pointer, look it up in the map, and put the new pointer in. This mechanism is VERY nice because user encoder/decoders just don't have to worry about Felix pointers, only foreign pointers (for example in a string, the pointer to the array). Really nasty objects like Google RE2 objects are easy if we're slightly hacky: we just grab the string regexp and use that as the encoding. The decoder can rebuild the RE2 object from just the regexp. The point here is that the top level serialisation routines can be written in Felix. The compiler only needs to maintain the low level per-object ignore Felix pointer serialisation. -- john skaller skal...@users.sourceforge.net http://felix-lang.org ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_jan _______________________________________________ Felix-language mailing list Felix-language@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/felix-language