Hi, After spending some time with library, looking through docs and code, compiling it and comparing with my expectations I see following 5 major issues with submitted library:
[Issue 1] Registration/reflection facility should be completely separated from serialization implementation and became registration policy template parameter [Issue 2] Library should be subdivided into 2 *independent* library serving purpose of serialization and deserialization. [Issue 3] Library seems to hardcode important part of functionality that users may want to overwrite. Here I refer in most part to archive/object preamble. [Issue 4] Presented solution for serialization of STL collections is seriously flawed. [Issue 5] Quality of code/implementation. I believe Boost should demonstrate examples of solidly written code; requirement submitted library not always comply to. I was able to compile library and tests with MSVC6.5. test.cpp produce couple runtime failures. See attached. Now let take a look in more details: Documentation --------------- As it what mentioned by some other reviewers documentation is incomplete and sometimes incorrect and most importantly missing very important topics. Tutorial: page 2: after // close archive you are closing ifs - should be ofs page 3: in function load incorrect type of the second argument: int file_version Reference: Section that is definitely missing is one describing in details (including contract) how to write custom archives. Operators for Pointers: const * T should be const T * (or T const* but this comment later) Why are you using "backward" logic while describing issues addressed to properly save and restore pointers. Reader will read the document from top to bottom and all these "must" "need" "have to be" are confusing. Operators for Templates: Operators you are describing in this section does not exist in reality and STL containers serialization is implemented using different means (even for compiler that support [partial specialization it would not look like what you described). IMO you confusing readers and simple incorrect. Serialization of Classes: code snippet at the end and one before using array "a" that never declared. Serialization of templates: Passage about "not working as expected" simply incorrect. It does world if partial specialization is supported. Following example successfully compiled and with both gcc and bcc32 I have access to: #include <iostream> template <typename T> struct A{}; template<typename T> struct B { enum { value = 0 }; }; template <typename T> struct B<A<T> > { enum { value = 1 }; }; int main() { typedef A<int> a_int; std::cout << B<a_int>::value; std::cout << B<int>::value; } In reality what should be discussed here is that library supports several ways to specialize the serialization logic and user may choose any way he prefer depends on it's preferences and used compiler. And BTW it's not specific to the templates this 3-layer logic is used for *all" classes. What I mean here (and it should be clear from docs that currently library is using following 3 layer logic for overwriting serialization algorithm: Layer 3: one could use partial ordering for specializing serialization for type T by overloading function void boost::serialization::sterilization_save( basic_oachive& ar, T&, long ); by default it referee to the serialization<T>::save(...) Note that I moved it out of details into serialization namespace. (BTW it was one of the reviewer requested - it's already in) Layer 2: one could provide partial/full specialization of traits class specialization by default if refer to the T::save(...) Layer 1: one could provide the function save(...) as member of type T accessible for serialization<T> Moreover I think that it should be clearly specified that Layer 3 is "temporary" and may be eliminated in a future so it is recommended to use one of the layers 2 or 1. Very large numbers of Small Objects: In function serialization<RBG>::save_ptr extra "(" near BOOST_STATIC _ASSERT. vector data? doesn't vector need any template parameters? When you serializing the struct line it's unclear how you save/load vector size Exception safety: First paragraph: "this problem" What problem? It should be clearly stated or rephrased. Each scenario you are describing should be accompanied by the simple but clear example code. "Other cases" refer to the altered shared_count.hpp which permits the serialization of shared_ptr. Does this means that the things will stay as it is users won't be able to serialize instances of shared_ptr? Whose fault is it? Serialization Implementation included in the library: "This header is automatically included in any program..." This is incorrect decision and docs will need to be fixed together with code. See below. Advice: Should be called "Advises" Working around ... "... compiler will attempt to declare an instance of an abstract base class as part of type traits package". Why? Where? could not you provide a workaround yourself. Rationale: quiet - should be quite rationale for not making "serialization" member of boost namespace is weak in IMO and decision was incorrect. Design ------- In general design of the submitted library is solid and clear. But there are several very important issues. Major [Issue 1]: I personally in my practice never used type_info and may be willing to supply my own type registration system and accordingly would not want any decision to be forced on me. In this sense "all" reference on type type_info and operator typeid should be eliminated from core library. RestrationPolicy should be supplied as template parameter and this policy will define type_id type and function for getting type_id from value of type T. Interface will need to be discussed in more details in a separate thread. Library could supply one implementation if registration policy based in type_info (and it will be the only place where type_info will be referenced). Major [Issue 2]: I believe it design error to couple both sides of serialization together in one library. It should be separated . So that user should be able to use only required part. It's not only eliminate not required dependency, but also make code much more clear and easier to navigate and compiled code smaller. Major [Issue 3]: Submitted library is somewhat limited in a means to modify what is written in archive header (I can change it but I will need to supply full fledged specific archive cause currently this logic is in constructor). And even more limited in a means to modify object header together with logic bound to it. Let me give couple examples. 1. Serialization signature is 22 bytes. Let say I am serializing messages into binary format for sending them through network. Size of my serialized code is 10 bytes. I do not believe I could accept 22 bytes of "signature information". The same about the version. I may or may not be interested in versioning of serialization system. 2. Binary archive constructor save some "guard" information. I may or may not want to do this. Or I may want to be even more careful. 3. Object save/load function contains pretty expensive logic for making sure that object is stored/loaded only once. If for any reason (performance) I could not afford constant searching but have external knowledge that all objects are different I may want to skip lookup phase. 4. Let say I am sending data by contract. It could be fixed contract in external file loaded during initialization or dynamic one sent before sending real data. The purpose id to save the bandwidth and eliminate all object/class ids from serialized stream. What should I do? Also as mention in other reviews it's seems from library design that there is no way to implement proper bracketing of the data for formats like XML. Other issues mostly minor. Object_id generation logic is transparent to the data type. You may save both CPU and space if you choose to use type specific object identifiers. version should be static constant member of the traits or user class and not a function at all. Implementation --------------- Major [Issue 4]: support of serialization of stl data structures is flawed. I would never agree to include header that will bring also: vector, list, deque, map, hash_map, slist. First I may not be using them at all. Second even if I do use vector I should include <boost/serialization/vector.hpp>. The same with any other STL data structure (only some of them are supported currently, while should be all). IOW serialization library will need to repeat the structure of STL (header-wise). And users of the library will need to include appropriate header. Library should be using following namespaces: boost: basic_..achive family only. The only part of public user interface boost::serialization struct traits (currently serialization class), used by used supply UDT sterilization traits boost::serialization::detail rest of the code including void cast (which is better be named runtime_cast cause the name should emphasize not void part but namely dynamic runtime essence of it) Accordingly directory structure should reflect this: archive.hpp should be the only header under boost. Rest under serialization and serialization/details directories. Use of template set to keep all static information is questionable. In several cases usage pattern looks like several inserts and a lot of querying. sorted vector would work better in this case. See Matt Austern article on this topic. basic_[i|o]archive uses virtual functions for serializing intrinsic types. In some rare cases it may be expensive Why rare? because in majority of the cases price of double dispatching is incomparable with i/o operations. But it may be the case when you serializing directly in memory lots of intrinsics. When user may want to eliminate double dispatching somehow. And even if it is not first priority task (IMO) you may think in a future instead if hardcoding virtual functions into basic_ classes to use mpl base generation facility that will allow serialization users to use compile time polymorphism. Code ------ I have serious concerns to the quality of the some part of the code supplied. It's lacking proper comments, incur unreasonable dependencies contains errors and not recommended styles. Let see the details: void_cast.hpp: see attached file for reworked version 1. header guard named with _H at the end. I would prefer _HPP 2. <set> <functional> should go into implementation file together with set variable 3. <typeinfo> should go 4. <boost/type_traits.hpp> <boost/type_traits/conversion_traits.hpp> should go. use is_const and is_polymorphic that you are really using. 5. namespace boost::serialization::detail 6. collection is used only in implementation file. should be moved there - no need to pollute the header with <set> dependency. less functor, operator<, collection_imp go together. And become free functor. 7. direction_enum. I would prefer name cast_direction. It's obvious that it's enum from the code. 8. inverse function never used 9. Why data members, constructors, virtual functions all messed up in seems like an arbitrary order. And all public. 10. Why cast is the pure virtual function? It's the only reason why void_caster_argument exist. Make return NULL; default implementation and eliminate second class. 11. Why insert is static function with argument? Couldn't we just self_register (see the code)? 12. void_cast_primitive, void_cast_register template parameters all uppercase. It's recommended to use mixed case in this case. 13. static_caster, dynamic_caster template parameters names better be decrypted from F, T to From, To 14. __declspec(noinline) doesn't work with MSVC6.5 15. void_cast_register need argument Base* dummy = NULL for it to work with MSVC 6.5 16. names of the format function arguments are using leading _. It's reserved for compilers. Better be trailing one. 17. void_cast register is extremely weird: { return expr1; expr2; } It's unclear for me why it should work at all. 18. Why void_cast_primitive and void_cast_derived are using different logic for instantiation. Why do you try to instantiate second on stack? It does not buy you anything - the price is complicated logic. 19. For the symmetry I would name void_cast void_upcast 20. You do not need const_cast to convert from void* to void const* 21. Boost recommends to keep * and & close to type not to variable. 22. why direction_enum is the member of void_caster? 23. at the end of the file in a comment wrong header guard. void_cast.cpp 1. After change in item 6 above you don't really need operator< any more. It's logic will move into one free "compare" functor. You don't need else clauses in this implementation. 2. void_cast implementation IMHO need some empty line separators. 3. at the end of void_cast implementation wrong namespace name in a comment collection_imp.hpp is not really used it's copy/pasted into stl.hpp and should not be part of the submission in it's current state. In fact in my proposition it should exist and all vector.hpp map.hpp reuse it. serialization_imp.hpp 1. eliminate type_traits header. use specific 2. set included but never used!! 3. #define typename is not only bad style it simply glitch cause it make MSVC to fail badly, as it was reported already in some post. The intend was solve MSVC dependent typename issue. Look through boost code for several fixes for this one (I really think it should be part of MSVC configuration. 4 .class serialization comment I would start with "for each type..." 5. In the same comment: passage " Note that I chose to avoid..." IMO is incorrect and alternative "failed" variant should be used instead. 6. type_info_extended_save/load hide constructors. which it should not be doing at all cause it inherited from boost::noncopyable and have non-default constructor already. 7. type_info_extended_save/load have repeated logic that could be shared though the base class 8. From the code unclear why above classes exist at all. IIUC in attempt to place an implementation into source file. Value is questionable considering size of the functions. 9. save_pointer_type template parameter should be TPtr. 10. It you take a look onto instantiate_nothing implementation you will be amused to see following: template<class T> struct instantiate_nothing { static type_info_extended_save & instantiate(){ BOOST_STATIC_ASSERT(false); type_info_extended_save *tix = NULL; return static_cast<type_info_extended_save &>(*tix); } }; a. Why this code dereference NULL pointer? b. Given BOOST_STATIC_ASSERT(false); why anything else exist at all? c. If this is not supposed to be instantiated why not to check directly in instantiate(...) that T is not fundamental and not enum? The same with both save and load sides 11 assert_polymorphic: why exist? Why not use static assert? do we need this check at all? serialization.cpp 1. comment on top state that it implements Persistence class 2. I would prefer STL header to *follow* local ones. Not vice-versa. 3. cobject_arg - the same story as with void_cast_argument. It should be eliminated by defining default behavior in base class. 4. near to the statement where find returns NULL there is worrying comment. Should not it be reported to the user? archive.hpp register_type function need T* dummy = NULL argument count == is.gcount needs static_cast archive.cpp why ARCHIVE_SIGNATURE is not std::string? hack in line 96 should be ifdefed Testing ------- void_test_cast.cpp Includes fstream and never use it. main had wrong specification. Did it work for you? I change it to use test/minimal.hpp and BOOST_CHECK See attached file test.cpp Should not it be names serialization_test.cpp I would consider separating it into several simple tests. MSVC6.5 was complaining about compiler limit reached. Conclusion ----------- In conclusion of this review I would like to thank Robert for his work that has definite potential. But in a light of above issues I vote NO to include the library now in boost. Once major issues will be resolved we will need to return to the library to check for some more subtle issues. Especially with registration. Once code is separated and cleaned a bit it will be easier to analyze it also. Looking forward to see those fixes. Regards, Gennadiy. _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost