I've been working on the same issue. So far it has mostly been just researching various options, but I can give you my two cents...
It really depends on your goals and constraints. I have narrowed down to two major families of serialization for storage and networking. One is the JSON/YAML/XML style, where you generate a serialized version of data structures primarily based on vectors and hashes that contain only simple data types. (Note, JSON is a subset of YAML, so you can parse JSON with YAML but not vice versa.) This is by far the fastest to develop and the most light weight in terms of programmer time. Basically one line each for read/write. The potential hidden cost depends on what data structures you use in your program. If you have clearly defined chunks of data to serialize, YAML works nicely, but for more complex structures you often have to do an intermediate conversion to simpler data structures where you deal by hand with things like circular references and pointers to ephemeral data that you don't want serialized. The previous options are however, inefficient for storage, transmission and parsing in comparison to a more strictly defined protocol. If you need raw performance and you are willing to spend the effort defining your protocol, then I think something like the Google protocol buffers or Facebook thrift are good options. They are basically the new-school versions of CORBA RPC. In essence, you define a schema for your messages or data serialization units, and then some tools generate classes or functions that are used to read/write and transmit this data. (SOAP pretty much works the same way, but it idiotically sits on XML too, so you get the worst of both worlds...) Again, if your data units to be serialized are self contained this can work pretty smoothly, but in more complex structures you will also have to convert between the simple, generated classes and your more complex application classes. The real work though, is in creating and maintaining your protocol definitions and the code that uses the generated classes. I think the default for a language like clojure should be YAML too. For dynamic languages where developer time is the focus it is by far the quickest mechanism to get up and running using databases, configuration files, networking, etc. Maybe we should look into integrating the built-in Clojure data-types with a YAML library, or otherwise creating a new one, so we can dump and load directly between serialized strings and Clojure data structures. If you run up against the limits of YAML, then I would go protocol buffers. They seem like a clean and efficient way to support multi-language communication without wasting time writing a bunch of custom serialization methods. It would be interesting if there was a way to sort of generate .proto files by example, by sniffing YAML on the wire or something... It could at least help bootstrap the protocol definition phase. Hopefully that helps. -Jeff Tayssir John Gabbour wrote: > Hi! > > How should I approach serialization? I made a little test function > which serializes and deserializes Clojure objects. It works for > strings, integers, symbols, LazilyPersistentVectors and.. oddly.. > PersistentHashMaps that have exactly one element. (My Clojure is about > a month old.) > > But for other things, like keywords and most PersistentHashMaps, it > throws NotSerializableException. > > My imagined possible solutions: > > * Implement Serializable for Clojure data -- but is it possible in a > dynamic "Hey I'll just write a new method!" way? > > * Go into Clojure's source and implement Serializable to the Java > classes. > > > My end goal is using a nonrelational DB like Tokyo Cabinet or > BerkeleyDB. > > Thanks, > Tayssir > > > PS: Here's my test code: > > (defn my-identity "Copies obj through serialization and > deserialization." > [obj] > (let [byte-out (new java.io.ByteArrayOutputStream) > obj-out (new java.io.ObjectOutputStream byte-out)] > (try (.writeObject obj-out obj) > (finally (.close obj-out))) > (let [obj-in (new java.io.ObjectInputStream > (new java.io.ByteArrayInputStream (.toByteArray > byte-out)))] > (try (.readObject obj-in) > (finally (.close obj-in)))))) > > > > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~----------~----~----~----~------~----~------~--~---