I apologize if this already exists somewhere, but I made a quick set
of routines that will compress/decompress strings, using Deflate.

This kind of thing was useful for compressing big xml documents into
something manageable that could be stored as text in a db.   The text
string is compressed, then that compressed output is converted to
base64 string, so it can be safely used in any context where a string
is expected (no funky binary chars).  Again, these routines are meant
to compress human readable text such as xml.

Because of base64 encoding (and the nature of deflate + its
dictionaries), there is a certain minimum size the string being
compressed must be before you'll see actual compression.  I think that
minimum may be around 600 bytes (unless your string is highly
repetitive), but we used 1000 as our threshold.

The code is below.  Perhaps someone can make it far more idiomatic, or
point me to where the routines are that do this the clojure way.

I used sun's base64 encode/decode because I needed the output as raw
bytes so as not to cause problems during decompression.  Something is
lost in going from string -> bytes when using encode-str from contrib.

;;; code begins here

(use 'clojure.contrib.duck-streams)

(defn str-to-bytes [s]
  (.getBytes s))

(defn str-from-bytes [b]
  (new String b))

; encode a raw array of bytes as a base64 array of bytes
(defn encode64 [b]
  (. (new sun.misc.BASE64Encoder) encode b))

; decode a string encoded in base 64, result as array of bytes
(defn decode64 [s]
  (let [decoder (new sun.misc.BASE64Decoder)]
    (. decoder decodeBuffer s))))

; compress human readable string and return it as base64 encoded
(defn compress [s]
  (let [b (str-to-bytes s)
        output (new java.io.ByteArrayOutputStream)
        deflater (new java.util.zip.DeflaterOutputStream
                      output
                      (new java.util.zip.Deflater) 1024)]
    (. deflater write b)
    (. deflater close)
    (str-from-bytes (encode64 (. output toByteArray)))))

; take a string that was compressed & base64 encoded.. and undo all
that
(defn decompress [s]
  (let [b (decode64 s)
        input (new java.io.ByteArrayInputStream b)
        inflater (new java.util.zip.InflaterInputStream
                      input
                      (new java.util.zip.Inflater) 1024)
        result (to-byte-array inflater)]
    (. inflater close)
    (str-from-bytes result)))

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to