Trouble here is, I only want to call the hash functions as needed. This is doing file differencing, and if I only have a single file of (say) 200 megabytes, I never need to calculate it's hash, as I'll never actually compare it to another file of exactly the same size.
- Korny On Thu, May 28, 2009 at 1:25 PM, Mikio Hokari <mikiohok...@gmail.com> wrote: > > Hello. > > I am new to clojure, but try it. > > code: > > (defn get-size [filename] > (println 'size filename) > (count filename)) > (defn get-quickhash [filename] > (println 'quickhash filename) > (hash (take 3 filename))) > (defn get-hash [filename] > (println 'hash filename) > (hash filename)) > > (def get-info (memoize > (fn [filename] > (map #(% filename) [get-size get-quickhash get-hash])))) > > (println "A") > (nth (get-info "abc") 0) > (nth (get-info "abc") 0) > (println "B") > (nth (get-info "abc") 0) > (nth (get-info "abc") 1) > (println "C") > (nth (get-info "abc") 0) > (nth (get-info "abc") 1) > (nth (get-info "abc") 2) > (println "D") > (nth (get-info "abc") 0) > (nth (get-info "abc") 1) > (nth (get-info "abc") 2) > > result: > > A > size abc > B > quickhash abc > C > hash abc > D > > > For memory efficiency, I suppose you may use state-monads and trie. > But it will need a lot of lines of code, and too hard for me. > > Regards, > > --------- > Mikio Hokari > > 2009/5/28 Korny Sietsma <ko...@sietsma.com>: >> >> Hi all, >> >> I have some ruby code that I'm thinking of porting to clojure, but I'm >> not sure how to translate this idiom to a functional world: >> I have objects that are externally immutable, but have internal >> mutable state they use for optimisation, specifically in this case to >> defer un-needed calculations. >> >> Basically, I have a FileInfo class that wraps a data file, used to >> compare lots of files on my system. >> It has an "exact_match" method similar to: >> def exact_match(other) >> return false if size != other.size >> return false if quickhash() != other.quickhash() >> return hash() != other.hash() >> end >> >> quickhash and hash store their results in instance variables so they >> only need to do the expensive calculations once - and quite often they >> never need to get calculated at all; I'm looking for duplicate files, >> but many files have no duplicate, so probably never need to have their >> contents hashed. >> >> How would I do this in a functional way? My first effort would be >> something like >> (defn hash [filename] (memoize (... hash function ...))) >> but I have a couple of problems with this: >> - it doesn't seem to store the hash value with the rest of the file >> information, which feels a bit ugly >> - I assume it means storing the full filename three times, once in >> the original file info structure, once in the memoized hash function, >> and once in the memoized quickhash function. My program struggles to >> get enough RAM to track as many files as I'd like already - storing >> the filename multiple times would blow out memory quite badly. >> >> I guess I could define a unique key for each filename, and define hash >> as a function on that key, but then hash would need to be able to >> access the list of filenames somehow. It's starting to get beyond me >> - I'm hoping there's a simpler option! >> >> Any suggestions? I'd hope this is not an uncommon idiom. >> >> - Korny >> >> -- >> Kornelis Sietsma korny at my surname dot com >> "Every jumbled pile of person has a thinking part >> that wonders what the part that isn't thinking >> isn't thinking of" >> >> > >> > > > > -- Kornelis Sietsma korny at my surname dot com "Every jumbled pile of person has a thinking part that wonders what the part that isn't thinking isn't thinking of" --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~----------~----~----~----~------~----~------~--~---