I'm having trouble with writing a function

   1. in idiomatic clojure
   2. which doesn't blow the stack

The problem is I have a time series of events e.g.

({:idhistory 78758272, :timestamp #inst 
"2016-03-31T19:34:27.313000000-00:00", :nameid 5637, :stringvalue nil, 
:value 8000.0} 
 {:idhistory 78756591, :timestamp #inst 
"2016-03-31T19:33:31.697000000-00:00", :nameid 5637, :stringvalue nil, 
:value 7368.0} 
 {:idhistory 78754249, :timestamp #inst 
"2016-03-31T19:32:17.100000000-00:00", :nameid 5637, :stringvalue nil, 
:value 6316.0} 
 {:idhistory 78753165, :timestamp #inst 
"2016-03-31T19:31:41.843000000-00:00", :nameid 5637, :stringvalue nil, 
:value 5263.0} 
 {:idhistory 78751187, :timestamp #inst 
"2016-03-31T19:30:36.213000000-00:00", :nameid 5637, :stringvalue nil, 
:value 4211.0}
 {:idhistory 78749476, :timestamp #inst 
"2016-03-31T19:29:41.363000000-00:00", :nameid 5637, :stringvalue nil, 
:value 3158.0} ...)

which is to say, each event is a map, and each event has two critical keys, 
:timestamp and :value. The series is sorted in descending order by 
timestamp, i.e. most recent event first. These series are of up to millions 
of events; the average length of the series is about half a million events. 
However, many contain successive events at which the value does not change, 
and where the value doesn't change I want to retain only the first event.

So far what I've got is:

(defn consolidate-events
  "Return a time series like this `series`, but without those events whose 
value is
   identical to the value of the preceding event."
  [series]
  (let [[car cadr & cddr] series]
    (cond
      (empty? series) series
      (=
        (get-value-for-event car)
        (get-value-for-event cadr)) (consolidate-events (rest series))
      true (cons car (consolidate-events (rest series))))))


Obviously, with millions of events or even merely hundreds of thousands, a 
recursive function blows the stack. Furthermore, this one isn't even tail 
call optimisable. I tried creating an inner function which I naively 
thought should be tail call optimisable, but it fails 'Can only recur from 
tail position':

(defn consolidate-events
  "Return a time series like this `series`, but without those events whose 
value is
  identical to the value of the preceding event."
  [series]
  (remove
    nil?
    (let [inner (fn [series]
                  (let [[car cadr & cddr] series]
                    (if
                      (not (empty? series))
                      ;; then
                      (cons
                        (if
                          (= (get-value-for-event car)
                             (get-value-for-event cadr))
                          ;; then
                          nil
                          ;; else
                          car)
                        (if
                          (not (empty? series))
                          (recur (rest series)))))))]
    (inner series))))


Test for the function is as follows:

(deftest consolidate-events-test
  (testing "consolidate-events"
    (let [s1 [{:timestamp #inst "2016-03-31T19:34:27.313000000-00:00", 
:value 8000.0}
              {:timestamp #inst "2016-03-31T19:33:31.697000000-00:00", 
:value 7368.0}
              {:timestamp #inst "2016-03-31T19:32:17.100000000-00:00", 
:value 6316.0}
              {:timestamp #inst "2016-03-31T19:31:41.843000000-00:00", 
:value 5263.0}
              {:timestamp #inst "2016-03-31T19:30:36.213000000-00:00", 
:value 4211.0}
              {:timestamp #inst "2016-03-31T19:29:41.363000000-00:00", 
:value 3158.0}]
          s2 [{:timestamp #inst "2016-03-31T19:34:27.313000000-00:00", 
:value 8000.0}
              {:timestamp #inst "2016-03-31T19:33:31.697000000-00:00", 
:value 7368.0}
              {:timestamp #inst "2016-03-31T19:33:17.100000000-00:00", 
:value 6316.0}
              {:timestamp #inst "2016-03-31T19:32:27.100000000-00:00", 
:value 6316.0}
              {:timestamp #inst "2016-03-31T19:32:17.100000000-00:00", 
:value 6316.0}
              {:timestamp #inst "2016-03-31T19:31:41.843000000-00:00", 
:value 5263.0}
              {:timestamp #inst "2016-03-31T19:30:36.213000000-00:00", 
:value 4211.0}
              {:timestamp #inst "2016-03-31T19:29:41.363000000-00:00", 
:value 3158.0}]]
      (is (= s1 (consolidate-events s1)) "There are no events in s1 that 
can be consolidated")
      (is (= s1 (consolidate-events s2)) "When consolidated, s2 = s1")
      (is (not (= s2 (consolidate-events s2))) "When consolidated, s2 no 
longer equals s2"))))


Any help gratefully accepted! 

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to