Time series with Clojure

Fernando Saldanha Fri, 19 Apr 2013 09:53:39 -0700

I am new to the Clojure world. After years of developing finance 
applications in R, I am trying to convert a relatively big R/Finance 
project into Clojure/Incanter. Some things are going very smoothly. I can 
see how the number of LOC is drastically reduced and the code is clean and 
concise.


However, in the core area of dealing with financial time series I am having 
difficulties. Here are some thoughts:

1) In R one works with matrices and data frames, analogous to Incanter's 
matrices and datasets. In R you can do calculations with both types, in 
Incanter only with matrices, but not with datasets. Both data frames and 
datasets allow for heterogeneous data, both matrices do not.

2) In R a matrix can have both column and row names, in Incanter it can 
have neither.

3) From 1) and 2) it seems to me that in Incanter every time you want to do 
calculations you lose the naming of your data. This gives me a feeling of 
insecurity as I have to think about the ordering of the rows (which is 
usually not a problem) and of the columns (which is a big problem).

4) Finance people work primarily with time series. They tend to work with 
data frames or matrices in which each column is a time series. The R data 
frame structure fits nicely with this since a data frame is a list of its 
columns (loosely speaking). Although I don't know what is going on in the 
innards of Incanter, it seems to be focused on rows. I wonder if that has a 
performance penalty when one is working with columns. One could think of 
representing time series as rows in datasets, but that would lead to a loss 
of naming, as datasets don't have row names. Or one could work with columns 
in datasets and rows in matrices, which would require systematically 
transposing the data, which is expensive.

5) I understand that working with Clojure one loses the possibility of 
writing code like 

A[i, j] = something 

where A is a matrix. Here i and j may be numbers or vectors. Actually in R 
a new matrix A is created when executes a command like this, so it is not 
the performance that is the issue. It is rather the convenience. Would it 
be possible to have a function in Clojure/Incanter that when called in the 
following way

(def B (foo A i j z))

would create a matrix B with the same dimensions and entries as A except 
that the subset of rows and columns defined by i and j would be replaced by 
z? (Here i and j could be vectors like [3 5 12] or just ints like 4)

6) To complement the functionality in 5) one would like to be able to apply 
a function to the columns or rows or a matrix, *with the parameters varying 
with the column*. Sometimes in R the functionality is already embedded in 
the function. For example, the function pmin (parallel min), So , if I have 
a matrix

mat
     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8
[3,]    3    6    9

I can call pmin and get the following:

pmin(mat, c(3, 4, 5))
     [,1] [,2] [,3]
[1,]    1    3    3
[2,]    2    4    4
[3,]    3    5    5

(This uses R's "recycling," which is not how Incanter deals with vectors of 
different lengths)

If a function does not have that functionality, one can write

t(apply(mat, 1, function(x, z) {x + z}, c(0, 1, 2)))

     [,1] [,2] [,3]
[1,]    1    5    9
[2,]    2    6   10
[3,]    3    7   11

In any case, just the ability to apply a function to a give or all the 
columns or rows of a matrix would be a big help. 

I wrote the functions

(defn matrix-map-col
  "Applies a function on each element of a column of a matrix."
  [A foo j] (matrix-map foo ($ :all j A)))

(defn matrix-maps-cols
  "Applies a sequence of functions to the elements of the columns of a 
matrix.
   The return value is a matrix with the same dimensions as the argument 
matrix."
  [A foos xs]
  (trans (matrix (map #(matrix-map-col A %1 %2) foos xs))))

which would be part of the solution. One would still have to add the 
ability to vary the parameters and return a matrix. Given my short 
experience with Clojure I wonder if they could be made faster/better and 
what would be the best way to implement the remaining functionality.

7) I wrote in R a class that extends R's matrices. It associates a Date 
object to each row and provides many other capabilities. When I extract 
data from an object I can see the date range and the series I am dealing 
with, which is crucial for checking calculations. I understand Clojure is 
not OO. How could I have similar capabilities. What kind of construct in 
Clojure could I use? I thought about a map with the following keys:

:date - a Java date vector or just a vector of strings of the form 
"20130415"

:data - One of the following two alternatives

a) A vector of maps, each one of which would have as a key the column name 
(a string) and as value a time series

b) An Incanter dataset

Each alternative has advantages and disadvantages. Has anyone thought about 
these issues?  Any comments would be very welcome.

These are my thoughts for now. I will also post this on the Incanter group 
(I hope that is not a problem).

FS 

-- 
-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Time series with Clojure

Reply via email to