Hi Fernando,

Would just like to point out for those not yet aware that there is a group 
of people interested in bringing proper multi-dimensional matrix 
capabilities to Clojure in the core.mtarix project. Somewhat inspired by 
NumPy, but with a distinctively Clojure flavour. See the discussion group 
here:
https://groups.google.com/forum/?fromgroups#!forum/numerical-clojure

Incanter pre-dates this initiative, and as such has it's own way of dealing 
with matrices (which is fine, but not as comprehensive a solution). In the 
medium term, I hope it is possible to integrate the two, i.e. Incanter will 
use core.matrix as the underlying matrix library. Then we won't have this 
disconnect, and Incanter will benefit from all of the advanced features in 
core.matrix.

To some extent you can already use core.matrix quite well with Incanter: 
core.matrix already supports treating arbitrary Clojure sequences and 
persistent vectors as matrices / vectors.

I'll try and quickly comment on your thoughts from a core.matrix 
perspective.

On Saturday, 20 April 2013 00:49:08 UTC+8, Fernando Saldanha wrote:
>
> I am new to the Clojure world. After years of developing finance 
> applications in R, I am trying to convert a relatively big R/Finance 
> project into Clojure/Incanter. Some things are going very smoothly. I can 
> see how the number of LOC is drastically reduced and the code is clean and 
> concise.
>
> However, in the core area of dealing with financial time series I am 
> having difficulties. Here are some thoughts:
>
> 1) In R one works with matrices and data frames, analogous to Incanter's 
> matrices and datasets. In R you can do calculations with both types, in 
> Incanter only with matrices, but not with datasets. Both data frames and 
> datasets allow for heterogeneous data, both matrices do not.
>

core.matrix allows for matrices with heterogeneous data, so you can use 
them as general purpose datasets. It also supportes specialised matrices 
with pure numbers (e.g. the pure double matrices in vectorz-clj).
 

>
> 2) In R a matrix can have both column and row names, in Incanter it can 
> have neither.
>

It might be smart to use metadata for row and column names in a standard 
way.
 

>
> 3) From 1) and 2) it seems to me that in Incanter every time you want to 
> do calculations you lose the naming of your data. This gives me a feeling 
> of insecurity as I have to think about the ordering of the rows (which is 
> usually not a problem) and of the columns (which is a big problem).
>
> 4) Finance people work primarily with time series. They tend to work with 
> data frames or matrices in which each column is a time series. The R data 
> frame structure fits nicely with this since a data frame is a list of its 
> columns (loosely speaking). Although I don't know what is going on in the 
> innards of Incanter, it seems to be focused on rows. I wonder if that has a 
> performance penalty when one is working with columns. One could think of 
> representing time series as rows in datasets, but that would lead to a loss 
> of naming, as datasets don't have row names. Or one could work with columns 
> in datasets and rows in matrices, which would require systematically 
> transposing the data, which is expensive.
>
> 5) I understand that working with Clojure one loses the possibility of 
> writing code like 
>
> A[i, j] = something 
>
> where A is a matrix. Here i and j may be numbers or vectors. Actually in R 
> a new matrix A is created when executes a command like this, so it is not 
> the performance that is the issue. It is rather the convenience. Would it 
> be possible to have a function in Clojure/Incanter that when called in the 
> following way
>
> (def B (foo A i j z))
>
> would create a matrix B with the same dimensions and entries as A except 
> that the subset of rows and columns defined by i and j would be replaced by 
> z? (Here i and j could be vectors like [3 5 12] or just ints like 4)
>

core.matrix provides the mset function to set individual array elements 
e.g. (mset A i j z).
Some implementations also support mutable arrays with (mset! A i j z) - use 
at your own non-concurrency-safe risk.....

Haven't yet put in an API function for setting entire submatrices at once, 
but can see the value in that: will look at adding this into the next 
iteration.
 

>
> 6) To complement the functionality in 5) one would like to be able to 
> apply a function to the columns or rows or a matrix, *with the parameters 
> varying with the column*. Sometimes in R the functionality is already 
> embedded in the function. For example, the function pmin (parallel min), So 
> , if I have a matrix
>
> mat
>      [,1] [,2] [,3]
> [1,]    1    4    7
> [2,]    2    5    8
> [3,]    3    6    9
>
> I can call pmin and get the following:
>
> pmin(mat, c(3, 4, 5))
>      [,1] [,2] [,3]
> [1,]    1    3    3
> [2,]    2    4    4
> [3,]    3    5    5
>
> (This uses R's "recycling," which is not how Incanter deals with vectors 
> of different lengths)
>
> If a function does not have that functionality, one can write
>
> t(apply(mat, 1, function(x, z) {x + z}, c(0, 1, 2)))
>
>      [,1] [,2] [,3]
> [1,]    1    5    9
> [2,]    2    6   10
> [3,]    3    7   11
>
> In any case, just the ability to apply a function to a give or all the 
> columns or rows of a matrix would be a big help. 
>
> I wrote the functions
>
> (defn matrix-map-col
>   "Applies a function on each element of a column of a matrix."
>   [A foo j] (matrix-map foo ($ :all j A)))
>
> (defn matrix-maps-cols
>   "Applies a sequence of functions to the elements of the columns of a 
> matrix.
>    The return value is a matrix with the same dimensions as the argument 
> matrix."
>   [A foos xs]
>   (trans (matrix (map #(matrix-map-col A %1 %2) foos xs))))
>
> which would be part of the solution. One would still have to add the 
> ability to vary the parameters and return a matrix. Given my short 
> experience with Clojure I wonder if they could be made faster/better and 
> what would be the best way to implement the remaining functionality.
>

These seem useful: I'll add them to the TODO list for core.matrix
 

>
> 7) I wrote in R a class that extends R's matrices. It associates a Date 
> object to each row and provides many other capabilities. When I extract 
> data from an object I can see the date range and the series I am dealing 
> with, which is crucial for checking calculations. I understand Clojure is 
> not OO. How could I have similar capabilities. What kind of construct in 
> Clojure could I use? I thought about a map with the following keys:
>

core.matrix allows you to extend a set of protocols so that you can treat 
*any* type as a matrix. This is a powerful tool (it means that we can do 
things like provide core.matrix support for multiple Java matrix libraries, 
for example). So you could create your own type (with deftype or defrecord) 
with whatever features you desire. However - this approach might be 
overkill for a simple extension like this.

I'd be tempted to try and do something with metadata: put some :date 
metadata on each row. You might need to be a bit careful about which 
operations preserve metadata and which won't, but apart from that it should 
all work fairly neatly, and it avoids the problem of date-tracking making 
your "core" data calculations more complicated.
 

>
> :date - a Java date vector or just a vector of strings of the form 
> "20130415"
>
> :data - One of the following two alternatives
>
> a) A vector of maps, each one of which would have as a key the column name 
> (a string) and as value a time series
>
> b) An Incanter dataset
>
> Each alternative has advantages and disadvantages. Has anyone thought 
> about these issues?  Any comments would be very welcome.
>

> These are my thoughts for now. I will also post this on the Incanter group 
> (I hope that is not a problem).
>
> FS 
>
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to