I'm trying to learn how to use J to do the kinds of data manipulation
that I currently do in R.  As an example, I tried to read in a file of
quarterly home price index (HPI) data, linearly interpolate it to
monthly, and write it out.

I got a partial solution to work.  The data contains series for all
states, plus the US as a while, and I'm currently just interpolating the
series for the US.  Ideally, I'd generate output for all states as well,
but I couldn't figure out how to do that.

Also, this just doesn't feel "fluent."  Picking the columns out into
variables seems clumsy.  I suspect there's a better way to do the
interpolation.  Etc.

I'd appreciate any style hints, suggestions, or pointers to examples to
read.

Here's what I came up with, with the actual data load replaced with a
literal subset of the data:

   load 'tables/dsv'
   load 'tables/csv'
   load 'files'
   
   NB. Load the data
   NB. htStr=. readcsv '~/hpi_fiserv.csv'
   htData=. 0 : 0
   area,source,year,qtr,hpi
   CA,Fiserv,2009,1,          128.170000
   CA,Fiserv,2009,2,          128.650000
   CA,Fiserv,2009,3,          135.700000
   CA,Fiserv,2009,4,          138.870000
   CA,Fiserv,2010,1,          138.580000
   CA,Fiserv,2010,2,          143.650000
   CA,Fiserv,2010,3,          142.620000
   CA,Fiserv,2010,4,          138.020000
   NY,Fiserv,2009,1,          176.120000
   NY,Fiserv,2009,2,          176.700000
   NY,Fiserv,2009,3,          179.810000
   NY,Fiserv,2009,4,          178.210000
   NY,Fiserv,2010,1,          174.640000
   NY,Fiserv,2010,2,          178.780000
   NY,Fiserv,2010,3,          179.750000
   NY,Fiserv,2010,4,          174.400000
   US,Fiserv,2009,1,          129.180000
   US,Fiserv,2009,2,          133.180000
   US,Fiserv,2009,3,          137.510000
   US,Fiserv,2009,4,          136.000000
   US,Fiserv,2010,1,          132.100000
   US,Fiserv,2010,2,          138.230000
   US,Fiserv,2010,3,          135.640000
   US,Fiserv,2010,4,          130.380000
   )
   htStr=. fixcsv htData
   ht=. makenum htStr
   
   NB. Extract the columns, so I can manipulate them.
   area=. >}.0{"1 ht
   year=. >}.2{"1 ht
   qtr=. >}.3{"1 ht
   hpi=. >}.4{"1 ht
   
   NB. month index (numbers of months since 2000-01)
   NB. quarters are centered, 2000-Q1=2000-02=1, etc.
   month=. (1+3*<:qtr)+12*year-2000
   
   NB. Pick out the series for the US.
   isUS=. area-:"1 'US'
   hpiUS=. isUS#hpi
   monthUS=. isUS#month
   yearUS=. isUS#year
   
   NB. For testing, pick out the values from 2010
   h=. (yearUS=2010)#hpiUS
   m=. (yearUS=2010)#monthUS
   NB. h=. 132.1 138.23 135.64 130.38
   NB. m=. 121 124 127 130
   
   NB. Linearly interpolate to monthly.
   hm=. (+/ % #)0 1 2|.!.''"0 _ ]3#h
   mm=. ({.m)+i.3*#m
   
   NB. Last two values are junk, drop them.
   hm=. _2}.hm
   mm=. _2}.mm
   
   NB. Output the interpolated version to a file.
   ('month,hpi',LF,makecsv mm,.hm) fwrites 'test.csv'


In particular, I'm not sure how to represent a table of data.  For
example, given the definitions above, would it be better to use
something like

  (;:'month hpi'),:m;h

or the transpose of that?  Are there any utilities around for
manipulating tables of data like that?

Thanks,
Johann

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to