I am analyzing some data and would like ideas on how to explain differences
between periods. The suggestions can either be in J or completely outside
of J. For example, perhaps I need to do some sort of weighted poisson
regression. Or maybe the answer is that there's not enough information to
draw any inferences.
I would like to answer:
1. How much of the overall change in Total Cost Per mile can be attributed
to different cars being driven?
2. How much can be attributed to less/more fuel efficient cars being driven
farther?
3. How much of the change can be attributed to fuel cost?
4. How much of the change can be attributed to change in speed?
My first cut is to look at what's the same and explain the differences
between matching records.
Here is fake data for 4 cars in period 1. All cars have traveled for 1/2
hour, for 30 minutes at a cost of $2.75.
]Month1=.(4 # 1),.(i. 4),.(4 # 0.5),.(4 # 30),.(4 # 2.75)
1 0 0.5 30 2.75
1 1 0.5 30 2.75
1 2 0.5 30 2.75
1 3 0.5 30 2.75
Let's replace one of the cars with a gas guzzler
]Month1=.(4 # 1),.(i. 4),.(4 # 0.5),.(30 30 30 15),.(4 # 2.75)
1 0 0.5 30 2.75
1 1 0.5 30 2.75
1 2 0.5 30 2.75
1 3 0.5 15 2.75
The 4th car only went 15 miles on $2.75 of gas
] 4{"1 Month1 % 3{"1 Month1
0.0916667 0.0916667 0.0916667 0.183333
Here is fake data for 3 cars in period 2. There are 3 cars that are similar
and one new car.
]Month2=.(3 # 2),.(0 1 99),.(3 # 0.5),.(30, 25, 40),.(2.75, 2.5,3.0)
2 0 0.5 30 2.75
2 1 0.5 25 2.5
2 99 0.5 40 3
I can calculate cost per mile:
cPM=: [: +/ 4{"1 ] % [: +/ 3{"1 ]
] cPM Month1
0.104762
] cPM Month2
0.0868421
The total cost per mile went down in Month 2 because the gas guzzler
(hummer? -- car #3) is gone and car 99 was added with better fuel economy,
and car #2's cost per mile got slightly worse.
We can look at these matches to better understand:
]Month1Matches=:((1{"1 Month1) e. (1{"1 Month2)) # Month1
1 0 0.5 30 2.75
1 1 0.5 30 2.75
]Month2Matches=:((1{"1 Month2) e. (1{"1 Month1)) # Month2
2 0 0.5 30 2.75
2 1 0.5 25 2.5
Car #1 traveled 5 fewer miles and the total cost changed by 25 cents
16% reduction in miles
(3{"1 Month2Matches % 3{"1 Month1Matches)-1
0 _0.166667
And only a 9% reduction in cost
(4{"1 Month2Matches % 4{"1 Month1Matches)-1
0 _0.0909091
One possible explanation is the fuel cost went up.
Using the 30 minutes of drive time, I can conclude, the car also slowed
down to an average speed of 50 mph.
At this point I feel like I'm starting to get stuck and making fuzzy
assumptions. There seems like there should be a better way. If there were
hundreds or thousands of records, would that change things?
Thanks for any ideas
Joe
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm