[R] Smoothing Data-dplyr

2014-10-18 Thread umair durrani
Please note that I have already asked this question on stackoverflow.com but 
did not get a satisfactory answer. I have a data set containing velocities of 
2169 vehicles recorded at 
intervals of 0.1 seconds. So, there are many rows for an individual 
vehicle. Here I am reproducing the data only for the vehicle # 2:   
 dput(uma)
structure(list(Vehicle.ID = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2), Frame.ID = 13:445, Vehicle.velocity = c(40, 40, 40, 40, 
40, 40, 40, 40.02, 40.03, 39.93, 39.61, 39.14, 38.61, 38.28, 
38.42, 38.78, 38.92, 38.54, 37.51, 36.34, 35.5, 35.08, 34.96, 
34.98, 35, 34.99, 34.98, 35.1, 35.49, 36.2, 37.15, 38.12, 38.76, 
38.95, 38.95, 38.99, 39.18, 39.34, 39.2, 38.89, 38.73, 38.88, 
39.28, 39.68, 39.94, 40.02, 40, 39.99, 39.99, 39.65, 38.92, 38.52, 
38.8, 39.72, 40.76, 41.07, 40.8, 40.59, 40.75, 41.38, 42.37, 
43.37, 44.06, 44.29, 44.13, 43.9, 43.92, 44.21, 44.59, 44.87, 
44.99, 45.01, 45.01, 45, 45, 45, 44.79, 44.32, 43.98, 43.97, 
44.29, 44.76, 45.06, 45.36, 45.92, 46.6, 47.05, 47.05, 46.6, 
45.92, 45.36, 45.06, 44.96, 44.97, 44.99, 44.99, 44.99, 44.99, 
45.01, 45.02, 44.9, 44.46, 43.62, 42.47, 41.41, 40.72, 40.49, 
40.6, 40.76, 40.72, 40.5, 40.38, 40.43, 40.38, 39.83, 38.59, 
37.02, 35.73, 35.04, 34.85, 34.91, 34.99, 34.99, 34.97, 34.96, 
34.98, 35.07, 35.29, 35.54, 35.67, 35.63, 35.53, 35.53, 35.63, 
35.68, 35.55, 35.28, 35.06, 35.09, 35.49, 36.22, 37.08, 37.8, 
38.3, 38.73, 39.18, 39.62, 39.83, 39.73, 39.58, 39.57, 39.71, 
39.91, 40, 39.98, 39.97, 40.08, 40.38, 40.81, 41.27, 41.69, 42.2, 
42.92, 43.77, 44.49, 44.9, 45.03, 45.01, 45, 45, 45, 45, 45, 
45, 45, 45, 45, 45, 45, 44.99, 45.03, 45.26, 45.83, 46.83, 48.2, 
49.68, 50.95, 51.83, 52.19, 52, 51.35, 50.38, 49.38, 48.63, 48.15, 
47.87, 47.78, 48.01, 48.63, 49.52, 50.39, 50.9, 50.96, 50.68, 
50.3, 50.05, 49.94, 49.87, 49.82, 49.82, 49.88, 49.96, 50, 50, 
49.98, 49.98, 50.16, 50.64, 51.43, 52.33, 53.01, 53.27, 53.22, 
53.25, 53.75, 54.86, 56.36, 57.64, 58.28, 58.29, 57.94, 57.51, 
57.07, 56.64, 56.43, 56.73, 57.5, 58.27, 58.55, 58.32, 57.99, 
57.89, 57.92, 57.74, 57.12, 56.24, 55.51, 55.1, 54.97, 54.98, 
55.02, 55.03, 54.86, 54.3, 53.25, 51.8, 50.36, 49.41, 49.06, 
49.17, 49.4, 49.51, 49.52, 49.51, 49.45, 49.24, 48.84, 48.29, 
47.74, 47.33, 47.12, 47.06, 47.07, 47.08, 47.05, 47.04, 47.25, 
47.68, 47.93, 47.56, 46.31, 44.43, 42.7, 41.56, 41.03, 40.92, 
40.92, 40.98, 41.19, 41.45, 41.54, 41.32, 40.85, 40.37, 40.09, 
39.99, 39.99, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 39.98, 
39.97, 40.1, 40.53, 41.36, 42.52, 43.71, 44.57, 45.01, 45.1, 
45.04, 45, 45, 45, 45, 45, 45, 44.98, 44.97, 45.08, 45.39, 45.85, 
46.2, 46.28, 46.21, 46.29, 46.74, 47.49, 48.35, 49.11, 49.63, 
49.89, 49.94, 49.97, 50.14, 50.44, 50.78, 51.03, 51.12, 51.05, 
50.85, 50.56, 50.26, 50.06, 50.1, 50.52, 51.36, 52.5, 53.63, 
54.46, 54.9, 55.03, 55.09, 55.23, 55.35, 55.35, 55.23, 55.07, 
54.99, 54.98, 54.97, 55.06, 55.37, 55.91, 56.66, 57.42, 58.07, 
58.7, 59.24, 59.67, 59.95, 60.02, 60, 60, 60, 60, 60, 60.01, 
60.06, 60.23, 60.65, 61.34, 62.17, 62.93, 63.53, 64, 64.41, 64.75, 
65.04, 65.3, 65.57, 65.75, 65.74, 65.66, 65.62, 65.71, 65.91, 
66.1, 66.26, 66.44, 66.61, 66.78, 66.91, 66.99, 66.91, 66.7, 
66.56, 66.6, 66.83, 67.17, 67.45, 67.75, 68.15, 68.64, 69.15, 
69.57, 69.79, 69.79, 69.72, 69.72, 69.81, 69.94, 70, 70.01, 70.02, 
70.03)), row.names = c(NA, 433L), class = data.frame, .Names = 
c(Vehicle.ID, 
Frame.ID, Vehicle.velocity))  
I am trying to smooth the data using dplyr. Here is the code:  
 uma - tbl_df(uma)
uma - uma %% # take data frame 
  group_by(Vehicle.ID)  %%  # group by Vehicle ID
  mutate(i = 1:length(Frame.ID), im1 = i-1, Nai = length(Frame.ID) - i,
 Dv = pmin(im1, 

[R] Extracting values from rows which meet a condition in R 3.0.2

2014-04-14 Thread umair durrani
Hi, I have a big data frame with millions of rows and more than 20 columns. Let 
me first describe what the data is to make question more clear. The original 
data frame consists of locations, velocities and accelerations of 2169 vehicles 
during a 15 minute period. Each vehicle has a unique Vehicle.ID, an ID of the 
time frame in which it was observed i.e. Frame.ID, the velocity of vehicle in 
that frame i.e. svel, the acceleration of vehicle in that frame i.e. sacc and 
the class of that vehicle, vehicle.class, i.e. 1= motorcycle, 2= car, 3 = 
truck. These variables were recorded after every 0.1 seconds i.e. each frame is 
0.1 seconds. Here are the first 6 rows:
 dput(head(df))structure(list(Vehicle.ID = c(2L, 2L, 2L, 2L, 2L, 2L), Frame.ID 
 = 133:138,Vehicle.class = c(2L, 2L, 2L, 2L, 2L, 2L), Lane = c(2L, 2L, 2L, 2L, 
 2L, 2L), svel = c(37.29, 37.11, 36.96, 36.83, 36.73,36.64), sacc = c(0.07, 
 0.11, 0.15, 0.19, 0.22, 0.25)), .Names = c(Vehicle.ID, Frame.ID, 
 Vehicle.class, Lane, svel, sacc), row.names = 7750:7755, class = 
 data.frame)
There are some instances in vehicles' journey during the 15 minute recording 
period that they completely stop i.e. svel==0. This continues for some frames 
and then vehicles gain speed again. For the purpose of reproduciblity I am 
creating an example data set as follows:
x - data.frame(Vehicle.ID = c(rep(10,5), rep(20,5), rep(30,5), rep(40,5), 
rep(50,5)),vehicle.class = c(rep(2,10), rep(3,10),rep(1,5)), svel = 
rep(c(1,0,0,0,3),5),   sacc = rep(c(0.3,0.001,0.001,0.002,0.5),5))
As described above some vehicles stop and have zero velocity for some time but 
later accelerate to get up to speed. I want to find the acceleration, sacc they 
apply after having zero velocity for some time (moving from standstill 
position). This means that I should be able to look at the FIRST row AFTER the 
last frame in which svel==0. In the example data this means that the car 
(vehicle.class==2) having a Vehicle.ID==10 had a velocity, svel equal to 1 as 
seen in the first row. Later, it stopped for 3 frames (3 consecutive rows) and 
then accelerated to velocity, svel, equal to 3. I want the acceleration sacc it 
applied in those 2 frames (rows 4 and 5 for vehicle 10, which come out to be 
0.002 and 0.500). This means that for example data, following should be the 
output by vehicle.class:
output - data.frame(Vehicle.ID = c(10,10,20,20,30,30,40,40,50, 
50),vehicle.class = c(2,2,2,2,3,3,3,3,1,1), xf = rep(c('l','f'),10),sacc = 
rep(c(0.002,0.500),5))
xf identifies the last row l in which svel==0 and f is the first one after 
that. I have tried using plyr and for loop to split by vehicle.class but am not 
sure how to extract the sacc. Please note that xf should be a part of output. 
It is not in given data. The original data frame df has 2169 vehicles, some 
stopped and some did not so not all vehicles had svel==0. The vehicles which 
did stop didn't do it at the same time. Also, the number of rows in which 
svel==0 is different vehicle to vehicle.
Thanks,
Umair Durrani
Master's candidate 
Civil and Environmental Engineering
University of Windsor 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Detecting Vehicle locations using R

2014-02-21 Thread umair durrani
The problem is resolved already. Please don't include this question in future 
mailing list

Umair Durrani

email: umairdurr...@outlook.com


 Subject: Re: [R] Detecting Vehicle locations using R
 From: jdnew...@dcn.davis.ca.us
 Date: Thu, 20 Feb 2014 20:06:28 -0800
 To: umairdurr...@outlook.com; r-help@r-project.org
 
 Please read the Posting Guide, which offers several applicable tips, such as:
 Don't post in HTML format... it tends to corrupt your code samples.
 Please provide a hand-generated example result that should be what the 
 solution should transform your sample data into.
 Please show the code that did not work... you may be closer to the solution 
 than you think, or we may see from it that you could benefit from learning a 
 concept you don't know exists yet. This is not supposed to be a forum that 
 does your work for you.
 ---
 Jeff NewmillerThe .   .  Go Live...
 DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
   Live:   OO#.. Dead: OO#..  Playing
 Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
 /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
 --- 
 Sent from my phone. Please excuse my brevity.
 
 On February 20, 2014 6:30:37 PM PST, umair durrani umairdurr...@outlook.com 
 wrote:
 I have a data frame of vehicle trajectories. Here's a snapshot:
 dput(head(df))structure(list(vehicle = c(2L, 2L, 2L, 2L, 2L, 2L),
 frame = 43:48, globalx = c(6451214.156, 6451216.824, 6451219.616,
 6451222.548, 6451225.462, 6451228.376), class = c(2L, 2L, 2L, 2L,
 2L, 2L), velocity = c(37.76, 37.9, 38.05, 38.18, 38.32, 38.44),
 lane = c(2L, 2L, 2L, 2L, 2L, 2L)), .Names = c(vehicle, frame,
 globalx, class, velocity, lane), row.names = c(NA, 6L), class =
 data.frame)
 where, vehicle= vehicle id, frame= frame id of time frames in which it
 was observed, globalx = x coordinate of the front center of the
 vehicle, class=type of vehicle (1=motorcycle, 2=car, 3=truck),
 velocity=speed of vehicles in feet per second, lane= lane number (there
 are 6 lanes).The 'frame' represents one tenth of a second i.e. one
 frame is 0.1 seconds long. At frame 't' the vehicle has globalx
 coordinate x(t) and at frame 't-1' (0.1 seconds before) it was x(t-1).
 If the reference location has globalx coordinate=6451179.1116 then I
 simply want a new column in df called 'u' which has 'yes' in the row
 where globalx of the vehicle was greater than reference coordinate at
 'U' AND the previous consecutive globalx coordinate of this vehicle was
 less than reference coordinate at 'U'(i.e. reference coordinate is
 between the 2 locations of vehicle in two consecutive frames). This
 means that if df has 100 vehicles then there will be 100 'yes' in 'u'
 column because every vehicle wil!
  
  
 l meet the above criteria only once. I have tried to do this by running
 the function with ifelse and also tried to do the same using a for loop
 but it doesn't work for me.
 
 

  [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Detecting Vehicle locations using R

2014-02-20 Thread umair durrani
I have a data frame of vehicle trajectories. Here's a snapshot:
dput(head(df))structure(list(vehicle = c(2L, 2L, 2L, 2L, 2L, 2L), frame = 
43:48, globalx = c(6451214.156, 6451216.824, 6451219.616, 6451222.548, 
6451225.462, 6451228.376), class = c(2L, 2L, 2L, 2L, 2L, 2L), velocity = 
c(37.76, 37.9, 38.05, 38.18, 38.32, 38.44), lane = c(2L, 2L, 2L, 2L, 2L, 
2L)), .Names = c(vehicle, frame, globalx, class, velocity, lane), 
row.names = c(NA, 6L), class = data.frame)
where, vehicle= vehicle id, frame= frame id of time frames in which it was 
observed, globalx = x coordinate of the front center of the vehicle, class=type 
of vehicle (1=motorcycle, 2=car, 3=truck), velocity=speed of vehicles in feet 
per second, lane= lane number (there are 6 lanes).The 'frame' represents one 
tenth of a second i.e. one frame is 0.1 seconds long. At frame 't' the vehicle 
has globalx coordinate x(t) and at frame 't-1' (0.1 seconds before) it was 
x(t-1). If the reference location has globalx coordinate=6451179.1116 then I 
simply want a new column in df called 'u' which has 'yes' in the row where 
globalx of the vehicle was greater than reference coordinate at 'U' AND the 
previous consecutive globalx coordinate of this vehicle was less than reference 
coordinate at 'U'(i.e. reference coordinate is between the 2 locations of 
vehicle in two consecutive frames). This means that if df has 100 vehicles then 
there will be 100 'yes' in 'u' column because every vehicle wil!
 l meet the above criteria only once. I have tried to do this by running the 
function with ifelse and also tried to do the same using a for loop but it 
doesn't work for me.


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R 3.0.2 How to Split-Apply-Combine using various Columns

2014-01-25 Thread umair durrani
I am sorry for the previous mail, it happened due to the tables I copied into 
mail. Here is the text version:Hello everyone,
I have a large vehicle trajectory data of which following is a small part:  
vehicle frame globalx class velocity lane1   221 6451260 223.37 
   51   222 6451261 223.1651   223 6451263 2
22.9451   224 6451264 222.855   2   115 6451181 
235.004   2   116 6451184 235.014   2   117 6451186 
235.034   2   118 6451188 234.924   2   119 
6451191 234.494   2   120 6451193 233.664   2   
121 6451195 232.504
vehicle = unique ID of vehicle. It is repeated (in column) for every frame in 
which it was observed;frame= ID of the frame in which the vehicle was observed. 
One frame is 0.1 seconds long;class = class of vehicle i.e. 1=motorcycle, 
2=car, 3=truck;velocity= velocity of vehicle in feet per second;lane= lane 
number in which vehicle is present in a particular frame

'frame' number can also repeat e.g. in frame 120 the example data shows vehicle 
2 was observed but in the original data many more vehicles might have been 
observed in this frame. Similarly, 'class' is defined above and all three 
classes are present in the original data (here example data only shows classes 
2 and 3 i.e. cars and trucks).
I need to determine two things:1) Number of vehicles observed in every 30 
seconds i.e. 300 frames 2) Average velocity of each vehicle class in every 30 
seconds
 This means that the first step might be to determine the minimum and maximum 
 frame numbers and then divide them in slots so that every slot has 300 
 frames. In my original data I found 22 as min and 9233 as max frame number. 
 This makes 30 time slots as 22-322, 322-622, ..., 9022-9233. I need following 
 output: 
TimeSlot Total-Cars Total-Trucks Total-Motorcycles MeanVelocity-Cars 
MeanVelocity-Trucks MeanVelocity-Motorcycles22-322322-622...9022-9233 


Umair Durrani

email: umairdurr...@outlook.com


 Date: Fri, 24 Jan 2014 19:45:27 -0800
 From: smartpink...@yahoo.com
 Subject: Re: [R] R 3.0.2 How to Split-Apply-Combine using various Columns
 To: umairdurr...@outlook.com
 
 Hi,
 
 Please check your post and see how much helpful is for another person to copy 
 and paste your example dataset to run the code.Á It is always useful to use 
 ?dput()
 dput(head(data,10)).Á Also, please post using plain text.

  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Corrected - R 3.0.2 How to Split-Apply-Combine using various Columns

2014-01-25 Thread umair durrani
Hello everyone,Here is the version using dput. I am sorry for the junk I posted 
before. I have a large vehicle trajectory data of which following is a small 
part:  
structure(list(vehicle = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L,2L, 2L), frame = 
c(221L, 222L, 223L, 224L, 115L, 116L, 117L, 118L, 119L, 120L, 121L), globalx = 
c(6451259.685, 6451261.244, 6451262.831, 6451264.362, 6451181.179, 6451183.532, 
6451185.884, 6451188.237, 6451190.609, 6451192.912, 6451195.132), class = c(2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), velocity = c(23.37, 23.16, 22.94, 
22.85, 35, 35.01, 35.03, 34.92, 34.49, 33.66, 32.5), lane = c(5L, 5L, 5L, 5L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L)), .Names = c(vehicle, frame, globalx, 
class, velocity, lane), row.names = c(85L, 86L, 87L, 88L, 447L, 448L, 
449L, 450L, 451L, 452L, 453L), class = data.frame)
Explanation of Columns:vehicle = unique ID of vehicle. It is repeated (in 
column) for every frame in which it was observed;frame= ID of the frame in 
which the vehicle was observed. One frame is 0.1 seconds long;class = class of 
vehicle i.e. 1=motorcycle, 2=car, 3=truck;velocity= velocity of vehicle in feet 
per second;lane= lane number in which vehicle is present in a particular frame;

'frame' number can also repeat e.g. in frame 120 the example data shows vehicle 
2 was observed but in the original data many more vehicles might have been 
observed in this frame. Similarly, 'class' is defined above and all three 
classes are present in the original data (here example data only shows classes 
2 and 3 i.e. cars and trucks).
I need to determine two things:1) Number of vehicles observed in every 30 
seconds i.e. 300 frames 2) Average velocity of each vehicle class in every 30 
seconds
 This means that the first step might be to determine the minimum and maximum 
 frame numbers and then divide them in slots so that every slot has 300 
 frames. In my original data I found 22 as min and 9233 as max frame number. 
 This makes 30 time slots as 22-322, 322-622, ..., 9022-9233. I need following 
 columns in one table as an output (note that Timeslot column should contain 
 the time intervals as described before): TimeSlot, Total-Cars, Total-Trucks, 
 Total-Motorcycles, MeanVelocity-Cars, MeanVelocity-Trucks, 
 MeanVelocity-Motorcycles



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R 3.0.2 How to Split-Apply-Combine using various Columns

2014-01-24 Thread umair durrani
















.
















.
















.
















9022-9233


















I have tried many things and also used some suggestions from stackoverflow but 
still am unable to get output like this with input data. Please help.



Umair Durrani

email: umairdurr...@outlook.com
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to get the proportions of data with respect to two variables in R?

2013-12-01 Thread umair durrani
Thanks for your answers Arun. Unfortunately the code didn't work and I am 
getting the error: arguments must have same length. Here are sample input and 
output:
INPUT:
Vehicle ID Vehicle Class Vehicle Length Vehicle Width
2 2 13.5 4.5
2 2 13.5 4.5
2 2 13.5 4.5
2 2 13.5 4.5
3 2 13.5 4.0
3 2 13.5 4.0
3 2 13.5 4.0
3 2 13.5 4.0
4 2 10.0 4.5
4 2 10.0 4.5
4 2 10.0 4.5
4 2 10.0 4.5
5 3 23.0 4.5
5 3 23.0 4.5
5 3 23.0 4.5
5 3 23.0 4.5
6 3 76.5 4.5
6 3 76.5 4.5
6 3 76.5 4.5
6 3 76.5 4.5
6 3 76.5 4.5
7 1 10.0 3.0
7 1 10.0 3.0
7 1 10.0 3.0
7 1 10.0 3.0
8 2 13.5 5.5
8 2 13.5 5.5
8 2 13.5 5.5
8 2 13.5 5.5Note that in this input: Total number of cars=4, trucks=2, 
motorcycles=1
Sample OutputGroup: cars
VehicleLength VehicleWidth Proportion
13.5 4.5 0.25
13.5 4.0 0.25
13.5 5.5 0.25
23.0 4.5 0.25

Group:trucks
VehicleLength VehicleWidth Proportion
23.0 4.5 0.5
76.0 4.5 0.5

Group: motorcycles
VehicleLength VehicleWidth Proportion
10.0 3.0 1.0

Umair Durrani

email: umairdurr...@outlook.com


 Date: Sat, 30 Nov 2013 23:41:28 -0800
 From: smartpink...@yahoo.com
 Subject: Re: [R] How to get the proportions of data with respect to two 
 variables in R?
 To: r-help@r-project.org
 CC: umairdurr...@outlook.com
 
 Hi,
 It is better to provide a reproducible example.
 May be this helps:
 set.seed(252)
 dat1 - data.frame(`Vehicle ID`=sample(150,150,replace=FALSE),`Vehicle 
 Class`=rep(1:4,c(20,40,30,60)), `Vehicle length`= 
 sample(15:25,150,replace=TRUE), `Vehicle width`= 
 sample(4:10,150,replace=TRUE),check.names=FALSE)
 cars - subset(dat1,`Vehicle Class`==2)
  by(cars,INDICES=cars$`Vehicle length`,FUN=table(cars$`Vehicle width`))  
 #Error in FUN(X[[1L]], ...) : could not find function FUN
 
 by(cars$`Vehicle width`,INDICES=cars$`Vehicle length`, table)
  by(dat1$`Vehicle width`,list(dat1$`Vehicle Class`,dat1$`Vehicle length`), 
 table)
 
 
 #Also, you may check
 
 ftable(dat1[2:4])
 prop.table(ftable(dat1[2:4]),1)
 
 
 A.K.
 
 
 
 
 
 On Sunday, December 1, 2013 12:08 AM, umair durrani 
 umairdurr...@outlook.com wrote:
 I have 4 columns: Vehicle ID, Vehicle Class, Vehicle Length and Vehicle 
 Width. Every vehicle has a unique vehicle ID (e.g. 2, 4, 5,...) and the data 
 was collected every 0.1 seconds which means that vehicle IDs are repeated in 
 Vehicle ID column for the number of times they were observed. There are three 
 vehicle classes i.e. 1=motorcycles, 2=cars, 3=trucks in the Vehicle Class 
 column and the lengths and widths are in their respective columns against 
 every vehicle ID. I want to subset the data by vehicle class and then find 
 the proportions of each vehicle model (unique length and width) within every 
 class. For example, for the Vehicle Class = 2 i.e. car, I want to find 
 different models of cars (unique length and width) and their proportions with 
 respect to total number of cars. Here is what I have done so far:To subset 
 data by Vehicle Classcars - subset(b, b$'Vehicle class'==2)
 trucks - subset(b, b$'Vehicle class'==3)
 motorcycles - subset(b, b$'Vehicle class'==1)To find the number of 
 carsnumofcars - length(unique(cars$'Vehicle ID')) # 2830
 numoftrucks - length(unique(trucks$'Vehicle ID')) # 137
 numofmotorcycles - length(unique(motorcycles$'Vehicle ID'))# 45The above 
 code worked but I could not find the proportions by using the code below:by 
 (cars, INDICES=cars$'Vehicle Length', FUN=table(class$'Vehicle width'))R 
 gives an error stating that it could not find 'FUN'. Please help me in 
 finding the proportions of each model within all classes of vehicles.
 
 Umair Durrani
 
 email: umairdurr...@outlook.com
   
 [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to get the proportions of data with respect to two variables in R?

2013-11-30 Thread umair durrani
I have 4 columns: Vehicle ID, Vehicle Class, Vehicle Length and Vehicle Width. 
Every vehicle has a unique vehicle ID (e.g. 2, 4, 5,...) and the data was 
collected every 0.1 seconds which means that vehicle IDs are repeated in 
Vehicle ID column for the number of times they were observed. There are three 
vehicle classes i.e. 1=motorcycles, 2=cars, 3=trucks in the Vehicle Class 
column and the lengths and widths are in their respective columns against every 
vehicle ID. I want to subset the data by vehicle class and then find the 
proportions of each vehicle model (unique length and width) within every class. 
For example, for the Vehicle Class = 2 i.e. car, I want to find different 
models of cars (unique length and width) and their proportions with respect to 
total number of cars. Here is what I have done so far:To subset data by Vehicle 
Classcars - subset(b, b$'Vehicle class'==2)
trucks - subset(b, b$'Vehicle class'==3)
motorcycles - subset(b, b$'Vehicle class'==1)To find the number of 
carsnumofcars - length(unique(cars$'Vehicle ID')) # 2830
numoftrucks - length(unique(trucks$'Vehicle ID')) # 137
numofmotorcycles - length(unique(motorcycles$'Vehicle ID'))# 45The above code 
worked but I could not find the proportions by using the code below:by (cars, 
INDICES=cars$'Vehicle Length', FUN=table(class$'Vehicle width'))R gives an 
error stating that it could not find 'FUN'. Please help me in finding the 
proportions of each model within all classes of vehicles.

Umair Durrani

email: umairdurr...@outlook.com
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Apply function to one specific column / Alternative to for loop

2013-11-16 Thread umair durrani
This might be of some use : 
http://nsaunders.wordpress.com/2010/08/20/a-brief-introduction-to-apply-in-r/

Umair Durrani

email: umairdurr...@outlook.com


 Date: Sat, 16 Nov 2013 07:30:29 -0800
 From: ron...@gmx.net
 To: r-help@r-project.org
 Subject: [R] Apply function to one specific column / Alternative to for loop
 
 Hi guys, I am a total newbie to R, so I hope this isn't a totally dumb
 question. I have a dataframe with a title in one row and the corresponding
 values in the next rows. Let's take this example: 
 
 test_df - data.frame(cbind(titel = , x = 4:5, y = 1:2))
 test_df = rbind(cbind(titel=1.Test, x=, y=), test_df,
 cbind(titel=2.Test, x=, y=), test_df, cbind(titel=3.Test, x=,
 y=), test_df)
 
 test_df
titel x y
 1 1.Test
 24 1
 35 2
 4 2.Test
 54 1
 65 2
 7 3.Test
 84 1
 95 2
 
 What I want to have is:
titel x y
 2 1.Test 4 1
 3 1.Test 5 2
 5 2.Test 4 1
 6 2.Test 5 2
 8 3.Test 4 1
 9 3.Test 5 2
 
 In my example, the title is in every third line, but in my real data there
 is no pattern. Each title has at least one line but can have x lines.
 
 I was able to solve my problem in a for loop with the following code:
 test_df$titel - as.character(test_df$titel)
 for (i in 1:nrow(test_df))
 {
   if (nchar(test_df$titel[i])==0){
 test_df$titel[i]=test_df$titel[i-1]
   }
 }
 test_df - subset(test_df,test_df$x!=)
 
 
 The problem is, I have a lot of data and the for loop is obviously very
 slow. Is there a more elegant way to achieve the same? I think I have to use
 the apply function, but I don't know how to use it with just one column.
 
 
 
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Apply-function-to-one-specific-column-Alternative-to-for-loop-tp4680566.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R for a stats intro for undergrads in the US?

2013-11-16 Thread umair durrani
Hi Spencer,
I would definitely recommend R for introductory stats. course because it is 
free and easy to learn. You can visit www.twotorials.com for two-minute 
tutorials on R. Also www.coursera.org offers many free courses on R, for intro 
stats check this out: https://www.coursera.org/course/stats1
Hope this helps,
Umair Durrani

email: umairdurr...@outlook.com


 Date: Sat, 16 Nov 2013 18:19:16 -0800
 From: spencer.gra...@prodsyse.com
 To: R-help@r-project.org
 Subject: [R] R for a stats intro for undergrads in the US?
 
 Hello, All:
 
 
Would anyone recommend R for an introductory statistics class for 
 freshman psychology students in the US?  If yes, might there be any 
 notes for such available?
 
 
I just checked r-projects.org and CRAN contributed documentation 
 and found nothing.
 
 
I have a friend who teaches such a class, and wondered if R might 
 be suitable.  The alternative is SPSS at $406 per student.
 
 
Thanks,
Spencer
 
 
 -- 
 Spencer Graves, PE, PhD
 President and Chief Technology Officer
 Structure Inspection and Monitoring, Inc.
 751 Emerson Ct.
 San José, CA 95126
 ph:  408-655-4567
 web:  www.structuremonitoring.com
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to sum a function over a specific range in R?

2013-11-12 Thread umair durrani



I am new to R and have already posted this question on stack overflow. The 
problem is that I did not understand the answers as the R documentation about 
the discussed functions (e.g. 'convolve') is quite complicated for a newbie 
like me. Here's the question:
I have a big text file with more than 3 million rows. The following is the 
example of the three columns I want to use:
indxvehID   LocalY
1   2   35.381
2   2   39.381
3   2   43.381
4   2   47.38
5   2   51.381
6   2   55.381
7   2   59.381
8   2   63.379
9   2   67.383
10  2   71.398
where,indx = IndexvehID = Vehicle ID (Here only '2' is shown but infact there 
are 2169 vehicle IDs and each one repeats several times because the data was 
collected at every 0.1 seconds)LocalY = The y coordinate of the vehicle at a 
particular time (The time column is not shown here)
What I want to do is to create a new column of 'SmoothedY' using the following 
formula:
SmoothedY = 1/Z * Summation from (i-15) to (i+15) (LocalY * exp(-abs(i-k))/5))
where,i = indxZ = Summation from (k =i-15) to (k = i+15) ( exp(-abs(i-k))/5))
How can I apply this formula to create the new column 'SmoothedY'? This is 
actually a data smoothing problem but default smoothing algorithms in R are not 
suitable for my data and I have to use this custom formula. 
Thanks in advance.

Umair Durrani


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R 3.0.2 - How to create intervals and group another variable in those intervals?

2013-11-06 Thread umair durrani
I have two columns for speed ('Smoothed velocity') and Spacing. What I want to 
do is to first create the intervals of speed (minimum value=0, max value= 
85.53), group the Spacing values falling in a particular Speed interval, find 
the average of the Spacing for an interval and finally plot the average spacing 
of each interval against the mid-point of the Speed interval. I want to have 
fixed intervals of 4.5 feet per second, i.e. 0-4.5, 4.5-9,..xx-85.53.After 
hours of search I found a function for creating intervals called 
classIntervals() but I can't figure out how to create fixed intervals of 4.5. 
Here is what I tried:classIntervals(s21[,'Smoothed velocity'], style='fixed', 
fixedBreaks=4.5)But the results were unexpected and there was a Warning 
message:In classIntervals(s21[, Smoothed velocity], style = fixed, 
fixedBreaks = 4.5) :
  variable range greater than fixedBreaksEven after intervals are created, I 
need to group spacing and find the avg. for every interval. How can I do this? 
I have tried what I could, please help

Umair Durrani

email: umairdurr...@outlook.com
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.