[R] Smoothing Data-dplyr
Please note that I have already asked this question on stackoverflow.com but did not get a satisfactory answer. I have a data set containing velocities of 2169 vehicles recorded at intervals of 0.1 seconds. So, there are many rows for an individual vehicle. Here I am reproducing the data only for the vehicle # 2: dput(uma) structure(list(Vehicle.ID = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), Frame.ID = 13:445, Vehicle.velocity = c(40, 40, 40, 40, 40, 40, 40, 40.02, 40.03, 39.93, 39.61, 39.14, 38.61, 38.28, 38.42, 38.78, 38.92, 38.54, 37.51, 36.34, 35.5, 35.08, 34.96, 34.98, 35, 34.99, 34.98, 35.1, 35.49, 36.2, 37.15, 38.12, 38.76, 38.95, 38.95, 38.99, 39.18, 39.34, 39.2, 38.89, 38.73, 38.88, 39.28, 39.68, 39.94, 40.02, 40, 39.99, 39.99, 39.65, 38.92, 38.52, 38.8, 39.72, 40.76, 41.07, 40.8, 40.59, 40.75, 41.38, 42.37, 43.37, 44.06, 44.29, 44.13, 43.9, 43.92, 44.21, 44.59, 44.87, 44.99, 45.01, 45.01, 45, 45, 45, 44.79, 44.32, 43.98, 43.97, 44.29, 44.76, 45.06, 45.36, 45.92, 46.6, 47.05, 47.05, 46.6, 45.92, 45.36, 45.06, 44.96, 44.97, 44.99, 44.99, 44.99, 44.99, 45.01, 45.02, 44.9, 44.46, 43.62, 42.47, 41.41, 40.72, 40.49, 40.6, 40.76, 40.72, 40.5, 40.38, 40.43, 40.38, 39.83, 38.59, 37.02, 35.73, 35.04, 34.85, 34.91, 34.99, 34.99, 34.97, 34.96, 34.98, 35.07, 35.29, 35.54, 35.67, 35.63, 35.53, 35.53, 35.63, 35.68, 35.55, 35.28, 35.06, 35.09, 35.49, 36.22, 37.08, 37.8, 38.3, 38.73, 39.18, 39.62, 39.83, 39.73, 39.58, 39.57, 39.71, 39.91, 40, 39.98, 39.97, 40.08, 40.38, 40.81, 41.27, 41.69, 42.2, 42.92, 43.77, 44.49, 44.9, 45.03, 45.01, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 44.99, 45.03, 45.26, 45.83, 46.83, 48.2, 49.68, 50.95, 51.83, 52.19, 52, 51.35, 50.38, 49.38, 48.63, 48.15, 47.87, 47.78, 48.01, 48.63, 49.52, 50.39, 50.9, 50.96, 50.68, 50.3, 50.05, 49.94, 49.87, 49.82, 49.82, 49.88, 49.96, 50, 50, 49.98, 49.98, 50.16, 50.64, 51.43, 52.33, 53.01, 53.27, 53.22, 53.25, 53.75, 54.86, 56.36, 57.64, 58.28, 58.29, 57.94, 57.51, 57.07, 56.64, 56.43, 56.73, 57.5, 58.27, 58.55, 58.32, 57.99, 57.89, 57.92, 57.74, 57.12, 56.24, 55.51, 55.1, 54.97, 54.98, 55.02, 55.03, 54.86, 54.3, 53.25, 51.8, 50.36, 49.41, 49.06, 49.17, 49.4, 49.51, 49.52, 49.51, 49.45, 49.24, 48.84, 48.29, 47.74, 47.33, 47.12, 47.06, 47.07, 47.08, 47.05, 47.04, 47.25, 47.68, 47.93, 47.56, 46.31, 44.43, 42.7, 41.56, 41.03, 40.92, 40.92, 40.98, 41.19, 41.45, 41.54, 41.32, 40.85, 40.37, 40.09, 39.99, 39.99, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 39.98, 39.97, 40.1, 40.53, 41.36, 42.52, 43.71, 44.57, 45.01, 45.1, 45.04, 45, 45, 45, 45, 45, 45, 44.98, 44.97, 45.08, 45.39, 45.85, 46.2, 46.28, 46.21, 46.29, 46.74, 47.49, 48.35, 49.11, 49.63, 49.89, 49.94, 49.97, 50.14, 50.44, 50.78, 51.03, 51.12, 51.05, 50.85, 50.56, 50.26, 50.06, 50.1, 50.52, 51.36, 52.5, 53.63, 54.46, 54.9, 55.03, 55.09, 55.23, 55.35, 55.35, 55.23, 55.07, 54.99, 54.98, 54.97, 55.06, 55.37, 55.91, 56.66, 57.42, 58.07, 58.7, 59.24, 59.67, 59.95, 60.02, 60, 60, 60, 60, 60, 60.01, 60.06, 60.23, 60.65, 61.34, 62.17, 62.93, 63.53, 64, 64.41, 64.75, 65.04, 65.3, 65.57, 65.75, 65.74, 65.66, 65.62, 65.71, 65.91, 66.1, 66.26, 66.44, 66.61, 66.78, 66.91, 66.99, 66.91, 66.7, 66.56, 66.6, 66.83, 67.17, 67.45, 67.75, 68.15, 68.64, 69.15, 69.57, 69.79, 69.79, 69.72, 69.72, 69.81, 69.94, 70, 70.01, 70.02, 70.03)), row.names = c(NA, 433L), class = data.frame, .Names = c(Vehicle.ID, Frame.ID, Vehicle.velocity)) I am trying to smooth the data using dplyr. Here is the code: uma - tbl_df(uma) uma - uma %% # take data frame group_by(Vehicle.ID) %% # group by Vehicle ID mutate(i = 1:length(Frame.ID), im1 = i-1, Nai = length(Frame.ID) - i, Dv = pmin(im1,
[R] Extracting values from rows which meet a condition in R 3.0.2
Hi, I have a big data frame with millions of rows and more than 20 columns. Let me first describe what the data is to make question more clear. The original data frame consists of locations, velocities and accelerations of 2169 vehicles during a 15 minute period. Each vehicle has a unique Vehicle.ID, an ID of the time frame in which it was observed i.e. Frame.ID, the velocity of vehicle in that frame i.e. svel, the acceleration of vehicle in that frame i.e. sacc and the class of that vehicle, vehicle.class, i.e. 1= motorcycle, 2= car, 3 = truck. These variables were recorded after every 0.1 seconds i.e. each frame is 0.1 seconds. Here are the first 6 rows: dput(head(df))structure(list(Vehicle.ID = c(2L, 2L, 2L, 2L, 2L, 2L), Frame.ID = 133:138,Vehicle.class = c(2L, 2L, 2L, 2L, 2L, 2L), Lane = c(2L, 2L, 2L, 2L, 2L, 2L), svel = c(37.29, 37.11, 36.96, 36.83, 36.73,36.64), sacc = c(0.07, 0.11, 0.15, 0.19, 0.22, 0.25)), .Names = c(Vehicle.ID, Frame.ID, Vehicle.class, Lane, svel, sacc), row.names = 7750:7755, class = data.frame) There are some instances in vehicles' journey during the 15 minute recording period that they completely stop i.e. svel==0. This continues for some frames and then vehicles gain speed again. For the purpose of reproduciblity I am creating an example data set as follows: x - data.frame(Vehicle.ID = c(rep(10,5), rep(20,5), rep(30,5), rep(40,5), rep(50,5)),vehicle.class = c(rep(2,10), rep(3,10),rep(1,5)), svel = rep(c(1,0,0,0,3),5), sacc = rep(c(0.3,0.001,0.001,0.002,0.5),5)) As described above some vehicles stop and have zero velocity for some time but later accelerate to get up to speed. I want to find the acceleration, sacc they apply after having zero velocity for some time (moving from standstill position). This means that I should be able to look at the FIRST row AFTER the last frame in which svel==0. In the example data this means that the car (vehicle.class==2) having a Vehicle.ID==10 had a velocity, svel equal to 1 as seen in the first row. Later, it stopped for 3 frames (3 consecutive rows) and then accelerated to velocity, svel, equal to 3. I want the acceleration sacc it applied in those 2 frames (rows 4 and 5 for vehicle 10, which come out to be 0.002 and 0.500). This means that for example data, following should be the output by vehicle.class: output - data.frame(Vehicle.ID = c(10,10,20,20,30,30,40,40,50, 50),vehicle.class = c(2,2,2,2,3,3,3,3,1,1), xf = rep(c('l','f'),10),sacc = rep(c(0.002,0.500),5)) xf identifies the last row l in which svel==0 and f is the first one after that. I have tried using plyr and for loop to split by vehicle.class but am not sure how to extract the sacc. Please note that xf should be a part of output. It is not in given data. The original data frame df has 2169 vehicles, some stopped and some did not so not all vehicles had svel==0. The vehicles which did stop didn't do it at the same time. Also, the number of rows in which svel==0 is different vehicle to vehicle. Thanks, Umair Durrani Master's candidate Civil and Environmental Engineering University of Windsor [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Detecting Vehicle locations using R
The problem is resolved already. Please don't include this question in future mailing list Umair Durrani email: umairdurr...@outlook.com Subject: Re: [R] Detecting Vehicle locations using R From: jdnew...@dcn.davis.ca.us Date: Thu, 20 Feb 2014 20:06:28 -0800 To: umairdurr...@outlook.com; r-help@r-project.org Please read the Posting Guide, which offers several applicable tips, such as: Don't post in HTML format... it tends to corrupt your code samples. Please provide a hand-generated example result that should be what the solution should transform your sample data into. Please show the code that did not work... you may be closer to the solution than you think, or we may see from it that you could benefit from learning a concept you don't know exists yet. This is not supposed to be a forum that does your work for you. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. On February 20, 2014 6:30:37 PM PST, umair durrani umairdurr...@outlook.com wrote: I have a data frame of vehicle trajectories. Here's a snapshot: dput(head(df))structure(list(vehicle = c(2L, 2L, 2L, 2L, 2L, 2L), frame = 43:48, globalx = c(6451214.156, 6451216.824, 6451219.616, 6451222.548, 6451225.462, 6451228.376), class = c(2L, 2L, 2L, 2L, 2L, 2L), velocity = c(37.76, 37.9, 38.05, 38.18, 38.32, 38.44), lane = c(2L, 2L, 2L, 2L, 2L, 2L)), .Names = c(vehicle, frame, globalx, class, velocity, lane), row.names = c(NA, 6L), class = data.frame) where, vehicle= vehicle id, frame= frame id of time frames in which it was observed, globalx = x coordinate of the front center of the vehicle, class=type of vehicle (1=motorcycle, 2=car, 3=truck), velocity=speed of vehicles in feet per second, lane= lane number (there are 6 lanes).The 'frame' represents one tenth of a second i.e. one frame is 0.1 seconds long. At frame 't' the vehicle has globalx coordinate x(t) and at frame 't-1' (0.1 seconds before) it was x(t-1). If the reference location has globalx coordinate=6451179.1116 then I simply want a new column in df called 'u' which has 'yes' in the row where globalx of the vehicle was greater than reference coordinate at 'U' AND the previous consecutive globalx coordinate of this vehicle was less than reference coordinate at 'U'(i.e. reference coordinate is between the 2 locations of vehicle in two consecutive frames). This means that if df has 100 vehicles then there will be 100 'yes' in 'u' column because every vehicle wil! l meet the above criteria only once. I have tried to do this by running the function with ifelse and also tried to do the same using a for loop but it doesn't work for me. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Detecting Vehicle locations using R
I have a data frame of vehicle trajectories. Here's a snapshot: dput(head(df))structure(list(vehicle = c(2L, 2L, 2L, 2L, 2L, 2L), frame = 43:48, globalx = c(6451214.156, 6451216.824, 6451219.616, 6451222.548, 6451225.462, 6451228.376), class = c(2L, 2L, 2L, 2L, 2L, 2L), velocity = c(37.76, 37.9, 38.05, 38.18, 38.32, 38.44), lane = c(2L, 2L, 2L, 2L, 2L, 2L)), .Names = c(vehicle, frame, globalx, class, velocity, lane), row.names = c(NA, 6L), class = data.frame) where, vehicle= vehicle id, frame= frame id of time frames in which it was observed, globalx = x coordinate of the front center of the vehicle, class=type of vehicle (1=motorcycle, 2=car, 3=truck), velocity=speed of vehicles in feet per second, lane= lane number (there are 6 lanes).The 'frame' represents one tenth of a second i.e. one frame is 0.1 seconds long. At frame 't' the vehicle has globalx coordinate x(t) and at frame 't-1' (0.1 seconds before) it was x(t-1). If the reference location has globalx coordinate=6451179.1116 then I simply want a new column in df called 'u' which has 'yes' in the row where globalx of the vehicle was greater than reference coordinate at 'U' AND the previous consecutive globalx coordinate of this vehicle was less than reference coordinate at 'U'(i.e. reference coordinate is between the 2 locations of vehicle in two consecutive frames). This means that if df has 100 vehicles then there will be 100 'yes' in 'u' column because every vehicle wil! l meet the above criteria only once. I have tried to do this by running the function with ifelse and also tried to do the same using a for loop but it doesn't work for me. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R 3.0.2 How to Split-Apply-Combine using various Columns
I am sorry for the previous mail, it happened due to the tables I copied into mail. Here is the text version:Hello everyone, I have a large vehicle trajectory data of which following is a small part: vehicle frame globalx class velocity lane1 221 6451260 223.37 51 222 6451261 223.1651 223 6451263 2 22.9451 224 6451264 222.855 2 115 6451181 235.004 2 116 6451184 235.014 2 117 6451186 235.034 2 118 6451188 234.924 2 119 6451191 234.494 2 120 6451193 233.664 2 121 6451195 232.504 vehicle = unique ID of vehicle. It is repeated (in column) for every frame in which it was observed;frame= ID of the frame in which the vehicle was observed. One frame is 0.1 seconds long;class = class of vehicle i.e. 1=motorcycle, 2=car, 3=truck;velocity= velocity of vehicle in feet per second;lane= lane number in which vehicle is present in a particular frame 'frame' number can also repeat e.g. in frame 120 the example data shows vehicle 2 was observed but in the original data many more vehicles might have been observed in this frame. Similarly, 'class' is defined above and all three classes are present in the original data (here example data only shows classes 2 and 3 i.e. cars and trucks). I need to determine two things:1) Number of vehicles observed in every 30 seconds i.e. 300 frames 2) Average velocity of each vehicle class in every 30 seconds This means that the first step might be to determine the minimum and maximum frame numbers and then divide them in slots so that every slot has 300 frames. In my original data I found 22 as min and 9233 as max frame number. This makes 30 time slots as 22-322, 322-622, ..., 9022-9233. I need following output: TimeSlot Total-Cars Total-Trucks Total-Motorcycles MeanVelocity-Cars MeanVelocity-Trucks MeanVelocity-Motorcycles22-322322-622...9022-9233 Umair Durrani email: umairdurr...@outlook.com Date: Fri, 24 Jan 2014 19:45:27 -0800 From: smartpink...@yahoo.com Subject: Re: [R] R 3.0.2 How to Split-Apply-Combine using various Columns To: umairdurr...@outlook.com Hi, Please check your post and see how much helpful is for another person to copy and paste your example dataset to run the code.Á It is always useful to use ?dput() dput(head(data,10)).Á Also, please post using plain text. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Corrected - R 3.0.2 How to Split-Apply-Combine using various Columns
Hello everyone,Here is the version using dput. I am sorry for the junk I posted before. I have a large vehicle trajectory data of which following is a small part: structure(list(vehicle = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L,2L, 2L), frame = c(221L, 222L, 223L, 224L, 115L, 116L, 117L, 118L, 119L, 120L, 121L), globalx = c(6451259.685, 6451261.244, 6451262.831, 6451264.362, 6451181.179, 6451183.532, 6451185.884, 6451188.237, 6451190.609, 6451192.912, 6451195.132), class = c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), velocity = c(23.37, 23.16, 22.94, 22.85, 35, 35.01, 35.03, 34.92, 34.49, 33.66, 32.5), lane = c(5L, 5L, 5L, 5L, 4L, 4L, 4L, 4L, 4L, 4L, 4L)), .Names = c(vehicle, frame, globalx, class, velocity, lane), row.names = c(85L, 86L, 87L, 88L, 447L, 448L, 449L, 450L, 451L, 452L, 453L), class = data.frame) Explanation of Columns:vehicle = unique ID of vehicle. It is repeated (in column) for every frame in which it was observed;frame= ID of the frame in which the vehicle was observed. One frame is 0.1 seconds long;class = class of vehicle i.e. 1=motorcycle, 2=car, 3=truck;velocity= velocity of vehicle in feet per second;lane= lane number in which vehicle is present in a particular frame; 'frame' number can also repeat e.g. in frame 120 the example data shows vehicle 2 was observed but in the original data many more vehicles might have been observed in this frame. Similarly, 'class' is defined above and all three classes are present in the original data (here example data only shows classes 2 and 3 i.e. cars and trucks). I need to determine two things:1) Number of vehicles observed in every 30 seconds i.e. 300 frames 2) Average velocity of each vehicle class in every 30 seconds This means that the first step might be to determine the minimum and maximum frame numbers and then divide them in slots so that every slot has 300 frames. In my original data I found 22 as min and 9233 as max frame number. This makes 30 time slots as 22-322, 322-622, ..., 9022-9233. I need following columns in one table as an output (note that Timeslot column should contain the time intervals as described before): TimeSlot, Total-Cars, Total-Trucks, Total-Motorcycles, MeanVelocity-Cars, MeanVelocity-Trucks, MeanVelocity-Motorcycles [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R 3.0.2 How to Split-Apply-Combine using various Columns
. . . 9022-9233 I have tried many things and also used some suggestions from stackoverflow but still am unable to get output like this with input data. Please help. Umair Durrani email: umairdurr...@outlook.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to get the proportions of data with respect to two variables in R?
Thanks for your answers Arun. Unfortunately the code didn't work and I am getting the error: arguments must have same length. Here are sample input and output: INPUT: Vehicle ID Vehicle Class Vehicle Length Vehicle Width 2 2 13.5 4.5 2 2 13.5 4.5 2 2 13.5 4.5 2 2 13.5 4.5 3 2 13.5 4.0 3 2 13.5 4.0 3 2 13.5 4.0 3 2 13.5 4.0 4 2 10.0 4.5 4 2 10.0 4.5 4 2 10.0 4.5 4 2 10.0 4.5 5 3 23.0 4.5 5 3 23.0 4.5 5 3 23.0 4.5 5 3 23.0 4.5 6 3 76.5 4.5 6 3 76.5 4.5 6 3 76.5 4.5 6 3 76.5 4.5 6 3 76.5 4.5 7 1 10.0 3.0 7 1 10.0 3.0 7 1 10.0 3.0 7 1 10.0 3.0 8 2 13.5 5.5 8 2 13.5 5.5 8 2 13.5 5.5 8 2 13.5 5.5Note that in this input: Total number of cars=4, trucks=2, motorcycles=1 Sample OutputGroup: cars VehicleLength VehicleWidth Proportion 13.5 4.5 0.25 13.5 4.0 0.25 13.5 5.5 0.25 23.0 4.5 0.25 Group:trucks VehicleLength VehicleWidth Proportion 23.0 4.5 0.5 76.0 4.5 0.5 Group: motorcycles VehicleLength VehicleWidth Proportion 10.0 3.0 1.0 Umair Durrani email: umairdurr...@outlook.com Date: Sat, 30 Nov 2013 23:41:28 -0800 From: smartpink...@yahoo.com Subject: Re: [R] How to get the proportions of data with respect to two variables in R? To: r-help@r-project.org CC: umairdurr...@outlook.com Hi, It is better to provide a reproducible example. May be this helps: set.seed(252) dat1 - data.frame(`Vehicle ID`=sample(150,150,replace=FALSE),`Vehicle Class`=rep(1:4,c(20,40,30,60)), `Vehicle length`= sample(15:25,150,replace=TRUE), `Vehicle width`= sample(4:10,150,replace=TRUE),check.names=FALSE) cars - subset(dat1,`Vehicle Class`==2) by(cars,INDICES=cars$`Vehicle length`,FUN=table(cars$`Vehicle width`)) #Error in FUN(X[[1L]], ...) : could not find function FUN by(cars$`Vehicle width`,INDICES=cars$`Vehicle length`, table) by(dat1$`Vehicle width`,list(dat1$`Vehicle Class`,dat1$`Vehicle length`), table) #Also, you may check ftable(dat1[2:4]) prop.table(ftable(dat1[2:4]),1) A.K. On Sunday, December 1, 2013 12:08 AM, umair durrani umairdurr...@outlook.com wrote: I have 4 columns: Vehicle ID, Vehicle Class, Vehicle Length and Vehicle Width. Every vehicle has a unique vehicle ID (e.g. 2, 4, 5,...) and the data was collected every 0.1 seconds which means that vehicle IDs are repeated in Vehicle ID column for the number of times they were observed. There are three vehicle classes i.e. 1=motorcycles, 2=cars, 3=trucks in the Vehicle Class column and the lengths and widths are in their respective columns against every vehicle ID. I want to subset the data by vehicle class and then find the proportions of each vehicle model (unique length and width) within every class. For example, for the Vehicle Class = 2 i.e. car, I want to find different models of cars (unique length and width) and their proportions with respect to total number of cars. Here is what I have done so far:To subset data by Vehicle Classcars - subset(b, b$'Vehicle class'==2) trucks - subset(b, b$'Vehicle class'==3) motorcycles - subset(b, b$'Vehicle class'==1)To find the number of carsnumofcars - length(unique(cars$'Vehicle ID')) # 2830 numoftrucks - length(unique(trucks$'Vehicle ID')) # 137 numofmotorcycles - length(unique(motorcycles$'Vehicle ID'))# 45The above code worked but I could not find the proportions by using the code below:by (cars, INDICES=cars$'Vehicle Length', FUN=table(class$'Vehicle width'))R gives an error stating that it could not find 'FUN'. Please help me in finding the proportions of each model within all classes of vehicles. Umair Durrani email: umairdurr...@outlook.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to get the proportions of data with respect to two variables in R?
I have 4 columns: Vehicle ID, Vehicle Class, Vehicle Length and Vehicle Width. Every vehicle has a unique vehicle ID (e.g. 2, 4, 5,...) and the data was collected every 0.1 seconds which means that vehicle IDs are repeated in Vehicle ID column for the number of times they were observed. There are three vehicle classes i.e. 1=motorcycles, 2=cars, 3=trucks in the Vehicle Class column and the lengths and widths are in their respective columns against every vehicle ID. I want to subset the data by vehicle class and then find the proportions of each vehicle model (unique length and width) within every class. For example, for the Vehicle Class = 2 i.e. car, I want to find different models of cars (unique length and width) and their proportions with respect to total number of cars. Here is what I have done so far:To subset data by Vehicle Classcars - subset(b, b$'Vehicle class'==2) trucks - subset(b, b$'Vehicle class'==3) motorcycles - subset(b, b$'Vehicle class'==1)To find the number of carsnumofcars - length(unique(cars$'Vehicle ID')) # 2830 numoftrucks - length(unique(trucks$'Vehicle ID')) # 137 numofmotorcycles - length(unique(motorcycles$'Vehicle ID'))# 45The above code worked but I could not find the proportions by using the code below:by (cars, INDICES=cars$'Vehicle Length', FUN=table(class$'Vehicle width'))R gives an error stating that it could not find 'FUN'. Please help me in finding the proportions of each model within all classes of vehicles. Umair Durrani email: umairdurr...@outlook.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Apply function to one specific column / Alternative to for loop
This might be of some use : http://nsaunders.wordpress.com/2010/08/20/a-brief-introduction-to-apply-in-r/ Umair Durrani email: umairdurr...@outlook.com Date: Sat, 16 Nov 2013 07:30:29 -0800 From: ron...@gmx.net To: r-help@r-project.org Subject: [R] Apply function to one specific column / Alternative to for loop Hi guys, I am a total newbie to R, so I hope this isn't a totally dumb question. I have a dataframe with a title in one row and the corresponding values in the next rows. Let's take this example: test_df - data.frame(cbind(titel = , x = 4:5, y = 1:2)) test_df = rbind(cbind(titel=1.Test, x=, y=), test_df, cbind(titel=2.Test, x=, y=), test_df, cbind(titel=3.Test, x=, y=), test_df) test_df titel x y 1 1.Test 24 1 35 2 4 2.Test 54 1 65 2 7 3.Test 84 1 95 2 What I want to have is: titel x y 2 1.Test 4 1 3 1.Test 5 2 5 2.Test 4 1 6 2.Test 5 2 8 3.Test 4 1 9 3.Test 5 2 In my example, the title is in every third line, but in my real data there is no pattern. Each title has at least one line but can have x lines. I was able to solve my problem in a for loop with the following code: test_df$titel - as.character(test_df$titel) for (i in 1:nrow(test_df)) { if (nchar(test_df$titel[i])==0){ test_df$titel[i]=test_df$titel[i-1] } } test_df - subset(test_df,test_df$x!=) The problem is, I have a lot of data and the for loop is obviously very slow. Is there a more elegant way to achieve the same? I think I have to use the apply function, but I don't know how to use it with just one column. -- View this message in context: http://r.789695.n4.nabble.com/Apply-function-to-one-specific-column-Alternative-to-for-loop-tp4680566.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R for a stats intro for undergrads in the US?
Hi Spencer, I would definitely recommend R for introductory stats. course because it is free and easy to learn. You can visit www.twotorials.com for two-minute tutorials on R. Also www.coursera.org offers many free courses on R, for intro stats check this out: https://www.coursera.org/course/stats1 Hope this helps, Umair Durrani email: umairdurr...@outlook.com Date: Sat, 16 Nov 2013 18:19:16 -0800 From: spencer.gra...@prodsyse.com To: R-help@r-project.org Subject: [R] R for a stats intro for undergrads in the US? Hello, All: Would anyone recommend R for an introductory statistics class for freshman psychology students in the US? If yes, might there be any notes for such available? I just checked r-projects.org and CRAN contributed documentation and found nothing. I have a friend who teaches such a class, and wondered if R might be suitable. The alternative is SPSS at $406 per student. Thanks, Spencer -- Spencer Graves, PE, PhD President and Chief Technology Officer Structure Inspection and Monitoring, Inc. 751 Emerson Ct. San José, CA 95126 ph: 408-655-4567 web: www.structuremonitoring.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to sum a function over a specific range in R?
I am new to R and have already posted this question on stack overflow. The problem is that I did not understand the answers as the R documentation about the discussed functions (e.g. 'convolve') is quite complicated for a newbie like me. Here's the question: I have a big text file with more than 3 million rows. The following is the example of the three columns I want to use: indxvehID LocalY 1 2 35.381 2 2 39.381 3 2 43.381 4 2 47.38 5 2 51.381 6 2 55.381 7 2 59.381 8 2 63.379 9 2 67.383 10 2 71.398 where,indx = IndexvehID = Vehicle ID (Here only '2' is shown but infact there are 2169 vehicle IDs and each one repeats several times because the data was collected at every 0.1 seconds)LocalY = The y coordinate of the vehicle at a particular time (The time column is not shown here) What I want to do is to create a new column of 'SmoothedY' using the following formula: SmoothedY = 1/Z * Summation from (i-15) to (i+15) (LocalY * exp(-abs(i-k))/5)) where,i = indxZ = Summation from (k =i-15) to (k = i+15) ( exp(-abs(i-k))/5)) How can I apply this formula to create the new column 'SmoothedY'? This is actually a data smoothing problem but default smoothing algorithms in R are not suitable for my data and I have to use this custom formula. Thanks in advance. Umair Durrani [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R 3.0.2 - How to create intervals and group another variable in those intervals?
I have two columns for speed ('Smoothed velocity') and Spacing. What I want to do is to first create the intervals of speed (minimum value=0, max value= 85.53), group the Spacing values falling in a particular Speed interval, find the average of the Spacing for an interval and finally plot the average spacing of each interval against the mid-point of the Speed interval. I want to have fixed intervals of 4.5 feet per second, i.e. 0-4.5, 4.5-9,..xx-85.53.After hours of search I found a function for creating intervals called classIntervals() but I can't figure out how to create fixed intervals of 4.5. Here is what I tried:classIntervals(s21[,'Smoothed velocity'], style='fixed', fixedBreaks=4.5)But the results were unexpected and there was a Warning message:In classIntervals(s21[, Smoothed velocity], style = fixed, fixedBreaks = 4.5) : variable range greater than fixedBreaksEven after intervals are created, I need to group spacing and find the avg. for every interval. How can I do this? I have tried what I could, please help Umair Durrani email: umairdurr...@outlook.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.