Does this do what you want: > library(tidyverse)
> input <- read_delim("PERMNO DATE Spread + 111 19940103 0.025464308 + 111 19940104 0.064424296 + 111 19940105 0.018579337 + 111 19940106 0.018872211 ..." ... [TRUNCATED] > # drop last two digits to get the month > monthly <- input %>% + group_by(PERMNO, month = DATE %/% 100) %>% + summarise(avg = mean(Spread)) > monthly # A tibble: 12 x 3 # Groups: PERMNO [3] PERMNO month avg <dbl> <dbl> <dbl> 1 111 199401 0.0416 2 111 199402 0.0508 3 111 199403 0.0567 4 111 199404 0.0466 5 112 199401 0.000533 6 112 199402 0.000593 7 112 199403 0.000471 8 112 199404 0.000587 9 113 199401 0.000692 10 113 199402 0.000591 11 113 199403 0.000677 12 113 199404 0.000555 > Jim Holtman *Data Munger Guru* *What is the problem that you are trying to solve?Tell me what you want to do, not how you want to do it.* On Sun, Oct 20, 2019 at 5:10 AM Subhamitra Patra <subhamitra.pa...@gmail.com> wrote: > Dear Sir, > > Thank you very much for your suggestions. > > Due to certain inconveniences, I was unable to work on your suggestions. > > Today I worked on both suggestions and got the result that I really wanted > that monthly averages for each country. > > Here, I am asking one more query (just for learning purpose) that if my > country name and its respective variable is in the panel format, and I want > to take the monthly average for each country, how the code will be > arranged. For your convenience, I am providing a small data sample below. > > PERMNO DATE Spread > 111 19940103 0.025464308 > 111 19940104 0.064424296 > 111 19940105 0.018579337 > 111 19940106 0.018872211 > 111 19940107 0.065279782 > 111 19940110 0.063485905 > 111 19940111 0.018355453 > 111 19940112 0.064135683 > 111 19940113 0.063519987 > 111 19940114 0.018277351 > 111 19940117 0.018628417 > 111 19940118 0.065630229 > 111 19940119 0.018713152 > 111 19940120 0.019119037 > 111 19940121 0.068342043 > 111 19940124 0.020843244 > 111 19940125 0.019954211 > 111 19940126 0.018980321 > 111 19940127 0.066827165 > 111 19940128 0.067459235 > 111 19940131 0.068682559 > 111 19940201 0.02081465 > 111 19940202 0.068236091 > 111 19940203 0.068821406 > 111 19940204 0.020075648 > 111 19940207 0.066070584 > 111 19940208 0.066068837 > 111 19940209 0.019077072 > 111 19940210 0.065894875 > 111 19940211 0.018847478 > 111 19940214 0.065040844 > 111 19940215 0.01880332 > 111 19940216 0.018836199 > 111 19940217 0.066888865 > 111 19940218 0.067116793 > 111 19940221 0.068809742 > 111 19940222 0.068230213 > 111 19940223 0.069502855 > 111 19940224 0.070383523 > 111 19940225 0.020430811 > 111 19940228 0.067087257 > 111 19940301 0.066776479 > 111 19940302 0.019959031 > 111 19940303 0.066596469 > 111 19940304 0.019131334 > 111 19940307 0.019312528 > 111 19940308 0.067349909 > 111 19940309 0.068916431 > 111 19940310 0.068620043 > 111 19940311 0.070494844 > 111 19940314 0.071056842 > 111 19940315 0.071042517 > 111 19940316 0.072401771 > 111 19940317 0.071940001 > 111 19940318 0.07352884 > 111 19940321 0.072671688 > 111 19940322 0.072652595 > 111 19940323 0.021352138 > 111 19940324 0.069933727 > 111 19940325 0.068717467 > 111 19940328 0.020470748 > 111 19940329 0.020003748 > 111 19940330 0.065833717 > 111 19940331 0.065268388 > 111 19940401 0.018762356 > 111 19940404 0.064914179 > 111 19940405 0.064706743 > 111 19940406 0.018764175 > 111 19940407 0.06524806 > 111 19940408 0.018593449 > 111 19940411 0.064913949 > 111 19940412 0.01872089 > 111 19940413 0.018729328 > 111 19940414 0.018978773 > 111 19940415 0.065477137 > 111 19940418 0.064614365 > 111 19940419 0.064184148 > 111 19940420 0.018553192 > 111 19940421 0.066872771 > 111 19940422 0.06680782 > 111 19940425 0.067467961 > 111 19940426 0.02014297 > 111 19940427 0.062464016 > 111 19940428 0.062357052 > 112 19940429 0.000233993 > 112 19940103 0.000815264 > 112 19940104 0.000238165 > 112 19940105 0.000813632 > 112 19940106 0.000236915 > 112 19940107 0.000809102 > 112 19940110 0.000801642 > 112 19940111 0.000797932 > 112 19940112 0.000795251 > 112 19940113 0.000795186 > 112 19940114 0.000231359 > 112 19940117 0.000232134 > 112 19940118 0.000233718 > 112 19940119 0.000233993 > 112 19940120 0.000234694 > 112 19940121 0.000235753 > 112 19940124 0.000808653 > 112 19940125 0.000235604 > 112 19940126 0.000805068 > 112 19940127 0.000802337 > 112 19940128 0.000801768 > 112 19940131 0.000233517 > 112 19940201 0.000797431 > 112 19940202 0.000233338 > 112 19940203 0.000233826 > 112 19940204 0.000799519 > 112 19940207 0.000798105 > 112 19940208 0.000792245 > 112 19940209 0.000231113 > 112 19940210 0.000233413 > 112 19940211 0.000798168 > 112 19940214 0.000233282 > 112 19940215 0.000797848 > 112 19940216 0.000785165 > 112 19940217 0.000228426 > 112 19940218 0.000786783 > 112 19940221 0.00078343 > 112 19940222 0.000781459 > 112 19940223 0.000776264 > 112 19940224 0.000226399 > 112 19940225 0.000779066 > 112 19940228 0.000773603 > 112 19940301 0.000226487 > 112 19940302 0.000775233 > 112 19940303 0.000227017 > 112 19940304 0.000227854 > 112 19940307 0.000782814 > 112 19940308 0.000229164 > 112 19940309 0.000787033 > 112 19940310 0.000784049 > 112 19940311 0.000228984 > 112 19940314 0.00078697 > 112 19940315 0.000782567 > 112 19940316 0.000228516 > 112 19940317 0.000786347 > 112 19940318 0.000229236 > 112 19940321 0.000230107 > 112 19940322 0.000792689 > 112 19940323 0.000787284 > 112 19940324 0.000787221 > 112 19940325 0.000227978 > 112 19940328 0.000228713 > 112 19940329 0.000228894 > 112 19940330 0.000229255 > 112 19940331 0.000231003 > 112 19940401 0.000796567 > 112 19940404 0.000790668 > 112 19940405 0.00078195 > 112 19940406 0.000780475 > 112 19940407 0.000228355 > 112 19940408 0.000781723 > 112 19940411 0.000775741 > 112 19940412 0.000226647 > 112 19940413 0.000778876 > 112 19940414 0.000777336 > 112 19940415 0.000775253 > 112 19940418 0.000226362 > 112 19940419 0.000779554 > 112 19940420 0.000774824 > 112 19940421 0.000225582 > 112 19940422 0.000225724 > 112 19940425 0.000773361 > 112 19940426 0.0002256 > 112 19940427 0.000776416 > 113 19940428 0.000280542 > 113 19940429 0.000964148 > 113 19940103 0.000962654 > 113 19940104 0.000281768 > 113 19940105 0.000962219 > 113 19940106 0.000961965 > 113 19940107 0.000958602 > 113 19940110 0.000280056 > 113 19940111 0.000956348 > 113 19940112 0.000952171 > 113 19940113 0.000948176 > 113 19940114 0.000275607 > 113 19940117 0.000275773 > 113 19940118 0.000276738 > 113 19940119 0.000947068 > 113 19940120 0.000940959 > 113 19940121 0.000275224 > 113 19940124 0.000948489 > 113 19940125 0.000940076 > 113 19940126 0.0009309 > 113 19940127 0.000269955 > 113 19940128 0.000270328 > 113 19940131 0.000924234 > 113 19940201 0.000924038 > 113 19940202 0.000269088 > 113 19940203 0.000270247 > 113 19940204 0.000270562 > 113 19940207 0.00092656 > 113 19940208 0.000921819 > 113 19940209 0.000920361 > 113 19940210 0.000268958 > 113 19940211 0.000924758 > 113 19940214 0.000266768 > 113 19940215 0.000911325 > 113 19940216 0.000909294 > 113 19940217 0.000905887 > 113 19940218 0.000262919 > 113 19940221 0.000262978 > 113 19940222 0.000263189 > 113 19940223 0.000904439 > 113 19940224 0.000263512 > 113 19940225 0.000906184 > 113 19940228 0.000265198 > 113 19940301 0.000906126 > 113 19940302 0.000264357 > 113 19940303 0.000265392 > 113 19940304 0.000912495 > 113 19940307 0.000910641 > 113 19940308 0.000266143 > 113 19940309 0.000910113 > 113 19940310 0.000909277 > 113 19940311 0.000905056 > 113 19940314 0.00090285 > 113 19940315 0.000898831 > 113 19940316 0.000896118 > 113 19940317 0.000261294 > 113 19940318 0.000892563 > 113 19940321 0.000890852 > 113 19940322 0.00088639 > 113 19940323 0.000258509 > 113 19940324 0.000260286 > 113 19940325 0.000889354 > 113 19940328 0.000888373 > 113 19940329 0.000885049 > 113 19940330 0.000259116 > 113 19940331 0.000259474 > 113 19940401 0.000260316 > 113 19940404 0.000897493 > 113 19940405 0.000894592 > 113 19940406 0.000260435 > 113 19940407 0.000260989 > 113 19940408 0.000262061 > 113 19940411 0.000262262 > 113 19940412 0.000263604 > 113 19940413 0.000908682 > 113 19940414 0.000265348 > 113 19940415 0.000265637 > 113 19940418 0.00026862 > 113 19940419 0.000918882 > 113 19940420 0.000909904 > 113 19940421 0.000901725 > 113 19940422 0.000900062 > 113 19940425 0.000893547 > 113 19940426 0.000260899 > Here, the 1st column is the name of the countries panel which I identified > in number, 2nd is the date column arranged in yyyy-mm-dd format, and 3rd is > the daily variable for each country for 4 months of 1994. I need to take > the monthly average of spread variable for each country (i.e. noted as 111, > 112, and 113) in the above example. In short, my monthly spread variable > should be sorted on the basis of both PERMNO (i.e. country identifier), and > months of the particular year. > > Please educate me that in this data format, how the average code can be > written? > > Thank you very much. > > > > [image: Mailtrack] > < > https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5& > > > Sender > notified by > Mailtrack > < > https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5& > > > 10/20/19, > 05:31:23 PM > > On Mon, Sep 16, 2019 at 4:31 PM PIKAL Petr <petr.pi...@precheza.cz> wrote: > > > Original email did not come through (some problems with formating). > > > > Hi > > > > No, on contrary. I **am** suggesting to change date column to real date > > asi it is easy to handle with appropriate functions. > > > > Here are some fake data > > > > > str(spdat) > > 'data.frame': 260 obs. of 3 variables: > > $ dates : Date, format: "1995-01-01" "1995-01-02" "1995-01-03" > > "1995-01-04" ... > > $ coutryA : num 0.188 0.405 -0.107 -0.596 -0.529 ... > > $ countryB: num 9.4 10.76 11.24 8.26 10.71 .. > > > > > head(spdat) > > dates coutryA countryB > > 1 1995-01-01 0.1875060 9.402851 > > 2 1995-01-02 0.4045193 10.755112 > > 3 1995-01-03 -0.1073904 11.243663 > > 4 1995-01-04 -0.5959683 8.256424 > > 5 1995-01-05 -0.5293772 10.705431 > > 6 1995-01-06 -0.2228029 10.171461 > > > > First I melt it > > spdat.m <- melt(spdat, id.var="dates") > > > > > head(spdat.m) > > dates variable value > > 1 1995-01-01 coutryA 0.1875060 > > 2 1995-01-02 coutryA 0.4045193 > > 3 1995-01-03 coutryA -0.1073904 > > 4 1995-01-04 coutryA -0.5959683 > > 5 1995-01-05 coutryA -0.5293772 > > 6 1995-01-06 coutryA -0.2228029 > > > > I do aggregation > > > > > spdat.ag <- aggregate(spdat.m$value, list(format(spdat.m$dates, > > "%m.%Y"), spdat.m$variable), mean) > > > > And now I use dcast to get required result. > > > > > dcast(spdat.ag, Group.1~Group.2) > > Using x as value column: use value.var to override. > > Group.1 coutryA countryB > > 1 01.1995 0.098688137 10.177696 > > 2 02.1995 0.352264682 9.609261 > > 3 03.1995 0.155521876 10.043503 > > 4 04.1995 -0.166092393 10.129844 > > 5 05.1995 0.164665188 10.308275 > > 6 06.1995 0.260633585 10.210129 > > 7 07.1995 0.003671979 10.549016 > > 8 08.1995 0.045295990 10.087435 > > 9 09.1995 -0.145488206 9.689876 > > 10 10.1995 -0.225645950 9.743744 > > 11 11.1995 0.030273383 10.025435 > > 12 12.1995 0.043557468 10.105626 > > > > Cheers > > Petr > > > > Here are the data. > > > > > dput(spdat) > > spdat <- structure(list(dates = structure(c(9131, 9132, 9133, 9134, > 9135, > > 9136, 9137, 9138, 9139, 9140, 9141, 9142, 9143, 9144, 9145, 9146, > > 9147, 9148, 9149, 9150, 9151, 9152, 9162, 9163, 9164, 9165, 9166, > > 9167, 9168, 9169, 9170, 9171, 9172, 9173, 9174, 9175, 9176, 9177, > > 9178, 9179, 9180, 9181, 9190, 9191, 9192, 9193, 9194, 9195, 9196, > > 9197, 9198, 9199, 9200, 9201, 9202, 9203, 9204, 9205, 9206, 9207, > > 9208, 9209, 9210, 9211, 9212, 9221, 9222, 9223, 9224, 9225, 9226, > > 9227, 9228, 9229, 9230, 9231, 9232, 9233, 9234, 9235, 9236, 9237, > > 9238, 9239, 9240, 9251, 9252, 9253, 9254, 9255, 9256, 9257, 9258, > > 9259, 9260, 9261, 9262, 9263, 9264, 9265, 9266, 9267, 9268, 9269, > > 9270, 9271, 9272, 9273, 9282, 9283, 9284, 9285, 9286, 9287, 9288, > > 9289, 9290, 9291, 9292, 9293, 9294, 9295, 9296, 9297, 9298, 9299, > > 9300, 9301, 9302, 9303, 9312, 9313, 9314, 9315, 9316, 9317, 9318, > > 9319, 9320, 9321, 9322, 9323, 9324, 9325, 9326, 9327, 9328, 9329, > > 9330, 9331, 9332, 9343, 9344, 9345, 9346, 9347, 9348, 9349, 9350, > > 9351, 9352, 9353, 9354, 9355, 9356, 9357, 9358, 9359, 9360, 9361, > > 9362, 9363, 9364, 9365, 9374, 9375, 9376, 9377, 9378, 9379, 9380, > > 9381, 9382, 9383, 9384, 9385, 9386, 9387, 9388, 9389, 9390, 9391, > > 9392, 9393, 9394, 9404, 9405, 9406, 9407, 9408, 9409, 9410, 9411, > > 9412, 9413, 9414, 9415, 9416, 9417, 9418, 9419, 9420, 9421, 9422, > > 9423, 9424, 9456, 9435, 9436, 9437, 9438, 9439, 9440, 9441, 9442, > > 9443, 9444, 9445, 9446, 9447, 9448, 9449, 9450, 9451, 9452, 9453, > > 9454, 9455, 9486, 9465, 9466, 9467, 9468, 9469, 9470, 9471, 9472, > > 9473, 9474, 9475, 9476, 9477, 9478, 9479, 9480, 9481, 9482, 9483, > > 9484, 9151), class = "Date"), coutryA = c(0.187506004416315, > > 0.404519257417805, -0.107390371811605, -0.595968278805544, > > -0.529377240936012, > > -0.222802921207767, 0.413182392872818, 0.689673026532298, > > -1.2768723266992, > > -0.506308625809406, 0.113859233745174, -0.0963423819877653, > > 0.323987304768398, > > 1.63846917270538, 0.893233423250338, 0.297732439150487, > 0.949323101836486, > > -0.599518074708052, 0.366372319197032, -2.25734971953878, > > -0.190971733204918, > > -0.0874143568874351, 1.46699645184047, 0.00702170238687361, > > 0.11221346278474, > > -0.8060359607624, 0.340842350476532, 0.798838328074708, > 0.449214745851041, > > -0.664972890558734, 0.521830282184173, -1.35020467264521, > > -0.95240631225826, > > 1.25607320999849, 1.57018988549379, 0.99477900888445, -0.936218787378207, > > 1.48489932847779, 0.529222943794807, 0.0995675049147771, > > 0.477770516727839, > > 1.64567253670186, -0.0212651530684566, 0.558952796713992, > > 0.0409979382929057, > > 0.428675380654606, 0.0919422583362682, -0.819694497340459, > > 1.23998830450888, > > 0.607498144489643, -1.27724580163097, 1.41634774644371, > > -0.579094515769707, > > 2.02039606694223, 0.0740478208705996, -1.69826944583929, > > -0.321482399813063, > > -0.489198601400924, 2.0066750686302, -1.90624857752322, 0.46762405849973, > > 1.31264724137396, -0.0473627194710677, 0.141362267796145, > > 0.329709761206515, > > 0.518454586458572, -1.39489985851779, -0.388303591187678, > > -0.668922704543522, > > 0.0735115674875065, 1.30737242978235, 0.198503397980751, > > 0.257831448122427, > > -1.31173539205588, -1.45147941969116, 0.359725782295977, > > 0.612882118056585, > > -0.0733768753346202, -0.508349204402508, 1.35776663767231, > > 0.997807735669086, > > -1.41717534266382, -0.894170593324238, -0.68578120845151, > > -0.211509378018794, > > 0.436738904337909, -1.46932152770435, 0.0817388759874159, > > -0.0389350881653141, > > 0.709198476466861, -0.963669144724435, -0.548607422521798, > > -0.896886885575286, > > 0.322231150840934, 1.37327611339939, 0.0310213133870952, > > 0.796577750757324, > > -0.2010067423637, -0.241723752424226, 1.37547329580654, > -1.15382202538982, > > 0.101454200596915, 0.273663839664217, 1.8315140887841, 1.86096518756473, > > -0.536393730924719, -0.45845011727266, 1.10226256157127, > > -0.385596991265563, > > 3.20218061566932, -1.25865250042183, -0.13613128784276, > 0.483329357746514, > > -0.597187329618306, 0.710977603908319, -1.07945708269043, > > -0.477626236401394, > > 1.51034914684104, 2.35886426985999, -0.0250526828683629, > > -0.29439443478131, > > 0.665774016744828, 0.464027472251246, 0.226658374792016, > > -0.802597030454373, > > 0.825517059805602, -1.11293193130819, -1.27677400513873, > 1.60776237113347, > > 1.12490009531342, 0.95767047134623, 0.0475745549797055, > > -0.0591587460876868, > > -0.690617365240342, -1.62111622938126, -1.3545210707469, 1.8607927043106, > > 0.764367674339969, 1.49261525602638, 0.549570728337346, > -1.29658399741794, > > -1.6289903797869, 0.00573336252135834, 0.0300702149640632, > > 0.440810830115721, > > 0.663568666361326, -0.126685900835146, -0.00221628368438927, > > 0.815321995886579, -0.499280888368945, -0.271814047751667, > > -0.071025546459042, > > 1.73165491816826, -0.0294770299043331, 0.833605607221529, > > -0.670108794857159, > > -0.303323318026829, 1.29039844459134, -0.818806702120603, > > -0.445515595649677, > > -0.0128796557666887, 0.320923705586147, 0.230597275812536, > > -1.54009153212366, > > -0.294702981688559, 0.581209734391958, 0.121384768986639, > > 0.502914098451111, > > -1.59018268505718, -0.635101104166451, 1.48005776676403, > > -0.25631761189957, > > 0.171947814411552, 0.444646195980014, 0.172655758440111, > > -0.00432159794094836, > > -0.549321974240026, 0.585055026451421, -1.22813371480849, > > 0.846807540195381, > > 0.319629441352597, 0.393525732059709, -1.40275675444594, > 1.11062585584811, > > 0.214809571213853, -0.636432711800391, -0.283087127251573, > > -1.46385553207618, > > 0.436928676930225, -1.34231945433777, 0.451281957595763, > > -0.523155001924496, > > -2.69416779107545, 1.5513477373689, 0.989632029400036, 1.34636075948993, > > 0.346147428691405, -0.464527560160041, 0.337233933370495, > > 1.11331396366389, > > -1.00060600083316, -0.734784444487169, 1.40476315358621, > 1.01671092179193, > > -0.0144306250829694, -0.923555930346906, -1.02275966525015, > > 0.619422010219383, > > 0.603484309754755, -0.774553813657576, 0.0932792545556387, > > -0.651884521428279, > > -0.61965612647073, -1.22104834441579, -1.31439612639271, > > -2.87707752518163, > > -0.0343801084491906, -0.640678302378492, -1.38653452986558, > > 0.884963139028743, > > -0.657454283462004, 0.462842665244993, -0.20881674837534, > 0.6345884135548, > > 0.707165108434729, -0.162090928425892, -0.998662309785188, > > 1.3130254639318, > > 0.191890764940071, -0.0493619237876962, -0.55183232511689, > > 0.470263932874487, > > -0.217088645692971, 0.231550037620628, -0.530406537266415, > > -0.616522469083808, > > 0.329347084038265, 1.49420692610475, 1.91750823142859, 0.753536143872474, > > 0.766584887163714, -0.259803384094296, -0.402463714097741, > > -0.0229799209735185, > > -0.259677990559218, -1.41529707261105, 0.191362852138627, > > 1.54483266684747, > > -1.17947655378489, -0.426265411073274, 0.723010460481118, > > 1.37405142869537, > > -0.374771207936141, 0.0513905365832423, -0.369432731236118, > > -0.945441984794364, > > 0.179506648255554, 0.31971255725438, -1.25117960937319, 2.46228549580083 > > ), countryB = c(9.4028512714591, 10.7551115504652, 11.2436629228434, > > 8.25642360904389, 10.7054313972395, 10.1714609666091, 10.3726975056206, > > 10.6594299429162, 8.56250595443296, 10.5612153841067, 8.07612112826519, > > 9.94704207511951, 11.392407273156, 10.4257690445268, 10.6339442533038, > > 10.5314883825356, 10.3506665399033, 10.2426403362978, 10.8437715647564, > > 10.8247200587034, 11.2449815064171, 9.2898697883112, 9.05418978124619, > > 10.6080277672463, 9.19882175737148, 11.3589722806948, 10.4139334238189, > > 9.44305216810892, 9.58426470056472, 8.84208362003176, 10.8125431356391, > > 7.71357872650814, 8.73526671289828, 10.714693958224, 9.49976972594189, > > 9.41896864941478, 7.33073349261249, 10.5268398982262, 9.92255671125184, > > 10.5665378092202, 10.5035704895405, 7.93682068228084, 10.882421050424, > > 9.3237610577468, 8.42128120513304, 9.46103753451174, 10.3472215515392, > > 11.0483414005193, 10.3421689244599, 7.85120280889754, 11.6327644046254, > > 9.57620745972098, 10.6553844719749, 10.8490250129346, 10.2742492933876, > > 9.55428072119304, 9.0976351049804, 10.0731951766966, 10.6956488509516, > > 11.1530744146062, 10.3496303024767, 9.52734980693306, 9.64478424894734, > > 9.28301632295047, 10.9568790570688, 11.6052870914912, 9.58530202776537, > > 11.1338134902295, 8.66685735149472, 11.0230863576875, 10.8000609212302, > > 10.6510296259782, 11.831292326569, 9.53836122448974, 9.55805411377422, > > 9.90336204189518, 9.36377040999133, 11.7041009433341, 9.95628297574831, > > 10.718111342931, 10.4562688422472, 8.85976383099186, 8.94085496683824, > > 8.19538394018434, 10.1058448260449, 9.70821289789561, 9.08757962570738, > > 10.6599997541876, 10.0521137258219, 9.9924295699559, 11.8730269098299, > > 11.2634470795907, 11.3854762443416, 9.56742053529845, 10.4101561978503, > > 9.53376547865009, 9.75410966995361, 9.92804558924886, 8.36231430067066, > > 10.7486459346681, 12.0143881312685, 11.0083060332839, 9.32820954213586, > > 10.8420346742049, 9.73064414798223, 10.7593902723319, 10.976622155215, > > 10.1039774975157, 8.36317871802524, 9.21809894958653, 10.1015362220683, > > 11.4655736295123, 9.65528297274543, 9.67844310028008, 10.1516820910267, > > 8.38764450852642, 10.163558398201, 11.1432463477237, 12.0509818193223, > > 10.9896913965091, 11.1772406550953, 9.14396687337779, 9.93338627749979, > > 10.9548864433126, 8.64911301751956, 11.706463972364, 11.1012846649741, > > 8.7805267197408, 11.5802098773954, 10.2268513542863, 10.3509617168731, > > 9.09646558899397, 11.2706666647314, 11.3984335011704, 11.4808985388742, > > 10.5608771133999, 10.3684356806175, 10.4815588822618, 10.5818867877558, > > 12.2561035284691, 8.6464271477849, 10.3412351841865, 10.7577574534162, > > 11.1124067479261, 9.91627943243343, 10.6356898895291, 10.2107566441478, > > 10.0672734202575, 10.2385787014999, 11.7112606160069, 10.0453801263575, > > 8.84654136100724, 10.2173421609193, 9.27919801705716, 10.4755578829547, > > 7.69340209082122, 9.24705253848083, 10.8415406794597, 8.69603117680965, > > 11.2589214416702, 10.5425642239737, 10.1389355042458, 9.17267675180435, > > 12.3052338002213, 10.0181674985326, 12.2715476751051, 9.64516268052739, > > 10.6305299379912, 10.1829347684655, 9.97983942366781, 10.2559847744715, > > 10.3092266661814, 9.75215330673072, 10.250464278709, 9.31118800061454, > > 10.3310666767171, 9.09703848990093, 10.241195105962, 8.57290406448477, > > 8.98090855172704, 8.64653101832229, 12.6791587435376, 9.56000538681993, > > 10.4062255533723, 11.067091476284, 10.5255014737268, 10.2240941949978, > > 9.13081571869084, 9.5942352120783, 9.2753466212409, 10.2789293993548, > > 8.10255065585342, 9.48751297655077, 8.51198576785003, 9.46310532206947, > > 9.86727270762806, 11.5149248124739, 9.31557156735022, 9.34351230206303, > > 10.022139448869, 11.4111350893792, 8.57891783464065, 10.3761090924661, > > 9.38300408584683, 9.33694577526158, 9.25815555686085, 9.29856853889735, > > 8.4250073823245, 8.83022950824832, 9.1510846172981, 10.2553042376765, > > 10.0739540955956, 9.04955917463259, 10.8927827168631, 9.44611041690694, > > 10.7883395708593, 10.6010088332078, 7.72560864006592, 10.1760839916637, > > 11.5576569894392, 11.384809257294, 8.73504353987083, 9.00585942714512, > > 9.62327893504013, 10.3527072699866, 10.5220100705827, 8.74921668696853, > > 8.56415116683662, 12.1348451793815, 10.9496674323819, 9.64443817181322, > > 9.52977454697087, 10.4281877186725, 8.52701721410292, 11.6911584965782, > > 10.2300108250139, 8.65368821276485, 11.7733431942379, 10.2060233777681, > > 9.57291673029552, 9.82687667895106, 10.5939736188493, 11.2510605726337, > > 10.3383384488323, 9.92301237292945, 10.0164623230529, 10.4939857044034, > > 10.5631769648289, 10.935731043532, 11.0659359187168, 8.51697010486427, > > 9.79512310587405, 9.35132038807071, 11.3286703149903, 10.4621597293933, > > 10.4099459919071, 8.86246315190942, 9.30054044639769, 9.40346575227191, > > 9.59278722974697)), row.names = c(NA, -260L), class = "data.frame") > > > > > > > > > > > > From: Subhamitra Patra <mailto:subhamitra.pa...@gmail.com> > > Sent: Friday, September 13, 2019 3:59 PM > > To: PIKAL Petr <mailto:petr.pi...@precheza.cz>; r-help mailing list > > <mailto:r-help@r-project.org> > > Subject: Re: [R] Query about calculating the monthly average of daily > data > > columns > > > > Dear PIKAL, > > > > Thank you very much for your suggestion. > > > > I tried your previous suggested code and getting the average value for > > each month for both country A, and B. But in your recent email, you are > > suggesting not to change the date column to real date. If I am going > > through your recently suggested code, i.e. > > > > "aggregate(value column, list(format(date column, "%m.%Y"), country > > column), mean)" > > > > I am getting an Error that "aggregate(value, list(format(date, "%m.%Y"), > > country), mean) : object 'value' not found". > > > > Here, my query "may I need to define the date column, country column, and > > value column separately?" > > > > Further, I need something the average value result like below in the data > > frame > > > > Month Country A Country B > > Jan 1994 26.66 35.78 > > Feb 1994 26.13 29.14 > > > > so that it will be easy for me to export to excel, and to use for the > > further calculations. > > > > Please suggest me in this regard. > > > > Thank you. > > > > > > > > > > > > > > > https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5& > > Sender notified by > > > > > https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality5& > > 09/13/19, 07:22:53 PM > > > > > > > > On Fri, Sep 13, 2019 at 7:03 PM PIKAL Petr <mailto: > petr.pi...@precheza.cz> > > wrote: > > Hi > > > > I am almost 100% sure that you would spare yourself much trouble if you > > changed your date column to real date > > > > ?as.Date > > > > reshape your wide format to long one > > library(reshape2) > > ?melt > > > > to get 3 column data.frame with one date column, one country column and > > one value column > > > > use ?aggregate and ?format to get summary value > > > > something like > > aggregate(value column, list(format(date column, "%m.%Y"), country > > column), mean) > > > > But if you insist to scratch your left ear with right hand accross your > > head, you could continue your way. > > > > Cheers > > Petr > > > > > -----Original Message----- > > > From: R-help <mailto:r-help-boun...@r-project.org> On Behalf Of > > Subhamitra > > > Patra > > > Sent: Friday, September 13, 2019 3:20 PM > > > To: Jim Lemon <mailto:drjimle...@gmail.com>; r-help mailing list > > <r-help@r- > > > http://project.org> > > > Subject: Re: [R] Query about calculating the monthly average of daily > > data > > > columns > > > > > > Dear Sir, > > > > > > Yes, I understood the logic. But, still, I have a few queries that I > > mentioned > > > below your answers. > > > > > > "# if you only have to get the monthly averages, it can be done this > way > > > > spdat$month<-sapply(strsplit(spdat$dates,"-"),"["*,2*) > > > > spdat$year<-sapply(strsplit(spdat$dates,"-"),"[",*3*)" > > > > > > > > B. Here, I need to define the no. of months, and years separately, > > right? > > > > or else what 2, and 3 (in bold) indicates? > > > > > > > > > > To get the grouping variable of sequential months that you want, you > only > > > need the month and year values of the dates in the first column. First > I > > used > > > the "strsplit" function to split the date field at the hyphens, then > used > > > "sapply" to extract ("[") the second (month) and *third (year)* parts > as > > two > > > new columns. Because you have more than one year of data, you need the > > > year values or you will group all Januarys, all Februarys and so on. > > > Notice how I pass both of the new columns as a list (a data frame is a > > type of > > > list) in the call to get the mean of each month. > > > > > > 1. Here, as per my understanding, the "3" indicates the 3rd year, > right? > > > But, you showed an average for 2 months of the same year. Then, what > "3" > > > in the spdat$year object indicate? > > > > > > > > > C. From this part, I got the exact average values of both January and > > > > February of 1994 for country A, and B. But, in code, I have a query > > > > that I need to define spdat$returnA, and spdat$returnB separately > > > > before writing this code, right? Like this, I need to define for each > > > > 84 countries separately with their respective number of months, and > > > > years before writing this code, right? > > > > > > > > > > I don't think so. Because I don't know what your data looks like, I am > > > guessing that for each row, it has columns for each of the 84 > countries. > > I > > > don't know what these columns are named, either. Maybe: > > > > > > date Australia Belarus ... Zambia > > > 01/01/1994 20 21 22 > > > ... > > > > > > Here, due to my misunderstanding about the code, I was wrong. But, what > > > data structure you guessed, it is absolutely right that for each row, I > > have > > > columns for each of the 84 countries. So, I think, I need to define the > > date > > > column with no. of months, and years once for all the countries. > > > Therefore, I got my answer to the first and third question in the > > previous > > > email (what you suggested) that I no need to define the column of each > > > country, as the date, and no. of observations are same for all > countries. > > > But, the no. of days are different for each month, and similarly, for > > each > > > year. So, I think I need to define date for each year separately. > > Hence, I have > > > given an example of 12 months, for 2 years (i.e. 1994, and 1995), and > > have > > > written the following code. Please correct me in case I am wrong. > > > > > > spdat<-data.frame( > > > > > > > > > dates=paste(c(1:21,1:20,1:23,1:21,1:22,1:22,1:21,1:23,1:22,1:21,1:22,1:22),c(r > > > ep(1,21),rep(2,20), > > > rep(3,23), rep(4,21), > > > > > > rep(5,22),rep(6,22),rep(7,21),rep(8,23),rep(9,22),rep(10,21),rep(11,22),rep(12 > > > ,22) > > > ),rep(1994,260) > > > dates1= > > > > > > paste(c(1:22,1:20,1:23,1:20,1:23,1:22,1:21,1:23,1:21,1:22,1:22,1:21),c(rep(1,2 > > > 2),rep(2,20), > > > rep(3,23), rep(4,20), > > > > > > rep(5,23),rep(6,22),rep(7,21),rep(8,23),rep(9,21),rep(10,21),rep(11,22),rep(12 > > > ,21) > > > ),rep(1995,259) ,sep="-") > > > > > > Concerning the exporting of structure of the dataset to excel, I will > > have > > > 12*84 matrix. But, please suggest me the way to proceed for the large > > > sample. I have mentioned below what I understood from your code. Please > > > correct me if I am wrong. > > > 1. I need to define the date for each year as the no. of days in each > > month > > > are different for each year (as mentioned in my above code). For > > instance, in > > > my data file, Jan 1994 has 21 days while Jan 1995 has 22 days. > > > 2. Need to define the date column as character. > > > 3. Need to define the monthly average for each month, and year. So, now > > > code will be as follows. > > > > > > spdat$month<-sapply(strsplit(spdat$dates,"-"),"[",2,3,4,5,6,7,8,9,10,11,12) > > > %%%%As I need all months average sequentially. > > > spdat$year<-sapply(strsplit(spdat$dates,"-"),"[",3) > > > > > > Here, this meaning of "3", I am really unable to get. > > > > > > 4. Need to define each country with each month and year as mentioned in > > > the last part of your code. > > > > > > Please suggest me in this regard. > > > > > > Thank you. > > > > > > > > > > > > > > > > > > > > > > > > [image: Mailtrack] > > > <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_ca > > > mpaign=signaturevirality5&> > > > Sender > > > notified by > > > Mailtrack > > > <https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_ca > > > mpaign=signaturevirality5&> > > > 09/13/19, > > > 06:41:41 PM > > > > > > On Fri, Sep 13, 2019 at 4:24 PM Jim Lemon <mailto:drjimle...@gmail.com > > > > wrote: > > > > > > > Hi Subhamitra, > > > > I'll try to write my answers adjacent to your questions below. > > > > > > > > On Fri, Sep 13, 2019 at 6:08 PM Subhamitra Patra < > > > > mailto:subhamitra.pa...@gmail.com> wrote: > > > > > > > >> Dear Sir, > > > >> > > > >> Thank you very much for your suggestion. > > > >> > > > >> Yes, your suggested code worked. But, actually, I have data from 3rd > > > >> January 1994 to 3rd August 2017 for very large (i.e. for 84 > > > >> countries) sample. From this, I have given the example of the years > > > >> up to 2000. Before applying the same code for the long 24 years, I > > > >> want to learn the logic behind the code. Actually, some part of the > > > >> code is not understandable to me which I mentioned in the bold > letter > > as > > > follows. > > > >> > > > >> "spdat<-data.frame( > > > >> > > dates=paste(c(1:30,1:28),c(rep(1,30),rep(2,28)),rep(1994,58),sep="-"), > > > >> returnA=sample(*15:50*,58,TRUE),returnB=sample(*10:45*,58,TRUE))" > > > >> > > > >> A. Here, I need to define the no. of days in a month, and the no. of > > > >> countries name separately, right? But, what is meant by 15:50, and > > > >> 10:45 in return A, and B respectively? > > > >> > > > > > > > > To paraphrase Donald Trump, this is FAKE DATA! I have no idea what > the > > > > real values of return are, so I made them up using the "sample" > > function. > > > > However, this is not meant to mislead anyone, just to show how > > > > whatever numbers are in your data can be used in calculations. The > > > > colon (":") operator creates a sequence of numbers starting with the > > > > one to the left and ending with the one to the right. > > > > > > > >> > > > >> "# if you only have to get the monthly averages, it can be done this > > > >> way > > > >> spdat$month<-sapply(strsplit(spdat$dates,"-"),"["*,2*) > > > >> spdat$year<-sapply(strsplit(spdat$dates,"-"),"[",*3*)" > > > >> > > > >> B. Here, I need to define the no. of months, and years separately, > > right? > > > >> or else what 2, and 3 (in bold) indicates? > > > >> > > > > > > > > To get the grouping variable of sequential months that you want, you > > > > only need the month and year values of the dates in the first column. > > > > First I used the "strsplit" function to split the date field at the > > > > hyphens, then used "sapply" to extract ("[") the second (month) and > > > > third (year) parts as two new columns. Because you have more than one > > > > year of data, you need the year values or you will group all > Januarys, > > > > all Februarys and so on. Notice how I pass both of the new columns as > > > > a list (a data frame is a type of > > > > list) in the call to get the mean of each month. > > > > > > > >> > > > >> "# get the averages by month and year - is this correct? > > > >> monthlyA<-by(*spdat$returnA*,spdat[,c("month","year")],mean) > > > >> monthlyB<-by(*spdat$returnB*,spdat[,c("month","year")],mean)" > > > >> > > > >> C. From this part, I got the exact average values of both January > and > > > >> February of 1994 for country A, and B. But, in code, I have a query > > > >> that I need to define spdat$returnA, and spdat$returnB separately > > > >> before writing this code, right? Like this, I need to define for > each > > > >> 84 countries separately with their respective number of months, and > > > >> years before writing this code, right? > > > >> > > > > > > > > I don't think so. Because I don't know what your data looks like, I > am > > > > guessing that for each row, it has columns for each of the 84 > > > > countries. I don't know what these columns are named, either. Maybe: > > > > > > > > date Australia Belarus ... Zambia > > > > 01/01/1994 20 21 22 > > > > ... > > > > > > > > > > > >> Yes, after obtaining the monthly average for each country's data, I > > > >> need to use them for further calculations. So, I want to export the > > > >> result to excel. But, until understanding the code, I think I > willn't > > > >> able to apply for the entire sample, and cannot be able to discuss > > > >> the format of the resulted column to export to excel. > > > >> > > > > > > > > Say that we perform the grouped mean calculation for the first two > > > > country columns like this: > > > > monmeans<-sapply(spdat[,2:3],by,spdat[,c("month","year")],mean) > > > > monmeans > > > > Australia Belarus > > > > [1,] 29.70000 30.43333 > > > > [2,] 34.17857 27.39286 > > > > > > > > We are presented with a 2x2 matrix of monthly means in just the > format > > > > someone might use for importing into Excel. The first row is January > > > > 1994, the second February 1994 and so on. By expanding the columns to > > > > include all the countries in your data, You should have the result > you > > want. > > > > > > > > Jim > > > > > > > > > > > > > -- > > > *Best Regards,* > > > *Subhamitra Patra* > > > *Phd. Research Scholar* > > > *Department of Humanities and Social Sciences* *Indian Institute of > > > Technology, Kharagpur* > > > *INDIA* > > > > > > [[alternative HTML version deleted]] > > > > > > ______________________________________________ > > > mailto:R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, > see > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide http://www.R-project.org/posting- > > > guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních > > partnerů PRECHEZA a.s. jsou zveřejněny na: > > https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information > > about processing and protection of business partner’s personal data are > > available on website: > > https://www.precheza.cz/en/personal-data-protection-principles/ > > Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou > > důvěrné a podléhají tomuto právně závaznému prohláąení o vyloučení > > odpovědnosti: https://www.precheza.cz/01-dovetek/ | This email and any > > documents attached to it may be confidential and are subject to the > legally > > binding disclaimer: https://www.precheza.cz/en/01-disclaimer/ > > > > > > > > -- > > Best Regards, > > Subhamitra Patra > > Phd. Research Scholar > > Department of Humanities and Social Sciences > > Indian Institute of Technology, Kharagpur > > INDIA > > ______________________________________________ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > -- > *Best Regards,* > *Subhamitra Patra* > *Phd. Research Scholar* > *Department of Humanities and Social Sciences* > *Indian Institute of Technology, Kharagpur* > *INDIA* > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.