My suggested approach:
dta <- structure(list(Prod_name = c("Banana", "Apple", "Orange",
"Yoghurt",
"Eggs", "Milk", "Day_num"), X1.1.2000 = c("1", "0", "4", "3",
"6", "2", "1"), X2.1.2000 = c("2", "4", "1", "5", "3", "0", "2"
), X3.1.2000 = c("1", "5", "2", "3", "0", "4", "3"), X4.1.2000 = c("2",
"4", "4", "1", "0", "0", "4"), X5.1.2000 = c("0", "0", "1", "0",
"2", "3", "5"), X6.1.2000 = c("1", "3", "2", "1", "4", "1", "6"
), X7.1.2000 = c("5", "4", "5", "2", "2", "1", "7")), .Names =
c("Prod_name",
"X1.1.2000", "X2.1.2000", "X3.1.2000", "X4.1.2000", "X5.1.2000",
"X6.1.2000", "X7.1.2000"), row.names = c(NA, 7L), class = "data.frame")
# The Day_num values ARE NOT data you will be aggregating and
# should not be in the data frame with meaningful values.
dta <- dta[ 1:6, ] # forget last garbage line
# assuming your data are intended to be numeric
for( i in 2:8 ) {
dta[[ i ]] <- as.numeric( dta[[ i ]] )
}
# you didn't say what computation you want to accomplish on the data
# assuming you want to add values up by product and part of week
# base R functions
# generally useful to set timezone when using POSIXt types
Sys.setenv( TZ="Etc/GMT" )
# gather data values from multiple columns into long form
# I find this function very confusing, but it does work if you
# don't like depending on contributed packages that are easier to
# understand
dtaLong <- reshape( dta
, idvar = "Prod_name"
, varying = 1+seq.int( length( dta ) - 1 )
, v.names = "value"
, timevar = "XDates"
, times = names( dta )[ 1+seq.int( length( dta ) - 1 ) ]
, direction = "long"
)
# extract Date values from column names
dtaLong$Dates <- as.Date( dtaLong$XDates, format="X%d.%m.%Y" )
# read about POSIX types in the help page ?DateTimeClasses
dt_lt <- as.POSIXlt( dtaLong$Dates )
# extract the weekday information from the POSIXlt
dtaLong$wday <- dt_lt$wday # Sunday==0
# identify rows corresponding to time of week
dtaLong$WkPart <- ifelse( dtaLong$wday %in% c( 0, 6 )
, "Weekend"
, "Weekday" )
# aggregate by sum the value grouping by Prod_name and WkPart
dtaAgg <- aggregate( dtaLong$value
, dtaLong[ , c( "Prod_name", "WkPart" ), drop=FALSE ]
, FUN=sum
)
# or using dplyr/tidyr
library(dplyr)
library(tidyr)
library(lubridate)
# "pipe" data frames from one step to the next
dtaAgg2.a <- ( dta
# tidyr way of making long form data
%>% gather( XDates, value, -Prod_name )
)
# dtaAgg2.a is purely for studying what is happening
dtaAgg2.b <- ( dta
# tidyr way of making long form data
%>% gather( XDates, value, -Prod_name )
%>% mutate( Dates = as.Date( XDates, format="X%d.%m.%Y" )
, WkPart = ifelse( wday( Dates ) %in% c( 0, 6 )
, "WeekEnd"
, "WeekDay" )
)
)
# dtaAgg2.b is also for studying what happens
# finally, run the whole pipeline of calculations
dtaAgg2 <- ( dta
# tidyr way of making long form data
%>% gather( XDates, value, -Prod_name )
%>% mutate( Dates = as.Date( XDates, format="X%d.%m.%Y" )
, WkPart = ifelse( wday( Dates ) %in% c( 0, 6 )
, "WeekEnd"
, "WeekDay" )
)
%>% group_by( Prod_name, WkPart )
%>% summarise( SumOfValues = sum( value ) )
)
# the group_by and summarise steps work together
On Sun, 4 Sep 2016, Filippos Katsios wrote:
Dear all,
I believe that this will be a more helpful way to put the problem:
structure(list(Prod_name = c("Banana", "Apple", "Orange", "Yoghurt",
"Eggs", "Milk", "Day_num"), X1.1.2000 = c("1", "0", "4", "3",
"6", "2", "1"), X2.1.2000 = c("2", "4", "1", "5", "3", "0", "2"
), X3.1.2000 = c("1", "5", "2", "3", "0", "4", "3"), X4.1.2000 = c("2",
"4", "4", "1", "0", "0", "4"), X5.1.2000 = c("0", "0", "1", "0",
"2", "3", "5"), X6.1.2000 = c("1", "3", "2", "1", "4", "1", "6"
), X7.1.2000 = c("5", "4", "5", "2", "2", "1", "7")), .Names = c("Prod_name",
"X1.1.2000", "X2.1.2000", "X3.1.2000", "X4.1.2000", "X5.1.2000",
"X6.1.2000", "X7.1.2000"), row.names = c(NA, 7L), class = "data.frame")
and the code:
https://gist.github.com/anonymous/750b02ad5db448d45c92a79059bf9844
Thank you for your help
Filippos
On 4 September 2016 at 19:30, Jeff Newmiller <jdnew...@dcn.davis.ca.us> wrote:
Please use Reply-all to keep the mailing list in the loop. I cannot
provide private assistance,
and others may provide valuable input or respond faster than I can.
It is very common that people cannot provide the original data. That
means more work for YOU,
though, not for us. It is up to you to create a small simulated data
set and process it as if
it were your original data.
Your idea will indeed be a good algorithm, but you will fail in R if you
don't set it up
differently. Read [1] and provide us with a reproducible example data set
and desired result and
someone here will be able to show you how to do it correctly.
[1] http://adv-r.had.co.nz/Reproducibility.html
--
Sent from my phone. Please excuse my brevity.
On September 4, 2016 8:28:39 AM PDT, Filippos Katsios
<katsi...@gmail.com> wrote:
>Dear Jeff,
>I am sorry but I am not allowed to share the original data. You are
>right
>about the Prod_name row. However, my goal is to split the columns
>"Date 1"
>etc into weekdays and weekends and manipulate them separately. I
>thought
>this would be the best way to do that (Assign to each day a number from
>1:7
>and then splitting them by a logical vector). Thank you for your help
>and
>your time!
>
>Filippos
>
>On 4 September 2016 at 18:20, Jeff Newmiller <jdnew...@dcn.davis.ca.us>
>wrote:
>
>> The "c" function creates vectors. Rows of data frames are data
>frames, not
>> vectors.
>>
>> new_row <- data.frame( Prod_name = "Day_name", `Date 1`=1, `Date
>> 2`=2,`Date 3`=3 )
>> data_may <- rbind( new_row, data_may )
>>
>> Furthermore, data frames are NOT spreadsheets. "Day_num" looks
>> suspiciously UNlike a product name, which may mean the corresponding
>values
>> in that row are not Dates, which would also lead you into trouble.
>>
>> Please read the Posting Guide. In particular, you should read about
>making
>> your examples reproducible. Part of that is posting in plain text and
>using
>> the dput function to give us your sample data, because all too often
>the
>> problem lies in the details of how you have imported and manipulated
>your
>> data and the shortest way for us to see that the data are okay is to
>see it
>> as it exists in your R script so far.
>> --
>> Sent from my phone. Please excuse my brevity.
>>
>> On September 4, 2016 6:22:48 AM PDT, Filippos Katsios
><katsi...@gmail.com>
>> wrote:
>> >Dear All,
>> >
>> >I am relatively new to R and certainly new to the e-mailing list. I
>> >need
>> >your help. I am working on a data frame, which looks like this:
>> >
>> >Prod_name | Date 1 | Date 2 | Date 3 |
>> >------------------|-------------|------------|--------------|
>> >Product 1 | 3 | 4 | 0 |
>> >------------------|-------------|------------|--------------|
>> >Product 2 | 5 | 3 | 3 |
>> >------------------|-------------|------------|--------------|
>> >Product 3 | 2 | 8 | 5 |
>> >
>> >I am trying to add a new row with the following results:
>> >
>> >Prod_name | Date 1 | Date 2 | Date 3 |
>> >------------------|-------------|------------|--------------|
>> >Day_num | 1 | 2 | 3 |
>> >------------------|-------------|------------|--------------|
>> >Product 1 | 3 | 4 | 0 |
>> >------------------|-------------|------------|--------------|
>> >Product 2 | 5 | 3 | 3 |
>> >------------------|-------------|------------|--------------|
>> >Product 3 | 2 | 8 | 5 |
>> >
>> >Bellow you can find the things I tried and the results.
>> >1)
>> >r <- 1
>> >newrow <- rep(1:7, 5, len=ncol(data_may)-1)
>> >insertRow <- function(data_may, newrow, r) {
>> >data_may[seq(r+1,nrow(data_may)+1),] <-
>> >data_may[seq(r,nrow(data_may)),]
>> > data_may[r,] <- newrow
>> > data_may
>> >}
>> >
>> >It doesn't put the new row.
>> >2)
>> >data_may<-rbind(data_may,c("Day_num",newrow))
>> >
>> >Error: cannot convert object to a data frame
>> >
>> >3)
>> >data_may[2093,]<-c("Day_num",rep(1:7, 5, len=ncol(data_may)-1))
>> >
>> >It makes all the columns characters and when i try to change it it
>says
>> >that you can change a list
>> >
>> >How can I add the row while keeping the columns (apart from the
>first
>> >one)
>> >as numeric or double or integer?
>> >
>> >Thank you, in advance, for your help!
>> >
>> >Kind regards
>> >Filippos
>> >
>> > [[alternative HTML version deleted]]
>> >
>> >______________________________________________
>> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >https://stat.ethz.ch/mailman/listinfo/r-help
>> >PLEASE do read the posting guide
>> >http://www.R-project.org/posting-guide.html
>> >and provide commented, minimal, self-contained, reproducible code.
>>
>>
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnew...@dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
---------------------------------------------------------------------------
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.