Hi andreas,
Please give a sample of your data, and how you want it to be after the
manipulation.
Consider using
?dput



----------------Contact
Details:-------------------------------------------------------
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
----------------------------------------------------------------------------------------------




On Tue, Jan 24, 2012 at 11:54 AM, ak13 <andreas.ka...@gmail.com> wrote:

> Hi,
>
> I am a total newbie to R so I apologize if the answer to my question is too
> obvious.  I a data set of the following form:
>
>
>
>
>
>        Date
>        V1
>        V...
>        VN
>        Region
>        Industry
>
>
>
>        22/03/1995 23:01:12
>        1
>        3
>        2
>        15
>        A
>
>
>
>        21/03/1995 21:01:12
>        3
>        3
>        1
>        9
>        C
>
>
>
>        1/04/1995 17:01:06
>        3
>        2
>        1
>        3
>        B
>
>
>
> Now I would like to analyze the data in the data.frame by Region, Industry,
> Date (I would like to collapse the whole think to weekly data) and by the
> three different answering options {1,2,3} in V1...VN. In stata which I used
> before i did this step by step with a loop over all questions (V1...VN):
> egen pos_`X'=total(`X'==1), by(industry week_year); egen
> pos_`X'=total(`X'==2, by(industry week_year). This step-by-step procedure
> works because stata, even if the dates are displayed as weeks, doesn't
> aggregate the values immediately. Unfortunately there seems to be no
> command
> which works exactly in the same manner as by() (from stata) in R. My by now
> most successful attempt accomplish the above described task was by using:
>
> as.data.frame(tapply(euwifo[,1]=1, list(df$date, df$region, df$industry),
> mean))
>
> (where date is formatted as ISO-weekly %U)
> Of course I would have to loop this over all questions (20) and all
> answering possibilities (3) but at least it gives me an out put of the
> structure:
>
>
>
>
>
>         .
>        industry.region
>        Industry.region
>        industry.region
>        industry.region
>
>
>
>         10-1995
>        32
>        45
>        10
>        9
>
>
>
>         15-1995
>        2
>        47
>        5
>        6
>
>
>
> I could live with that because I could recombine the so created different
> dataframes thenafter. My problem however is tapply doesn't preserve the
> dataframe's format as a time series (xts). This means R aggregates by time
> (week) (and industry and region) but the weeks on the x-axis are not in the
> right order. I also tried to apply.weekly() but this doesn't seem to do
> what
> I want to do.
>
> Could anyone give me a hint how i could to this? Maybe with formatting the
> data frame as time series data beforehand with preserving this during that
> procedure. And maybe somebody also has an idea how I can maybe avoid all
> this looping.
>
> I would appreciate it very much much if somebody of you could give me a
> hint!
>
> Best regards,
>
> Andreas
>
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Splitting-up-large-set-of-survey-data-into-categories-tp4323327p4323327.html
> Sent from the R help mailing list archive at Nabble.com.
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to