A user could rate multiple times but lets please ignore that fact as that is not the case in this data and i just made up fake data below and made a mistake.
The new column is what i want to compute using data table and add to the original data. so the output should be: user,mv,rating,somedate, Avg_disimilar_users 1,2,3,date,NA 1,3,4,date,NA 2,3,4,date,NA 2,4,5,date,1 3,5,1,date,NA 4,4,1,date,5 4,7,3,date,na input: each row is : user, movie, rating, time. user,mv,rating,somedate 1,2,3,date 1,3,4,date, 2,3,4,date, 2,4,5,date, 3,5,1,date 4,4,1,date 4,7,3,date so here for user 2, movie 4, the new column called avg disimilar rating should be 1. for user 4, movie 4, the new column should be 5. the new column Avg_disimilar users which I want is the avg of users who rated differently on the movie in question on that row of the data. when users dont have movies in common 4 apart the result should be NA or empty. Hope that helps. Dhruv -----Original Message----- From: "Matthew Dowle" [[email protected]] Date: 02/24/2012 04:43 AM To: [email protected] CC: [email protected] Subject: Re: converting sql to data table with subqueriesand exists clauses Dhruv, But user 4 voted for movie 4 twice (first 1*, then 3*). Which is something I already asked about (but you haven't addressed). Further, you've described a new column. That implies a new value for every row. There are 7 rows but you've only stated what 2 of those 7 values should be. Please read through this in full : http://www.catb.org/~esr/faqs/smart-questions.html Matthew each row is : user, movie, rating, time. user,mv,rating,somedate 1,2,3,date 1,3,4,date, 2,3,4,date, 2,4,5,date, 3,5,1,date 4,4,1,date 4,4,3,date so here for user 2, movie 4, the new column called avg disimilar rating should be 1. for user 4, movie 4, the new column should be 5. this column is a disimilar column. Dhruv _______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
