Thanks so much !
Vào 12:21 Ngày 24 tháng 4 năm 2012, Devaraj k <devara...@huawei.com> đã viết: > Hi Lac, > > As per my understanding based on your problem description, you need to > the below things. > > 1. Mapper : Write a mapper which emits records from input files and > convert intto key and values. Here this key should contain teacher id, > class id and no of students, value can be empty(or null). > 2. Partitioner : Write Custom partitioner to send all the records for a > teacher id to one reducer. > 3. Grouping Comaparator : Write a comparator to group the records based on > teacher id. > 4. Sorting Comparator : Write a comparator to sort the records based on > teacher id and no of students. > 5. Reducer : In the reducer, you will get the records for all teachers one > after other and also in the sorted order(by no of students) for a teacher > id. You can keep how many top records you want in the reducer and finally > can be written. > > You can refer this doc for reference: > http://www.inf.ed.ac.uk/publications/thesis/online/IM100859.pdf > > Thanks > Devaraj > > ________________________________________ > From: Lac Trung [trungnb3...@gmail.com] > Sent: Tuesday, April 24, 2012 10:11 AM > To: common-user@hadoop.apache.org > Subject: Re: Determine the key of Map function > > Ah, as I said before, I have no experience at programming MapReduce. So, > can you give me some documents or websites or something about programming > the thing you said above? ("Thousand things start hard" - VietNam) > Thanks so much ^^! > > Vào 10:54 Ngày 24 tháng 4 năm 2012, Lac Trung <trungnb3...@gmail.com> đã > viết: > > > Thanks Jay so much ! > > I will try this. > > ^^ > > > > Vào 10:52 Ngày 24 tháng 4 năm 2012, Jay Vyas <jayunit...@gmail.com> đã > > viết: > > > > Ahh... Well than the key will be teacher, and the value will simply be > >> > >> <-1 * # students, class_id> . > >> > >> Then, you will see in the reducer that the first 3 entries will always > be > >> the ones you wanted. > >> > >> On Mon, Apr 23, 2012 at 10:17 PM, Lac Trung <trungnb3...@gmail.com> > >> wrote: > >> > >> > Hi Jay ! > >> > I think it's a bit difference here. I want to get 30 classId for each > >> > teacherId that have most students. > >> > For example : get 3 classId. > >> > (File1) > >> > 1) Teacher1, Class11, 30 > >> > 2) Teacher1, Class12, 29 > >> > 3) Teacher1, Class13, 28 > >> > 4) Teacher1, Class14, 27 > >> > ... n ... > >> > > >> > n+1) Teacher2, Class21, 45 > >> > n+2) Teacher2, Class22, 44 > >> > n+3) Teacher2, Class23, 43 > >> > n+4) Teacher2, Class24, 42 > >> > ... n+m ... > >> > > >> > => return 3 line 1, 2, 3 for Teacher1 and line n+1, n+2, n+3 for > >> Teacher2 > >> > > >> > > >> > Vào 09:52 Ngày 24 tháng 4 năm 2012, Jay Vyas <jayunit...@gmail.com> > đã > >> > viết: > >> > > >> > > Its somewhat tricky to understand exactly what you need from your > >> > > explanation, but I believe you want teachers who have the most > >> students > >> > in > >> > > a given class. So for English, i have 10 teachers teaching the > class > >> - > >> > and > >> > > i want the ones with the highes # of students. > >> > > > >> > > You can output key= <classid>, value=<-1*#ofstudent,teacherid> as > the > >> > > values. > >> > > > >> > > The values will then be sorted, by # of students. You can thus pick > >> > > teacher in the the first value of your reducer, and that will be the > >> > > teacher for class id = xyz , with the highes number of students. > >> > > > >> > > You can also be smart in your mapper by running a combiner to remove > >> the > >> > > teacherids who are clearly not maximal. > >> > > > >> > > On Mon, Apr 23, 2012 at 9:38 PM, Lac Trung <trungnb3...@gmail.com> > >> > wrote: > >> > > > >> > > > Hello everyone ! > >> > > > > >> > > > I have a problem with MapReduce [:(] like that : > >> > > > I have 4 file input with 3 fields : teacherId, classId, > >> numberOfStudent > >> > > > (numberOfStudent is ordered by desc for each teach) > >> > > > Output is top 30 classId that numberOfStudent is max for each > >> teacher. > >> > > > My approach is MapReduce like Wordcount example. But I don't know > >> how > >> > to > >> > > > determine key for map function. > >> > > > I run Wordcount example, understand its code but I have no > >> experience > >> > at > >> > > > programming MapReduce. > >> > > > > >> > > > Can anyone help me to resolve this problem ? > >> > > > Thanks so much ! > >> > > > > >> > > > > >> > > > -- > >> > > > Lạc Trung > >> > > > 20083535 > >> > > > > >> > > > >> > > > >> > > > >> > > -- > >> > > Jay Vyas > >> > > MMSB/UCHC > >> > > > >> > > >> > > >> > > >> > -- > >> > Lạc Trung > >> > 20083535 > >> > > >> > >> > >> > >> -- > >> Jay Vyas > >> MMSB/UCHC > >> > > > > > > > > -- > > Lạc Trung > > 20083535 > > > > > > > -- > Lạc Trung > 20083535 > -- Lạc Trung 20083535