On Fri, Mar 5, 2010 at 11:38 AM, N Kapshoo <nkaps...@gmail.com> wrote:
> I am looking at the student/courses hbase schema example to get my head > wrapped around the basics. (Many-many relationship between students and > courses). > > Now how do I tie this to something like > 'get all courses for a student'? > 'get all students that have taken a course'? > > The way I envision is: store all courseIds for a given studentId in a > column > in Student table. Then retrieve all of them for the given studentId, create > the row-ids corresponding to studenIid_courseId and then do a scan of the > StudentCourse table. Am I on the right track? > You're on the right track, and what you are saying would work, but it might be helpful to take the de-normalization a step further. Instead of storing just the ids in the students course listing, you may want to store the basic info about a class as some serialized object. That is: student_id -> courses col fam -> course id qualifier -> json/bson/protobuff course info This would all you to do those common operations doing only one fetch. This may only be basic info -- in this example, course title, instructor. There may be info in the courses table that isn't duplicated (i.e. required books, instructor's office info, ...). Similarly, courses may store all students in the class, or at least their names.