Thank you. That makes perfect sense.

I guess following your guideline, I would have 2 options:
student_id -> courses col fam -> course id qualifier -> json/bson/protobuff
course info

OR

student_id -> courses col fam -> course id qualifier -> course year
student_id -> courses col fam -> course id qualifier -> course status
etc

Basically, what considerations should I look for to determine whether to
flatten the json object or store as a big object?

Thanks.


On Fri, Mar 5, 2010 at 4:19 PM, Kevin Peterson <kpeter...@biz360.com> wrote:

> On Fri, Mar 5, 2010 at 11:38 AM, N Kapshoo <nkaps...@gmail.com> wrote:
>
> > I am looking at the student/courses hbase schema example to get my head
> > wrapped around the basics. (Many-many relationship between students and
> > courses).
> >
> > Now how do I tie this to something like
> > 'get all courses for a student'?
> > 'get all students that have taken a course'?
> >
> > The way I envision is: store all courseIds for a given studentId in a
> > column
> > in Student table. Then retrieve all of them for the given studentId,
> create
> > the row-ids corresponding to studenIid_courseId and then do a scan of the
> > StudentCourse table. Am I on the right track?
> >
>
> You're on the right track, and what you are saying would work, but it might
> be helpful to take the de-normalization a step further.  Instead of storing
> just the ids in the students course listing, you may want to store the
> basic
> info about a class as some serialized object. That is:
>
> student_id -> courses col fam -> course id qualifier -> json/bson/protobuff
> course info
>
> This would all you to do those common operations doing only one fetch. This
> may only be basic info -- in this example, course title, instructor. There
> may be info in the courses table that isn't duplicated (i.e. required
> books,
> instructor's office info, ...).
>
> Similarly, courses may store all students in the class, or at least their
> names.
>

Reply via email to