Hello list, I have some time now to try out HBase and want to use it for a private project.
Questions like "How to I transfer one-to-many or many-to-many relations from my RDBMS's schema to HBase?" seem to be common. I hope we can throw all the best practices that are out there in this thread. As the wiki states: One should create two tables. One for students, another for courses. Within the students' table, one should add one column per selected course with the course_id besides some columns for the student itself (name, birthday, sex etc.). On the other hand one fills the courses table with one column per student_id besides some columns which describe the course itself (name, teacher, begin, end, year, location etc.). So far, so good. How do I access these tables efficiently? A common case would be to show all courses per student. To do so, one has to access the student-table and get all the student's courses-columns. Let's say their names are prefixed ids. One has to remove the prefix and then one accesses the courses-table to get all the courses and their metadata (name, teacher, location etc.). How do I do this kind of operation efficiently? The naive and brute force approach seems to be using a Get-object per course and fetch the neccessary data. Another approach seems to be using the HTable-class and unleash the power of "multigets" by using the batch()-method. All of the information above is theoretically, since I did not used it in code (I currently learn more about the fundamentals of HBase). That's why I give the question to you: How do you do this kind of operation by using HBase? Kind regards, Em