Re: Data Model

aaron morton Fri, 14 Sep 2012 02:01:26 -0700

> Consider a course_students col family which gives a list of students for a 
> course


I would use two CF's:

Course CF:
        * Each row is one course
        * Columns are the properties and values of the course

CourseEnrolements CF
        * Each row is one course
        * Column name is the student ID. 
        * Column value may be blank or some useful value. 

Hope that helps. 

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 14/09/2012, at 3:19 PM, Michael Morris <michael.m.mor...@gmail.com> wrote:

> I'm fairly new to Cassandra myself, but had to solve a similar problem.  If 
> ordering of the student number values is not important to you, you can store 
> them as UTF8 values (Ascii would work too, may be a better choice?), and the 
> resulting columns would be sorted by the lexical ordering of the numeric 
> values (as opposed to numeric sorting), so it would be 1, 10, 11, 2, 3, 31, 
> 5...
> 
> In my situation, I had a document, composed of digitized images of the pages. 
>  My row key is the doc id number, and there were document level attributes, 
> as well as page level attributes I wanted to capture in the row.  So in my 
> model, I used composite columns of 3 UTF8 values.  The first is an attribute 
> descriptor, I used 'a' to indicate a document level attribute, and 'p' as a 
> page level attribute.  The second composite value depends on the 1st, for 'a' 
> types, the 2nd value is the actual attribute identifier (ex, form type, 
> scanner number, etc...).  For 'p' types, it refers to the page number.  
> Having Cassandra preserve the order of the page numbers is not a priority for 
> me, so I can deal with the page number being sorted in String order in the 
> database, I'll deal with numerical sorting that in the app logic (since the 
> largest documents we process are only about 100 pages long).  The 3rd 
> composite value is empty for all 'a' types, and for 'p' types, it refers to a 
> page level attribute (page landmark information, image skew angle, location 
> on disk, etc...).
> 
> As an example:
> 
> key1 => a:form=x, a:scanner=1425436, p:1:skew=0.0042142, p:1:file=page1.png, 
> p:2:skew=0.0042412, p:2:file=page2.png
> key2 => a:form=q, a:scanner=935625, p:1:skew=0.00032352, p:1:file=other1.png, 
> p:2:skew=:0.0002355, p:2:file=other2.png
> 
> It's been working well for me when using the Hector client.
> 
> Thanks,
> 
> Mike
> 
> On Thu, Sep 13, 2012 at 12:59 PM, Soumya Acharya <cse.sou...@gmail.com> wrote:
> I just started learning Cassandra any suggestion where to start with ??
> 
> 
> Thanks
> Soumya 
> 
> On Thu, Sep 13, 2012 at 10:54 AM, Roshni Rajagopal 
> <roshni_rajago...@hotmail.com> wrote:
> I want to learn how we can model a mix of static and dynamic columns in a 
> family.
> 
> Consider a course_students col family which gives a list of students for a 
> course
> with row key- Course Id
> Columns - Name, Teach_Nm, StudID1, StudID2, StudID3
> Values - Maths, Prof. Abc, 20,21,25 
> where 20,21,25 are IDs of students.
> 
> We have
> fixed columns like Course Name, Teacher Name, and a dynamic number of
> 
> columns like 'StudID1', 'StudID2' etc, and my thoughts were that we could
> 
> look for 'StudID' and get all the columns with the student Ids in Hector. But 
> the
> 
> question was how would we determine the number for the column, like to add
> 
> StudID3 we need to read the row and identify that 2 students are there,
> 
> and this is the third one.
> 
> 
> 
> So we can remove the number in the column name, altogether and keep
> 
> columns like Course Name, Teacher Name, Student:20,Student:21, Student:25,
> 
> where the second part is the actual student id. However here we run into
> 
> the second issue that we cannot have some columns of a composite format
> 
> and some of another format, when we use static column families- all
> 
> columns would need to be in the format UTF8:integer We may want to treat
> 
> it as a composite column key and not use a delimiter- to get sorting,
> 
> validate the types of the parts of the key, not have to search for the
> 
> delimiter and separate the 2 components  manually etc. 
> 
> 
> 
> A third option is to put only data in the column name for students like
> 
> Course Name, Teacher Name, 20,21,25 - it would be difficult to identify
> 
> that columns with name 20, 21, 25 actually stand for student names - a bit
> 
> unreadable.
> 
> 
> 
> I hope this is not confusing, and would like to hear your thoughts on 
> this.The question is
> 
> around when you de-normalize & want to have some static info like name ,
> 
> and a dynamic list - whats the best way to model this.
> 
> 
> 
> Regards,
> 
> Roshni
> 
> 
> 
> 
> -- 
> Regards and Thanks 
> Soumya Kanti Acharya
>

Re: Data Model

Reply via email to