Thanks for the response, I haven't checked on the status of phpcassa in a while 
but does it now work with 0.7?
That was one of the main reasons I switched to pandra, it seemed more up to date


From: Tyler Hobbs 
Sent: Monday, March 07, 2011 2:40 AM
To: user@cassandra.apache.org 
Subject: Re: Designing a decent data model for an online music 
shop...confused/stuck on decisions


Regarding PHP performance with Cassandra, THRIFT-638 was recently resolved and 
it shows some big performance improvements.  I'll be upgrading the Thrift 
package that ships with phpcassa soon to include this fix, so you may want to 
compare performance numbers before and after.


On Sun, Mar 6, 2011 at 8:03 PM, Courtney <e-mailadr...@hotmail.com> wrote:

  We're in a bit of a predicament, we have an e-music store currently built in 
PHP using codeigniter/mysql...
  The current system has 100+K users and a decent song collection. Over the 
last few months I've been playing with
  Cassandra... needless to say I'm impressed but I have a few questions.
  Firstly, I want to avoid re-writing the entire site if possible so my 
instincts have made me inclined to replace the database layer
  in code igniter... is this something anyone would recommend and are there any 
gotchas in doing that?

  I can't say I've been terribly happy with PHP accessing cassandra, when 
sample data of the same size was put into mysql and in cassandra (of the same 
size/type)
  The pages with php connecting to Cassandra took longer to load, (30K records 
in table). 
  I've thought maybe it was my setup that needed tweaking and I've played with 
as many a options as I could but the best I've gotten is matching query time.
  Query speed test was simply getting time stamps right before and after query 
call returned...

  Is this something anyone else has seen, any comments suggestions? I've tried 
using thrift, phpcassa and pandra with pretty similar numbers.

  My other thought turned to maybe it was the way I designed my CFs, at first I 
used super columns to model user account CF based on a post I read
  by Arin (WTF is a super column) but I later changed to using normal CFs.

  I'm trying to make this work but I get the feeling my approach is 
somewhat...I don't mis-guided.

  Here's a break down of the current model.
      CF:Users{
                  uid
                  fname
                  lname
                  username
                  password
                  street
                  ....
              }
  Some additional columns in place for a user but keeping it simple...
  CF:Library{
                  uid
                  songid
                  ...
                  other info about user library
                  }

  CF:Songs{
                  songid
                  title
                  artistid
                  }

  This all is still very relational like (considering I go on to have a CF for 
playlist and artists) and I'm not sure if this is a good design for the data 
but... when I looked into
  combining some of the info and removing some CFs I run into the issue of 
replicating data all over the place. If for example I stored the artist name in 
the library for each record
  then each then the artist would be replicated for every song they have for 
every user who has that song in their library....

  Where do you sort of draw the line on deciding how much is okay to be 
replicated?

  As much as I am not liking the idea of building the application from scratch, 
I'm considering the possibility of building from scratch in Java/JSP just to 
get the benefit of using
  the hector client. (Efforts from the guys doing the PHP libs is much 
appreciated but PHP doesn't seem to go too well with Cas.)

  In the process of making decisions because the upgrade/rebuild needs to have 
a fairly steady working version for October and I don't want to go wrong before 
even starting.

  Recommendations. Suggestions, advice are all welcomed (Any experience with 
PHP and Cas. is also welcomed since all my fav. libs. are in PHP I'm reluctant 
to turn away)



-- 
Tyler Hobbs
Software Engineer, DataStax
Maintainer of the pycassa Cassandra Python client library

Reply via email to