Hi Matthew, If it were me, I would want to make the rowkey the player name, "MyPlayerName2009" in the hbase table. And all other *Id names i would use as the column families: playerAchievement: , achievement: , guild:, player: Within each column family, you have infinite number of columns to use (dynamically created). And you could insert/update rowkeys with cell data like: rowkey = 'MyPlayerName2009' ( guild:level = "Beginner", player:address = "123 main street", achievement:some_new_column = "some new value", etc.. )
HTH On Tue, Mar 31, 2009 at 12:20 PM, Matthew Runo <[email protected]>wrote: > Hello folks! > > I'm running a site (http://www.realmranker.com) that works on medium sized > data sets from the World of Warcraft armory. I'm currently doing everything > in MySQL on a "Large" EC2 instance, but hitting some budgetary limits in > how > much more data I can put there. > > First, here is my current MySQL architecture: > > player - table that holds the player data, currently about 500,000 rows > (expecting 2,000,000+ rows long term) > achievement - table that holds the root achievement data, currently a few > 1000 rows, not expecting all that much growth > player_achievement - table that maps a player to all the achievement's > they've completed. Currentl about 60,000,000 rows, expecting 250,000,000+ > long term > > Here is the layout of player_achievment right now: > CREATE TABLE `player_achievement` ( > `playerAchievementId` int(11) unsigned NOT NULL AUTO_INCREMENT, > `playerId` int(11) unsigned NOT NULL, > `achievementId` int(11) unsigned NOT NULL, > `guildId` int(11) unsigned DEFAULT NULL, > `dateCompleted` varchar(100) NOT NULL, > PRIMARY KEY (`playerAchievementId`), > UNIQUE KEY `playerId` (`playerId`,`achievementId`), > KEY `playerId_2` (`playerId`), > KEY `achievementId` (`achievementId`), > KEY `guildId` (`guildId`), > KEY `dateCompleted` (`dateCompleted`) > ) > > I'd like to move this table into HBase, since it's simply getting too large > for the amount of server resources I can afford. I assume that HBase would > preform faster than MySQL on my (small by HBase standards) dataset. > > My question is simply what is the best way to do this? I'm assuming > something like this: > > player_achievement table > achievementId row > players family > player1, player2, player3.....playerN columns > date completed would be the data under each column > > I need to do the following types of queries: > - What players have a given achievement? > - What achievements does a give player have? > - What achievements does a given player NOT have? > - What are all the possible achievements? > - Who are the first 25 people to have a given achievement? > > I'm sorry for the really high level of this email - I'm still getting up to > speed on HBase, and not quite sure it's exactly the answer to my prayers > that I think it might be.. > > --Matthew Runo >
