Re: the files of one table is so big?
I want to copy the files of one table from one cluster to another cluster . So I do it by step: 1 . bin/hadoop fs -copyToLocal at A 2. scp files from A to B 3. bin/hadoop fs -copyFromLocal at B I scan the table and save the data to a file by a file formats which is defined by me before yesterday ,and I found the file is only 18M. It is stange that used -copyFromLocal , the file has 99G . Why? 2011/4/1 Jean-Daniel Cryans jdcry...@apache.org Depends what you're trying to do? Like I said you didn't give us a lot of information so were pretty much in the dark regarding what you're trying to achieve. At first you asked why the files were so big, I don't see the relation with the log files. Also I'm not sure why you referred to the number of versions, unless you are overwriting your data it's irrelevant to on-disk size. Again not enough information about what you're trying to do. J-D On Thu, Mar 31, 2011 at 12:27 AM, 陈加俊 cjjvict...@gmail.com wrote: Can I skip the log files? On Thu, Mar 31, 2011 at 2:17 PM, 陈加俊 cjjvict...@gmail.com wrote: I found there is so many log files under the table folder and it is very big ! On Thu, Mar 31, 2011 at 2:16 PM, 陈加俊 cjjvict...@gmail.com wrote: I fond there is so many log files under the table folder and it is very big ! On Thu, Mar 31, 2011 at 1:37 PM, 陈加俊 cjjvict...@gmail.com wrote: thank you JD the type of key is Long , and the family's versions is 5 . On Thu, Mar 31, 2011 at 12:42 PM, Jean-Daniel Cryans jdcry...@apache.org wrote: (Trying to answer with the very little information you gave us) So in HBase every cell is stored along it's row key, family name, qualifier and timestamp (plus length of each). Depending on how big your keys are, it can grow your total dataset. So it's not just a function of value sizes. J-D On Wed, Mar 30, 2011 at 9:34 PM, 陈加俊 cjjvict...@gmail.com wrote: I scan the table ,It just has 29000 rows and each row only has not reached 1 k . I save it to files which has 18M. But I used /app/cloud/hadoop/bin/hadoop fs -copyFromLocal , it has 99G . Why ? -- Thanks Best regards jiajun -- Thanks Best regards jiajun -- Thanks Best regards jiajun -- Thanks Best regards jiajun -- Thanks Best regards jiajun -- Thanks Best regards jiajun
I want to update the hbase to 0.90.x
I want to update the hbase to 0.90.x from 0.20.6 ,which version should I use now ? and anynone has steps in detail ? -- Thanks Best regards jiajun
A question about region hot spot
In hbase version 0.20.6, If contiguous regions, do not assign adjacent regions in same region server. So it can break daughters of splits in same region server and avoid hot spot. The performance can improve. In version 0.90.1, daughter is opened in region server that his parent is opened. In the case A region server has thousands of regions. the contiguous region is difficult to Choose by random. So the region server always is hot spot. Should the balance method be choose the contiguous region and then random ? .
Re: LZO Compression changes in 0.90 ?
Hi, I had faced the same error. I rebuilt hadoop-gpl-compression individually on all nodes, copied the files to $HBASE_HOME/lib/native as well apart from $HADOOP_HOME/lib/native and it started working. Is that fine? Or should I recompile from github as suggested above? Thanks, Hari On Tue, Mar 29, 2011 at 1:20 PM, Todd Lipcon t...@cloudera.com wrote: Yep. Kevin has apparently gotten too many job promotions recently, so it's mostly me maintaining it these days :) But he usually reviews and pulls my changes within a few days. -Todd On Mon, Mar 28, 2011 at 9:05 AM, Stack st...@duboce.net wrote: It doesn't matter IIUC (Correct me if I'm wrong Todd). They keep up on each others git repo changes. St.Ack On Mon, Mar 28, 2011 at 8:57 AM, George P. Stathis gstat...@traackr.com wrote: BTW, which github version should I be cloning? Todd's or Kevin's? On Mon, Mar 28, 2011 at 11:35 AM, Friso van Vollenhoven fvanvollenho...@xebia.com wrote: Yes, things changed a bit. The compressors now have a reinit(...) method that allows HBase to reuse the objects. I believe Hadoop still works Here's the version that should be good: https://github.com/toddlipcon/hadoop-lzo If you're already on a recent version, then something else might have gone wrong. I think the stack trace you posted indicates that there is an error earlier in the RS logs that shows what is wrong with the compressor. Algorithm 'lzo' previously failed test. Look for when the test happened. Cheers, Friso On 28 mrt 2011, at 17:01, George P. Stathis wrote: Has anything changed with the way compression is handled in 0.90? I'm in the process of testing 0.90 CDH3B4 on my development machine (OS X). Up until 0.89 SNAPSHOTS, I have been able to re-use the same lzo JAR and native libs that I compiled back in 0.20.x. With this latest upgrade, I now see this in the region server logs: 2011-03-28 10:54:51,787 ERROR org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open of region=monitors,,1287788998658.a6c3f7f6a927fda960a856c274b470c0. java.io.IOException: Compression algorithm 'lzo' previously failed test. at org.apache.hadoop.hbase.util.CompressionTest.testCompression(CompressionTest.java:77) at org.apache.hadoop.hbase.regionserver.HRegion.checkCompressionCodecs(HRegion.java:2555) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2544) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2532) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:262) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:94) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:151) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:680) Just want to make sure I'm not missing something before I start re-compiling those libs. -GS -- Todd Lipcon Software Engineer, Cloudera
Re: A question about region hot spot
Thats a good point that one of two contiguous regions should be moved on balance. There is HBASE-3373 but I just opened an umbrella balancing issue at HBASE-3724. Would you mind adding your observation there? Thank you Gaojinchao. St.Ack On Fri, Apr 1, 2011 at 2:39 AM, Gaojinchao gaojinc...@huawei.com wrote: In hbase version 0.20.6, If contiguous regions, do not assign adjacent regions in same region server. So it can break daughters of splits in same region server and avoid hot spot. The performance can improve. In version 0.90.1, daughter is opened in region server that his parent is opened. In the case A region server has thousands of regions. the contiguous region is difficult to Choose by random. So the region server always is hot spot. Should the balance method be choose the contiguous region and then random ? .
What is overload?
I extracted from regionserver's log : 2011-04-01 19:17:22,716 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_CLOSE: cjjHTML,http://news.ifeng.com/gundong/detail_2011_03/15/515 4913_0.shtml,1300245193111: Overloaded 2011-04-01 19:17:22,716 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Worker: MSG_REGION_CLOSE: cjjHTML,http://news.ifeng.com/gundong/detail_2011_0 3/15/5154913_0.shtml,1300245193111: Overloaded why overloaded? -- Thanks Best regards jiajun
row_counter map reduce job 0.90.1
I'm able to run this job from the hadoop machine (where job task tracker also runs) /hadoop jar /home/maryama/hbase-0.90.1/hbase-0.90.1.jar rowcounter usertable But, I'm not able to run the same job from a) hbase client machine (full hbase hadoop installed) b) hbase server machines (ditto) Get File /home/.../hdfs/tmp/mapred/system/job_201103311630_0024/libjars/hadoop-0.20.2-core.jar does not exist. Any idea how this jar file get packaged/where is it looking for? thanks v
Re: HBase strangeness and double deletes of HDFS blocks and writing to closed blocks
Thanks for taking the time to upload all those logs, I really appreciate it. So from the looks of it, only 1 region wasn't able to split during the time of those logs and it successfully rolled back. At first I thought the data could have been deleted in the parent region, but we don't do that in the region server (it's the master that's responsible for that deletion) meaning that you couldn't lose data. Which makes me think, those rows that are missing... are they part of that region or they are also in other regions? If it's the case, then maybe this is just a red herring. You say tat you insert in two different families at different row keys. IIUC that means you would insert row A in family f1 and row B in family f2, and so on. And you say only one of the rows is there... I guess you don't really mean that you were inserting into 2 rows for 11 hours and one of them was missing right? More like, all the data in one family was missing for those 11B rows? Is that right? Thx! J-D On Thu, Mar 31, 2011 at 7:15 PM, Chris Tarnas c...@email.com wrote: Thanks for your help J.D., answers inline: On Mar 31, 2011, at 8:00 PM, Jean-Daniel Cryans wrote: I wouldn't worry too much at the moment for what seems to be double deletes of blocks, I'd like to concentrate on the state of your cluster first. So if you run hbck, do you see any inconsistencies? Clean, no inconsistencies, as is hadoop fsck In the datanode logs, do you see any exceptions regarding xcievers (just in case). The only place is shows up is the BlockAlreadyExistsException: 2011-03-29 22:55:12,726 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.56.24.12:50010, storageID=DS-1245200625-10.56.24.12-50010-1297226452434, infoPort=50075, ipcPort=50020):DataXceiver org.apache.hadoop.hdfs.server.datanode.BlockAlreadyExistsException: Block blk_5380327420628627923_1036814 is valid, and cannot be written to. at org.apache.hadoop.hdfs.server.datanode.FSDataset.writeToBlock(FSDataset.java:1314) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.init(BlockReceiver.java:99) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:318) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:122) In the region server logs, after the split is failed, do you see a line that starts with Successful rollback of failed split of...? I would hope so as the region server would have aborted in case the split failed. Yes, the rollback was successful. Actually would it be possible to see bigger logs? As much as I like the deep investigation you did, I don't want to go down a single debug path when something else might have happened and never got a chance to see it. Very very often, what you think is HDFS issues can turn out to be network issues (did you do those checks?). I didn't see any issues with the network (no errors, dropped packets, etc). Here is more of the master log: http://pastebin.com/PZqDZ1TV Regionserver: http://pastebin.com/pCDMPbB2 Datanode had to snip some stuff to fit in pastebin: http://pastebin.com/DrWzgykH Same for the master logs, it'd be nice to look for any weirdness. Thx! J-D On Thu, Mar 31, 2011 at 5:03 PM, Christopher Tarnas c...@email.com wrote: I've been trying to track down some hbase strangeness from what looks to be lost hbase puts: in one thrift put we insert data into two different column families at different rowkeys, but only one of the rows is there. There were no errors to the client or the thrift log, which is a little disturbing. This is fairly well tested code that has worked flawlessly up to this point so I started to look for problems in the hbase and hadoop logs. We are using CDH3b4. This happened during a large 1 billion row load over 11 hours . In the regionserver logs it looks like the they were having trouble talking with the datanodes during splits. I'd see worrying stuff like this in the regionserver log: 2011-03-29 22:55:12,946 INFO org.apache.hadoop.hdfs.DFSClient: Could not complete file /hbase/sequence/d13ef276819124e550bb6e0be9c5cdc8/splits/c6f102f1781897a0bd2025bd8252c3cd/core/9219969397457794945.d13ef276819124e550bb6e0be9c5cdc8 retrying... 2011-03-29 22:55:23,628 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread: Running rollback of failed split of sequence,MI6Q0pMyu3mQDR7hp71RBA\x093d6ae329777234c5ae9019cf8f5cfe80-A262-1\x09,1301367397529.d13ef276819124e550bb6e0be9c5cdc8.; Took too long to split the files and create the references, aborting split 2011-03-29 22:55:23,661 WARN org.apache.hadoop.ipc.Client: interrupted waiting to send params to server java.lang.InterruptedException at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1279) More of the regionserver log here: http://pastebin.com/vnzHWXKT I saw those
Re: Migration Path for 0.90 to 0.91
Data migration would only be necessary if there are file format changes, which I don't believe there are any at the moment. But you can't really be certain about that until we get closer to a release of 0.92. The coprocessor feature does break RPC protocol compatibility, so you will have to do a full HBase cluster restart when upgrading 0.90.x - 0.92. That's about all I can say with certainty at the moment. On Wed, Mar 30, 2011 at 12:35 PM, Nichole Treadway kntread...@gmail.comwrote: No I was wondering about 0.91 actuallywe are interested in the Coprocessor code there but have a development cluster running 0.90.1 right now. On Wed, Mar 30, 2011 at 3:27 PM, Jean-Daniel Cryans jdcry...@apache.org wrote: Are you talking about 0.90.1? 0.91.0 is far from being released so I guess it can't be it. 0.90.1 it's a minor revision, it's supposed to be forward and backward compatible so code compiled against 0.90.0 should work on 0.90.1 (if not then it's a bug). J-D On Wed, Mar 30, 2011 at 12:22 PM, Nichole Treadway kntread...@gmail.com wrote: All, Is there a migration path for upgrading between 0.90 and 0.91? Do we need to do data migration? Thank you, Nichole
Re: row_counter map reduce job 0.90.1
Yeah.. I tried that as well as what Ted suggested..It can't find hadoop jar Hadoop map reduce jobs works fine ..it's just hbase map reduce jobs fails with this error tx -Original Message- From: Stack st...@duboce.net To: user@hbase.apache.org Sent: Fri, Apr 1, 2011 12:39 pm Subject: Re: row_counter map reduce job 0.90.1 Does where you are running from have a build/classes dir and a hadoop-0.20.2-core.jar at top level? If so, try cleaning out the build/classes. Also, could try something like this: HADOOP_CLASSPATH=/home/stack/hbase-0.90.2-SNAPSHOT/hbase-0.90.2-SNAPSHOT-tests.jar:/home/stack/hbase-0.90.2-SNAPSHOT/hbase-0.90.2-SNAPSHOT.jar:`/home/stack/hbase-0.90.2-SNAPSHOT/bin/hbase classpath` ./bin/hadoop jar /home/stack/hbase-0.90.2-SNAPSHOT/hbase-0.90.2-SNAPSHOT.jar rowcounter usertable ... only make sure the hadoop jar is in HADOOP_CLASSPATH. But you shouldn't have to do the latter at least. Compare where it works to where it doesn't. Something is different. St.Ack On Fri, Apr 1, 2011 at 9:26 AM, Venkatesh vramanatha...@aol.com wrote: Definitely yes..It'all referenced in -classpath option of jvm of tasktracker/jobtracker/datanode/namenode.. file does exist in the cluster.. But the error I get is on the client File /home/hdfs/tmp/mapred/system/job_201103311630_0027/libjars/hadoop-0.20.2-core.jar does not exist. at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:245) at org.apache.hadoop.filecache.DistributedCache.getTimestamp(DistributedCache.java:509) at org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:629) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:761) at org.apache.hadoop.mapreduce.Job.submit(Job.java:432) So, in theory in should n't expect from client ..correct? This is the only that is stopping me in moving to 0.90.1 -Original Message- From: Stack st...@duboce.net To: user@hbase.apache.org Sent: Fri, Apr 1, 2011 12:19 pm Subject: Re: row_counter map reduce job 0.90.1 On Fri, Apr 1, 2011 at 9:06 AM, Venkatesh vramanatha...@aol.com wrote: I'm able to run this job from the hadoop machine (where job task tracker also runs) /hadoop jar /home/maryama/hbase-0.90.1/hbase-0.90.1.jar rowcounter usertable But, I'm not able to run the same job from a) hbase client machine (full hbase hadoop installed) b) hbase server machines (ditto) Get File /home/.../hdfs/tmp/mapred/system/job_201103311630_0024/libjars/hadoop-0.20.2-core.jar does not exist. Is that jar present on the cluster? St.Ack
Re: Modelling threaded messages
Ted, this is a pretty clever idea. On Thu, Mar 31, 2011 at 9:27 PM, Ted Dunning tdunn...@maprtech.com wrote: Solr/Elastic search is a fine solution, but probably won't be quite as fast as a well-tuned hbase solution. One key assumption you seem to be making is that you will store messages only once. If you are willing to make multiple updates to tables, then you can arrange the natural ordering of the table to get what you want. For instance, you could keep the most recent messages (say the last 10 from each of the 1000 most recently updated threads) in an in memory table. Then you could store messages in a thread table indexed by thread:timestamp. Finally you could store messages in a table indexed by user:thread or user:timestamp. This would allow you to display the most recent messages or thread in near zero time, to display all or the most recent messages from a particular thread with only one retrieval and all of the messages from a particular user in time order in one retrieval. On Thu, Mar 31, 2011 at 5:56 PM, Mark Jarecki mjare...@bigpond.net.au wrote: Hi all, I'm modelling a schema for storing and retrieving threaded messages, where, for planning purposes: - there are many millions of users. - a user might have up to 1000 threads. - each thread might have up to 5 messages (with some threads being sparse with only a few messages). - the Stargate REST interface is used. I want to be able to execute the following queries: - retrieve x latest active threads, with the latest message. - retrieve x latest active threads, with the latest message, offset by y. - retrieve x latest messages from a thread. - retrieve x latest messages from a thread, offset by y. I've come up with a few possible methods for modelling this. But any insights would be greatly appreciated. Thanks in advance, Mark Possible solution 1: TABLE: threads KEY:userID : threadID COLUMN: latest_message TABLE: messages KEY:userID : threadID : timestamp COLUMN: message Messages are first written to the messages table, and then the threads table's thread is updated with the latest message. To fetch the latest x active threads, with the latest message: - I retrieve all threads and then sort and reduce the results on the client. A concern with this is the fetching of all threads to sort on each request. This could be unwieldy! Possible solution 2: TABLE: threads KEY:userID : timestamp : threadID COLUMN: latest_message TABLE: messages KEY:userID : threadID : timestamp COLUMN: message Messages are first written to the messages table, and then the threads table's is updated with the latest message. The previous latest message is then deleted from the threads table. To fetch the latest x active threads, with the latest message: - I scan the threads table until I get x unique threads. A concern with this could be the issue of keeping the threads table in sync with the messages table - especially with the deletion of old latest messages. Possible solution 3: TABLE: messages KEY:userID : timestamp : threadID COLUMN: message To fetch the latest x active threads, with the latest message: - I scan the messages table until I get x unique threads. One of my concerns with this method is that some threads will be busier than others, forcing a scan through nearly all of a user's messages. And there will be an ever increasing number of messages. A periodic archiving process - moving older messages to another table - might alleviate things here. Possible solution 4: Use SOLR/Elastic search or equivalent.
Re: Modelling threaded messages
Have you considered using Cassandra? Kevin On Fri, Apr 1, 2011 at 11:01 PM, M. C. Srivas mcsri...@gmail.com wrote: Ted, this is a pretty clever idea. On Thu, Mar 31, 2011 at 9:27 PM, Ted Dunning tdunn...@maprtech.com wrote: Solr/Elastic search is a fine solution, but probably won't be quite as fast as a well-tuned hbase solution. One key assumption you seem to be making is that you will store messages only once. If you are willing to make multiple updates to tables, then you can arrange the natural ordering of the table to get what you want. For instance, you could keep the most recent messages (say the last 10 from each of the 1000 most recently updated threads) in an in memory table. Then you could store messages in a thread table indexed by thread:timestamp. Finally you could store messages in a table indexed by user:thread or user:timestamp. This would allow you to display the most recent messages or thread in near zero time, to display all or the most recent messages from a particular thread with only one retrieval and all of the messages from a particular user in time order in one retrieval. On Thu, Mar 31, 2011 at 5:56 PM, Mark Jarecki mjare...@bigpond.net.au wrote: Hi all, I'm modelling a schema for storing and retrieving threaded messages, where, for planning purposes: - there are many millions of users. - a user might have up to 1000 threads. - each thread might have up to 5 messages (with some threads being sparse with only a few messages). - the Stargate REST interface is used. I want to be able to execute the following queries: - retrieve x latest active threads, with the latest message. - retrieve x latest active threads, with the latest message, offset by y. - retrieve x latest messages from a thread. - retrieve x latest messages from a thread, offset by y. I've come up with a few possible methods for modelling this. But any insights would be greatly appreciated. Thanks in advance, Mark Possible solution 1: TABLE: threads KEY:userID : threadID COLUMN: latest_message TABLE: messages KEY:userID : threadID : timestamp COLUMN: message Messages are first written to the messages table, and then the threads table's thread is updated with the latest message. To fetch the latest x active threads, with the latest message: - I retrieve all threads and then sort and reduce the results on the client. A concern with this is the fetching of all threads to sort on each request. This could be unwieldy! Possible solution 2: TABLE: threads KEY:userID : timestamp : threadID COLUMN: latest_message TABLE: messages KEY:userID : threadID : timestamp COLUMN: message Messages are first written to the messages table, and then the threads table's is updated with the latest message. The previous latest message is then deleted from the threads table. To fetch the latest x active threads, with the latest message: - I scan the threads table until I get x unique threads. A concern with this could be the issue of keeping the threads table in sync with the messages table - especially with the deletion of old latest messages. Possible solution 3: TABLE: messages KEY:userID : timestamp : threadID COLUMN: message To fetch the latest x active threads, with the latest message: - I scan the messages table until I get x unique threads. One of my concerns with this method is that some threads will be busier than others, forcing a scan through nearly all of a user's messages. And there will be an ever increasing number of messages. A periodic archiving process - moving older messages to another table - might alleviate things here. Possible solution 4: Use SOLR/Elastic search or equivalent.
Re: Modelling threaded messages
Not original with me, I have to admit. Some of the ideas are best described in the OpenTSDB descriptions. On Fri, Apr 1, 2011 at 8:01 PM, M. C. Srivas mcsri...@gmail.com wrote: Ted, this is a pretty clever idea. On Thu, Mar 31, 2011 at 9:27 PM, Ted Dunning tdunn...@maprtech.comwrote: Solr/Elastic search is a fine solution, but probably won't be quite as fast as a well-tuned hbase solution. One key assumption you seem to be making is that you will store messages only once. If you are willing to make multiple updates to tables, then you can arrange the natural ordering of the table to get what you want. For instance, you could keep the most recent messages (say the last 10 from each of the 1000 most recently updated threads) in an in memory table. Then you could store messages in a thread table indexed by thread:timestamp. Finally you could store messages in a table indexed by user:thread or user:timestamp. This would allow you to display the most recent messages or thread in near zero time, to display all or the most recent messages from a particular thread with only one retrieval and all of the messages from a particular user in time order in one retrieval. On Thu, Mar 31, 2011 at 5:56 PM, Mark Jarecki mjare...@bigpond.net.au wrote: Hi all, I'm modelling a schema for storing and retrieving threaded messages, where, for planning purposes: - there are many millions of users. - a user might have up to 1000 threads. - each thread might have up to 5 messages (with some threads being sparse with only a few messages). - the Stargate REST interface is used. I want to be able to execute the following queries: - retrieve x latest active threads, with the latest message. - retrieve x latest active threads, with the latest message, offset by y. - retrieve x latest messages from a thread. - retrieve x latest messages from a thread, offset by y. I've come up with a few possible methods for modelling this. But any insights would be greatly appreciated. Thanks in advance, Mark Possible solution 1: TABLE: threads KEY:userID : threadID COLUMN: latest_message TABLE: messages KEY:userID : threadID : timestamp COLUMN: message Messages are first written to the messages table, and then the threads table's thread is updated with the latest message. To fetch the latest x active threads, with the latest message: - I retrieve all threads and then sort and reduce the results on the client. A concern with this is the fetching of all threads to sort on each request. This could be unwieldy! Possible solution 2: TABLE: threads KEY:userID : timestamp : threadID COLUMN: latest_message TABLE: messages KEY:userID : threadID : timestamp COLUMN: message Messages are first written to the messages table, and then the threads table's is updated with the latest message. The previous latest message is then deleted from the threads table. To fetch the latest x active threads, with the latest message: - I scan the threads table until I get x unique threads. A concern with this could be the issue of keeping the threads table in sync with the messages table - especially with the deletion of old latest messages. Possible solution 3: TABLE: messages KEY:userID : timestamp : threadID COLUMN: message To fetch the latest x active threads, with the latest message: - I scan the messages table until I get x unique threads. One of my concerns with this method is that some threads will be busier than others, forcing a scan through nearly all of a user's messages. And there will be an ever increasing number of messages. A periodic archiving process - moving older messages to another table - might alleviate things here. Possible solution 4: Use SOLR/Elastic search or equivalent.
Re: Modelling threaded messages
Depending on the speed requirements associated with retrieving bunches of messages, hbase may have a real edge here. This is a special problem in that there are common query patterns that allow contiguous reads of lots of data. That gives a huge advantage to systems like hbase that store data organized by key. You might view it as the karmic opposite of the common hot-spotting problem due to storing elements by time-stamp. On Fri, Apr 1, 2011 at 8:10 PM, Kevin Apte technicalarchitect2...@gmail.com wrote: Have you considered using Cassandra? Kevin On Fri, Apr 1, 2011 at 11:01 PM, M. C. Srivas mcsri...@gmail.com wrote: Ted, this is a pretty clever idea. On Thu, Mar 31, 2011 at 9:27 PM, Ted Dunning tdunn...@maprtech.com wrote: Solr/Elastic search is a fine solution, but probably won't be quite as fast as a well-tuned hbase solution. One key assumption you seem to be making is that you will store messages only once. If you are willing to make multiple updates to tables, then you can arrange the natural ordering of the table to get what you want. For instance, you could keep the most recent messages (say the last 10 from each of the 1000 most recently updated threads) in an in memory table. Then you could store messages in a thread table indexed by thread:timestamp. Finally you could store messages in a table indexed by user:thread or user:timestamp. This would allow you to display the most recent messages or thread in near zero time, to display all or the most recent messages from a particular thread with only one retrieval and all of the messages from a particular user in time order in one retrieval. On Thu, Mar 31, 2011 at 5:56 PM, Mark Jarecki mjare...@bigpond.net.au wrote: Hi all, I'm modelling a schema for storing and retrieving threaded messages, where, for planning purposes: - there are many millions of users. - a user might have up to 1000 threads. - each thread might have up to 5 messages (with some threads being sparse with only a few messages). - the Stargate REST interface is used. I want to be able to execute the following queries: - retrieve x latest active threads, with the latest message. - retrieve x latest active threads, with the latest message, offset by y. - retrieve x latest messages from a thread. - retrieve x latest messages from a thread, offset by y. I've come up with a few possible methods for modelling this. But any insights would be greatly appreciated. Thanks in advance, Mark Possible solution 1: TABLE: threads KEY:userID : threadID COLUMN: latest_message TABLE: messages KEY:userID : threadID : timestamp COLUMN: message Messages are first written to the messages table, and then the threads table's thread is updated with the latest message. To fetch the latest x active threads, with the latest message: - I retrieve all threads and then sort and reduce the results on the client. A concern with this is the fetching of all threads to sort on each request. This could be unwieldy! Possible solution 2: TABLE: threads KEY:userID : timestamp : threadID COLUMN: latest_message TABLE: messages KEY:userID : threadID : timestamp COLUMN: message Messages are first written to the messages table, and then the threads table's is updated with the latest message. The previous latest message is then deleted from the threads table. To fetch the latest x active threads, with the latest message: - I scan the threads table until I get x unique threads. A concern with this could be the issue of keeping the threads table in sync with the messages table - especially with the deletion of old latest messages. Possible solution 3: TABLE: messages KEY:userID : timestamp : threadID COLUMN: message To fetch the latest x active threads, with the latest message: - I scan the messages table until I get x unique threads. One of my concerns with this method is that some threads will be busier than others, forcing a scan through nearly all of a user's messages. And there will be an ever increasing number of messages. A periodic archiving process - moving older messages to another table - might alleviate things here. Possible solution 4: Use SOLR/Elastic search or equivalent.