Re: hbase data

2012-05-31 Thread Andrew Nguyen
for later processing. We are working with medical/physiological sensor data. -- Andrew Nguyen On Tuesday, May 29, 2012 at 10:13 AM, Josh Patterson wrote: unless you need low latency access to all of this time series, it might be a more cost efficient path to store large archives

Re: HBase storage sizing

2010-08-06 Thread Andrew Nguyen
With respect to the comment below, I'm trying to determine what the minimum IO requirements are for us... For any given value being stored into HBase, is accurate to calculate the size of the row key, family, qualifier, timestamp, and value and use their sum as the amount of data that needs to

Re: HBase minimum block size for sequential access

2010-07-28 Thread Andrew Nguyen
So, I ran the following command in the shells: alter 'tablename', {NAME='cfname', BLOCKSIZE=1045876} major_compact 'tablename' How do I know the major compact completed successfully? I saw that the number of regions has grown quite a bit but I'm not quite sure to know when it's all finished

Zero-copy reads

2010-07-27 Thread Andrew Nguyen
Hello all, I recently saw some references to zero copy reads in Lars' blog post as well as some powerpoints, jira comments, etc. Is there any additional information available on this topic? I saw some comments in jira that mentioned the loss of zero copy reads, while others mention that it's

Re: HBase minimum block size for sequential access

2010-07-27 Thread Andrew Nguyen
Perfect thanks, I will run some experiments and keep you posted. Aside from just getting elapsed time on scans of various sizes, are there any other tips on what sorts of measurements to perform? Also, since I'm doing the experiments with various block sizes anyways, any requests for other

Re: HBase minimum block size for sequential access

2010-07-27 Thread Andrew Nguyen
Nevermind my last question. I thought BLOCKSIZE was a table attribute but it is specific to a column family. On Jul 27, 2010, at 1:19 PM, Andrew Nguyen wrote: I just attempted to change the blocksize and it doesn't seem to be taking. I am doing the following in the shell: alter

Re: Running an jython import job

2010-07-23 Thread Andrew Nguyen
, and increased the buffer. Anything else I should consider? Thanks! --Andrew -- Andrew Nguyen and...@ucsfcti.org The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain confidential

Re: Running an jython import job

2010-07-23 Thread Andrew Nguyen
Thanks for the info. I actually used that blog post as a starting point for my work with jython. I will also take a look at the bulk loading you referenced below. We are currently only doing single-cf imports. --Andrew -- Andrew Nguyen and...@ucsfcti.org The information contained

Re: Modeling column families

2010-06-04 Thread Andrew Nguyen
are the downsides to having hundreds of different tables that have the same schema otherwise? Thanks! --Andrew -- Andrew Nguyen and...@ucsfcti.org The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain