You could take a look at the files in HDFS.  Use the HFile tool to
look at one of the HBase StoreFile/HFiles.  See
http://people.apache.org/~stack/hbase-0.90.0-candidate-2/docs/ch08s02.html#hfile_tool
for how to use.   See how each cell entry includes the
row+column+timestamp.  Are you using LZO?  That'd probably tidy up
your space in filesystem some.

St.Ack

On Wed, Dec 29, 2010 at 1:34 PM, Hiller, Dean  (Contractor)
<[email protected]> wrote:
> I have dfs.replication set to 1, and have a 1.8 gig file on the hdfs and
> after my map reduct which just pretty much puts each row in the file to
> a row in the database, I end up with a 14.8 gigs of usage-1.8 = 13 gigs
> used by hbase???
>
>
>
> I think this is starting to seem normal maybe now after thinking about
> it a bit.  Here is the details though just in case...
>
>
>
> My 10 million rows each have a
>
> Key=<accountNo>-<UUID>  //ok, this UUID is extra space too that I eat up
>
>
>
> And my other code just comes from the file....
>
>
>
>                  Put put = new Put(key.getBytes());
>
>
>
> //all below is from file
>
>                  add(put, "attributes_family", "accountNo", values[0]);
> //int
>
>                  add(put, "attributes_family", "activityId",
> values[1]); //int
>
>                  add(put, "attributes_family", "random", values[2]);
> //int
>
>                  add(put, "attributes_family", "line", values[3]);
> //long string
>
>                  add(put, "attributes_family", "something", values[4]);
> //long string
>
>
>
> In RDBMS, I was not taking any space with column names, but that now
> takes up space, right?  And my UUID is not in the file and also adds
> some space as well.  Does this sound about right to people?  (I have no
> idea what the size would look like if I read that into an RDBMS(and of
> course indexing, etc. can play a role too).
>
>
>
> Thanks,
>
> Dean
>
>
> This message and any attachments are intended only for the use of the 
> addressee and
> may contain information that is privileged and confidential. If the reader of 
> the
> message is not the intended recipient or an authorized representative of the
> intended recipient, you are hereby notified that any dissemination of this
> communication is strictly prohibited. If you have received this communication 
> in
> error, please notify us immediately by e-mail and delete the message and any
> attachments from your system.
>
>

Reply via email to