[ 
https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12978054#action_12978054
 ] 

Nicolas Spiegelberg commented on HBASE-3421:
--------------------------------------------

For interested parties...

From: Ted Yu
Hi,
I used the command you suggested in HBASE-3421 on a table and got:

K: 0012F2157E58883070B9814047048E8B/v:_/1283909035492/Put/vlen=1308 
K: 0041A80A545C4CBF412865412065BF5E/v:_/1283909035492/Put/vlen=1311 
K: 00546F4AA313020E551E049E848949C6/v:_/1283909035492/Put/vlen=1866 
K: 0068CC263C81CE65B65FC5425EFEBBCD/v:_/1283909035492/Put/vlen=1191 
K: 006DB8745D6D1B624F77E0F06C177C0B/v:_/1283909035492/Put/vlen=1021 
K: 006F9037BD7A8F081B54C5B03756C143/v:_/1283909035492/Put/vlen=1382 
...

Can you briefly describe what conclusion can be drawn here ?

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
From: Nicolas Spiegelberg

You're basically seeing all the KeyValues in that HFile.  The format is 
basically:

K: <KeyValue.toString()>

If you look at KeyValue.toString(), you'll see that the format is roughly:

row/family:qualifier/timestamp/type/value_length

So, it looks like you only have one qualifier per row and each row is roughly 
~1500 bytes of data.  For the user with the 30K columns per row, you should see 
an output that contains a ton of lines with the same row.  If you grep that 
row, cut the number after vlen=, and sum the values, you can see the size of 
your rows on a per-Hfile basis.


> Very wide rows -- 30M plus -- cause us OOME
> -------------------------------------------
>
>                 Key: HBASE-3421
>                 URL: https://issues.apache.org/jira/browse/HBASE-3421
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.0
>            Reporter: stack
>
> From the list, see 'jvm oom' in 
> http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it 
> looks like wide rows -- 30M or so -- causes OOME during compaction.  We 
> should check it out. Can the scanner used during compactions use the 'limit' 
> when nexting?  If so, this should save our OOME'ing (or, we need to add to 
> the next a max size rather than count of KVs).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to