[
https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12978054#action_12978054
]
Nicolas Spiegelberg commented on HBASE-3421:
--------------------------------------------
For interested parties...
From: Ted Yu
Hi,
I used the command you suggested in HBASE-3421 on a table and got:
K: 0012F2157E58883070B9814047048E8B/v:_/1283909035492/Put/vlen=1308
K: 0041A80A545C4CBF412865412065BF5E/v:_/1283909035492/Put/vlen=1311
K: 00546F4AA313020E551E049E848949C6/v:_/1283909035492/Put/vlen=1866
K: 0068CC263C81CE65B65FC5425EFEBBCD/v:_/1283909035492/Put/vlen=1191
K: 006DB8745D6D1B624F77E0F06C177C0B/v:_/1283909035492/Put/vlen=1021
K: 006F9037BD7A8F081B54C5B03756C143/v:_/1283909035492/Put/vlen=1382
...
Can you briefly describe what conclusion can be drawn here ?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
From: Nicolas Spiegelberg
You're basically seeing all the KeyValues in that HFile. The format is
basically:
K: <KeyValue.toString()>
If you look at KeyValue.toString(), you'll see that the format is roughly:
row/family:qualifier/timestamp/type/value_length
So, it looks like you only have one qualifier per row and each row is roughly
~1500 bytes of data. For the user with the 30K columns per row, you should see
an output that contains a ton of lines with the same row. If you grep that
row, cut the number after vlen=, and sum the values, you can see the size of
your rows on a per-Hfile basis.
> Very wide rows -- 30M plus -- cause us OOME
> -------------------------------------------
>
> Key: HBASE-3421
> URL: https://issues.apache.org/jira/browse/HBASE-3421
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.90.0
> Reporter: stack
>
> From the list, see 'jvm oom' in
> http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser, it
> looks like wide rows -- 30M or so -- causes OOME during compaction. We
> should check it out. Can the scanner used during compactions use the 'limit'
> when nexting? If so, this should save our OOME'ing (or, we need to add to
> the next a max size rather than count of KVs).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.