[
https://issues.apache.org/jira/browse/HBASE-16425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15423675#comment-15423675
]
Jean-Marc Spaggiari commented on HBASE-16425:
---------------------------------------------
I like this thread!
Another thing related to the bulk load. If someone bulkloads a cell wich is WAY
too big, the region server might not be able to load it. Like, a 2GB cell. And
will fail. Might be nice to detect that and alert the user/log the issue/skip
the cell...
> [Operability] Autohandling 'bad data'
> -------------------------------------
>
> Key: HBASE-16425
> URL: https://issues.apache.org/jira/browse/HBASE-16425
> Project: HBase
> Issue Type: Brainstorming
> Components: Operability
> Reporter: stack
>
> This is a brainstorming issue. It came up chatting w/ a couple of operators
> talking about 'bad data'; i.e. no matter how you control your clients,
> someone by mistake or under a misconception will load an out-of-spec Cell or
> Row. In this particular case, two types of 'bad data' were talked about:
> (on) The Big Cell: An upload of a 'big cell' came in via bulkload but it so
> happened that their frontend all arrived at the malignant Cell at the same
> time so hundreds of threads requesting the big cell. The RS OOME'd. Then when
> the region opened on the new RS, it OOME'd, etc. Could we switch to chunking
> when a Server sees that it has a large Cell on its hands? I suppose bulk load
> could defeat any Put chunking we had in place but would be good to have this
> too. Chatting w/ Matteo, we probably want to just move to the streaming
> Interface that we've talked of in the past at various times; the Get would
> chunk out the big Cell for assembly on the Client, or just give back the Cell
> in pieces -- an OutputStream for the Application to suck on. New API and/or
> old API could use it when Cells are big.
> (on) The user had a row with 29M Columns in it because the default entity had
> id=-1.... In this case chunking the Scan (v1.1+) helps but the operator was
> having trouble finding the problem row. How could we surface anomalies like
> this for operators? On flush, add even more meta data to the HFile (Yahoo!
> Data Sketches as [~jleach] has been suggesting) and then an offline tool to
> read metadata and run it through a few simple rules. Data Sketches are
> mergeable so could build up a region-view or store-view....
> This is sketchy and I'm pretty sure repeats stuff in old issues but parking
> this note here while the encounter still fresh.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)