Re: Question regarding maximum row size.

Stack Thu, 27 Jun 2019 21:56:05 -0700

On Wed, Jun 26, 2019 at 1:08 PM Vitaliy Semochkin <vitaliy...@gmail.com>
wrote:


> Hi,
>
> I have an analytical report that would be very easy to build
> if I could store thousands of cells in one row each cell storing about
> 2kb of information.
> I don't need those rows to be stored in any cache, because they will
> be used only occasionally for analytical reports in Flink.
>
> The question is, what is the biggest size of a row hbase can handle?
> Should I store 2kb rows as MOBs or regular format is ok?
>
> There are old articles that say that large rows, i.e. rows which total
> size is large than 10mb, can affect hbase performance,
> is this statement still valid for the modern hbase versions?
> What is the largest row size  hbase handle theses days without having
> issues with performance?
> Is it possible to read a row so that it's whole content is not read
> into memory (e.g I would like to read row's content cell by cell)?
>
>
See
https://hbase.apache.org/2.0/apidocs/org/apache/hadoop/hbase/client/Scan.html#setAllowPartialResults-boolean-
It speaks to your question. See the 'See Also:' on this method too.

Only works for Scan. Doesn't work if you Get a row (You could Scan one row
only if you need the above partial result).

HBase has no 'streaming' API that would allow you return a Cell-at-a-time
so big rows are a problem if you don't do the above partial. The big row is
materialized serverside in memory and then again client-side. 10MB is a
conservative upper bound.

2kb Cells should work nicely -- even if a few thousand... especially if you
can use partial.

S



> Best Regards
> Vitaliy
>

Re: Question regarding maximum row size.

Reply via email to