Thank you very much for the fast reply!

What would be NOT conservative upper bound for such case?
Is it possible to use 500GB rows using such approach?

Regards,
Vitaliy


On Fri, Jun 28, 2019 at 7:55 AM Stack <st...@duboce.net> wrote:
>
> On Wed, Jun 26, 2019 at 1:08 PM Vitaliy Semochkin <vitaliy...@gmail.com>
> wrote:
>
> > Hi,
> >
> > I have an analytical report that would be very easy to build
> > if I could store thousands of cells in one row each cell storing about
> > 2kb of information.
> > I don't need those rows to be stored in any cache, because they will
> > be used only occasionally for analytical reports in Flink.
> >
> > The question is, what is the biggest size of a row hbase can handle?
> > Should I store 2kb rows as MOBs or regular format is ok?
> >
> > There are old articles that say that large rows, i.e. rows which total
> > size is large than 10mb, can affect hbase performance,
> > is this statement still valid for the modern hbase versions?
> > What is the largest row size  hbase handle theses days without having
> > issues with performance?
> > Is it possible to read a row so that it's whole content is not read
> > into memory (e.g I would like to read row's content cell by cell)?
> >
> >
> See
> https://hbase.apache.org/2.0/apidocs/org/apache/hadoop/hbase/client/Scan.html#setAllowPartialResults-boolean-
> It speaks to your question. See the 'See Also:' on this method too.
>
> Only works for Scan. Doesn't work if you Get a row (You could Scan one row
> only if you need the above partial result).
>
> HBase has no 'streaming' API that would allow you return a Cell-at-a-time
> so big rows are a problem if you don't do the above partial. The big row is
> materialized serverside in memory and then again client-side. 10MB is a
> conservative upper bound.
>
> 2kb Cells should work nicely -- even if a few thousand... especially if you
> can use partial.
>
> S
>
>
>
> > Best Regards
> > Vitaliy
> >

Reply via email to