handle overly large column family in one row
--------------------------------------------

                 Key: HBASE-2007
                 URL: https://issues.apache.org/jira/browse/HBASE-2007
             Project: Hadoop HBase
          Issue Type: Bug
            Reporter: Andrew Purtell
             Fix For: 0.20.3, 0.21.0


>From a team in TM:

{quote}
I tried to create the large column in one row and one family. The value of each 
column has only 10 bytes. When I created the 804226850th column, the region 
server was crashed. I find that all columns of one row and one family will in 
the same region. The region server will crash, because this region is too 
large. And then I want to reboot HBase, after several minutes, the region 
servers will crash one by one.

If one row and one family cannot split, then no matter how many machines are in 
HBase system, the capacity of HBase will be limited by one machine. I want to 
know whether this problem is a bug. If the column quantity in one row and one 
family is limited, can you tell me the safe range? 
{quote}

Currently a row cannot be split. So an individual row can expand only to some 
finite limit constrained by the region server capability. 

I am impressed that a row was able to successfully contain 804,226,849 columns. 

The HBase storage capability goals are currently "billions of rows, millions of 
columns, thousands of tables". A test involving hundreds of millions of columns 
is very challenging. 

Most important, HBase should not accept input beyond some limit which produces 
a cascading failure. 

I think we also do want to have the architectural discussion about rows that 
must span region servers due to immensity. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to