Pardon, hit send to early. "You will run into OS and HDFS level issues attempting this, I don't recommend it."
On Sat, Dec 21, 2013 at 12:26 PM, Andrew Purtell <[email protected]>wrote: > Bear in mind that how many files you'll have open simultaneously is a > function of number of regions, number of column families, and how > compaction organizes the HBase files on disk (the strategy in effect and > its parameters, the current ingest rate, and so on). You call ballpark this > as such: If you have one column family in a table, and store data into all > the regions, then you will have one file open on the cluster per region, or > more. If you have 100,000 column families in a table, and store data into > all the regions and CFs, then you will have 100,000 files open on the > cluster per region, *or more*. You will run into OS and HDFS levels > attempting this, I don't recommend it. > > I don't think any reasonable schema design needs produce a requirement for > 100,000 column *families*. You can have any number of keys with > <column>:<qualifier> in a column family, varying the <qualifier> to 100,000 > or 1,000,000 or more unique values is no problem. Can you say more about > what you are trying to accomplish? > > > On Sat, Dec 21, 2013 at 7:17 AM, 乃岩 <[email protected]> wrote: > >> Hi, >> Can anybody tell me if future HBase release will integrate 3149 for >> Make flush decisions per column family? >> >> By the way, for current HBase, if the simultaneous flush is the only >> issue? I mean, to create 100000 CFs will not be a problem, right? >> >> Thanks in advance! >> >> >> >> >> >> N.Y. > > > > > -- > Best regards, > > - Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > (via Tom White) > -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
