Dear Wiki user, You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.
The following page has been changed by udanax: http://wiki.apache.org/lucene-hadoop/Hbase/HbaseArchitecture ------------------------------------------------------------------------------ by [wiki:udanax Udanax] [[MailTo(webmaster AT SPAMFREE udanax DOT org)]] - It's need to be much smaller, much faster, managed for high-demand analytics and can be sparse. - So, BigTable(Hbase) must Column storing like C-Store for wide and sparse data. - In a column oriented, NULLs are much easier to handle, and impose a significantly smaller performance overhead. + I think Hbase should be compact (space-efficient), fast and should be able to manage high-demand load. It should be able to handle sparse tables efficiently. + So, for wide and sparse data, Hbase must store data by columns like C-Store does. + A column-oriented system handles NULLs more easily with significantly smaller performance overhead, - And supports both Horizontal/Vertical Parallel Processing. + and supports both Horizontal and Vertical Parallel Processing. - Do you know RDF(Resource Description Framework) Storage? - We Can put it. + Let's consider the following case: + You may be familiar to RDF(Resource Description Framework) Storage from W3C, which is * Storing and managing very large amounts of structured data * Row/column space can be sparse @@ -286, +286 @@ * Because of the design of the system, columns are easy to create (and are created implicitly) * Column families can be split into locality groups (Ontologies!) - And then, assume some job. - I wanna get clustered document set by one of RDF Properties. - It can be Readed only vertical(Column) Data from Table, because Column-stored. - if you are not in agreement on this point, let me show your ideas via attach me through MSN Messenger([EMAIL PROTECTED]) + Let's assume a large amount of RDF documents are stored in the system. + And then, vertical(column) data set by one of RDF properties can be read fast from Table, because it is column-stored. + Please let me know if you don't agree with me. + ----