Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for 
change notification.

The following page has been changed by udanax:
http://wiki.apache.org/lucene-hadoop/Hbase/HbaseArchitecture

------------------------------------------------------------------------------
  
  by [wiki:udanax Udanax] [[MailTo(webmaster AT SPAMFREE udanax DOT org)]]
  
- It's need to be much smaller, much faster, managed for high-demand analytics 
and can be sparse.
- So, BigTable(Hbase) must Column storing like C-Store for wide and sparse data.
- In a column oriented, NULLs are much easier to handle, and impose a 
significantly smaller performance overhead.
+ I think Hbase should be compact (space-efficient), fast and should be able to 
manage high-demand load. It should be able to handle sparse tables efficiently.
+ So, for wide and sparse data, Hbase must store data by columns like C-Store 
does.
+ A column-oriented system handles NULLs more easily with significantly smaller 
performance overhead,
- And supports both Horizontal/Vertical Parallel Processing.
+ and supports both Horizontal and Vertical Parallel Processing.
  
- Do you know RDF(Resource Description Framework) Storage?
- We Can put it.
+ Let's consider the following case:
+ You may be familiar to RDF(Resource Description Framework) Storage from W3C, 
which is
  
   * Storing and managing very large amounts of structured data
   * Row/column space can be sparse
@@ -286, +286 @@

   * Because of the design of the system, columns are easy to create (and are 
created implicitly) 
   * Column families can be split into locality groups (Ontologies!) 
  
- And then, assume some job.
- I wanna get clustered document set by one of RDF Properties.
- It can be Readed only vertical(Column) Data from Table, because Column-stored.
- if you are not in agreement on this point, let me show your ideas via attach 
me through MSN Messenger([EMAIL PROTECTED])
+ Let's assume a large amount of RDF documents are stored in the system.
+ And then, vertical(column) data set by one of RDF properties can be read fast 
from Table, because it is column-stored.
+ Please let me know if you don't agree with me.
+ 
  
  ----
  

Reply via email to