It reminds me of the /proc system, but also of SQL Views. Basically we provide additional metadata, and there's a View on that data which looks and feels like a regular HQL table. I always found Views very useful to give applications a consistent view of a table even if the underlying table structure changes between different versions.
In Hypertable a View would just be a dispatcher to either the regular column families or to the pseudo-tables. And later we could maybe implement "real" Views if the need comes up. bye Christoph 2013/3/5 Doug Judd <[email protected]> > This is a proposal for the introduction of *pseudo-tables* into > Hypertable. This idea came about when trying to come up with an > inexpensive way to discover large rows in a table. We zeroed in on the > CellStore indexes because they contain information that can be used to > estimate large rows cheaply. However, the next question was how do we > provide access to the CellStore indexdes through the API? Instead of > adding some special-purpose *ReadCellStoreIndexes* API, I propose that we > use the existing API as-is and surface the CellStore index information via > a *pseudo-table*. A pseudo-table is a virtual table with no real table > behind it. When a query comes in for the CellStore index pseudo table, > the CellStore indexes will get read directly to satisfy the query. This > approach is exactly analogous to the /proc filesystem in > Linux<http://www.ibm.com/developerworks/library/l-proc/index.html> > . > > The pseudo-table that represents the CellStore indexes for a given table, > *foo*, would have the name *foo*^.cellstore.index and the following > schema: > > create table foo^.cellstore.index ( > Size, > CompressedSize, > KeyCount > ); > > For each column family, there would be one qualified column for each block > in the CellStore indexes. The column qualifier would have the format: > <filename>:<hex-offset>. Also, the row key would be the same as the row > key in the CellStore index entries (we assume that's what most people will > want to aggregate this info on). So for example, the CellStore index block > entry for file 2/2/default/ZwmE_ShYJKgim-IL/cs103 at offset 0x28A61 might > generate the following keys: > > [email protected] > Size:2/2/default/ZwmE_ShYJKgim-IL/cs103:0000000000028A61 171728 > [email protected] > CompressedSize:2/2/default/ZwmE_ShYJKgim-IL/cs103:0000000000028A61 65231 > [email protected] > KeyCount:2/2/default/ZwmE_ShYJKgim-IL/cs103:0000000000028A61 281 > > To query the cellstore.index pseudo-table for table *foo* to find an > estimate of large rows, you would issue a query along the lines of the > following: > > SELECT sum(Size) FROM foo^.cellstore.index WHERE sum(Size) > 100000000; > > Please respond with feedback or if you have any questions. Thanks! > > - Doug > > -- > You received this message because you are subscribed to the Google Groups > "Hypertable User" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/hypertable-user?hl=en. > For more options, visit https://groups.google.com/groups/opt_out. > > > -- You received this message because you are subscribed to the Google Groups "Hypertable Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/hypertable-dev?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
