Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for 
change notification.

The following page has been changed by udanax:
http://wiki.apache.org/lucene-hadoop/Hbase/HbaseShell

------------------------------------------------------------------------------
- [[TableOfContents(5)]]
+ [[TableOfContents(4)]]
- ----
- 
- = Hbase Shell Plan Draft =
- Plan is to significantly expand the set of shell operators.  Basic data 
manipulation and data definition operators will be extended and evolved to be 
more SQL-like ([wiki:Hbase/HbaseShell/HQL HQL]).  More sophisticated 
manipulations to do relational and linear algebra, matrix additions, 
multiplications, etc., will be added to a HBase subshell to keep the two 
operator types -- SQL-like vs. non-SQL -- distinct.
- 
-  ''-- After POC(proof of concept) review, many things can change.[[BR]]-- If 
you have constructive ideas, Please advise me. [[MailTo(webmaster AT SPAMFREE 
udanax DOT org)]]''
- 
- This project is currently in the planning stage.  
[https://issues.apache.org/jira/browse/HADOOP-1608 HADOOP-1608] to add 
"Relational Algrebra Operators" is currently in process.
  
  ----
+ = Hbase Shell Introduction =
+ Hbase Shell is a basic, command-line, and interactive 'shell' for 
manipulating tables in Hbase. It has support for a small set of SQL-inspired 
operations. Results are presented in an ASCII-table format.
  
- == Suggested Hbase Shell altools plans ==
- I suggest to develop HBase Shell in SQL-style, and develop '''al'''gebraic 
'''tools''' as a sub shell in Intuitionalized-style as described below. 
+ The Hbase Shell aims to be to Hbase what the mysql client command-line tool 
is to mysqld, and what sqlplus to Oracle.
+ 
+ Hbase Shell was first added to TRUNK in July, 2007.
+ 
+  * [http://issues.apache.org/jira/browse/hadoop-1720 HADOOP-1720] to update 
"[wiki:Hbase/HbaseShell/HQL HQL]" is currently in process. 
+  * See [wiki:Hbase/ShellPlans Hbase Shell plans] page for discussion and 
description of future operators. The intent is to add more support for 
non-interactive usage as well as operators for algebraic, relational, and 
matrix manipulations. 
+ 
+ == People Involved ==
+  * [:udanax:Edward Yoon] [[MailTo(webmaster AT SPAMFREE udanax DOT org)]] 
(Research and Development center, NHN corp.) -- Initial contributor
+  * [:InchulSong:Inchul Song] [[MailTo(icsong AT SPAMFREE gmail DOT com)]] 
(Database Lab, KAIST)
+ 
+ ----
+ = How to Start a Shell =
+ Run the following on the command-line:
+ 
+ {{{${HBASE_HOME}/bin/hbase shell}}}
+ 
+ You will be presented with the following prompt:
+ 
+ {{{HBase Shell, 0.0.1 version.
+ Copyright (c) 2007 by udanax, licensed to Apache Software Foundation.
+ Type 'help;' for usage.
+ 
+ HBase >}}}
+ 
+ All commands are terminated with a semi-colon: e.g. Type 'help;' to see list 
of available commands.
+ 
+ = Hbase Shell Commands =
+ '''Note''' that Data should be located by their row, column, and timestamp.
+ 
+ ||<bgcolor="#ececec">'''Command''' ||<bgcolor="#ececec">'''Explanation''' ||
+ ||Help ||<99%>'''Help''' command provides information about the use of shell 
script.[[BR]][[BR]]~-''HELP [function_name];''-~ ||
+ ||Show ||<99%>'''Show''' command lists tables ''or files 
(DFS)''.[[BR]][[BR]]~-''SHOW tables[ or files];''-~ ||
+ ||Describe ||'''Describe''' command provides information about the 
columnfamilies in a table.[[BR]][[BR]]~-''DESC table_name;''-~ ||
+ ||Create ||'''Create''' command creates a new table.[[BR]][[BR]]~-''CREATE 
table_name[[BR]]COLUMNFAMILIES('columnfamily_name1'[, 'columnfamily_name2', 
...])[[BR]][LIMIT=limitNumber_of_Version];''-~ ||
+ ||Drop ||'''Drop''' command drops columnfamilies in a table or 
tables.[[BR]][[BR]]~-''DROP table_name1[, table_name2, ...] or 
columnfamily_name1[, columnfamily_name2, ...];''-~ ||
+ ||Clear ||<99%>'''Clear''' the screen.[[BR]][[BR]]~-''CLEAR;''-~ ||
+ ||Exit ||<99%>'''Exit''' from the current shell 
script.[[BR]][[BR]]~-''EXIT;''-~ ||
+ And, Commands to manually manipulate data on more detailed parts.
+ ||<bgcolor="#ececec">'''Command''' ||<bgcolor="#ececec">'''Explanation''' ||
+ ||Insert ||<99%>'''Insert''' command inserts one row into the table with a 
value for specified column in the table.[[BR]][[BR]]~-''INSERT table_name 
('columnfamily_name1:column_key'[, 'columnfamily_name2:column_key', ...])[[BR]] 
VALUESVALUES ('entry1'[, 'entry2', ...])[[BR]]WHERE row='row_key';''-~ ||
+ ||Delete ||'''Delete''' command deletes specified rows in table. 
[[BR]][[BR]]~-''DELETE table_name[[BR]]WHERE row='row_key'[[BR]][AND 
column='columnfamily_name:column_key'];''-~ ||
+ ||Select ||<99%>'''Select''' command retrieves rows from a 
table.[[BR]][[BR]]~-''SELECT table_name[[BR]][WHERE row='row_key'][[BR]][AND 
column='columnfamily_name:column_key'];[[BR]][AND 
time='Specified_Timestamp'];[[BR]][LIMIT=Number_of_Version];''-~ ||
+ 
+ ----
+ = Example Of Hbase Shell Use =
+ == Create the table in a HBase ==
  
  {{{
- HBase > altools;
+ HBase > CREATE movieLog_table
+     --> COLUMNFAMILIES('year', 'length', 'inColor', 'studioName', 'vote', 
'producer')
+     --> LIMIT=1; 
  
+ HBase > CREATE movieStar_table
+     --> COLUMNFAMILIES('biography', 'filmography', 'gender', 'birthDate')
+     --> LIMIT=1;
- Hbase altools, 0.0.1 version
- Type 'help;' for Hbase altools usage.
- 
- Hbase.altools > who are you;
- 
-  Hadoop + Hbase based algebraic manipulation tools
- 
- Hbase.altools > exit;
- Hbase > exit;
  }}}
  
- Hbase altools is an Hbase Shell sub 'interpreter' (or 'shell)' program to 
provide scalable data processing capabilities like  aggregation, algebraic 
calculation(groups and sets, commutative rings, algebraic geometry, and linear 
algebra) on Hadoop + Hbase based parallel machines. especially, it will focus 
on storing and manipulating very large sparse matrices on Hbase.
+ == Insert data into a table ==
+ {{{
+ HBase > INSERT movieLog_table ('year:', 'length:', 'inColor:', 'studioName:', 
'vote:user_1', 'producer:')
+     --> VALUES ('1977', '124', 'true', 'Fox', '5', 'George Lucas')
+     --> WHERE row='Star Wars';
  
-  ''-- Altools Matrix operations will show how Google search's LSI, Google 
Earth's algebraic topology, Google News' recommendation system are related to 
Bigtable.''
  
- === Background ===
- I expect Hadoop + Hbase to handle sparsity and data explosion very well in 
near future. Moreover, i believe the design of the multi-dimensional map 
structure and the 3d space model of the data are optimized for rapid ad-hoc 
information retrieval in any orientation, as well as for fast, flexible 
calculation and transformation of raw data based on formulaic relationships. It 
is advantageous with respect to Analysis Processing as it allows users to 
easily formulate complex queries, and filter or slice data into meaningful 
subsets, among other things.
+ HBase > INSERT movieStar_table ('biography:', 'filmography:Star Wars', 
'gender:', 'birthDate:')
+     --> VALUES ('blah~', 'starring', 'male', 'March 31, 1971')
+     --> WHERE row='Ewan Gordon Mc.Gregor'; 
+ }}}
  
- ----
+ == Show all data in a table ==
+ {{{
+ HBase > SELECT movieLog_table;
+ }}}
  
+ ||Row Key ||<-12>Column Families ||
+ ||<rowbgcolor="#ececec">title ||<-2> year ||<-2>length ||<-2>inColor ||<-2> 
studioName ||<-2> vote ||<-2> producer ||
+ ||Star Wars ||year: || 1977 ||length: || 124 ||inColor: || true ||studioName: 
|| Fox || vote:''user_1'' || 5 || producer: || George Lucas ||
+ || || || || || || || || || || vote:''user_2'' || 2 || || ||
+ ||Mighty Ducks ||year: || 1991 ||length: || 104 ||inColor: || true 
||studioName: || Disney || vote:''user_1'' || 2 || producer: || Blair Peters ||
+ || || || || || || || || || || vote:''user_3'' || 4 || || ||
+ ||Wayne's World ||year: || 1992 ||length: || 95 ||inColor: || true 
||studioName: || Paramount || vote:''user_2'' || 3 || producer: || Penelope 
Spheeris ||
+ || || || || || || || || || || vote:''user_3'' || 4 || || ||
- == Suggested Hbase altools Syntax ==
- '''Note''' that Data should be located by their row, column, and timestamp.
- 
- === Commands ===
- ||<bgcolor="#E5E5E5">'''Command''' ||<bgcolor="#E5E5E5">'''Explanation''' ||
- ||Table ||<99%>'''Table''' command loads specified table. 
[[BR]][[BR]]~-''Table('movieLog_table');''-~ ||
- ||Matrix ||<99%>'''Matrix''' command constructs the configuration of the 
logic matrix.[[BR]]'''Options''' : features not yet. 
[[BR]][[BR]]~-''Matrix(table_name, columnfamily_name[, option]);''-~ ||
- ||Substitute ||<99%>'''Substitute''' expression to [A~Z][[BR]][[BR]]~-''A = 
Table('movieLog_table');''-~ ||
- ||IF...ELSE ||<99%>'''IF...ELSE''', Imposes conditions on the execution. 
[[BR]][[BR]]~-''IF ( boolean_expression )[[BR]]B = 
command_statements;[[BR]]ELSE[[BR]]B = command_statements;''-~||
- ||Store ||<99%>'''Store''' command will store results to specified table. 
[[BR]][[BR]]~-''A = Table('movieLog_table'); [[BR]]B = A.Selection(length > 
100); [[BR]]Store B TO table('tmp_table')[or file('backup.dat')];''-~ ||
- 
- === Relational Operators ===
- ||<bgcolor="#E5E5E5">'''Operator''' ||<bgcolor="#E5E5E5">'''Explanation''' ||
- ||Projection ||<99%>'''Projection''' of a relation ~+R+~, It makes a new 
relation as the set that is obtained when all tuples(rows) in ~+R+~ are 
restricted to the set 
{columnfamily,,1,,,...,columnfamily,,n,,}.[[BR]][[BR]]~-''A = 
Table('movieLog_table');[[BR]]B = A.Projection('year','length'); 
'''//π,,year.length,,(A)''' ''-~ ||
- ||Selection ||<99%>'''Selection''' of a relation ~+R+~, It makes a new 
relation as the set of specified tuples(rows) of the relation 
~+R+~.[[BR]]'''Set Operations''' : ~-''OR, AND, NOT''-~[[BR]][[BR]]~-''A = 
Table('movieLog_table');[[BR]]B = A.Selection(length > 100 AND studioName = 
'Fox'); '''//σ,,length > 100.studioName='Fox',,(A)''' ''-~ ||
- ||JOINs ||<99%>Table '''JOIN''' operations, linking and extracting data from 
two different internal source.[[BR]]'''Operations''' : ~-''naturalJoin(), 
thetaJoin(), cartesianProduct() ''-~ [[BR]][[BR]]~-''R = 
Table('movieLog_table');[[BR]]S = Table('movieStar_table');[[BR]]C = 
R.naturalJoin(S); '''//C = R▷◁S''' ''-~ ||
- ||Group ||<99%>'''Group''' tuples by value of an attribute and apply 
aggregate function independently to each group of tuples.[[BR]]'''Aggregate 
Functions''' : ~-''AVG( attribute ), SUM( attribute ), COUNT( attribute ), MIN( 
attribute ), MAX( attribute )''-~[[BR]][[BR]]~-''A = 
Table('movieLog_table);[[BR]]B = A.Group('studioName', MIN('year')); 
'''//γ,,studioName.MIN( year ),,(A)''' ''-~ ||
- ||Sort ||<99%>'''Sort''' of tuples(rows) of R, ordered according to 
columnfamilies on columnfamily-list.[[BR]][[BR]]~-''A = 
Table('movieLog_table');[[BR]]B = Sort A by ('length'); '''//τ,,length,,(A)''' 
''-~ ||
- 
- '''(ex. 1)''' Search the subject and the year of the movies which were 
produced by 'Fox' company and where running time is more than 100 minutes.
- [[BR]]~-''π ,,title.year,, (σ ,,length > 100,, (movieLog_table) ∩ σ 
,,studioName = 'Fox',, (movieLog_table))''-~
  
  {{{
+ HBase > SELECT movieStar_table;
- Hbase.altools > A = Table('movieLog_table'); 
- Hbase.altools > B = A.Selection(length > 100 AND studioName = 'Fox'); 
- Hbase.altools > C = B.Projection('year'); 
- 
- Hbase.altools > store C to table('result_table'); 
  }}}
  
- '''(ex. 2)''' Theta Join : ▷◁,,C,,
- [[BR]]~-''movieStars_table▷◁,,actor < year,,movieLog_table''-~
+ ||Row Key ||<-8>Column Families ||
+ ||<rowbgcolor="#ececec">starName ||<-2> biography ||<-2>filmography 
||<-2>gender ||<-2> birthDate ||
+ ||Ewan Gordon Mc.Gregor ||biography: ||blah blah ||filmography:Star Wars 
||starring ||gender: ||male ||birthDate: ||March 31, 1971 ||
+ || || || ||filmography:Emma ||extra || || || || ||
+ ||Kenan Thompson ||biography: ||blah blah ||filmography:Mighty Ducks 
||starring ||gender: ||male ||birthDate: ||May 10, 1978 ||
+ || || || ||filmography:Big Fat Liar  ||cameo || || || || ||
+ ||keanu reeves ||biography: ||blah blah ||filmography:Constantine ||starring 
||gender: ||male ||birthDate: ||September 2, 1964||
+ || || || ||filmography:The Matrix Reloaded ||starring || || || || ||
  
- {{{
- Hbase.altools > A = Table('movieStars_table'); 
- Hbase.altools > B = Table('movieLog_table');
- Hbase.altools > C = A.thetaJoin(B);
- 
- Hbase.altools > store C to table('result_table'); 
- }}}
- 
- === Matrix Arithmetic Operators ===
- ||<bgcolor="#E5E5E5">'''Operator''' ||<bgcolor="#E5E5E5">'''Explanation''' ||
- ||Addition ||<99%>'''Adding''' entries with the same indices. 
[[BR]][[BR]]~-''A = Matrix('m_table','cf_1');[[BR]]B = 
Matrix('m_table','cf_2');[[BR]]C = A + B; '''// c,,ij,, = a,,ij,, + b,,ij,, (i 
: row key, j : column key)''' ''-~ ||
- ||Subtraction ||<99%>'''Subtracting''' entries with the same 
indices.[[BR]][[BR]]~-''A = Matrix('m_table','cf_1');[[BR]]B = 
Matrix('m_table','cf_2');[[BR]]C = A - B; '''// c,,ij,, = a,,ij,, - b,,ij,, (i 
: row key, j : column key)''' ''-~ ||
- ||Multiplication ||<99%>'''Multiplication''' of two matrices, Product C of 
two matrices A and B.[[BR]][[BR]]~-''A = Matrix('m_table','cf_1');[[BR]]B = 
Matrix('m_table','cf_2');[[BR]]C = A * B; '''//C = A · B''' ''-~ ||
- ||Division ||<99%>'''Division''' is solving the matrix equation AX = B for 
X.[[BR]][[BR]]~-''A = Matrix('m_table','cf_1');[[BR]]B = 
Matrix('m_table','cf_2');[[BR]]C = A /[or \] B; '''// C = A / B''' ''-~||
- ||Transpose ||<99%>'''Transpose''' of a Matrix, A matrix which is formed by 
turning all the rows of a given matrix into columns and 
vice-versa.[[BR]][[BR]]~-''A = Matrix('m_table','cf_1');[[BR]]B = Transpose(A); 
'''// B = A'''' ''-~||
- 
- '''(ex. 1)''' The product C of two matrices A and B
- [[BR]]~-''C,,ij,, = ΣA,,ik,,B,,kj,, (1 ≤ i ≤ m , 1 ≤ j ≤n)''-~
- 
- {{{
- Hbase.altools > A = Matrix('m_table','cf_1');
- Hbase.altools > B = Matrix('m_table','cf_2');
- Hbase.altools > C = A * B;  
- }}}
- 
- === Factorizations and Decompositions ===
- 
- ||<bgcolor="#E5E5E5">'''Function''' ||<bgcolor="#E5E5E5">'''Explanation''' ||
- ||LU ||<99%>'''LU Decomposition'''[[BR]]A procedure for decomposing an N by N 
matrix A into a product of a lower triangular matrix L and an upper triangular 
matrix U, LU = A.[[BR]]'''Functions''' : ~-''getL(), getU(), isSingular(), 
getPivot()''-~ [[BR]][[BR]]~-''A = Matrix('m_table','cf_1');[[BR]]B = 
LUDecomposition(A);[[BR]]C = getU(B);[[BR]]D = getL(A);''-~||
- ||QR ||<99%>'''QR Decomposition'''[[BR]]For an m-by-n matrix A with m >= n, 
the QR decomposition is an m-by-n orthogonal matrix Q and an n-by-n upper 
triangular matrix R so that A = Q*R.[[BR]]'''Functions''' : ~-''getH(), getQ(), 
getR()''-~[[BR]][[BR]]~-''A = Matrix('m_table','cf_1');[[BR]]B = 
QRDecomposition(A);[[BR]]C = getH(B);''-~||
- ||Cholesky ||<99%>'''Cholesky Decomposition'''[[BR]]It is a special case of 
LU decomposition applicable only if matrix to be decomposed is symmetric 
positive definite.[[BR]]'''Functions''' : ~-''getL(), isSPD()''-~ 
[[BR]][[BR]]~-''A = Matrix('m_table','cf_1');[[BR]]B = 
CholeskyDecomposition(A);[[BR]]C = getL(A);''-~||
- ||SVD ||<99%>'''SV(Singular Value) Decomposition'''[[BR]]For an m-by-n matrix 
A with m >= n, the singular value decomposition is an m-by-n orthogonal matrix 
U, an n-by-n diagonal matrix S, and an n-by-n orthogonal matrix V so that A = 
U*S*V'.[[BR]]'''Functions''' : ~-''getS(), getU(), getV(), 
getSingularValues()''-~ [[BR]][[BR]]~-''A = Matrix('m_table','cf_1');[[BR]]B = 
SVDecomposition(A);[[BR]]C = getU(B);''-~||
- 
- '''(ex. 1)''' To find the Singular Value decomposition in Altools, do the 
following:
- [[BR]]~-''M = UΣV*''-~
- 
- {{{
- Hbase.altools > M = Matrix('m_table','cf_1'); //Set up the matrix M from 
mapped matrix in hbase.
- Hbase.altools > U = M.getU();
- Hbase.altools > V = M.getV();
- }}}
- 

Reply via email to