Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for 
change notification.

The following page has been changed by stack:
http://wiki.apache.org/lucene-hadoop/Hbase/HbaseShell

The comment on the change is:
Split page in two -- current and future -- after chatting with Edward Yoon

------------------------------------------------------------------------------
-  * https://issues.apache.org/jira/browse/HADOOP-1375 (resolved)
-  * Work in progress
- 
  [[TableOfContents(4)]]
  
  ----
- = Hbase Shell Introduction =
+ = HBase Shell Introduction =
- Hbase Shell is an 'interpreter' (or 'shell)' to provide scalable data 
processing capabilities like [[BR]]aggregation, algebraic calculation on Hadoop 
+ Hbase.
+ Hbase Shell is a basic, command-line, interactive 'shell' for manipulating 
tables in HBase.  It has support for a small set of SQL-inspired operations.  
Results are presented in an ASCII-table format.
  
+ The HBase Shell aims to be to HBase what the mysql client command-line tool 
is to mysqld and sqlplus is to Oracle.
- == Hbase Shell Goals ==
- HBase Shell is developed to achieve the following goals.
  
+ HBase Shell was first added to TRUNK in July, 2007.
-  * A Simplified Import/Export/Migrate Functionality Between different data 
sources (Hadoop, HBase)
-  * A Simplified processing of a logical data model
-  * A Simplified algebraic operations
-  * A Simplified Parallel Numerical Analysis by abstracting/numericalizing 
points, lines, [[BR]]or plane data across multiple maps in HBase.
- 
- == Background ==
- 
- I expect Hadoop + Hbase to handle sparsity and data explosion very well in 
near future. [[BR]]Moreover, i believe the design of the multi-dimensional 
structure and the 3-dim space model of the data are [[BR]]optimized for rapid 
ad-hoc information retrieval in any orientation, as well as for fast, flexible 
calculation and transformation of [[BR]]raw data based on formulaic 
relationships.
- 
- Then, I thought it would require a more user-friendly interface to enable 
querying the data interactive.
- 
- == Rationale ==
- 
- It will probably take a while for Hadoop + HBase to provide reliable 
real-time service like other DBMS. 
- [[BR]]Thus, I decided to develop a shell to process linear algebraic 
computing 
- [[BR]]and large scale data using Hadoop's parallel processing and HBase 
storage. 
- 
- ''Then you may ask "What is a difference from MapReduce using MapFiles?"''
- 
- I don't expect it to give us a high-performance just yet, 
- [[BR]]but it will sure make data management and development much easier. 
- [[BR]]First, let's take a look at HBase's data model. 
- 
- HBase provides a unified data model and it represents a data in 3-dimensional 
- [[BR]]- Row, Column, and TImestamp. Also, Row and Column may be extended 
infinitely. 
-   
- If we decide to cut the data model in time version, then we may view the new 
data as a 2D table. 
- [[BR]]If index is in string, we may view it as a huge map. If index is in 
integer, then it is one huge 2D array. 
- 
- So each table may have such data storages in 3D (ColumnFamilies)
- [[BR]]Locality Group(Columnfamilies) is a relationship that can occur between 
multiple references 
- [[BR]]whenever one reference brings in much of the data used by the other 
references.
- 
-   ''-- I hope physical files on networks are grouped together with locality 
grouping.[[BR]]by [:udanax:udanax].''
  
  == People Involved ==
  
-  * [:udanax:Edward Yoon] [EMAIL PROTECTED] (NHN corp.)
+  * [:udanax:Edward Yoon] [[MailTo(udanax AT SPAMFREE nhncorp DOT com)]] (NHN 
corp.)
-  * [:boyo:Sewon Kim] [EMAIL PROTECTED] (Empas, Inc.)
-  * [:mskim:Minsu Kim] [EMAIL PROTECTED] (Daum, Inc.)
  
  ----
- = Hbase Shell Client Syntax Definition =
+ = How to Start a Shell =
+ Run the following on the command-line:
+ 
+ {{{${HBASE_HOME}/bin/hbase shell}}}
+ 
+ You will be presented with the following prompt:
+ 
+ {{{HBase Shell, 0.0.1 version.
+ Copyright (c) 2007 by udanax, licensed to Apache Software Foundation.
+ Type 'help;' for usage.
+ 
+ HBase >}}}
+ 
+ All commands are terminated with a semi-colon: e.g. Type 'help;' to see list 
of available commands.
+ 
+ = Hbase Shell Commands =
  '''Note''' that Data should be located by their row, column, and timestamp.
  
- 
- == Commands ==
  ||<bgcolor="#ececec">'''Command''' ||<bgcolor="#ececec">'''Explanation''' ||
  ||Help ||<99%>'''Help''' command provides information about the use of shell 
script.[[BR]][[BR]]~-''HELP [function_name];''-~ ||
- ||Show ||<99%>'''Show''' command will list the tables.[[BR]][[BR]]~-''SHOW 
tables;''-~ ||
+ ||Show ||<99%>'''Show''' command lists tables.[[BR]][[BR]]~-''SHOW 
tables;''-~ ||
- ||Describe ||'''Describe''' command will provides information about the 
columnfamilies in a table.[[BR]][[BR]]~-''DESC table_name;''-~ ||
+ ||Describe ||'''Describe''' command provides information about the 
columnfamilies in a table.[[BR]][[BR]]~-''DESC table_name;''-~ ||
- ||Create ||'''Create''' command will create a new 
table.[[BR]][[BR]]~-''CREATE 
table_name[[BR]]COLUMNFAMILIES('columnfamily_name1'[, 'columnfamily_name2', 
...])[[BR]][LIMIT=limitNumber_of_Version];''-~ ||
+ ||Create ||'''Create''' command creates a new table.[[BR]][[BR]]~-''CREATE 
table_name[[BR]]COLUMNFAMILIES('columnfamily_name1'[, 'columnfamily_name2', 
...])[[BR]][LIMIT=limitNumber_of_Version];''-~ ||
- ||Drop ||'''Drop''' command will droping columnfamilies in a table or 
tables.[[BR]][[BR]]~-''DROP table_name1[, table_name2, ...] or 
columnfamily_name1[, columnfamily_name2, ...];''-~ ||
+ ||Drop ||'''Drop''' command drops columnfamilies in a table or 
tables.[[BR]][[BR]]~-''DROP table_name1[, table_name2, ...] or 
columnfamily_name1[, columnfamily_name2, ...];''-~ ||
- ||Substitute || '''Substitute''' expression to [A~Z][[BR]][[BR]]~-''X = 
Matrix(table_name, columnfamily_name);''-~||
- ||Store ||'''STORE''' command will store results to specified table. 
[[BR]][[BR]]~-''A = Table('movieLog_table'); [[BR]]B = A.Selection('length' > 
100); [[BR]]STORE B TO X run_style;''-~ ||
  ||Exit ||<99%>'''Exit''' from the current shell 
script.[[BR]][[BR]]~-''EXIT;''-~ ||
  And, Commands to manually manipulate data on more detailed parts.
  ||<bgcolor="#ececec">'''Command''' ||<bgcolor="#ececec">'''Explanation''' ||
- ||Insert ||<99%>'''Insert''' command will insert one row into the table with 
a value for specified column in the table.[[BR]][[BR]]~-''INSERT table_name 
('columnfamily_name1:column_key'[, 'columnfamily_name2:column_key', ...])[[BR]] 
VALUESVALUES ('entry1'[, 'entry2', ...])[[BR]]WHERE row='row_key';''-~ ||
+ ||Insert ||<99%>'''Insert''' command inserts one row into the table with a 
value for specified column in the table.[[BR]][[BR]]~-''INSERT table_name 
('columnfamily_name1:column_key'[, 'columnfamily_name2:column_key', ...])[[BR]] 
VALUESVALUES ('entry1'[, 'entry2', ...])[[BR]]WHERE row='row_key';''-~ ||
- ||Set ||'''SET''' command will change the values. [[BR]][[BR]]~-''SET 
table_name[[BR]] VALUES('columnfamily_name:column_key','entry')[[BR]]WHERE 
row='row_key' AND time='Specified_Timestamp';''-~ ||
- ||Delete ||'''Delete''' command will delete specified rows in table. 
[[BR]][[BR]]~-''DELETE table_name[[BR]]WHERE row='row_key'[[BR]][AND 
column='columnfamily_name:column_key'];''-~ ||
+ ||Delete ||'''Delete''' command deletes specified rows in table. 
[[BR]][[BR]]~-''DELETE table_name[[BR]]WHERE row='row_key'[[BR]][AND 
column='columnfamily_name:column_key'];''-~ ||
- ||Select ||<99%>'''Select''' command will retrieves rows from a 
table.[[BR]][[BR]]~-''SELECT table_name[[BR]][WHERE row='row_key'][[BR]][AND 
column='columnfamily_name:column_key'];[[BR]][AND 
time='Specified_Timestamp'];[[BR]][LIMIT=Number_of_Version];''-~ ||
+ ||Select ||<99%>'''Select''' command retrieves rows from a 
table.[[BR]][[BR]]~-''SELECT table_name[[BR]][WHERE row='row_key'][[BR]][AND 
column='columnfamily_name:column_key'];[[BR]][AND 
time='Specified_Timestamp'];[[BR]][LIMIT=Number_of_Version];''-~ ||
- 
- == Relational Operators ==
- 
- ||<bgcolor="#ececec">'''Operator''' ||<bgcolor="#ececec">'''Explanation''' ||
- ||Projection ||<99%>'''Projection''' of a relation ~+R+~, It makes a new 
relation as the set that is obtained when all tuples(rows) in ~+R+~ are 
restricted to the set 
{columnfamily,,1,,,...,columnfamily,,n,,}.[[BR]][[BR]]~-''A = 
Table('movieLog_table');[[BR]]B = A.Projection('year','length');''-~||
- ||Selection ||<99%>'''Selection''' of a relation ~+R+~, It makes a new 
relation as the set of specified tuples(rows) of the relation ~+R+~[[BR]]'''Set 
Operations''' : ~-''OR, AND, NOT''-~[[BR]][[BR]]~-''A = 
Table('movieLog_table');[[BR]]B = A.Selection('length' > 100);[[BR]]C = 
A.Selection('length' > 100 AND 'year' > 1979);''-~||
- ||Product ||<99%>'''Product''' of relations R and S, It makes a new relation 
as the set of all possible combinations of tuples of the two operation 
relations.[[BR]]'''NOTE''' that this is the most computationally expensive 
operator in the relational algebra.||
- ||Rename ||<99%>'''Rename''' r to x, The columnfamily names in the 
columnfamily-list replace the columnfamily names of the 
relation.[[BR]][[BR]]~-''A = Table('movieLog_table');[[BR]]B = 
A.Rename('length' = 'movieLength');''-~||
- ||Group ||<99%>'''Group''' tuples by value of an attribute and apply 
aggregate function independently to each group of tuples.[[BR]]'''Aggregate 
Functions''' : ~-''AVG( attribute ), SUM( attribute ), COUNT( attribute ), MIN( 
attribute ), MAX( attribute )''-~[[BR]][[BR]]~-''A = 
Table('movieLog_table);[[BR]]B = A.Group('studioName', MIN('year'));''-~||
- ||Sort ||<99%>'''Sort''' of tuples(rows) of R, ordered according to 
columnfamilies on columnfamily-list[[BR]][[BR]]~-''A = 
Table('movieLog_table');[[BR]]B = A.Sort('length', 'vote');''-~||
- 
- == Matrix Operators ==
- 
- 
- * matrix operator
- 
- ||<bgcolor="#ececec">'''Operator''' ||<bgcolor="#ececec">'''Explanation''' ||
- ||Addition ||<99%>... ||
- ||subtraction ||<99%>... ||
- ||multiplication ||<99%>... ||
- ||division ||<99%>... ||
- ||transpose ||<99%>interchanging rows and columns ||
- ||permutation ||<99%>... ||
- ||norms ||<99%>... ||
- 
- * decompositions
- 
- ||<bgcolor="#ececec">'''Operator''' ||<bgcolor="#ececec">'''Explanation''' ||
- ||LU ||<99%>... ||
- ||QR ||<99%>... ||
- ||Cholesky ||<99%>... ||
- ||SVD ||<99%>... ||
- ||Inverse ||<99%>interchanging rows and columns ||
- ||Pseudoinverse ||<99%>... ||
- ||Condition ||<99%>... ||
- ||Determinant ||<99%>... ||
- ||Rank ||<99%>... ||
- 
  
  ----
  = Example Of Hbase Shell Use =
- == Basic Usage ==
- 
- === Create the table in a HBase ===
+ == Create the table in a HBase ==
  
  ~-''CREATE movieLog_table
  [[BR]]COLUMNFAMILIES('year', 'length', 'inColor', 'studioName', 'vote', 
'producer')
@@ -127, +58 @@

  [[BR]]COLUMNFAMILIES('biography', 'filmography', 'gender', 'birthDate')
  [[BR]]LIMIT=1;''-~ 
  
- === Insert data into a table ===
+ == Insert data into a table ==
  ~-''INSERT movieLog_table ('year:', 'length:', 'inColor:', 'studioName:', 
'vote:user_1', 'producer')
  [[BR]]VALUES ('1977', '124', 'true', 'Fox', '5', 'George Lucas')
  [[BR]]WHERE row='Star Wars';''-~ 
@@ -138, +69 @@

  [[BR]]WHERE row='Ewan Gordon Mc.Gregor';''-~ 
  
  
- === Show all data in a table ===
+ == Show all data in a table ==
  ~-''SELECT movieLog_table;''-~ 
  
  ||Row Key ||<-12>Column Families ||
@@ -161, +92 @@

  ||keanu reeves ||biography: ||blah~ ||filmography:Constantine ||starring 
||gender: ||male ||birthDate: ||September 2, 1964||
  || || || ||filmography:The Matrix Reloaded ||starring || || || || ||
  
- == Relation Operations ==
+ = HBase Shell Plans =
+ The intent is add more support for non-interactive usage as well as operators 
to support algebraic, relational, and matrix manipulations. See 
[wiki:Hbase/ShellPlans  ShellPlans] page for discussion and description of 
future operators.
  
- === Projection ===
- 
- ~-''A = Table('movieLog_table');
- [[BR]]B = A.Projection('year','length');''-~
- 
- '''~+^π^+~'''~-title-~,~-year-~,~-length-~'''~+^(movieLog_table)^+~'''
- 
- ||<rowbgcolor="#ececec">title ||year ||length ||
- ||Star Wars ||1977 ||124 ||
- ||Mighty Ducks ||1991 ||104 ||
- ||Wayne's World ||1992 ||95 ||
- 
- 
- 
- === Selection ===
- 
- ~-''A = Table('movieLog_table');
- [[BR]]B = A.Selection('length' > 100);''-~
- 
- '''~+^σ^+~'''~-length>100-~'''~+^(movieLog_table)^+~'''
- 
- ||<rowbgcolor="#ececec">title ||year ||length ||inColor ||studioName 
||producer ||
- ||Star Wars ||1977 ||124 ||true ||Fox ||12345 ||
- ||Mighty Ducks ||1991 ||104 ||true ||Disney ||67890 ||
- 
- === Renaming ===
- 
- '''~+^ρS^+~'''~-columnfamily-list-~'''~+^(movieLog_table)^+~'''
- 
- === Groupping ===
- 
- '''~+^γ^+~'''~-columnfamily-list-~'''~+^(R)^+~'''
- 
- === Sorting ===
- 
- '''~+^τ^+~'''~-columnfamily-list-~'''~+^(R)^+~'''
- 
- === Example ===
- 
- ~-''A = Table('movieLog_table');
- [[BR]]B = A.Selection(length > 100 AND studioName = 'Fox');
- [[BR]]C = B.Projection('year');''-~
- 
- 
'''~+^π^+~'''~-title-~,~-year-~'''~+^(σ^+~'''~-length>100-~'''~+^(movieLog_table)∩σ^+~'''~-studioName='Fox'-~'''~+^(movieLog_table))^+~'''
- 
- ||<rowbgcolor="#ececec">title ||year ||
- ||Star Wars ||1977 ||
- 
- == Matrix Operations ==
- 
- Lets construct a abstract sparse row-by-column Map Matrix, orientation is row 
major.
- 
- ~-''A = doubleMatrix('movieLog_table','vote');''-~
- 
- ||<rowbgcolor="#ececec"> ||user_1 ||user_2 ||user_3 ||
- ||<bgcolor="#ececec">Star Wars || 5.0 || 2.0 ||   ||
- ||<bgcolor="#ececec">Mighty Ducks || 2.0 ||   || 4.0 ||
- ||<bgcolor="#ececec">Wayne's World ||   || 3.0 || 4.0 ||
- 
- ----
- = Matrix Extension Example On Hbase Shell =
- == Latent Semantic Analysis By Singular Value Decomposition ==
- '''Motivation'''
- Lexical matching at term level inaccurate (claimed)
- 
-   * Polysemy - words with number of ‘meanings’ - term matching returns 
irrelevant documents - impacts precision
-   * Synonomy - number of words with same ‘meaning’ - term matching misses 
relevant documents - impacts recall
- 
- LSA assumes that there exists a LATENT structure in word usage - obscured by 
variability in word choice 
- [[BR]]Analogous to signal + additive noise model in signal processing
- 
- 
- 
- == Scalable Collaborative Filtering With A Large User-By-Item Matrix ==
- 
- I will follow (Google Recommendation System) algorithms.
- 
- [http://www2007.org/papers/paper570.pdf]
- 
- 
- == Consistency Assessment Of Topological Relationship By Matrix-Union ==
- ..
- 
- ----
- = Performance Reports =
- ..
- 

Reply via email to