Dear Wiki user, You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.
The following page has been changed by udanax: http://wiki.apache.org/lucene-hadoop/Hbase/HbaseShell ------------------------------------------------------------------------------ ---- = Hbase Shell Introduction = - Hbase Shell is a query language shell for manipulating databases in Hadoop + Hbase. - [[BR]][:Hbase/HbaseShell/ShellPlans: Shell Plans] page for discussion and description of future operators. - == People Involved == + == Initial Contributor == * [:udanax:Edward Yoon] (R&D center, NHN corp.) -- Initial Contributor - * [:InchulSong: Inchul Song] (Database Lab, KAIST) - * [:JaesunHan:Jaesun Han] (NexR corp.) - * [:YonghoHa:Yongho Ha] (R&D center, TMAXSoft) ---- = How to Start a Shell = @@ -31, +26 @@ hql >}}} All commands are terminated with a semi-colon: e.g. Type 'help;' to see list of available commands. - - == Hbase Shell Configuration == - - ||<bgcolor="#ececec">'''Property''' ||<bgcolor="#ececec">'''Description''' || - ||hbaseshell.jline.bell.enabled ||<99%> If true, enable audible keyboard bells if an alert is required.|| - ||hbaseshell.formatter||<99%> Name of formatter class to use drawing result set tables. Default: org.apache.hadoop.hbase.shell.formatter.AsciiTableFormatter|| - - - == Application Access to Hbase Shell Server == - * [:Hbase/HbaseShell/PHP:PHP HQL] Connect. - * [http://shell.hadoop.co.kr/PHPClient.php HQL query test site] on 2 node cluster Hbase = Hbase Query Language Commands = @@ -91, +75 @@ ||JOIN ||<99%>Table '''JOIN''' operations, linking and extracting data from two different internal source.[[BR]][[BR]]~-''R = Table('table_name1');[[BR]]S = Table('table_name2');[[BR]]C = R.Join(condition_expression) and S;''-~ || ||Group ||<99%>'''Group''' tuples by value of an attribute and apply aggregate function independently to each group of tuples.[[BR]]'''Aggregate Functions''' : ~-''AVG(attribute), SUM(attribute), COUNT(attribute), MIN(attribute), MAX(attribute)''-~[[BR]][[BR]]~-''A = Table('table_name');[[BR]]B = A.Group(column-list);''-~ || - ---- - = Examples = - == Example Of Basic Query Command Uses == - === Create the table in a Hbase === - - {{{ - hql > help create; - CREATE Create tables - - Syntax: - CREATE TABLE table_name - column_family_definition [, column_family_definition] ... - - column_family_definition: - column_family_name - [MAX_VERSIONS=n] - [MAX_LENGTH=n] - [COMPRESSION=NONE|RECORD|BLOCK] - [IN_MEMORY] - [BLOOMFILTER=NONE|BLOOMFILTER|COUNTING_BLOOMFILTER|RETOUCHED_BLOOMFILTER VECTOR_SIZE=n NUM_HASH=n] - }}} - - '''CREATE TABLE''' enables you to create a new table with various options for each column family. - - * ~-'''MAX_VERSIONS''' makes a table keep only the recent n versions in a cell. Its default value is 3, i.e., only the most recent 3 versions of values are stored in a cell.-~ - * ~-'''MAX_LENGTH''' specifies the maximum size of a column value. By default, the maximum size is unlimited. It is limited only by Integer.MAX_VALUE.-~ - * ~-'''COMPRESSION''' specifies which compression technique to use for the column family. By default, Hbase does not compress any data.-~ - * ~-'''IN_MEMORY''' specifies whether to keep the values in the column family in-memory, or not. By default, values are not kept in-memory.-~ - * ~-'''BLOOMFILTER''' specifies which bloom filter to use for the column family. By default, none of the bloom filters is used. You can specify the options in two ways: with '''VECTOR_SIZE''' and '''NUM_HASH''', or with '''NUM_ENTRIES'''. '''VECTOR_SIZE''' specifies the number of elements in the vector, and '''NUM_HASH''' specifies the number of hash functions to use; With '''NUM_ENTRIES''', you specify only the approximated number of entries in the column, and '''VECTOR_SIZE''' and '''NUM_HASH''' are automatically determined.-~ - - {{{ - hql > CREATE TABLE movieLog_table ( - --> year, length, inColor, studioName, vote, producer, actor) - --> NUM_VERSIONS 10; - - hql > CREATE TABLE webtable ( - --> contents in_memory max_versions=10 compression=block, - --> anchor max_length=256 bloomfilter=counting_bloomfilter - --> vector_size=1000000 num_hash=4); - }}} - - === Select data from a table === - {{{ - hql > help select; - SELECT Select values from tables - - Syntax: - SELECT { column_name, [, column_name] ... | *} FROM table_name - [WHERE row='row_key' | STARTING FROM 'row-key'] - [NUM_VERSIONS = version_count] - [TIMESTAMP 'timestamp'] - [LIMIT = row_count] - [INTO FILE 'file_name'] - }}} - - '''SELECT''' retrieves a subset of data from the specified table. - - * ~-'''STARTING FROM''' returns all the rows starting from 'row-key'.-~ - * ~-'''NUM_VERSIONS''' retrieves only the recent n versions of values in a cell.-~ - * ~-'''TIMESTAMP''' returns only the values with the specified timestamp.-~ - * ~-'''LIMIT''' limits the number of rows to be returned.-~ - - {{{ - hql > SELECT studioName: FROM movieLog_table WHERE row = 'Star Wars'; - - +---------------------------------------------+ - | title studioName | - | =========================================== | - | Star Wars Fox Entertainment Group, Inc. | - +---------------------------------------------+ - - Successfully print out the selected data.(0.05 sec) - - hql > SELECT 'studioName:YoungGu Art, Corp.' FROM movieLog_table WHERE row = 'D-War'; - }}} - - === Insert data into a table === - {{{ - hql > help insert; - INSERT Insert values into tables - - Syntax: - INSERT INTO table_name - (colmn_name, ...) VALUES ('value', ...) - WHERE row='row_key' - - column_name: - column_family_name - | column_family_name:column_label_name - }}} - - '''INSERT''' inserts a set of values into a table. - - * ~-If the specified column already exists, the new value is stored as a new version.-~ - * ~-If '''TIMESTAMP''' is not specified for the value, the current time is used as its timestamp.-~ - - {{{ - hql > INSERT INTO movieLog_table (year:, length:, inColor:, studioName:, 'vote:user name', producer:, 'actor:hero') - --> VALUES ('1977', '124', 'true', 'Fox', '5', 'George Lucas', 'Mark Hamill') - --> WHERE row='Star Wars'; - }}} - - === Delete data in a table === - {{{ - hql > help delete; - DELETE Delete table data - - Syntax: - DELETE { column_name, [, column_name] ... | COLUMNFAMILIES(column_family[, column_family] ... | *} - FROM table_name - [WHERE row = 'row-key']; - }}} - - * Asterisk (*) will be delete the all in table. - * COLUMNFAMILIES(column_family list) will be delete the all columns in columnfamily. - * Specified column_name will be delete the specified column data. - - {{{ - hql > DELETE actor:hero FROM movieLog_table; - hql > DELETE actor:hero FROM movieLog_table WHERE row='Star Wars'; - hql > DELETE * FROM movieLog_table; - }}} - - === How to use external jars in Hbase Shell === - - {{{ - hql > HELP JAR; - JAR jarFile [mainClass] args...; - ... - - hql > JAR ./build/hadoop-examples.jar pi 10 10; - }}} - - == Example Of Relational Algebraic Query Command Uses == - - === Relational projection === - - * selects a subset of the columnfamilies of a relation - * Result = Ï ,,column_list,, (Relation) - {{{ - Relation - +--------------------------------------------------------------------+ - | title year length inColor studioName | - | ================================================================== | - | Star Wars 1977 124 true Fox Entertainment Group, Inc. | - | Mighty Ducks 1991 104 true Paramount Pictures Corp. | - +--------------------------------------------------------------------+ - - hql > Result = Relation.Projection('year','legnth'); - hql > save Result into table('result'); - - Result - +------------------------------+ - | title year length | - | ============================ | - | Star Wars 1977 124 | - | Mighty Ducks 1991 104 | - +------------------------------+ - }}} - === Relational selection === - * selects a subset of the rows in a relation that satisfy a selection condition - * Result = Ï ,,selection_condition,, (Relation) - {{{ - Relation - +--------------------------------------------------------------------+ - | title year length inColor studioName | - | ================================================================== | - | Star Wars 1977 124 true Fox Entertainment Group, Inc. | - | Mighty Ducks 1991 104 true Paramount Pictures Corp. | - +--------------------------------------------------------------------+ - - hql > Result = Relation.Selection(length > 100 and studioName = 'Fox Entertainment Group, Inc.'); - or Result = Relation.Selection(length > 100 - and studioName NOT IN ('Fox Entertainment Group, Inc.', 'Paramount Pictures Corp.')); - hql > save Result into table('result'); - - Result - +--------------------------------------------------------------------+ - | title year length inColor studioName | - | ================================================================== | - | Star Wars 1977 124 true Fox Entertainment Group, Inc. | - +--------------------------------------------------------------------+ - }}} - - === Relational group === - * Aggregation functions on collections of data values: average, minimum, maximum, sum, count. - * Group rows by value of an columnfamily and apply aggregate function independently to each group of rows. - * <Grouping columnfamilies> Æ ,,function_list,, (Relation) - - {{{ - Hbase > Group Relation by (studioName, SUM('vote:user')); - }}} - - === Relational join === - * The join of two relations R1(A,,1,, ,A,,2,, ,...,A,,n,,) and R2(B,,1,, ,B,,2,, ,...,B,,m,,) is a relation with degree k=n+m and attributes (A,,1,, ,A,,2,, ,...,A,,n,, , B,,1,, ,B,,2,, ,...,B,,m,,) that satisfy the join condition - * Result = R1 â·â ,,θ join_condition,, R2 - {{{ - R1 - +----------------------------------------------------------------------------------------------+ - | title producer actor studioName | - | ============================================================================================ | - | Star Wars George Lucas Fox Entertainment Group, Inc. | - | Mighty Ducks Blair Peters actor:hero <Charles Adler> Paramount Pictures Corp. | - | actor:cameo <Someone> | - | Bodyguard Mick Jackson actor:hero <Kevin Costner> Warner Home Video Inc. | - | actor:heroine <Whitney Houston> | - +----------------------------------------------------------------------------------------------+ - - R2 - +---------------------------------------------------------+ - | name occupation birth knownFor | - | ======================================================= | - | Whitney Houston Singer Aug 9, 1963 Diva | - +---------------------------------------------------------+ - - hql > R1 = table('movieLog_table'); - hql > R2 = table('personInfo_table'); - hql > Result = R1.join(R1.producer: = R2.ROW) and R2; - or Result = R1.join(R1.actor:hero = R2.Row) and R2; - or Result = R1.join(R1.actor:heroine = R2.Row and R1.studioName = 'Warner Home Video' and R2.occupation = 'Singer') and R2; - - Result - +--------------------------------------------------------------------------------------------------------------------+ - | title producer actor studioName occupation birth knownFor | - | ================================================================================================================== | - | Bodyguard Mick Jackson actor:heroine <Whitney Houston> Warner Home Video Inc. Singer Aug 9, 1963 Diva | - +--------------------------------------------------------------------------------------------------------------------+ - }}} - - * If we don't have an index for a domain in the join, we can still improve on the nested-loop join using sort join. - - {{{ - R1 - +----------------------------------------------------------------------------------------------+ - | title producer actor studioName | - | ============================================================================================ | - | Star Wars George Lucas Fox Entertainment Group, Inc. | - | Mighty Ducks Blair Peters actor:hero <Charles Adler> Paramount Pictures Corp. | - | actor:cameo <Someone> | - | The Bodyguard Mick Jackson actor:hero <Kevin Costner> Warner Home Video Inc. | - | actor:heroine <Whitney Houston> | - +----------------------------------------------------------------------------------------------+ - - R2 - +--------------------------------------------------------------------------------------------------+ - | Symbol corporation Market Management | - | ================================================================================================ | - | CSCO Cisco Systems, inc. Market:NASDAQ Management:CEO <John T. Chambers> | - | Fox Fox Entertainment Group, Inc. Market:NYSE Management:President <Peter F. Chernin> | - | Management:SEVP <David F. DeVoe> | - | ... | - +--------------------------------------------------------------------------------------------------+ - - hql > R1 = table('movieLog_table'); - hql > R2 = table('stockCompany_info'); - hql > result = R1.join(R1.studioName = R2.corporation) and R2; - }}} - - === Query mixture examples === - - '''(ex. 1)''' Search the subject(row) and the year of the movies which were produced by 'Fox' company and where running time is more than 100 minutes. - [[BR]]~-''Ï ,,year,, (Ï ,,length > 100,, (movieLog_table) â© Ï ,,studioName = 'Fox',, (movieLog_table))''-~ - - {{{ - Relation - +--------------------------------------------------------------------+ - | title year length inColor studioName | - | ================================================================== | - | Star Wars 1977 124 true Fox Entertainment Group, Inc. | - | Mighty Ducks 1991 104 true Paramount Pictures Corp. | - +--------------------------------------------------------------------+ - - hql > Proj = Relation.Projection('year'); - hql > Result = Proj.Selection(length > 100 AND studioName = 'Fox Entertainment Group, Inc.'); - hql > save Result into table('result'); - - Result - +----------------------+ - | title(row) year | - | ==================== | - | Star Wars 1977 | - +----------------------+ - }}} - - ---- - = Comments = - - Please add comments related to the project below. -