[ 
https://issues.apache.org/jira/browse/HIVE-1634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-1634:
------------------------------

    Attachment: HIVE-1634.D1581.1.patch

ashutoshc requested code review of "HIVE-1634 [jira] Allow access to Primitive 
types stored in binary format in HBase".
Reviewers: JIRA

  https://issues.apache.org/jira/browse/HIVE-1634

  Rebased the patch to the trunk. This patch adds support binary storage 
support for HBase tables. What that means is if you have existing hbase tables 
(that is those not written through hive) you can use query them now using 
hbase-handler. Without this patch, you can only read hbase tables which were 
stored through hive.

  Test Plan:
  3 new .q files
  hbase_binary_external_table_queries.q
  hbase_binary_map_queries.q
  hbase_binary_storage_queries.q
  which has new tests.

  This addresses HIVE-1245 in part, for atomic or primitive types.

  The serde property "hbase.columns.storage.types" = ",b,b,b,b,b,b,b,b" is a 
specification of the storage option for the corresponding column in the serde 
property "hbase.columns.mapping". Allowed values are '' for table default, 's' 
for standard string storage, and 'b' for binary storage as would be obtained 
from o.a.h.hbase.utils.Bytes. Map types for HBase column families use a colon 
separated pair such as 's:b' for the key and value part specifiers 
respectively. See the test cases and queries for HBase handler for additional 
examples.

  There is also a table property "hbase.table.default.storage.type" = "string" 
to specify a table level default storage type. The other valid specification is 
"binary". The table level default is overridden by a column level specification.

  This control is available for the boolean, tinyint, smallint, int, bigint, 
float, and double primitive types. The attached patch also relaxes the mapping 
of map types to HBase column families to allow any primitive type to be the map 
key.

  Attached is a program for creating a table and populating it in HBase. The 
external table in Hive can access the data as shown in the example below.

  hive> create external table TestHiveHBaseExternalTable
      > (key string, c_bool boolean, c_byte tinyint, c_short smallint,
      >  c_int int, c_long bigint, c_string string, c_float float, c_double 
double)
      >  stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
      >  with serdeproperties ("hbase.columns.mapping" = 
":key,cf:boolean,cf:byte,cf:short,cf:int,cf:long,cf:string,cf:float,cf:double")
      >  tblproperties ("hbase.table.name" = "TestHiveHBaseExternalTable");
  OK
  Time taken: 0.691 seconds
  hive> select * from TestHiveHBaseExternalTable;
  OK
  key-1 NULL    NULL    NULL    NULL    NULL    Test-String     NULL    NULL
  Time taken: 0.346 seconds
  hive> drop table TestHiveHBaseExternalTable;
  OK
  Time taken: 0.139 seconds
  hive> create external table TestHiveHBaseExternalTable
      > (key string, c_bool boolean, c_byte tinyint, c_short smallint,
      >  c_int int, c_long bigint, c_string string, c_float float, c_double 
double)
      >  stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
      >  with serdeproperties (
      >  "hbase.columns.mapping" = 
":key,cf:boolean,cf:byte,cf:short,cf:int,cf:long,cf:string,cf:float,cf:double",
      >  "hbase.columns.storage.types" = "-,b,b,b,b,b,b,b,b" )
      >  tblproperties (
      >  "hbase.table.name" = "TestHiveHBaseExternalTable",
      >  "hbase.table.default.storage.type" = "string");
  OK
  Time taken: 0.139 seconds
  hive> select * from TestHiveHBaseExternalTable;
  OK
  key-1 true    -128    -32768  -2147483648     -9223372036854775808    
Test-String     -2.1793132E-11  2.01345E291
  Time taken: 0.151 seconds
  hive> drop table TestHiveHBaseExternalTable;
  OK
  Time taken: 0.154 seconds
  hive> create external table TestHiveHBaseExternalTable
      > (key string, c_bool boolean, c_byte tinyint, c_short smallint,
      >  c_int int, c_long bigint, c_string string, c_float float, c_double 
double)
      >  stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
      >  with serdeproperties (
      >  "hbase.columns.mapping" = 
":key,cf:boolean,cf:byte,cf:short,cf:int,cf:long,cf:string,cf:float,cf:double",
      >  "hbase.columns.storage.types" = ",b,b,b,b,b,,b,b" )
      >  tblproperties ("hbase.table.name" = "TestHiveHBaseExternalTable");
  OK
  Time taken: 0.347 seconds
  hive> select * from TestHiveHBaseExternalTable;
  OK
  key-1 true    -128    -32768  -2147483648     -9223372036854775808    
Test-String     -2.1793132E-11  2.01345E291
  Time taken: 0.245 seconds
  hive>

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D1581

AFFECTED FILES
  hbase-handler/src/test/results/hbase_binary_external_table_queries.q.out
  hbase-handler/src/test/results/hbase_binary_map_queries.q.out
  hbase-handler/src/test/results/hbase_binary_storage_queries.q.out
  hbase-handler/src/test/org/apache/hadoop/hive/hbase/HBaseTestSetup.java
  hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestHBaseSerDe.java
  hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestLazyHBaseObject.java
  hbase-handler/src/test/queries/hbase_binary_map_queries.q
  hbase-handler/src/test/queries/hbase_binary_storage_queries.q
  hbase-handler/src/test/queries/hbase_binary_external_table_queries.q
  hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStatsPublisher.java
  
hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java
  
hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableOutputFormat.java
  hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseCellMap.java
  hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStatsAggregator.java
  hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java
  hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStorageHandler.java
  hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseRow.java
  serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyUtils.java
  serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyIntegerBinary.java
  serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyShortBinary.java
  serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyLongBinary.java
  serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java
  serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyByteBinary.java
  serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFloatBinary.java
  serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyDoubleBinary.java
  serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyBooleanBinary.java

MANAGE HERALD DIFFERENTIAL RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/3321/

Tip: use the X-Herald-Rules header to filter Herald messages in your client.

                
> Allow access to Primitive types stored in binary format in HBase
> ----------------------------------------------------------------
>
>                 Key: HIVE-1634
>                 URL: https://issues.apache.org/jira/browse/HIVE-1634
>             Project: Hive
>          Issue Type: Improvement
>          Components: HBase Handler
>    Affects Versions: 0.7.0
>            Reporter: Basab Maulik
>            Assignee: Basab Maulik
>         Attachments: HIVE-1634.0.patch, HIVE-1634.1.patch, 
> HIVE-1634.D1581.1.patch, TestHiveHBaseExternalTable.java
>
>
> This addresses HIVE-1245 in part, for atomic or primitive types.
> The serde property "hbase.columns.storage.types" = "-,b,b,b,b,b,b,b,b" is a 
> specification of the storage option for the corresponding column in the serde 
> property "hbase.columns.mapping". Allowed values are '-' for table default, 
> 's' for standard string storage, and 'b' for binary storage as would be 
> obtained from o.a.h.hbase.utils.Bytes. Map types for HBase column families 
> use a colon separated pair such as 's:b' for the key and value part 
> specifiers respectively. See the test cases and queries for HBase handler for 
> additional examples.
> There is also a table property "hbase.table.default.storage.type" = "string" 
> to specify a table level default storage type. The other valid specification 
> is "binary". The table level default is overridden by a column level 
> specification.
> This control is available for the boolean, tinyint, smallint, int, bigint, 
> float, and double primitive types. The attached patch also relaxes the 
> mapping of map types to HBase column families to allow any primitive type to 
> be the map key.
> Attached is a program for creating a table and populating it in HBase. The 
> external table in Hive can access the data as shown in the example below.
> hive> create external table TestHiveHBaseExternalTable
>     > (key string, c_bool boolean, c_byte tinyint, c_short smallint,
>     >  c_int int, c_long bigint, c_string string, c_float float, c_double 
> double)
>     >  stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
>     >  with serdeproperties ("hbase.columns.mapping" = 
> ":key,cf:boolean,cf:byte,cf:short,cf:int,cf:long,cf:string,cf:float,cf:double")
>     >  tblproperties ("hbase.table.name" = "TestHiveHBaseExternalTable");
> OK
> Time taken: 0.691 seconds
> hive> select * from TestHiveHBaseExternalTable;
> OK
> key-1 NULL    NULL    NULL    NULL    NULL    Test-String     NULL    NULL
> Time taken: 0.346 seconds
> hive> drop table TestHiveHBaseExternalTable;
> OK
> Time taken: 0.139 seconds
> hive> create external table TestHiveHBaseExternalTable
>     > (key string, c_bool boolean, c_byte tinyint, c_short smallint,
>     >  c_int int, c_long bigint, c_string string, c_float float, c_double 
> double)
>     >  stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
>     >  with serdeproperties (
>     >  "hbase.columns.mapping" = 
> ":key,cf:boolean,cf:byte,cf:short,cf:int,cf:long,cf:string,cf:float,cf:double",
>     >  "hbase.columns.storage.types" = "-,b,b,b,b,b,b,b,b" )
>     >  tblproperties (
>     >  "hbase.table.name" = "TestHiveHBaseExternalTable",
>     >  "hbase.table.default.storage.type" = "string");
> OK
> Time taken: 0.139 seconds
> hive> select * from TestHiveHBaseExternalTable;
> OK
> key-1 true    -128    -32768  -2147483648     -9223372036854775808    
> Test-String     -2.1793132E-11  2.01345E291
> Time taken: 0.151 seconds
> hive> drop table TestHiveHBaseExternalTable;
> OK
> Time taken: 0.154 seconds
> hive> create external table TestHiveHBaseExternalTable
>     > (key string, c_bool boolean, c_byte tinyint, c_short smallint,
>     >  c_int int, c_long bigint, c_string string, c_float float, c_double 
> double)
>     >  stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
>     >  with serdeproperties (
>     >  "hbase.columns.mapping" = 
> ":key,cf:boolean,cf:byte,cf:short,cf:int,cf:long,cf:string,cf:float,cf:double",
>     >  "hbase.columns.storage.types" = "-,b,b,b,b,b,-,b,b" )
>     >  tblproperties ("hbase.table.name" = "TestHiveHBaseExternalTable");
> OK
> Time taken: 0.347 seconds
> hive> select * from TestHiveHBaseExternalTable;
> OK
> key-1 true    -128    -32768  -2147483648     -9223372036854775808    
> Test-String     -2.1793132E-11  2.01345E291
> Time taken: 0.245 seconds
> hive> 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to