[ 
https://issues.apache.org/jira/browse/HIVE-20481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

slim bouguerra updated HIVE-20481:
----------------------------------
    Description: 
Kafka records are keyed, most of the case this key is null or used to route 
records to the same partition. This patch adds this column as a binary column 
{code} __key{code}.

New table layout is as follow
{code}
POSTHOOK: type: CREATETABLE
POSTHOOK: Output: database:default
POSTHOOK: Output: default@wiki_kafka_avro_table
PREHOOK: query: describe extended wiki_kafka_avro_table
PREHOOK: type: DESCTABLE
PREHOOK: Input: default@wiki_kafka_avro_table
POSTHOOK: query: describe extended wiki_kafka_avro_table
POSTHOOK: type: DESCTABLE
POSTHOOK: Input: default@wiki_kafka_avro_table
isrobot                 boolean                 from deserializer   
channel                 string                  from deserializer   
timestamp               string                  from deserializer   
flags                   string                  from deserializer   
isunpatrolled           boolean                 from deserializer   
page                    string                  from deserializer   
diffurl                 string                  from deserializer   
added                   bigint                  from deserializer   
comment                 string                  from deserializer   
commentlength           bigint                  from deserializer   
isnew                   boolean                 from deserializer   
isminor                 boolean                 from deserializer   
delta                   bigint                  from deserializer   
isanonymous             boolean                 from deserializer   
user                    string                  from deserializer   
deltabucket             double                  from deserializer   
deleted                 bigint                  from deserializer   
namespace               string                  from deserializer   
__key                   binary                  from deserializer   
__partition             int                     from deserializer   
__offset                bigint                  from deserializer   
__timestamp             bigint                  from deserializer   
__start_offset          bigint                  from deserializer   
__end_offset            bigint                  from deserializer  
{code}

  was:
Kafka records are keyed, most of the case this key is null or used to route 
records to the same partition. This patch adds this column as a binary column 
{code} __key{code}.



> Add the Kafka Key record as part of the row.
> --------------------------------------------
>
>                 Key: HIVE-20481
>                 URL: https://issues.apache.org/jira/browse/HIVE-20481
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: slim bouguerra
>            Assignee: slim bouguerra
>            Priority: Major
>             Fix For: 4.0.0
>
>         Attachments: HIVE-20481.patch
>
>
> Kafka records are keyed, most of the case this key is null or used to route 
> records to the same partition. This patch adds this column as a binary column 
> {code} __key{code}.
> New table layout is as follow
> {code}
> POSTHOOK: type: CREATETABLE
> POSTHOOK: Output: database:default
> POSTHOOK: Output: default@wiki_kafka_avro_table
> PREHOOK: query: describe extended wiki_kafka_avro_table
> PREHOOK: type: DESCTABLE
> PREHOOK: Input: default@wiki_kafka_avro_table
> POSTHOOK: query: describe extended wiki_kafka_avro_table
> POSTHOOK: type: DESCTABLE
> POSTHOOK: Input: default@wiki_kafka_avro_table
> isrobot               boolean                 from deserializer   
> channel               string                  from deserializer   
> timestamp             string                  from deserializer   
> flags                 string                  from deserializer   
> isunpatrolled         boolean                 from deserializer   
> page                  string                  from deserializer   
> diffurl               string                  from deserializer   
> added                 bigint                  from deserializer   
> comment               string                  from deserializer   
> commentlength         bigint                  from deserializer   
> isnew                 boolean                 from deserializer   
> isminor               boolean                 from deserializer   
> delta                 bigint                  from deserializer   
> isanonymous           boolean                 from deserializer   
> user                  string                  from deserializer   
> deltabucket           double                  from deserializer   
> deleted               bigint                  from deserializer   
> namespace             string                  from deserializer   
> __key                 binary                  from deserializer   
> __partition           int                     from deserializer   
> __offset              bigint                  from deserializer   
> __timestamp           bigint                  from deserializer   
> __start_offset        bigint                  from deserializer   
> __end_offset          bigint                  from deserializer  
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to