[ 
https://issues.apache.org/jira/browse/FLINK-14567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16974992#comment-16974992
 ] 

Kevin Zhang commented on FLINK-14567:
-------------------------------------

I basically agree with [~jackylau]. In the proposed scenario flink may be not 
able to figure out whether concating two unique key fields with a specific 
separator can preserve the uniqueness. However, most of the time the users 
understand their data and know using which separtor won't break the uniqueness, 
at this time we should let the users to determine whether the query result can 
be inserted into hbase sink.  The solution may be providing another 
concat_ws-like function with an additional parameter to indicate if to ignore 
the potential risks, and document the reason properly.

> Aggregate query with more than two group fields can't be write into HBase sink
> ------------------------------------------------------------------------------
>
>                 Key: FLINK-14567
>                 URL: https://issues.apache.org/jira/browse/FLINK-14567
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / HBase, Table SQL / Legacy Planner, Table 
> SQL / Planner
>            Reporter: Jark Wu
>            Priority: Critical
>             Fix For: 1.10.0
>
>
> If we have a hbase table sink with rowkey of varchar (also primary key) and a 
> column of bigint, we want to write the result of the following query into the 
> sink using upsert mode. However, it will fail when primary key check with the 
> exception "UpsertStreamTableSink requires that Table has a full primary keys 
> if it is updated."
> {code:sql}
> select concat(f0, '-', f1) as key, sum(f2)
> from T1
> group by f0, f1
> {code}
> This happens in both blink planner and old planner. That is because if the 
> query works in update mode, then there must be a primary key exist to be 
> extracted and set to {{UpsertStreamTableSink#setKeyFields}}. 
> That's why we want to derive primary key for concat in FLINK-14539, however, 
> we found that the primary key is not preserved after concating. For example, 
> if we have a primary key (f0, f1, f2) which are all varchar type, say we have 
> two unique records ('a', 'b', 'c') and ('ab', '', 'c'), but the results of 
> concat(f0, f1, f2) are the same, which means the concat result is not primary 
> key anymore.
> So here comes the problem, how can we proper support HBase sink or such use 
> case? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to