[
https://issues.apache.org/jira/browse/IMPALA-7015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16472689#comment-16472689
]
Greg Rahn commented on IMPALA-7015:
-----------------------------------
IIRC the current behavior was chosen to make the query run to competition,
despite hitting errors. This is mainly done because of the lack of atomicity
with multi-row txns. For example, when doing a bulk insert containing
duplicate keys, it would be impossible to have the command run for all
non-violating records unless one removed them in the source/input set. The
current behavior at least lets the command work on as many tuples as possible
without adjusting the input. I'm all for better error message propagation but
AFAIK this was also a limitation of the current protocols as mentioned in
IMPALA-4416 and IMPALA-1789. If there is a way to provide a better UX I'm all
for it.
> Insert into Kudu table returns with Status OK even if there are Kudu errors
> ---------------------------------------------------------------------------
>
> Key: IMPALA-7015
> URL: https://issues.apache.org/jira/browse/IMPALA-7015
> Project: IMPALA
> Issue Type: Bug
> Components: Backend
> Affects Versions: Impala 2.12.0
> Reporter: Mostafa Mokhtar
> Priority: Major
> Attachments: Insert into kudu profile with errors.txt
>
>
> DML statements against Kudu tables return status OK even if there are Kudu
> errors.
> This behavior is misleading.
> {code}
> Summary:
> Session ID: 18430b000e5dd8dc:e3e5dadb4a15d4b4
> Session Type: BEESWAX
> Start Time: 2018-05-11 10:10:07.314218000
> End Time: 2018-05-11 10:10:07.434017000
> Query Type: DML
> Query State: FINISHED
> Query Status: OK
> Impala Version: impalad version 2.12.0-cdh5.15.0 RELEASE (build
> 2f9498d5c2f980aa7ff9505c56654c8e59e026ca)
> User: mmokhtar
> Connected User: mmokhtar
> Delegated User:
> Network Address: ::ffff:10.17.234.27:60760
> Default Db: tpcds_1000_kudu
> Sql Statement: insert into store_2 select * from store
> Coordinator: vd1317.foo:22000
> Query Options (set by configuration):
> Query Options (set by configuration and planner): MT_DOP=0
> Plan:
> {code}
> {code}
> Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem
> Est. Peak Mem Detail
> -------------------------------------------------------------------------------------------------------------------------------------------------
> 02:PARTIAL SORT 5 909.030us 1.025ms 1.00K 1.00K 6.14 MB
> 4.00 MB
> 01:EXCHANGE 5 6.262ms 7.232ms 1.00K 1.00K 75.50 KB
> 0 KUDU(KuduPartition(tpcds_1000_kudu.store.s_store_sk))
> 00:SCAN KUDU 5 3.694ms 4.137ms 1.00K 1.00K 4.34 MB
> 0 tpcds_1000_kudu.store
> Errors: Key already present in Kudu table
> 'impala::tpcds_1000_kudu.store_2'. (1 of 1002 similar)
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]