[ 
https://issues.apache.org/jira/browse/IMPALA-7015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16472689#comment-16472689
 ] 

Greg Rahn commented on IMPALA-7015:
-----------------------------------

IIRC the current behavior was chosen to make the query run to competition, 
despite hitting errors.  This is mainly done because of the lack of atomicity 
with multi-row txns.  For example, when doing a bulk insert containing 
duplicate keys, it would be impossible to have the command run for all 
non-violating records unless one removed them in the source/input set.  The 
current behavior at least lets the command work on as many tuples as possible 
without adjusting the input.  I'm all for better error message propagation but 
AFAIK this was also a limitation of the current protocols as mentioned in 
IMPALA-4416 and IMPALA-1789.  If there is a way to provide a better UX I'm all 
for it.

> Insert into Kudu table returns with Status OK even if there are Kudu errors
> ---------------------------------------------------------------------------
>
>                 Key: IMPALA-7015
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7015
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 2.12.0
>            Reporter: Mostafa Mokhtar
>            Priority: Major
>         Attachments: Insert into kudu profile with errors.txt
>
>
> DML statements against Kudu tables return status OK even if there are Kudu 
> errors.
> This behavior is misleading. 
> {code}
>   Summary:
>     Session ID: 18430b000e5dd8dc:e3e5dadb4a15d4b4
>     Session Type: BEESWAX
>     Start Time: 2018-05-11 10:10:07.314218000
>     End Time: 2018-05-11 10:10:07.434017000
>     Query Type: DML
>     Query State: FINISHED
>     Query Status: OK
>     Impala Version: impalad version 2.12.0-cdh5.15.0 RELEASE (build 
> 2f9498d5c2f980aa7ff9505c56654c8e59e026ca)
>     User: mmokhtar
>     Connected User: mmokhtar
>     Delegated User: 
>     Network Address: ::ffff:10.17.234.27:60760
>     Default Db: tpcds_1000_kudu
>     Sql Statement: insert into store_2 select * from store
>     Coordinator: vd1317.foo:22000
>     Query Options (set by configuration): 
>     Query Options (set by configuration and planner): MT_DOP=0
>     Plan: 
> {code}
> {code}
> Operator          #Hosts   Avg Time  Max Time  #Rows  Est. #Rows  Peak Mem  
> Est. Peak Mem  Detail                                                
> -------------------------------------------------------------------------------------------------------------------------------------------------
> 02:PARTIAL SORT        5  909.030us   1.025ms  1.00K       1.00K   6.14 MB    
>     4.00 MB                                                        
> 01:EXCHANGE            5    6.262ms   7.232ms  1.00K       1.00K  75.50 KB    
>           0  KUDU(KuduPartition(tpcds_1000_kudu.store.s_store_sk)) 
> 00:SCAN KUDU           5    3.694ms   4.137ms  1.00K       1.00K   4.34 MB    
>           0  tpcds_1000_kudu.store                                 
>     Errors: Key already present in Kudu table 
> 'impala::tpcds_1000_kudu.store_2'. (1 of 1002 similar)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to