Dmytro Vitiuk created AMBARI-25576:
--------------------------------------

             Summary: Primary key duplication error during flushing alerts from 
alerts cache
                 Key: AMBARI-25576
                 URL: https://issues.apache.org/jira/browse/AMBARI-25576
             Project: Ambari
          Issue Type: Bug
          Components: ambari-server
    Affects Versions: 2.7.5
            Reporter: Dmytro Vitiuk
             Fix For: 2.7.6


Sometimes there are commit errors for clusters with a lot of hosts and enabled 
alert caching:
{code:java}
2020-10-09 19:53:14,444 ERROR [alert-event-bus-4] 
AmbariJpaLocalTxnInterceptor:180 - [DETAILED ERROR] Rollback reason: 
Local Exception Stack: 
Exception [EclipseLink-4002] (Eclipse Persistence Services - 
2.6.2.v20151217-774c696): org.eclipse.persistence.exceptions.DatabaseException
Internal Exception: java.sql.BatchUpdateException: Batch entry 1 INSERT INTO 
alert_history (alert_id, alert_instance, alert_label, alert_state, alert_text, 
alert_timestamp, cluster_id, component_name, host_name, service_name, 
alert_definition_id) VALUES (15363461, NULL, 'DataNode Web UI', 'OK', 'HTTP 200 
response in 0.000s', 1602286496756, 2, 'DATANODE', 'host1', 'HDFS', 53) was 
aborted: ERROR: duplicate key value violates unique constraint 
"pk_alert_history"
  Detail: Key (alert_id)=(15363461) already exists.  Call getNextException to 
see other errors in the batch.
Error Code: 0
Call: INSERT INTO alert_history (alert_id, alert_instance, alert_label, 
alert_state, alert_text, alert_timestamp, cluster_id, component_name, 
host_name, service_name, alert_definition_id) VALUES (?, ?, ?, ?, ?, ?, ?, ?, 
?, ?, ?)
        bind => [11 parameters bound]
{code}
This is not often issue, but anyway it has extensive logging. Also this issue 
can cause other rare problems, so it should be fixed.


 The reason of the issue is we have a shareable cache which can be updated with 
just merged value before this value will be really committed into DB. In this 
case other thread (from CachedAlertFlushService or AlertEventPublisher) can try 
to also merge already merged entity. 
 For example, we've created a new AlertHistoryEntity and set it to existing 
AlertCurrentEntity. A first thread started transaction, merged current entity 
to context, saved merged value to the cache and paused execution. After that a 
second thread tries to merge all content of cache and also merges just updated 
current entity. So we have two transaction and both think they should update 
current entity and create the new history entity. As result one of them is 
failing on duplicate error.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to