Dmytro Vitiuk created AMBARI-25576:
--------------------------------------
Summary: Primary key duplication error during flushing alerts from
alerts cache
Key: AMBARI-25576
URL: https://issues.apache.org/jira/browse/AMBARI-25576
Project: Ambari
Issue Type: Bug
Components: ambari-server
Affects Versions: 2.7.5
Reporter: Dmytro Vitiuk
Fix For: 2.7.6
Sometimes there are commit errors for clusters with a lot of hosts and enabled
alert caching:
{code:java}
2020-10-09 19:53:14,444 ERROR [alert-event-bus-4]
AmbariJpaLocalTxnInterceptor:180 - [DETAILED ERROR] Rollback reason:
Local Exception Stack:
Exception [EclipseLink-4002] (Eclipse Persistence Services -
2.6.2.v20151217-774c696): org.eclipse.persistence.exceptions.DatabaseException
Internal Exception: java.sql.BatchUpdateException: Batch entry 1 INSERT INTO
alert_history (alert_id, alert_instance, alert_label, alert_state, alert_text,
alert_timestamp, cluster_id, component_name, host_name, service_name,
alert_definition_id) VALUES (15363461, NULL, 'DataNode Web UI', 'OK', 'HTTP 200
response in 0.000s', 1602286496756, 2, 'DATANODE', 'host1', 'HDFS', 53) was
aborted: ERROR: duplicate key value violates unique constraint
"pk_alert_history"
Detail: Key (alert_id)=(15363461) already exists. Call getNextException to
see other errors in the batch.
Error Code: 0
Call: INSERT INTO alert_history (alert_id, alert_instance, alert_label,
alert_state, alert_text, alert_timestamp, cluster_id, component_name,
host_name, service_name, alert_definition_id) VALUES (?, ?, ?, ?, ?, ?, ?, ?,
?, ?, ?)
bind => [11 parameters bound]
{code}
This is not often issue, but anyway it has extensive logging. Also this issue
can cause other rare problems, so it should be fixed.
The reason of the issue is we have a shareable cache which can be updated with
just merged value before this value will be really committed into DB. In this
case other thread (from CachedAlertFlushService or AlertEventPublisher) can try
to also merge already merged entity.
For example, we've created a new AlertHistoryEntity and set it to existing
AlertCurrentEntity. A first thread started transaction, merged current entity
to context, saved merged value to the cache and paused execution. After that a
second thread tries to merge all content of cache and also merges just updated
current entity. So we have two transaction and both think they should update
current entity and create the new history entity. As result one of them is
failing on duplicate error.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)