[
https://issues.apache.org/jira/browse/AMBARI-25576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dmytro Grinenko updated AMBARI-25576:
-------------------------------------
Assignee: Dmytro Grinenko
Status: Patch Available (was: Open)
> Primary key duplication error during flushing alerts from alerts cache
> ----------------------------------------------------------------------
>
> Key: AMBARI-25576
> URL: https://issues.apache.org/jira/browse/AMBARI-25576
> Project: Ambari
> Issue Type: Bug
> Components: ambari-server
> Affects Versions: 2.7.5
> Reporter: Dmytro Vitiuk
> Assignee: Dmytro Grinenko
> Priority: Major
> Fix For: 2.7.6
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Sometimes there are commit errors for clusters with a lot of hosts and
> enabled alert caching:
> {code:java}
> 2020-10-09 19:53:14,444 ERROR [alert-event-bus-4]
> AmbariJpaLocalTxnInterceptor:180 - [DETAILED ERROR] Rollback reason:
> Local Exception Stack:
> Exception [EclipseLink-4002] (Eclipse Persistence Services -
> 2.6.2.v20151217-774c696): org.eclipse.persistence.exceptions.DatabaseException
> Internal Exception: java.sql.BatchUpdateException: Batch entry 1 INSERT INTO
> alert_history (alert_id, alert_instance, alert_label, alert_state,
> alert_text, alert_timestamp, cluster_id, component_name, host_name,
> service_name, alert_definition_id) VALUES (15363461, NULL, 'DataNode Web UI',
> 'OK', 'HTTP 200 response in 0.000s', 1602286496756, 2, 'DATANODE', 'host1',
> 'HDFS', 53) was aborted: ERROR: duplicate key value violates unique
> constraint "pk_alert_history"
> Detail: Key (alert_id)=(15363461) already exists. Call getNextException to
> see other errors in the batch.
> Error Code: 0
> Call: INSERT INTO alert_history (alert_id, alert_instance, alert_label,
> alert_state, alert_text, alert_timestamp, cluster_id, component_name,
> host_name, service_name, alert_definition_id) VALUES (?, ?, ?, ?, ?, ?, ?, ?,
> ?, ?, ?)
> bind => [11 parameters bound]
> {code}
> This is not often issue, but anyway it has extensive logging. Also this issue
> can cause other rare problems, so it should be fixed.
> The reason of the issue is we have a shareable cache which can be updated
> with just merged value before this value will be really committed into DB. In
> this case other thread (from CachedAlertFlushService or AlertEventPublisher)
> can try to also merge already merged entity.
> For example, we've created a new AlertHistoryEntity and set it to existing
> AlertCurrentEntity. A first thread started transaction, merged current entity
> to context, saved merged value to the cache and paused execution. After that
> a second thread tries to merge all content of cache and also merges just
> updated current entity. So we have two transaction and both think they should
> update current entity and create the new history entity. As result one of
> them is failing on duplicate error.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)