----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/75007/ -----------------------------------------------------------
(Updated June 25, 2024, 5:15 a.m.) Review request for atlas, Ashutosh Mestry, Jayendra Parab, Madhan Neethiraj, and Radhika Kundam. Changes ------- addressed review comment Bugs: ATLAS-4866 https://issues.apache.org/jira/browse/ATLAS-4866 Repository: atlas Description ------- **Background:** Atlas uses HBase as its store for audit repository. After import, atlas store the audit entity with the import information along with all the processed entitiy guids. **Issue: ** When large sized export zipped file is imported, import gives below error, internally import gets succesful, but fail creating audit. *{"errorCode":"ATLAS-500-00-001","errorMessage":"org.janusgraph.core.JanusGraphException: Could not commit transaction due to exception during persistence","errorCause":"Could not commit transaction due to exception during persistence"}* When size of the entity is greater than "hbase.client.keyvalue.maxsize" property then audit entity creation fails with exception. *Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 action: org.apache.hadoop.hbase.DoNotRetryIOException: Cell[\x00\x00\x00\x00\x00\x00\x00\x00\x01\x05\xCC\xBB/l:\x00\x06\x18r\xB0\xBE\xFDH\xA00a11ed186467-ve0214-halxg-cloudera-com\xB2\x00\x00\x00\x00\x00\x0D\xB6Y/1715730740890001/Put/vlen=23826488/seqid=0] with size 23826581 exceeds limit of 10485760 bytes at org.apache.hadoop.hbase.regionserver.RSRpcServices.checkCellSizeLimit(RSRpcServices.java:906) at org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:992) at org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicBatchOp(RSRpcServices.java:927) at org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:892) at org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2855) at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45961) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:387) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:139) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:369) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:349) : 1 time, servers with issues: ve0214.halxg.cloudera.com,22101,1715690875185 at org.apache.hadoop.hbase.client.BatchErrors.makeException(BatchErrors.java:50) at org.apache.hadoop.hbase.client.AsyncRequestFutureImpl.getErrors(AsyncRequestFutureImpl.java:1228) at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:434) at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:422) at org.janusgraph.diskstorage.hbase2.HTable2_0.batch(HTable2_0.java:51) **Solution: ** Hence in this case, storing processed entities guids is skipped while creating ExportImportAuditEntry when size of entity goes beyond the value (in bytes) of below property atlas.hbase.client.keyvalue.maxsize Diffs (updated) ----- repository/src/main/java/org/apache/atlas/repository/impexp/ExportImportAuditService.java 3afa17301 Diff: https://reviews.apache.org/r/75007/diff/2/ Changes: https://reviews.apache.org/r/75007/diff/1-2/ Testing ------- manually verified through below api, processed entity guids is not stored /api/atlas/admin/expimp/audit?userName=admin&operation=IMPORT Precommit: https://ci-builds.apache.org/job/Atlas/job/PreCommit-ATLAS-Build-Test/1649/console Thanks, Pinal Shah