imbajin commented on code in PR #725:
URL:
https://github.com/apache/hugegraph-toolchain/pull/725#discussion_r3077517643
##########
hugegraph-hubble/hubble-be/src/main/java/org/apache/hugegraph/controller/load/FileUploadController.java:
##########
@@ -262,28 +269,51 @@ private void checkFileValid(int connId, int jobId,
JobManager jobEntity,
Ex.check(formatWhiteList.contains(format),
"load.upload.file.format.unsupported");
- long fileSize = file.getSize();
- long singleFileSizeLimit = this.config.get(
- HubbleOptions.UPLOAD_SINGLE_FILE_SIZE_LIMIT);
- Ex.check(fileSize <= singleFileSizeLimit,
- "load.upload.file.exceed-single-size",
- FileUtils.byteCountToDisplaySize(singleFileSizeLimit));
+ if (sourceFileSize != null) {
+ Ex.check(sourceFileSize > 0L,
+ "load.upload.file.cannot-be-empty");
+ long singleFileSizeLimit = this.config.get(
+ HubbleOptions.UPLOAD_SINGLE_FILE_SIZE_LIMIT);
+ Ex.check(sourceFileSize <= singleFileSizeLimit,
+ "load.upload.file.exceed-single-size",
+ FileUtils.byteCountToDisplaySize(singleFileSizeLimit));
+
+ long totalFileSizeLimit = this.config.get(
+ HubbleOptions.UPLOAD_TOTAL_FILE_SIZE_LIMIT);
+ long currentJobSize = jobEntity.getJobSize();
+ Ex.check(sourceFileSize + currentJobSize <= totalFileSizeLimit,
+ "load.upload.file.exceed-total-size",
+ FileUtils.byteCountToDisplaySize(totalFileSizeLimit));
+ }
// Check is there a file with the same name
FileMapping oldMapping = this.service.get(connId, jobId, fileName);
Ex.check(oldMapping == null ||
oldMapping.getFileStatus() == FileMappingStatus.UPLOADING,
"load.upload.file.existed", fileName);
+ }
+
+ private Long resolveSourceFileSize(MultipartFile file, Long fileSize,
+ int total, int index) {
+ if (fileSize != null) {
Review Comment:
‼️ The quota check now trusts the client-provided `size` value as the source
of truth. A crafted request can under-report that number and still upload a
much larger file, because we never reconcile `sourceFileSize` with the bytes
actually written/merged on the server before marking the upload successful.
Please validate against a server-derived size (or re-check the merged file
size) so the limit cannot be bypassed by tampering with the request parameter.
##########
hugegraph-hubble/hubble-be/src/main/java/org/apache/hugegraph/controller/load/FileUploadController.java:
##########
@@ -262,28 +269,51 @@ private void checkFileValid(int connId, int jobId,
JobManager jobEntity,
Ex.check(formatWhiteList.contains(format),
"load.upload.file.format.unsupported");
- long fileSize = file.getSize();
- long singleFileSizeLimit = this.config.get(
- HubbleOptions.UPLOAD_SINGLE_FILE_SIZE_LIMIT);
- Ex.check(fileSize <= singleFileSizeLimit,
- "load.upload.file.exceed-single-size",
- FileUtils.byteCountToDisplaySize(singleFileSizeLimit));
+ if (sourceFileSize != null) {
+ Ex.check(sourceFileSize > 0L,
+ "load.upload.file.cannot-be-empty");
+ long singleFileSizeLimit = this.config.get(
+ HubbleOptions.UPLOAD_SINGLE_FILE_SIZE_LIMIT);
+ Ex.check(sourceFileSize <= singleFileSizeLimit,
+ "load.upload.file.exceed-single-size",
+ FileUtils.byteCountToDisplaySize(singleFileSizeLimit));
+
+ long totalFileSizeLimit = this.config.get(
+ HubbleOptions.UPLOAD_TOTAL_FILE_SIZE_LIMIT);
+ long currentJobSize = jobEntity.getJobSize();
+ Ex.check(sourceFileSize + currentJobSize <= totalFileSizeLimit,
Review Comment:
‼️ `currentJobSize` only reflects completed files here, but Hubble uploads
up to 5 files in parallel. If the job already has, for example, 9 GB uploaded,
two concurrent 800 MB files can both pass this check against the same stale
value and leave the job above the 10 GB cap once both merges finish. Please
include in-progress uploads in the server-side reservation/check, or reserve
the size before accepting the first chunk so the limit cannot be bypassed by
concurrency.
##########
hugegraph-hubble/hubble-be/src/main/java/org/apache/hugegraph/service/load/JobManagerService.java:
##########
@@ -132,4 +139,59 @@ public void remove(int id) {
throw new InternalException("entity.delete.failed", id);
}
}
+
+ @Transactional(isolation = Isolation.READ_COMMITTED)
+ public void deleteJob(int id) {
+ JobManager job = this.get(id);
+ if (job == null) {
+ throw new ExternalException("job.manager.not-exist.id", id);
+ }
+
+ List<LoadTask> loadTasks = this.taskService.taskListByJob(id);
+ for (LoadTask loadTask : loadTasks) {
+ if (loadTask.getStatus().inRunning() ||
+ loadTask.getStatus() == LoadStatus.PAUSED) {
+ this.taskService.stop(loadTask.getId());
+ }
+ this.taskService.remove(loadTask.getId());
+ }
+
+ List<FileMapping> mappings = this.fileMappingService.listByJob(id);
+ for (FileMapping mapping : mappings) {
+ this.fileMappingService.remove(mapping.getId());
+ }
+ this.remove(id);
+ this.deleteDiskFilesAfterCommit(mappings);
+ }
+
+ private void deleteDiskFilesAfterCommit(List<FileMapping> mappings) {
+ if (mappings.isEmpty()) {
+ return;
+ }
+
+ List<FileMapping> copiedMappings = new ArrayList<>(mappings);
+ if (!TransactionSynchronizationManager.isSynchronizationActive()) {
+ this.deleteDiskFiles(copiedMappings);
+ return;
+ }
+
+ TransactionSynchronizationManager.registerSynchronization(
+ new TransactionSynchronization() {
+ @Override
+ public void afterCommit() {
+ deleteDiskFiles(copiedMappings);
+ }
+ });
+ }
+
+ private void deleteDiskFiles(List<FileMapping> mappings) {
+ for (FileMapping mapping : mappings) {
+ try {
+ this.fileMappingService.deleteDiskFile(mapping);
+ } catch (RuntimeException e) {
Review Comment:
⚠️ By the time we get here, all `FileMapping` rows are already deleted and
the transaction is committed. If `deleteDiskFile()` fails, we only log a
warning and the orphaned upload directory becomes effectively untraceable,
because there is no metadata left for a later retry/cleanup job to discover.
Since this PR is meant to make job deletion actually free the uploaded data,
please consider persisting a retryable cleanup record or surfacing the failure
instead of silently losing the only reference to the disk path.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]