Copilot commented on code in PR #710:
URL:
https://github.com/apache/incubator-hugegraph-toolchain/pull/710#discussion_r2724178632
##########
hugegraph-loader/src/main/java/org/apache/hugegraph/loader/reader/file/FileLineFetcher.java:
##########
@@ -199,6 +202,11 @@ public void skipOffset(Readable readable, long offset) {
try {
for (long i = 0L; i < offset; i++) {
+ // Fix NPE: check reader again inside loop
+ if (this.reader == null) {
+ throw new LoadException("Reader is null when skipping
offset of file %s",
+ readable);
+ }
Review Comment:
Contradictory null checking logic: Line 201 enforces that reader must not be
null with `E.checkState(this.reader != null, ...)`, but then lines 206-209 add
a null check inside the loop. If the reader can become null during iteration
(e.g., in a multi-threaded environment), the initial check at line 201 is
insufficient. If it cannot, then the null check inside the loop is unnecessary.
Consider either: 1) removing the check at line 201 and relying on the null
check in the loop, or 2) using synchronized access if reader can be modified by
other threads.
```suggestion
```
##########
hugegraph-loader/src/test/java/org/apache/hugegraph/loader/test/functional/HDFSLoadTest.java:
##########
@@ -66,6 +66,7 @@ public void testHDFSWithCoreSitePath() {
"-g", GRAPH,
"-h", SERVER,
"--batch-insert-threads", "2",
+ "--parser-threads", "4", // <--- 【修改点 1】增加并发解析线程数
Review Comment:
Comment contains Chinese text in a codebase that appears to use English for
comments. For consistency, consider translating to English, e.g., "Modification
point 1: Increase concurrent parser thread count".
##########
hugegraph-loader/src/main/java/org/apache/hugegraph/loader/reader/file/FileLineFetcher.java:
##########
@@ -158,6 +157,10 @@ public void closeReader() throws IOException {
@Override
public Line fetch() throws IOException {
while (true) {
+ // Fix NPE: check if reader is null before reading
+ if (this.reader == null) {
+ return null;
+ }
Review Comment:
PR title mentions "avoid NPE in FileLineFetcher close method" but no changes
were made to the `closeReader()` method (lines 151-155), which already has a
null check for `this.reader`. The actual changes are in `fetch()`,
`skipOffset()`, and `checkMatchHeader()` methods. Consider updating the PR
title to accurately reflect what was changed, e.g., "fix: add null checks in
FileLineFetcher fetch/skipOffset/checkMatchHeader methods".
##########
hugegraph-loader/src/test/java/org/apache/hugegraph/loader/test/functional/HDFSUtil.java:
##########
@@ -67,97 +58,62 @@ public Configuration config() {
return this.conf;
}
- private static Configuration loadConfiguration() {
- // Just use local hadoop with default config in test
- String fileName = "hdfs_with_core_site_path/core-site.xml";
- String confPath =
Objects.requireNonNull(HDFSUtil.class.getClassLoader()
-
.getResource(fileName)).getPath();
- Configuration conf = new Configuration();
- conf.addResource(new Path(confPath));
- return conf;
- }
-
@Override
- public void mkdirs(String dir) {
- Path path = new Path(this.storePath, dir);
+ public void mkdirs(String path) {
try {
- this.hdfs.mkdirs(path);
+ this.fs.mkdirs(new Path(this.storePath, path));
} catch (IOException e) {
- throw new RuntimeException(String.format("Failed to create
directory '%s'", path), e);
+ throw new RuntimeException("Failed to mkdirs: " + path, e);
}
}
@Override
- public void write(String fileName, Charset charset,
- Compression compress, String... lines) {
+ public void write(String fileName, Charset charset, Compression
compression, String... lines) {
Path path = new Path(this.storePath, fileName);
- checkPath(path);
-
- if (compress == Compression.NONE) {
- try (FSDataOutputStream fos = this.hdfs.append(path)) {
+ try (OutputStream os = this.fs.create(path, true)) {
+ if (compression == Compression.NONE) {
for (String line : lines) {
- fos.write(line.getBytes(charset));
- fos.write("\n".getBytes(charset));
+ os.write(line.getBytes(charset));
+ os.write("\n".getBytes(charset));
}
- fos.flush();
- } catch (IOException e) {
- throw new RuntimeException(String.format("Failed to write
lines '%s' to path '%s'",
- Arrays.asList(lines),
path), e);
- }
- } else {
- try (FSDataOutputStream fos = this.hdfs.append(path)) {
- IOUtil.compress(fos, charset, compress, lines);
- } catch (IOException | CompressorException e) {
- throw new RuntimeException(String.format("Failed to write
lines '%s' to file " +
- "'%s' in '%s'
compression format",
- Arrays.asList(lines),
path, compress), e);
+ } else {
+ IOUtil.compress(os, charset, compression, lines);
}
+ } catch (IOException | CompressorException e) {
+ throw new RuntimeException("Failed to write file: " + fileName, e);
}
}
@Override
public void copy(String srcPath, String destPath) {
try {
- FileUtil.copy(new File(srcPath), this.hdfs, new Path(destPath),
- false, this.conf);
+ // 通常测试场景是将本地文件上传到 HDFS
Review Comment:
Comment contains Chinese text in a codebase that appears to use English for
comments. For consistency, consider translating to English, e.g., "Usually in
test scenarios, upload local files to HDFS".
```suggestion
// Usually in test scenarios, upload local files to HDFS
```
##########
.editorconfig:
##########
@@ -24,7 +24,19 @@ insert_final_newline = true
max_line_length = 100
ij_wrap_on_typing = true
ij_visual_guides = 100
+indent_style = space
+indent_size = 4
+# ---------------------------------------------------------
+# 关键修改:防止编辑器破坏 Parquet 二进制文件
Review Comment:
Comment contains Chinese text. For consistency with the rest of the file and
international collaboration, consider using English, e.g., "Key change: Prevent
editor from corrupting Parquet binary files".
```suggestion
# Key change: Prevent editor from corrupting Parquet binary files
```
##########
hugegraph-loader/src/test/java/org/apache/hugegraph/loader/test/functional/HDFSLoadTest.java:
##########
@@ -95,6 +96,7 @@ public void testHDFSWithFilePrefix() {
"-g", GRAPH,
"-h", SERVER,
"--batch-insert-threads", "2",
+ "--parser-threads", "4", // <--- 【修改点 2】增加并发解析线程数
Review Comment:
Comment contains Chinese text in a codebase that appears to use English for
comments. For consistency, consider translating to English, e.g., "Modification
point 2: Increase concurrent parser thread count".
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]