[
https://issues.apache.org/jira/browse/OPENNLP-1440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17656472#comment-17656472
]
ASF GitHub Bot commented on OPENNLP-1440:
-----------------------------------------
mawiesne commented on code in PR #486:
URL: https://github.com/apache/opennlp/pull/486#discussion_r1065418797
##########
opennlp-tools/src/main/java/opennlp/tools/cmdline/dictionary/DictionaryBuilderTool.java:
##########
@@ -56,8 +58,8 @@ public void run(String[] args) {
CmdLineUtil.checkInputFile("dictionary input file", dictInFile);
CmdLineUtil.checkOutputFile("dictionary output file", dictOutFile);
- try (InputStreamReader in = new InputStreamReader(new
FileInputStream(dictInFile), encoding);
- OutputStream out = new FileOutputStream(dictOutFile)) {
+ try (Reader in = new BufferedReader(new InputStreamReader(new
FileInputStream(dictInFile), encoding));
+ OutputStream out = new FileOutputStream(dictOutFile)) {
Review Comment:
This PR/issues targets _read_ IO operations, only. _Write_ is a separate
topic.
> Ensure files are read via buffered IO operations
> ------------------------------------------------
>
> Key: OPENNLP-1440
> URL: https://issues.apache.org/jira/browse/OPENNLP-1440
> Project: OpenNLP
> Issue Type: Improvement
> Components: Applications
> Affects Versions: 2.1.0
> Reporter: Martin Wiesner
> Assignee: Martin Wiesner
> Priority: Minor
> Fix For: 2.1.1
>
>
> Several classes in _opennnlp.tools_ exist which read files via
> {{FileInputStream}} *without* using buffered IO. If IO is not buffered, this
> can impose a (high) performance penalty as the JVM will have to use native
> (JNI) calls more often (resulting in more sys-calls to the OS).
> We can avoid that by adapting existing classes to use {{BufferedInputStream}}
> or {{BufferedReader}} more consequently.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)