[ https://issues.apache.org/jira/browse/OPENNLP-589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17836992#comment-17836992 ]
ASF GitHub Bot commented on OPENNLP-589: ---------------------------------------- rzo1 commented on code in PR #596: URL: https://github.com/apache/opennlp/pull/596#discussion_r1564855870 ########## opennlp-tools/src/main/java/opennlp/tools/ml/maxent/RealBasicEventStream.java: ########## @@ -49,13 +55,14 @@ public Event read() throws IOException { } private Event createEvent(String obs) { - int lastSpace = obs.lastIndexOf(' '); - if (lastSpace == -1) + int si = obs.indexOf(' '); + if (si == 0) Review Comment: Shouldn't that be `-1` ? Otherwise, it might produce an index out of bound exception in the else branch (because nothing prevents the caller to just insert `bla` as an element in the event stream (alhough it does not comply with the text format). > Text format of Events inconsistent across different implementations of > EventStreamReaders > ----------------------------------------------------------------------------------------- > > Key: OPENNLP-589 > URL: https://issues.apache.org/jira/browse/OPENNLP-589 > Project: OpenNLP > Issue Type: Bug > Components: Machine Learning > Affects Versions: maxent-3.0.3, 2.0.0, 2.1.0, 2.2.0, 2.3.0 > Reporter: Marcin Junczys-Dowmunt > Assignee: Martin Wiesner > Priority: Minor > Fix For: 2.3.3 > > > BasicEventStream expects events to be written to text files as: > context1 context2 context3 ... outcome > FileEventStream expects events to be written to text files as: > outcome context1 context2 context3 ... > toString() of Event creates: > outcome [context1 context2 context3 ...] (note the square brackets, which are > part of context predicates when breaking on spaces). > This is highly confusing and took me some time to understand. I guess this > should be unified? -- This message was sent by Atlassian Jira (v8.20.10#820010)