Can you try using this other constructor? PlainTextByLineStream(FileChannel channel, Charset encoding)
I don't know if it is related, but internally we don't use the one that takes a InputStream. Let me know what happens. Thank you, William 2013/11/21 Jörn Kottmann <[email protected]> > Please post the exception with stack trace here. > > Jörn > > > > On 11/21/2013 07:53 AM, Walrus theCat wrote: > >> To update, when I create the stream as above >> (PlainTextByLineStream(ByteArrayInputStream)) I get the "Stream not >> marked" >> error when attempting to cross validate (but not when just evaluating on >> the training data). When I, instead, create the PlainTextByLineStream on >> a >> BufferedReader (see below), I get the error " Model not compatible with >> name finder!" during training. The result is I can't cross validate, >> something I really need to do. >> >> >> def linesToStream(lines:Array[String]) = { >> val charset = Charset.forName(CHARSET) >> val reader = new BufferedReader(new InputStreamReader(new >> ByteArrayInputStream(lines.mkString("\n").getBytes(CHARSET)))) >> new NameSampleDataStream( >> new PlainTextByLineStream( >> reader)) >> } >> >> >> On Wed, Nov 20, 2013 at 5:42 PM, Walrus theCat <[email protected] >> >wrote: >> >> Thanks for the reply, even though I was kind of rude. I'm using the API. >>> The evaluator gives me suspiciously high metrics, and the cross validator >>> fails out as mentioned. >>> >>> The code is in Scala: >>> >>> def linesToStream(lines:Array[String]) = { >>> val charset = Charset.forName(CHARSET) >>> new NameSampleDataStream( >>> new PlainTextByLineStream( >>> new >>> ByteArrayInputStream(lines.mkString("\n").getBytes(CHARSET)), charset)) >>> } >>> >>> I train the model with the above: >>> NameFinderME.train("en", entityName, linesToStream(lines), >>> TrainingParameters.defaultParams(), >>> null:Array[Byte], Collections.emptyMap[String, Object]()); >>> >>> When it comes time to evaluate, I recreate the stream to try to >>> circumvent >>> these kinds of problems ("resetting" it also throws the same error): >>> >>> val crossValidator = new TokenNameFinderCrossValidator("en", >>> entityName, TrainingParameters.defaultParams(), >>> null:Array[Byte], Collections.emptyMap[String, Object](), >>> listener) >>> crossValidator.evaluate(sampleStream, 10) >>> >>> Thanks >>> >>> >>> >>> On Wed, Nov 20, 2013 at 3:43 PM, William Colen <[email protected]> >>> wrote: >>> >>> Are you using the API or the command line tools? Can you send a code >>>> snippet showing how do you load the ObjectStream? >>>> >>>> >>>> 2013/11/20 Walrus theCat <[email protected]> >>>> >>>> I'm getting "java.io.IOException: Stream not marked" when calling >>>>> TokenNameFinderCrossValidator.evaluate with a NameSampleDataStream. >>>>> >>>> This >>>> >>>>> works when I use a TokenNameFinderEvaluator instead. I'm led to >>>>> believe >>>>> that .reset isn't called on the stream in the CrossValidator. >>>>> >>>>> >>> >
