William,

Using the FileChannel constructor seems to have done away with that error.
I'll post back if anything goes awry, but it's nice to have at least worked
past that.

Thanks


On Thu, Nov 21, 2013 at 9:36 AM, Walrus theCat <[email protected]>wrote:

> Thanks William,
>
> I'll give it a shot.  I really need to be able to work with String[]s, as
> concatenating them all into a new file and reading it back is not that
> scalable.  I'll  let you know how it works out.
>
>
> On Thu, Nov 21, 2013 at 5:12 AM, William Colen <[email protected]>wrote:
>
>> Can you try using this other constructor?
>>  PlainTextByLineStream(FileChannel channel, Charset encoding)
>>
>> I don't know if it is related, but internally we don't use the one that
>> takes a InputStream.
>>
>> Let me know what happens.
>>
>>
>> Thank you,
>>
>> William
>>
>>
>> 2013/11/21 Jörn Kottmann <[email protected]>
>>
>> > Please post the exception with stack trace here.
>> >
>> > Jörn
>> >
>> >
>> >
>> > On 11/21/2013 07:53 AM, Walrus theCat wrote:
>> >
>> >> To update, when I create the stream as above
>> >> (PlainTextByLineStream(ByteArrayInputStream)) I get the "Stream not
>> >> marked"
>> >> error when attempting to cross validate (but not when just evaluating
>> on
>> >> the training data).  When I, instead, create the PlainTextByLineStream
>> on
>> >> a
>> >> BufferedReader (see below), I get the error " Model not compatible with
>> >> name finder!" during training.  The result is I can't cross validate,
>> >> something I really need to do.
>> >>
>> >>
>> >>    def linesToStream(lines:Array[String]) = {
>> >>      val charset = Charset.forName(CHARSET)
>> >>      val reader = new BufferedReader(new InputStreamReader(new
>> >> ByteArrayInputStream(lines.mkString("\n").getBytes(CHARSET))))
>> >>      new NameSampleDataStream(
>> >>          new PlainTextByLineStream(
>> >>              reader))
>> >>    }
>> >>
>> >>
>> >> On Wed, Nov 20, 2013 at 5:42 PM, Walrus theCat <[email protected]
>> >> >wrote:
>> >>
>> >>  Thanks for the reply, even though I was kind of rude.  I'm using the
>> API.
>> >>> The evaluator gives me suspiciously high metrics, and the cross
>> validator
>> >>> fails out as mentioned.
>> >>>
>> >>> The code is in Scala:
>> >>>
>> >>>    def linesToStream(lines:Array[String]) = {
>> >>>      val charset = Charset.forName(CHARSET)
>> >>>      new NameSampleDataStream(
>> >>>          new PlainTextByLineStream(
>> >>>              new
>> >>> ByteArrayInputStream(lines.mkString("\n").getBytes(CHARSET)),
>> charset))
>> >>>    }
>> >>>
>> >>> I train the model with the above:
>> >>>        NameFinderME.train("en", entityName, linesToStream(lines),
>> >>> TrainingParameters.defaultParams(),
>> >>>              null:Array[Byte], Collections.emptyMap[String,
>> Object]());
>> >>>
>> >>> When it comes time to evaluate, I recreate the stream to try to
>> >>> circumvent
>> >>> these kinds of problems ("resetting" it also throws the same error):
>> >>>
>> >>>      val crossValidator = new TokenNameFinderCrossValidator("en",
>> >>> entityName, TrainingParameters.defaultParams(),
>> >>>              null:Array[Byte], Collections.emptyMap[String, Object](),
>> >>> listener)
>> >>> crossValidator.evaluate(sampleStream, 10)
>> >>>
>> >>> Thanks
>> >>>
>> >>>
>> >>>
>> >>> On Wed, Nov 20, 2013 at 3:43 PM, William Colen <
>> [email protected]>
>> >>> wrote:
>> >>>
>> >>>  Are you using the API or the command line tools? Can you send a code
>> >>>> snippet showing how do you load the ObjectStream?
>> >>>>
>> >>>>
>> >>>> 2013/11/20 Walrus theCat <[email protected]>
>> >>>>
>> >>>>  I'm getting  "java.io.IOException: Stream not marked" when calling
>> >>>>> TokenNameFinderCrossValidator.evaluate with a NameSampleDataStream.
>> >>>>>
>> >>>>   This
>> >>>>
>> >>>>> works when I use a TokenNameFinderEvaluator instead.  I'm led to
>> >>>>> believe
>> >>>>> that .reset isn't called on the stream in the CrossValidator.
>> >>>>>
>> >>>>>
>> >>>
>> >
>>
>
>

Reply via email to