I created TIKA-679 to continue this discussion.
https://issues.apache.org/jira/browse/TIKA-679

On Tue, Jun 21, 2011 at 4:17 PM, Nick Burch <[email protected]> wrote:

> On Mon, 20 Jun 2011, Troy Witthoeft wrote:
>
>> I made some changes, and brought inline with other tika parser examples I
>> have seen. I've looked over IOUtils, however I'm a bit rusty on my Java.  By
>> rusty I mean inept.
>>
>
> If you want, open a new jira and upload a sample small cadkey file along
> with your code so far. I'll be happy to take a look and tweak it slightly
> when I next have a minute
>
>
>  Note: I found a simpler prefix that delineates the start of user text.
>> [0x01] [0x1F]
>>
>
> Unless we can figure out the file/record structure better, it might be
> safer to search for a longer sequence than that (eg all the 33s you
> mentioned in another email). 0x01 0x1f could potentially turn up elsewhere
> in a file, so we should aim for a more discrimination test
>
> Nick
>

Reply via email to