Does anyone know of an existing parser or library that may be able to
extract the user inputted text from a Cadkey .prt file?

I don't think it should be that hard to implement.  For instance, opening
prt files with a text editor shows that user inputted text fields are stored
as ASCII(?) characters.
Here's an image of a prt file open in notepad [http://i.imgur.com/CPTU0.png]
I'm not sure about the file encoding (UTF-8?), but it's these characters
that I would like to extract.

I've played around with writing a few parsers; registering them to
org.apache.tika.parser.Parser, and adding them to tika-mimetypes.xml.
Currently I have a dummy PRTparser, that functions inside of the tika-app.
It has seems to have accurate magic mime-type detection.
Match value="0M3C" type="string" offset="8"
but it only outputs dummy data at the moment.

Can someone point me in the right direction in order to create a parser for
these files?

Reply via email to