Greetings.  I am afraid I owe you an apology -- I went to make some mods to 
tika for the app we are working on, and that got me into  the code for 
text/plain translation to xhtml.  For some reason -- I could have sworn it 
didn't work before -- I thought the translation of special characters wasn't 
being done, and I find out now that my examples work after all.  Mea culpa.

The only good thing that came of this exercise was just that -- it was a 
good exercize to climb around the java hierarchy and get a feel for the way 
tika is organized, as well as for getting some practice with java, which is 
a new language for me.

This leaves just one change to tika that I wonder about as it might be more 
appropriate to put it in the app itself rather than in tika.
Our app will be an editor/transcriber tool for producing braille from print 
books or other files.  The leader of this project wants newlines to be 
handled as follows: 2 consecutive newlines are to generate a <p> paragraph 
marker.
In addition, he is concerned about the handling of carriage return newline 
and how they should affect the flow.  I still need to pin him down on 
exactly what should happen.
Anyway, this needs to be specified before I can do anything with it, but the 
problem does affect tika, if I use tika for text/plain files, since by the 
time the text gets to the user it will have already been rendered from the 
xhtml.
I will be discussing this issue with the group and if I need to post again I 
will definitely try to be more prepared...:-/
Thanks for the comments.
--le

Reply via email to