[ 
https://issues.apache.org/jira/browse/TIKA-1323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14018931#comment-14018931
 ] 

Tim Allison commented on TIKA-1323:
-----------------------------------

Hi Sergey,

  For TIKA-1302, I'd like to use tika-server, and I'd like to be able to record 
exceptions at a per file level so that we can say, e.g. With Tika 1.5 we had 
515 exceptions on docx files, but with Tika-1.6-SNAPSHOT we had 1025 or 
something similar.  I'd also like to be able to say: we had an exception on 
file 12345.docx with Tika 1.5 but we're not getting an exception with 
Tika-1.6-SNAPSHOT.  We can do that now with tika-server on the client side.  If 
my client receives a 422 or 500, I know that something went wrong, and I can 
log it.

However, what I'd also like to be able to do is identify frequency of 
stacktrace elements so that we can sort the most frequent exceptions per 
document type.  To do this, we need to be able to record the stacktrace, and 
I'd also like to be able to link the stacktrace back to the document that 
caused the problem. 

If I run Tika directly via java code (what I've been doing), I can easily catch 
the exceptions and log the information at a per file basis.  So, my preference 
(plan A) would be have tika-server return the stacktrace as the body content 
for exceptions.  We can parameterize this functionality on the commandline, of 
course.  The other option (plan B) would be to pass the file name to 
tika-server, and have tika-server log the file name in conjunction with the 
stacktrace, but that is not as appealing to me.  The third option, of course, 
is to set up a different service for evaluation, but I'd much prefer to use our 
base code as much as possible.

So, is plan A reasonable?



> Improve logging in JAX-RS server
> --------------------------------
>
>                 Key: TIKA-1323
>                 URL: https://issues.apache.org/jira/browse/TIKA-1323
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Tim Allison
>            Priority: Minor
>
> I'd like to use tika-server for TIKA-1302.  As part of that, I'd like to 
> record exception stacktraces per document.  I see two options: transmit the 
> info back to the client (assuming a doc didn't bring the server down :) ) 
> along with the current error code or log the document id and stacktrace via 
> the server.  Given my current design thoughts, I'd prefer the first option.
> Any objections or recommendations?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to