Re: Fixing the problem of TIKA-895 and TIKA-914</span></a></span> </h1> <p class="darkgray font13"> <span class="sender pipe"><a href="/search?l=dev@tika.apache.org&q=from:%22John+M%22" rel="nofollow"><span itemprop="author" itemscope itemtype="http://schema.org/Person"><span itemprop="name">John M</span></span></a></span> <span class="date"><a href="/search?l=dev@tika.apache.org&q=date:20120716" rel="nofollow">Mon, 16 Jul 2012 06:38:49 -0700</a></span> </p> </div> <div itemprop="articleBody" class="msgBody"> <!--X-Body-of-Message--> <pre>I'm using the command line and the Tika-app jar, as are the creators of Tika-895 and 914....so Java 6 with a latest Tika build; nothing more I suppose I got the JDK from Oracle's website. Does the problem not exist in an older version of Tika, or with a different version of Java that you know of?</pre><pre> John On Mon, Jul 16, 2012 at 7:34 AM, Jukka Zitting <jukka.zitt...@gmail.com> wrote: > Hi, > > On Sun, Jul 15, 2012 at 5:29 PM, John M <jfm.apa...@gmail.com> wrote: >> So, is it a bug in the SAX library: that the line >> "super.characters(new char[0], 0, 0);" in the XHTMLContentHandler >> should work (but doesn't)? > > Yes, or the SAX library you're using could treat that as a feature > (automatically ignoring empty content). > > What's the SAX library you're using to serialize the output from Tika? > You may also want to try the ToXMLContentHandler class in o.a.t.sax. > It can serialize SAX events and doesn't suffer from this problem. > > BR, > > Jukka Zitting </pre> </div> <div class="msgButtons margintopdouble"> <ul class="overflow"> <li class="msgButtonItems"><a class="button buttonleft " accesskey="p" href="msg04849.html">Previous message</a></li> <li class="msgButtonItems textaligncenter"><a class="button" accesskey="c" href="thrd15.html#04852">View by thread</a></li> <li class="msgButtonItems textaligncenter"><a class="button" accesskey="i" href="mail15.html#04852">View by date</a></li> <li class="msgButtonItems textalignright"><a class="button buttonright " accesskey="n" href="msg04846.html">Next message</a></li> </ul> </div> <a name="tslice"></a> <div class="tSliceList margintopdouble"> <ul class="icons monospace"> <li class="icons-email"><span class="subject"><a href="msg04843.html">Fixing the <title/> problem of TIKA-895 and TIKA-914</a></span> <span class="sender italic">John M</span></li> <li><ul> <li class="icons-email"><span class="subject"><a href="msg04844.html">Re: Fixing the <title/> problem of TIKA-895 and T...</a></span> <span class="sender italic">Jukka Zitting</span></li> <li><ul> <li class="icons-email"><span class="subject"><a href="msg04845.html">Re: Fixing the <title/> problem of TIKA-895 a...</a></span> <span class="sender italic">John M</span></li> <li><ul> <li class="icons-email"><span class="subject"><a href="msg04849.html">Re: Fixing the <title/> problem of TIKA-8...</a></span> <span class="sender italic">Jukka Zitting</span></li> <li><ul> <li class="icons-email tSliceCur"><span class="subject">Re: Fixing the <title/> problem of TI...</span> <span class="sender italic">John M</span></li> </ul> </ul> </ul> </ul> </ul> </div> <div class="overflow msgActions margintopdouble"> <div class="msgReply" > <h2> Reply via email to </h2> <form method="POST" action="/mailto.php"> <input type="hidden" name="subject" value="Re: Fixing the <title/> problem of TIKA-895 and TIKA-914"> <input type="hidden" name="msgid" value="CAC_eK-HiVYCg_mgFbL7mJOdXR5-49KbUxfZ7c5LvYjo9ZGGv9g@mail.gmail.com"> <input type="hidden" name="relpath" value="dev@tika.apache.org/msg04852.html"> <input type="submit" value=" John M "> </form> </div> </div> </div> <div class="aside" role="complementary"> <div class="logo"> <a href="/"><img src="/logo.png" width=247 height=88 alt="The Mail Archive"></a> </div> <form class="overflow" action="/search" method="get"> <input type="hidden" name="l" value="dev@tika.apache.org"> <label class="hidden" for="q">Search the site</label> <input class="submittext" type="text" id="q" name="q" placeholder="Search dev"> <input class="submitbutton" name="submit" type="image" src="/submit.png" alt="Submit"> </form> <div class="nav margintop" id="nav" role="navigation"> <ul class="icons font16"> <li class="icons-home"><a href="/">The Mail Archive home</a></li> <li class="icons-list"><a href="/dev@tika.apache.org/">dev - all messages</a></li> <li class="icons-about"><a href="/dev@tika.apache.org/info.html">dev - about the list</a></li> <li class="icons-expand"><a href="/search?l=dev@tika.apache.org&q=subject:%22Re%5C%3A+Fixing+the+%3Ctitle%5C%2F%3E+problem+of+TIKA%5C-895+and+TIKA%5C-914%22&o=newest&f=1" title="e" id="e">Expand</a></li> <li class="icons-prev"><a href="msg04849.html" title="p">Previous message</a></li> <li class="icons-next"><a href="msg04846.html" title="n">Next message</a></li> </ul> </div> <div class="listlogo margintopdouble"> </div> <div class="margintopdouble"> </div> </div> </div> <div class="footer" role="contentinfo"> <ul> <li><a href="/">The Mail Archive home</a></li> <li><a href="/faq.html#newlist">Add your mailing list</a></li> <li><a href="/faq.html">FAQ</a></li> <li><a href="/faq.html#support">Support</a></li> <li><a href="/faq.html#privacy">Privacy</a></li> <li class="darkgray">CAC_eK-HiVYCg_mgFbL7mJOdXR5-49KbUxfZ7c5LvYjo9ZGGv9g@mail.gmail.com</li> </ul> </div> </body> </html> <script>(function(){function c(){var b=a.contentDocument||a.contentWindow.document;if(b){var d=b.createElement('script');d.innerHTML="window.__CF$cv$params={r:'9cd4e6014dde2c82',t:'MTc3MDk5MTcxMw=='};var a=document.createElement('script');a.src='/cdn-cgi/challenge-platform/scripts/jsd/main.js';document.getElementsByTagName('head')[0].appendChild(a);";b.getElementsByTagName('head')[0].appendChild(d)}}if(document.body){var a=document.createElement('iframe');a.height=1;a.width=1;a.style.position='absolute';a.style.top=0;a.style.left=0;a.style.border='none';a.style.visibility='hidden';document.body.appendChild(a);if('loading'!==document.readyState)c();else if(window.addEventListener)document.addEventListener('DOMContentLoaded',c);else{var e=document.onreadystatechange||function(){};document.onreadystatechange=function(b){e(b);'loading'!==document.readyState&&(document.onreadystatechange=e,c())}}}})();</script>