Re: Checkstyle linelength

2021-04-14 Thread Tim Allison
Done. 120 column CRT is in the mail... I should be good... lol Seriously, though, let me know what else needs to be changed there. On Wed, Apr 14, 2021 at 8:00 PM Tim Allison wrote: > Wait, monitors?! And what, may I ask, is so wrong with punchcards[1]...I > had just figured out how to modify

Re: Checkstyle linelength

2021-04-14 Thread Tim Allison
Wait, monitors?! And what, may I ask, is so wrong with punchcards[1]...I had just figured out how to modify my punchcard reader to get 100 Kidding, will bump to 120 tomorrow. [1] https://www.emacswiki.org/emacs/EightyColumnRule On Wed, Apr 14, 2021 at 6:10 PM Subhajit Das wrote: > Yes 120

Re: Checkstyle linelength

2021-04-14 Thread Subhajit Das
Yes 120 would be more reasonable. And it should be itegrated in branch_1x as well, to keep consistency. On Apr 15 2021, at 2:22 am, Peter Kronenberg wrote: > > My personal opinion is that the checkstyle line length restriction of 100 > characters is too short. Can we increase this to maybe 120

Checkstyle linelength

2021-04-14 Thread Peter Kronenberg
My personal opinion is that the checkstyle line length restriction of 100 characters is too short. Can we increase this to maybe 120 at least? 100 characters was good 20 years ago when we were all on 15" monitors. Peter Kronenberg | Senior AI Analytic ENGINEER C: 703.887.5623 [Torch

Re: Parsing XML

2021-04-14 Thread Tim Allison
We don't currently support that, but it should be fairly straightforward to drop a custom xml parser in the framework. On Wed, Apr 14, 2021 at 12:59 PM Peter Kronenberg wrote: > When parsing an XML file, is there any configuration where I can pass an > Xpath, for example, to only get the pieces

Tika Server Resource method priority/order

2021-04-14 Thread Subhajit Das
Hi, In Tika server, if a request is made against a REST resource and if it is matched with multiple resource handler method (like getXML and getText), then, is there predefined order of resolving method? There is always an warning printed, in such case. Practically I have seen uncertainty in

Parsing XML

2021-04-14 Thread Peter Kronenberg
When parsing an XML file, is there any configuration where I can pass an Xpath, for example, to only get the pieces that I want? Peter Peter Kronenberg | Senior AI Analytic ENGINEER C: 703.887.5623 [Torch AI] 4303 W. 119th St., Leawood, KS 66209

RE: Parsing PDF file - setting threshold of unmapped characters

2021-04-14 Thread Peter Kronenberg
The numbers I suggested were actually from Tim a week or 2 ago. Of course, the idea is to allow the user to adjust them, so if the default numbers don't work for a particular scenario, they can be changed. It sounds like the best solution for the best vs fast discussion is to just make it an

RE: Parsing PDF file - setting threshold of unmapped characters

2021-04-14 Thread Nick Burch
On Wed, 14 Apr 2021, Peter Kronenberg wrote: Anyone have any thoughts on this? I think both an absolute and a percentage would be good, but I don't have enough experience to comment on your suggested numbers for those two thresholds, sorry! Your idea on best vs fast touches on much older

RE: Parsing PDF file - setting threshold of unmapped characters

2021-04-14 Thread Peter Kronenberg
Anyone have any thoughts on this? Peter Kronenberg | Senior AI Analytic ENGINEER C: 703.887.5623 [Torch AI] 4303 W. 119th St., Leawood, KS 66209 WWW.TORCH.AI From: Peter Kronenberg Sent: Sunday, April 11, 2021 9:21 PM To: user@tika.apache.org;