[jira] [Updated] (TIKA-1048) XMLParser should add whitespace between elements

2012-12-20 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated TIKA-1048: - Attachment: TIKA-1048.patch Patch w/ failing test ... I'm not sure where/how to best fix

Re: [jira] [Updated] (TIKA-1048) XMLParser should add whitespace between elements

2012-12-20 Thread Oleg Tikhonov
Hi Make, May be consider using of UIMA (the rule engine) ? BR, Oleg On Thu, Dec 20, 2012 at 1:05 PM, Michael McCandless (JIRA) j...@apache.orgwrote: [ https://issues.apache.org/jira/browse/TIKA-1048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel] Michael

Re: [jira] [Updated] (TIKA-1048) XMLParser should add whitespace between elements

2012-12-20 Thread Michael McCandless
Hi Oleg, UIMA could be useful for extracting text from XML (I'm not familiar enough with it...), but I think we should still fix Tika's own XML extraction. Mike McCandless http://blog.mikemccandless.com On Thu, Dec 20, 2012 at 6:14 AM, Oleg Tikhonov o...@apache.org wrote: Hi Make, May be

Re: [jira] [Updated] (TIKA-1048) XMLParser should add whitespace between elements

2012-12-20 Thread Mattmann, Chris A (388J)
+1... Cheers, Chris On 12/20/12 4:23 AM, Michael McCandless luc...@mikemccandless.com wrote: Hi Oleg, UIMA could be useful for extracting text from XML (I'm not familiar enough with it...), but I think we should still fix Tika's own XML extraction. Mike McCandless