[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available
[ https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310271#comment-16310271 ] Tim Allison commented on SOLR-11701: Finally back to keyboard. Doh, and thank you!!! > Upgrade to Tika 1.17 when available > --- > > Key: SOLR-11701 > URL: https://issues.apache.org/jira/browse/SOLR-11701 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Tim Allison >Assignee: Erick Erickson > Fix For: 7.3 > > Attachments: SOLR-11701.patch, SOLR-11701.patch > > > Kicking off release process for Tika 1.17 in the next few days. Please let > us know if you have any requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available
[ https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16303004#comment-16303004 ] Erick Erickson commented on SOLR-11701: --- Sorry, forgot to include the comment to close the PR. > Upgrade to Tika 1.17 when available > --- > > Key: SOLR-11701 > URL: https://issues.apache.org/jira/browse/SOLR-11701 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Tim Allison >Assignee: Erick Erickson > Attachments: SOLR-11701.patch, SOLR-11701.patch > > > Kicking off release process for Tika 1.17 in the next few days. Please let > us know if you have any requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available
[ https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16303002#comment-16303002 ] ASF subversion and git services commented on SOLR-11701: Commit c548002569f8bd94c6a8386edc85fdcdc55accaf in lucene-solr's branch refs/heads/branch_7x from [~erickerickson] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=c548002 ] SOLR-11701: Upgrade to Tika 1.17 when available (cherry picked from commit 7e321d7) > Upgrade to Tika 1.17 when available > --- > > Key: SOLR-11701 > URL: https://issues.apache.org/jira/browse/SOLR-11701 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Tim Allison >Assignee: Erick Erickson > Attachments: SOLR-11701.patch, SOLR-11701.patch > > > Kicking off release process for Tika 1.17 in the next few days. Please let > us know if you have any requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available
[ https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302993#comment-16302993 ] ASF subversion and git services commented on SOLR-11701: Commit 7e321d70df302738358266dfcee892dac79d1c0d in lucene-solr's branch refs/heads/master from [~erickerickson] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=7e321d7 ] SOLR-11701: Upgrade to Tika 1.17 when available > Upgrade to Tika 1.17 when available > --- > > Key: SOLR-11701 > URL: https://issues.apache.org/jira/browse/SOLR-11701 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Tim Allison >Assignee: Erick Erickson > Attachments: SOLR-11701.patch, SOLR-11701.patch > > > Kicking off release process for Tika 1.17 in the next few days. Please let > us know if you have any requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available
[ https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302685#comment-16302685 ] Erick Erickson commented on SOLR-11701: --- OK, I'm having a horrible time merging this. Did you purposely remove lucene/analysis/opennlp/ivy.xml? It's gone from your branch but present in master, _and_ present in the master in your repo, but gone in your branch Or should I assume that anything having to do with opennlp is something that should be just as it is on master? > Upgrade to Tika 1.17 when available > --- > > Key: SOLR-11701 > URL: https://issues.apache.org/jira/browse/SOLR-11701 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Tim Allison >Assignee: Erick Erickson > Attachments: SOLR-11701.patch > > > Kicking off release process for Tika 1.17 in the next few days. Please let > us know if you have any requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available
[ https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16295483#comment-16295483 ] Tim Allison commented on SOLR-11701: Sounds good. _Thank you_! On the git conflict, y, that was caused by the recent addition of opennlp. I've updated the PR, but there are, of course, already new conflicts! :) Let me know if I can do anything to help with that. On the 401, I'm sure why that was happening...I'll take a look. On the unused imports, ugh. Thank you. > Upgrade to Tika 1.17 when available > --- > > Key: SOLR-11701 > URL: https://issues.apache.org/jira/browse/SOLR-11701 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Tim Allison >Assignee: Erick Erickson > Attachments: SOLR-11701.patch > > > Kicking off release process for Tika 1.17 in the next few days. Please let > us know if you have any requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available
[ https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16295437#comment-16295437 ] Erick Erickson commented on SOLR-11701: --- Tim: I don't see a problem here then, I was mostly worried that this was an inexplicable problem that would pop out other places. We'll upgrade slf4j sometime anyway, so it seems to me that just adding the annotation is an OK fix. At most, some of the @Slow or @Nightly or @Weekly tests will error out and need a similar annotation, but we're in the beginning of a new release cycle so there's time for those to flush out. Any kind of blanket "don't report the full stack trace" seems like a bad idea since that's often so necessary to analysis. If you find anything out that's germane, let me know otherwise I'll just annotate and commit (probably this evening). Thanks for tracking this down! > Upgrade to Tika 1.17 when available > --- > > Key: SOLR-11701 > URL: https://issues.apache.org/jira/browse/SOLR-11701 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Tim Allison >Assignee: Erick Erickson > Attachments: SOLR-11701.patch > > > Kicking off release process for Tika 1.17 in the next few days. Please let > us know if you have any requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available
[ https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16295411#comment-16295411 ] Tim Allison commented on SOLR-11701: Back to keyboard. You're right in all of the above. When we bump slf4j from 1.7.7 to 1.7.24, its behavior changes to print out the full stacktrace instead of just the message. In org.slf4j.helpers.MessageFormatter in 1.7.7, the exception is counted as one of the members of {{argArray}}, and because of the following snippet, the {{throwableCandidate}} is nulled out in the returned {{FormattingTuple}} {noformat} if (L < argArray.length - 1) { return new FormattingTuple(sbuf.toString(), argArray, throwableCandidate); } else { return new FormattingTuple(sbuf.toString(), argArray, (Throwable)null); } {noformat} In 1.7.24, there's an added bit of logic before we get to that location that removes the exception from {{argArray}} so that it can't get swept into the message. {noformat} Object[] args = argArray; if (throwableCandidate != null) { args = trimmedCopy(argArray); } {noformat} I have in the back of my mind that there was a reason we upgraded slf4j in Tika. I'll look through our git history to see when/why and if we need to do it for the Solr integration. > Upgrade to Tika 1.17 when available > --- > > Key: SOLR-11701 > URL: https://issues.apache.org/jira/browse/SOLR-11701 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Tim Allison >Assignee: Erick Erickson > Attachments: SOLR-11701.patch > > > Kicking off release process for Tika 1.17 in the next few days. Please let > us know if you have any requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available
[ https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16294253#comment-16294253 ] Erick Erickson commented on SOLR-11701: --- My _guess_ is it's log4j. The line producing the different outputs in IpmlicitSnitch.getHostIp() (see below) What I'm not sure of at all is whether there's really a problem. The total size of the output files in the two cases produced from *ant test* differed by 7K (out of 930K or so), so apparently it's just this one log message. Why it would be different is a total mystery. And it can be fixed by the *@TestRuleLimitSysouts.Limit(bytes=2)* annotation. It'd be nice to know why they differ, but perhaps not essential. Anyway, we can discuss as the week progresses. public String getHostIp(String host) { try { InetAddress address = InetAddress.getByName(host); return address.getHostAddress(); } catch (Exception e) { * log.warn("Failed to get IP address from host [{}], with exception [{}] ", host, e);* return null; } } > Upgrade to Tika 1.17 when available > --- > > Key: SOLR-11701 > URL: https://issues.apache.org/jira/browse/SOLR-11701 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Tim Allison >Assignee: Erick Erickson > Attachments: SOLR-11701.patch > > > Kicking off release process for Tika 1.17 in the next few days. Please let > us know if you have any requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available
[ https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16294225#comment-16294225 ] Tim Allison commented on SOLR-11701: Ugh. I’m still without keyboard. Can you tell which dependency is now adding more stuff? Will take a look tomorrow. Thank you for making it easy for me to replicate. > Upgrade to Tika 1.17 when available > --- > > Key: SOLR-11701 > URL: https://issues.apache.org/jira/browse/SOLR-11701 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Tim Allison >Assignee: Erick Erickson > Attachments: SOLR-11701.patch > > > Kicking off release process for Tika 1.17 in the next few days. Please let > us know if you have any requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available
[ https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16293984#comment-16293984 ] Erick Erickson commented on SOLR-11701: --- OK, I see what's happening here, but not quite sure why or what the proper fix would be. There is a lot more data being dumped out as a result of this patch, see below, ###WITHOUT THIS PULL and #WITH THIS PULL. That results in the framework failing the test because it's too verbose (see ##ERROR). This test illustrates how this fails, but be aware that if you just run that test it succeeds (in IntelliJ), you have to run the whole suite. *ImplicitSnitchTest.testGetTags_with_correct_ipv6_format_ip_returns_nothing()* Doing what the error says and adding @TestRuleLimitSysouts.Limit(bytes=2) causes the test to succeed. I took a quick look at the size of the output file from a test run and there's about a 7K difference with or without this patch (and bumping the limit). Is that sufficient? *#ERROR* java.lang.AssertionError: The test or suite printed 9268 bytes to stdout and stderr, even though the limit was set to 8192 bytes. Increase the limit with @Limit, ignore it completely with @SuppressSysoutChecks or run with -Dtests.verbose=true at __randomizedtesting.SeedInfo.seed([75A0F6DC6F8B91D6]:0) at org.apache.lucene.util.TestRuleLimitSysouts.afterIfSuccessful(TestRuleLimitSysouts.java:211) at com.carrotsearch.randomizedtesting.rules.TestRuleAdapter$1.afterIfSuccessful(TestRuleAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:37) *#WITHOUT THIS PULL* Connected to the target VM, address: '127.0.0.1:51314', transport: 'socket' 0WARN (TEST-ImplicitSnitchTest.testGetTags_with_correct_ipv6_format_ip_returns_nothing-seed#[F055DBD86566DD29]) [] o.a.s.c.c.r.ImplicitSnitch Failed to get IP address from host [[0], with exception [java.net.UnknownHostException: [0: invalid IPv6 address] 4WARN (TEST-ImplicitSnitchTest.testGetTags_with_correct_ipv6_format_ip_returns_nothing-seed#[F055DBD86566DD29]) [] o.a.s.c.c.r.ImplicitSnitch Failed to match host IP address from node URL [[0:0:0:0:0:0:0:1]:8983_solr] using regex [(?:https?://)?([^:]+):(\d+)] Disconnected from the target VM, address: '127.0.0.1:51314', transport: 'socket' *#WITH THIS PULL (abbreviated)* objc[44364]: Class JavaLaunchHelper is implemented in both /Library/Java/JavaVirtualMachines/jdk1.8.0_121.jdk/Contents/Home/bin/java (0x1015234c0) and /Library/Java/JavaVirtualMachines/jdk1.8.0_121.jdk/Contents/Home/jre/lib/libinstrument.dylib (0x1015af4e0). One of the two will be used. Which one is undefined. 1WARN (TEST-ImplicitSnitchTest.testGetTags_with_correct_ipv6_format_ip_returns_nothing-seed#[75A0F6DC6F8B91D6]) [] o.a.s.c.c.r.ImplicitSnitch Failed to get IP address from host [[0], with exception [{}] java.net.UnknownHostException: [0: invalid IPv6 address at java.net.InetAddress.getAllByName(InetAddress.java:1146) at java.net.InetAddress.getAllByName(InetAddress.java:1126) at java.net.InetAddress.getByName(InetAddress.java:1076) at org.apache.solr.common.cloud.rule.ImplicitSnitch.getHostIp(ImplicitSnitch.java:182) at org.apache.solr.common.cloud.rule.ImplicitSnitch.getIpFragments(ImplicitSnitch.java:169) at org.apache.solr.common.cloud.rule.ImplicitSnitch.addIpTags(ImplicitSnitch.java:145) at org.apache.solr.common.cloud.rule.ImplicitSnitch.getTags(ImplicitSnitch.java:73) at org.apache.solr.cloud.rule.ImplicitSnitchTest.testGetTags_with_correct_ipv6_format_ip_returns_nothing(ImplicitSnitchTest.java:117) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934) at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970) at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at
[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available
[ https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16293930#comment-16293930 ] Tim Allison commented on SOLR-11701: Away from tools now. Will look on Monday. Thank you! > Upgrade to Tika 1.17 when available > --- > > Key: SOLR-11701 > URL: https://issues.apache.org/jira/browse/SOLR-11701 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Tim Allison >Assignee: Erick Erickson > Attachments: SOLR-11701.patch > > > Kicking off release process for Tika 1.17 in the next few days. Please let > us know if you have any requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available
[ https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16293525#comment-16293525 ] Tim Allison commented on SOLR-11701: Y. Thank you! > Upgrade to Tika 1.17 when available > --- > > Key: SOLR-11701 > URL: https://issues.apache.org/jira/browse/SOLR-11701 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Tim Allison >Assignee: Erick Erickson > > Kicking off release process for Tika 1.17 in the next few days. Please let > us know if you have any requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available
[ https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16293441#comment-16293441 ] Erick Erickson commented on SOLR-11701: --- Double checking since I'm a bit "git challenged". That PR has 67 files and 8 commits changed, although most of the file count comes from checksums. Does this sound right? The commit history mentions SOLR-11701 and SOLR-11622, but not SOLR-11693, I'm guessing that 11693 is fixed too. > Upgrade to Tika 1.17 when available > --- > > Key: SOLR-11701 > URL: https://issues.apache.org/jira/browse/SOLR-11701 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Tim Allison >Assignee: Erick Erickson > > Kicking off release process for Tika 1.17 in the next few days. Please let > us know if you have any requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available
[ https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16293243#comment-16293243 ] Tim Allison commented on SOLR-11701: K. I turned off the warnings with [d25349d|https://github.com/apache/lucene-solr/pull/291/commits/d25349dba44f8774683863092104fad8ea05c75d], and I reran the integration tests. That _should_ be ready to go. > Upgrade to Tika 1.17 when available > --- > > Key: SOLR-11701 > URL: https://issues.apache.org/jira/browse/SOLR-11701 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Tim Allison >Assignee: Erick Erickson > > Kicking off release process for Tika 1.17 in the next few days. Please let > us know if you have any requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available
[ https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16292713#comment-16292713 ] Erick Erickson commented on SOLR-11701: --- Tim: Can't get to JIRA right now (seems to have been happening a lot lately). But go ahead and update the PR, I haven't started working on this yet. Erick > Upgrade to Tika 1.17 when available > --- > > Key: SOLR-11701 > URL: https://issues.apache.org/jira/browse/SOLR-11701 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Tim Allison >Assignee: Erick Erickson > > Kicking off release process for Tika 1.17 in the next few days. Please let > us know if you have any requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available
[ https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16292681#comment-16292681 ] Tim Allison commented on SOLR-11701: One more change... I'd like to turn off the missing jar warnings as the default in Solr. Update to PR coming soon, unless that should be a different issue. > Upgrade to Tika 1.17 when available > --- > > Key: SOLR-11701 > URL: https://issues.apache.org/jira/browse/SOLR-11701 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Tim Allison >Assignee: Erick Erickson > > Kicking off release process for Tika 1.17 in the next few days. Please let > us know if you have any requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available
[ https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16291955#comment-16291955 ] Tim Allison commented on SOLR-11701: Yes, and please. Thank you! > Upgrade to Tika 1.17 when available > --- > > Key: SOLR-11701 > URL: https://issues.apache.org/jira/browse/SOLR-11701 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Tim Allison >Assignee: Erick Erickson > > Kicking off release process for Tika 1.17 in the next few days. Please let > us know if you have any requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available
[ https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16291949#comment-16291949 ] Erick Erickson commented on SOLR-11701: --- OK, probably tomorrow or this weekend. This is the PR #291, right? Plus I should close the other two JIRAS linked too? > Upgrade to Tika 1.17 when available > --- > > Key: SOLR-11701 > URL: https://issues.apache.org/jira/browse/SOLR-11701 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Tim Allison >Assignee: Erick Erickson > > Kicking off release process for Tika 1.17 in the next few days. Please let > us know if you have any requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available
[ https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16291935#comment-16291935 ] Tim Allison commented on SOLR-11701: I merged [~kramachand...@commvault.com]'s mods and made a few updates for Tika 1.17. I ran an integration test against 643 files in Apache Tika's unit test docs, and I got the same # of documents indexed in Solr as tika-app.jar parsed without exceptions. {noformat} public static void main(String[] args) throws Exception { Path extracts = Paths.get("C:\\data\\tika_unit_tests_extracts"); SolrClient client = new HttpSolrClient.Builder("http://localhost:8983/solr/fileupload_passt/;).build(); for (File f : extracts.toFile().listFiles()) { try (Reader r = Files.newBufferedReader(f.toPath(), StandardCharsets.UTF_8)) { List metadataList = JsonMetadataList.fromJson(r); String ex = metadataList.get(0).get(TikaCoreProperties.TIKA_META_EXCEPTION_PREFIX + "runtime"); if (ex == null) { SolrQuery q = new SolrQuery("id: "+f.getName().replace(".json", "")); QueryResponse response = client.query(q); SolrDocumentList results = response.getResults(); if (results.getNumFound() != 1) { System.err.println(f.getName() + " " + results.getNumFound()); } } } } } {noformat} I did the usual dance: {noformat} ant clean-jars jar-checksums ant precommit {noformat} [~erickerickson], this _should_ be good to go. > Upgrade to Tika 1.17 when available > --- > > Key: SOLR-11701 > URL: https://issues.apache.org/jira/browse/SOLR-11701 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Tim Allison > > Kicking off release process for Tika 1.17 in the next few days. Please let > us know if you have any requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available
[ https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16291932#comment-16291932 ] ASF GitHub Bot commented on SOLR-11701: --- GitHub user tballison opened a pull request: https://github.com/apache/lucene-solr/pull/291 jira/solr-11701 SOLR-11701 upgrade to Tika 1.17 You can merge this pull request into a Git repository by running: $ git pull https://github.com/tballison/lucene-solr jira/solr-11701 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/lucene-solr/pull/291.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #291 commit c5d4e37de782b2491b3e71cfbb004e5022b55f6b Author: Karthik RamachandranDate: 2017-11-14T00:21:44Z SOLR-11622: Fix mime4j library dependency for Tika commit 40b246b12e8fc6455e023d9d60b8edcfab9b184e Author: Karthik Ramachandran Date: 2017-12-01T22:12:15Z Merge remote-tracking branch 'upstream/master' into jira/solr-11622 commit 21f2ab483f356fad9b89233e544457a07540afd1 Author: Karthik Ramachandran Date: 2017-12-03T03:50:01Z SOLR-11622: Fix bundled mime4j library not sufficient for Tika requirement commit a40ca80ed7036732a332f5508589ae32eb4b Author: Karthik Ramachandran Date: 2017-12-04T15:33:18Z Merge remote-tracking branch 'upstream/master' into jira/solr-11622 commit a0d6fba8c2e85565a02a8565882a627fa7ceccc4 Author: Karthik Ramachandran Date: 2017-12-14T16:24:45Z Merge remote-tracking branch 'upstream/master' into jira/SOLR-11622 commit c2c885f8a2e2c49fab6f737b13f0ff9a1346714c Author: Karthik Ramachandran Date: 2017-12-14T20:45:09Z SOLR-11622: Fix mime4j library dependency for Tika commit e834693a31d0b410a7e0205e1eecda55206a44fa Author: tballison Date: 2017-12-15T02:20:51Z SOLR-11701 - upgrade to Tika 1.17 > Upgrade to Tika 1.17 when available > --- > > Key: SOLR-11701 > URL: https://issues.apache.org/jira/browse/SOLR-11701 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Tim Allison > > Kicking off release process for Tika 1.17 in the next few days. Please let > us know if you have any requests. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org