[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available

2018-01-03 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310271#comment-16310271
 ] 

Tim Allison commented on SOLR-11701:


Finally back to keyboard. Doh, and thank you!!!

> Upgrade to Tika 1.17 when available
> ---
>
> Key: SOLR-11701
> URL: https://issues.apache.org/jira/browse/SOLR-11701
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Erick Erickson
> Fix For: 7.3
>
> Attachments: SOLR-11701.patch, SOLR-11701.patch
>
>
> Kicking off release process for Tika 1.17 in the next few days.  Please let 
> us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available

2017-12-24 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16303004#comment-16303004
 ] 

Erick Erickson commented on SOLR-11701:
---

Sorry, forgot to include the comment to close the PR.


> Upgrade to Tika 1.17 when available
> ---
>
> Key: SOLR-11701
> URL: https://issues.apache.org/jira/browse/SOLR-11701
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Erick Erickson
> Attachments: SOLR-11701.patch, SOLR-11701.patch
>
>
> Kicking off release process for Tika 1.17 in the next few days.  Please let 
> us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available

2017-12-24 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16303002#comment-16303002
 ] 

ASF subversion and git services commented on SOLR-11701:


Commit c548002569f8bd94c6a8386edc85fdcdc55accaf in lucene-solr's branch 
refs/heads/branch_7x from [~erickerickson]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=c548002 ]

SOLR-11701: Upgrade to Tika 1.17 when available

(cherry picked from commit 7e321d7)


> Upgrade to Tika 1.17 when available
> ---
>
> Key: SOLR-11701
> URL: https://issues.apache.org/jira/browse/SOLR-11701
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Erick Erickson
> Attachments: SOLR-11701.patch, SOLR-11701.patch
>
>
> Kicking off release process for Tika 1.17 in the next few days.  Please let 
> us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available

2017-12-24 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302993#comment-16302993
 ] 

ASF subversion and git services commented on SOLR-11701:


Commit 7e321d70df302738358266dfcee892dac79d1c0d in lucene-solr's branch 
refs/heads/master from [~erickerickson]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=7e321d7 ]

SOLR-11701: Upgrade to Tika 1.17 when available


> Upgrade to Tika 1.17 when available
> ---
>
> Key: SOLR-11701
> URL: https://issues.apache.org/jira/browse/SOLR-11701
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Erick Erickson
> Attachments: SOLR-11701.patch, SOLR-11701.patch
>
>
> Kicking off release process for Tika 1.17 in the next few days.  Please let 
> us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available

2017-12-23 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302685#comment-16302685
 ] 

Erick Erickson commented on SOLR-11701:
---

OK, I'm having a horrible time merging this. 

Did you purposely remove lucene/analysis/opennlp/ivy.xml? 

It's gone from your branch but present in master, _and_ present in the master 
in your repo, but gone in your branch 

Or should I assume that anything having to do with opennlp is something that 
should be just as it is on master?

> Upgrade to Tika 1.17 when available
> ---
>
> Key: SOLR-11701
> URL: https://issues.apache.org/jira/browse/SOLR-11701
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Erick Erickson
> Attachments: SOLR-11701.patch
>
>
> Kicking off release process for Tika 1.17 in the next few days.  Please let 
> us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available

2017-12-18 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16295483#comment-16295483
 ] 

Tim Allison commented on SOLR-11701:


Sounds good.  _Thank you_!

On the git conflict, y, that was caused by the recent addition of opennlp.  
I've updated the PR, but there are, of course, already new conflicts! :)  Let 
me know if I can do anything to help with that. 

On the 401, I'm sure why that was happening...I'll take a look.

On the unused imports, ugh.  Thank you.

> Upgrade to Tika 1.17 when available
> ---
>
> Key: SOLR-11701
> URL: https://issues.apache.org/jira/browse/SOLR-11701
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Erick Erickson
> Attachments: SOLR-11701.patch
>
>
> Kicking off release process for Tika 1.17 in the next few days.  Please let 
> us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available

2017-12-18 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16295437#comment-16295437
 ] 

Erick Erickson commented on SOLR-11701:
---

Tim:

I don't see a problem here then, I was mostly worried that this was an 
inexplicable problem that would pop out other places. We'll upgrade slf4j 
sometime anyway, so it seems to me that just adding the annotation is an OK fix.

At most, some of the @Slow or @Nightly or @Weekly tests will error out and need 
a similar annotation, but we're in the beginning of a new release cycle so 
there's time for those to flush out.

Any kind of blanket "don't report the full stack trace" seems like a bad idea 
since that's often so necessary to analysis.

If you find anything out that's germane, let me know otherwise I'll just 
annotate and commit (probably this evening).

Thanks for tracking this down!

> Upgrade to Tika 1.17 when available
> ---
>
> Key: SOLR-11701
> URL: https://issues.apache.org/jira/browse/SOLR-11701
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Erick Erickson
> Attachments: SOLR-11701.patch
>
>
> Kicking off release process for Tika 1.17 in the next few days.  Please let 
> us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available

2017-12-18 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16295411#comment-16295411
 ] 

Tim Allison commented on SOLR-11701:


Back to keyboard.  You're right in all of the above. When we bump slf4j from 
1.7.7 to 1.7.24, its behavior changes to print out the full stacktrace instead 
of just the message.

In org.slf4j.helpers.MessageFormatter in 1.7.7, the exception is counted as one 
of the members of {{argArray}}, and because of the following snippet, the 
{{throwableCandidate}} is nulled out in the returned {{FormattingTuple}}
{noformat}
if (L < argArray.length - 1) {
return new FormattingTuple(sbuf.toString(), argArray, 
throwableCandidate);
} else {
return new FormattingTuple(sbuf.toString(), argArray, 
(Throwable)null);
}
{noformat}

In 1.7.24, there's an added bit of logic before we get to that location that 
removes the exception from {{argArray}} so that it can't get swept into the 
message.
{noformat}
Object[] args = argArray;
if (throwableCandidate != null) {
args = trimmedCopy(argArray);
}
{noformat}

I have in the back of my mind that there was a reason we upgraded slf4j in 
Tika.  I'll look through our git history to see when/why and if we need to do 
it for the Solr integration.


> Upgrade to Tika 1.17 when available
> ---
>
> Key: SOLR-11701
> URL: https://issues.apache.org/jira/browse/SOLR-11701
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Erick Erickson
> Attachments: SOLR-11701.patch
>
>
> Kicking off release process for Tika 1.17 in the next few days.  Please let 
> us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available

2017-12-17 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16294253#comment-16294253
 ] 

Erick Erickson commented on SOLR-11701:
---

My _guess_ is it's log4j. The line producing the different outputs in 
IpmlicitSnitch.getHostIp() (see below)

What I'm not sure of at all is whether there's really a problem. The total size 
of the output files in the two cases  produced from *ant test* differed by 7K 
(out of 930K or so), so apparently it's just this one log message. Why it would 
be different is a total mystery. And it can be fixed by the 
*@TestRuleLimitSysouts.Limit(bytes=2)*  annotation. It'd be nice to know 
why they differ, but perhaps not essential.

Anyway, we can discuss as the week progresses.

  public String getHostIp(String host) {
try {
  InetAddress address = InetAddress.getByName(host);
  return address.getHostAddress();
} catch (Exception e) {
*  log.warn("Failed to get IP address from host [{}], with exception [{}] 
", host, e);*
  return null;
}
  }


> Upgrade to Tika 1.17 when available
> ---
>
> Key: SOLR-11701
> URL: https://issues.apache.org/jira/browse/SOLR-11701
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Erick Erickson
> Attachments: SOLR-11701.patch
>
>
> Kicking off release process for Tika 1.17 in the next few days.  Please let 
> us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available

2017-12-17 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16294225#comment-16294225
 ] 

Tim Allison commented on SOLR-11701:


Ugh. I’m still without keyboard. Can you tell which dependency is now adding 
more stuff? Will take a look tomorrow. Thank you for making it easy for me to 
replicate.

> Upgrade to Tika 1.17 when available
> ---
>
> Key: SOLR-11701
> URL: https://issues.apache.org/jira/browse/SOLR-11701
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Erick Erickson
> Attachments: SOLR-11701.patch
>
>
> Kicking off release process for Tika 1.17 in the next few days.  Please let 
> us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available

2017-12-16 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16293984#comment-16293984
 ] 

Erick Erickson commented on SOLR-11701:
---

OK, I see what's happening here, but not quite sure why or what the proper fix 
would be.

There is a lot more data being dumped out as a result of this patch, see below, 
###WITHOUT THIS PULL and #WITH THIS PULL. That results in the framework 
failing the test because it's too verbose (see ##ERROR).

This test illustrates how this fails, but be aware that if you just run that 
test it succeeds (in IntelliJ), you have to run the whole suite.
*ImplicitSnitchTest.testGetTags_with_correct_ipv6_format_ip_returns_nothing()*

Doing what the error says and adding @TestRuleLimitSysouts.Limit(bytes=2) 
causes the test to succeed. I took a quick look at the size of the output file 
from a test run and there's about a 7K difference with or without this patch 
(and bumping the limit). Is that sufficient?

*#ERROR*

java.lang.AssertionError: The test or suite printed 9268 bytes to stdout and 
stderr, even though the limit was set to 8192 bytes. Increase the limit with 
@Limit, ignore it completely with @SuppressSysoutChecks or run with 
-Dtests.verbose=true
at __randomizedtesting.SeedInfo.seed([75A0F6DC6F8B91D6]:0)
at 
org.apache.lucene.util.TestRuleLimitSysouts.afterIfSuccessful(TestRuleLimitSysouts.java:211)
at 
com.carrotsearch.randomizedtesting.rules.TestRuleAdapter$1.afterIfSuccessful(TestRuleAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:37)


*#WITHOUT THIS PULL*
Connected to the target VM, address: '127.0.0.1:51314', transport: 'socket'
0WARN  
(TEST-ImplicitSnitchTest.testGetTags_with_correct_ipv6_format_ip_returns_nothing-seed#[F055DBD86566DD29])
 [] o.a.s.c.c.r.ImplicitSnitch Failed to get IP address from host [[0], 
with exception [java.net.UnknownHostException: [0: invalid IPv6 address] 
4WARN  
(TEST-ImplicitSnitchTest.testGetTags_with_correct_ipv6_format_ip_returns_nothing-seed#[F055DBD86566DD29])
 [] o.a.s.c.c.r.ImplicitSnitch Failed to match host IP address from node 
URL [[0:0:0:0:0:0:0:1]:8983_solr] using regex [(?:https?://)?([^:]+):(\d+)]
Disconnected from the target VM, address: '127.0.0.1:51314', transport: 'socket'


*#WITH THIS PULL (abbreviated)*
objc[44364]: Class JavaLaunchHelper is implemented in both 
/Library/Java/JavaVirtualMachines/jdk1.8.0_121.jdk/Contents/Home/bin/java 
(0x1015234c0) and 
/Library/Java/JavaVirtualMachines/jdk1.8.0_121.jdk/Contents/Home/jre/lib/libinstrument.dylib
 (0x1015af4e0). One of the two will be used. Which one is undefined.
1WARN  
(TEST-ImplicitSnitchTest.testGetTags_with_correct_ipv6_format_ip_returns_nothing-seed#[75A0F6DC6F8B91D6])
 [] o.a.s.c.c.r.ImplicitSnitch Failed to get IP address from host [[0], 
with exception [{}] 
java.net.UnknownHostException: [0: invalid IPv6 address
at java.net.InetAddress.getAllByName(InetAddress.java:1146)
at java.net.InetAddress.getAllByName(InetAddress.java:1126)
at java.net.InetAddress.getByName(InetAddress.java:1076)
at 
org.apache.solr.common.cloud.rule.ImplicitSnitch.getHostIp(ImplicitSnitch.java:182)
at 
org.apache.solr.common.cloud.rule.ImplicitSnitch.getIpFragments(ImplicitSnitch.java:169)
at 
org.apache.solr.common.cloud.rule.ImplicitSnitch.addIpTags(ImplicitSnitch.java:145)
at 
org.apache.solr.common.cloud.rule.ImplicitSnitch.getTags(ImplicitSnitch.java:73)
at 
org.apache.solr.cloud.rule.ImplicitSnitchTest.testGetTags_with_correct_ipv6_format_ip_returns_nothing(ImplicitSnitchTest.java:117)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 

[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available

2017-12-16 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16293930#comment-16293930
 ] 

Tim Allison commented on SOLR-11701:


Away from tools now. Will look on Monday. Thank you!

> Upgrade to Tika 1.17 when available
> ---
>
> Key: SOLR-11701
> URL: https://issues.apache.org/jira/browse/SOLR-11701
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Erick Erickson
> Attachments: SOLR-11701.patch
>
>
> Kicking off release process for Tika 1.17 in the next few days.  Please let 
> us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available

2017-12-15 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16293525#comment-16293525
 ] 

Tim Allison commented on SOLR-11701:


Y. Thank you!

> Upgrade to Tika 1.17 when available
> ---
>
> Key: SOLR-11701
> URL: https://issues.apache.org/jira/browse/SOLR-11701
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Erick Erickson
>
> Kicking off release process for Tika 1.17 in the next few days.  Please let 
> us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available

2017-12-15 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16293441#comment-16293441
 ] 

Erick Erickson commented on SOLR-11701:
---

Double checking since I'm a bit "git challenged". That PR has 67 files and 8 
commits changed, although most of the file count comes from checksums. Does 
this sound right?

The commit history mentions SOLR-11701 and SOLR-11622, but not SOLR-11693, I'm 
guessing that 11693 is fixed too.

> Upgrade to Tika 1.17 when available
> ---
>
> Key: SOLR-11701
> URL: https://issues.apache.org/jira/browse/SOLR-11701
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Erick Erickson
>
> Kicking off release process for Tika 1.17 in the next few days.  Please let 
> us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available

2017-12-15 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16293243#comment-16293243
 ] 

Tim Allison commented on SOLR-11701:


K.  I turned off the warnings with 
[d25349d|https://github.com/apache/lucene-solr/pull/291/commits/d25349dba44f8774683863092104fad8ea05c75d],
 and I reran the integration tests. That _should_ be ready to go.

> Upgrade to Tika 1.17 when available
> ---
>
> Key: SOLR-11701
> URL: https://issues.apache.org/jira/browse/SOLR-11701
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Erick Erickson
>
> Kicking off release process for Tika 1.17 in the next few days.  Please let 
> us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available

2017-12-15 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16292713#comment-16292713
 ] 

Erick Erickson commented on SOLR-11701:
---

Tim:

Can't get to JIRA right now (seems to have been happening a lot
lately). But go ahead and update the PR, I haven't started working on
this yet.

Erick



> Upgrade to Tika 1.17 when available
> ---
>
> Key: SOLR-11701
> URL: https://issues.apache.org/jira/browse/SOLR-11701
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Erick Erickson
>
> Kicking off release process for Tika 1.17 in the next few days.  Please let 
> us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available

2017-12-15 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16292681#comment-16292681
 ] 

Tim Allison commented on SOLR-11701:


One more change... I'd like to turn off the missing jar warnings as the default 
in Solr.  Update to PR coming soon, unless that should be a different issue.

> Upgrade to Tika 1.17 when available
> ---
>
> Key: SOLR-11701
> URL: https://issues.apache.org/jira/browse/SOLR-11701
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Erick Erickson
>
> Kicking off release process for Tika 1.17 in the next few days.  Please let 
> us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available

2017-12-14 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16291955#comment-16291955
 ] 

Tim Allison commented on SOLR-11701:


Yes, and please.  Thank you!

> Upgrade to Tika 1.17 when available
> ---
>
> Key: SOLR-11701
> URL: https://issues.apache.org/jira/browse/SOLR-11701
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Erick Erickson
>
> Kicking off release process for Tika 1.17 in the next few days.  Please let 
> us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available

2017-12-14 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16291949#comment-16291949
 ] 

Erick Erickson commented on SOLR-11701:
---

OK, probably tomorrow or this weekend. This is the PR #291, right? Plus I 
should close the other two JIRAS linked too?

> Upgrade to Tika 1.17 when available
> ---
>
> Key: SOLR-11701
> URL: https://issues.apache.org/jira/browse/SOLR-11701
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Erick Erickson
>
> Kicking off release process for Tika 1.17 in the next few days.  Please let 
> us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available

2017-12-14 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16291935#comment-16291935
 ] 

Tim Allison commented on SOLR-11701:


I merged [~kramachand...@commvault.com]'s mods and made a few updates for Tika 
1.17.

I ran an integration test against 643 files in Apache Tika's unit test docs, 
and I got the same # of documents indexed in Solr as tika-app.jar parsed 
without exceptions.

{noformat}
public static void main(String[] args) throws Exception {
Path extracts = Paths.get("C:\\data\\tika_unit_tests_extracts");
SolrClient client = new 
HttpSolrClient.Builder("http://localhost:8983/solr/fileupload_passt/;).build();
for (File f : extracts.toFile().listFiles()) {
try (Reader r = Files.newBufferedReader(f.toPath(), 
StandardCharsets.UTF_8)) {
List metadataList = JsonMetadataList.fromJson(r);
String ex = 
metadataList.get(0).get(TikaCoreProperties.TIKA_META_EXCEPTION_PREFIX + 
"runtime");
if (ex == null) {
SolrQuery q = new SolrQuery("id: 
"+f.getName().replace(".json", ""));
QueryResponse response = client.query(q);
SolrDocumentList results = response.getResults();
if (results.getNumFound() != 1) {
System.err.println(f.getName() + " " + 
results.getNumFound());
}
}
}
}
}
{noformat}

I did the usual dance:
{noformat}
ant clean-jars jar-checksums
ant precommit
{noformat}

[~erickerickson], this _should_ be good to go.  


> Upgrade to Tika 1.17 when available
> ---
>
> Key: SOLR-11701
> URL: https://issues.apache.org/jira/browse/SOLR-11701
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>
> Kicking off release process for Tika 1.17 in the next few days.  Please let 
> us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11701) Upgrade to Tika 1.17 when available

2017-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16291932#comment-16291932
 ] 

ASF GitHub Bot commented on SOLR-11701:
---

GitHub user tballison opened a pull request:

https://github.com/apache/lucene-solr/pull/291

jira/solr-11701

SOLR-11701 upgrade to Tika 1.17

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tballison/lucene-solr jira/solr-11701

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/291.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #291


commit c5d4e37de782b2491b3e71cfbb004e5022b55f6b
Author: Karthik Ramachandran 
Date:   2017-11-14T00:21:44Z

SOLR-11622: Fix mime4j library dependency for Tika

commit 40b246b12e8fc6455e023d9d60b8edcfab9b184e
Author: Karthik Ramachandran 
Date:   2017-12-01T22:12:15Z

Merge remote-tracking branch 'upstream/master' into jira/solr-11622

commit 21f2ab483f356fad9b89233e544457a07540afd1
Author: Karthik Ramachandran 
Date:   2017-12-03T03:50:01Z

SOLR-11622: Fix bundled mime4j library not sufficient for Tika requirement

commit a40ca80ed7036732a332f5508589ae32eb4b
Author: Karthik Ramachandran 
Date:   2017-12-04T15:33:18Z

Merge remote-tracking branch 'upstream/master' into jira/solr-11622

commit a0d6fba8c2e85565a02a8565882a627fa7ceccc4
Author: Karthik Ramachandran 
Date:   2017-12-14T16:24:45Z

Merge remote-tracking branch 'upstream/master' into jira/SOLR-11622

commit c2c885f8a2e2c49fab6f737b13f0ff9a1346714c
Author: Karthik Ramachandran 
Date:   2017-12-14T20:45:09Z

SOLR-11622: Fix mime4j library dependency for Tika

commit e834693a31d0b410a7e0205e1eecda55206a44fa
Author: tballison 
Date:   2017-12-15T02:20:51Z

SOLR-11701 - upgrade to Tika 1.17




> Upgrade to Tika 1.17 when available
> ---
>
> Key: SOLR-11701
> URL: https://issues.apache.org/jira/browse/SOLR-11701
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>
> Kicking off release process for Tika 1.17 in the next few days.  Please let 
> us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org