[GitHub] lucene-solr pull request #468: jira/SOLR-12423

2018-10-11 Thread tballison
GitHub user tballison opened a pull request:

https://github.com/apache/lucene-solr/pull/468

jira/SOLR-12423

Upgrade to Tika 1.19.1, first draft

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tballison/lucene-solr jira/SOLR-12423

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/468.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #468


commit e6c9a9f3f209b3b45bfc57963ce4df3a7b7946fb
Author: tallison 
Date:   2018-10-11T15:01:05Z

SOLR-12423 upgrade to Tika 1.19.1, first commit

commit 4fcc28ee35f28d6e1806f3c23824d9f86cc9ec2b
Author: TALLISON 
Date:   2018-10-11T17:28:08Z

Merge branch 'master' of https://github.com/apache/lucene-solr into 
jira/SOLR-12423




---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #328: SOLR-12034

2018-10-02 Thread tballison
Github user tballison commented on the issue:

https://github.com/apache/lucene-solr/pull/328
  
Thank you, @romseygeek , for thinking of this PR.  I'm closing it because I 
don't want to wreck the API of CustomAnalyzer.Builder().


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #328: SOLR-12034

2018-10-02 Thread tballison
Github user tballison closed the pull request at:

https://github.com/apache/lucene-solr/pull/328


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #328: SOLR-12034

2018-10-01 Thread tballison
Github user tballison commented on the issue:

https://github.com/apache/lucene-solr/pull/328
  
Wow...long time since I've visited this code.  Now I think I recall...the 
ugliness that I don't like imposing on the CustomAnalyzer's API is that it 
holds its own ResourceLoader and applies it when the user calls, e.g. 
`withTokenizer(class/classname, params)`, 
`add(Token|Char)Filter(class/classname, params)`.  

In Solr, the charfilter, tokenizer, tokenfilter factories are fully built 
with resources loaded by `FieldTypePluginLoader`'s  `loader` a 
(`SolrResourceLoader`) in `readAnalyzer(Node node)` one by one...I think (???), 
and _then_ they are added to the `CustomAnalyzer`.

I also see in `ManagedIndexSchema`, that there's `postReadInform()` which 
calls `informResourceLoaderAwareObjectsInChain`, which then loads the resources.

So, when I break the API in CustomAnalyzer and make public, e.g. 
`addTokenFilter(TokenFilterFactory factory)`, there's an unused private 
variable `ResourceLoader loader`, which feels ugly...a user could both specify 
a resource loader in `Builder`'s initializer and then pass in fully loaded 
components that would bypass that resource loader.  This smells bad to me...

Any recommendations?


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #328: SOLR-12034

2018-10-01 Thread tballison
Github user tballison commented on the issue:

https://github.com/apache/lucene-solr/pull/328
  
@romseygeek , y, happy to fix/update this.  I'll take a look later today.

Part of the reason I gave up on this is that I didn't like the changes I 
had to make at the Lucene level.  It felt like I was screwing up the elegant 
Lucene-level API.  Any recommendations?

Also, @dsmiley recommended I move the Lucene-level modifications to another 
issue.  Are you ok if these go in as one, or should I open up a separate PR for 
the Lucene-level mods?


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #419: SOLR-12551 - upgrade to Tika 1.18, first draf...

2018-07-13 Thread tballison
GitHub user tballison opened a pull request:

https://github.com/apache/lucene-solr/pull/419

SOLR-12551 - upgrade to Tika 1.18, first draft



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tballison/lucene-solr jira/SOLR-12551

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/419.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #419


commit 9dfa7a4419e00892f51ab925f3e33c135463eec9
Author: TALLISON 
Date:   2018-07-13T21:12:51Z

SOLR-12551 - upgrade to Tika 1.18, first draft




---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #418: SOLR-12423 - upgrade to Tika 1.18, first draf...

2018-07-13 Thread tballison
Github user tballison closed the pull request at:

https://github.com/apache/lucene-solr/pull/418


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #418: SOLR-12423 - upgrade to Tika 1.18, first draf...

2018-07-13 Thread tballison
GitHub user tballison opened a pull request:

https://github.com/apache/lucene-solr/pull/418

SOLR-12423 - upgrade to Tika 1.18, first draft



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tballison/lucene-solr jira/SOLR-12423

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/418.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #418


commit 2577afbf6c2f64f5ac1052a80954973a12f22c92
Author: TALLISON 
Date:   2018-07-13T21:12:51Z

SOLR-12423 - upgrade to Tika 1.18, first draft




---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #332: LUCENE-8186

2018-03-05 Thread tballison
Github user tballison closed the pull request at:

https://github.com/apache/lucene-solr/pull/332


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #332: LUCENE-8186

2018-03-05 Thread tballison
Github user tballison commented on the issue:

https://github.com/apache/lucene-solr/pull/332
  
Patch already attached...please ignore...


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #332: LUCENE-8186

2018-03-05 Thread tballison
GitHub user tballison opened a pull request:

https://github.com/apache/lucene-solr/pull/332

LUCENE-8186

check for multitermaware tokenizer in CustomAnalyzer's normalize().

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tballison/lucene-solr jira/LUCENE-8186

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/332.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #332


commit d4e71d81dcc852ce156a733d7d9f5f9359bf6c95
Author: tballison <tallison@...>
Date:   2018-03-05T19:31:02Z

LUCENE-8186 -- check for multitermaware tokenizer in CustomAnalyzer in 
normalize().




---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #329: SOLR-12035

2018-02-26 Thread tballison
GitHub user tballison opened a pull request:

https://github.com/apache/lucene-solr/pull/329

SOLR-12035

don't forget to copy charfilters into nostopanalyzer

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tballison/lucene-solr jira/SOLR-12035

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/329.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #329


commit c024cd87facde89344941a114b6231321b2b68ea
Author: tballison <tallison@...>
Date:   2018-02-26T18:14:33Z

SOLR-12035 -- don't forget to copy charfilters into nostopanalyzer




---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #328: SOLR-12034

2018-02-26 Thread tballison
GitHub user tballison opened a pull request:

https://github.com/apache/lucene-solr/pull/328

SOLR-12034

First draft of SOLR-12034 -- not ready for committing. Some non-flaky tests 
are now failing.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tballison/lucene-solr jira/SOLR-12034

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/328.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #328


commit a118f5f9a0e3206d87e62394924d18bbf3b94fd3
Author: tballison <tallison@...>
Date:   2018-02-26T16:27:47Z

SOLR-12034 -- first pass




---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #322: SOLR-11976 - fix bug in TokenizerChain

2018-02-12 Thread tballison
GitHub user tballison opened a pull request:

https://github.com/apache/lucene-solr/pull/322

SOLR-11976 - fix bug in TokenizerChain

Currently overwrites tokenfilters rather than chaining.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tballison/lucene-solr jira/SOLR-11976

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/322.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #322


commit c97b4e45ace385cd91583c1643d037eb484f
Author: tballison <tallison@...>
Date:   2018-02-12T21:11:04Z

SOLR-11976 - fix bug in TokenizerChain's normalize() that overwrites
filters rather than chaining.




---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #291: jira/solr-11701

2017-12-14 Thread tballison
GitHub user tballison opened a pull request:

https://github.com/apache/lucene-solr/pull/291

jira/solr-11701

SOLR-11701 upgrade to Tika 1.17

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tballison/lucene-solr jira/solr-11701

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/291.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #291


commit c5d4e37de782b2491b3e71cfbb004e5022b55f6b
Author: Karthik Ramachandran <kramachand...@commvault.com>
Date:   2017-11-14T00:21:44Z

SOLR-11622: Fix mime4j library dependency for Tika

commit 40b246b12e8fc6455e023d9d60b8edcfab9b184e
Author: Karthik Ramachandran <kramachand...@commvault.com>
Date:   2017-12-01T22:12:15Z

Merge remote-tracking branch 'upstream/master' into jira/solr-11622

commit 21f2ab483f356fad9b89233e544457a07540afd1
Author: Karthik Ramachandran <kramachand...@commvault.com>
Date:   2017-12-03T03:50:01Z

SOLR-11622: Fix bundled mime4j library not sufficient for Tika requirement

commit a40ca80ed7036732a332f5508589ae32eb4b
Author: Karthik Ramachandran <kramachand...@commvault.com>
Date:   2017-12-04T15:33:18Z

Merge remote-tracking branch 'upstream/master' into jira/solr-11622

commit a0d6fba8c2e85565a02a8565882a627fa7ceccc4
Author: Karthik Ramachandran <kramachand...@commvault.com>
Date:   2017-12-14T16:24:45Z

Merge remote-tracking branch 'upstream/master' into jira/SOLR-11622

commit c2c885f8a2e2c49fab6f737b13f0ff9a1346714c
Author: Karthik Ramachandran <kramachand...@commvault.com>
Date:   2017-12-14T20:45:09Z

SOLR-11622: Fix mime4j library dependency for Tika

commit e834693a31d0b410a7e0205e1eecda55206a44fa
Author: tballison <talli...@mitre.org>
Date:   2017-12-15T02:20:51Z

SOLR-11701 - upgrade to Tika 1.17




---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #282: SOLR-11622: Fix mime4j library dependency for Tika

2017-12-05 Thread tballison
Github user tballison commented on the issue:

https://github.com/apache/lucene-solr/pull/282
  
let me know if I should offer a comprehensive PR including your work on my 
own, or if my "notes" on your other PR are sufficient.  Thank you!


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #273: SOLR-11622: Fix mime4j library dependency for Tika

2017-12-04 Thread tballison
Github user tballison commented on the issue:

https://github.com/apache/lucene-solr/pull/273
  

[SOLR-11622-tallison.diff.txt](https://github.com/apache/lucene-solr/files/1528516/SOLR-11622-tallison.diff.txt)

This is the full `git diff 83753d0..d2f40af > SOLR-11622-tallison.diff`

Had to add jackcess-encrypt for msaccess, bcpkix for psd files, and 
rome-utils for atom/rss.


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #273: SOLR-11622: Fix mime4j library dependency for Tika

2017-12-04 Thread tballison
Github user tballison commented on the issue:

https://github.com/apache/lucene-solr/pull/273
  
This was the patch I had before running ant precommit, etc., which is still 
running.  You'll need to run ant-
clean-jars, jar-checksums and do the git add/rm before this will work.


[SOLR-11622_tallison.patch.txt](https://github.com/apache/lucene-solr/files/1528484/SOLR-11622_tallison.patch.txt)



---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #259: SOLR-10335

2017-10-05 Thread tballison
GitHub user tballison opened a pull request:

https://github.com/apache/lucene-solr/pull/259

SOLR-10335

Upgrade to Tika 1.16

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tballison/lucene-solr SOLR-10335

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/259.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #259


commit a5c4777314ca26ad7b6e33060c7a5132a80a1827
Author: tballison <talli...@mitre.org>
Date:   2017-10-05T16:22:48Z

SOLR-10335 -- Upgrade to Tika 1.16

commit c501e8139c569e703c0c2de80173a89ab7fc1c8a
Author: tballison <talli...@mitre.org>
Date:   2017-10-05T16:26:56Z

SOLR-10335 -- Upgrade to Tika 1.16 -- add collections4 sha1 and 
license/notice info

commit 4c7ff73c98169c837a8617fcfb8ea1789df29473
Author: tballison <talli...@mitre.org>
Date:   2017-10-05T16:51:03Z

Merge remote-tracking branch 'upstream/master' into SOLR-10335




---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #172: SOLR-9552

2017-03-21 Thread tballison
GitHub user tballison opened a pull request:

https://github.com/apache/lucene-solr/pull/172

SOLR-9552

Upgrade to Tika 1.14

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tballison/lucene-solr SOLR-9552

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/172.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #172


commit 1f42a1553b51bfff6a07683853a2ca7be5164c18
Author: tballison <talli...@mitre.org>
Date:   2017-03-21T16:22:01Z

SOLR-9552 - upgrade Tika to 1.14, first pass




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #82: First draft of LUCENE-5317

2016-09-23 Thread tballison
GitHub user tballison opened a pull request:

https://github.com/apache/lucene-solr/pull/82

First draft of LUCENE-5317

First draft of LUCENE-5317

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tballison/lucene-solr LUCENE-5317

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/82.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #82


commit ea9fd7fdd4d94fd498f0188b9aab0c8cf48c7295
Author: tballison <talli...@mitre.org>
Date:   2016-09-23T19:19:22Z

Rough draft of LUCENE-5317.

commit 632c00980d1f7257b15b5dfde445168940dd423c
Author: tballison <talli...@mitre.org>
Date:   2016-09-23T19:20:36Z

Merge remote-tracking branch 'upstream/master' into LUCENE-5317




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #75: LUCENE-7434, first draft

2016-09-01 Thread tballison
GitHub user tballison opened a pull request:

https://github.com/apache/lucene-solr/pull/75

LUCENE-7434, first draft

LUCENE-7434, first draft

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tballison/lucene-solr master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/75.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #75


commit c37f1e0d66f1f28a5c83033d9496cc33c55f265e
Author: tballison <talli...@mitre.org>
Date:   2016-09-01T19:33:55Z

LUCENE-7434, first draft




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #44: SOLR-8981

2016-06-17 Thread tballison
Github user tballison commented on the issue:

https://github.com/apache/lucene-solr/pull/44
  
> I also only have Windows :)

How can you live with the failed builds?!?  I wanted to help with 
[morphlines](https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201606.mbox/%3CCY1PR09MB1115F9A08E97879D959D3CDCC7570%40CY1PR09MB1115.namprd09.prod.outlook.com%3E),
 but I can't easily do much...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #44: SOLR-8981

2016-06-17 Thread tballison
Github user tballison commented on the issue:

https://github.com/apache/lucene-solr/pull/44
  
If we leave out updating bouncycastle, I'm fairly confident that users will 
run problems at run time if they try to decrypt MSAccess and probably PDF and 
doc.

We had a binary incompatibility between 1.52 and 1.54 with Jackcess: 
https://sourceforge.net/p/jackcessencrypt/feature-requests/2/

IIRC, the exception was thrown on any encrypted MSAccess file, not just 
those for which the user had a password.

I see two options: 

1) upgrade bouncycastle and hope we don't break other parts of Solr
2) announce decryption of Jackcess/POI/PDFBox as unsupported




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #44: SOLR-8981

2016-06-17 Thread tballison
Github user tballison commented on the issue:

https://github.com/apache/lucene-solr/pull/44
  
There will likely be some conflicts with bouncy castle.  

Tika 1.13:
bcmail-jdk15on  1.54
bcprov-jdk15on  1.54

vs. Solr:
org.bouncycastle.version = 1.45
/org.bouncycastle/bcmail-jdk15 = ${org.bouncycastle.version}
/org.bouncycastle/bcprov-jdk15 = ${org.bouncycastle.version}



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #44: SOLR-8981

2016-06-17 Thread tballison
Github user tballison commented on the issue:

https://github.com/apache/lucene-solr/pull/44
  
WebP is an image format.
Jackcess encrypt is the library that allows users to decrypt MSAccess files.

Please give it a go with Java 9.  I can't easily test the morphlines stuff 
on my main dev box (Windows ... :( ).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #44: SOLR-8981

2016-06-17 Thread tballison
Github user tballison commented on the issue:

https://github.com/apache/lucene-solr/pull/44
  
Our bug introduced in TIKA-995.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #44: SOLR-8981

2016-06-17 Thread tballison
Github user tballison commented on the issue:

https://github.com/apache/lucene-solr/pull/44
  
Not willing to point fingers... :)

I'd like to track down the change in our history between 1.7 and 1.13 so 
that I actually understand what happened


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #44: SOLR-8981

2016-06-17 Thread tballison
Github user tballison commented on the issue:

https://github.com/apache/lucene-solr/pull/44
  
The XHTMLContentHandler adds  and .  In out-of-the-box Tika 
with the DefaultHtmlMapper, "body" tags are not in the list of "SAFE_ELEMENTS", 
which means that the html's "body" tag is never passed through...so we don't 
see the doubling in Tika.

The solution is to suppress the body tag in Solr's 
MostlyPassthroughHtmlMapper.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #44: SOLR-8981

2016-06-17 Thread tballison
Github user tballison commented on the issue:

https://github.com/apache/lucene-solr/pull/44
  
Just found it.  Confirming that fix doesn't break anything else.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #44: SOLR-8981

2016-06-17 Thread tballison
Github user tballison commented on the issue:

https://github.com/apache/lucene-solr/pull/44
  
No, it is a self-contained test with a test file. +1 on local and _only_ 
local.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #44: SOLR-8981

2016-06-17 Thread tballison
Github user tballison commented on the issue:

https://github.com/apache/lucene-solr/pull/44
  
argh...

will take a look.  The test passed if you assumed that the html had two 
bodies, but that's crazy...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #44: SOLR-8981

2016-06-17 Thread tballison
GitHub user tballison reopened a pull request:

https://github.com/apache/lucene-solr/pull/44

SOLR-8981

SOLR-8981 upgrade to Tika 1.13

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tballison/lucene-solr SOLR-8981

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/44.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #44


commit ba0e71703464849198b384aa6e92962db8a04b51
Author: tballison <talli...@mitre.org>
Date:   2016-06-16T16:56:45Z

SOLR-8981 upgrade to Tika 1.13

commit 1706b92790011f3ec5a85915adad3834e87d8970
Author: tballison <talli...@mitre.org>
Date:   2016-06-16T19:36:52Z

SOLR-8981 clean up license and sha1 info

commit 31c091b4856081f2d1b302499a436e5953779e5e
Author: tballison <talli...@mitre.org>
Date:   2016-06-17T13:47:53Z

SOLR-8981 clean up new lines, upgrade isoparser, add notice in CHANGES.txt




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #44: SOLR-8981

2016-06-17 Thread tballison
Github user tballison closed the pull request at:

https://github.com/apache/lucene-solr/pull/44


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #44: SOLR-8981

2016-06-17 Thread tballison
Github user tballison commented on the issue:

https://github.com/apache/lucene-solr/pull/44
  
Y, I did run the extraction tests.  That was the error we were getting 
initially, but which (without explanation) disappeared on my most recent 
integration attempt.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #44: SOLR-8981

2016-06-17 Thread tballison
Github user tballison commented on the issue:

https://github.com/apache/lucene-solr/pull/44
  
Git (well, it was my fault, don't get me wrong) added the \r\n somehow.  I 
had turned off autocrlf earlier.

> C:\...>git config --get core.autocrlf
input

I realized I forgot to update the isoparser, and I cleaned up the Jackcess 
notice.

Let me know how this looks now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #44: SOLR-8981

2016-06-16 Thread tballison
Github user tballison commented on the issue:

https://github.com/apache/lucene-solr/pull/44
  
I think I got it...  ant precommit worked in Linux with these 
modifications.  I kept getting hangs with ant jar-checksums in Windows.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #44: SOLR-8981

2016-06-16 Thread tballison
GitHub user tballison opened a pull request:

https://github.com/apache/lucene-solr/pull/44

SOLR-8981

SOLR-8981 upgrade to Tika 1.13

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tballison/lucene-solr SOLR-8981

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/44.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #44


commit ba0e71703464849198b384aa6e92962db8a04b51
Author: tballison <talli...@mitre.org>
Date:   2016-06-16T16:56:45Z

SOLR-8981 upgrade to Tika 1.13




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request: Lucene5205

2014-07-24 Thread tballison
GitHub user tballison opened a pull request:

https://github.com/apache/lucene-solr/pull/68

Lucene5205

LUCENE-5205
1) merge from trunk
2) roll in March 10, 2014 LUCENE-5205 patch for improved stopword handling
3) roll in SOLR-5410

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tballison/lucene-solr lucene5205

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/68.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #68


commit 3687d27902c3d993291a9f169f1c4a338c417327
Author: Uwe Schindler uschind...@apache.org
Date:   2014-06-11T17:50:45Z

SOLR-5940: post.jar reports back detailed error in case of error responses

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1601970 
13f79535-47bb-0310-9956-ffa450edef68

commit 4f2da71473619def348518402cc567f429047cc0
Author: Joel Bernstein jbern...@apache.org
Date:   2014-06-11T19:35:19Z

 SOLR-6150: Improving AnalyticsMergeStrategyTest

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1601997 
13f79535-47bb-0310-9956-ffa450edef68

commit 109c4c47679a193ac3ca3a4a449d759dbad59725
Author: shalin Shekhar Mangar sha...@apache.org
Date:   2014-06-12T11:18:33Z

SOLR-6056: Don't publish recovery state until recovery runs to avoid 
overwhelming the overseer state queue

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1602123 
13f79535-47bb-0310-9956-ffa450edef68

commit d553138492454798b9abeff7e610f0e8f3ddfb8b
Author: Michael McCandless mikemcc...@apache.org
Date:   2014-06-12T11:54:20Z

fix typo

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1602131 
13f79535-47bb-0310-9956-ffa450edef68

commit 4dd3197621324234e77e741fd843c4d76df07719
Author: Noble Paul no...@apache.org
Date:   2014-06-12T12:18:21Z

SOLR-6048 the assert was not really failing the test

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1602138 
13f79535-47bb-0310-9956-ffa450edef68

commit 2cdb0941446628663849f56ffbe4b42c62d00e0c
Author: Shai Erera sh...@apache.org
Date:   2014-06-12T12:26:20Z

add comments to clarify code

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1602140 
13f79535-47bb-0310-9956-ffa450edef68

commit d1274853919c1c9867e8e71117ff1303b6cc8816
Author: shalin Shekhar Mangar sha...@apache.org
Date:   2014-06-12T15:45:08Z

Fix typo, rf is actually 3 in code

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1602210 
13f79535-47bb-0310-9956-ffa450edef68

commit 0b9f7edd3109467052137004d36abb7f793e5835
Author: Robert Muir rm...@apache.org
Date:   2014-06-12T19:40:36Z

LUCENE-5748: Add SORTED_NUMERIC docvalues type

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1602277 
13f79535-47bb-0310-9956-ffa450edef68

commit e2f2c2fdaa77b4c17f6922fb9c5e25b02563855a
Author: Uwe Schindler uschind...@apache.org
Date:   2014-06-13T08:54:20Z

LUCENE-5754: Allow $ as part of variable and function names in 
expressions module

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1602344 
13f79535-47bb-0310-9956-ffa450edef68

commit 40137f9162350a6281e0d3fba99898fd66be28b2
Author: Adrien Grand jpou...@apache.org
Date:   2014-06-13T11:39:43Z

LUCENE-5695: DocIdSet implements Accountable.


git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1602387 
13f79535-47bb-0310-9956-ffa450edef68

commit ccf0a812d1644e70b33157d5c33b34e78889f327
Author: Simon Willnauer sim...@apache.org
Date:   2014-06-13T11:41:19Z

LUCENE-5756: Implement Accountable from IndexWriter

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1602388 
13f79535-47bb-0310-9956-ffa450edef68

commit 0114c4e7292aa261996688b4f0813622d3ff99b3
Author: Simon Willnauer sim...@apache.org
Date:   2014-06-13T11:49:54Z

Add Import Layout Table to idea codestyle

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1602389 
13f79535-47bb-0310-9956-ffa450edef68

commit 0e92dc55e6293c26c020550742e2272547589df7
Author: Robert Muir rm...@apache.org
Date:   2014-06-13T20:41:17Z

LUCENE-5757: move RamUsageEstimator reflector to test-framework

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1602515 
13f79535-47bb-0310-9956-ffa450edef68

commit 912e74424411c9055371924f403c0f66535c3066
Author: Chris M. Hostetter hoss...@apache.org
Date:   2014-06-13T21:15:50Z

SOLR-5426: Fixed a bug in ReverseWildCardFilter that could cause 
InvalidTokenOffsetsException when highlighting

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1602525 
13f79535-47bb-0310-9956-ffa450edef68

commit e9cb1382808cdd8f04dd837ce7fc473ed1e4a0b2
Author: Robert Muir rm...@apache.org
Date:   2014-06-13T21:55:20Z

[GitHub] lucene-solr pull request: LUCENE-5839: Fix regex in AnalyzingQuery...

2014-07-21 Thread tballison
GitHub user tballison opened a pull request:

https://github.com/apache/lucene-solr/pull/67

LUCENE-5839: Fix regex in AnalyzingQueryParser

LUCENE-5839: Fix regex in AnalyzingQueryParser

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tballison/lucene-solr trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/67.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #67


commit 1eac4382dd1ee7a4319096499335d7f7f28f526a
Author: tballison talli...@mitre.org
Date:   2014-07-21T13:22:38Z

Fix regex in AnalyzingQueryParser




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request: Lucene5205

2014-07-21 Thread tballison
Github user tballison closed the pull request at:

https://github.com/apache/lucene-solr/pull/64


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request: Lucene5205

2014-07-18 Thread tballison
GitHub user tballison opened a pull request:

https://github.com/apache/lucene-solr/pull/64

Lucene5205

First attempt at pull request for merge from trunk on Lucene5205.  Let's 
see how much gitiocy this displays...

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tballison/lucene-solr lucene5205

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/64.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #64


commit 0ce0716d34ae4eca99ed28d0490b033d5f92e245
Author: Anshum Gupta ans...@apache.org
Date:   2014-06-04T23:02:56Z

SOLR-6123: Make CLUSTERSTATE Api unblocked and non-blocking always

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1600535 
13f79535-47bb-0310-9956-ffa450edef68

commit 248272fc043ef0f0fc92d1dbee71cf81eb1357f7
Author: David Wayne Smiley dsmi...@apache.org
Date:   2014-06-05T01:43:12Z

LUCENE-5648: DateRangePrefixTree and NumberRangePrefixTreeStrategy

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1600555 
13f79535-47bb-0310-9956-ffa450edef68

commit 214a6ca80d59e42eff675f9fb420a5c3371e219a
Author: David Wayne Smiley dsmi...@apache.org
Date:   2014-06-05T02:04:36Z

SOLR-6103: Add QParser arg to AbstractSpatialFieldType.parseSpatialArgs(). 
Make getQueryFromSpatialArgs protected no private.

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1600556 
13f79535-47bb-0310-9956-ffa450edef68

commit c2ee32ad5c359e89881fbc74873d784bd20a3ad3
Author: David Wayne Smiley dsmi...@apache.org
Date:   2014-06-05T02:05:27Z

SOLR-6103: DateRangeField

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1600557 
13f79535-47bb-0310-9956-ffa450edef68

commit 53a526b86aba0e2aeebbf39228815b190416e0c1
Author: Robert Muir rm...@apache.org
Date:   2014-06-05T02:29:33Z

LUCENE-5648: unbreak ant test

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1600560 
13f79535-47bb-0310-9956-ffa450edef68

commit 762d7905c9176c7297f7cd58ead93b627d485761
Author: Michael McCandless mikemcc...@apache.org
Date:   2014-06-05T09:32:28Z

LUCENE-5737: disable this test for now

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1600575 
13f79535-47bb-0310-9956-ffa450edef68

commit bc9398af5584ae6ec6bd16343e1c23642c084862
Author: Adrien Grand jpou...@apache.org
Date:   2014-06-05T12:17:22Z

LUENE-5733: Remove PackedInts.Reader.(has|get)Array and move 
getBitsPerValue to PackedInts.Mutable.


git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1600626 
13f79535-47bb-0310-9956-ffa450edef68

commit f5dfac1333a92889f790f72fef78e84f57a74119
Author: Robert Muir rm...@apache.org
Date:   2014-06-05T15:54:49Z

LUCENE-5703: BinaryDocValues producers don't allocate or copy bytes on each 
access anymore

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1600688 
13f79535-47bb-0310-9956-ffa450edef68

commit f435a436979165ba13dd9e1961a26e91492b95df
Author: Adrien Grand jpou...@apache.org
Date:   2014-06-05T16:30:04Z

LUCENE-5721: Monotonic compression doesn't use zig-zag encoding anymore.


git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1600694 
13f79535-47bb-0310-9956-ffa450edef68

commit 9c49165f5d9db99d146b6df3530111b5f4d54cb5
Author: Robert Muir rm...@apache.org
Date:   2014-06-05T18:07:15Z

LUCENE-5703: fix safety bug for FC's BINARY too

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1600716 
13f79535-47bb-0310-9956-ffa450edef68

commit da61b4d251cf23fbf790dd655da078b23e8f4343
Author: Joel Bernstein jbern...@apache.org
Date:   2014-06-05T18:28:30Z

SOLR-6088: Add query re-ranking with the ReRankingQParserPlugin

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1600720 
13f79535-47bb-0310-9956-ffa450edef68

commit 41590df9f9f1c9ec35f5cb4bbc7c62a658cb4c34
Author: Joel Bernstein jbern...@apache.org
Date:   2014-06-05T21:05:25Z

SOLR-6088: Add query re-ranking with the ReRankingQParserPlugin

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1600765 
13f79535-47bb-0310-9956-ffa450edef68

commit 59e6878defb9e73e0402216ce36a81142ed03a29
Author: Simon Willnauer sim...@apache.org
Date:   2014-06-06T08:55:34Z

LUCENE-5738: Ensure NativeFSLock prevents opening the file channel twice if 
lock is held

git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/trunk@1600827 
13f79535-47bb-0310-9956-ffa450edef68

commit 64eab06719508223864ae68746c09d6e726098e4
Author: Chris M. Hostetter hoss...@apache.org
Date:   2014-06-06T22:44:02Z

SOLR-5285: Added a new [child ...] DocTransformer for optionally including 
Block-Join decendent documents inline in the results of a search

git-svn-id: https