[tika] branch master updated: As per RFC2361, the official mimetype for WAV is audio/vnd.wav

2017-06-29 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git The following commit(s) were added to refs/heads/master by this push: new bada130 As per RFC2361, the official mimetype

[tika] branch master updated: TIKA-2388 OpenOffice database files have application/vnd.oasis.opendocument.base as their embedded mimetype, so make that the canonical one

2017-06-08 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git The following commit(s) were added to refs/heads/master by this push: new 916e5ed TIKA-2388 OpenOffice database files have

[tika] branch master updated: TIKA-2372 Test DMG file

2017-05-18 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git The following commit(s) were added to refs/heads/master by this push: new 5a964e5 TIKA-2372 Test DMG file 5a964e5

svn commit: r1795357 - /tika/site/src/site/apt/1.14/formats.apt

2017-05-16 Thread nick
Author: nick Date: Tue May 16 19:11:53 2017 New Revision: 1795357 URL: http://svn.apache.org/viewvc?rev=1795357=rev Log: Version number fix Modified: tika/site/src/site/apt/1.14/formats.apt Modified: tika/site/src/site/apt/1.14/formats.apt URL: http://svn.apache.org/viewvc/tika/site/src

svn commit: r1794436 - /tika/site/publish/1.15/formats.html

2017-05-08 Thread nick
Author: nick Date: Mon May 8 18:13:50 2017 New Revision: 1794436 URL: http://svn.apache.org/viewvc?rev=1794436=rev Log: Republish Modified: tika/site/publish/1.15/formats.html Modified: tika/site/publish/1.15/formats.html URL: http://svn.apache.org/viewvc/tika/site/publish/1.15

svn commit: r1794435 - /tika/site/src/site/apt/1.15/formats.apt

2017-05-08 Thread nick
Author: nick Date: Mon May 8 18:13:15 2017 New Revision: 1794435 URL: http://svn.apache.org/viewvc?rev=1794435=rev Log: EMF, and mention the existance of the NER, NLP and OR parsers Modified: tika/site/src/site/apt/1.15/formats.apt Modified: tika/site/src/site/apt/1.15/formats.apt URL

svn commit: r1794427 - /tika/site/src/site/apt/1.15/formats.apt

2017-05-08 Thread nick
Author: nick Date: Mon May 8 17:46:33 2017 New Revision: 1794427 URL: http://svn.apache.org/viewvc?rev=1794427=rev Log: Document support for additional Microsoft Office formats Modified: tika/site/src/site/apt/1.15/formats.apt Modified: tika/site/src/site/apt/1.15/formats.apt URL: http

svn commit: r1794421 - /tika/site/src/site/apt/1.15/formats.apt

2017-05-08 Thread nick
Author: nick Date: Mon May 8 17:41:31 2017 New Revision: 1794421 URL: http://svn.apache.org/viewvc?rev=1794421=rev Log: Document support for TSD, WMF and WordPerfect Modified: tika/site/src/site/apt/1.15/formats.apt Modified: tika/site/src/site/apt/1.15/formats.apt URL: http

svn commit: r1794419 - /tika/site/src/site/apt/1.15/formats.apt

2017-05-08 Thread nick
Author: nick Date: Mon May 8 17:31:09 2017 New Revision: 1794419 URL: http://svn.apache.org/viewvc?rev=1794419=rev Log: Compress supported formats update, and other new packaging support Modified: tika/site/src/site/apt/1.15/formats.apt Modified: tika/site/src/site/apt/1.15/formats.apt URL

svn commit: r1794409 - in /tika/site/publish/1.15: ./ examples.html formats.html

2017-05-08 Thread nick
Author: nick Date: Mon May 8 16:36:55 2017 New Revision: 1794409 URL: http://svn.apache.org/viewvc?rev=1794409=rev Log: Publish WiP 1.15 pages Added: tika/site/publish/1.15/ tika/site/publish/1.15/examples.html tika/site/publish/1.15/formats.html Added: tika/site/publish/1.15

svn commit: r1794408 - in /tika/site: publish/contribute.html src/site/apt/contribute.apt.vm

2017-05-08 Thread nick
Author: nick Date: Mon May 8 16:36:15 2017 New Revision: 1794408 URL: http://svn.apache.org/viewvc?rev=1794408=rev Log: Update the Git URL, and put Github as the default now Modified: tika/site/publish/contribute.html tika/site/src/site/apt/contribute.apt.vm Modified: tika/site/publish

svn commit: r1794407 [1/2] - in /tika/site: publish/0.10/ publish/0.7/ publish/0.8/ publish/0.9/ publish/1.0/ publish/1.1/ publish/1.10/ publish/1.11/ publish/1.12/ publish/1.13/ publish/1.14/ publish

2017-05-08 Thread nick
Author: nick Date: Mon May 8 16:32:46 2017 New Revision: 1794407 URL: http://svn.apache.org/viewvc?rev=1794407=rev Log: Update the Git URL Modified: tika/site/publish/0.10/parser_guide.html tika/site/publish/0.7/parser_guide.html tika/site/publish/0.8/parser_guide.html tika/site

svn commit: r1794407 [2/2] - in /tika/site: publish/0.10/ publish/0.7/ publish/0.8/ publish/0.9/ publish/1.0/ publish/1.1/ publish/1.10/ publish/1.11/ publish/1.12/ publish/1.13/ publish/1.14/ publish

2017-05-08 Thread nick
Modified: tika/site/src/site/apt/1.4/parser_guide.apt URL: http://svn.apache.org/viewvc/tika/site/src/site/apt/1.4/parser_guide.apt?rev=1794407=1794406=1794407=diff == --- tika/site/src/site/apt/1.4/parser_guide.apt

[tika] 01/02: TIKA-2346 Add OfficeParserConfig support to control extraction from shapes from non-shape-based formats

2017-04-27 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit aa4954fb44f707779693faea785acc219739ccd5 Author: Nick Burch <n...@gagravarr.org> AuthorDate: Thu Apr 27 17:58:35 2017

[tika] 02/02: TIKA-2346 OfficeParserConfig control extraction from shapes from DOCX

2017-04-27 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit 0876aa909ffb77dfbd384ebe2f5de0a873ab489a Author: Nick Burch <n...@gagravarr.org> AuthorDate: Thu Apr 27 18:02:06 2017

[tika] 01/04: TIKA-2345 Tika Config Serialisation of EncodingDetector details

2017-04-27 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit 4d3a43c1682ca88f8c5a88ea1b34cd6fb105f997 Author: Nick Burch <n...@gagravarr.org> AuthorDate: Thu Apr 27 15:31:16 2017

[tika] branch master updated (11ad0fd -> d77fb59)

2017-04-27 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/tika.git. from 11ad0fd TIKA-2039 -- extra unit test... ensure standard handling of exception in embedded file new 4d3a43c

[tika] 02/04: TIKA-2345 Test for Tika Config Serialisation of EncodingDetector

2017-04-27 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit 86e821b6d47b32e0d00dbaf95d84cff5aa5570b5 Author: Nick Burch <n...@gagravarr.org> AuthorDate: Thu Apr 27 15:37:16 2017

[tika] 04/04: Merge branch 'master' of https://github.com/apache/tika

2017-04-27 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit d77fb59a54b40d1f850a89053bebd87409b2b05c Merge: 27f3b3d 11ad0fd Author: Nick Burch <n...@gagravarr.org> AuthorDate: T

[tika] branch 2.x updated: Merge changelog update

2017-03-23 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch 2.x in repository https://gitbox.apache.org/repos/asf/tika.git The following commit(s) were added to refs/heads/2.x by this push: new f87948d Merge changelog update f87948d is described

[tika] branch master updated: Changelog update

2017-03-23 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git The following commit(s) were added to refs/heads/master by this push: new 8d31ab6 Changelog update 8d31ab6 is described

[tika] 02/04: TIKA-1772 More test WebVTT files - no text header, and a custom one

2017-03-22 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch 2.x in repository https://gitbox.apache.org/repos/asf/tika.git commit e34498bbefc87038511e3077a80cf71d8c8fc98b Author: Nick Burch <n...@gagravarr.org> AuthorDate: Wed Mar 22 23:26:28 2017

[tika] 03/04: Merge 3c02c4b to the new 2.x test documents area

2017-03-22 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch 2.x in repository https://gitbox.apache.org/repos/asf/tika.git commit d12c87b6d65f36e4e2d8cfbc766bbe4aa35fe5c9 Author: Nick Burch <n...@gagravarr.org> AuthorDate: Thu Mar 23 00:15:03 2017

[tika] branch 2.x updated (e3fead4 -> 78c31eb)

2017-03-22 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a change to branch 2.x in repository https://gitbox.apache.org/repos/asf/tika.git. from e3fead4 TIKA-2307 -- include finer grained supported types so that users can control includes/excludes with decorator

[tika] 01/04: TIKA-1772 More WebVTT magic - for cases with no header, and with custom headers

2017-03-22 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch 2.x in repository https://gitbox.apache.org/repos/asf/tika.git commit 2df5c536bd3a37660dab5914a385097ed1f39560 Author: Nick Burch <n...@gagravarr.org> AuthorDate: Wed Mar 22 23:25:52 2017

[tika] 04/04: TIKA-1772 More WebVTT unit tests

2017-03-22 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch 2.x in repository https://gitbox.apache.org/repos/asf/tika.git commit 78c31eb614231d8f75f87a09fa0d3eef9e3010ba Author: Nick Burch <n...@gagravarr.org> AuthorDate: Wed Mar 22 23:27:05 2017

[tika] branch master updated (256a281 -> 40647ea)

2017-03-22 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/tika.git. from 256a281 TIKA-2212 ooxml parser should use finer-grained media types so that they can be filtered by users

[tika] 02/03: TIKA-1772 More test WebVTT files - no text header, and a custom one

2017-03-22 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit 3c02c4b2abf10ce9745734d8eff2b7a1f5bf1765 Author: Nick Burch <n...@gagravarr.org> AuthorDate: Wed Mar 22 23:26:28 2017

[tika] 03/03: TIKA-1772 More WebVTT unit tests

2017-03-22 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit 40647ea4e929683ae41422bdd3144cf84f24d0e0 Author: Nick Burch <n...@gagravarr.org> AuthorDate: Wed Mar 22 23:27:05 2017

[tika] 01/03: TIKA-1772 More WebVTT magic - for cases with no header, and with custom headers

2017-03-22 Thread nick
This is an automated email from the ASF dual-hosted git repository. nick pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/tika.git commit bb82205eece0eb68edee7d7ac24f63cf3934198f Author: Nick Burch <n...@gagravarr.org> AuthorDate: Wed Mar 22 23:25:52 2017

[3/5] tika git commit: TIKA-2255 Magic for older sas data files

2017-01-30 Thread nick
Branch: refs/heads/2.x Commit: a79de0ccfa20feeed8e8c805f320e3f2d9077829 Parents: 4d8feae Author: Nick Burch <n...@gagravarr.org> Authored: Mon Jan 30 08:39:43 2017 + Committer: Nick Burch <n...@gagravarr.org> Committed: Mon Jan 30 08:50:2

[2/5] tika git commit: Move to Tika 2.x location

2017-01-30 Thread nick
/heads/2.x Commit: 4d8feaee50968e2d30804dd24e11fd5723f99f6e Parents: 6287b75 Author: Nick Burch <n...@gagravarr.org> Authored: Mon Jan 30 08:50:17 2017 + Committer: Nick Burch <n...@gagravarr.org> Committed: Mon Jan 30 08:50:1

[4/5] tika git commit: TIKA-2255 Mime detection unit tests for SAS files

2017-01-30 Thread nick
/534a5259 Branch: refs/heads/2.x Commit: 534a525980a8232bb4dcf567a351828c8510ea5e Parents: a79de0c Author: Nick Burch <n...@gagravarr.org> Authored: Mon Jan 30 08:39:59 2017 + Committer: Nick Burch <n...@gagravarr.org> Committed: Mon Jan 30 08:52:0

[1/5] tika git commit: TIKA-2255 Test SAS files

2017-01-30 Thread nick
iff: http://git-wip-us.apache.org/repos/asf/tika/diff/6287b75b Branch: refs/heads/2.x Commit: 6287b75b5aaca14083f9f222b8881b920dcbf04e Parents: 3df8ce8 Author: Nick Burch <n...@gagravarr.org> Authored: Mon Jan 30 08:35:33 2017 + Committer: Nick Burch <n...@gagravarr.org> Committed

[2/3] tika git commit: TIKA-2255 Magic for older sas data files

2017-01-30 Thread nick
Branch: refs/heads/master Commit: c5130ec612ec3c46cd4a7d86153b4c6e4b8cf57f Parents: 2735942 Author: Nick Burch <n...@gagravarr.org> Authored: Mon Jan 30 08:39:43 2017 + Committer: Nick Burch <n...@gagravarr.org> Committed: Mon Jan 30 08:39:4

[2/3] tika git commit: TIKA-2250 As of RFC7903, the official mime type for WMF is now an image one and without the x- prefix

2017-01-23 Thread nick
Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/6668d78f Branch: refs/heads/2.x Commit: 6668d78fa73e050a2e36bf5bc57106c0640bff6b Parents: 58d56c3 Author: Nick Burch <n...@gagravarr.org> Authored: Mon Jan 23 18:27:02 2017 + Committer: Nick Burch <n...@gagravarr.org> Committed: M

[3/3] tika git commit: TIKA-2250 As of RFC7903, the official mime type for EMF is now an image one and without the x- prefix

2017-01-23 Thread nick
Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/bd667acd Branch: refs/heads/2.x Commit: bd667acde6a48e118574129d79dfacb1c3c2db25 Parents: 6668d78 Author: Nick Burch <n...@gagravarr.org> Authored: Mon Jan 23 18:31:49 2017 + Committer: Nick Burch <n...@gagravarr.org> Committed: M

[1/3] tika git commit: TIKA-2250 As of RFC7903, the official mime type for BMP is now the one without the x- prefix

2017-01-23 Thread nick
c33 Tree: http://git-wip-us.apache.org/repos/asf/tika/tree/58d56c33 Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/58d56c33 Branch: refs/heads/2.x Commit: 58d56c33fc7d103e0c6875aa63f3377eaf8b7ae4 Parents: 7882817 Author: Nick Burch <n...@gagravarr.org> Authored: Mon Jan 23 18:20:44 2

[2/3] tika git commit: TIKA-2250 As of RFC7903, the official mime type for WMF is now an image one and without the x- prefix

2017-01-23 Thread nick
Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/e6c0082e Branch: refs/heads/master Commit: e6c0082e41143a01f0bf646a8a8b6c06a85ca239 Parents: 847156a Author: Nick Burch <n...@gagravarr.org> Authored: Mon Jan 23 18:27:02 2017 + Committer: Nick Burch <n...@gagravarr.org> Committed

[1/3] tika git commit: TIKA-2250 As of RFC7903, the official mime type for BMP is now the one without the x- prefix

2017-01-23 Thread nick
mit/847156ac Tree: http://git-wip-us.apache.org/repos/asf/tika/tree/847156ac Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/847156ac Branch: refs/heads/master Commit: 847156ac0f5fa7d4cc06964198359cf594b66d50 Parents: 4cc15e2 Author: Nick Burch <n...@gagravarr.org> Authored: Mon Jan 23 18:2

[3/3] tika git commit: TIKA-2250 As of RFC7903, the official mime type for EMF is now an image one and without the x- prefix

2017-01-23 Thread nick
Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/90bf4f6e Branch: refs/heads/master Commit: 90bf4f6e4c645240b36ded6973eb64961312fc0a Parents: e6c0082 Author: Nick Burch <n...@gagravarr.org> Authored: Mon Jan 23 18:31:49 2017 + Committer: Nick Burch <n...@gagravarr.org> Committed

tika git commit: TIKA-2241 Add new config dumping option STATIC_FULL which lists all supported+active mime types for parsers

2017-01-17 Thread nick
ika/commit/45a9b77d Tree: http://git-wip-us.apache.org/repos/asf/tika/tree/45a9b77d Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/45a9b77d Branch: refs/heads/2.x Commit: 45a9b77d614ba005be43cf0c45321fe4071a425a Parents: dd70fd3 Author: Nick Burch <n...@gagravarr.org> Authored: Tue Jan

tika git commit: TIKA-2241 Add new config dumping option STATIC_FULL which lists all supported+active mime types for parsers

2017-01-17 Thread nick
asf/tika/commit/320a1f1e Tree: http://git-wip-us.apache.org/repos/asf/tika/tree/320a1f1e Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/320a1f1e Branch: refs/heads/master Commit: 320a1f1ede36cf1f62f6f2b8cab468cd78094606 Parents: 25aa2be Author: Nick Burch <n...@gagravarr.org> Authored: T

[1/6] tika git commit: TIKA-2224 Mime magic for OneNote

2016-12-22 Thread nick
ree/bb76d986 Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/bb76d986 Branch: refs/heads/2.x Commit: bb76d986a81e720e8f2992d8e07eb1e462337ab7 Parents: d8fa3c2 Author: Nick Burch <n...@gagravarr.org> Authored: Thu Dec 22 01:11:06 2016 + Committer: Nick Burch <n...@gagravarr.org> Com

[5/6] tika git commit: Manually merge changelog

2016-12-22 Thread nick
/heads/2.x Commit: 785e4741332bbe8bad3dca1379e0083f3dc1bbbc Parents: cdb6456 Author: Nick Burch <n...@gagravarr.org> Authored: Fri Dec 23 03:22:13 2016 + Committer: Nick Burch <n...@gagravarr.org> Committed: Fri Dec 23 03:22:1

[3/6] tika git commit: TIKA-2224 We now differ from HTTPD on onenote formats, as we have subtypes they lack

2016-12-22 Thread nick
-us.apache.org/repos/asf/tika/diff/71584b2d Branch: refs/heads/2.x Commit: 71584b2deb5eda2742fb362b279740e8b6fed15d Parents: db21ee1 Author: Nick Burch <n...@gagravarr.org> Authored: Thu Dec 22 01:27:10 2016 + Committer: Nick Burch <n...@gagravarr.org> Committed: Fri Dec 23 03:21:1

[2/6] tika git commit: TIKA-2224 Mime sub-entry for .onepkg, a cab file holding other onenote files

2016-12-22 Thread nick
-us.apache.org/repos/asf/tika/diff/db21ee15 Branch: refs/heads/2.x Commit: db21ee158a0a576c8012c8424ba1424626adc77a Parents: bb76d98 Author: Nick Burch <n...@gagravarr.org> Authored: Thu Dec 22 01:21:33 2016 + Committer: Nick Burch <n...@gagravarr.org> Committed: Fri Dec 23 03:16:3

[4/6] tika git commit: TIKA-2224 Test OneNote file from Krishnan Narayan plus unit test

2016-12-22 Thread nick
/asf/tika/diff/cdb6456b Branch: refs/heads/2.x Commit: cdb6456bbf1317e20f1fd11b2a9bcd1fc2282b2d Parents: 71584b2 Author: Nick Burch <n...@gagravarr.org> Authored: Fri Dec 23 03:12:48 2016 + Committer: Nick Burch <n...@gagravarr.org> Committed: Fri Dec 23 03:21:3

[6/6] tika git commit: Move new test file to the 2.x location

2016-12-22 Thread nick
Branch: refs/heads/2.x Commit: 4e3534da0d1ebb2ab970262bdb610cbf2e464f32 Parents: 785e474 Author: Nick Burch <n...@gagravarr.org> Authored: Fri Dec 23 03:25:03 2016 + Committer: Nick Burch <n...@gagravarr.org> Committed: Fri Dec 23 03:25:0

[4/5] tika git commit: TIKA-2224 We now differ from HTTPD on onenote formats, as we have subtypes they lack

2016-12-21 Thread nick
-us.apache.org/repos/asf/tika/diff/9546bd31 Branch: refs/heads/master Commit: 9546bd31953a10704e54fd40ebac68b2138e3aa2 Parents: 135f326 Author: Nick Burch <n...@gagravarr.org> Authored: Thu Dec 22 01:27:10 2016 + Committer: Nick Burch <n...@gagravarr.org> Committed: Thu Dec 22 0

[5/5] tika git commit: Merge

2016-12-21 Thread nick
: aa448a3b7e61f9a46efd1bf3f2ac72e6a3852d8f Parents: 9546bd3 d011d70 Author: Nick Burch <n...@gagravarr.org> Authored: Thu Dec 22 01:31:25 2016 + Committer: Nick Burch <n...@gagravarr.org> Committed: Thu Dec 22 01:31:2

[1/5] tika git commit: TIKA-2224 Mime magic for OneNote

2016-12-21 Thread nick
ree/df14f78e Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/df14f78e Branch: refs/heads/master Commit: df14f78e46feeae16cb6cbd2cb40c44ce497d53e Parents: aa2407a Author: Nick Burch <n...@gagravarr.org> Authored: Thu Dec 22 01:11:06 2016 + Committer: Nick Burch <n...@gagravarr.or

[2/5] tika git commit: TIKA-2224 Mime sub-entry for .onepkg, a cab file holding other onenote files

2016-12-21 Thread nick
-us.apache.org/repos/asf/tika/diff/009c143a Branch: refs/heads/master Commit: 009c143aedb95e356b2835b17fd66e9f7aec43d0 Parents: df14f78 Author: Nick Burch <n...@gagravarr.org> Authored: Thu Dec 22 01:21:33 2016 + Committer: Nick Burch <n...@gagravarr.org> Committed: Thu Dec 22 01:21:3

[3/5] tika git commit: Changelog

2016-12-21 Thread nick
Commit: 135f3265b8b898924d7c1938f5b16c274b916c66 Parents: 009c143 Author: Nick Burch <n...@gagravarr.org> Authored: Thu Dec 22 01:23:37 2016 + Committer: Nick Burch <n...@gagravarr.org> Committed: Thu Dec 22 01:23:3

svn commit: r1772249 - /tika/site/publish/doap.rdf

2016-12-01 Thread nick
Author: nick Date: Thu Dec 1 18:55:09 2016 New Revision: 1772249 URL: http://svn.apache.org/viewvc?rev=1772249=rev Log: Publish new DOAP Modified: tika/site/publish/doap.rdf Modified: tika/site/publish/doap.rdf URL: http://svn.apache.org/viewvc/tika/site/publish/doap.rdf?rev=1772249

svn commit: r1772248 - /tika/site/src/site/resources/doap.rdf

2016-12-01 Thread nick
Author: nick Date: Thu Dec 1 18:52:50 2016 New Revision: 1772248 URL: http://svn.apache.org/viewvc?rev=1772248=rev Log: Add our twitter details Modified: tika/site/src/site/resources/doap.rdf Modified: tika/site/src/site/resources/doap.rdf URL: http://svn.apache.org/viewvc/tika/site/src

svn commit: r1769266 - in /tika/site/src/site/apt/1.15: ./ examples.apt formats.apt

2016-11-11 Thread nick
Author: nick Date: Fri Nov 11 09:31:29 2016 New Revision: 1769266 URL: http://svn.apache.org/viewvc?rev=1769266=rev Log: Start tracking examples and formats for 1.15 Added: tika/site/src/site/apt/1.15/ tika/site/src/site/apt/1.15/examples.apt (props changed) - copied unchanged

tika git commit: Mimetype for SAS Xport (XPT) files

2016-11-10 Thread nick
ree/d647a234 Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/d647a234 Branch: refs/heads/master Commit: d647a234783bbc3f83d3691a1e1d5788c438d55d Parents: 2e325cb Author: Nick Burch <n...@gagravarr.org> Authored: Thu Nov 10 18:14:55 2016 + Committer: Nick Burch <n...@gagravarr.or

tika git commit: Tesseract may see the t in haystack as a ! some times...

2016-10-05 Thread nick
org/repos/asf/tika/tree/1ec8c094 Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/1ec8c094 Branch: refs/heads/2.x Commit: 1ec8c0947575729975601d543f9a5b08ca3c7269 Parents: 1ab6c81 Author: Nick Burch <n...@gagravarr.org> Authored: Wed Jun 22 09:33:41 2016 +0100 Committer: Nick

[2/3] tika git commit: TIKA-2064 Test Stata DTA files from Michael Stepner, plus detection unit test

2016-09-13 Thread nick
-us.apache.org/repos/asf/tika/diff/e58ade38 Branch: refs/heads/2.x Commit: e58ade381a3e4285eb81d55fb250611e82adbef7 Parents: 443a21e Author: Nick Burch <n...@gagravarr.org> Authored: Tue Sep 13 20:41:41 2016 +0100 Committer: Nick Burch <n...@gagravarr.org> Committed: Tue Sep 13 20:48:1

[1/3] tika git commit: TIKA-2064 Mime types, with magic, for mostly-xml Stata DTA files. (Awaiting suitably licensed file for testing)

2016-09-13 Thread nick
asf/tika/commit/443a21e3 Tree: http://git-wip-us.apache.org/repos/asf/tika/tree/443a21e3 Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/443a21e3 Branch: refs/heads/2.x Commit: 443a21e3fb564df9bb1c52f6533bd5da6f5cfcc8 Parents: 4636f95 Author: Nick Burch <n...@gagravarr.org> Authored: T

[1/2] tika git commit: TIKA-2064 Test Stata DTA files from Michael Stepner, plus detection unit test

2016-09-13 Thread nick
ttp://git-wip-us.apache.org/repos/asf/tika/tree/fe0c Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/fe0c Branch: refs/heads/master Commit: fe0ce2e1db633bdcf49bd7b24941374f2033 Parents: 3c0abc8 Author: Nick Burch <n...@gagravarr.org> Authored: Tue Sep 13 20:41:41 2016 +0100

[2/2] tika git commit: Changelog update

2016-09-13 Thread nick
/master Commit: 9130bbc1fa6d69419b2ad294917260d6b1cced08 Parents: fe0 Author: Nick Burch <n...@gagravarr.org> Authored: Tue Sep 13 20:42:13 2016 +0100 Committer: Nick Burch <n...@gagravarr.org> Committed: Tue Sep 13 20:42:1

[3/5] tika git commit: TIKA-2042 MBOX magic and detection unit test

2016-07-26 Thread nick
/65cc9bce Branch: refs/heads/2.x Commit: 65cc9bcecdc6b86294a88f3b2b6b26017f356ae5 Parents: 31374a3 Author: Nick Burch <n...@gagravarr.org> Authored: Tue Jul 26 11:36:29 2016 +0100 Committer: Nick Burch <n...@gagravarr.org> Committed: Tue Jul 26 12:06:5

[4/5] tika git commit: Changelog update

2016-07-26 Thread nick
Commit: 53310facc3e9514904d85f3d1e0f1ae6ed57d5cb Parents: 65cc9bc Author: Nick Burch <n...@gagravarr.org> Authored: Tue Jul 26 11:38:14 2016 +0100 Committer: Nick Burch <n...@gagravarr.org> Committed: Tue Jul 26 12:06:5

[2/5] tika git commit: TIKA-2037 RFC822Parser should wrap the James InputStream of embedded resources to avoid problems with downstream detection or extraction

2016-07-26 Thread nick
-us.apache.org/repos/asf/tika/tree/31374a39 Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/31374a39 Branch: refs/heads/2.x Commit: 31374a39bae03bfc260f73662c133467637193f1 Parents: d6ce10b Author: Nick Burch <n...@gagravarr.org> Authored: Wed Jul 20 18:15:25 2016 +0100 Committer: Nick Bu

[1/5] tika git commit: Email with attachment for testing extraction issues

2016-07-26 Thread nick
pos/asf/tika/tree/d6ce10b4 Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/d6ce10b4 Branch: refs/heads/2.x Commit: d6ce10b4140b63797726e7881039c6fd09df3cbe Parents: 8b951a4 Author: Nick Burch <n...@gagravarr.org> Authored: Wed Jul 20 17:38:45 2016 +0100 Committer: Nick Burch <n...@g

[5/5] tika git commit: TIKA-2037 Merge fixes for 2.x

2016-07-26 Thread nick
: refs/heads/2.x Commit: f89887d2fbaa3949c398095b37322208a3fd4c7a Parents: 53310fa Author: Nick Burch <n...@gagravarr.org> Authored: Tue Jul 26 12:18:00 2016 +0100 Committer: Nick Burch <n...@gagravarr.org> Committed: Tue Jul 26 12:18:0

tika git commit: Changelog update

2016-07-26 Thread nick
iff: http://git-wip-us.apache.org/repos/asf/tika/diff/53c461a0 Branch: refs/heads/master Commit: 53c461a036eb6afccb20cd3c56d1b0b347a24ded Parents: 72d2d88 Author: Nick Burch <n...@gagravarr.org> Authored: Tue Jul 26 11:38:14 2016 +0100 Committer: Nick Burch <n...@gagravarr.org> Committed: T

tika git commit: TIKA-2042 MBOX magic and detection unit test

2016-07-26 Thread nick
asf/tika/tree/72d2d88b Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/72d2d88b Branch: refs/heads/master Commit: 72d2d88b381ba75942ae791042ef54af33ee1f38 Parents: f00ab04 Author: Nick Burch <n...@gagravarr.org> Authored: Tue Jul 26 11:36:29 2016 +0100 Committer: Nick Burch <n...@g

tika git commit: TIKA-2037 RFC822Parser should wrap the James InputStream of embedded resources to avoid problems with downstream detection or extraction

2016-07-20 Thread nick
git-wip-us.apache.org/repos/asf/tika/commit/952fb54e Tree: http://git-wip-us.apache.org/repos/asf/tika/tree/952fb54e Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/952fb54e Branch: refs/heads/master Commit: 952fb54ed78a2fba07db4653cc674f5641211031 Parents: 3ecdc0c Author: Nick Burch

tika git commit: Email with attachment for testing extraction issues

2016-07-20 Thread nick
org/repos/asf/tika/tree/3ecdc0cb Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/3ecdc0cb Branch: refs/heads/master Commit: 3ecdc0cb0b689b62bb2dd26dfdbc9a52811a9604 Parents: ff187a0 Author: Nick Burch <n...@gagravarr.org> Authored: Wed Jul 20 17:38:45 2016 +0100 Committer: Nick

tika git commit: Detection magic for POI-generated OOXML files, which have _rels before content type, plus test

2016-06-23 Thread nick
mit/52ea9ba7 Tree: http://git-wip-us.apache.org/repos/asf/tika/tree/52ea9ba7 Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/52ea9ba7 Branch: refs/heads/master Commit: 52ea9ba7c2e3c99e7a2d4fb38875caa996438857 Parents: d6981ad Author: Nick Burch <n...@gagravarr.org> Authored: Thu Jun 23

[2/2] tika git commit: Upgrade Commons Compress to 1.12 (supports progress on TIKA-1358)

2016-06-22 Thread nick
/asf/tika/diff/a46ffacf Branch: refs/heads/master Commit: a46ffacf10a783187042a4c940ca62549b4bea03 Parents: 5101023 Author: Nick Burch <n...@gagravarr.org> Authored: Wed Jun 22 09:37:36 2016 +0100 Committer: Nick Burch <n...@gagravarr.org> Committed: Wed Jun 22 09:37:3

svn commit: r1748423 - in /tika/site: publish/mail-lists.html src/site/apt/mail-lists.apt

2016-06-14 Thread nick
Author: nick Date: Tue Jun 14 14:25:01 2016 New Revision: 1748423 URL: http://svn.apache.org/viewvc?rev=1748423=rev Log: Link to the new PonyMail lists.apache.org interface by default - more use friendly and helpful Modified: tika/site/publish/mail-lists.html tika/site/src/site/apt/mail

svn commit: r1745868 - in /tika/site/publish: 1.10/configuring.html 1.11/configuring.html 1.12/configuring.html 1.13/configuring.html

2016-05-28 Thread nick
Author: nick Date: Sat May 28 11:41:48 2016 New Revision: 1745868 URL: http://svn.apache.org/viewvc?rev=1745868=rev Log: Republish Modified: tika/site/publish/1.10/configuring.html tika/site/publish/1.11/configuring.html tika/site/publish/1.12/configuring.html tika/site/publish

svn commit: r1745867 - in /tika/site/src/site/apt: 1.10/configuring.apt 1.11/configuring.apt 1.12/configuring.apt 1.13/configuring.apt

2016-05-28 Thread nick
Author: nick Date: Sat May 28 11:41:07 2016 New Revision: 1745867 URL: http://svn.apache.org/viewvc?rev=1745867=rev Log: Correct the markup for pre-formatted text, to avoid key words getting lost in the configuring section TIKA-1989 Modified: tika/site/src/site/apt/1.10/configuring.apt

svn commit: r1744087 - in /tika/site/src/site/apt/1.14: ./ examples.apt formats.apt

2016-05-16 Thread nick
Author: nick Date: Mon May 16 16:34:06 2016 New Revision: 1744087 URL: http://svn.apache.org/viewvc?rev=1744087=rev Log: Start tracking the formats and examples for 1.14, as we develop Added: tika/site/src/site/apt/1.14/ tika/site/src/site/apt/1.14/examples.apt - copied unchanged

[1/2] tika git commit: TIKA-1928 More tests for files with # in them

2016-05-15 Thread nick
asf/tika/tree/398e73de Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/398e73de Branch: refs/heads/2.x Commit: 398e73dee3e6ece5a0af376abe954b75bce0cfda Parents: a882a32 Author: Nick Burch <n...@gagravarr.org> Authored: Mon May 16 01:04:03 2016 +0100 Committer: Nick Burch <n...@g

tika git commit: TIKA-1928 Fix detection for filenames containing a #, avoid mis-detecting that part as a page anchor

2016-05-15 Thread nick
ika/commit/b2821d92 Tree: http://git-wip-us.apache.org/repos/asf/tika/tree/b2821d92 Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/b2821d92 Branch: refs/heads/master Commit: b2821d921ac4cfd3be468e8bea9123f5cb627cbf Parents: bc0b1f7 Author: Nick Burch <n...@gagravarr.org> Authored: Mon May

svn commit: r1743051 - /tika/site/src/site/apt/1.13/formats.apt

2016-05-09 Thread nick
Author: nick Date: Mon May 9 21:03:58 2016 New Revision: 1743051 URL: http://svn.apache.org/viewvc?rev=1743051=rev Log: Update for 1.13 formats Modified: tika/site/src/site/apt/1.13/formats.apt Modified: tika/site/src/site/apt/1.13/formats.apt URL: http://svn.apache.org/viewvc/tika/site

tika git commit: TIKA-1966 Converted versions of test iWorks files from latest iWorks for iPad

2016-05-04 Thread nick
ttp://git-wip-us.apache.org/repos/asf/tika/tree/c93ff3e1 Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/c93ff3e1 Branch: refs/heads/master Commit: c93ff3e1cc8d1af1e925a3911949405d0084a507 Parents: 7f486e6 Author: Nick Burch <n...@gagravarr.org> Authored: Wed May 4 14:11:52 2016 +0100

[4/5] tika git commit: Whitespace

2016-05-02 Thread nick
Commit: fba811b830a2105095efcdc267b5843c29932e7e Parents: 567f7f7 Author: Nick Burch <n...@gagravarr.org> Authored: Mon May 2 21:09:33 2016 +0100 Committer: Nick Burch <n...@gagravarr.org> Committed: Mon May 2 21:09:3

[5/5] tika git commit: Comment updates

2016-05-02 Thread nick
Commit: 7f486e693bb28448811245e71deeb3fb561332b2 Parents: fba811b Author: Nick Burch <n...@gagravarr.org> Authored: Mon May 2 21:09:48 2016 +0100 Committer: Nick Burch <n...@gagravarr.org> Committed: Mon May 2 21:09:4

svn commit: r1741802 - /tika/site/src/site/apt/index.apt.vm

2016-04-30 Thread nick
Author: nick Date: Sat Apr 30 19:46:05 2016 New Revision: 1741802 URL: http://svn.apache.org/viewvc?rev=1741802=rev Log: Fix link Modified: tika/site/src/site/apt/index.apt.vm Modified: tika/site/src/site/apt/index.apt.vm URL: http://svn.apache.org/viewvc/tika/site/src/site/apt

[2/2] tika git commit: Remove erroneous backslashes before already-escaped < entries in vnd.mif mime magic, plus unit tests. Thanks to Steffen Netz in TIKA-1898 for help with this

2016-03-23 Thread nick
ttp://git-wip-us.apache.org/repos/asf/tika/tree/c94236a8 Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/c94236a8 Branch: refs/heads/master Commit: c94236a8365b9ab1491d98c31819a85b065a9fe1 Parents: 959c9ad Author: Nick Burch <n...@gagravarr.org> Authored: Wed Mar 23 23:31:12 2016 +

[1/2] tika git commit: Test vnd.mif file from Steffen Netz from TIKA-1898

2016-03-23 Thread nick
org/repos/asf/tika/tree/959c9ad2 Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/959c9ad2 Branch: refs/heads/master Commit: 959c9ad2843ed8d21cb3e8c60e5ac614e2f7ab99 Parents: 9ebf066 Author: Nick Burch <n...@gagravarr.org> Authored: Wed Mar 23 23:23:15 2016 + Committer: Nick

tika git commit: Magic for Mobipocket Ebook and ESRI Shapefiles from TIKA-1892 from Suman Kashyap

2016-03-06 Thread nick
ree: http://git-wip-us.apache.org/repos/asf/tika/tree/74e71ebd Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/74e71ebd Branch: refs/heads/master Commit: 74e71ebd871172c3473719d0814400f69d4c8913 Parents: f7d3097 Author: Nick Burch <n...@gagravarr.org> Authored: Sun Mar 6 14:42:24 2

[2/2] tika git commit: TIKA-1890 Mime magic for CAB files, and unit tests for detection

2016-03-06 Thread nick
/asf/tika/diff/f7d3097f Branch: refs/heads/master Commit: f7d3097fb6581d989195b51bb2bc4302ad9bf24a Parents: b878281 Author: Nick Burch <n...@gagravarr.org> Authored: Sun Mar 6 14:33:54 2016 + Committer: Nick Burch <n...@gagravarr.org> Committed: Sun Mar 6 14:33:5

tika git commit: Better express the MP4/QuickTime relationship in our mime type hierarchy

2016-03-06 Thread nick
git-wip-us.apache.org/repos/asf/tika/tree/bee1a87d Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/bee1a87d Branch: refs/heads/master Commit: bee1a87d7d9ad3a1c5f45cf65082b9505dbe9fc0 Parents: 1b7009d Author: Nick Burch <n...@gagravarr.org> Authored: Sun Mar 6 13:02:24 2016 + Commit

[2/2] tika git commit: TIKA-1878 - Upgrade Apche SIS to 0.6. This closes #79 from github

2016-03-06 Thread nick
/asf/tika/diff/1b7009d0 Branch: refs/heads/master Commit: 1b7009d0b122b8177ad8cc3aa5f835c6649fbeb1 Parents: 963a916 Author: Nick Burch <n...@gagravarr.org> Authored: Sun Mar 6 12:47:20 2016 + Committer: Nick Burch <n...@gagravarr.org> Committed: Sun Mar 6 12:47:2

[2/4] tika git commit: Merge branch 'TIKA-1877' of https://github.com/prasadns14/tika

2016-03-06 Thread nick
/asf/tika/diff/e1050887 Branch: refs/heads/master Commit: e105088729ef12acbda1fe4e71f462f84d98f981 Parents: 355a7d1 602d237 Author: Nick Burch <n...@gagravarr.org> Authored: Sun Mar 6 12:24:35 2016 + Committer: Nick Burch <n...@gagravarr.org> Committed: Sun Mar 6 12:24:3

[1/4] tika git commit: TIKA-1870 refactor RichTextContentHandler into tika-core from tika-server so users if needing it don't need to depend upon tika-server

2016-02-25 Thread nick
Repository: tika Updated Branches: refs/heads/master 0c030081b -> ed762b702 TIKA-1870 refactor RichTextContentHandler into tika-core from tika-server so users if needing it don't need to depend upon tika-server Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit:

svn commit: r1731229 - in /tika/site/publish: 1.12/formats.html 1.13/ 1.13/examples.html 1.13/formats.html

2016-02-19 Thread nick
Author: nick Date: Fri Feb 19 11:51:32 2016 New Revision: 1731229 URL: http://svn.apache.org/viewvc?rev=1731229=rev Log: Publish r1731228 changes Added: tika/site/publish/1.13/ tika/site/publish/1.13/examples.html tika/site/publish/1.13/formats.html Modified: tika/site/publish

svn commit: r1731228 - in /tika/site/src/site/apt: 1.12/formats.apt 1.13/ 1.13/formats.apt

2016-02-19 Thread nick
Author: nick Date: Fri Feb 19 11:50:48 2016 New Revision: 1731228 URL: http://svn.apache.org/viewvc?rev=1731228=rev Log: List the details of the parsers included in 1.12, and start the 1.13 pages which need updates during the development phase Added: tika/site/src/site/apt/1.13

svn commit: r1731227 - in /tika/site: publish/1.12/formats.html src/site/apt/1.12/formats.apt

2016-02-19 Thread nick
Author: nick Date: Fri Feb 19 11:48:42 2016 New Revision: 1731227 URL: http://svn.apache.org/viewvc?rev=1731227=rev Log: Add something on new parsers for 1.12 Modified: tika/site/publish/1.12/formats.html tika/site/src/site/apt/1.12/formats.apt Modified: tika/site/publish/1.12

tika git commit: Briefly describe the parser, and link to the wiki for more details

2016-02-19 Thread nick
wip-us.apache.org/repos/asf/tika/tree/28b9a666 Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/28b9a666 Branch: refs/heads/master Commit: 28b9a6667c6618f9f326f715588515aaa3f6eb89 Parents: fc801d1 Author: Nick Burch <n...@gagravarr.org> Authored: Fri Feb 19 11:45:49 2016 + Committer: N

tika git commit: TIKA-1856 Upgrade the Ogg dependency for the truncated files fix

2016-02-17 Thread nick
wip-us.apache.org/repos/asf/tika/tree/2eb49a72 Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/2eb49a72 Branch: refs/heads/master Commit: 2eb49a721b77edf23c3588326c8d480332d79722 Parents: 542bebc Author: Nick Burch <n...@gagravarr.org> Authored: Wed Feb 17 18:50:19 2016 + Committer: N

[3/4] tika git commit: Test PKCS7 Signature files produced by CADES, from Alessandro De Angelis TIKA-1821

2016-02-03 Thread nick
-us.apache.org/repos/asf/tika/diff/57ae2c5e Branch: refs/heads/master Commit: 57ae2c5e0be62fbdc3321ebd89f37c81af190ad1 Parents: 6ac99bf Author: Nick Burch <n...@gagravarr.org> Authored: Wed Feb 3 13:38:10 2016 + Committer: Nick Burch <n...@gagravarr.org> Committed: Wed Feb 3 13:38:1

[1/4] tika git commit: Test PKCS7 Signature files produced by CADES, from Alessandro De Angelis TIKA-1821

2016-02-03 Thread nick
Repository: tika Updated Branches: refs/heads/master 6ac99bf41 -> 046e43f81 http://git-wip-us.apache.org/repos/asf/tika/blob/57ae2c5e/tika-parsers/src/test/resources/test-documents/testPKCS17Sig.xml.p7m -- diff --git

<    1   2   3   4   5   6   7   8   >