[
https://issues.apache.org/jira/browse/TIKA-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mujahid Ateeb Khan updated TIKA-2365:
-
Description: I'm working with Birt and tika, birt uses a jar called
org.apache.batik.pdf
[
https://issues.apache.org/jira/browse/TIKA-2362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013488#comment-16013488
]
Mujahid Ateeb Khan edited comment on TIKA-2362 at 5/17/17 4:17 AM:
---
Yes I
[
https://issues.apache.org/jira/browse/TIKA-2362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013488#comment-16013488
]
Mujahid Ateeb Khan commented on TIKA-2362:
--
Yes I tried that method using XHTML handler but some
[
https://issues.apache.org/jira/browse/TIKA-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013472#comment-16013472
]
Thamme Gowda commented on TIKA-2360:
Sorry, I am late to the discussion.
1. (y) to turn it OFF. I had
[
https://issues.apache.org/jira/browse/TIKA-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013087#comment-16013087
]
Tim Allison commented on TIKA-2360:
---
> Thanks Tim, appreciate it.
Of course! I'm sorry for moving out on
[
https://issues.apache.org/jira/browse/TIKA-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013042#comment-16013042
]
Chris A. Mattmann commented on TIKA-2360:
-
Thanks Tim, appreciate it. I think at the end of the day
[
https://issues.apache.org/jira/browse/TIKA-2367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013002#comment-16013002
]
Hudson commented on TIKA-2367:
--
FAILURE: Integrated in Jenkins build Tika-trunk #1268 (See
I reran the eval with some updates, including rc1 of PDFBox 2.0.6, which is now
integrated.
http://162.242.228.174/reports/reports_tika_20170515.tar.gz
I need to do some more digging on attachments -- hit max limit. The decrease
in attachments from the few docs I reviewed is explained by
[
https://issues.apache.org/jira/browse/TIKA-2367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-2367.
---
Resolution: Fixed
Fix Version/s: 1.15
> Avoid npe in wmf
>
>
>
[
https://issues.apache.org/jira/browse/TIKA-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison reopened TIKA-2360:
---
Reopening to discuss
> Handle SentimentParser resource failure more robustly
>
[
https://issues.apache.org/jira/browse/TIKA-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012940#comment-16012940
]
Tim Allison commented on TIKA-2360:
---
Doh. Sorry. Should I revert anything?
> Handle SentimentParser
[
https://issues.apache.org/jira/browse/TIKA-2364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-2364.
---
Resolution: Fixed
Left a few in CLIs and in tika-core
> Clean up printstacktrace
>
[
https://issues.apache.org/jira/browse/TIKA-2364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison updated TIKA-2364:
--
Fix Version/s: 1.15
> Clean up printstacktrace
>
>
> Key:
[
https://issues.apache.org/jira/browse/TIKA-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012930#comment-16012930
]
Chris A. Mattmann commented on TIKA-2360:
-
Tim, I didn't' get a chance at all to comment on this
Tim Allison created TIKA-2367:
-
Summary: Avoid npe in wmf
Key: TIKA-2367
URL: https://issues.apache.org/jira/browse/TIKA-2367
Project: Tika
Issue Type: Improvement
Reporter: Tim
Quick slide on camel-tika.
https://docs.google.com/presentation/d/1OUORiDwB4d0FkLZ0HIlQDLE30vvTniawdyzhQmLj1xE/edit?usp=sharing
On 5/16/2017 10:31 AM, Nick Burch wrote:
> On Tue, 16 May 2017, Eric Pugh wrote:
>> It was great to read through
>>
On Tue, 16 May 2017, Eric Pugh wrote:
It was great to read through
http://events.linuxfoundation.org/sites/events/files/slides/WhatsNewWithApacheTika_1.pdf…
Wow there is a lot in Tika.
And I think that might be the one challenge with the talk structure,
there is SOO much information.
The
Nick,
It was great to read through
http://events.linuxfoundation.org/sites/events/files/slides/WhatsNewWithApacheTika_1.pdf…
Wow there is a lot in Tika.
And I think that might be the one challenge with the talk structure, there is
SOO much information.
I think I’d like to see “How does
[
https://issues.apache.org/jira/browse/TIKA-2364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012555#comment-16012555
]
Hudson commented on TIKA-2364:
--
SUCCESS: Integrated in Jenkins build Tika-trunk #1267 (See
[
https://issues.apache.org/jira/browse/TIKA-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012557#comment-16012557
]
Hudson commented on TIKA-2360:
--
SUCCESS: Integrated in Jenkins build Tika-trunk #1267 (See
Nick,
Here are some pointers:
1. Image recognition using Tensorflow:
https://wiki.apache.org/tika/TikaAndVision; Link to Paper:
https://memex.jpl.nasa.gov/MFSEC17.pdf
2. Image Recognition using Deeplearning4j -
https://wiki.apache.org/tika/TikaAndVisionDL4J
3. Sentiment Analysis using OpenNLP:
Zachary Lee Jones created TIKA-2366:
---
Summary: Add image cropping functionality to TesseractOCRParser
Key: TIKA-2366
URL: https://issues.apache.org/jira/browse/TIKA-2366
Project: Tika
Yep, literally take a look at the Tika wiki – there are examples a plenty and
even
screen shots. Further, if you look at the MEMEX site under our new publications
section, there are a few examples (like the ICMR paper on forensics) that show
it
in action.
IIRC, image and video labeling basic support was added (Chris & Thamme
could you elaborate on that, please), TSD (TIKA-2309, time stamped data
envelope format) support, slf4j migration (ongoing on 2.x branch).
вт, 16 мая 2017 г. в 16:06, Allison, Timothy B. :
> Doh! Sorry
Doh! Sorry for the delay...might add configuration of EncodingDetectors, but
that's probably too far into the weeds?
-Original Message-
From: Nick Burch [mailto:n...@apache.org]
Sent: Sunday, May 14, 2017 11:34 AM
To: dev@tika.apache.org
Subject: Tika talk next week - help needed!
Hi
[
https://issues.apache.org/jira/browse/TIKA-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mujahid Ateeb Khan updated TIKA-2365:
-
Description: I'm working with Birt and tika birt uses a jar called
org.apache.batik.pdf it
Mujahid Ateeb Khan created TIKA-2365:
Summary: Signer's Information doesn't match issue
Key: TIKA-2365
URL: https://issues.apache.org/jira/browse/TIKA-2365
Project: Tika
Issue Type: Bug
[
https://issues.apache.org/jira/browse/TIKA-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012279#comment-16012279
]
Hudson commented on TIKA-2360:
--
UNSTABLE: Integrated in Jenkins build Tika-trunk #1266 (See
[
https://issues.apache.org/jira/browse/TIKA-2363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012278#comment-16012278
]
Hudson commented on TIKA-2363:
--
UNSTABLE: Integrated in Jenkins build Tika-trunk #1266 (See
[
https://issues.apache.org/jira/browse/TIKA-2362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012245#comment-16012245
]
Thejan Wijesinghe commented on TIKA-2362:
-
Can't we use regular expressions to detect headers &
[
https://issues.apache.org/jira/browse/TIKA-2362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012244#comment-16012244
]
Mujahid Ateeb Khan commented on TIKA-2362:
--
Is there any alternate way to skip headers and footers
Tim Allison created TIKA-2364:
-
Summary: Clean up printstacktrace
Key: TIKA-2364
URL: https://issues.apache.org/jira/browse/TIKA-2364
Project: Tika
Issue Type: Improvement
Reporter:
[
https://issues.apache.org/jira/browse/TIKA-2362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012238#comment-16012238
]
Tim Allison commented on TIKA-2362:
---
There isn't, and it shouldn't be hard to add. Prob won't make it
[
https://issues.apache.org/jira/browse/TIKA-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012235#comment-16012235
]
Tim Allison edited comment on TIKA-2360 at 5/16/17 12:20 PM:
-
I removed the
[
https://issues.apache.org/jira/browse/TIKA-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison updated TIKA-2360:
--
Fix Version/s: 1.15
> Handle SentimentParser resource failure more robustly
>
[
https://issues.apache.org/jira/browse/TIKA-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-2360.
---
Resolution: Fixed
I removed the SentimentParser from SPI, removed glob detection for .sent, and
made
Tim Allison created TIKA-2363:
-
Summary: Skip image recognition test if network call fails
Key: TIKA-2363
URL: https://issues.apache.org/jira/browse/TIKA-2363
Project: Tika
Issue Type:
[
https://issues.apache.org/jira/browse/TIKA-2361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16012202#comment-16012202
]
Hudson commented on TIKA-2361:
--
SUCCESS: Integrated in Jenkins build Tika-trunk #1265 (See
Mujahid Ateeb Khan created TIKA-2362:
Summary: Skipping Header and Footer data from documents
Key: TIKA-2362
URL: https://issues.apache.org/jira/browse/TIKA-2362
Project: Tika
Issue
[
https://issues.apache.org/jira/browse/TIKA-2361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-2361.
---
Resolution: Fixed
Fix Version/s: 1.15
> Upgrade to PDFBox 2.0.6
> ---
>
>
[
https://issues.apache.org/jira/browse/TIKA-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16011799#comment-16011799
]
ASF GitHub Bot commented on TIKA-2298:
--
asmehra95 commented on issue #159: Creation of TIKA-2298
41 matches
Mail list logo