Yes, there is a big reason. It’s b/c you don’t have to have an external
server running to use it with tika-dl. And of course you can static analyze
the code (which you have to mix languages for that with the other solution),
etc.
So yes, we should keep them both…
From: Tim
This is very helpful. Thank you! Is there any use in having the tika-dl
module if our more modern approach is REST + Docker? The upkeep in tika-dl
is nontrivial.
On Fri, Jul 6, 2018 at 6:15 PM Chris Mattmann wrote:
> Tim,
>
>
>
> Thanks. There are multiple modes of integrating deep learning
Tim,
Thanks. There are multiple modes of integrating deep learning with Tika:
The original mode: uses Thamme’s work on REST exposing Tensorflow
and Docker to provide a REST Service to Tika to allow for running Tensorflow
DL models. We initially did Inception_v3, and a model by Madhav Sharan
[
https://issues.apache.org/jira/browse/TIKA-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison updated TIKA-2680:
--
Attachment: main_email_in_outlook.jpg
> Email attachments to an email are not extracted
>
[
https://issues.apache.org/jira/browse/TIKA-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535412#comment-16535412
]
Tim Allison commented on TIKA-2680:
---
Given that Outlook appears to treat this as an attachment, are you
[
https://issues.apache.org/jira/browse/TIKA-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison updated TIKA-2680:
--
Attachment: (was: main_email_in_outlook.jpg)
> Email attachments to an email are not extracted
>
[
https://issues.apache.org/jira/browse/TIKA-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison updated TIKA-2680:
--
Attachment: main_email_in_outlook.jpg
> Email attachments to an email are not extracted
>
On Twitter, Chris, Thamme, Thejan, and I are working with some
deeplearning4j devs to help us upgrade to deeplearning4j 1.0.0-BETA
(TIKA-2672).
I initially requested help from Thejan (and Thamme :D) for this because we
were getting an initialization exception after the upgrade in tika-dl's
[
https://issues.apache.org/jira/browse/TIKA-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535351#comment-16535351
]
Yury Kats edited comment on TIKA-2680 at 7/6/18 9:07 PM:
-
Indeed, the first
[
https://issues.apache.org/jira/browse/TIKA-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535351#comment-16535351
]
Yury Kats commented on TIKA-2680:
-
Indeed, the first embedded rfc822 is not an attachment. I believe this
[
https://issues.apache.org/jira/browse/TIKA-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535339#comment-16535339
]
Tim Allison commented on TIKA-2680:
---
Something like this?
{noformat}
multipart/mixed (uses
[
https://issues.apache.org/jira/browse/TIKA-2685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535296#comment-16535296
]
Yury Kats commented on TIKA-2685:
-
Yes, correct, this is govern by RFC 3642, sorry I didn't mention this
[
https://issues.apache.org/jira/browse/TIKA-2685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535288#comment-16535288
]
Tim Allison commented on TIKA-2685:
---
https://tools.ietf.org/html/rfc3462 page 2 describes exactly
[
https://issues.apache.org/jira/browse/TIKA-2685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535271#comment-16535271
]
Yury Kats edited comment on TIKA-2685 at 7/6/18 8:03 PM:
-
delivery-status and
[
https://issues.apache.org/jira/browse/TIKA-2685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535275#comment-16535275
]
Tim Allison commented on TIKA-2685:
---
I think I agree...the first rfc822 (multipart/report) has three
[
https://issues.apache.org/jira/browse/TIKA-2685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535271#comment-16535271
]
Yury Kats edited comment on TIKA-2685 at 7/6/18 8:02 PM:
-
delivery-status and
[
https://issues.apache.org/jira/browse/TIKA-2685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535267#comment-16535267
]
Tim Allison edited comment on TIKA-2685 at 7/6/18 8:00 PM:
---
Is this your
[
https://issues.apache.org/jira/browse/TIKA-2685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535271#comment-16535271
]
Yury Kats commented on TIKA-2685:
-
delivery-status and message/rfc822 are inside multipart/report
> Email
[
https://issues.apache.org/jira/browse/TIKA-2685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535267#comment-16535267
]
Tim Allison commented on TIKA-2685:
---
Is this your understanding of the structure?
{noformat}
[
https://issues.apache.org/jira/browse/TIKA-2673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535147#comment-16535147
]
Hudson commented on TIKA-2673:
--
SUCCESS: Integrated in Jenkins build tika-branch-1x #56 (See
[
https://issues.apache.org/jira/browse/TIKA-2673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535145#comment-16535145
]
Hudson commented on TIKA-2673:
--
SUCCESS: Integrated in Jenkins build Tika-trunk #1517 (See
[
https://issues.apache.org/jira/browse/TIKA-2685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535135#comment-16535135
]
Yury Kats commented on TIKA-2685:
-
For my own immediate needs, I modified MimeStreamParser to call
[
https://issues.apache.org/jira/browse/TIKA-2673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535130#comment-16535130
]
Hudson commented on TIKA-2673:
--
UNSTABLE: Integrated in Jenkins build tika-2.x-windows #282 (See
[
https://issues.apache.org/jira/browse/TIKA-2685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535100#comment-16535100
]
Tim Allison commented on TIKA-2685:
---
[~yurykats], thank you for identifying this problem and TIKA-2680
[
https://issues.apache.org/jira/browse/TIKA-2685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison reassigned TIKA-2685:
-
Assignee: Tim Allison
> Email attached to an undeliverable email report are not extracted
>
[
https://issues.apache.org/jira/browse/TIKA-2673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16535041#comment-16535041
]
Tim Allison commented on TIKA-2673:
---
I've added this to both 'master' and 'branch_1x'. Let me know if
Once tika-dl works again with Inception v4, I’m good ☺
I’m working on adding some more models to tika-dl and other things
but those can come after 1.19.
Cheers,
Chris
From: Tim Allison
Reply-To: "dev@tika.apache.org"
Date: Friday, July 6, 2018 at 8:40 AM
To:
All,
We've made quite a few improvements, what would you think of starting the
release process in a couple of weeks...say, July 23ish?
I'd like to complete the dl4j upgrade and update some of our dependencies
so that we can at least build with Java 11.
Any blockers or other things people
[
https://issues.apache.org/jira/browse/TIKA-2672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16534994#comment-16534994
]
Tim Allison commented on TIKA-2672:
---
Fantastic! Thank you [~ThejanWijesinghe]!
> Upgrade dl4j to
[
https://issues.apache.org/jira/browse/TIKA-2672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16534994#comment-16534994
]
Tim Allison edited comment on TIKA-2672 at 7/6/18 3:30 PM:
---
Fantastic! Thank
[
https://issues.apache.org/jira/browse/TIKA-2673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16534992#comment-16534992
]
Tim Allison commented on TIKA-2673:
---
[~gbouchar], thank you for contributing this! I won't have time to
[
https://issues.apache.org/jira/browse/TIKA-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16534990#comment-16534990
]
Hudson commented on TIKA-2675:
--
SUCCESS: Integrated in Jenkins build tika-branch-1x #55 (See
[
https://issues.apache.org/jira/browse/TIKA-2672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16534984#comment-16534984
]
Chris A. Mattmann commented on TIKA-2672:
-
GREAT WORK [~ThejanWijesinghe] thanks my guy
> Upgrade
[
https://issues.apache.org/jira/browse/TIKA-2672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16534966#comment-16534966
]
Thejan Wijesinghe commented on TIKA-2672:
-
[~talli...@apache.org] sorry for the delay, so the dl4j
[
https://issues.apache.org/jira/browse/TIKA-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16534858#comment-16534858
]
Hudson commented on TIKA-2675:
--
SUCCESS: Integrated in Jenkins build Tika-trunk #1516 (See
[
https://issues.apache.org/jira/browse/TIKA-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-2675.
---
Resolution: Fixed
Fix Version/s: 2.0.0
1.19
Thank you [~wastl-nagel]!
>
[
https://issues.apache.org/jira/browse/TIKA-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16534789#comment-16534789
]
ASF GitHub Bot commented on TIKA-2675:
--
tballison closed pull request #240: TIKA-2675
[
https://issues.apache.org/jira/browse/TIKA-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16534784#comment-16534784
]
Tim Allison commented on TIKA-874:
--
See TIKA-2684 for how to configure GDAL to parse FITS...many thanks to
[
https://issues.apache.org/jira/browse/TIKA-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-2684.
---
Resolution: Not A Problem
Not a Tika problem technically, but definitely an area for us to improve
[
https://issues.apache.org/jira/browse/TIKA-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16534780#comment-16534780
]
Tim Allison commented on TIKA-2684:
---
W00t! Thank you [~chrismattmann].
[~sborda], I updated our
[
https://issues.apache.org/jira/browse/TIKA-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16534780#comment-16534780
]
Tim Allison edited comment on TIKA-2684 at 7/6/18 12:46 PM:
W00t! Thank you
41 matches
Mail list logo