[
https://issues.apache.org/jira/browse/TIKA-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17708146#comment-17708146
]
Chris Mattmann commented on TIKA-4009:
--
ugh, one more time, not `geo.topic`, instead `geo/topic
[
https://issues.apache.org/jira/browse/TIKA-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17708144#comment-17708144
]
Chris Mattmann commented on TIKA-4009:
--
Forgot the config, file, fixed in main:
{noformat}
(base
[
https://issues.apache.org/jira/browse/TIKA-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Mattmann resolved TIKA-4009.
--
Resolution: Fixed
Fixed:
{noformat}
(base) mattmann@proscuitto:~/git/tika$ git commit -m
[
https://issues.apache.org/jira/browse/TIKA-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17708070#comment-17708070
]
Chris Mattmann commented on TIKA-4009:
--
OK, I have a patch and commit forthcoming but it's fixed
Chris Mattmann created TIKA-4009:
Summary: GeoTopic Parser package changed incorrectly from
o.a.t.parser.geo from o.a.t.parser.geo.topic
Key: TIKA-4009
URL: https://issues.apache.org/jira/browse/TIKA-4009
[
https://issues.apache.org/jira/browse/TIKA-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Mattmann reassigned TIKA-4009:
Assignee: Chris Mattmann
> GeoTopic Parser package changed incorrectly f
[
https://issues.apache.org/jira/browse/TIKA-3439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Mattmann updated TIKA-3439:
-
Issue Type: New Feature (was: Bug)
> Create new TensorFlow2 backed Tika NLP doc
[
https://issues.apache.org/jira/browse/TIKA-3439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Mattmann reassigned TIKA-3439:
Assignee: Chris Mattmann
> Create new TensorFlow2 backed Tika NLP doc
Chris Mattmann created TIKA-3439:
Summary: Create new TensorFlow2 backed Tika NLP docker for
SentimentAnalysis
Key: TIKA-3439
URL: https://issues.apache.org/jira/browse/TIKA-3439
Project: Tika
Hannah, I am pushing your question upstream to the dev@tika list. I think what
you need is for them to look
at your config file which I’ve reattached below pasted, and then see if it
looks ok. Then in Tika Python you need
to give it this config file before your server starts up or outside of
[
https://issues.apache.org/jira/browse/TIKA-94?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17338675#comment-17338675
]
Chris Mattmann commented on TIKA-94:
[~lewismc] congratulations! What an accomplishment!
> Spe
[
https://issues.apache.org/jira/browse/TIKA-3329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Mattmann resolved TIKA-3329.
--
Resolution: Fixed
Merged into main! Thanks [~thammegowda]!
{noformat}
(base) mattmann
[
https://issues.apache.org/jira/browse/TIKA-3329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Mattmann updated TIKA-3329:
-
Fix Version/s: 2.0.0
> RTG Translator with many-to-eng translat
[
https://issues.apache.org/jira/browse/TIKA-3329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Mattmann updated TIKA-3329:
-
Labels: memex (was: )
> RTG Translator with many-to-eng translat
[
https://issues.apache.org/jira/browse/TIKA-3329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Mattmann reassigned TIKA-3329:
Assignee: Chris Mattmann (was: Thamme Gowda)
> RTG Translator with many-to-
Hi Manish, I think you should ask this one upstream on the Tika Dev lists. I’ve
cc’ed them for you.
From: manish mathur
Date: Monday, March 15, 2021 at 4:41 AM
To:
Subject: Re: Python-tika: issues related to memory consumption
Hi Chris,
I am using python-tika library to
l.com"
Subject: Help in tika-python
Hello Chris Mattmann,
I installed your library, it works perfectly. I wonder if it possible to find
the position (bounding boxes ) of the texts and images on ppt files.
And to discorver which page de of the slides that texts come from.
Thanks
Nilton
Copying the Tika dev list where I think you will find the help you are looking
for
From: Mariusz G
Date: Wednesday, December 16, 2020 at 7:04 AM
To: "Mattmann, Chris A (US 1740)"
Subject: [EXTERNAL] Tika - problem with Polish encoding
Hello Sir,
I'm writing to you because I
Welcome Peter!
From: Peter Lee
Reply-To:
Date: Wednesday, November 25, 2020 at 6:08 PM
To: "dev@tika.apache.org" , "talli...@apache.org"
Cc: "u...@tika.apache.org"
Subject: Re: [ANNOUNCE] Welcome Peter Lee as Tika PMC member and committer
Many thanks to you, Tim. :)
Hi,
Christian thank you for reaching out. I am copying dev@tika.apache.org as
I think your question is best directed there since tika python is downstream
of the processing that happens there.
Best of luck!
Cheers
Chris
From: Christian Faggionato
Date: Tuesday, November 24, 2020 at
Thanks for reaching out Aditya and for using Tika Python. This issue is
best solved upstream in dev@tika.apache.org so I am copying that list
and making it the reply to.
The issue likely lies in the PDFBox algorithm. There are PDFBox folks on
this list. They can help you. Hopefully there is a
Haha I’m down and supportive!
Time’s TIME FOR 2.x
From: Tim Allison
Reply-To: "dev@tika.apache.org" , "Allison, Tim (US
174B-Affiliate)"
Date: Friday, August 14, 2020 at 6:06 AM
To: ""
Subject: [EXTERNAL] Tika 2.0 modularization
All,
I _think_ I might have some time to
[
https://issues.apache.org/jira/browse/TIKA-3119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17140963#comment-17140963
]
Chris Mattmann commented on TIKA-3119:
--
[~agibsonccc] can you help see above?
> General upgra
How about just development?
We use that on OODT … though we have a master too that needs to get
removed …
From: Tim Allison
Reply-To: "dev@tika.apache.org" , "Allison, Tim (US
1740-Affiliate)"
Date: Tuesday, June 16, 2020 at 10:31 AM
To: ""
Subject: [EXTERNAL] renaming master?
[
https://issues.apache.org/jira/browse/TIKA-3093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17091708#comment-17091708
]
Chris Mattmann commented on TIKA-3093:
--
yea we have lots of pipelines with OODT and Tika that does
Yes, some of us have been developing an Elastic scaling stack for Tika server…
That does just that with AWS. Don’t have it ready to push upstream yet.
Cheers,
Chris
From: Eric Pugh
Reply-To: "dev@tika.apache.org"
Date: Thursday, April 16, 2020 at 7:09 AM
To: "dev@tika.apache.org"
[
https://issues.apache.org/jira/browse/TIKA-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076659#comment-17076659
]
Chris Mattmann commented on TIKA-2368:
--
I have a TensorFlow version of Sentiment Analysis based
ated.
Cheers,
Oleg
On Wed, Mar 18, 2020 at 4:35 PM Chris Mattmann wrote:
So I was able to get past my issues with Tesseract by reinstalling the
latest version with Brew.
I have a new issue!
I’ve tried in JDK12 and JDK13 to build tika-dl, but it keeps failing:
Date: Wednesday, March 18, 2020 at 2:35 AM
To: "dev@tika.apache.org"
Subject: [EXTERNAL] Re: JDK 12 build issues
Haven’t tried...we should add java 12-14 to Jenkins.
Wait, are we up to 18 yet...
Will look into it...
On Tue, Mar 17, 2020 at 10:07 PM Chris Mattmann wro
Hey Tim et al.,
Do the tests fail for you with Java 12?
[INFO] Running org.apache.tika.parser.pkg.GzipParserTest
[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.397 s
- in org.apache.tika.parser.pkg.GzipParserTest
[INFO] Running
Thanks. Please make sure dev@tika.apache.org is where you are addressing
these questions to.
From: Max Franklin
Date: Monday, February 10, 2020 at 10:59 AM
To: Chris Mattmann
Subject: Re: [EXTERNAL] question about Tika
Hi Chris,
The Tika Server seems to work okay for me
Max, does Tika Server work OK for you? Is there a different behavior with Tika
Python than simply posting the PDF to Tika server? Try first and then I am
redirecting
you to the Tika dev list for help.
Thanks,
Chris
From: Max Franklin
Date: Monday, February 10, 2020 at 9:37 AM
OK can you please post an issue http://issues.apache.org/jira/browse/TIKA and
attach your
document and specific error? Thanks!
From: "Gowda,Sumanth"
Date: Wednesday, January 8, 2020 at 9:36 PM
To: Chris Mattmann
Subject: RE: [EXTERNAL] Regarding unicodeencode Error
T
0>>
>>> >
>>> > And a WIP progress PR is at https://github.com/apache/tika/pull/305
<https://github.com/apache/tika/pull/305> <
https://github.com/apache/tika/pull/305 <
https://github.com/apache/tika/pull/305>>
>>> >
&
Thanks for bringing this conversation up Eric.
Historically if you look over the last 5 years, I think what you are asking
below has sort of already become the de facto
truth. Most people are in fact using Tika server, whether they are individual
devs, govvies, commercial folk and the like.
aking
the existing Dockerfile that LogicalSpark has published.
I don’t know how other projects at ASF handle the image publishing.
On Nov 20, 2019, at 7:02 PM, Chris Mattmann wrote:
Nick, TBH, I don’t get it. If we ship the “Dockerfile” we are simply shipping
text file,
code. Under a l
Nick, TBH, I don’t get it. If we ship the “Dockerfile” we are simply shipping
text file,
code. Under a license. If we create a “docker image” and then publish it to the
ASF
hub then I agree with you.
My suggestion and my interpretation of Tim’s is to ship a standard
“Dockerfile”. Do you
+1 ship it
From: Tim Allison
Reply-To: "dev@tika.apache.org" , "Allison, Timothy B (US
1760-Affiliate)"
Date: Wednesday, November 20, 2019 at 9:07 AM
To: ""
Subject: [EXTERNAL] Tika 1.23?
All,
I've abandoned hope of getting the contenthandler factory configuration
stuff into
Hi Aswathi,
Please check with dev@tika.apache.org.
Cheers,
Chris
From: Aswathi Nambiar
Date: Wednesday, November 13, 2019 at 7:39 AM
To: "Mattmann, Chris A (US 1760)"
Subject: [EXTERNAL] How to set the page segmentation for TIKA python
Hi Chris,
I am using Apache
Hi Jay, yes, I believe so. Tika Python is just a thin client to Tika Server and
it
provides this functionality. CC’ing dev@tika
From: Jay Chuk
Date: Tuesday, October 15, 2019 at 3:47 PM
To: "Mattmann, Chris A (US 1761)"
Subject: [EXTERNAL] Extracting font information from xml
Hi
When you do a parse, do this:
from tika import parser
parsed = parser.from_file(‘/path/to/file’, xmlContent=True)
xmlContent = parsed[“content”]
print(xmlContent)
G’luck!
Cheers
Chris
From: Jay Chuk
Date: Tuesday, October 15, 2019 at 3:54 PM
To: Chris Mattmann
Cc
I was able to compress the files in a single zip file and extract, this worked
but the extracted text where saved in a single file, i need the files to be
saved in their individual files so I can use them as input to another program.
Please what is the best method to go about this.
Thank
Victor, please send your email to dev@tika.apache.org, which I’ve CC’ed…
From: Victor Olaiya
Date: Tuesday, August 6, 2019 at 1:37 PM
To: "Mattmann, Chris A (US 1761)"
Subject: [EXTERNAL] TIKA
Hello chris,
I am building an information retrieval system and i need apache tika to
I’ve also got some new stuff I’m getting ready to contribute, in the following
ML/Deep Learning
areas:
Some Basic models using Tensorflow stable 1.13
CIFAR-10 image classifier using a CNN ~86% accuracy – obviously different
than Inception-v3/v4 and VGG-16 which we currently have available,
Looks good…
From: Oleg Tikhonov
Reply-To: "dev@tika.apache.org"
Date: Tuesday, June 25, 2019 at 7:57 AM
To: "dev@tika.apache.org"
Subject: [EXTERNAL] Re: Tika 1.22?
Would be great!!!
Cheers,
Oleg
On Tue, Jun 25, 2019, 17:45 Tim Allison wrote:
All,
The vote for the
ling to confirm
that my commit/fix is sane, I'd appreciate it. Thank you!!!
Cheers,
Tim
On Wed, May 8, 2019 at 11:32 AM Chris Mattmann
wrote:
Thejan, Thamme any ideas?
From: Tim Allison
Reply-
On Wed, May 8, 2019 at 11:32 AM Chris Mattmann wrote:
Thejan, Thamme any ideas?
From: Tim Allison
Reply-To: "dev@tika.apache.org"
Date: Wednesday, May 8, 2019 at 7:50 AM
To: "dev@tika.apache.org"
Subject: [EXTERNAL] Re: DL4JVGG16NetTes
I will test this out
From: Tim Allison
Reply-To: "dev@tika.apache.org"
Date: Wednesday, May 8, 2019 at 6:58 AM
To: "dev@tika.apache.org"
Subject: [EXTERNAL] DL4JVGG16NetTest failures
All,
Apologies for the broken builds...I'm not able to reproduce this
test failure on my mac
Thejan, Thamme any ideas?
From: Tim Allison
Reply-To: "dev@tika.apache.org"
Date: Wednesday, May 8, 2019 at 7:50 AM
To: "dev@tika.apache.org"
Subject: [EXTERNAL] Re: DL4JVGG16NetTest failures
Any recommendations?
java.lang.IllegalStateException: Number of indices (got 2) must
Hi,
This would be a good question to ask on the dev@tika.a.o list so I’m CC’ing
them.
Cheers,
Chris
From: Djari Imene
Date: Friday, April 26, 2019 at 9:45 AM
To: "Mattmann, Chris A (1761)"
Subject: [EXTERNAL] Tika script
Good evening sir I am writing you to request more
+1 from me!
From: Konstantin Gribov
Reply-To: "dev@tika.apache.org"
Date: Thursday, March 21, 2019 at 10:02 AM
To: "dev@tika.apache.org"
Subject: [EXTERNAL] Wiki migration
Hi, folks
What do you think about starting wiki migration (from moin to confluence)?
I can try it via
Roll forward! Yay!
From: Tim Allison
Reply-To: "dev@tika.apache.org"
Date: Thursday, December 13, 2018 at 7:02 AM
To: "dev@tika.apache.org"
Subject: Re: 1.20?
Reports are here:
http://162.242.228.174/reports/tika_1_20-pre-rc1.zip
I'm going to revert the mp4 parser, and
Love it and I can align tika-python with that too ☺
From: Tim Allison
Reply-To: "dev@tika.apache.org"
Date: Tuesday, November 20, 2018 at 3:04 PM
To: "dev@tika.apache.org"
Subject: 1.20?
All,
POI 4.0.1 will be out shortly with some important bug fixes. What would
you all
+1 from me please update the wiki once you do
From: Tim Allison
Reply-To: "dev@tika.apache.org"
Date: Wednesday, September 26, 2018 at 5:47 AM
To: "dev@tika.apache.org"
Cc: Craig Russell
Subject: Re: ***UNCHECKED*** Fwd: MODERATE for annou...@apache.org
All,
It is ok to
Sounds great!
From: Tim Allison
Reply-To: "dev@tika.apache.org"
Date: Tuesday, September 25, 2018 at 9:40 AM
To: "dev@tika.apache.org"
Subject: Re: 1.19.1?
Given the mp3 issue and some other items, let's go with 1.19.1 rc1
today or tomorrow?
On Mon, Sep 24, 2018 at 3:07 PM Nick
Let’s roll it….
From: Tim Allison
Reply-To: "dev@tika.apache.org"
Date: Wednesday, September 19, 2018 at 12:14 PM
To: "dev@tika.apache.org"
Subject: 1.19.1?
The mp3 regression is bad. In hindsight, the Tika-eval reports were fairly
clear on this but I did some self-hand-waving to
From: KamilD
Date: Tuesday, July 31, 2018 at 11:37 PM
To: "dev-ow...@tika.apache.org"
Subject: Tika DjVu?
Helo,
I'm trying to use tika for djvu but is problem.
When using app version 1.14 I get empty result, but in version 1.18 I get:
C:\Users\>java -jar
ach is REST + Docker? The upkeep in tika-dl
is nontrivial.
On Fri, Jul 6, 2018 at 6:15 PM Chris Mattmann wrote:
Tim,
Thanks. There are multiple modes of integrating deep learning with Tika:
The original mode: uses Thamme’s work on REST exposing Tensorflow
and Docker to pr
Tim,
Thanks. There are multiple modes of integrating deep learning with Tika:
The original mode: uses Thamme’s work on REST exposing Tensorflow
and Docker to provide a REST Service to Tika to allow for running Tensorflow
DL models. We initially did Inception_v3, and a model by Madhav Sharan
Once tika-dl works again with Inception v4, I’m good ☺
I’m working on adding some more models to tika-dl and other things
but those can come after 1.19.
Cheers,
Chris
From: Tim Allison
Reply-To: "dev@tika.apache.org"
Date: Friday, July 6, 2018 at 8:40 AM
To:
ctly on my
Windows and Linux setups.
Cheers,
Dave
On Thu, 24 May 2018, 17:09 Chris Mattmann, <mattm...@apache.org> wrote:
Tim,
Are you seeing this?
Results :
Failed tests:
PDFParserTest.testEmbeddedDocsWithOCROnly:1250->TikaTest.assertConta
Tim,
Are you seeing this?
Results :
Failed tests:
PDFParserTest.testEmbeddedDocsWithOCROnly:1250->TikaTest.assertContains:103
pdf_haystack not found in:
http://www.w3.org/1999/xhtml;>
Welcome to Thejan Wijesinghe who has joined as a new Tika PMC member and
committer!
Please say a bit about yourself…thanks!
Cheers,
Chris
Awesomeness
From: "Allison, Timothy B."
Reply-To: "dev@tika.apache.org"
Date: Friday, April 6, 2018 at 11:30 AM
To: "dev@tika.apache.org"
Subject: rfc822 updates and 1.18
All,
I made two updates to our handling of
+1
From: Nick Burch
Reply-To: "dev@tika.apache.org"
Date: Wednesday, March 28, 2018 at 8:01 AM
To: "dev@tika.apache.org"
Subject: Re: message/news; charset=windows-1252 -> message/rfc822
On Wed, 28 Mar 2018, Allison,
Hey Folks,
Just found this R-Tika API binding:
https://ropensci.github.io/rtika/articles/rtika_introduction.html
Very cool! Updated the wiki with it.
Cheers,
Chris
Completely agree, awesome job Nick.
I will definitely try this week as well.
Thank you!
Sincerely,
Chris
On 3/18/18, 2:47 PM, "David Meikle" wrote:
Nice one Nick! Will take a look this week.
Cheers,
Dave
On 14 March 2018 at 17:38, Nick Burch
Sounds good to me thanks Tim. Happy to line it up with PDF Box 2.0.9
On 3/7/18, 1:16 PM, "Allison, Timothy B." wrote:
All,
I think I've made the updates that I wanted to make sure got in to 1.18.
It looks like PDFBox is going to start their release cycle
Same: makes perfect sense to me and let's do it ( I just updated (finally) Tika
Python down
stream to be based on the 1.16 Tika, I guess I should get it based on 1.17 soon
too (
https://github.com/chrismattmann/tika-python/blob/master/tika/__init__.py#L17
Cheers,
Chris
On 3/1/18, 5:16 AM,
No clue - Radhia - perhaps you can enlighten everyone..?
On 2/23/18, 6:45 AM, "Allison, Timothy B." <talli...@mitre.org> wrote:
Um, no, that's not great. What's wrong with our current version?
-Original Message-
From: Chris Mattmann [mailto:mat
Great to hear!
From: radhia bezzine <bezzinerad...@gmail.com>
Date: Thursday, February 22, 2018 at 12:28 PM
To: Chris Mattmann <mattm...@apache.org>
Subject: Re: RE : Re: Issue with apache Tika
Hi Chris !
I fixed the issue ! it was not so complicated ! a proble
Try UTF-8 encoding the URLs or the parameters themselves. If you are using
Tika-Python, then use the Python
encode library…
Cheers,
Chris
From: radhia bezzine
Date: Thursday, February 22, 2018 at 6:03 AM
To: "Mattmann, Chris A (1761)"
Added! https://wiki.apache.org/tika/ContributorsGroup
Feel free to edit the page
From: Prerana Teligi Harapanahalli Math
Date: Thursday, February 15, 2018 at 8:35 PM
To: "dev@tika.apache.org" , "Mattmann, Chris A (1761)"
eate an optional setinputstreamfactory() method in TikaInputStream, so the
user can implement an InputStreamFactory interface with a getInputStream
method, if he does not want to pay a performance hit with temp files for
everything.
Luis
Em 5 de fev de 2018 4:52 PM, "C
induce
overhead, but as a start, why not?
In short just run through the stream 2x
++++++
Chris Mattmann, Ph.D.
Associate Chief Technology and Innovation Officer, OCIO Manager, Advanced
IT Research and Open
2 Jan 2018, Nick Burch wrote:
> On Thu, 26 Oct 2017, Chris Mattmann wrote:
>> On collision, the precedence order defines what key takes precedence and
>> _overwrites_ the other. Overwrite is but one option (you could save
*all*
>> the values it’s a multi-val
to OSSRH and synced
On 2/5/18, 9:01 AM, "Chris Mattmann" <mattm...@apache.org> wrote:
Hmmm...the problem here is that Sonatype won't let us publish to Central
with
the below. It's not even an ASF policy thing - it's a Sonatype thing
On 2/5/18, 5:55 AM, &qu
Hmmm...the problem here is that Sonatype won't let us publish to Central with
the below. It's not even an ASF policy thing - it's a Sonatype thing
On 2/5/18, 5:55 AM, "Allison, Timothy B." wrote:
Sorry for the duplication, but I wanted to check on this and didn't want
c1 and two repos in nexus?!
Do we expect only the src to be in nexus, not the jar artifacts (with sigs
and digests) for app, server, eval?
-Original Message-
From: Chris Mattmann [mailto:mattm...@apache.org]
Sent: Friday, December 8, 2017 5:07 PM
To: dev
Hey Tim, probably just upload errors on the first one and so it tried again. No
worries. Drop and close
the first, and just use the 2nd.
Cheers,
Chris
On 12/8/17, 12:05 PM, "Allison, Timothy B." wrote:
Not sure what happened, but two repos were created in Nexus:
eers,
> Dave
>
>
>
> On 3 November 2017 at 15:19, Mattmann, Chris A (3010) <
> chris.a.mattm...@jpl.nasa.gov> wrote:
>
> > Let’s make it so (
> >
> >
> +
On Thu, 26 Oct 2017, Chris Mattmann wrote:
> My general approach to conflicting metadata is simply to define
> precedence orders.
>
> For example here is one documented from OODT:
>
>
https://cwiki.apache.org/confluence/display/OODT/Understa
maybe in tika-config.xml
would be a fine
start.
On 10/26/17, 9:14 AM, "Nick Burch" <apa...@gagravarr.org> wrote:
On Thu, 26 Oct 2017, Chris Mattmann wrote:
> Why don’t we just store N copies of the stream, and parse it twice?
I'm not sure that's the chal
Why don’t we just store N copies of the stream, and parse it twice?
Of course that’s the ugly way, but currently the way I’ve hacked this in all of
my projects is simply to call Tika N times OUTSIDE of Tika. Why don’t we just
use
that as the weakest baseline and work backwards from there?
Chris
This makes sense to me, +1 Giuseppe!
On 10/24/17, 6:12 PM, "Giuseppe Totaro" wrote:
Hi folks,
I am developing the proposed solutions within tika-server for enabling
specific ContentHandlers. Basically, I am working to provide the ability of
giving
I saw this Tyler, and it’s awesome. I forked it already though I’m not a Go
programmer thank you
for increasing the community here (
CC’ing Jim Jag who I know has done some Go programming, Jim spread the word ;)
Cheers,
Chris
On 10/6/17, 10:12 AM, "Tyler Bui-Palsulich"
ssing TikaConfig is needed
anyway, having a way to specify a handler there can be handy too...
Cheers, Sergey
On 28/09/17 22:17, Chris Mattmann wrote:
> I am +1 for this. Option #2 sounds like a slick way to handle this for me
that would
> remain back compat
I am +1 for this. Option #2 sounds like a slick way to handle this for me that
would
remain back compat with tika-python which is of strong interest to me.
Cheers,
Chris
On 9/28/17, 1:35 PM, "Giuseppe Totaro" wrote:
Hi folks,
if I am not wrong, currently
[dropping Beam on this]
Tim, another thing is that you can finally download the TREC-DD Polar data
either
from the NSF Arctic Data Center (70GB zip), or from Amazon S3, as described
here:
http://github.com/chrismattmann/trec-dd-polar/
In case we want to use as part of our regression.
Hi all,
One other thing is that Tika extracts metadata, and language information in
which order
doesn’t matter (Keys can be out of order).
Would this be useful?
Cheers,
Chris
On 9/21/17, 2:10 PM, "Sergey Beryozkin" wrote:
Hi Eugene
Thank you, very
te a new
> instance of TikaIO pipeline, and point it to the new temp folder where a
> new batch of files has been dropped to.
>
> Thanks, Sergey
> On 11/09/17 22:41, Mattmann, Chris A (3010) wrote:
>> Amazing work, thank you Sergey!!
>>
&
ranch is so I defer to Tim on the risk of going with #1.
- Bob
On 9/11/2017 5:15 PM, Chris Mattmann wrote:
> +1000
>
>
>
> On 9/11/17, 12:03 PM, "Allison, Timothy B." <talli...@mitre.org> wrote:
>
> Y, wel
+1000
On 9/11/17, 12:03 PM, "Allison, Timothy B." wrote:
Y, well, I didn't say _which_ September...
Given my limited availability to work on this in Sept and POI's decision to
move to Java 1.8, I propose releasing Tika 1.17 after the release of POI 3.17
and
Welcome Madhav!
Cheers,
Chris
On 8/31/17, 12:29 PM, "loo...@gmail.com on behalf of Dave Meikle"
wrote:
Hello Everyone,
Please join me in welcoming Madhav Sharan as a PMC Members and Committer to
the project!
From: Deepanshu Bhardwaj
Date: Tuesday, August 8, 2017 at 2:53 AM
To: "dev-ow...@tika.apache.org"
Subject: Query related to Apache Tika dependencies
Hi Team,
I need one help. I need to know the list of libraries
+1 from me SIGS and CHECKSUMS look good.
Thanks Tim!
Cheers,
Chris
LMC-053601:apache-tika-1.16-rc1 mattmann$ for type in "" \-app \-eval \-server;
do $HOME/bin/stage_apache_rc tika$type 1.16
https://dist.apache.org/repos/dist/dev/tika/; done
% Total% Received % Xferd Average Speed
thy B." <talli...@mitre.org> wrote:
Thank you, Chris!
Now, how do I bulk move open 1.16->1.17 on JIRA?
-Original Message-----
From: Chris Mattmann [mailto:mattm...@apache.org]
Sent: Friday, July 7, 2017 11:39 AM
To: dev@tika.apache.org
Sure
On 7/7/17, 7:57 AM, "Allison, Timothy B." <talli...@mitre.org> wrote:
I'll leave the moving to a new module to you?
-Original Message-
From: Chris Mattmann [mailto:mattm...@apache.org]
Sent: Friday, July 7, 2017 10:32 AM
To: dev@tika.
Great Tim thanks!
On 7/7/17, 7:28 AM, "talli...@apache.org" wrote:
This is an automated email from the ASF dual-hosted git repository.
tallison pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/tika.git
The
10) [mailto:chris.a.mattm...@jpl.nasa.gov]
> Sent: Monday, July 3, 2017 2:24 PM
> To: dev@tika.apache.org
> Subject: Re: Tika 1.15.1? -> 1.16
>
> Hey Tim, if I don’t get it done by today, push 1.16 and we’ll put Age
> Detection in 1.17.
>
> +
1 - 100 of 210 matches
Mail list logo