+1, positively.
On Mon, Aug 2, 2010 at 8:33 PM, Mattmann, Chris A (388J)
chris.a.mattm...@jpl.nasa.gov wrote:
Hi Tika community,
Jukka Zitting and I are working on the Tika in Action book [1]. How would
everyone feel about us posting a link to it on the Tika website [2]?
If so, I'll
Hi Ken,
I used Nutch's LanguageProfiler in order to produce language profile.
More about this issue you can find:
http://www.ibm.com/developerworks/opensource/tutorials/os-apache-tika/authors.html
(It's not self - promoting !)
Download the sources, using ant task you'll able to create lang
There are the situations, I could think about, where you would like to
implement customized classloader:
1. You need different hierarchy to load classes, as OSGi for instance.
Hollywood principle if you like.
2. When you need to run different versions of classes or jars. For example,
you want to
Why do not use:
http://felix.apache.org/site/apache-felix-http-service.html
On Tue, Feb 8, 2011 at 5:06 PM, Chris A. Mattmann (JIRA) j...@apache.orgwrote:
[
Hello Tran Nam Quang,
It uses CHMLIB C library, i.e. JNI. From my previous experience, it works
for limited amount of os'es. It does not work in Solaris or AIX.
The really good library with limitations mentioned above is
http://sevenzipjbind.sourceforge.net/ and also LGPL (I would say, the best
Sami,
Chris and me, some time ago did that for developerWorks tutorial, the
clean code exist, although may be out of day.
I thought, is it good idea to use Nutch code inside Tika? Might be Nutch
guys could extend it as independent module?
On Thu, Apr 14, 2011 at 3:01 PM, Sami Siren (JIRA)
Hi Chris,
I've applied the patch to the
tika-parsers/src/main/java/org/apache/tika/parser/chm, also added 3 chm
files to the tika-parsers\src\test\resources\test-documents and the tests.
BR,
Oleg
On Sun, Jun 5, 2011 at 1:32 AM, Chris A. Mattmann (JIRA) j...@apache.orgwrote:
[
Thank you Chris and Jukka!
I tried to keep the KISS principle, but couldn't.
On Tue, Jun 7, 2011 at 6:49 PM, Chris A. Mattmann (JIRA) j...@apache.orgwrote:
[
https://issues.apache.org/jira/browse/TIKA-245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel]
Chris A.
Hello all,
As you may know, Oracle announced Java 5 SE EOL (End Of Life) since 2009 .
However, we are still supporting Java 5 SE. What is a rational behind the
walls? Why we encourage our costumers do not upgrade to the more modern
version(s) of Java?
Developing new products/features we cannot
Chris, Nick,
I've attached the patch, hope now it will work/compile.
BR,
Oleg
On Wed, Jun 8, 2011 at 6:09 PM, Mattmann, Chris A (388J)
chris.a.mattm...@jpl.nasa.gov wrote:
Hi Nick,
Yep, didn't even catch it on commit. Oleg emailed me offlist (which I asked
him to bring onlist) and caught
Hi Jukka,
no problem at all.
I'll reformat and commit tomorrow then.
BR,
Oleg
On Wed, Jun 8, 2011 at 9:56 PM, Jukka Zitting jukka.zitt...@gmail.comwrote:
Hi Oleg,
On Wed, Jun 8, 2011 at 8:20 PM, Oleg Tikhonov o...@apache.org wrote:
I've attached the patch, hope now it will work/compile
Jukka,
Committed revision 1133955.
BR,
Oleg
On Thu, Jun 9, 2011 at 11:52 AM, Jukka Zitting jukka.zitt...@gmail.comwrote:
Hi,
On Thu, Jun 9, 2011 at 4:20 AM, Apache Jenkins Server
jenk...@builds.apache.org wrote:
cause : Too many unapproved licenses: 4
The following files in
Good evening,
What are the files that cannot pass the rat scanning?
Thanks in advance,
Oleg
On Mon, Jul 18, 2011 at 10:55 PM, Apache Jenkins Server
jenk...@builds.apache.org wrote:
See https://builds.apache.org/job/Tika-trunk/580/changes
Changes:
[kkrugler] Add quick test to validate that
FYI,
On Fri, Jul 29, 2011 at 12:13 AM, Uwe Schindler uschind...@apache.orgwrote:
Hello Apache Lucene Apache Solr users,
Hello users of other Java-based Apache projects,
Oracle released Java 7 today. Unfortunately it contains hotspot compiler
optimizations, which miscompile some loops. This
Hey,
and welcome to the Tika.
Using Eclipse you would better download an eclipse plug-in:
http://m2eclipse.sonatype.org/sites/m2e
Having downloaded and installed plug-in, your next step could be importing
Tika project like that: ' *File* -* Import* - *Existing Maven Project* '
...
However, if
I'm in favor, +1.
On Fri, Aug 26, 2011 at 1:22 PM, Jukka Zitting (JIRA) j...@apache.orgwrote:
Automatic checks against backwards-incompatible API changes
---
Key: TIKA-699
URL:
Hi Make! Congrats!
I worked with OmniFind edition at IBM Jerusalem :-) ..., I heard about you
from my colleagues (Josemina, Yariv) and now met you here! Welcome!
On Mon, Aug 29, 2011 at 6:14 PM, Michael McCandless
luc...@mikemccandless.com wrote:
Thanks Chris!
Here's a quick intro:
I
Yes, it's resolved, need to change the status.
2011/9/18 Jan Høydahl (JIRA) j...@apache.org
[
https://issues.apache.org/jira/browse/TIKA-546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13107290#comment-13107290]
Jan Høydahl commented on
In favor of releasing the Tika 0.10, +1
On Mon, Sep 26, 2011 at 9:50 AM, Mattmann, Chris A (388J)
chris.a.mattm...@jpl.nasa.gov wrote:
Hi Folks,
A first release candidate for the Tika 0.10 release is available at:
http://people.apache.org/~mattmann/apache-tika-0.10/rc1/
The release
(DjVu) format
Key: TIKA-513
URL: https://issues.apache.org/jira/browse/TIKA-513
Project: Tika
Issue Type: New Feature
Components: parser
Reporter: Oleg Tikhonov
It might
Hi Tran Nam Quang,
Currently our CHM extractor skips all entities that are not HTML.
It would be great if you could write a list of desired entities to be
extracted. In addition, if you can, please attach the CHM files you're
working with.
BR,
Oleg
On Sat, Oct 22, 2011 at 8:08 AM, Tran Nam
Hi Ahmad,
I hope you built pdfbox using a maven, i.e. running mvn clean install. If
so, a new pdfbox jar file is located in the .m2 local repository.
In addition, please find a pom.xml under ../tika-parsers and change the
following:
dependency
groupIdorg.apache.pdfbox/groupId
, Oleg Tikhonov o...@apache.org wrote:
Hi Ahmad,
I hope you built pdfbox using a maven, i.e. running mvn clean install. If
so, a new pdfbox jar file is located in the .m2 local repository.
In addition, please find a pom.xml under ../tika-parsers and change the
following:
dependency
For Chinese we need to create/get two profiles: Chinese Traditional and
Chinese Simplified.
Oleg
On Thu, Feb 2, 2012 at 6:13 AM, James Sullivan (Commented) (JIRA)
j...@apache.org wrote:
[
Here is my +1, this time tested only on Windows 7 x86-64 PE.
BR,
Oleg
On Thu, Mar 8, 2012 at 5:11 PM, Alex Ott alex...@gmail.com wrote:
+1
unpacked sources, compiled, tests passed. compiled tika-app works
correctly.
separately downloaded tika-app-1.1.jar also works correctly for me
The
Hi,
here is my +1.
Kind regards,
Oleg
On Thu, Jul 12, 2012 at 2:48 AM, Jukka Zitting jukka.zitt...@gmail.comwrote:
Hi,
On Wed, Jul 11, 2012 at 4:27 PM, Mattmann, Chris A (388J)
chris.a.mattm...@jpl.nasa.gov wrote:
On Jul 11, 2012, at 6:43 AM, Michael McCandless wrote:
Why are there
Hi Guys,
+1 for the graduation. Keep going !
KR,
Oleg
On Mon, Aug 6, 2012 at 11:44 PM, Dave Meikle loo...@gmail.com wrote:
Hi,
On 3 Aug 2012, at 18:50, Mattmann, Chris A (388J)
chris.a.mattm...@jpl.nasa.gov wrote:
...
I'm now going to call for a community VOTE (before heading to the
Hey,
I've tried to look up the distribution, however could not find the sources,
in binaries they provide only Nokia distribution.
It would be nice if you could play with it and say your impression(s).
BR,
Oleg
On Wed, Nov 7, 2012 at 2:52 AM, Pei Chen (JIRA) j...@apache.org wrote:
[
Hi David,
in the same folder level, say /home/tika/, where you run 'mvn clean
install' just put the following command:
mvn dependency:list
It will print out all the jars which a project depends on.
Hope it helps.
On Wed, Dec 12, 2012 at 3:35 PM, David Morana (JIRA) j...@apache.orgwrote:
David, is it failing on some particular file or always, never mind what
goes on?
POI hints that there is illegal offset, that probably is a cause of the
error.
--Oleg
On Wed, Dec 12, 2012 at 4:31 PM, David Morana (JIRA) j...@apache.orgwrote:
[
Hi Make,
May be consider using of UIMA (the rule engine) ?
BR,
Oleg
On Thu, Dec 20, 2012 at 1:05 PM, Michael McCandless (JIRA)
j...@apache.orgwrote:
[
https://issues.apache.org/jira/browse/TIKA-1048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel]
Michael
I've tried without success. There is more than it seems. JavaOCR is not an
option in its current status. Temporal solution can be wrapper of tesseract
however making tesseract to work on multi-platforms is still quite
difficult.
Best regards,
Oleg
On Fri, Jan 4, 2013 at 3:46 PM, Maciej
From DejaVu (particular case) point of view possible flow can be as follows:
1. Extract images
2. For each image extract text using OCR
2.1 Detect language
2.2.Detect font type
.
So, language, font type may be used for providing metadata.
I think it should be seamless as much as possible.
Hey Dave,
Could not test on other systems than Windows 7 x64. All tests passed
successfully !
[x] +1 Release this package as Apache Tika 1.3
BR,
Oleg
On Sat, Jan 19, 2013 at 6:30 AM, Dave Meikle loo...@gmail.com wrote:
http://svn.apache.org/repos/asf/tika/tags/tika-1.3/
Back to the future. Aha moment !!!
Here is mine +1.
According to Oracle In February 2011 Oracle announced the End of Public
Updates for their Java SE 6 products for July 2012. In February 2012 Oracle
extended the End of Public Updates for 4 months, to November 2012. .
Oleg
On Fri, Feb 8, 2013
Tika chm support has its limitations, can you provide such file(s) for
further investigation ?
BR,
Oleg
On Wed, Mar 6, 2013 at 1:10 AM, Tejas Patil (JIRA) j...@apache.org wrote:
[
In favor,
[x] +1 Release this package as Apache Tika 1.4.
Tested on Linux ubuntu 3.8.0-23-generic x64.
May be we have to update some dependencies.
Also ran a code coverage using mvn plugin, cobertura.
BR,
Oleg
Here is a link to code coverage report dependencies updates (available
@dev).
In favor,
[x] +1 Release this package as Apache Tika 1.4.
Tested on Linux ubuntu 3.8.0-23-generic x64.
May be we have to update some dependencies.
Also ran a code coverage using mvn plugin, cobertura.
BR,
Oleg
Here is a link to code coverage report dependencies updates (available
@dev).
I've tried to send some comments about release candidate, however got
delivery failure error. I'm out of list ?
BR,
Oleg
On Sun, Jun 16, 2013 at 9:07 PM, Chris Mattmann mattm...@apache.org wrote:
Ouch, just saw this. Oliver, I'm happy to commit the updated patch
to the trunk but do you
Hey,
All tests are passed on following platforms:
1. Linux ubuntu 3.8.0-25-generic x86_64 Ubuntu 13.04
2. Microsoft Windows 7 Enterprise, x64-based PC
Please have a look:
https://drive.google.com/?tab=moauthuser=0#folders/0B_DmgPkneiMgOFg2ZXBsOTZkRHc
There are two files, one of them contains list
Hi, can you attach the problematic file ?
Thanks.
On Tue, Jul 23, 2013 at 4:46 PM, Hong-Thai Nguyen (JIRA) j...@apache.orgwrote:
[
https://issues.apache.org/jira/browse/TIKA-1152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel]
Hong-Thai Nguyen updated TIKA-1152:
Thanks !
BR,
Oleg
On Mon, Jul 29, 2013 at 4:47 PM, Hong-Thai Nguyen (JIRA) j...@apache.orgwrote:
[
https://issues.apache.org/jira/browse/TIKA-1152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13716538#comment-13716538]
Hong-Thai Nguyen edited
Hi, Vasily,
Welcome aboard !
Just keep in mind, Tika is written on Java, so it can run on any JVM which
supports that.
For starters please refer to: http://tika.apache.org/1.4/gettingstarted.html
Generally, Tika supports extracting most known type including PDFs.
Apache Tika is Apache Software
Hi,
if you meant how to import Tika's project then here the steps:
1. In Eclipse -- File -- Import ...
2. Choose Existing Maven Project, click Next;
3. Point to Tika project, clicking on Browse button, say tika-core
4. Next, click on Finish.
That's it.
Hope it helps.
BR,
Oleg
On Fri, Sep
Hi Animesh,
my wild guess is that N-gram profile for Chinese wasn't trained pretty
well. Try recreate Chinese language profile.
Have a look here:
http://www.ibm.com/developerworks/opensource/tutorials/os-apache-tika/section6.html
Hope it helps.
On Sat, Oct 26, 2013 at 8:48 PM, Chris Mattmann
This one is better
https://issues.apache.org/jira/browse/TIKA-546
On Sat, Oct 26, 2013 at 10:05 PM, Oleg Tikhonov o...@apache.org wrote:
Hi Animesh,
my wild guess is that N-gram profile for Chinese wasn't trained pretty
well. Try recreate Chinese language profile.
Have a look here:
http
Think, we must. +1 for such improvement.
BR,
Oleg
On Mon, Dec 2, 2013 at 4:17 PM, Hong-Thai Nguyen
hong-thai.ngu...@polyspot.com wrote:
Hi all,
NonSequentialPDFParser may increase 45% parsing performance on PDF
extraction. Should we integrate in Tika ?
Hi Ken,
no at all. +1 - go for it!
BR,
Oleg
On Sun, Dec 15, 2013 at 1:39 AM, Ken Krugler kkrugler_li...@transpac.comwrote:
Hi all,
See https://issues.apache.org/jira/browse/TIKA-1209
Any objections to switching to JUnit 4.11?
-- Ken
--
Ken Krugler
+1
Hi Frank,
It's not so easy especially having dependency on native libraries.
It's also depends on trained profiles, languages fonts.
The questions are - what are platforms we want to support. what are
languages and fonts.
BR,
Oleg
On Tue, Dec 24, 2013 at 9:48 AM, frank (JIRA) j...@apache.org
Hi David,
[x] +1 Release this package as Apache Tika 1.5
Thanks!
BR,
Oleg
On Wed, Feb 5, 2014 at 3:59 AM, David Meikle loo...@gmail.com wrote:
Hi Guys,
A candidate for the Tika 1.5 release is now available at:
http://people.apache.org/~dmeikle/tika-1.5-rc1/
The release candidate is a
Hi Grant,
what you're doing seems great.
I've checked the Tess4j (http://tess4j.sourceforge.net/) they released and
distributed under the Apache License,
v2.0http://www.apache.org/licenses/LICENSE-2.0.html
.
Hope it helps.
BR,
Oleg
On Sat, Feb 8, 2014 at 1:14 PM, Grant Ingersoll (JIRA)
Hi,
There is another code coverage maven plug-in, called cobertura.
If you run *mvn clean install cobertura:cobertura* no need to put it in the
pom.
Hope it helps.
On Sat, Feb 8, 2014 at 10:17 PM, Grant Ingersoll (JIRA) j...@apache.orgwrote:
[
@Timo,
On the other hand this Parser can serves as a Composite for more
complicated parsers.
For example of DejaVu, you can extract images and parse them one by one,
and after just to append extracted text.
BR,
Oleg
On Mon, Feb 10, 2014 at 11:09 AM, Timo Boehme (JIRA) j...@apache.orgwrote:
Hi Mike!
Sounds great! Thanks.
Oleg
On Wed, Mar 5, 2014 at 6:47 PM, Michael McCandless
luc...@mikemccandless.com wrote:
Team,
If you want to search for Tika Jira issues, I just added Tika coverage
into the Lucene dog food server we use for finding Lucene/Solr
issues at
Hi Rupert,
agree about
javax.servlet;resolution:=optional,
javax.servlet.http;resolution:=optional,
Will check it out tomorrow.
Thanks !!!
On Mon, Apr 28, 2014 at 4:44 PM, Rupert Westenthaler (JIRA) j...@apache.org
wrote:
[
No problem. Will test it.
On Tue, Apr 29, 2014 at 3:43 PM, Rupert Westenthaler (JIRA) j...@apache.org
wrote:
[
https://issues.apache.org/jira/browse/TIKA-1276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13984251#comment-13984251]
Rupert
Guys,
Tesseract is by itself a project that written on C/C++ and should be
compiled differently for each platform.
Personally, i would put a requirement for those who want to work with
tesseract. Not sure that putting Tesseract in the sources is a right way to
go.
How good tesseract is - depends
Hi,
Please have a look at provided code:
[code]
Parser parser = new AutoDetectParser(); // Should auto-detect!
ContentHandler handler = new BodyContentHandler();
Metadata metadata = new Metadata();
InputStream stream = ZipParserTest.class.getResourceAsStream(
[x] +1 Release this package as Apache Tika 1.6.
Tested on the following systems:
1. Microsoft Windows 7 Enterprise, SP 1, x64-based PC
2. Linux ubuntu 3.11.0-24-generic #42-Ubuntu SMP x86_64 GNU/Linux
Thanks,
Oleg
On Mon, Jul 28, 2014 at 7:22 AM, Mattmann, Chris A (3980)
Hi,
does context contain only one language or it's mixed.
if the text contains a single language then it seems something strange in
our language profiles. If it mixed - then it kindda ok. The first detected
will be an answer.
What is a size of context? one word or bunch of text? Basically to
Hi, I can try this on.
What is a trunk?
Thanks,
Oleg
On Tue, Oct 21, 2014 at 6:21 AM, Mattmann, Chris A (3980)
chris.a.mattm...@jpl.nasa.gov wrote:
Hmm any idea why this is failing on Windows? Tyler P. and
I were talking the other day - maybe we shouldn't run the
tests from TIKA-1422
/
++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++
-Original Message-
From: Oleg Tikhonov olegtikho...@gmail.com
Reply-To: dev
Please take a try with newest patch.
Cheers,
Oleg
On Tue, Oct 21, 2014 at 9:08 AM, Oleg Tikhonov olegtikho...@gmail.com
wrote:
Taken. Thanks. in progress ...
On Tue, Oct 21, 2014 at 8:54 AM, Mattmann, Chris A (3980)
chris.a.mattm...@jpl.nasa.gov wrote:
Trunk is the current checkout/branch
, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++
-Original Message-
From: Oleg Tikhonov o...@apache.org
Reply-To: dev@tika.apache.org dev@tika.apache.org
Date: Monday, October
AM, Oleg Tikhonov olegtikho...@gmail.com
wrote:
Sorry!!!
On Tue, Oct 21, 2014 at 9:37 AM, Mattmann, Chris A (3980)
chris.a.mattm...@jpl.nasa.gov wrote:
Thanks Oleg, will try tomorrow for me Los angeles time
Hi,
Just one quess. Did you check the permissons, does it have executable
permission?
Br,
Oleg
On 6 Feb 2015 12:15, Sean Zhao (JIRA) j...@apache.org wrote:
Sean Zhao created TIKA-1543:
---
Summary: TesseractOCRParser.setTesseractPath() doesn't work
+1 for 1.8 release.
On 29 Mar 2015 02:04, Konstantin Gribov gros...@gmail.com wrote:
Also, I think, we should resolve TIKA-1575 (upgrade to pdfbox 1.8.9) since
pdfbox 1.8.8 hangs on some pdf forms.
--
Best regards,
Konstantin Gribov
сб, 28 марта 2015 г. в 23:22, Konstantin Gribov
Hi,
Just for the record ...
It can happen if a file contains context that at least written in two
different languages. For instance, the first half of file, say, is a German
and the second one, say ... a French. In such case detection would be
faulty.
Br,
Oleg
On 3 Mar 2015 04:03, Tyler Palsulich
,
What do you mean, the detection is faulty? What is the expected result in
that case?
Thanks,
Tyler
On Mar 3, 2015 1:10 AM, Oleg Tikhonov o...@apache.org wrote:
Hi,
Just for the record ...
It can happen if a file contains context that at least written in two
different languages
Hi Chris,
just to confirm:
[INFO]
[INFO] Reactor Summary:
[INFO]
[INFO] Apache Tika parent . SUCCESS [
9.268 s]
[INFO] Apache Tika core ... SUCCESS [
25.823 s]
Hi there,
+1 for dropping.
On 30 Jan 2015 05:05, Tyler Palsulich tpalsul...@gmail.com wrote:
+1
Tyler
On Jan 29, 2015 9:52 PM, Mattmann, Chris A (3980)
chris.a.mattm...@jpl.nasa.gov wrote:
+1 move to 1.7
Sent from my iPhone
On Jan 29, 2015, at 5:04 PM, Allison, Timothy B.
I Tim,
Having looked at CC, a couple of ideas crossed the mind. I think it's cool.
+1.
BR,
Oleg
On 3 Apr 2015 17:29, Allison, Timothy B. talli...@mitre.org wrote:
All,
What do you think?
https://groups.google.com/forum/#!topic/common-crawl/Cv21VRQjGN0
On Friday, April 3, 2015 at 8:23:11
Hi Tyler,
good job, indeed !!!
[x] +1 Release this package as Apache Tika 1.8
On Wed, Apr 15, 2015 at 8:22 AM, Mattmann, Chris A (3980)
chris.a.mattm...@jpl.nasa.gov wrote:
Thanks Tyler! +1 from me:
SIGS, checksums check out:
[chipotle:~/tmp/apache-tika-1.8-rc2] mattmann%
Hi,
[x] +1 Release this package as Apache Tika 1.8.
Tested on: Ubuntu 14.10, x86_64. Java 1.7 (Oracle)
Don't we want to update the following dependencies:
biz.aQute:bndlib . 1.43.0 - 2.0.0.20130123-133441
org.apache.felix:org.apache.felix.scr.annotations 1.6.0 - 1.9.10
Hi,
All basic tests are passed.
java version 1.7.0_75
Java(TM) SE Runtime Environment (build 1.7.0_75-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.75-b04, mixed mode)
Linux/Ubuntu x86_64
Superb !!!
[x] +1 Release this package as Apache Tika 1.9
Thanks,
Oleg
On Tue, Jun 9, 2015 at 2:12 PM,
Wow !!! Amazing.
How does it perform?
BR,
Oleg
On Thu, Aug 20, 2015 at 9:48 AM, Mattmann, Chris A (3980)
chris.a.mattm...@jpl.nasa.gov wrote:
Just saw this online:
http://www.informationweek.com/software/enterprise-applications/goldman-sac
hs-puts-elasticsearch-to-work/d/d-id/1321778
Thanks!
+1
BR,
Oleg
On Tue, Aug 4, 2015 at 5:37 AM, Mattmann, Chris A (3980)
chris.a.mattm...@jpl.nasa.gov wrote:
+1
++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA
Hi, thanks for doing that !!!
+1 for the release.
Ran on Kubuntu 15 x64. All basic tests are passed.
BR,
Oleg
On Tue, Aug 4, 2015 at 6:17 AM, Mattmann, Chris A (3980)
chris.a.mattm...@jpl.nasa.gov wrote:
+1 from me, great work Dave SIGS and CHECKSUMS are sound:
+1 !!!
My two cents.
Please also add ability to change/retrain/tote language profiles.
Thanks !!!
BR,
Oleg
On Wed, Jul 29, 2015 at 3:59 AM, Mattmann, Chris A (3980)
chris.a.mattm...@jpl.nasa.gov wrote:
Cool. Well with this one I found, along with language-detector,
along with Ramirez and the
Hi guys, all looks fine on basic set up in x86_64 Ubuntu, however I got the
following:
Running org.apache.tika.parser.journal.JournalParserTest
25 Oct 2015 10:45:53 WARN PhaseInterceptorChain - Interceptor for {
http://localhost:8080/grobid}WebClient has thrown exception, unwinding now
Hi Ken,
I would be choose the last option you've mentioned.
-- Oleg
On Sat, Aug 29, 2015 at 7:58 PM, Ken Krugler kkrugler_li...@transpac.com
wrote:
Hi all,
As part of integrating language-detector into Tika (see TIKA-1723), I
noticed TIKA-546 (Add ability to create language profiles to
Good intro. Welcome a board.
Oleg
On 17 Sep 2015 03:05, "David Meikle" wrote:
> Hello All,
>
> Please welcome Bob Paulin as he joins us as the latest Tika committer and
> PMC Member.
>
> Bob, please feel free to say a bit about yourself as an introduction to
> the group.
>
>
+1.
There is a bunch of add-ons. For instance - git flow.
On Wed, Nov 18, 2015 at 7:15 PM, Mattmann, Chris A (3980) <
chris.a.mattm...@jpl.nasa.gov> wrote:
> Hey Nick,
>
> Git has something similar to svn:externals:
>
> http://stackoverflow.com/questions/571232/svnexternals-equivalent-in-git
>
Hi Chris,
thanks for doing it.
Yesterday I successfuly build the tika using mvn clean install.
All tests are passed. Platform: x86_64 Kubuntu with Oracle Java 8. Nothing
special was ran.
[x] +1 Release this package as Apache Tika 1.12
Best regards,
Oleg
On Mon, Jan 25, 2016 at 9:58 PM,
hi Luis,
Here what I did:
git clone https://git-wip-us.apache.org/repos/asf/tika.git
git branch
* master
gdalinfo --version
GDAL 1.11.3, released 2015/09/16
mvn clean install -U
Tests run: 3, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 42.59 sec -
in
Hi,
+1 for release.
Built on Ubuntu 16.04 and CentOS 7.0 x86_64.
All tests are passed. Java 8.
BR,
Oleg
On Thu, Oct 20, 2016 at 5:54 PM, Julien Nioche <
lists.digitalpeb...@gmail.com> wrote:
> Hi Tim
>
> I had exiftool installed indeed, so that might explain it. All tests now
> pass. Will have
+1 for the release.
On Mon, Apr 17, 2017 at 8:39 PM, David Meikle wrote:
> +1 from me too.
>
> Cheers,
> Dave
>
> On 13 April 2017 at 13:08, Konstantin Gribov wrote:
>
> > Preliminary +1 from me, I'll the a closer look this weekend
> >
> > чт, 13 апр. 2017,
[x]+1 Release this package as Apache Tika 1.16
Basic tests and build on Ubuntu 17.04 + Java 8 (Oracle).
Thanks,
Oleg
On Wed, Jul 12, 2017 at 11:03 AM, Dave Meikle wrote:
> On 8 July 2017 at 03:40, Tim Allison wrote:
>
> >
> > A candidate for the Tika
Hi guys,
Here is wrong ...
org.apache.tika
tika-parent
1.16-SNAPSHOT
tika-parent/pom.xml
If you are cloning the project, the upper level pom contains this.
The fix is to change 1.16-SNAPSHOT to 1.15
What i did was:
git clone https://github.com/apache/tika.git
Any
Also put
./tika-dl/src/test/java/org/apache/tika/dl/imagerec/DL4JInceptionV3NetTest.java
@Ignore because I do not have any DL installed on my comp.
On Tue, May 23, 2017 at 11:00 PM, Oleg Tikhonov <o...@apache.org> wrote:
> Hi guys,
> Here is wrong ...
>
> org.apache.tika
[x] +1 Release this package as Apache Tika 1.15
[INFO]
[INFO] BUILD SUCCESS
[INFO]
[INFO] Total time: 19:41 min
[INFO] Finished at:
@gmail.com] On Behalf Of
> Oleg Tikhonov
> Sent: Tuesday, May 23, 2017 4:33 PM
> To: dev@tika.apache.org
> Subject: Re: [VOTE] Release Apache Tika 1.15 Candidate #1
>
> Also put
> ./tika-dl/src/test/java/org/apache/tika/dl/imagerec/
> DL4JInceptionV3NetTest.java
> @Ignore b
Guys, i can help with Tika dockerization. just let design/plan what we
gonna do.
On Thu, Jun 1, 2017 at 4:02 PM, Eric Pugh
wrote:
> As the Tika project starts embracing more non Java tools (I’m thinking of
> Tesseract for example), dockerizing your Tika setup
Hi Tim,
definitely would be helpful !
+1
Thanks,
Oleg
On Tue, May 22, 2018 at 3:38 PM, Tim Allison (JIRA) wrote:
> Tim Allison created TIKA-2647:
> -
>
> Summary: Create a "security" page on our website
> Key:
In this approach, probably it is the only way ...
What is tika-server typical env? stand-alone, distributed ... like replicas
in cluster?
Are there some time limitation for recovery? How do we know what point to
start processing from?
Do we mark documents which were processed?
For example, if
Hi Tim,
What if watcher thread fails/gets stuck etc?
On Thu, Sep 6, 2018 at 3:27 PM Tim Allison (JIRA) wrote:
> Tim Allison created TIKA-2725:
> -
>
> Summary: Make tika-server robust against ooms/infinite
> loops/memory leaks
>
Ideally, tika server is dockerized, runs on swarm as a service. In
addition, it has healthckeck mechanism, say something ... like http get
request with return code 200. Docker will runs this hc periodically, and if
it fails, will restart tika server.
However, we are far away. Two ways to go, fmpov
Yep, seems to be best match... unblocked execution.
On Thu, Sep 6, 2018, 23:47 Tim Allison (JIRA) wrote:
>
> [
> https://issues.apache.org/jira/browse/TIKA-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606373#comment-16606373
> ]
>
> Tim Allison commented on
[+] Release this package as Apache Tika 1.18
[INFO] Apache Tika parent . SUCCESS [
12.379 s]
[INFO] Apache Tika core ... SUCCESS [
55.650 s]
[INFO] Apache Tika parsers SUCCESS [05:55
min]
[INFO]
Hi,
thanks a lot.
[x] +1 Release this package as Apache Tika 1.18
Even did a security scan:
mvn org.owasp:dependency-check-maven:3.1.2:check
Report is attached.
Best regards,
Oleg
On Sat, Apr 21, 2018 at 12:54 AM, talli...@apache.org
wrote:
> All,
> A candidate for the
1 - 100 of 180 matches
Mail list logo