Re: Stack Overflow Question

2014-06-30 Thread Oleg Tikhonov
Hi, Please have a look at provided code: [code] Parser parser = new AutoDetectParser(); // Should auto-detect! ContentHandler handler = new BodyContentHandler(); Metadata metadata = new Metadata(); InputStream stream = ZipParserTest.class.getResourceAsStream(

Re: [VOTE] Apache Tika 1.6 release candidate #1

2014-07-27 Thread Oleg Tikhonov
[x] +1 Release this package as Apache Tika 1.6. Tested on the following systems: 1. Microsoft Windows 7 Enterprise, SP 1, x64-based PC 2. Linux ubuntu 3.11.0-24-generic #42-Ubuntu SMP x86_64 GNU/Linux Thanks, Oleg On Mon, Jul 28, 2014 at 7:22 AM, Mattmann, Chris A (3980) < chris.a.mattm...@jpl

Re: [jira] [Created] (TIKA-1405) German content detected as French

2014-08-30 Thread Oleg Tikhonov
Hi, does context contain only one language or it's mixed. if the text contains a "single" language then it seems something strange in our language profiles. If it mixed - then it kindda ok. The first detected will be an answer. What is a size of context? one word or "bunch" of text? Basically to d

Re: [VOTE] Release Apache Tika 1.6 RC #2

2014-09-02 Thread Oleg Tikhonov
Hi, [x] +1 Release this package as Apache Tika 1.6. Tested on Windows 7 Home Premium, AMD64 Family 16 Model 6 Stepping 3 AuthenticAMD Thanks, Oleg On Mon, Sep 1, 2014 at 8:16 AM, Mattmann, Chris A (3980) < chris.a.mattm...@jpl.nasa.gov> wrote: > Hi Folks, > > A candidate for the Tika 1.6 releas

Re: [jira] [Updated] (TIKA-1430) CHM parser gets faulty text (fix found)

2014-09-28 Thread Oleg Tikhonov
Awesome. No one complained because chm is not such popular as a pdf, for instance. In any case, thanks for fixing. On Sun, Sep 28, 2014 at 11:35 AM, Bin Hawking (JIRA) wrote: > > [ > https://issues.apache.org/jira/browse/TIKA-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tab

Re: 1.7 release?

2014-10-20 Thread Oleg Tikhonov
Hi, I can try this on. What is a trunk? Thanks, Oleg On Tue, Oct 21, 2014 at 6:21 AM, Mattmann, Chris A (3980) < chris.a.mattm...@jpl.nasa.gov> wrote: > Hmm any idea why this is failing on Windows? Tyler P. and > I were talking the other day - maybe we shouldn't run the > tests from TIKA-1422 u

Re: 1.7 release?

2014-10-20 Thread Oleg Tikhonov
ov > WWW: http://sunset.usc.edu/~mattmann/ > ++ > Adjunct Associate Professor, Computer Science Department > University of Southern California, Los Angeles, CA 90089 USA > ++++++ > > > &g

Re: 1.7 release?

2014-10-20 Thread Oleg Tikhonov
Please take a try with newest patch. Cheers, Oleg On Tue, Oct 21, 2014 at 9:08 AM, Oleg Tikhonov wrote: > Taken. Thanks. in progress ... > > On Tue, Oct 21, 2014 at 8:54 AM, Mattmann, Chris A (3980) < > chris.a.mattm...@jpl.nasa.gov> wrote: > >> Trunk is the current

Re: 1.7 release?

2014-10-20 Thread Oleg Tikhonov
+++ > Adjunct Associate Professor, Computer Science Department > University of Southern California, Los Angeles, CA 90089 USA > ++ > > > > > > > -Original Message- > From: Oleg

Re: 1.7 release?

2014-10-24 Thread Oleg Tikhonov
+++ > > > Adjunct Associate Professor, Computer Science Department > > > University of Southern California, Los Angeles, CA 90089 USA > > > ++ > > > > > > > > > > > > > > > > > > > > > -Original Me

Re: TIKA-1423 Build a parser to extract data from GRIB formats not good with Java 6

2015-01-30 Thread Oleg Tikhonov
Hi there, +1 for dropping. On 30 Jan 2015 05:05, "Tyler Palsulich" wrote: > +1 > > Tyler > On Jan 29, 2015 9:52 PM, "Mattmann, Chris A (3980)" < > chris.a.mattm...@jpl.nasa.gov> wrote: > > > +1 move to 1.7 > > > > Sent from my iPhone > > > > > On Jan 29, 2015, at 5:04 PM, Allison, Timothy B. >

Re: [jira] [Created] (TIKA-1543) TesseractOCRParser.setTesseractPath() doesn't work on Linux

2015-02-06 Thread Oleg Tikhonov
Hi, Just one quess. Did you check the permissons, does it have executable permission? Br, Oleg On 6 Feb 2015 12:15, "Sean Zhao (JIRA)" wrote: > Sean Zhao created TIKA-1543: > --- > > Summary: TesseractOCRParser.setTesseractPath() doesn't work > on Linux >

Re: [jira] [Closed] (TIKA-993) Language Detection Fault

2015-03-02 Thread Oleg Tikhonov
Hi, Just for the record ... It can happen if a file contains context that at least written in two different languages. For instance, the first half of file, say, is a German and the second one, say ... a French. In such case detection would be faulty. Br, Oleg On 3 Mar 2015 04:03, "Tyler Palsulich

Re: [jira] [Closed] (TIKA-993) Language Detection Fault

2015-03-03 Thread Oleg Tikhonov
gt; Hi, > > What do you mean, the detection is faulty? What is the expected result in > that case? > > Thanks, > Tyler > On Mar 3, 2015 1:10 AM, "Oleg Tikhonov" wrote: > > > Hi, > > Just for the record ... > > It can happen if a file contains cont

Re: trunk test failure

2015-03-26 Thread Oleg Tikhonov
Hi Chris, just to confirm: [INFO] [INFO] Reactor Summary: [INFO] [INFO] Apache Tika parent . SUCCESS [ 9.268 s] [INFO] Apache Tika core ... SUCCESS [ 25.823 s] [

Re: [DISCUSS] Tika 1.8 or 1.7.1

2015-03-29 Thread Oleg Tikhonov
+1 for 1.8 release. On 29 Mar 2015 02:04, "Konstantin Gribov" wrote: > Also, I think, we should resolve TIKA-1575 (upgrade to pdfbox 1.8.9) since > pdfbox 1.8.8 hangs on some pdf forms. > > -- > Best regards, > Konstantin Gribov > > сб, 28 марта 2015 г. в 23:22, Konstantin Gribov : > > > +1 to re

Re: FW: Any interest in running Apache Tika as part of CommonCrawl?

2015-04-03 Thread Oleg Tikhonov
I Tim, Having looked at CC, a couple of ideas crossed the mind. I think it's cool. +1. BR, Oleg On 3 Apr 2015 17:29, "Allison, Timothy B." wrote: > All, > > What do you think? > > > https://groups.google.com/forum/#!topic/common-crawl/Cv21VRQjGN0 > > > On Friday, April 3, 2015 at 8:23:11 AM UTC-

Re: [VOTE] Release Apache Tika 1.8 Candidate #1

2015-04-07 Thread Oleg Tikhonov
Hi, [x] +1 Release this package as Apache Tika 1.8. Tested on: Ubuntu 14.10, x86_64. Java 1.7 (Oracle) Don't we want to update the following dependencies: biz.aQute:bndlib . 1.43.0 -> 2.0.0.20130123-133441 org.apache.felix:org.apache.felix.scr.annotations 1.6.0 -> 1.9.10 o

Re: [VOTE] Apache Tika 1.8 Release Candidate #2

2015-04-15 Thread Oleg Tikhonov
Hi Tyler, good job, indeed !!! [x] +1 Release this package as Apache Tika 1.8 On Wed, Apr 15, 2015 at 8:22 AM, Mattmann, Chris A (3980) < chris.a.mattm...@jpl.nasa.gov> wrote: > Thanks Tyler! +1 from me: > > SIGS, checksums check out: > > > [chipotle:~/tmp/apache-tika-1.8-rc2] mattmann% $HOME/b

Re: [VOTE] Release Apache Tika 1.9 Candidate #2

2015-06-09 Thread Oleg Tikhonov
Hi, All basic tests are passed. java version "1.7.0_75" Java(TM) SE Runtime Environment (build 1.7.0_75-b13) Java HotSpot(TM) 64-Bit Server VM (build 24.75-b04, mixed mode) Linux/Ubuntu x86_64 Superb !!! [x] +1 Release this package as Apache Tika 1.9 Thanks, Oleg On Tue, Jun 9, 2015 at 2:12 PM,

Re: Bayesian N-Gram Language Detection

2015-07-29 Thread Oleg Tikhonov
+1 !!! My two cents. Please also add ability to change/retrain/tote language profiles. Thanks !!! BR, Oleg On Wed, Jul 29, 2015 at 3:59 AM, Mattmann, Chris A (3980) < chris.a.mattm...@jpl.nasa.gov> wrote: > Cool. Well with this one I found, along with language-detector, > along with Ramirez and

Re: release Tika 1.10?

2015-08-04 Thread Oleg Tikhonov
Thanks! +1 BR, Oleg On Tue, Aug 4, 2015 at 5:37 AM, Mattmann, Chris A (3980) < chris.a.mattm...@jpl.nasa.gov> wrote: > +1 > ++ > Chris Mattmann, Ph.D. > Chief Architect > Instrument Software and Science Data Systems Section (398) >

Re: [VOTE] Apache Tika 1.10 Release Candidate #1

2015-08-04 Thread Oleg Tikhonov
Hi, thanks for doing that !!! +1 for the release. Ran on Kubuntu 15 x64. All basic tests are passed. BR, Oleg On Tue, Aug 4, 2015 at 6:17 AM, Mattmann, Chris A (3980) < chris.a.mattm...@jpl.nasa.gov> wrote: > +1 from me, great work Dave SIGS and CHECKSUMS are sound: > > [chipotle:~/tmp/tika-1.10

Re: Apache Tika: In use at Goldman Sachs

2015-08-20 Thread Oleg Tikhonov
Wow !!! Amazing. How does it perform? BR, Oleg On Thu, Aug 20, 2015 at 9:48 AM, Mattmann, Chris A (3980) < chris.a.mattm...@jpl.nasa.gov> wrote: > Just saw this online: > > http://www.informationweek.com/software/enterprise-applications/goldman-sac > hs-puts-elasticsearch-to-work/d/d-id/1321778

Re: Remove support for building language identifier profiles?

2015-08-29 Thread Oleg Tikhonov
Hi Ken, I would be choose the last option you've mentioned. -- Oleg On Sat, Aug 29, 2015 at 7:58 PM, Ken Krugler wrote: > Hi all, > > As part of integrating language-detector into Tika (see TIKA-1723), I > noticed TIKA-546 ("Add ability to create language profiles to tika-app") > > If we switch

Re: [ANNOUNCE] Welcome Bob Paulin as Tika Committer + PMC Member

2015-09-17 Thread Oleg Tikhonov
Good intro. Welcome a board. Oleg On 17 Sep 2015 03:05, "David Meikle" wrote: > Hello All, > > Please welcome Bob Paulin as he joins us as the latest Tika committer and > PMC Member. > > Bob, please feel free to say a bit about yourself as an introduction to > the group. > > Welcome aboard, > Dav

Re: [VOTE] Apache Tika 1.11 Release Candidate #1

2015-10-25 Thread Oleg Tikhonov
Hi guys, all looks fine on basic set up in x86_64 Ubuntu, however I got the following: Running org.apache.tika.parser.journal.JournalParserTest 25 Oct 2015 10:45:53 WARN PhaseInterceptorChain - Interceptor for { http://localhost:8080/grobid}WebClient has thrown exception, unwinding now org.apache.

Re: [DISCUSS] Moving to Git

2015-11-19 Thread Oleg Tikhonov
+1. There is a bunch of add-ons. For instance - git flow. On Wed, Nov 18, 2015 at 7:15 PM, Mattmann, Chris A (3980) < chris.a.mattm...@jpl.nasa.gov> wrote: > Hey Nick, > > Git has something similar to svn:externals: > > http://stackoverflow.com/questions/571232/svnexternals-equivalent-in-git > >

Re: [VOTE] Apache Tika 1.12 Release Candidate #1

2016-01-28 Thread Oleg Tikhonov
Hi Chris, thanks for doing it. Yesterday I successfuly build the tika using mvn clean install. All tests are passed. Platform: x86_64 Kubuntu with Oracle Java 8. Nothing special was ran. [x] +1 Release this package as Apache Tika 1.12 Best regards, Oleg On Mon, Jan 25, 2016 at 9:58 PM, Mattmann,

Re: [VOTE] Apache Tika 1.14 Release Candidate #1

2016-10-20 Thread Oleg Tikhonov
Hi, +1 for release. Built on Ubuntu 16.04 and CentOS 7.0 x86_64. All tests are passed. Java 8. BR, Oleg On Thu, Oct 20, 2016 at 5:54 PM, Julien Nioche < lists.digitalpeb...@gmail.com> wrote: > Hi Tim > > I had exiftool installed indeed, so that might explain it. All tests now > pass. Will have

Re: Master Build Failing

2016-10-25 Thread Oleg Tikhonov
hi Luis, Here what I did: git clone https://git-wip-us.apache.org/repos/asf/tika.git git branch * master gdalinfo --version GDAL 1.11.3, released 2015/09/16 mvn clean install -U Tests run: 3, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 42.59 sec - in org.apache.tika.parser.gdal.TestGDALPa

Re: [VOTE] Release Apache Tika 1.20 Candidate #1

2018-12-22 Thread Oleg Tikhonov
All basic staff passed. +1. Oleg On Fri, Dec 21, 2018, 22:02 Ken Krugler Hi Tim, > > Thanks for rolling the release. > > Built & validated on Mac OS X 10.12 > > Updated flink-crawler, all tests pass. > > So here’s my +1 > > — Ken > > > > On Dec 17, 2018, at 6:14 PM, Tim Allison wrote: > > > > A

Re: [VOTE] Release Apache Tika 1.20 Candidate #1

2018-12-22 Thread Oleg Tikhonov
*stuff On Sat, Dec 22, 2018, 11:01 Oleg Tikhonov All basic staff passed. > +1. > Oleg > > On Fri, Dec 21, 2018, 22:02 Ken Krugler wrote: > >> Hi Tim, >> >> Thanks for rolling the release. >> >> Built & validated on Mac OS X 10.12 >> >&

Re: Tika 1.21?

2019-04-08 Thread Oleg Tikhonov
Great! +1. Thanks, Oleg On Mon, Apr 8, 2019, 21:11 Tim Allison wrote: > All, > PDFBox will be out in a few days, and POI should be out soon as > well. I _think_ I'd like to get in a first draft of "auto" mode for > OCR'ing PDFs (TIKA-2749), but other than that, I'd be willing to run a > relea

Re: Tika 1.21?

2019-04-22 Thread Oleg Tikhonov
ld start the > regression tests now (well, tomorrowish), though, unless anyone has > anything they want to get in...I'm happy to wait, though, till next > week to start the regression tests. > WDYT? > >Cheers, > > Tim > > On Mon, Apr 8, 2019 a

Re: [VOTE] Release Apache Tika 1.21 Candidate #1

2019-05-14 Thread Oleg Tikhonov
Hi all, [x] +1 Release this package as Apache Tika 1.21 I've ran just basic stuff, mvn clean install (Ubuntu x86, java 8). Seems to be good. Thanks, Oleg On Mon, May 13, 2019 at 8:33 PM Tim Allison wrote: > A candidate for the Tika 1.21 release is available at: > > https://dist.apache.org/r

Re: [VOTE] Release Apache Tika 1.21 Candidate #1

2019-05-14 Thread Oleg Tikhonov
:-) I'm good with any option. RC1 seems to be good from my point of view. Cheers, Oleg On Tue, May 14, 2019 at 3:56 PM Tim Allison wrote: > All, > I'm happy to close rc1 and respin an rc2 after Oleg's findings > (TIKA-2871 and TIKA-2872)...many thanks, Oleg! I'm also happy to > proceed with r

Re: [VOTE] Release Apache Tika 1.21 Candidate #2

2019-05-15 Thread Oleg Tikhonov
Here is my +1. Thanks, Tim! On Wed, May 15, 2019 at 5:16 AM Tim Allison wrote: > A candidate for the Tika 1.21 release is available at: > > https://dist.apache.org/repos/dist/dev/tika/ > > The release candidate is a zip archive of the sources in: > https://github.com/apache/tika/tree/1.21-r

Re: [jira] [Commented] (TIKA-2878) Update dependencies for 1.21.1 or 1.22

2019-05-20 Thread Oleg Tikhonov
Today I've also used a master branch and got the same result. On Mon, May 20, 2019 at 8:59 PM Tim Allison (JIRA) wrote: > > [ > https://issues.apache.org/jira/browse/TIKA-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16844167#comment-16844167 >

Re: Tika 1.22?

2019-06-25 Thread Oleg Tikhonov
Would be great!!! Cheers, Oleg On Tue, Jun 25, 2019, 17:45 Tim Allison wrote: > All, > The vote for the next version of PDFBox is under way. I think we've > had a number of useful upgrades since our last release. Any > objections to starting the release process for Tika 1.22 a week or so > a

Re: 1.22?

2019-07-15 Thread Oleg Tikhonov
+1 On Mon, Jul 15, 2019 at 2:41 PM Tim Allison wrote: > Anyone have anything they want to get into 1.22? If not, I’ll kick off the > regression tests shortly. > > Cheers, > Tim >

Re: [VOTE] Release Apache Tika 1.22 Candidate #4

2019-07-30 Thread Oleg Tikhonov
Hi Tim, thanks for the release !!! Here is my +1, tested on Ubuntu 18.04.2 LTS, x_86 arc. Best wishes, Oleg On Mon, Jul 29, 2019 at 8:50 PM Tim Allison wrote: > A candidate for the Tika 1.22 release is available at: > > https://dist.apache.org/repos/dist/dev/tika/ > > > The release candidate

Re: [ANNOUNCE] Welcome Tilman Hausherr as Tika PMC member and committer

2019-10-04 Thread Oleg Tikhonov
Welcome a board Tilman!!! Best regards, Oleg On Fri, Oct 4, 2019 at 5:37 PM Tilman Hausherr wrote: > Am 04.10.2019 um 16:19 schrieb Tim Allison: > > All, > > > > The Tika PMC has elected to add Tilman Hausherr to our ranks. Tilman, > > please feel free to introduce yourself, and welcome aboar

Re: [EXTERNAL] Docker image along with 1.23?

2019-11-21 Thread Oleg Tikhonov
My question is more pragmatic. What we put inside the Dockerfile, on which image it will be based on (say Ubuntu) ... What will contain an entrypoint? Tika Server? Should we "install" a tesseract? Anything more? Thanks, Oleg On Thu, Nov 21, 2019 at 4:46 AM Chris Mattmann wrote: > Yeah producing

Re: [VOTE] Release Apache Tika 1.23 Candidate #1

2019-11-29 Thread Oleg Tikhonov
Hi, here is my +1. All tests are passed un ubuntu 19.04. Thanks Tim! Best Regards, Oleg On Thu, Nov 28, 2019, 15:39 Markus Jelsma wrote: > +1! > > All tests pass and i can seamlessly update our internal software to 1.23. > > Thanks! > > -Original message- > > From:Tim Allison > > Sent:

Re: [VOTE] Release Apache Tika 1.23 Candidate #2

2019-12-03 Thread Oleg Tikhonov
[x] +1 Release this package as Apache Tika 1.23 Thanks, Oleg On Tue, Dec 3, 2019 at 5:15 AM Tim Allison wrote: > A candidate for the Tika 1.23 release is available at: > https://dist.apache.org/repos/dist/dev/tika/ > > The release candidate is a zip archive of the sources in: > https://gith

Re: 1.24?

2020-02-05 Thread Oleg Tikhonov
>> Should we wait for the next version of PDFBox? May be it's worth waiting >> what would you think of the week of the 23rd/ first week of March? Sounds good. BR, Oleg On Wed, Feb 5, 2020 at 4:41 PM Tim Allison wrote: > All, > > The new version of POI will be out soon. I have a couple of mor

Re: [EXTERNAL] Re: JDK 12 build issues

2020-03-18 Thread Oleg Tikhonov
Hi Chris, I'm currently trying to build an env with java 12/13 ... in order to try your setup. What java version are you using? open jdk or oracle? One upon a time was a bug in openjdk https://bugs.openjdk.java.net/browse/JDK-8131146 But it seems to be ok in recent releases. Keep you updated. Chee

Re: 1.24.1?

2020-04-15 Thread Oleg Tikhonov
+1. Seems ok to me. Thanks, Oleg On Wed, Apr 15, 2020, 00:18 Tim Allison wrote: > I fixed the hwp5 multithreading problem. > > I looked into tar files, and the handful I reviewed had a "skip the rest of > the final block with x bytes", but there weren't actually x bytes. This > didn't harm extr

Re: [VOTE] Release Apache Tika 1.24.1 Candidate #1

2020-04-18 Thread Oleg Tikhonov
Hi Tim, Thanks for doing this! I've ran all basic stuff on Ubuntu 18 with Java 8. All tests are passed. Here is my + 1. BR, Oleg On Sat, Apr 18, 2020 at 12:38 AM Tim Allison wrote: > A candidate for the Tika 1.24.1 release is available at: > https://dist.apache.org/repos/dist/dev/tika/ > > T

Re: renaming master?

2020-06-16 Thread Oleg Tikhonov
Hi Tim, for me, "main" makes more sense. But, no objection to any other option! Thanks, Oleg On Tue, Jun 16, 2020 at 8:31 PM Tim Allison wrote: > All, > > As you may have seen, there's a movement to rename the "master" branch to > "main" or "trunk" (at least in the U.S.)[1][2]. Github is doi

Re: [EXTERNAL] Tika 2.0 modularization

2020-08-18 Thread Oleg Tikhonov
Hi Tim, looks awesome. Somehow I did not find a couple of parsers, probably it is because of on-going work ... In addition, I was thinking about "getting rid of" maven. If we are going to make Tika more modern, maybe gradle can do a trick? Do we plan to add new Java "gooddies" like lambdas, foreign

Re: [VOTE] Release Apache Tika 1.25 Candidate #2

2020-11-27 Thread Oleg Tikhonov
Here is my +1. Did basic stuff. Seems ok. Thanks! On Thu, Nov 26, 2020, 01:15 Ken Krugler wrote: > +1 > > Thanks Tim. > > — Ken > > > On Nov 25, 2020, at 4:20 AM, Tim Allison wrote: > > > > A candidate for the Tika 1.25 release is available at: > > https://dist.apache.org/repos/dist/dev/tika/

Re: [VOTE] Release Apache Tika 2.0.0-ALPHA Candidate #1

2021-01-15 Thread Oleg Tikhonov
+1. Good job! On Thu, Jan 14, 2021 at 8:44 PM Tilman Hausherr wrote: > +1 > > Tilman > > Am 14.01.2021 um 02:19 schrieb Tim Allison: > > All, > > > > A candidate for the Tika 2.0.0-ALPHA release is available at: > >https://dist.apache.org/repos/dist/dev/tika/ > > > > The release candidate i

Re: [VOTE] Release Apache Tika 1.26 Candidate #1

2021-03-25 Thread Oleg Tikhonov
[INFO] [INFO] Reactor Summary for Apache Tika 1.26: [INFO] [INFO] Apache Tika parent . SUCCESS [ 40.841 s] [INFO] Apache Tika core ... SUCCESS [01:08 min] [INFO]

Re: [DISCUSS] Contribution guide & style enforcement

2017-03-30 Thread Oleg Tikhonov
Definitely true, +1 On Wed, Mar 29, 2017 at 9:19 PM, Allison, Timothy B. wrote: > +1 Y, thank you! > > -Original Message- > From: Ken Krugler [mailto:kkrugler_li...@transpac.com] > Sent: Wednesday, March 29, 2017 2:07 PM > To: dev@tika.apache.org > Subject: Re: [DISCUSS] Contribution gu

Re: 1.15?

2017-04-17 Thread Oleg Tikhonov
+1 for the release. On Mon, Apr 17, 2017 at 8:39 PM, David Meikle wrote: > +1 from me too. > > Cheers, > Dave > > On 13 April 2017 at 13:08, Konstantin Gribov wrote: > > > Preliminary +1 from me, I'll the a closer look this weekend > > > > чт, 13 апр. 2017, 0:00 Allison, Timothy B. : > > > > >

Re: [VOTE] Release Apache Tika 1.15 Candidate #1

2017-05-23 Thread Oleg Tikhonov
Hi guys, Here is wrong ... org.apache.tika tika-parent 1.16-SNAPSHOT tika-parent/pom.xml If you are cloning the project, the upper level pom contains this. The fix is to change 1.16-SNAPSHOT to 1.15 What i did was: git clone https://github.com/apache/tika.git Any suggestions

Re: [VOTE] Release Apache Tika 1.15 Candidate #1

2017-05-23 Thread Oleg Tikhonov
Also put ./tika-dl/src/test/java/org/apache/tika/dl/imagerec/DL4JInceptionV3NetTest.java @Ignore because I do not have any DL installed on my comp. On Tue, May 23, 2017 at 11:00 PM, Oleg Tikhonov wrote: > Hi guys, > Here is wrong ... > > org.apache.tika > tika-pa

Re: [VOTE] Release Apache Tika 1.15 Candidate #2

2017-05-24 Thread Oleg Tikhonov
[x] +1 Release this package as Apache Tika 1.15 [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 19:41 min [INFO] Finished at: 2017-05-24T22:22:17+

Re: [VOTE] Release Apache Tika 1.15 Candidate #1

2017-05-24 Thread Oleg Tikhonov
Cannot reproduce after having done some workarounds ... On Wed, May 24, 2017 at 3:05 AM, Allison, Timothy B. wrote: > Hi Oleg, > What's your error on that unit test? > > -Original Message- > From: olegtikho...@gmail.com [mailto:olegtikho...@gmail.com] On Behal

Re: experiences with Tika in Docker

2017-06-02 Thread Oleg Tikhonov
Guys, i can help with Tika dockerization. just let design/plan what we gonna do. On Thu, Jun 1, 2017 at 4:02 PM, Eric Pugh wrote: > As the Tika project starts embracing more non Java tools (I’m thinking of > Tesseract for example), dockerizing your Tika setup becomes more and more > valuable. >

Re: [VOTE] Release Apache Tika 1.16 Candidate #1

2017-07-12 Thread Oleg Tikhonov
[x]+1 Release this package as Apache Tika 1.16 Basic tests and build on Ubuntu 17.04 + Java 8 (Oracle). Thanks, Oleg On Wed, Jul 12, 2017 at 11:03 AM, Dave Meikle wrote: > On 8 July 2017 at 03:40, Tim Allison wrote: > > > > > A candidate for the Tika 1.16 release is available at: > > https://

tsdb extraction

2018-03-28 Thread Oleg Tikhonov
Hi guys, I am wondering if we have a parser which can deal with time series, like influxDB or Prometheus? May be you know such "work in progress" - it's also good. Thanks in advance, Oleg

Re: tsdb extraction

2018-03-29 Thread Oleg Tikhonov
ok. time to read the spec :-) On Thu, Mar 29, 2018 at 4:02 PM, Allison, Timothy B. wrote: > Sorry...not aware of anything... > > -Original Message- > From: olegtikho...@gmail.com [mailto:olegtikho...@gmail.com] On Behalf Of > Oleg Tikhonov > Sent: Thursday, March 29,

Re: [VOTE] Release Apache Tika 1.18 Candidate #1

2018-04-11 Thread Oleg Tikhonov
[+] Release this package as Apache Tika 1.18 [INFO] Apache Tika parent . SUCCESS [ 12.379 s] [INFO] Apache Tika core ... SUCCESS [ 55.650 s] [INFO] Apache Tika parsers SUCCESS [05:55 min] [INFO] Apache

Re: [VOTE] Release Apache Tika 1.18 Candidate #3

2018-04-22 Thread Oleg Tikhonov
Hi, thanks a lot. [x] +1 Release this package as Apache Tika 1.18 Even did a security scan: mvn org.owasp:dependency-check-maven:3.1.2:check Report is attached. Best regards, Oleg On Sat, Apr 21, 2018 at 12:54 AM, talli...@apache.org wrote: > All, > A candidate for the Tika 1.18 release is a

Re: [jira] [Created] (TIKA-2647) Create a "security" page on our website

2018-05-22 Thread Oleg Tikhonov
Hi Tim, definitely would be helpful ! +1 Thanks, Oleg On Tue, May 22, 2018 at 3:38 PM, Tim Allison (JIRA) wrote: > Tim Allison created TIKA-2647: > - > > Summary: Create a "security" page on our website > Key: TIKA-2647 >

Re: [jira] [Created] (TIKA-2725) Make tika-server robust against ooms/infinite loops/memory leaks

2018-09-06 Thread Oleg Tikhonov
Hi Tim, What if watcher thread fails/gets stuck etc? On Thu, Sep 6, 2018 at 3:27 PM Tim Allison (JIRA) wrote: > Tim Allison created TIKA-2725: > - > > Summary: Make tika-server robust against ooms/infinite > loops/memory leaks > Key

Re: [jira] [Commented] (TIKA-2725) Make tika-server robust against ooms/infinite loops/memory leaks

2018-09-06 Thread Oleg Tikhonov
In this approach, probably it is the only way ... What is tika-server typical env? stand-alone, distributed ... like replicas in cluster? Are there some time limitation for recovery? How do we know what point to start processing from? Do we mark documents which were processed? For example, if tika-

Re: [jira] [Commented] (TIKA-2725) Make tika-server robust against ooms/infinite loops/memory leaks

2018-09-06 Thread Oleg Tikhonov
Ideally, tika server is dockerized, runs on swarm as a service. In addition, it has healthckeck mechanism, say something ... like http get request with return code 200. Docker will runs this hc periodically, and if it fails, will restart tika server. However, we are far away. Two ways to go, fmpov

Re: [jira] [Commented] (TIKA-2725) Make tika-server robust against ooms/infinite loops/memory leaks

2018-09-07 Thread Oleg Tikhonov
Yep, seems to be best match... unblocked execution. On Thu, Sep 6, 2018, 23:47 Tim Allison (JIRA) wrote: > > [ > https://issues.apache.org/jira/browse/TIKA-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16606373#comment-16606373 > ] > > Tim Allis

Re: [VOTE] Release Apache Tika 1.19 Candidate #1

2018-09-17 Thread Oleg Tikhonov
Hi Tim, thanks ! [INFO] Apache Tika parent . SUCCESS [ 5.138 s] [INFO] Apache Tika core ... SUCCESS [ 58.722 s] [INFO] Apache Tika parsers SUCCESS [04:20 min] [INFO] Apache Tika XMP ...

Re: [jira] [Created] (TIKA-2730) parseToString fails for a simple mp3

2018-09-19 Thread Oleg Tikhonov
Hi, It would be great, if you could attach such a file. Or does it fails on any? On Wed, Sep 19, 2018, 13:13 Boris Petrov (JIRA) wrote: > Boris Petrov created TIKA-2730: > -- > > Summary: parseToString fails for a simple mp3 > Key: T

Re: Release Announcement: General Availability of JDK 11

2018-09-26 Thread Oleg Tikhonov
Good news!!! On Thu, Sep 27, 2018, 00:06 Tim Allison wrote: > +1 successful build > On Wed, Sep 26, 2018 at 5:20 AM Rory O'Donnell > wrote: > > > > Hi Tim, > > > > *1) Release Announcement: General Availability of JDK 11 * > > > > * JDK 11, the reference implementation of Java 11 and the firs

Re: [VOTE] Release Apache Tika 1.19.1 Candidate #2

2018-10-09 Thread Oleg Tikhonov
sorry. +1 On Tue, Oct 9, 2018 at 7:26 PM Tim Allison wrote: > Thank you, Dave! > > Fellow devs, would anyone else have a chance to vote? We need a third > for the release. Thank you! > On Mon, Oct 8, 2018 at 4:36 AM wrote: > > > > Hello, > > > > On Thu, 4 Oct 2018 at 23:03, Tim Allison wrote

Fwd: DIH for TikaEntityProcessor

2018-10-12 Thread Oleg Tikhonov
-- Forwarded message - From: Martin Frank Hansen (MHQ) Date: Wed, Oct 10, 2018, 11:15 Subject: DIH for TikaEntityProcessor To: solr-u...@lucene.apache.org Hi, I am trying to read documents from a file system into Solr, using dataimporthandler but keep getting the following er

Re: [VOTE] Accept tika-helm source code into the Apache Tika project

2021-04-09 Thread Oleg Tikhonov
Great! +1 On Fri, Apr 9, 2021, 06:10 Lewis John McGibbney wrote: > Hi dev@, > > I am opening this VOTE with the goal of donating the tika-helm source code > [0] into the Apache Tika project. > Tika-helm is a Helm chart [1] to deploy Apache Tika on Kubernetes (K8s) > [2]. More specifically the c

Re: Release 1.27?

2021-04-28 Thread Oleg Tikhonov
+1 On Wed, Apr 28, 2021, 19:22 Tim Allison wrote: > All, > > There have been a number of key fixes in 1.x and some security fixes > in some of our dependencies. Any objections to starting the release > process for 1.27 in the next few weeks? Any blockers we need to fix > for 1.27? > >

Re: 2.0.0-BETA?

2021-05-11 Thread Oleg Tikhonov
Hi Tim, Thanks for the effort! +1. BR, Oleg On Tue, May 11, 2021, 16:51 Tim Allison wrote: > All, > What would you say to a beta release towards the end of this > week/beginning of next? > > Cheers, > > Tim >

Re: [VOTE] Release Apache Tika 2.0.0-BETA Candidate #1

2021-05-20 Thread Oleg Tikhonov
Hi Tim, My +1. Ubuntu 20, basic stuff. Java 11. Best regards, Oleg > On 19 May 2021, at 18:29, Tim Allison wrote: > > All, > > A candidate for the Tika 2.0.0-BETA release is available at: > https://dist.apache.org/repos/dist/dev/tika/ > > The release candidate is a zip archive of the sources

Re: [VOTE] Release Apache Tika 1.27 Candidate #1

2021-07-02 Thread Oleg Tikhonov
[x] +1 Release this package as Apache Tika 1.27 > On 2 Jul 2021, at 21:21, Tilman Hausherr wrote: > > +1 > > Tilman > > Am 30.06.2021 um 22:03 schrieb Tim Allison: >> A candidate for the Tika 1.27 release is available at: >> https://dist.apache.org/repos/dist/dev/tika/1.27 >> >> The KEYS f

Re: [VOTE] Release Apache Tika 2.0.0 Candidate #1

2021-07-18 Thread Oleg Tikhonov
+1 Thanks, Oleg > On 19 Jul 2021, at 4:04, Dave Meikle wrote: > > +1 > > Cheers, > Dave > > On Wed, 14 Jul 2021 at 19:16, Tim Allison wrote: > >> All, >> A candidate for the Tika 2.0.0 release is available >> at: >> https://dist.apache.org/repos/dist

Re: [DISCUSS] Support Elasticsearch in the tika-pipes module?

2021-07-26 Thread Oleg Tikhonov
Hi Tim, I would prefer to cut our suppot for non Apache realm lisences. Thanks, Oleg On Tue, Jul 27, 2021, 00:08 Tim Allison wrote: > All, > > As you may have heard, Amazon forked the last Apache licensed > version of Elasticsearch and is now releasing it as pure ASL 2.0 under > the name "Open

Re: [VOTE] Release Apache Tika 2.1.0 Candidate #2

2021-08-23 Thread Oleg Tikhonov
+1 basic staff, ubuntu 20.04, java 11 Thanks, Oleg On Mon, Aug 23, 2021, 20:58 Konstantin Gribov wrote: > Hi, Tim. > > SHA512 and gpg signatures are fine, build succeeds on Linux/OpenJDK11 > except Tesseract issue (same as before, 4.1.1 extracts "Page?2" instead of > "Page 2" in multipage test).

Re: [VOTE] Release Apache Tika 2.2.0 Candidate #1

2021-12-14 Thread Oleg Tikhonov
+1 > On 15 Dec 2021, at 0:01, Tim Allison wrote: > > +1 > > On Tue, Dec 14, 2021 at 4:31 PM Lewis John McGibbney > wrote: > >> I'll submit a PR for the README but I think it's also worthwile to augment >> the release management guide so that the message to review the release >> candidate inc

Re: [VOTE] Release Apache Tika 2.2.1 Candidate #3

2021-12-20 Thread Oleg Tikhonov
Hi, [x] +1 Release this package as Apache Tika 2.2.1 mvn clean install -U *OK* OS and arch: Linux oleg-vb 5.11.0-41-generic #45~20.04.1-Ubuntu SMP Wed Nov 10 10:20:10 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux Java version: openjdk version "1.8.0_312" OpenJDK Runtime Environment (build 1.8.0_312-8

Re: [VOTE] Release Apache Tika 1.28 Candidate #3

2021-12-21 Thread Oleg Tikhonov
Hi, [x] +1 Release this package as Apache Tika 1.28 mvn clean install -U OK *OS and arch*: Linux oleg-vb 5.11.0-41-generic #45~20.04.1-Ubuntu SMP Wed Nov 10 10:20:10 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux *Java version*: openjdk version "1.8.0_312" OpenJDK Runtime Environment (build 1.8.0_312-8

Re: [VOTE] Release Apache Tika 2.3.0 Candidate 1

2022-02-06 Thread Oleg Tikhonov
Hi, Linux Ubuntu 20.04, java 11. +1 Thanks, Oleg On Sun, Feb 6, 2022, 22:05 Konstantin Gribov wrote: > Hi, folks. > > SHA512 checksums and GPG signatures are fine. > > Built successfully on ArchLinux, OpenJDK 17 & 11 (Temurin-17.0.1+12 & > Temurin-11.0.13+8), Tesseract 5.0.1-2, Leptonica 1.82.0-

Re: [VOTE] Release Apache Tika 1.28.1 Candidate #1

2022-02-10 Thread Oleg Tikhonov
+1 , ubuntu 20.04, open jdk 11. Thanks, Oleg On Fri, Feb 11, 2022, 04:34 David Meikle wrote: > Hello, > > On Tue, 8 Feb 2022 at 18:22, Tim Allison wrote: > > > A candidate for the Tika 1.28.1 release is available at: > > https://dist.apache.org/repos/dist/dev/tika/1.28.1 > > > > The release c

Re: [VOTE] Release Apache Tika 1.28.2 Candidate #2

2022-04-29 Thread Oleg Tikhonov
Hi, +1. Basic stuff, linux ubuntu 20, x86, java 11. Thanks. On Thu, Apr 28, 2022, 20:23 Tilman Hausherr wrote: > +1 > > Tilman > > Am 28.04.2022 um 16:54 schrieb Tim Allison: > > A candidate for the Tika 1.28.2 release is available at: > >https://dist.apache.org/repos/dist/dev/tika/1.28.2 >

Re: [VOTE] Release Apache Tika 2.4.0 Candidate #1

2022-04-29 Thread Oleg Tikhonov
Hi, +1, Ubuntu 20, x86, Java 11. Thanks! > On 29 Apr 2022, at 2:23, Tim Allison wrote: > > A candidate for the Tika 2.4.0 release is available at: > https://dist.apache.org/repos/dist/dev/tika/2.4.0 > > The release candidate is a zip archive of the sources in: > https://github.com/apache/tika

Re: next release: 1.28.3?

2022-05-18 Thread Oleg Tikhonov
Good idea! +1. Cheers, Oleg On Wed, May 18, 2022, 17:11 Tim Allison wrote: > All, > I propose kicking off a release for 1.28.3 early next week. I've updated > some dependencies. What do you think? > > Best, > > Tim >

Re: [VOTE] Release Apache Tika 1.28.3 Candidate #1

2022-05-26 Thread Oleg Tikhonov
Hi, Here is +1, ubuntu, java 11, x86_64. Thanks, Oleg On Thu, May 26, 2022, 11:04 Tilman Hausherr wrote: > +1 > > Tilman > > Am 23.05.2022 um 20:38 schrieb Tim Allison: > > I'm indifferent but lean slightly towards going forward as is. > > > > If anyone has a hesitation, I'm happy to revert the

Re: [VOTE] Release Apache Tika 2.4.1 Candidate #1

2022-06-15 Thread Oleg Tikhonov
[x] +1 Release this package as Apache Tika 2.4.1 [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 23:55 min [INFO] Finished at: 2022-06-15T12:29:4

Re: [VOTE] Release Apache Tika 1.28.4 Candidate #1

2022-06-16 Thread Oleg Tikhonov
Hey, [x] +1 Release this package as Apache Tika 1.28.4 Java 8, ubuntu 20, basic stuff. Thanks, Oleg On Thu, Jun 16, 2022, 17:42 Konstantin Gribov wrote: > Built successfully on ArchLinux, OpenJDK 11 & 17 (Temurin-11.0.15+10 & > 17.0.3+7) w/ Tesseract 5.1.0, Leptonica 1.82. > The issue with the

Re: [VOTE] Release Apache Tika 2.5.0 Candidate #1

2022-09-30 Thread Oleg Tikhonov
Ubuntu 20.04, java sdk 11, +1 Thanks On Fri, Sep 30, 2022, 21:33 Tilman Hausherr wrote: > +1 > > builds on windows 10, oracle jdk1.8.0_341 > > Tilman > > On 30.09.2022 16:12, Tim Allison wrote: > > A candidate for the Tika 2.5.0 release is available at: > > https://dist.apache.org/repos/dist/de

Re: Possibly speeding up tests with Gradle - anyone interested?

2022-10-05 Thread Oleg Tikhonov
Hi Nick, Honestly I am trying to port our project to gradle. But it goes not well. It is good idea. Is some folk can help, we can do it together. +1 Cheers, Oleg On Wed, Oct 5, 2022, 22:05 Nick Burch wrote: > Hi All > > At ApacheCon this week, a Bob and myself ended up chatting with the folks >

Re: [VOTE] Release Apache Tika 2.6.0 Candidate #1

2022-11-04 Thread Oleg Tikhonov
[x] +1 Release this package as Apache Tika 2.6.0 [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 38:54 min [INFO] Finished at: 2022-11-04T23:47:1

Re: [VOTE] Release Apache Tika 2.7.0 Candidate #1

2023-02-02 Thread Oleg Tikhonov
Hey, +1 Ubuntu, jdk 8 (Oracle). Thanks, Oleg On Fri, Feb 3, 2023 at 6:09 AM Tilman Hausherr wrote: > +1 > > builds on german W10 with jdk8 > > Tilman > > On 31.01.2023 20:13, Tim Allison wrote: > > A candidate for the Tika 2.7.0 release is available at: > > https://dist.apache.org/repos/dist/de

  1   2   3   >