Re: Solr 6.4. Can't index MS Visio vsdx files

2017-07-03 Thread Gytis Mikuciunas
lso, include commons-collections4 (which > is new in POI w Tika 1.14). (I assume you have already added curvesapi?) > > -Original Message- > From: Gytis Mikuciunas [mailto:gyt...@gmail.com] > Sent: Saturday, June 3, 2017 5:39 AM > To: solr-user@lucene.apache.org > Subject: R

RE: Solr 6.4. Can't index MS Visio vsdx files

2017-06-03 Thread Gytis Mikuciunas
twn PDFBox 2.0.5 and 2.0.6-SNAPSHOT on ~500k pdfs, see: http://162.242.228.174/ reports/reports_pdfbox_2_0_6.tar.gz -Original Message----- From: Gytis Mikuciunas [mailto:gyt...@gmail.com] Sent: Tuesday, May 9, 2017 7:17 AM To: solr-user@lucene.apache.org Subject: Re: Solr 6.4. Can't index MS

Re: Solr 6.4. Can't index MS Visio vsdx files

2017-05-09 Thread Gytis Mikuciunas
ere: https://builds.apache.org/ > > Please ask on the POI or Tika users lists for how to get the latest/latest > running, and thank you, again, for opening the issue on POI's Bugzilla. > > Best, > > Tim > > -Original Message- > From: Gytis Mikuciunas [mailto

Re: Solr 6.4. Can't index MS Visio vsdx files

2017-04-11 Thread Gytis Mikuciunas
n issue in POI, or maybe we should catch this > special example at the Tika level? > > For "Caused by: java.lang.ArrayIndexOutOfBoundsException:", the POI team > _might_ be able to modify the parser to ignore a stream if there's an > exception, but that's often a sign tha

RE: Solr 6.4. Can't index MS Visio vsdx files

2017-04-11 Thread Gytis Mikuciunas
Thanks for your responses. Are there any posibilities to ignore parsing errors and continue indexing? because now solr/tika stops parsing whole document if it finds any exception On Apr 11, 2017 19:51, "Allison, Timothy B." wrote: > You might want to drop a note to the dev

Re: Solr 6.4. Can't index MS Visio vsdx files

2017-04-11 Thread Gytis Mikuciunas
common.SolrException", "root-error-class", "java.lang.ArrayIndexOutOfBoundsException" ] } } Regards, Gytis On Mon, Feb 6, 2017 at 6:54 PM, Allison, Timothy B. <talli...@mitre.org> wrote: > Shouldn't have taken you that much effort.

Re: how to get modified field data if it doesn't exist in meta

2017-02-13 Thread Gytis Mikuciunas
blob/releases/ > lucene-solr/6.4.0/solr/core/src/java/org/apache/solr/update/processor/ > TemplateUpdateProcessorFactory.java > > Alternatively, you could implement your URP in Javascript, but I am > not sure that has an API to check file dates. > > Regards, >Alex. > > http://www.

Re: how to get modified field data if it doesn't exist in meta

2017-02-12 Thread Gytis Mikuciunas
the date should work. Regards, Alex On 10 Feb 2017 2:39 AM, "Gytis Mikuciunas" <gyt...@gmail.com> wrote: Hi, We have started to use solr for our documents indexing (vsd, vsdx, xls,xlsx, doc, docx, pdf, txt). Modified date values is needed for each file. MS Office's file

Re: how to get modified field data if it doesn't exist in meta

2017-02-12 Thread Gytis Mikuciunas
t; On Fri, Feb 10, 2017 at 4:59 AM, Alexandre Rafalovitch > <arafa...@gmail.com> wrote: > > Custom update request processor that looks up a file from the name and > gets > > the date should work. > > > > Regards, > > Alex > > > > On 10 Feb 2017 2

how to get modified field data if it doesn't exist in meta

2017-02-09 Thread Gytis Mikuciunas
Hi, We have started to use solr for our documents indexing (vsd, vsdx, xls,xlsx, doc, docx, pdf, txt). Modified date values is needed for each file. MS Office's files, pdfs have this value. Problem is with txt files as they don't have this value in their meta. Is there any possibility to get it

Re: Solr 6.4. Can't index MS Visio vsdx files

2017-02-06 Thread Gytis Mikuciunas
virtuald/curvesapi/1.03 > > See also [1] > > [1] http://apache-poi.1045710.n5.nabble.com/support-for- > reading-Microsoft-Visio-2013-vsdx-format-td5721500.html > > -----Original Message- > From: Gytis Mikuciunas [mailto:gyt...@gmail.com] > Sent: Monday, February 6, 2

Re: Solr 6.4. Can't index MS Visio vsdx files

2017-02-06 Thread Gytis Mikuciunas
> > with its installation. So, I think you are hitting Tika where it > > manages to figure out what type of content you have, but does not have > > (Apache POI - another O/S project) library installed. > > > > What you need to do is to get the additional jar from

Re: Solr 6.4. Can't index MS Visio vsdx files

2017-02-05 Thread Gytis Mikuciunas
/lucene-solr/blob/releases/ > lucene-solr/6.4.0/solr/CHANGES.txt > and it is Tika 1.13 > > Hope it helps, >Alex. > > http://www.solr-start.com/ - Resources for Solr users, new and experienced > > > On 3 February 2017 at 05:57, Gytis Mikuciunas <gyt...@g

RE: Solr 6.4. Can't index MS Visio vsdx files

2017-02-03 Thread Gytis Mikuciunas
GES.txt > and it is Tika 1.13 > > Hope it helps, >Alex. > > http://www.solr-start.com/ - Resources for Solr users, new and experienced > > > On 3 February 2017 at 05:57, Gytis Mikuciunas <gyt...@gmail.com> wrote: > > Hi, > > > > > >

Solr 6.4. Can't index MS Visio vsdx files

2017-02-03 Thread Gytis Mikuciunas
Hi, I'm using single core Solr 6.4 instance on windows server (windows server 2012 R2 standard), Java v8, (build 1.8.0_121-b13). All works more or less ok, except MS Visio vsdx files indexing. Every time it throws an error (no matters if it tries to index vsdx file or for example docx with