lso, include commons-collections4 (which
> is new in POI w Tika 1.14). (I assume you have already added curvesapi?)
>
> -Original Message-
> From: Gytis Mikuciunas [mailto:gyt...@gmail.com]
> Sent: Saturday, June 3, 2017 5:39 AM
> To: solr-user@lucene.apache.org
> Subject: R
twn PDFBox 2.0.5 and
2.0.6-SNAPSHOT on ~500k pdfs, see: http://162.242.228.174/
reports/reports_pdfbox_2_0_6.tar.gz
-Original Message-----
From: Gytis Mikuciunas [mailto:gyt...@gmail.com]
Sent: Tuesday, May 9, 2017 7:17 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr 6.4. Can't index MS
ere: https://builds.apache.org/
>
> Please ask on the POI or Tika users lists for how to get the latest/latest
> running, and thank you, again, for opening the issue on POI's Bugzilla.
>
> Best,
>
> Tim
>
> -Original Message-
> From: Gytis Mikuciunas [mailto
n issue in POI, or maybe we should catch this
> special example at the Tika level?
>
> For "Caused by: java.lang.ArrayIndexOutOfBoundsException:", the POI team
> _might_ be able to modify the parser to ignore a stream if there's an
> exception, but that's often a sign tha
Thanks for your responses.
Are there any posibilities to ignore parsing errors and continue indexing?
because now solr/tika stops parsing whole document if it finds any exception
On Apr 11, 2017 19:51, "Allison, Timothy B." wrote:
> You might want to drop a note to the dev
common.SolrException",
"root-error-class",
"java.lang.ArrayIndexOutOfBoundsException"
]
}
}
Regards,
Gytis
On Mon, Feb 6, 2017 at 6:54 PM, Allison, Timothy B. <talli...@mitre.org>
wrote:
> Shouldn't have taken you that much effort.
blob/releases/
> lucene-solr/6.4.0/solr/core/src/java/org/apache/solr/update/processor/
> TemplateUpdateProcessorFactory.java
>
> Alternatively, you could implement your URP in Javascript, but I am
> not sure that has an API to check file dates.
>
> Regards,
>Alex.
>
> http://www.
the date should work.
Regards,
Alex
On 10 Feb 2017 2:39 AM, "Gytis Mikuciunas" <gyt...@gmail.com> wrote:
Hi,
We have started to use solr for our documents indexing (vsd, vsdx,
xls,xlsx, doc, docx, pdf, txt).
Modified date values is needed for each file. MS Office's file
t; On Fri, Feb 10, 2017 at 4:59 AM, Alexandre Rafalovitch
> <arafa...@gmail.com> wrote:
> > Custom update request processor that looks up a file from the name and
> gets
> > the date should work.
> >
> > Regards,
> > Alex
> >
> > On 10 Feb 2017 2
Hi,
We have started to use solr for our documents indexing (vsd, vsdx,
xls,xlsx, doc, docx, pdf, txt).
Modified date values is needed for each file. MS Office's files, pdfs have
this value.
Problem is with txt files as they don't have this value in their meta.
Is there any possibility to get it
virtuald/curvesapi/1.03
>
> See also [1]
>
> [1] http://apache-poi.1045710.n5.nabble.com/support-for-
> reading-Microsoft-Visio-2013-vsdx-format-td5721500.html
>
> -----Original Message-
> From: Gytis Mikuciunas [mailto:gyt...@gmail.com]
> Sent: Monday, February 6, 2
> > with its installation. So, I think you are hitting Tika where it
> > manages to figure out what type of content you have, but does not have
> > (Apache POI - another O/S project) library installed.
> >
> > What you need to do is to get the additional jar from
/lucene-solr/blob/releases/
> lucene-solr/6.4.0/solr/CHANGES.txt
> and it is Tika 1.13
>
> Hope it helps,
>Alex.
>
> http://www.solr-start.com/ - Resources for Solr users, new and experienced
>
>
> On 3 February 2017 at 05:57, Gytis Mikuciunas <gyt...@g
GES.txt
> and it is Tika 1.13
>
> Hope it helps,
>Alex.
>
> http://www.solr-start.com/ - Resources for Solr users, new and experienced
>
>
> On 3 February 2017 at 05:57, Gytis Mikuciunas <gyt...@gmail.com> wrote:
> > Hi,
> >
> >
> >
Hi,
I'm using single core Solr 6.4 instance on windows server (windows server
2012 R2 standard),
Java v8, (build 1.8.0_121-b13).
All works more or less ok, except MS Visio vsdx files indexing.
Every time it throws an error (no matters if it tries to index vsdx file or
for example docx with
15 matches
Mail list logo