1.png@01D383C9.6C129A60]
>
>
> Lautrupparken 40-42, DK-2750 Ballerup
> E-mail m...@kmd.dk Web www.kmd.dk
> Mobil +4525571418
>
>
>
> *Fra:* Martin Frank Hansen (MHQ)
> *Sendt:* 10. oktober 2018 10:15
> *Til:* solr-user
> *Emne:* DIH for TikaEntityProcessor
>
&g
schema).
> I used the default config, and Solr version 7.5.0; I was able to
> import the data just fine (I also tested with .*DOC). Is there any
> other information you can provide that can help me reproduce this error?
>
>
>
>
> On Fri, Oct 12, 2018 at 4:11 PM Martin Frank Hanse
u want configuration over custom code, you
> could look at something like Apache NiFI. It can push data into Solr.
> Obviously it is a bigger solution, but it is correspondingly more
> robust too.
>
> Regards,
>Alex.
> On Sun, 21 Oct 2018 at 11:07, Martin Frank Hansen (MHQ
dingly more
> robust too.
>
> Regards,
>Alex.
> On Sun, 21 Oct 2018 at 11:07, Martin Frank Hansen (MHQ)
> wrote:
> >
> > Hi Alexandre,
> >
> > Thanks for your reply.
> >
> > Yes right now it is just for testing the possibilities of Solr and
> Tes
1:07, Martin Frank Hansen (MHQ) wrote:
>
> Hi Alexandre,
>
> Thanks for your reply.
>
> Yes right now it is just for testing the possibilities of Solr and Tesseract.
>
> I will take a look at the Tika documentation to see if I can make it work.
>
> You said that DIH are no
not sure you can pass parseContext that way and DIH is also not
recommended for production.
I hope this helps,
Alex.
On Sun, 21 Oct 2018 at 09:24, Martin Frank Hansen (MHQ) wrote:
> Hi again,
>
>
>
> Is there anyone who has some experience of using Tesseract’s OCR
>
2750 Ballerup
E-mail m...@kmd.dk<mailto:m...@kmd.dk> Web www.kmd.dk<http://www.kmd.dk/>
Mobil +4525571418
Fra: Martin Frank Hansen (MHQ)
Sendt: 18. oktober 2018 13:30
Til: solr-user@lucene.apache.org
Emne: Tesseract language
Hi,
I have been trying to use Tesseract through the dat
Hi,
I have been trying to use Tesseract through the data-import-handler in Solr and
it actually works very well – with English. As the documents are in Danish, I
need to change the language setting in Tesseract to Danish as well, is that
possible from Solr?
I was using the
md.dk<http://www.kmd.dk/>
Mobil +4525571418
Fra: Martin Frank Hansen (MHQ)
Sendt: 10. oktober 2018 10:15
Til: solr-user
Emne: DIH for TikaEntityProcessor
Hi,
I am trying to read documents from a file system into Solr, using
dataimporthandler but keep getting the following errors:
[cid:image
ut.println(handler.toString());
}
Hope that someone can help here.
-Original Message-----
From: Martin Frank Hansen (MHQ)
Sent: 22. oktober 2018 07:58
To: solr-user@lucene.apache.org
Subject: SV: Tesseract language
Hi Erick,
Thanks for the help! I will take a look at it.
Martin Frank Hansen, S
attachment exceptions.
On Fri, Oct 26, 2018 at 6:25 AM Martin Frank Hansen (MHQ)
wrote:
> Hi again,
>
> Never mind, I got manage to get the content of the msg-files as well
> using the following link as inspiration:
> https://wiki.apache.org/tika/RecursiveMetadata
>
> But thanks ag
27, 2018 at 12:39 AM Martin Frank Hansen (MHQ)
>
> wrote:
>
> > Hi Rohan,
> >
> > Thanks for your reply, are you using tess4j with Tika or on its own?
> > I will take a look at tess4j if I can't make it work with Tika alone.
> >
> > Best regards
>
rging data from different sources
>
> Maybe
> https://lucene.apache.org/solr/guide/7_5/update-request-processors.htm
> l#atomicupdateprocessorfactory
>
> Regards,
> Alex
>
> On Tue, Oct 30, 2018, 7:57 AM Martin Frank Hansen (MHQ), wrote:
>
> > Hi,
> >
> > I
Hi,
I am trying to merge files from different sources and with different content
(except for one key-field) , how can this be done in Solr?
An example could be:
Document 1
001 Unique id for
Document 1
test-123
…
. oktober 2018 13:16
To: solr-user
Subject: Re: Merging data from different sources
Maybe
https://lucene.apache.org/solr/guide/7_5/update-request-processors.html#atomicupdateprocessorfactory
Regards,
Alex
On Tue, Oct 30, 2018, 7:57 AM Martin Frank Hansen (MHQ), wrote:
> Hi,
>
> I
, Oct 26, 2018 at 12:31 PM Martin Frank Hansen (MHQ)
wrote:
> Hi Tim,
>
> You were right.
>
> When I called `tesseract testing/eurotext.png testing/eurotext-dan -l
> dan`, I got an error message so I downloaded "dan.traineddata" and
> added it to the Tesseract-OCR
azy and just execute it in
> IntelliJ for development and have forgotten to set my classpath on
> _numerous_ occasions when running it from a command line ;)
>
> Best,
> Erick
>
> On Thu, Oct 25, 2018 at 2:55 AM Martin Frank Hansen (MHQ)
> wrote:
> >
> > Hi,
>
able to specify "dan"
with your code above.
On Fri, Oct 26, 2018 at 10:49 AM Martin Frank Hansen (MHQ) wrote:
>
> Hi again,
>
> Now I moved the OCR part to Tika, but I still can't make it work with Danish.
> It works when using default language settings and it seems like Tika
to Solr
If you’re processing actual msg (not eml), you’ll also need poi and
poi-scratchpad and their dependencies, but then those msgs could have
attachments, at which point, you may as just add tika-app. :D
On Thu, Oct 25, 2018 at 2:46 PM Martin Frank Hansen (MHQ)
wrote:
> Hi Erick and
Hi again,
Never mind, I got manage to get the content of the msg-files as well using the
following link as inspiration: https://wiki.apache.org/tika/RecursiveMetadata
But thanks again for all your help!
-Original Message-
From: Martin Frank Hansen (MHQ)
Sent: 26. oktober 2018 10:14
Hi,
I am trying to read content of msg-files using Tika and index these in Solr,
however I am having some problems with the OfficeParser(). I keep getting the
error java.lang.NoClassDefFoundError for the OfficeParcer, even though both
tika-core and tika-parsers are included in the build path.
Hi,
I am trying to add meta data and files to Solr, but are experiencing some
problems.
Data is divided on three two, cases and files. For each case the meta-data is
given in an xml document, while meta data for the files is given in another xml
document, and the actual files are kept in yet
Hi,
I am trying to read documents from a file system into Solr, using
dataimporthandler but keep getting the following errors:
[cid:image002.png@01D46082.022FF7A0]
Exception while processing: files document :
null:org.apache.solr.handler.dataimport.DataImportHandlerException:
Hi,
I am having some problems getting the data-import-handler in Solr to work. I
have tried a lot of things but I simply get no response from Solr, not even an
error.
When calling the API:
http://localhost:8983/solr/nh/dataimport?command=full-import
{
"responseHeader":{
"status":0,
t;:"0:0:0.136"}}
Seems like it is not even trying to read the data.
Martin Frank Hansen
-Oprindelig meddelelse-
Fra: Jan Høydahl
Sendt: 2. oktober 2018 17:46
Til: solr-user@lucene.apache.org
Emne: Re: data-import-handler for solr-7.5.0
> url="C:/Users/z6mhq/Desktop/data_
import/nh_test.xml"
>
> Have you tried url="C:\\Users\\z6mhq/Desktop\\data_import\\nh_test.xml" ?
>
> --
> Jan Høydahl, search solution architect Cominvent AS -
> www.cominvent.com
>
> > 2. okt. 2018 kl. 17:15 skrev Martin Frank Hansen (MHQ) :
> >
> &g
/master/configsets/pets-final/pets-data-config.xml).
Regards,
Alex.
On Tue, 2 Oct 2018 at 12:46, Martin Frank Hansen (MHQ) wrote:
>
> Thanks for the info, the UI looks interesting... It does read the data-config
> correctly, so the problem is probably in this file.
>
> Martin Frank
example that
ships with DIH example set. Specifically, at commonField parameter, it may be
useful for you:
https://lucene.apache.org/solr/guide/7_4/uploading-structured-data-store-data-with-the-data-import-handler.html
Regards,
Alex.
On Sun, 7 Oct 2018 at 13:23, Martin Frank Hansen (MHQ) wro
Hi,
I am having some difficulties adding data from different levels of a xml
document.
The xml can be as simple as this:
2165432
5
10
The data-config-file looks like this.
The result is the following:
{
, and the burden of building and running
a separate app will probably be worth it.
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
> 16. nov. 2018 kl. 12:24 skrev Martin Frank Hansen (MHQ) :
>
> Hi,
>
> I am trying to add meta data and files to Solr, but are ex
ink there are some pictures which are not being sent through in
> the email.
>
> Do send your query that you are using, and which version of Solr you
> are using?
>
> Regards,
> Edwin
>
>> On Mon, 25 Feb 2019 at 20:54, Martin Frank Hansen (MHQ) wrote:
>>
Sorry forgot to mention that we are using Solr 7.5.
Internal - KMD A/S
-Original Message-
From: Martin Frank Hansen (MHQ)
Sent: 26. februar 2019 07:43
To: solr-user@lucene.apache.org
Subject: RE: MLT and facetting
Hi Edwin,
Thanks for your response.
Yes you are right
On Mon, 25 Feb 2019 at 20:54, Martin Frank Hansen (MHQ) wrote:
> Hi,
>
>
>
> I am trying to combine the mlt functionality with facets, but Solr
> throws
> org.apache.solr.common.SolrException: ":"Unable to compute facet
> ranges, facet context is not set".
in solrconfig.xml?
Regards,
Edwin
On Tue, 26 Feb 2019 at 14:43, Martin Frank Hansen (MHQ) wrote:
> Hi Edwin,
>
> Thanks for your response.
>
> Yes you are right. It was simply the search parameters from Solr.
>
> The query looks like this:
>
> http://
> .../solr/.../mlt?df
.
Regards,
Edwin
On Thu, 28 Feb 2019 at 14:51, Martin Frank Hansen (MHQ) wrote:
> Hi Edwin,
>
> Ok that is nice to know. Do you know when this bug will get fixed?
>
> By ordering I mean that MLT score the documents according to its
> similarity function (believe it is cosine
according to the number of
> occurrences. But I'm not sure how it will affect the MLT score or how
> it will be output when combine together, as it is not working
> currently and there is no way to test.
>
> Regards,
> Edwin
>
>> On Thu, 28 Feb 2019 at 14:51, Martin Fr
e output when combine together, as it is not working
> currently and there is no way to test.
>
> Regards,
> Edwin
>
> On Thu, 28 Feb 2019 at 14:51, Martin Frank Hansen (MHQ) wrote:
>
>> Hi Edwin,
>>
>> Ok that is nice to know. Do you know when this bug will get fi
the same problem in Solr 7.7 if I turn on faceting in /mlt
requestHandler.
Found this issue in the JIRA:
https://issues.apache.org/jira/browse/SOLR-7883
Seems like it is a bug in Solr and it has not been resolved yet.
Regards,
Edwin
On Tue, 26 Feb 2019 at 21:03, Martin Frank Hansen (MHQ) wrote:
>
Hi,
I am trying to combine the mlt functionality with facets, but Solr throws
org.apache.solr.common.SolrException: ":"Unable to compute facet ranges, facet
context is not set".
What I am trying to do is quite simple, find similar documents using mlt and
group these using the facet parameter.
before, so I'm not sure how it works.
For the ordering of the documents, do you mean to sort them according to the
criteria that you want?
Regards,
Edwin
On Wed, 27 Feb 2019 at 14:43, Martin Frank Hansen (MHQ) wrote:
> Hi Edwin,
>
> Thanks for your response. Are you sure it
Hi,
Hope someone can help me, I am trying to make an incremental update for one
document using the API, but cannot make it work. I have tried a lot of things
and all I actually want is to increment the value of the field “clicks” by one.
I have something like this:
uot;docid","clicks":{“inc”:"1"}}]
In an /update?commit=true
Best regards
Thierry
See documentation here
https://lucene.apache.org/solr/guide/6_6/updating-parts-of-documents.html
> On 19 Mar 2019, at 08:14, Martin Frank Hansen (MHQ) wrote:
>
> Hi,
>
>
;,"clicks":{“inc”:"1"}}] in the raw body
hence using curl or any other app that allows you this like Postman.
Best regards
Thierry
> On 19 Mar 2019, at 08:59, Martin Frank Hansen (MHQ) wrote:
>
> Hi Thierry,
>
> Do you mean something like this?
>
> http://loc
without
highlighting.
> Am 21.03.2019 um 17:05 schrieb Martin Frank Hansen (MHQ) :
>
> Hi,
>
> I am wondering how performance highlighting in Solr performs when the number
> of documents get large?
>
> Right now we have about 1 TB of data in all sorts of file types an
Hi,
I am wondering how performance highlighting in Solr performs when the number of
documents get large?
Right now we have about 1 TB of data in all sorts of file types and I was
wondering how storing these documents within Solr (for highlighting purpose)
will affect performance?
Is it
Hi,
I am trying to create an index on a small Linux server running Solr-7.5.0, but
keep running into problems.
When I try to index a file-folder of roughly 18 GB (18000 files) I get the
following error from the server:
java.lang.OutOfMemoryError: unable to create new native thread.
>From the
. SolrClient is definitely a subject for heavy reuse.
On Tue, Feb 12, 2019 at 5:16 PM Martin Frank Hansen (MHQ)
wrote:
> Hi Mikhail,
>
> I am using Solrj but think I might have found the problem.
>
> I am doing a atomicUpdate on existing documents, and found out that I
> creat
did you get this error?
Usually it occurs in custom code with many new Thread() calls and usually
healed with thread poling.
On Tue, Feb 12, 2019 at 3:25 PM Martin Frank Hansen (MHQ)
wrote:
> Hi,
>
> I am trying to create an index on a small Linux server running
> Solr-7.5.0, but
Hi,
I am having some difficulties making highlighting work. For some reason the
highlighting feature only works on some fields but not on other fields even
though these fields are stored.
An example of a request looks like this:
Please try hl.method=unified and tell us if that helps.
~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley
On Mon, Jun 3, 2019 at 4:06 AM Martin Frank Hansen (MHQ) wrote:
> Hi,
>
> I am having some difficulties making highlighting work. For some
6.2019 um 10:06 schrieb Martin Frank Hansen (MHQ) :
>
> Hi,
>
> I am having some difficulties making highlighting work. For some reason the
> highlighting feature only works on some fields but not on other fields even
> though these fields are stored.
>
> An example of a re
of the
documents?
> Am 03.06.2019 um 10:06 schrieb Martin Frank Hansen (MHQ) :
>
> Hi,
>
> I am having some difficulties making highlighting work. For some reason the
> highlighting feature only works on some fields but not on other fields even
> though these fields are s
using for the field “Sagstitel”? Is it the same as other
fields?
Regards,
Edwin
On Mon, 3 Jun 2019 at 16:06, Martin Frank Hansen (MHQ) wrote:
> Hi,
>
> I am having some difficulties making highlighting work. For some
> reason the highlighting feature only works on some fields but no
ype definition of those
> fields? Could this word be omitted or with wrong encoding during
> loading of the documents?
>
> > Am 03.06.2019 um 10:06 schrieb Martin Frank Hansen (MHQ) :
> >
> > Hi,
> >
> > I am having some difficulties making highlighting work. For some
> &
Hi,
I was wondering how others are handling solr – injection in their solutions?
After reading this post:
https://www.waratek.com/apache-solr-injection-vulnerability-customer-alert/ I
can see how important it is to update to Solr-8.2 or higher.
Has anyone been successful in injecting
55 matches
Mail list logo