Re: NPE in QueryComponent.mergeIds when using timeAllowed and sorting SOLR 8.7

2021-03-03 Thread Tomás Fernández Löbbe
; org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:214) > > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2627) > > > > > > Can this be fixed in a patch for Solr 8.8? I do not want to have to go > back to Solr 6 and reindex the system, that takes 2 days using 180 EMR > instances. > > > > Pease advise. Thank you. > >

Re: NPE in QueryComponent.mergeIds when using timeAllowed and sorting SOLR 8.7

2021-03-01 Thread Phill Campbell
e fixed in a patch for Solr 8.8? I do not want to have to go back > to Solr 6 and reindex the system, that takes 2 days using 180 EMR instances. > > Pease advise. Thank you.

NPE in QueryComponent.mergeIds when using timeAllowed and sorting SOLR 8.7

2021-02-24 Thread Phill Campbell
equest(RequestHandlerBase.java:214) at org.apache.solr.core.SolrCore.execute(SolrCore.java:2627) Can this be fixed in a patch for Solr 8.8? I do not want to have to go back to Solr 6 and reindex the system, that takes 2 days using 180 EMR instances. Pease advise. Thank you.

NPE in QueryComponent.mergeIds when using timeAllowed and sorting SOLR 8.7

2021-02-24 Thread Phill Campbell
equest(RequestHandlerBase.java:214) at org.apache.solr.core.SolrCore.execute(SolrCore.java:2627) Can this be fixed in a patch for Solr 8.8? I do not want to have to go back to Solr 6 and reindex the system, that takes 2 days using 180 EMR instances. Pease advise. Thank you.

Re: Using multiple language stop words in Solr Core

2021-02-11 Thread Markus Jelsma
Hell Abhay, Do not enable stopwords unless you absolutely know what you are doing. In general, it is a bad practice that somehow still lingers on. But to answer the question, you must have one field and fieldType for each language, so language specific filters go there. Also, using edismax

Using multiple language stop words in Solr Core

2021-02-11 Thread Abhay Kumar
Hello Team, Solr provides some data type out of box in managed schema for different languages such as english, french, japanies etc. We are using common data type "text_general" for fields declaration and using stopwards.txt for stopword

Re: SSL using CloudSolrClient

2021-02-03 Thread ChienHuaWang
Thanks for the information. Could you advise whether CloudSolrClient is compatible with non-TLS? even client is not configure, it can still connect to Solr (TLS enabled)? -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: SSL using CloudSolrClient

2021-02-03 Thread Jörn Franke
schrieb ChienHuaWang : >> >> Hi, >> >> I am implementing SSL between Solr and Client communication. The clients >> connect to Solr via CloudSolrClient >> >> According to doc >> <https://lucene.apache.org/solr/guide/8_5/enabling-ssl.html#index-a-do

Re: SSL using CloudSolrClient

2021-02-03 Thread Jörn Franke
onnect to Solr via CloudSolrClient > > According to doc > <https://lucene.apache.org/solr/guide/8_5/enabling-ssl.html#index-a-document-using-cloudsolrclient> > > , the passwords should also be set in clients. > However, in testing, client is still working well without

SSL using CloudSolrClient

2021-02-03 Thread ChienHuaWang
Hi, I am implementing SSL between Solr and Client communication. The clients connect to Solr via CloudSolrClient According to doc <https://lucene.apache.org/solr/guide/8_5/enabling-ssl.html#index-a-document-using-cloudsolrclient> , the passwords should also be set in clients. H

Re: Change uniqueKey using SolrJ

2021-02-01 Thread Jason Gerlowski
Hi, SolrJ doesn't have any purpose-made request class to change the uniqueKey, afaict. However doing so is still possible (though less convenient) using the "GenericSolrRequest" class, which can be used to hit arbitrary Solr APIs. If you'd like to see better support for this in S

Re: Getting Solr's statistic using SolrJ

2021-02-01 Thread Jason Gerlowski
quest = new GenericSolrRequest(SolrRequest.METHOD.GET, "/admin/metrics/history", params); final SimpleSolrResponse response = request.process(solrClient); Hope that helps, Jason On Fri, Jan 22, 2021 at 11:21 AM Gael Jourdan-Weil wrote: > > Hello Steven, > > I believe what y

Re: Is there way to autowarm new searcher using recently ran queries

2021-01-28 Thread Chris Hostetter
: I am wondering if there is a way to warmup new searcher on commit by : rerunning queries processed by the last searcher. May be it happens by : default but then I can't understand why we see high query times if those : searchers are being warmed. it only happens by default if you have an

Re: Is there way to autowarm new searcher using recently ran queries

2021-01-27 Thread Joel Bernstein
Typically what you would do is add static warming queries to warm all the caches. These queries are hardcoded into the solrconfig.xml. You'll want to run the facets you're using in the warming queries particularly facets on string fields. Once you add these it will take longer to warm the new

Is there way to autowarm new searcher using recently ran queries

2021-01-27 Thread Pushkar Raste
Hi, A rookie question. We have a Solr cluster that doesn't get too much traffic. We see that our queries take long time unless we run a script to send more traffic to Solr. We are indexing data all the time and use autoCommit. I am wondering if there is a way to warmup new searcher on commit by

RE: Getting Solr's statistic using SolrJ

2021-01-22 Thread Gael Jourdan-Weil
Hello Steven, I believe what you are looking for cannot be accessed using SolrJ (I didn't really check though). But you can easily access it either via the Collections APIs and/or the Metrics API depending on what you need exactly. See https://lucene.apache.org/solr/guide/8_4/cluster-node

Getting Solr's statistic using SolrJ

2021-01-22 Thread Steven White
each core, etc. etc. using SolrJ API. Thanks Steven

Change uniqueKey using SolrJ

2021-01-22 Thread Timo Grün
Hi All, I’m currently trying to change the uniqueKey of my Solr Cloud schema using Solrj. While creating new Fields and FieldDefinitions is pretty straight forward, I struggle to find any solution to change the Unique Key field with Solrj. Any advice here? Best Regards, Timo Gruen

Re: Exact matching without using new fields

2021-01-21 Thread Alexandre Rafalovitch
rmation retrieval END > > START advanced information retrieval with solr END > > > > And with our custom query parser, when an EXACT operator is found, I > > tokenize the query to match the first case. Otherwise pass it through. > > > > Needs custom analyzers on the

Re: Exact matching without using new fields

2021-01-21 Thread Doss
> Needs custom analyzers on the query and index sides to generate the > > correct token sequences. > > > > It's worked out well for our case. > > > > Dave > > > > > > > > > > From: gnandre

Re: Exact matching without using new fields

2021-01-19 Thread gnandre
first case. Otherwise pass it through. > > Needs custom analyzers on the query and index sides to generate the > correct token sequences. > > It's worked out well for our case. > > Dave > > > > > From: gnandre > Sent: Tuesda

Re: Exact matching without using new fields

2021-01-19 Thread David R
It's worked out well for our case. Dave From: gnandre Sent: Tuesday, January 19, 2021 4:07 PM To: solr-user@lucene.apache.org Subject: Exact matching without using new fields Hi, I am aware that to do exact matching (only whatever is provided inside double quotes should be matched) in Solr, w

Exact matching without using new fields

2021-01-19 Thread gnandre
Hi, I am aware that to do exact matching (only whatever is provided inside double quotes should be matched) in Solr, we can copy existing fields with the help of copyFields into new fields that have very minimal tokenization or no tokenization (e.g. using KeywordTokenizer or using string field

Re: Solr using all available CPU and becoming unresponsive

2021-01-12 Thread Charlie Hull
for OS page cache) 2. disable swap, if you can (this is esp. important if using network storage as swap). There are potential downsides to this (so proceed with caution); but if part of your heap gets swapped out (and it almost certainly will, with a sufficiently large heap) full GCs lead to a swap storm

Re: Solr using all available CPU and becoming unresponsive

2021-01-12 Thread Michael Gibney
for OS page cache) 2. disable swap, if you can (this is esp. important if using network storage as swap). There are potential downsides to this (so proceed with caution); but if part of your heap gets swapped out (and it almost certainly will, with a sufficiently large heap) full GCs lead to a swap

Re: Solr using all available CPU and becoming unresponsive

2021-01-12 Thread Jeremy Smith
(and maybe the StopFilterFactory from the index section as well)? Thanks again, Jeremy From: Michael Gibney Sent: Monday, January 11, 2021 8:30 PM To: solr-user@lucene.apache.org Subject: Re: Solr using all available CPU and becoming unresponsive Hi Jeremy, Can

Re: Solr using all available CPU and becoming unresponsive

2021-01-11 Thread Michael Gibney
t; For the filterCache, we have tried sizes as low as 128, which caused our > CPU usage to go up and didn't solve our issue. autowarmCount used to be > much higher, but we have reduced it to try to address this issue. > > > The behavior we see: > > Solr is normally

Solr using all available CPU and becoming unresponsive

2021-01-11 Thread Jeremy Smith
have tried sizes as low as 128, which caused our CPU usage to go up and didn't solve our issue. autowarmCount used to be much higher, but we have reduced it to try to address this issue. The behavior we see: Solr is normally using ~3-6GB of heap and we usually have ~20GB of free memory

Re: Possible bug on LTR when using solr 8.6.3 - index out of bounds DisiPriorityQueue.add(DisiPriorityQueue.java:102)

2021-01-06 Thread Florin Babes
.com/apache/lucene-solr/blob/releases/lucene-solr/8.6.3/solr/contrib/ltr/src/java/org/apache/solr/ltr/feature/SolrFeature.java#L243 > [3] > https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.6.3/solr/contrib/ltr/src/java/org/apache/solr/ltr/LTRScoringQuery.java#L520-L525 >

Re:Possible bug on LTR when using solr 8.6.3 - index out of bounds DisiPriorityQueue.add(DisiPriorityQueue.java:102)

2021-01-05 Thread Christine Poerschke (BLOOMBERG/ LONDON)
-L525 From: solr-user@lucene.apache.org At: 01/04/21 17:31:44To: solr-user@lucene.apache.org Subject: Possible bug on LTR when using solr 8.6.3 - index out of bounds DisiPriorityQueue.add(DisiPriorityQueue.java:102) Hello, We are trying to update Solr from 8.3.1 to 8.6.3. On Solr 8.3.1 we

Possible bug on LTR when using solr 8.6.3 - index out of bounds DisiPriorityQueue.add(DisiPriorityQueue.java:102)

2021-01-04 Thread Florin Babes
Hello, We are trying to update Solr from 8.3.1 to 8.6.3. On Solr 8.3.1 we are using LTR in production using a MultipleAdditiveTrees model. On Solr 8.6.3 we receive an error when we try to compute some SolrFeatures. We didn't find any pattern of the queries that fail. Example: We have the following

Suggester using up memory

2020-11-20 Thread Nick Vercammen
Hey, We have a problem on one of our installations with the suggestComponent. The index has about 16 million documents and contains a "Global" field which contains the data of multiple other fields. This "Global" field is used to build up the suggestions. A short time after starting Solr it is

Re: Using fromIndex for single collection

2020-11-19 Thread Jason Gerlowski
Hi Irina, Yes, the "fromIndex" parameter can be used to perform a join from the host collection to a separate, single-shard collection in SolrCloud. If specified, this "fromIndex" collection must be present on whichever host is processing the request. (Often this involves over-replicating your

RE: Using Multiple collections with streaming expressions

2020-11-12 Thread ufuk yılmaz
Many thanks for the info Joel --ufuk Sent from Mail for Windows 10 From: Joel Bernstein Sent: 12 November 2020 17:00 To: solr-user@lucene.apache.org Subject: Re: Using Multiple collections with streaming expressions T

Re: Using Multiple collections with streaming expressions

2020-11-12 Thread Joel Bernstein
> > From: Erick Erickson > Sent: 10 November 2020 16:48 > To: solr-user@lucene.apache.org > Subject: Re: Using Multiple collections with streaming expressions > > Y > >

RE: Using Multiple collections with streaming expressions

2020-11-10 Thread ufuk yılmaz
16:48 To: solr-user@lucene.apache.org Subject: Re: Using Multiple collections with streaming expressions Y

Re: Using Multiple collections with streaming expressions

2020-11-10 Thread Erick Erickson
You need to open multiple streams, one to each collection then combine them. For instance, open a significantTerms stream to collection1, another to collection2 and wrap both in a merge stream. Best, Erick > On Nov 9, 2020, at 1:58 PM, ufuk yılmaz wrote: > > For example the streaming

Using Multiple collections with streaming expressions

2020-11-09 Thread ufuk yılmaz
For example the streaming expression significantTerms: https://lucene.apache.org/solr/guide/8_4/stream-source-reference.html#significantterms significantTerms(collection1, q="body:Solr", field="author", limit="50",

solr-exporter using string arrays - 2

2020-11-03 Thread Maximilian Renner
Sorry for the bad format of the first mail, once again: Hello there, while playing around with the https://github.com/apache/lucene-solr/blob/master/solr/contrib/prometheus-exporter/conf/solr-exporter-config.xml I found a bug when trying to use string arrays like 'facet.field':

~solr-exporter using string arrays

2020-11-03 Thread Maximilian Renner
Hello there, while playing around with the https://github.com/apache/lucene-solr/blob/master/solr/contrib/prometheus-exporter/conf/solr-exporter-config.xml I found a bug when trying to use string arrays like 'facet.field': __ __ __ __ __ __ _test_ _/select_ __ __ __

Using fromIndex for single collection

2020-10-07 Thread Irina Kamalova
I suppose my question is very simple. Am I right that if I want to use joins in the single collection in SolrCloud across several shards, I need to use semantic "fromIndex"? According to documentation I should use it only if I have different collections. I have one single collection across

Re: Help using Noggit for streaming JSON data

2020-10-07 Thread Christopher Schultz
Yonic, Thanks for the reply, and apologies for the long delay in this reply. Also apologies for top-posting, I’m writing from my phone. :( Oh, of course... simply subclass the CharArr. In my case, I should be able to immediately base64-decode the value (saves 1/4 in-memory representation)

Re: Daylight savings time issue using NOW in Solr 6.1.0

2020-10-07 Thread Bernd Fehling
Hi, because you are using solr.in.cmd I guess you are using Windows OS. I don't know much about Solr and Windows but you can check your Windows, Jetty and Solr time by looking at your solr-8983-console.log file after starting Solr. First the timestamp of the file itself, then the timestamp

RE: Using streaming expressions with shards filter

2020-10-07 Thread Gael Jourdan-Weil
Thanks Joel. I will try it in the future if I still need it (for now I went for another solution that fits my needs). Gaël

Daylight savings time issue using NOW in Solr 6.1.0

2020-10-07 Thread vishal patel
Hi I am using Solr 6.1.0. My SOLR_TIMEZONE=UTC in solr.in.cmd. My current Solr server machine time zone is also UTC. My one collection has below one field in schema. Suppose my current Solr server machine time is 2020-10-01 10:00:00.000. I have one document in that collection

Re: Using streaming expressions with shards filter

2020-10-06 Thread Joel Bernstein
;> I expected to be able to use the "shards" parameter like on a regular >> query on "/select" for instance but this appear to not work or I don't know >> how to do it. >> >> Is this somehow a feature/restriction of Streaming expressions? >> Or am I mi

Re: Using streaming expressions with shards filter

2020-10-06 Thread Joel Bernstein
ng expressions? > Or am I missing something? > > Note that the Streaming Expression I use is actually using the "/export" > request handler. > > Example of the streaming expression: > curl -X POST -v --data-urlencode > 'expr=search(myCollection,q="*:*",fl="id"

Re: Daylight savings time issue using NOW in Solr 6.1.0

2020-10-04 Thread vishal patel
Hello, Can anyone help me? Regards, Vishal Sent from Outlook<http://aka.ms/weboutlook> From: vishal patel Sent: Thursday, October 1, 2020 4:51 PM To: solr-user@lucene.apache.org Subject: Daylight savings time issue using NOW in Solr 6.1.0 Hi I am usin

Using streaming expressions with shards filter

2020-10-01 Thread Gael Jourdan-Weil
is somehow a feature/restriction of Streaming expressions? Or am I missing something? Note that the Streaming Expression I use is actually using the "/export" request handler. Example of the streaming expression: curl -X POST -v --data-urlencode 'expr=search(myCollection,q="*:*

Daylight savings time issue using NOW in Solr 6.1.0

2020-10-01 Thread vishal patel
Hi I am using Solr 6.1.0. My SOLR_TIMEZONE=UTC in solr.in.cmd. My current Solr server machine time zone is also UTC. My one collection has below one field in schema. Suppose my current Solr server machine time is 2020-10-01 10:00:00.000. I have one document in that collection

Using Autoscaling Simulation Framework to simulate a lost node in a cluster

2020-09-21 Thread Howard Gonzalez
Hello folks, has anyone tried to use the autoscaling simulation framework to simulate a lost node in a solr cluster? I was trying to do the following: 1.- Take a current production cluster state snapshout using bin/solr autoscaling -save 2.- Modify the clusterstate and livenodes json files

Re: Doing what does using SolrJ API

2020-09-17 Thread Steven White
.org/solr/guide/8_6/update-request-processors.html > and > >> see the extensive list of processors you can leverage. The specific > >> mentioned one is this one: > >> > https://lucene.apache.org/solr/8_6_0//solr-core/org/apache/solr/update/processor/StatelessScrip

Re: Handling failure when adding docs to Solr using SolrJ

2020-09-17 Thread Erick Erickson
I recommend _against_ issuing explicit commits from the client, let your solrconfig.xml autocommit settings take care of it. Make sure either your soft or hard commits open a new searcher for the docs to be searchable. I’ll bend a little bit if you can _guarantee_ that you only ever have one

Re: Doing what does using SolrJ API

2020-09-17 Thread Erick Erickson
>> You can read all about it at: >> https://lucene.apache.org/solr/guide/8_6/update-request-processors.html and >> see the extensive list of processors you can leverage. The specific >> mentioned one is this one: >> https://lucene.apache.org/solr/8_6_0//solr-core/org/apache

Re: Doing what does using SolrJ API

2020-09-17 Thread Walter Underwood
specific > mentioned one is this one: > https://lucene.apache.org/solr/8_6_0//solr-core/org/apache/solr/update/processor/StatelessScriptUpdateProcessorFactory.html > > Just a word of warning that Stateless URP is using Javascript, which is > getting a bit of a complicated story as underlying

Re: Doing what does using SolrJ API

2020-09-17 Thread Alexandre Rafalovitch
/update/processor/StatelessScriptUpdateProcessorFactory.html Just a word of warning that Stateless URP is using Javascript, which is getting a bit of a complicated story as underlying JVM is upgraded (Oracle dropped their javascript engine in JDK 14). So if one of the simpler URPs will do the job

Re: Doing what does using SolrJ API

2020-09-17 Thread Steven White
e line. > > > > > > For instance, I’ve seen designs where instead of > > > field1:some_value > > > field2:other_value…. > > > > > > you use a single field with _tokens_ like: > > > field:field1_some_value > > > field:field2_oth

Handling failure when adding docs to Solr using SolrJ

2020-09-17 Thread Steven White
Hi everyone, I'm trying to figure out when and how I should handle failures that may occur during indexing. In the sample code below, look at my comment and let me know what state my index is in when things fail: SolrClient solrClient = new HttpSolrClient.Builder(url).build();

Re: Doing what does using SolrJ API

2020-09-17 Thread Erick Erickson
value…. > > > > you use a single field with _tokens_ like: > > field:field1_some_value > > field:field2_other_value > > > > that drops the complexity and increases performance. > > > > Anyway, just a thought you might want to co

Re: Doing what does using SolrJ API

2020-09-17 Thread Steven White
e. > > Anyway, just a thought you might want to consider. > > Best, > Erick > > > On Sep 16, 2020, at 9:31 PM, Steven White wrote: > > > > Hi everyone, > > > > I figured it out. It is as simple as creating a List and using > > that as the v

Re: Help using Noggit for streaming JSON data

2020-09-17 Thread Yonik Seeley
See this method: /** Reads a JSON string into the output, decoding any escaped characters. */ public void getString(CharArr output) throws IOException And then the idea is to create a subclass of CharArr to incrementally handle the string that is written to it. You could overload write

Help using Noggit for streaming JSON data

2020-09-17 Thread Christopher Schultz
All, Is this an appropriate forum for asking questions about how to use Noggit? The Github doesn't have any discussions available and filing an "issue" to ask a question is kinda silly. I'm happy to be redirected to the right place if this isn't appropriate. I've been able to figure out most

Re: Doing what does using SolrJ API

2020-09-17 Thread Erick Erickson
. Anyway, just a thought you might want to consider. Best, Erick > On Sep 16, 2020, at 9:31 PM, Steven White wrote: > > Hi everyone, > > I figured it out. It is as simple as creating a List and using > that as the value part for SolrInputDocument.addField() API. >

Re: Doing what does using SolrJ API

2020-09-16 Thread Steven White
Hi everyone, I figured it out. It is as simple as creating a List and using that as the value part for SolrInputDocument.addField() API. Thanks, Steven On Wed, Sep 16, 2020 at 9:13 PM Steven White wrote: > Hi everyone, > > I want to avoid creating a source="OneFieldOfMany&q

Doing what does using SolrJ API

2020-09-16 Thread Steven White
Hi everyone, I want to avoid creating a in my schema (there will be over 1000 of them and maybe more so managing it will be a pain). Instead, I want to use SolrJ API to do what does. Any example of how I can do this? If there is an example online, that would be great. Thanks in advance.

NullPointerException in IndexSearcher.explain() when using ComplexPhraseQueryParser

2020-09-09 Thread Michał Słomkowski
Hello, I get NPE when I use IndexSearcher.explain(). Checked with Lucene 8.6.0 and 8.6.2. The query: (lorem AND NOT "dolor lorem") OR ipsum The text: dolor lorem ipsum Stack trace: > java.lang.NullPointerException > at java.util.Objects.requireNonNull(Objects.java:203) > at

Retrieving Parent and Child Documents using the Bock Join Query Technique when the Child and parent Document having the identical field

2020-09-08 Thread Nagaraj S
Hi Solr Team, I am trying to retrieve the Parent Document by using the Block Join Parent Query Parser (q={!parent which=allParents}someChildren), but the filter condition i gave is having the same field in both the parent and the child document, So the Parser is throwing the Error : "

HEY, are you using the Analytics contrib?

2020-09-03 Thread David Smiley
I wonder who is using the Analytics contrib? Why do you use it instead of other Solr features like the JSON Faceting module that seem to have competing functionality. My motivation is to ascertain if it ought to be maintained as a 3rd party plugin/package or remain as a 1st party contrib where

RE: Using Solr's zkcli.sh

2020-09-02 Thread Victor Kretzer
Vincent -- Your suggestion worked perfectly. After using chmod I'm now able to use the zkcli script. Thank you so much for the quick save. Victor Victor Kretzer Sitecore Developer Application Services GDC IT Solutions Office: 717-262-2080 ext. 151 www.gdcitsolutions.com -Original

Re: Using Solr's zkcli.sh

2020-09-02 Thread Vincent Brehin
commands, including zkcli. So you should first launch "sudo chmod a+x server/scripts/cloud-scripts/zkcli.sh" , then you should be able to use the command. Let us know ! Vincent Le mar. 1 sept. 2020 à 23:35, Victor Kretzer a écrit : > Thank you in advance. This is my first time using a

Using Solr's zkcli.sh

2020-09-01 Thread Victor Kretzer
Thank you in advance. This is my first time using a mailing list like this so hopefully I am doing so correctly. I am attempting to setup SolrCloud (Solr 6.6.6) and an external zookeeper ensemble on Azure. I have three dedicated to the zookeeper ensemble and two for solr all running Ubuntu

Re: PDF extraction using Tika

2020-08-26 Thread Walter Underwood
;Joe D. >>> >>> On 25/08/2020 10:54, Charlie Hull wrote: >>>> On 25/08/2020 06:04, Srinivas Kashyap wrote: >>>>> Hi Alexandre, >>>>> >>>>> Yes, these are the same PDF files running in windows and linux. There

RE: [EXT] Re: PDF extraction using Tika

2020-08-26 Thread Hanjan, Harinderdeep S.
) and if one is not responding, move on to the next one. This will also allow you to easily incorporate using multiple PDF extraction tools, should Tika fail on a PDF. The way this would work is something like this: - Your code sees a PDF - It sends the PDF to Tika Server - Tika Server parses the PDF

Re: PDF extraction using Tika

2020-08-26 Thread Jan Høydahl
> On 25/08/2020 10:54, Charlie Hull wrote: >>> On 25/08/2020 06:04, Srinivas Kashyap wrote: >>>> Hi Alexandre, >>>> >>>> Yes, these are the same PDF files running in windows and linux. There are >>>> around 30 pdf files and I tried indexing s

Re: PDF extraction using Tika

2020-08-26 Thread Charlie Hull
, these are the same PDF files running in windows and linux. There are around 30 pdf files and I tried indexing single file, but faced same error. Is it related to how PDF stored in linux? Did you try running Tika (the same version as you're using in Solr) standalone on the file as Alexandre suggested

RE: PDF extraction using Tika

2020-08-25 Thread Srinivas Kashyap
Thanks Phil, I will modify it according to the need. Thanks, Srinivas -Original Message- From: Phil Scadden Sent: 26 August 2020 02:44 To: solr-user@lucene.apache.org Subject: RE: PDF extraction using Tika Code for solrj is going to be very dependent on your needs but the beating

RE: PDF extraction using Tika

2020-08-25 Thread Phil Scadden
Admin", password); UpdateResponse ur = req.process(solr,"prindex"); req.commit(solr, "prindex"); -Original Message- From: Srinivas Kashyap Sent: Tuesday, 25 August 2020 17:04 To: solr-user@lucene.apache.org Subject: RE: PDF extraction usi

Re: PDF extraction using Tika

2020-08-25 Thread Joe Doupnik
Alexandre, Yes, these are the same PDF files running in windows and linux. There are around 30 pdf files and I tried indexing single file, but faced same error. Is it related to how PDF stored in linux? Did you try running Tika (the same version as you're using in Solr) standalone on the file

Re: PDF extraction using Tika

2020-08-25 Thread Charlie Hull
as you're using in Solr) standalone on the file as Alexandre suggested? And with regard to DIH and TIKA going away, can you share if any program which extracts from PDF and pushes into solr? https://lucidworks.com/post/indexing-with-solrj/ is one example. You should run Tika separately

RE: PDF extraction using Tika

2020-08-24 Thread Srinivas Kashyap
from PDF and pushes into solr? Thanks, Srinivas Kashyap -Original Message- From: Alexandre Rafalovitch Sent: 24 August 2020 20:54 To: solr-user Subject: Re: PDF extraction using Tika The issue seems to be more with a specific file and at the level way below Solr's or possibly even

Re: How to Write Autoscaling Policy changes to Zookeeper/SolrCloud using the autoscaling Java API

2020-08-24 Thread Howard Gonzalez
Good morning! To add more context on the question, I can successfully use the Java API to build the list of new Clauses. However, the problem that I have is that I don't know how to "write" those changes back to solr using the Java API. I see there's a writeMap method in the Po

Re: PDF extraction using Tika

2020-08-24 Thread Alexandre Rafalovitch
sed. On Mon, 24 Aug 2020 at 11:09, Srinivas Kashyap wrote: > > Hello, > > We are using TikaEntityProcessor to extract the content out of PDF and make > the content searchable. > > When jetty is run on windows based machine, we are able to successfully load > documents using

PDF extraction using Tika

2020-08-24 Thread Srinivas Kashyap
Hello, We are using TikaEntityProcessor to extract the content out of PDF and make the content searchable. When jetty is run on windows based machine, we are able to successfully load documents using full import DIH(tika entity). Here PDF's is maintained in windows file system. But when

How to Write Autoscaling Policy changes to Zookeeper/SolrCloud using the autoscaling Java API

2020-08-21 Thread Howard Gonzalez
Hello. I am trying to use the autoscaling Java API to write some cluster policy changes to a Zookeeper/SolrCloud cluster. However, I can't find the right way to do it. I can get all the autoscaling cluster policy clauses using: autoScalingConfig.getPolicy.getClusterPolicy However, after

Re: Manipulating client's query using a Query object

2020-08-17 Thread Erick Erickson
you have access to that too. >> >> Regards, >> Markus >> >> >> -Original message- >>> From:Edward Turner >>> Sent: Monday 17th August 2020 21:25 >>> To: solr-user@lucene.apache.org >>> Subject: Re: Manipulating c

Re: Manipulating client's query using a Query object

2020-08-17 Thread Edward Turner
ss to that too. > > Regards, > Markus > > > -Original message- > > From:Edward Turner > > Sent: Monday 17th August 2020 21:25 > > To: solr-user@lucene.apache.org > > Subject: Re: Manipulating client's query using a Query object > > > > H

RE: Manipulating client's query using a Query object

2020-08-17 Thread Markus Jelsma
variable (i think it was qstr) that contains the original input string. In there you have access to that too. Regards, Markus -Original message- > From:Edward Turner > Sent: Monday 17th August 2020 21:25 > To: solr-user@lucene.apache.org > Subject: Re: Manipulating client's

Re: Manipulating client's query using a Query object

2020-08-17 Thread Edward Turner
Hi Markus, That's really great info. Thank you. Supposing we've now modified the Query object, do you know how we would get the corresponding query String, which we could then forward to our Solrcloud via SolrClient? (Or should we be using this extended ExtendedDisMaxQParser class server side

RE: Manipulating client's query using a Query object

2020-08-17 Thread Markus Jelsma
- > From:Edward Turner > Sent: Monday 17th August 2020 15:53 > To: solr-user@lucene.apache.org > Subject: Manipulating client's query using a Query object > > Hi all, > > Thanks for all your help recently. We're now using the edismax query parser > and are happy with its behaviou

Manipulating client's query using a Query object

2020-08-17 Thread Edward Turner
Hi all, Thanks for all your help recently. We're now using the edismax query parser and are happy with its behaviour. We have another question which maybe someone can help with. We have one use case where we optimise our query before sending it to Solr, and we do this by manipulating

Re: Backups in SolrCloud using snapshots of individual cores?

2020-08-11 Thread Bram Van Dam
On 11/08/2020 13:15, Erick Erickson wrote: > CDCR is being deprecated. so I wouldn’t suggest it for the long term. Ah yes, thanks for pointing that out. That makes Dominique's alternative less attractive. I guess I'll stick to my original proposal! Thanks Erick :-) - Bram

Re: Backups in SolrCloud using snapshots of individual cores?

2020-08-11 Thread Dominique Bejean
gt;>>> > >>>> Been reading up about the various ways of creating backups. The whole > >>>> "shared filesystem for Solrcloud backups"-thing is kind of a no-go in > >>>> our environment, so I've been looking for ways around that, and her

Re: Backups in SolrCloud using snapshots of individual cores?

2020-08-11 Thread Erick Erickson
of a no-go in >>>> our environment, so I've been looking for ways around that, and here's >>>> what I've come up with so far: >>>> >>>> 1. Stop applications from writing to solr >>>> >>>> 2. Commit everything >>>> &g

Re: Backups in SolrCloud using snapshots of individual cores?

2020-08-11 Thread Dominique Bejean
backups"-thing is kind of a no-go in > >> our environment, so I've been looking for ways around that, and here's > >> what I've come up with so far: > >> > >> 1. Stop applications from writing to solr > >> > >> 2. Commit everything > >

Re: Backups in SolrCloud using snapshots of individual cores?

2020-08-11 Thread Erick Erickson
ications from writing to solr >> >> 2. Commit everything >> >> 3. Identify a single core for each shard in each collection >> >> 4. Snapshot that core using CREATESNAPSHOT in the Collections API >> >> 5. Once complete, re-enable application write a

Re: Backups in SolrCloud using snapshots of individual cores?

2020-08-11 Thread Dominique Bejean
from writing to solr > > 2. Commit everything > > 3. Identify a single core for each shard in each collection > > 4. Snapshot that core using CREATESNAPSHOT in the Collections API > > 5. Once complete, re-enable application write access to Solr > > 6. Create a back

Re: Backups in SolrCloud using snapshots of individual cores?

2020-08-10 Thread Ashwin Ramesh
've been looking for ways around that, and here's > what I've come up with so far: > > 1. Stop applications from writing to solr > > 2. Commit everything > > 3. Identify a single core for each shard in each collection > > 4. Snapshot that core using CREATESNAPSHOT in

Backups in SolrCloud using snapshots of individual cores?

2020-08-06 Thread Bram Van Dam
ng to solr 2. Commit everything 3. Identify a single core for each shard in each collection 4. Snapshot that core using CREATESNAPSHOT in the Collections API 5. Once complete, re-enable application write access to Solr 6. Create a backup from these snapshots using the replication handler's backu

Re: Querying solr using many QueryParser in one call

2020-07-20 Thread Charlie Hull
with strategies like the cacheing you describe. Charlie On 16/07/2020 18:14, harjag...@gmail.com wrote: Hi All, Below are question regarding querying solr using many QueryParser in one call. We have need to do a search by keyword and also include few specific documents to result. We don't want to use

How do I use dismax or edismax to rank using 60% tf-idf and 40% a numeric field?

2020-07-16 Thread Russell Jurney
Hello Solarians, I know how to boost a query and I see the methods for tf and idf in streaming scripting. What I don’t know is how to incorporate these things together at a specific percentage of the ranking function. How do I write a query to use dismax or edismax to rank using 60% tf-idf score

  1   2   3   4   5   6   7   8   9   10   >