Re: Streaming In Solr

2018-11-14 Thread Lucky Sharma
Hi Prakhar, Thanks for the reply, But What I am actually curious to know how is it implemented Internally? On Wed, Nov 14, 2018 at 12:47 PM Prakhar Nigam wrote: > > HI Lucky Prakhar here > > we have met at a training at Mahindra Comviva. I have found this article it > may be a little helpful >

Re: ExtractRequestHandler and Tika. Get only plain text

2018-11-14 Thread Jan Høydahl
Have you tried to specify =text -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com > 14. nov. 2018 kl. 12:09 skrev marotosg : > > Hi all, > > Currently I am trying to do index documents from different kinds with Solr > and tika. It's working fine but when solr returns

ExtractRequestHandler and Tika. Get only plain text

2018-11-14 Thread marotosg
Hi all, Currently I am trying to do index documents from different kinds with Solr and tika. It's working fine but when solr returns the content of the document. Doesn't return the plain text. It comes back as well with some metadata. For instance my request.

Median in Solr json facet api

2018-11-14 Thread Anil
HI, Good Morning. I don;t see median aggregation in JSON facet api documentation. Could you please point me to the documentation to create custom json facet apis ? Thanks. Regards, Anil

Re: SolrCloud scaling/optimization for high request rate

2018-11-14 Thread Toke Eskildsen
On Mon, 2018-11-12 at 14:19 +0200, Sofiya Strochyk wrote: > I'll check if the filter queries or the main query tokenizers/filters > might have anything to do with this, but I'm afraid query > optimization can only get us so far. Why do you think that? As you tried eliminating sorting and

Re: querying on field of type string doesn't work as expected

2018-11-14 Thread Erick Erickson
No it doesn't match. You have to get the search in as a single term. You get a lot of information by adding =true and looking at your parsed query. Try myFieldName:"Some\ Text" o myFieldName:Some\ Text Best, Erick On Wed, Nov 14, 2018 at 4:02 PM Angel Todorov wrote: > > Hi guys, > > I have

Re: ClassNotFound indexing crypted documents

2018-11-14 Thread Shawn Heisey
On 11/13/2018 11:51 AM, Luca Vergantini wrote: Maybe I skipped the correct steps to open an issue, but here https://issues.apache.org/jira/browse/SOLR-12985 you can find the details. I think that is at least a configuration issue for the install script, but maybe is most hard. I detected this

querying on field of type string doesn't work as expected

2018-11-14 Thread Angel Todorov
Hi guys, I have SOLR 6.5 , and a custom defined field which is of type string (not text or text_general). In some document, there is the value for that field, for example, "Some Text" . When I query by myFieldName:"Some Text" , I don't get any matches, but I think I should, because this matches

Re: Median in Solr json facet api

2018-11-14 Thread Toke Eskildsen
On Wed, 2018-11-14 at 17:53 +0530, Anil wrote: > I don;t see median aggregation in JSON facet api documentation. It's the 50 percentile: https://lucene.apache.org/solr/guide/7_5/json-facet-api.html#metrics-example - Toke Eskildsen, Royal Danish Library

Re: Indexing vs Search node

2018-11-14 Thread Fernando Otero
Thanks everyone this gave me great arguments for migrating to Solr7 :D On Fri, Nov 9, 2018 at 7:50 PM Shawn Heisey wrote: > On 11/9/2018 1:58 PM, David Hastings wrote: > > I personally like standalone solr for this reason, i can tune the > indexing > > "master" for doing nothing but taking in

Re: Solr cloud change collection index directory

2018-11-14 Thread Charlie Hull
On 13/11/2018 22:34, Shawn Heisey wrote: If it's important for you to have the data separated from the program, setting the solr home is in my opinion the right way to go.  This separation is achieved by the service installer script that Solr includes, which runs on most operating systems

Issue Searching Data from multiple Databases

2018-11-14 Thread Santosh Kumar S
I am trying to achieve search by connecting to multiple Databases (in my case trying with 2 different DBs) to index data from multiple DB tables. I have tried doing the below as an approach to achieve my goal but in vain, I am able to get only data from the DB 1 when I perform a full-import. Steps

Re: Median in Solr json facet api

2018-11-14 Thread Joel Bernstein
The JSON facet API uses the t-digest approach to estimate the percentiles. You can also use Solr Math Expressions to take a random sample from a field and estimate the median from the sample. Here is the Streaming Expression: let(a=random(collection1, q="*:*", fl="filesize_d", rows="25000"),

Re: Streaming In Solr

2018-11-14 Thread Joel Bernstein
The implementation is as follows: 1) There are "stream sources" that generate results from Solr Cloud collections. Some of these include: search, facet, knnSearch, random, timeseries, nodes, sql etc... 2) There are "stream decorators" that wrap stream sources and operated over the result set

Delete by query in SOLR 6.3

2018-11-14 Thread RAKESH KOTE
Hi,   We are using SOLR 6.3 in cloud and we have created 2 collections in a single SOLR cluster consisting of 20 shards and 3 replicas each(overall 20X3 = 60 instances). The first collection has close to 2.5 billion records and the second collection has 350 million records. Both the collection

Exporting results and schema design

2018-11-14 Thread Dwane Hall
Good afternoon Solr community, I have a situation where I require the following solr features. 1. Highlighting must be available for the matched search results 2. After a user performs a regular solr search (/select, rows=10) I require a drill down which has the potential to export

Re: Exporting results and schema design

2018-11-14 Thread Erick Erickson
Well, docValues doesn't necessarily waste much index space if you don't store the field and useDocValuesAsStored. It also won't beat up your machine as badly if you fetch all your fields from DV fields. To fetch a stored field, you need to > seek to the stored data on disk > decompress a 16K

Re: ExtractRequestHandler and Tika. Get only plain text

2018-11-14 Thread Sergio García Maroto
Thanks a lot Jan. That works very well. I am now trying to index the doc in Solr deleting the extractOnly parameter and can't find any similiar option to get the data indexed in plain text. I am getting the metadata as well, This is my request.

Re: ExtractRequestHandler and Tika. Get only plain text

2018-11-14 Thread Sergio García Maroto
Thanks Erick. I do use this strategy for indexing data from DB. It is very flexible for me. I work in a company where .net is the main dev platform , so even more important to separate things. Does you post mean that functionality for indexing documents in Solr using ExtractRequestHandler doesn't

Re: ExtractRequestHandler and Tika. Get only plain text

2018-11-14 Thread Erick Erickson
While ERH is find for getting started, as you go toward production you'll want to consider parsing the data outside of Solr for the reasons (and example) outlined here: https://lucidworks.com/2012/02/14/indexing-with-solrj/ Best, Erick On Wed, Nov 14, 2018 at 6:46 AM Sergio García Maroto wrote:

Atomic updates and stored CopyTo destination fields

2018-11-14 Thread Jon Kjær Amundsen
Reading up on atomic updates, the Solr reference guide states the following: The core functionality of atomically updating a document requires that all fields in your schema must be configured as stored (stored="true") or docValues (docValues="true") except for fields which are destinations,

Re: index size, stored vs indexed

2018-11-14 Thread Erick Erickson
Can't really be answered. For instance, stored data is held in *.fdt files and is largely irrelevant to searching since that data is only consulted for returning stored fields of the top N docs. So if your index consists of 90% stored data it's one answer, if 10% it's totally another. the stored

Re: ExtractRequestHandler and Tika. Get only plain text

2018-11-14 Thread Erick Erickson
bq. Does you post mean that functionality for indexing documents in Solr using ExtractRequestHandler doesn't provide the option of Indexing plain data Frankly I don't know. It's just that if you plan to eventually offload the Tika parsing onto a client (or use a service), does it make sense to

Re: Atomic updates and stored CopyTo destination fields

2018-11-14 Thread Erick Erickson
I'm a little confused on what you're trying. Say your source field is Y and your destination field X. Are you saying that you want your destination field X to contain both the old value of field Y and the new value of field Y when you atomically update that field Y? H, I'm actually not sure

index size, stored vs indexed

2018-11-14 Thread David Hastings
Was wondering if anyone has an idea of the ratio size of indexed only vs stored and indexed in solr 7.x. I was gong to run some testing myself later today but was curious what others have seen in this regard. Thanks, David

3 Solr instances different ports

2018-11-14 Thread cristian.tiu...@gmail.com
Helloo I want to have 3 different solr instances on the same server. 1 - Master 2 - Slave I want to have this 3 instances also on different ports. How can i do this. Thx -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Atomic updates and stored CopyTo destination fields

2018-11-14 Thread Jon Kjær Amundsen
It is not that I want it. I just can't reproduce it even though I read it as an expected behaviour. So I wondered if something has been changed since the warning was written, or if I had misunderstood something. ons. d. 14. nov. 2018 17.09 skrev Erick Erickson : > I'm a little confused on what

RE: Issue Searching Data from multiple Databases

2018-11-14 Thread Vadim Ivanov
Hi! Have you tried to name entity in Fulldataimport http call As /dataimport/?command=full-import=Document1=true=true Is there something sane in the log file after that command? -- Vadim > -Original Message- > From: Santosh Kumar S [mailto:santoshkumar.saripa...@infinite.com] >

Re: 3 Solr instances different ports

2018-11-14 Thread Erick Erickson
bin/solr start -help e.g. bin/solr start -z localhost:2181 -p 8981 -s example/cloud/node1/solr On Wed, Nov 14, 2018 at 7:59 AM cristian.tiu...@gmail.com wrote: > > Helloo > > I want to have 3 different solr instances on the same server. > 1 - Master > 2 - Slave > > I want to have this 3

3 Solr instances different ports

2018-11-14 Thread cristian.tiu...@gmail.com
Helloo I want to have 3 different solr instances on the same server. 1 - Master 2 - Slave I want to have this 3 instances also on different ports. How can i do this. Thx -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: 3 Solr instances different ports

2018-11-14 Thread Shawn Heisey
On 11/14/2018 7:58 AM, cristian.tiu...@gmail.com wrote: I want to have 3 different solr instances on the same server. 1 - Master 2 - Slave Why do you want multiple Solr instances on the same server? If this is to mock up an install that will have the different instances on separate servers

Re: Atomic updates and stored CopyTo destination fields

2018-11-14 Thread Shawn Heisey
On 11/14/2018 10:35 AM, Jon Kjær Amundsen wrote: It is not that I want it. I just can't reproduce it even though I read it as an expected behaviour. So I wondered if something has been changed since the warning was written, or if I had misunderstood something. To my knowledge, nothing has