Slower queries with 7.3.1?

2018-05-24 Thread Will Currie
I'm seeing 0.5s increase in avg and p99 latencies after upgrading from 7.2 to 7.3.1. Eg 1s avg to 1.5s avg. I've tried upgrading from java 8 to 10 (9 being EOL) suspecting https://issues.apache.org/jira/browse/LUCENE-7966. No help. I'm using boost queries. Probably abusing boost queries. Jstack

Re: Is it possible to index documents without storing their content?

2018-05-24 Thread Thomas Lustig
Thanks Emir for the great answer :) Br Tom 2018-05-23 10:16 GMT+02:00 Emir Arnautović : > Hi Tom, > Yes it is possible - see field options: https://lucene.apache.org/ > solr/guide/6_6/defining-fields.html#DefiningFields- > OptionalFieldTypeOverrideProperties

simple enrich uploaded binary documents with sha256 hashes

2018-05-24 Thread Thomas Lustig
dear community, I would like to automatically add a sha256 filehash to a Document field after a binary file is posted to a ExtractingRequestHandler. First i thought, that the ExtractingRequestHandler has such a feature, but so far i did not find a configuration. It was mentioned that I should

Re: Could not load collection from ZK:

2018-05-24 Thread Aman Singh
Hi Shawn & Alessandro, We have tried to increase the heap also but we were facing the same issue but after removing the ZK from the solr server to their dedicated server this problem goes away, Yes when we are facing this issue the GC activity was high around 60-70% out of 400%. Regards, Aman

Sort by payload value

2018-05-24 Thread John Davis
Hello, We are trying to use payload values as described in [1] and are running into issues when issuing *sort by* payload value. Would appreciate any pointers to what we might be doing wrong. We are running solr 6.6.0. * Here's the payload value definition:

Re: Could not load collection from ZK:

2018-05-24 Thread Shawn Heisey
On 6/20/2017 9:46 AM, Aman Deep Singh wrote: > Sorry Shawn, > It didn't copy entire stacktrace I put the stacktrace at > https://www.dropbox.com/s/zf8b87m24ei2ils/solr%20exception2?dl=0 > > Note: I have shaded the solr library under com.gdn.solr620 so all solr > class will be appear as

Re: deletebyQuery vs deletebyId

2018-05-24 Thread Jay Potharaju
Hi Erick, Yes, I commented on the ticket ...after finding it during my search for the issue in the solr JIRA. Setup: 2 Nodes, 6 shards , 3 shards on each node (no replication) Collection uses implicit routing. Just to give some background ... The first time I tried it ...it worked but then when

Solr failed to start after configuring Kerberos authentication

2018-05-24 Thread adfel70
Hi, We are trying to configure Kerberos auth for Solr 6.5.1. We went over the steps as described through Sorl’s ref guide, but after restart we are getting the following error: org.apache.zookeeper.client.ZookeeperSaslClient; An error: (java.security.PrivilegedActionException:

Re: How to do parallel indexing on files (not on HDFS)

2018-05-24 Thread Rahul Singh
Right, That’s why you need a place to persist the task list / graph. If you use a table, you can set “processed” / “unprocessed” value … or a queue, then its delivered only once .. otherwise you have to check indexed date from solr, and waste a solr call. -- Rahul Singh rahul.si...@anant.us

Re: Navigating through Solr Source Code

2018-05-24 Thread Christine Poerschke (BLOOMBERG/ LONDON)
Hello. Emir mentioned about starting from the feature/concept. If you haven't come across it yet then the slides and/or recording of Hoss's "Lifecycle of a Solr Search Request" talk may be of interest - http://home.apache.org/~hossman/ has links. Erick mentioned about getting a sense via unit

Re: How to do parallel indexing on files (not on HDFS)

2018-05-24 Thread Adhyan Arizki
You will still need to devise a way to partition the data source even if you are scheduling multiple jobs otherwise, you might end up digesting the same data again and again. On Fri, May 25, 2018 at 12:46 AM, Raymond Xie wrote: > Thank you all for the suggestions. I'm now

Re: Index protected zip

2018-05-24 Thread Alexandre Rafalovitch
Hmm. If it works, then it is Tika magic. Which may mean they may have a setting for passwords. Which would need to be configured and then exposed through Solr. So, I would check if you can extract text with Tika standalone first. Regards, Alex On Thu, May 24, 2018, 5:05 AM Dimitris

Re: Escaping in streaming expression

2018-05-24 Thread Joel Bernstein
I just confirmed that the following query works as expected: search(collection2, q="test_s:\"hello world\"", fl="id", sort="id desc") In this case the double quotes are used to specify a phrase query. But this fails: search(collection2, q="test_s:\"hello world", fl="id", sort="id desc") In

Re: How to do parallel indexing on files (not on HDFS)

2018-05-24 Thread Raymond Xie
Thank you all for the suggestions. I'm now tending to not using a traditional parallel indexing my data are json files with meta data extracted from raw data received and archived into our data server cluster. Those data come in various flows and reside in their respective folders, splitting them

Re: Escaping in streaming expression

2018-05-24 Thread Joel Bernstein
Also while looking at you're query it looks like you are getting error from the solr query parser. I believe the this is the issue you are facing: https://issues.apache.org/jira/browse/SOLR-10894 I'll confirm, but I believe this query should work: search(collection1, q="test \"hello

Re: Escaping in streaming expression

2018-05-24 Thread Joel Bernstein
This ticket originally addressed the issue: https://issues.apache.org/jira/browse/SOLR-8409 It's a confusing ticket though and I'm not seeing test cases that prove out that this is still working. I write a quick test case to see how escaping of quotes is being handled. This is a followup issue

Escaping in streaming expression

2018-05-24 Thread Christian Spitzlay
Hello, I’m experimenting with streaming expressions and I wonder how to escape a double quote in a value. I am on 7.3.0 and trying with the text area on http://localhost:8983/solr/#/collname/stream The following expression works for me and returns results: search(kmm, q="sds_endpoint_name:F2",

Re: How to do parallel indexing on files (not on HDFS)

2018-05-24 Thread Rahul Singh
Resending to list to help more people.. This is an architectural pattern to solve the same issue that arises over and over again.. The queue can be anything — a table in a database, even a collection solr. And yes I have implemented it —  I did it in C# before using a SQL Server table based

Re: How to do parallel indexing on files (not on HDFS)

2018-05-24 Thread Adhyan Arizki
Raymond, Running parallel index might be trickier than it looks if the scale is big. For instance, you can easily partition your data (let's say into 5 chunks) and run 5 processes to index them. However, you will need to be aware if there will be choke in the pipeline along the way (e.g. I/O of

Re: Use payloads in facet sorting

2018-05-24 Thread Mikhail Khludnev
It's not possible in JSON Facet API now. It reminds me http://shaierera.blogspot.com/2013/01/facet-associations.html and https://issues.apache.org/jira/browse/SOLR-9480. On Thu, May 24, 2018 at 11:31 AM, Tobias Kässmann wrote: > Hey, > I’m just playing around with the

Re: Question regarding TLS version for solr

2018-05-24 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Anchal, On 5/24/18 6:02 AM, Anchal Sharma2 wrote: > Thanks a lot for sharing the steps . I tried few of them .Actually > we already have been using solr in our application since an year or > so .We just want to encrypt it to use secure solr now

Re: Could not load collection from ZK:

2018-05-24 Thread Alessandro Benedetti
hi Aman, I had similar issues in the past and the reason was attributed to : SOLR-8868 Which unfortunately is not solved yet. Did you manage to find a different cause in your case? hope that helps. Regards - --- Alessandro

Re: Solr streaming - get single value from tuple

2018-05-24 Thread Joel Bernstein
I've been meaning to add this, so let's create a ticket. Until it's released you can plugin the function in the solrconfig.xml. We can add functions called: getValue(tuple, key) setValue(tuple, key, value) There is also a mean() function which works on a vector which you can use right now. This

Solr streaming - get single value from tuple

2018-05-24 Thread Jan Høydahl
describe() returns a tuple. I’d like to assign the value of “mean” from that tuple to a separate variable for use in later computations. How to achieve?? Jan

Re: Solaris 10

2018-05-24 Thread Shawn Heisey
On 5/24/2018 3:40 AM, Takuya Kawasaki wrote: Please let me ask a question. I would like to use Solr on Solaris 10. But I encountered a lot of errors. First, I can’t install solr using install script in .tgz. script result shows I have to install manually not using the script. Second, I can’t

Re: Solaris 10

2018-05-24 Thread Susheel Kumar
No idea about Solaris much but the only option is to install manually as you did and try to modify /bin/solr script to get rid of the errors you are seeing etc. Thnx On Thu, May 24, 2018 at 5:40 AM, Takuya Kawasaki wrote: > Please let me ask a question. > > I would

Re: Question regarding TLS version for solr

2018-05-24 Thread Anchal Sharma2
Hi Chris, Thanks a lot for sharing the steps . I tried few of them .Actually we already have been using solr in our application since an year or so .We just want to encrypt it to use secure solr now .So ,I followed the steps where you have created the certificates ,etc .But when I go to start

Solaris 10

2018-05-24 Thread Takuya Kawasaki
Please let me ask a question. I would like to use Solr on Solaris 10. But I encountered a lot of errors. First, I can’t install solr using install script in .tgz. script result shows I have to install manually not using the script. Second, I can’t use ‘start’ command using /bin/solr script

Index protected zip

2018-05-24 Thread Dimitris Kardarakos
Hello everyone. In Solr 7.3.0 I can successfully index the content of zip files. But if the zip file is password protected, running something like the below: curl "http://localhost:8983/solr/sample/update/extract?commit=true&=enc.zip=1234; -H "Content-Type: application/zip" --data-binary

Use payloads in facet sorting

2018-05-24 Thread Tobias Kässmann
Hey, I’m just playing around with the new payload feature in Solr and try to use it within the facet component. In my documents I have a field called "keywords" and indexed there some keywords for a document with a score as payload for each. Now I want to ask a question like: "Which are the