Parent child documents partial update

2017-07-17 Thread Sujay Bawaskar
Hi, Need a help to understand solr parent child document partial update behaviour. Can we perform partial update on parent document without losing its chiild documents? My observation is that parent child relationship between documents get lost in case partial update is performed on parent. Any wo

Re: Create too many zookeeper connections when recreate CloudSolrServer instance

2017-07-17 Thread wg85907
Hi Shawn, Thanks for your detail explanation. The reason I want to shutdown the CloudSolrServer instance and create a new one is that I have concern that if it can successfully reconnect to Zookeeper server if Zookeeper cluster has some issue and reboot. I will do related test with version

Get results in multiple orders (multiple boosts)

2017-07-17 Thread Luca Dall'Osto
Hello, I'm new in Solr (and in mailing lists..), and I have a question about querying contents in multiple custom orders. I 'm trying to query some documents boosted by 2 (or more) fields: I'm able to make a search of 2 day and return results boosted by category field, like this: ?indent=on &def

AW: Get results in multiple orders (multiple boosts)

2017-07-17 Thread Florian Waltersdorfer
Hi, I am quite the SolR newbie myself, but have you looked at the resulting scores, e.g. via fl=*,score (that way, you can see/test how your boosting affects the results)? In a similar scenario, I am using fixed value boosts for specific field values; "^=[boost]" instead of "^[factor]", for exa

Solr 6.6.0 - Indexing errors

2017-07-17 Thread Joe Obernberger
We've been indexing data on a 45 node cluster with 100 shards and 3 replicas, but our indexing processes have been stopping due to errors. On the server side the error is "Error logging add". Stack trace: 2017-07-17 12:29:24.057 INFO (qtp985934102-5161548) [c:UNCLASS s:shard58 r:core_node290

Re: Solr 6.6.0 - Indexing errors

2017-07-17 Thread Joe Obernberger
Some more info: When I stop all the indexers, in about 5-10 minutes the cluster goes all green. When I start just one indexer, several nodes immediately go down with the 'Error adding log' message. I'm using CloudSolrClient.add(List) to do the indexing. Is this correct for SolrCloud? Tha

Re: Cant stop/start server

2017-07-17 Thread Iridian Group
So I installed SOLR on another server using just the service install script and am experiencing the same issue when starting/stopping the service using /opt/solr/bin/solr stop -all however when using /etc/init.d/solr start /etc/init.d/solr stop the server starts/stops gracefully without issue.

Re: CDCR - how to deal with the transaction log files

2017-07-17 Thread Susheel Kumar
I just voted for https://issues.apache.org/jira/browse/SOLR-11069 to get it resolved, as we are discussing to start using CDCR soon. On Fri, Jul 14, 2017 at 5:21 PM, Varun Thacker wrote: > https://issues.apache.org/jira/browse/SOLR-11069 is tracking why is > LASTPROCESSEDVERSION=-1 > on the sour

Re: Cant stop/start server

2017-07-17 Thread Susheel Kumar
Exactly. Both are different and for the purpose if you see the content. The later refers the prev one. On Mon, Jul 17, 2017 at 9:15 AM, Iridian Group wrote: > So I installed SOLR on another server using just the service install > script and am experiencing the same issue when starting/stopping

Re: Solr 6.6.0 - Indexing errors

2017-07-17 Thread Shawn Heisey
On 7/17/2017 6:36 AM, Joe Obernberger wrote: > We've been indexing data on a 45 node cluster with 100 shards and 3 > replicas, but our indexing processes have been stopping due to > errors. On the server side the error is "Error logging add". Stack > trace: > Caused by: org.apache.hadoop.ipc.Rem

Re: Solr 6.6.0 - Indexing errors

2017-07-17 Thread Susheel Kumar
There is some analysis error also. I would suggest to test the indexer on just one shard setup first, then test for a replica (1 shard and 1 replica) and then test for 2 shards and 2 replica. This would confirm if there is basic issue with indexing / cluster setup. On Mon, Jul 17, 2017 at 9:04 A

Re: Solr 6.6.0 - Indexing errors

2017-07-17 Thread Joe Obernberger
So far we've indexed about 46 million documents, but over the weekend, these errors started coming up. I would expect that if there was a basic issue, it would have started right away? We ran a test cluster with just a few shards/replicas prior and didn't see any issues using the same indexin

Solr Subfaceting

2017-07-17 Thread Ponnuswamy, Poornima (GE Healthcare)
Hello, We have Solr version 6.4.2 and we have been using Solr Subfaceting – Terms Facet as per the document https://cwiki.apache.org/confluence/display/solr/Faceted+Search in our project. In our project which is going to go in production soon, we use it for getting the facet/subfacet counts,

Re: TransactionLog doesn't know how to serialize class java.util.UUID; try implementing ObjectResolver?

2017-07-17 Thread deviantcode
Hi Mahmoud, did you ever get to the bottom of this? I'm having the same issue on solr 5.5.2 -- View this message in context: http://lucene.472066.n3.nabble.com/TransactionLog-doesn-t-know-how-to-serialize-class-java-util-UUID-try-implementing-ObjectResolver-tp4332277p4346335.html Sent from the

Re: Creating a custom auth plugin for solr

2017-07-17 Thread srshaik
Thanks Jan. I had gone though the link. But, not the code. I will look into it and try to understand. However, I had a question regarding support multi-tenancy. If I have one collection containing documents for multiple tenants, would I have to build a custom authorization plugin to prevent one cu

Re: Creating a custom auth plugin for solr

2017-07-17 Thread srshaik
Thanks Jan. I had gone though the link. But, not the code. I will look into it and try to understand. However, I had a question regarding support multi-tenancy. If I have one collection containing documents for multiple tenants, would I have to build a custom authorization plugin to prevent one cu

Re: dynamic datasource password in db_data_config file

2017-07-17 Thread javeed
HI Team, Can you please update on this issue. Thank you -- View this message in context: http://lucene.472066.n3.nabble.com/dynamic-datasource-password-in-db-data-config-file-tp4345804p4346288.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Create too many zookeeper connections when recreate CloudSolrServer instance

2017-07-17 Thread Walter Underwood
If your Zookeeper cluster is rebooting frequently, you have much, much worse problems than client connections. Is Zookeeper unstable in your installation? If so, fix that. Stop hacking the client. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On J

Re: Solr 6.6.0 - Indexing errors

2017-07-17 Thread Erick Erickson
Joe: I agree that 46 million docs later you'd expect things to have settled out. However, I do note that you have "add-unknown-fields-to-the-schema" in your error stack which means you're using "field guessing", sometimes called data_driven. I would recommend you do _not_ use this for production a

Re: dynamic datasource password in db_data_config file

2017-07-17 Thread Amrit Sarkar
Javed, Can you let us know if you are running in standalone or cloud mode? Amrit Sarkar Search Engineer Lucidworks, Inc. 415-589-9269 www.lucidworks.com Twitter http://twitter.com/lucidworks LinkedIn: https://www.linkedin.com/in/sarkaramrit2 On Mon, Jul 17, 2017 at 11:54 AM, javeed wrote: > HI

Re: Get results in multiple orders (multiple boosts)

2017-07-17 Thread Erick Erickson
I don't think boosting is really what you want here. Boosting _influences_ the score, it does not impose an ordering. Sorting _does_ impose an ordering, the question is how to sort and the answer depends on how fixed (or not) the sorting criteria are. Do they change with different queries? If not,

Re: solr-user-subscribe

2017-07-17 Thread Erick Erickson
Please follow the instructions here: http://lucene.apache.org/solr/community.html#mailing-lists-irc. You must use the _exact_ same e-mail as you used to subscribe. If the initial try doesn't work and following the suggestions at the "problems" link doesn't work for you, let us know. But note you

Re: Solr 6.6.0 - Indexing errors

2017-07-17 Thread Susheel Kumar
and there is document id mentioned above when it failed with analysis error. You can look how those documents differ as Eric suggested. On Mon, Jul 17, 2017 at 11:53 AM, Erick Erickson wrote: > Joe: > > I agree that 46 million docs later you'd expect things to have settled > out. However, I do

Re: solr-user-subscribe

2017-07-17 Thread srshaik
I added a reply to the discussion. Please accept. On Fri, Jul 14, 2017 at 11:05 PM, Naohiko Uramoto [via Lucene] < ml+s472066n4346101...@n3.nabble.com> wrote: > solr-user-subscribe <[hidden email] > > > > -- > Naohiko Uramoto > > > --

Re: Solr 6.6.0 - Indexing errors

2017-07-17 Thread Joe Obernberger
Erick - thank you. I meant to disable field guessing as our indexer does this internally. Thanks for seeing that! Yes, we've seen things come in like IDs that are 12345 (int), but then next ID is 12AF456 (string). There is also a version mismatch between our Cloudera 5.10.2 hadoop version a

Re: TransactionLog doesn't know how to serialize class java.util.UUID; try implementing ObjectResolver?

2017-07-17 Thread Amrit Sarkar
I looked into the code TransactionLog.java (branch_5_5) :: JavaBinCodec.ObjectResolver resolver = new JavaBinCodec.ObjectResolver() { @Override public Object resolve(Object o, JavaBinCodec codec) throws IOException { if (o instanceof BytesRef) { BytesRef br = (BytesRef)o; codec

Re: Solr Subfaceting

2017-07-17 Thread Amrit Sarkar
Poornima, Regarding 3; You can do something like: CloudSolrClient client = new CloudSolrClient("localhost:9983"); SolrParams params = new ModifiableSolrParams().add("q","*:*") .add("json.facet","{.}"); QueryResponse response = client.query(params); Setting key and value via SolrPar

Re: Solr Subfaceting

2017-07-17 Thread Ponnuswamy, Poornima (GE Healthcare)
Thanks for your response. I have tried with SolrParams and it works for me. Any feedback on question 1 & 2. Thanks, Poornima On 7/17/17, 12:38 PM, "Amrit Sarkar" wrote: Poornima, Regarding 3; You can do something like: CloudSolrClient client = new CloudSolrClient("lo

Re: Solr Subfaceting

2017-07-17 Thread Amrit Sarkar
Poornima, 1. In confluence - https://cwiki.apache.org/confluence/display/solr/ Faceted+Search it page says its experimental and may change significantly. Is it safe for us to use the Terms faceting or will it change in future releases?. When will this be official?. A lot of people / engineers

Re: How to exclude stop words in spellcheck collations

2017-07-17 Thread Susheel Kumar
The field which you are using for spellcheck suggestions can utilise stopword filter factory. Thanks, Susheel On Sun, Jul 16, 2017 at 12:47 PM, Naveen Pajjuri wrote: > Hi, > Is there any way i can exclude stop words from the collations and > sugesstions from spell check component ? > > Regards,

Re: Solr 6.6.0 - Indexing errors

2017-07-17 Thread Joe Obernberger
We use puppet to deploy the solr instance to all the nodes. I changed what was deployed to use the CDH jars, but our puppet module deletes the old directory and replaces it. So, all the core configuration files under server/solr/ were removed. Zookeeper still has the configuration, but the no

Limit to the number of cores supported?

2017-07-17 Thread Pouliot, Scott
Hey guys. We're running SOLR 6.2.0 in a master/slave configuration and I was wondering if there is a limit to the number of cores this setup can support? We're having random issue where a core or 2 will stop responding to POSTS (GETS work fine) until we restart SOLR. We've currently got 140+ c

Re: Parent child documents partial update

2017-07-17 Thread Amrit Sarkar
Sujay, Not really. Parent-child documents are stored in a single block contiguously. Read more about parent-child relationship at: https://medium.com/@sarkaramrit2/multiple-documents-with-same-doc-id-in-index-in-solr-cloud-32c072db2164 While we perform partial / atomic update, say {"id":"X", "fie

Re: Help with updateHandler commit stats

2017-07-17 Thread Amrit Sarkar
Antonio, I think it is itself suggesting what it is. Meanwhile in official documentation: autocommits Total number of auto-commits executed. so yeah, total number of commits executed in the core's lifetime. Look into: https://cwiki.apache.org/confluence/display/solr/Performance+Statistics+Refe

CloudSolrClient preferred over LBHttpSolrClient

2017-07-17 Thread S G
Hi, Does anyone know if CloudSolrClient is preferred over LBHttpSolrClient ? If yes, why so and has there been any good performance benefits documented anywhere? Thanks SG

Re: CloudSolrClient preferred over LBHttpSolrClient

2017-07-17 Thread Amrit Sarkar
S G, Not sure about the documentation but: The CloudSolrClient uses a connection to zookeeper to extract cluster information like who is a the leader for a shard in a solr collection. To create a CloudSolrClient all you specify is the zookeepers and which collection you want to work with. Behind

Re: CloudSolrClient preferred over LBHttpSolrClient

2017-07-17 Thread Erick Erickson
Also, since CloudSolrClient is ZK aware it is notified when any Solr instances go up and down so it will take the appropriate action. Also, when indexing CloudSolrClient will send updates to the correct leader, reducing the hops for indexing documents. Short form: CloudSolrClient is preferred over

Re: CloudSolrClient preferred over LBHttpSolrClient

2017-07-17 Thread Susheel Kumar
Also per def of CloudSolrClient. SolrJ client class to communicate with SolrCloud. Instances of this class communicate with Zookeeper to discover Solr endpoints for SolrCloud collections, and then use the LBHttpSolrClient

Re: Limit to the number of cores supported?

2017-07-17 Thread Erick Erickson
I know of thousands of cores on a single Solr instance. Operationally there's not problem there, although there may be some practical issues (i.e. startup time and the like). What does your Solr log show? Two popular issues: OutOfMemory issues Not enough file handles (fix with ulimit) But withou

Copy field a source of copy field

2017-07-17 Thread tstusr
Hi We want to use a copy field as a source for another copy field or some kind of post processing of a field. The problem is here. We have a field from a text that is captured by a field, like this: which has (at the end of the processing) just the words in a field.

Re: Copy field a source of copy field

2017-07-17 Thread Erick Erickson
In a word, "no". Copyfields are not chained together. I'm not at all sure what you're trying to accomplish with those filter chains anyway, By shingling _then_ doing the stopwords, you'll have some input like abies durangensis become abies abies_durangensis durangensis Then put that through your

Re: Copy field a source of copy field

2017-07-17 Thread Shawn Heisey
On 7/17/2017 4:26 PM, tstusr wrote: > We want to use a copy field as a source for another copy field or some kind > of post processing of a field. > As an example imagine we have on species > > abies durangensis > abies flinckii > > so, after post processing, we expect to have only > abies > > whi

Re: Parent child documents partial update

2017-07-17 Thread Sujay Bawaskar
Thanks Amrit. So storage mechanism of parent child documents is limiting the capability of partial update. It would be great to have flawless parent child index support in solr. On 17-Jul-2017 11:14 PM, "Amrit Sarkar" wrote: > Sujay, > > Not really. Parent-child documents are stored in a single

Highlighting words with special characters

2017-07-17 Thread Lasitha Wattaladeniya
Hi devs, I have setup solr highlighting with default setup (only changed the fragsize to 0 to match any field length). It worked fine but recently I discovered it doesn't highlight for words with special characters in the middle. For an example, let's say I have indexed email address test.f...@ra

Re: Highlighting words with special characters

2017-07-17 Thread Lasitha Wattaladeniya
Further more, ngram field has following tokenizer/filter chain in index and query UAX29URLEmailTokenizerFactory (only in index) stopFilterFactory LowerCaseFilterFactory ASCIIFoldingFilterFactory EnglishPossessiveFilterFactory StemmerOverrideFilterFactory (only in query) NgramTokenizerFactory (only

RE: Joins in Parallel SQL?

2017-07-17 Thread imran
Is it possible to contribute towards building this capability? What part of developer documentation would be suitable for this? Regards, Imran Sent from Mail for Windows 10 From: Joel Bernstein Sent: Thursday, July 6, 2017 7:40 AM To: solr-user@lucene.apache.org Subject: Re: Joins in Parallel S

Re: Parent child documents partial update

2017-07-17 Thread Amrit Sarkar
Sujay, Lucene index is in flat-object document style, so I really not think nested documents at index / storage will ever be supported unless someone change the very intricacy of the index. Amrit Sarkar Search Engineer Lucidworks, Inc. 415-589-9269 www.lucidworks.com Twitter http://twitter.com/lu