Re: Solr 6.6.3: Cant create shard name with hyphen

2018-03-14 Thread Jay Potharaju
nvm i see the first comment on the ticket ...that hypens are allowed but not when it is the first character. It looks like 5.5 was the last version where it was supported. Thanks Jay Thanks Jay Potharaju On Wed, Mar 14, 2018 at 9:29 PM, Jay Potharaju wrote: > Thanks

Problem encountered upon starting Solr after improper exit

2018-03-14 Thread YIFAN LI
To whom it may concern, I am running Solr 7.1.0 and encountered a problem starting Solr after I killed the Java process running Solr without proper cleanup. The error message that I received is as following: solr-7.1.0 liyifan$ bin/solr run dyld: Library not loaded:

Re: Solr 6.6.3: Cant create shard name with hyphen

2018-03-14 Thread Jay Potharaju
Thanks for the reply Shawn. Was this a recent change ? As per the ticket it was fixed in 6.0. Is this change(no hyphens as starting of name ) applicable to all 6x versions. Thanks > On Mar 14, 2018, at 6:25 PM, Shawn Heisey wrote: > >> On 3/14/2018 6:20 PM, Jay Potharaju

RE: solr query

2018-03-14 Thread Albert Lee
Hi Emir, If using OR-ed conditions for different years then the query will be very long if I got 100 years and I think this is not practical. You have any other idea? Regards, Albert From: Gus Heck Sent: Thursday, March 15, 2018 12:43 AM To: solr-user@lucene.apache.org Subject: Re: solr query

Re: Solr 6.6.3: Cant create shard name with hyphen

2018-03-14 Thread Shawn Heisey
On 3/14/2018 6:20 PM, Jay Potharaju wrote: > I am creating a new collection in solr 6.6.3 and it wont allow me create a > shard with hyphen. This ticket( > https://issues.apache.org/jira/browse/SOLR-8725) was closed earlier. But it > is not working for me in 6.6.3. > Upgrading from 5.3 to 6.6.3.

Re: Scoping SolrCloud setup

2018-03-14 Thread Scott Prentice
Walter... Thanks for the additional data points. Clearly we're a long way from needing anything too complex. Cheers! ...scott On 3/14/18 1:12 PM, Walter Underwood wrote: That would be my recommendation for a first setup. One Solr instance per host, one shard per collection. We run 5

Re: Zookeeper service?

2018-03-14 Thread Scott Prentice
Yeah .. I knew it was a different Apache project, but figured that since it was so tightly integrated with SolrCloud that others may have run into this issue. I did some poking around and have (for now) ended up with this .. implemented it as a service through the "systemd.unit"

Re: Solr 6.6.3: Cant create shard name with hyphen

2018-03-14 Thread Jay Potharaju
I tested in solr 6.5.1 and there also it is broken. Any recommendation which version of 6 is that feature functioning. At this time the shard name cant be changed because of dependency with other applications. Thanks Thanks Jay Potharaju On Wed, Mar 14, 2018 at 5:20 PM, Jay Potharaju

Solr 6.6.3: Cant create shard name with hyphen

2018-03-14 Thread Jay Potharaju
Hi , I am creating a new collection in solr 6.6.3 and it wont allow me create a shard with hyphen. This ticket( https://issues.apache.org/jira/browse/SOLR-8725) was closed earlier. But it is not working for me in 6.6.3. Upgrading from 5.3 to 6.6.3. Thanks Jay

Re: Zookeeper service?

2018-03-14 Thread Shawn Heisey
On 3/14/2018 12:24 PM, Scott Prentice wrote: > We might be going at this wrong, but we've got Solr set up as a > service, so if the machine goes down it'll restart. But without > Zookeeper running as a service, that's not much help. You're probably going to be very unhappy to be told this ... but

Re: Solr on DC/OS ?

2018-03-14 Thread Rick Leir
Søren, DC/OS installs on top of Ubuntu or RedHat, and it is used to coordinate many machines so they appear as a cluster. Solr needs to be on a single machine, or in the case of SolrCloud, on many machines. It has no need of the coordination which DC/OS provides. Solr depends on direct access

Re: SynonymGraphFilterFactory with WordDelimiterGraphFilterFactory usage

2018-03-14 Thread Jay Potharaju
Thanks for the response Rick!. I checked 6.6.2 and it has the same issue. The only work around that I have now is comment out the SynonymGraphFilterFactory as we are not using synonyms as of now. But would like to know how to address this issue once we start using it down the line. Thanks J

Solr Developer needed urgently

2018-03-14 Thread asmatalib
Hi, I am Asma Talib from UTG Tech. We are looking for strong Solr Developer with good concepts and skill. Experience: 8 + Years Location: New York This position is for a Senior Search Developer, using Apache Solr for Analytics Following are the Responsibilities of candidate: *

RE: SpellCheck Reload

2018-03-14 Thread Sadiki Latty
Hello, Just bumping this question up regarding the spellcheck reload. Can anyone provide some insight on this question? Thanks in advance Sid -Original Message- From: Sadiki Latty [mailto:sla...@uottawa.ca] Sent: March-12-18 1:38 PM To: solr-user@lucene.apache.org Subject:

Re: Scoping SolrCloud setup

2018-03-14 Thread Walter Underwood
That would be my recommendation for a first setup. One Solr instance per host, one shard per collection. We run 5 million document cores with 8 GB of heap for the JVM. We size the RAM so that all the indexes fit in OS filesystem buffers. Our big cluster is 32 hosts, 21 million documents in four

Re: SynonymGraphFilterFactory with WordDelimiterGraphFilterFactory usage

2018-03-14 Thread Rick Leir
Jay Did you try using text_en_splitting copied out of another release? Though if someone went to the trouble of removing it from the example, there could be something broken in it. Cheers -- Rick -- Sorry for being brief. Alternate email is rickleir at yahoo dot com

Re: Scoping SolrCloud setup

2018-03-14 Thread Scott Prentice
Erick... Thanks. Yes. I think we were just going shard-happy without really understanding the purpose. I think we'll start by keeping things simple .. no shards, fewer replicas, maybe a bit more RAM. Then we can assess the performance and make adjustments as needed. Yes, that's the main

Expose a metric for percentage-recovered during full recoveries

2018-03-14 Thread S G
Hi, Solr does full recoveries very frequently - sometimes even for seemingly simple cases like adding a field to the schema, a couple of nodes go into recovery. It would be nice if it did not do such full recoveries so frequently but since that may require a lot of fixing, can we have a metric

Re: Why are cursor mark queries recommended over regular start, rows combination?

2018-03-14 Thread Erick Erickson
I'm pretty sure you can use Streaming Expressions to get all the rows back from a sharded collection without chewing up lots of memory. Try: search(collection, q="id:*", fl="id", sort="id asc", qt="/export") on a sharded SolrCloud installation,

Re: Scoping SolrCloud setup

2018-03-14 Thread Erick Erickson
Scott: Eventually you'll hit the limit of your hardware, regardless of VMs. I've seen multiple VMs help a lot when you have really beefy hardware, as in 32 cores, 128G memory and the like. Otherwise it's iffier. re: sharding or not. As others wrote, sharding is only useful when a single

Re: SolrCloud update and luceneMatchVersion

2018-03-14 Thread Erick Erickson
Hendrik: There's one problem with IndexUpgraderTool. As Shawn points out, it does a forceMerge, which by default creates one large segment. This has some implications in terms of the number of deleted documents if the index has updates afterwards, see:

Zookeeper service?

2018-03-14 Thread Scott Prentice
We might be going at this wrong, but we've got Solr set up as a service, so if the machine goes down it'll restart. But without Zookeeper running as a service, that's not much help. I found the zookeeperd install, which in theory seems like it should do the trick, but that installs a new

Re: SolrCloud update and luceneMatchVersion

2018-03-14 Thread Hendrik Haddorp
Thanks for the detailed description! On 14.03.2018 16:11, Shawn Heisey wrote: On 3/14/2018 5:56 AM, Hendrik Haddorp wrote: So you are saying that we do not need to run the IndexUpgrader tool if we move from 6 to 7. Will the index be then updated automatically or will we get a problem once we

Re: solr query

2018-03-14 Thread Emir Arnautović
Right - I focused more on the fact that Albert was not just looking for the current mont of the current year. Emir -- Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > On 14 Mar 2018, at 17:43, Gus Heck

Re: solr query

2018-03-14 Thread Gus Heck
I think you have inadvertently "corrected" the intentional exclusive end on my range... [NOW/MONTH TO NOW/MONTH+1MONTH} On Wed, Mar 14, 2018 at 12:08 PM, Emir Arnautović < emir.arnauto...@sematext.com> wrote: > Hi Gus, > It is just current month, but Albert is interested in month, regardless of

Re: Scoping SolrCloud setup

2018-03-14 Thread Scott Prentice
Emir... Thanks for the input. Our larger collections are localized content, so it may make sense to shard those so we can target the specific index. I'll need to confirm how it's being used, if queries are always within a language or if they are cross-language. Thanks also for the link ..

Re: Scoping SolrCloud setup

2018-03-14 Thread Scott Prentice
Greg... Thanks. That's very helpful, and is inline with what I've been seeing. So, to be clear, you're saying that the size of all collections on a server should be less than the available RAM. It looks like we've got about 13GB of documents in all (and growing), so, if we're restricted to

Re: Why are cursor mark queries recommended over regular start, rows combination?

2018-03-14 Thread S G
Thanks everybody. This is lot of good information. And we should try to update this in the documentation too to help users make the right choice. I can take a stab at this if someone can point me how to update the documentation. Thanks SG On Tue, Mar 13, 2018 at 2:04 PM, Chris Hostetter

Re: solr query

2018-03-14 Thread Emir Arnautović
Hi Gus, It is just current month, but Albert is interested in month, regardless of year. It can be done with OR-ed conditions for different years: birthDate:[NOW/MONTH TO NOW/MONTH+1MONTH] OR birthDate:[NOW-1YEAR/MONTH TO NOW-1YEAR/MONTH+1MONTH] OR birthDate:[NOW-2YEAR/MONTH TO

Re: solr query

2018-03-14 Thread Gus Heck
I think you can specify the current month with birthDate:[NOW/MONTH TO NOW/MONTH+1MONTH} does that work for you? On Wed, Mar 14, 2018 at 6:32 AM, Emir Arnautović < emir.arnauto...@sematext.com> wrote: > Actually you don’t have to add another field - there is function ms that > converts date to

Re: Implications of using implicit routing

2018-03-14 Thread Chris Ulicny
Shawn, I knew that the shard had to be specified by the indexing process or document, but I didn't realize that the uniqueness of the document across the collection also had to be handled outside of solr as well. We've used the compositeId router successfully to route documents, but it seemed

Re: Implications of using implicit routing

2018-03-14 Thread Shawn Heisey
On 3/14/2018 9:26 AM, Chris Ulicny wrote: We've been looking at using implicit for one of our collections, and there seems to be some weird behavior that we're not sure whether it was expected or not. Is it recommended to use a uniqueKey for implicit routing? Is the following behavior intended?

Re: In Place Updates not work as expected

2018-03-14 Thread Shawn Heisey
On 3/14/2018 8:27 AM, mganeshs wrote: As I mentioned before, since I am updating only docvalues i expect it should update in faster than updating normal field. Isn't it ? Maybe.  But not always. To do an in-place update, Solr must rewrite the docValues data for that field in that segment. 

Implications of using implicit routing

2018-03-14 Thread Chris Ulicny
Hi all, We've been looking at using implicit for one of our collections, and there seems to be some weird behavior that we're not sure whether it was expected or not. Is it recommended to use a uniqueKey for implicit routing? Is the following behavior intended? We have encountered the following

Re: Some performance questions....

2018-03-14 Thread Shawn Heisey
On 3/14/2018 5:49 AM, BlackIce wrote: I was just thinking Do I really need separate VM's in order to run multiple Solr instances? Doesn't it suffice to have each instance in its own user account? You can run multiple instances all under the same account on one machine.  But for a single

Re: SolrCloud update and luceneMatchVersion

2018-03-14 Thread Shawn Heisey
On 3/14/2018 5:56 AM, Hendrik Haddorp wrote: So you are saying that we do not need to run the IndexUpgrader tool if we move from 6 to 7. Will the index be then updated automatically or will we get a problem once we move to 8? If you don't run IndexUpgrader, and the index version is one that

Re: Solr Warming Up Doubts

2018-03-14 Thread Shawn Heisey
First, I need to state my agreement with what Alessandro told you.  A lot of what I am saying below is the same as that reply. On 3/7/2018 9:08 AM, Birender Rawat wrote: *1. FirstSearcher* I have added some 2 frequent used query but all my autowarmCount are set to 0. I have also added facet

Re: [nesting] Any way to return the whole hierarchical structure when doing Block Join queries?

2018-03-14 Thread Jan Høydahl
I tried to index a 3-level nested Block and expected the "1.2.1" document to have _root_=1.2 but it had the top-document as root. If each doc in addition would have a _parent_= field pointing to its nearest parent, then it would be possible to extend the [child] doc transformer to reconstruct

Re: [nesting] Any way to return the whole hierarchical structure when doing Block Join queries?

2018-03-14 Thread Anshum Gupta
Hi Jan, The way I remember it was done (or at least we did it) is by storing the depth information as a field in the document using an update request processor and using a custom transformer to reconstruct the original multi-level document from it. Also, this was a reasonably long time ago, so

Re: Solr on DC/OS ?

2018-03-14 Thread Shawn Heisey
On 3/14/2018 4:19 AM, Søren wrote: Hi, has anyone experience in running solr on DC/OS? If so, how is that achieved succesfully? Solr is not in Universe. If this operating system has Oracle or OpenJDK Java (version 8 preferred) available, chances are that it will run Solr. If it can run

Re: LTR not able to upload org.apache.solr.ltr.model.MultipleAdditiveTreesModel

2018-03-14 Thread Alessandro Benedetti
This is the piece of code involved : "try { // create an instance of the model model = solrResourceLoader.newInstance( className, LTRScoringModel.class, new String[0], // no sub packages new Class[] { String.class, List.class, List.class,

Re: In Place Updates not work as expected

2018-03-14 Thread mganeshs
Hi Emir, I am using solrj to update the document. Is there any spl API to be used for in place Updates ? Yes are we are updating in Batch of 1000 documents. As I mentioned before, since I am updating only docvalues i expect it should update in faster than updating normal field. Isn't it ?

Re: LTR not able to upload org.apache.solr.ltr.model.MultipleAdditiveTreesModel

2018-03-14 Thread Roopa Rao
Hi Alessandro, I figured the issue, the model was using a feature which was not in the features file. The error was very generic so was hard to find this. Thank you, Roopa On Wed, Mar 14, 2018 at 7:16 AM, Alessandro Benedetti wrote: > Hi Roopa, > that model changed name

Re: [nesting] Any way to return the whole hierarchical structure when doing Block Join queries?

2018-03-14 Thread Jan Høydahl
I understand that the [subquery] transformer can help build a nested response when you know the structure in advance, but what if you have some BlockJoin indexed structure with grand children (as the original question in this thread), and you want to reconstruct the full document based on what

Re: Some performance questions....

2018-03-14 Thread Deepak Goel
Have you measured the overhead of VM anytime? Or have you read it somewhere? On 14 Mar 2018 18:10, "BlackIce" wrote: > but it should be possible, without the overhead of VM's > > On Wed, Mar 14, 2018 at 1:30 PM, Deepak Goel wrote: > > > The OS

Re: Some performance questions....

2018-03-14 Thread BlackIce
but it should be possible, without the overhead of VM's On Wed, Mar 14, 2018 at 1:30 PM, Deepak Goel wrote: > The OS resources would be shared in that case > > On 14 Mar 2018 17:19, "BlackIce" wrote: > > > I was just thinking Do I really need

Re: Some performance questions....

2018-03-14 Thread Deepak Goel
The OS resources would be shared in that case On 14 Mar 2018 17:19, "BlackIce" wrote: > I was just thinking Do I really need separate VM's in order to run > multiple Solr instances? Doesn't it suffice to have each instance in its > own user account? > > Greetz > > On

Re: SolrCloud update and luceneMatchVersion

2018-03-14 Thread Hendrik Haddorp
So you are saying that we do not need to run the IndexUpgrader tool if we move from 6 to 7. Will the index be then updated automatically or will we get a problem once we move to 8? How would one use the IndexUpgrader at all with Solr? Would one need to run it against the index of every core?

Re: Some performance questions....

2018-03-14 Thread BlackIce
I was just thinking Do I really need separate VM's in order to run multiple Solr instances? Doesn't it suffice to have each instance in its own user account? Greetz On Mon, Mar 12, 2018 at 7:41 PM, BlackIce wrote: > I don't have any production logs and this all

Re: How to store files larger than zNode limit

2018-03-14 Thread Atita Arora
Thank you Markus , that's kind of relief to know ! Rick, I spent few minutes looking about puppet/ansible as I have not used them before, but this seems kind of doable. Let me give this a try and I'll let you know. Thanks, Atita On Wed, Mar 14, 2018 at 5:01 PM, Rick Leir

Re: Solr Warming Up Doubts

2018-03-14 Thread Alessandro Benedetti
I see quite a bit of confusion here : *1. FirstSearcher* I have added some 2 frequent used query but all my autowarmCount are set to 0. I have also added facet for warming. So if my autowarmCount=0, does this mean by queries are not getting cached. /First Searcher as the name suggests is the

Re: How to store files larger than zNode limit

2018-03-14 Thread Rick Leir
Could you manage userdict using Puppet or Ansible? Or whatever your automation system is. -- Sorry for being brief. Alternate email is rickleir at yahoo dot com

Re: LTR not able to upload org.apache.solr.ltr.model.MultipleAdditiveTreesModel

2018-03-14 Thread Alessandro Benedetti
Hi Roopa, that model changed name few times, which Apache Solr version are you using ? It is very likely you are using a class name not in sync with your Apache Solr version. Regards - --- Alessandro Benedetti Search Consultant, R Software Engineer, Director Sease Ltd. -

Re: solr query

2018-03-14 Thread Emir Arnautović
Actually you don’t have to add another field - there is function ms that converts date to timestamp. What you can do is use frange query parser and play bit with math, e.g. sub(ms(date_field),ms(NOW/YEAR)) will give you ms elapsed since this year and you know that from 0 to 31*8640 is

RE: How to store files larger than zNode limit

2018-03-14 Thread Rick Leir
Markus, Atita We set it higher too. When zk is recovering from a disconnected state it re-sends all the messages that it had been trying to send while the machines were disconnected. Is this stored in a ' transaction log' .tlog file? I am not clear on this. Zk also goes through the unsent

Solr on DC/OS ?

2018-03-14 Thread Søren
Hi, has anyone experience in running solr on DC/OS? If so, how is that achieved succesfully? Solr is not in Universe. Thanks in advance, Soren

Re: SolrCloud update and luceneMatchVersion

2018-03-14 Thread Shawn Heisey
On 3/14/2018 3:04 AM, Hendrik Haddorp wrote: we have a SolrCloud 6.3 with HDFS setup and plan to upgrade to 7.2.1. The cluster upgrade instructions on https://lucene.apache.org/solr/guide/7_2/upgrading-a-solr-cluster.html does not contain any information on changing the luceneMatchVersion.

Re: Solr reload process flow

2018-03-14 Thread Emir Arnautović
Hi Akshay, 1. Solr creates new core with new schema/conf updates, opens a new searcher and replaces existing core if all want ok. If you have some issues with schema/conf or you ended up with corrupted index, loading core will fail but old one will stay there. 2. It is not creating new index,

RE: solr query

2018-03-14 Thread Albert Lee
I don’t want to add separate fields since I have many dates to index. How to index it as timestamp and do function query, any example or documentation? Regards, Albert From: Emir Arnautović Sent: Wednesday, March 14, 2018 5:38 PM To: solr-user@lucene.apache.org Subject: Re: solr query Hi

Re: Solr reload process flow

2018-03-14 Thread Shawn Heisey
On 3/14/2018 12:56 AM, Akshay Murarka wrote: I am using solr-5.4.0 in my production environment and am trying to automate the reload/restart process of the solr collections based on certain specific conditions. I noticed that on solr reload the thread count increases a lot there by resulting

Re: solr query

2018-03-14 Thread Emir Arnautović
Hi Albert, The simplest solution is to index month/year as separate fields. Alternative is to index it as timestamp and do function query to do some math and filter out records. Emir -- Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch Consulting Support Training -

RE: solr query

2018-03-14 Thread Albert Lee
NOW/MONTH and NOW/YEAR to get the start of month/year, but how can I get current month of regardless year. Like the use case, people who’s birthdate is this month? Regard, Albert From: Emir Arnautović Sent: Wednesday, March 14, 2018 5:26 PM To: solr-user@lucene.apache.org Subject: Re: solr

Re: solr query

2018-03-14 Thread Emir Arnautović
Hi Albert, It does - you can use NOW/MONTH and NOW/YEAR to get the start of month/year. Here is reference to date math: https://lucene.apache.org/solr/guide/6_6/working-with-dates.html#WorkingwithDates-DateMathSyntax

Re: Scoping SolrCloud setup

2018-03-14 Thread Emir Arnautović
Hi Scott, There is no definite answer - it depends on your documents and query patterns. Sharding does come with an overhead but also allows Solr to parallelise search. Query latency is usually something that tells you if you need to split collection to multiple shards or not. In caseIf you are

Solr reload process flow

2018-03-14 Thread Akshay Murarka
Hey, I am using solr-5.4.0 in my production environment and am trying to automate the reload/restart process of the solr collections based on certain specific conditions. I noticed that on solr reload the thread count increases a lot there by resulting in increased latencies. So I read about

SolrCloud update and luceneMatchVersion

2018-03-14 Thread Hendrik Haddorp
Hi, we have a SolrCloud 6.3 with HDFS setup and plan to upgrade to 7.2.1. The cluster upgrade instructions on https://lucene.apache.org/solr/guide/7_2/upgrading-a-solr-cluster.html does not contain any information on changing the luceneMatchVersion. If we change the luceneMatchVersion

Re: Boosting with 0 factor

2018-03-14 Thread Emir Arnautović
Hi Dariusz, It will match but it will not affect the score. If you have a single field boosted with 0, score will be 0. You can use debugQuery=true to see how query is parsed and see that there is a component even boost is 0. HTH, Emir -- Monitoring - Log Management - Alerting - Anomaly

AW: Navigation/Paging

2018-03-14 Thread Sebastian Riemer
Dear Shawn, thank you so much for taking the time for this detailed answer! It helps me very much and I'm very grateful. 1) As you've suggested, we already load the data for detail pages from our relational db, just using the documentId from Solr to look it up. 2) Our index size won't ever

AW: Navigation/Paging

2018-03-14 Thread Sebastian Riemer
Hi Rick, thanks for pointing this out - that's the solution I was thinking about too "... -> I guess this we could handle by >simply checking and sending a second query where the param "start" >would be adjusted accordingly ..." Just checking if there are other options, Thanks again!