Lose Solr config on zookeeper when it is restarted

2015-10-07 Thread CrazyDiamond
sometimes when Zookeeper( single mode) is restarted it lose solr collections. Furthermore when i manually upload it again then no state.json is created in collection but clusterstate.json is created instead. i use solr 5.1.0 -- View this message in context:

Re: Lose Solr config on zookeeper when it is restarted

2015-10-07 Thread Upayavira
On Wed, Oct 7, 2015, at 09:42 PM, CrazyDiamond wrote: > sometimes when Zookeeper( single mode) is restarted it lose solr > collections. Furthermore when i manually upload it again then no > state.json > is created in collection but clusterstate.json is created instead. > i use solr 5.1.0 How

Re: Unexpected delayed document deletion with atomic updates

2015-10-07 Thread Upayavira
What ID are you using? Are you possibly using the same ID field for both, so the second document you visit causes the first to be overwritten? Upayavira On Wed, Oct 7, 2015, at 06:38 PM, Erick Erickson wrote: > This certainly should not be happening. I'd > take a careful look at what you

Re: Is solr.StandardDirectoryFactory an MMapDirectory?

2015-10-07 Thread Shawn Heisey
On 10/7/2015 12:00 PM, Eric Torti wrote: > Can we read "high reopen rate" as "frequent soft commits"? (In our > case, hard commits do not open a searcher. But soft commits do). > > Considering it does mean "frequent soft commits", I'd say that it > doesn't fit our setup because we have an index

admin-extra

2015-10-07 Thread Upayavira
Do you use admin-extra within the admin UI? If so, please go to [1] and document your use case. The feature currently isn't implemented in the new admin UI, and without use-cases, it likely won't be - so if you want it in there, please help us understand how you use it! Thanks! Upayavira [1]

words n-gram analyser

2015-10-07 Thread vit
Does Solr 4.2 have n-gram filter over words, not symbols like EdgeNGramFilterFactory. I hoped NGramTokenFilterFactory serves this purposes but looks like it also creates n-grams over symbols. I used it this way in hope that I will get 3-words to 10-words -- View this message in context:

Re: Instant Page Previews

2015-10-07 Thread Alexandre Rafalovitch
I don't think that particular functionality is anything directly to do with Solr? You will have server component that will index web page (I am guessing) into Solr. That same component can generate preview image. Your frontend UI will get the URL/id from Solr and display the related image. Solr

Re: Numeric Sorting with 0 and NULL Values

2015-10-07 Thread Todd Long
Todd Long wrote > I'm curious as to where the loss of precision would be when using > "-(Double.MAX_VALUE)" as you mentioned? Also, any specific reason why you > chose that over Double.MIN_VALUE (sorry, just making sure I'm not missing > something)? So, to answer my own question it looks like

RE: Cannot connect to a zookeeper 3.4.6 instance via zkCli.cmd

2015-10-07 Thread Adrian Liew
Hi Shawn, To reiterate, this is the exception I get if unable to connect to Zookeeper service: E:\solr-5.3.0\server\scripts\cloud-scripts>zkcli.bat -z 10.0.0.4:2181 -cmd list Exception in thread "main" org.apache.solr.common.SolrException: java.util.concu rrent.TimeoutException: Could not

RE: If zookeeper is down, SolrCloud nodes will not start correctly, even if zookeeper is started later

2015-10-07 Thread Adrian Liew
Hi Shawn Thanks for informing me. I guess the worst case scenario is that all 3 ZK services are down and that may be unlikely the case. At this juncture, as you said the viable workaround is a manual approach to start up the services in sequence in ensuring a quorum can take place. So the

Re: If zookeeper is down, SolrCloud nodes will not start correctly, even if zookeeper is started later

2015-10-07 Thread Shawn Heisey
On 10/6/2015 10:22 PM, Adrian Liew wrote: > Hence, the issue is that upon startup of three machines, the startup of ZK > and Solr is out of sequence that causes SolrCloud to behave unexpectedly. > Noting there is Jira ticket addressed here for Solr 4.9 above to include an > improvement to the

Re: Lose Solr config on zookeeper when it is restarted

2015-10-07 Thread Erick Erickson
Sounds like you're somehow mixing old and new versions of the ZK state when you restart. I have no idea how that would be happening, but... It's consistent. If you're somehow creating collections with the new format where state.json is kept per collection, but when you restart you're somehow

Re: Run Solr 5.3.0 as a Service on Windows using NSSM

2015-10-07 Thread Zheng Lin Edwin Yeo
Hi Adrian and Upayavira, It works fine when I start Solr outside NSSM. As for the NSSM, so far I haven't tried the automatic startup yet. I start the services for ZooKeeper and Solr in NSSM manually from the Windows Component Services, so the ZooKeeper will have been started before I start Solr.

Re: Lose Solr config on zookeeper when it is restarted

2015-10-07 Thread CrazyDiamond
zk is stand-alone. But i think solr node is ephimeral. -- View this message in context: http://lucene.472066.n3.nabble.com/Lose-Solr-config-on-zookeeper-when-it-is-restarted-tp421p4233376.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: words n-gram analyser

2015-10-07 Thread Erick Erickson
I think that ShingleFilterFactory is what you're looking for, see: https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ShingleFilterFactory Best, Erick On Wed, Oct 7, 2015 at 4:29 PM, vit wrote: > Does Solr 4.2 have n-gram filter over words, not symbols like

Re: tlog replay

2015-10-07 Thread Erick Erickson
The only way I can account for such a large file off the top of my head is if, for some reason, the Solr on the node somehow was failing to index documents and kept adding them to the log for a lnnn time. But how that would happen without the node being in recovery mode I'm not sure. I

Scramble data

2015-10-07 Thread Tarala, Magesh
Folks, I have a strange question. We have a Solr implementation that we would like to demo to external customers. But we don't want to display the real data, which contains our customer information and so is sensitive data. What's the best way to scramble the data of the Solr Query results? By

Re: tlog replay

2015-10-07 Thread Erick Erickson
Uhm, that's very weird. Updates are not applied from the tlog. Rather the raw doc is forwarded to the replica which both indexes the doc and writes it to the local tlog. So having a 14G tlog on a follower but a small tlog on the leader is definitely strange, especially if it persists over time. I

Re: tlog replay

2015-10-07 Thread Rallavagu
Thanks Erick. Eventually, followers caught up but the 14G tlog file still persists and they are healthy. Is there anything to look for? Will monitor and see how long will it take before it disappears. Evaluating move to Solr 5.3. On 10/7/15 7:51 PM, Erick Erickson wrote: Uhm, that's very

tlog replay

2015-10-07 Thread Rallavagu
Solr 4.6.1, single shard, 4 node cloud, 3 node zk Like to understand the behavior better when large number of updates happen on leader and it generates huge tlog (14G sometimes in my case) on other nodes. At the same time leader's tlog is few KB. So, what is the rate at which the changes from

Exclude documents having same data in two fields

2015-10-07 Thread Aman Tandon
Hi, Is there a way in solr to remove all those documents from the search results in which two of the fields, *mapping* and *title* is the exactly same. With Regards Aman Tandon

Re: Scramble data

2015-10-07 Thread Erick Erickson
Probably sanitize the data on the front end? Something simple like put "REDACTED" for all of the customer-sensitive fields. You might also write a DocTransformer plugin, all you have to do is implement subclass DocTransformer and override one very simple "transform" method, Best, Erick On Wed,

Re: Pressed optimize and now SOLR is not indexing while optimize is going on

2015-10-07 Thread Eric Torti
Hello Shawn, I'm sorry to diverge this thread a little bit. But could please point me to resources that explain deeply how this process of OS using the non-java memory to cache index data? > Whatever RAM is left over after you give 12GB to Java for Solr will be > used automatically by the

Re: Query to count matching terms and disable 'coord' multiplication

2015-10-07 Thread Alessandro Benedetti
Hi, related 1 you should take a look to all the similarity implementation, maybe there's some good fit there for your use case ! Another interesting reading could be : http://opensourceconnections.com/blog/2014/01/20/build-your-own-custom-lucene-query-and-scorer/ fro Doug. I remember i saw that

Re: Run Solr 5.3.0 as a Service on Windows using NSSM

2015-10-07 Thread Upayavira
Wrap your script that starts Solr with one that checks it can access Zookeeper before attempting to start Solr, that way, once ZK starts, Solr will come up. Then, hand *that* script to NSSM. And finally, when one of you has got a setup that works with NSSM starting Solr via the default

Re: efficient sort by title (multi word field)

2015-10-07 Thread Alessandro Benedetti
Hi Gili, if you want to use sorting you will need to index the extra string field. And I suggest you to set only DocValues on that field. Doc values are not compatible with textual analysed fields and neither sort is. [1]

Re: Pressed optimize and now SOLR is not indexing while optimize is going on

2015-10-07 Thread Toke Eskildsen
On Wed, 2015-10-07 at 07:03 -0300, Eric Torti wrote: > I'm sorry to diverge this thread a little bit. But could please point me to > resources that explain deeply how this process of OS using the non-java > memory to cache index data?

RE: Run Solr 5.3.0 as a Service on Windows using NSSM

2015-10-07 Thread Adrian Liew
Hi Edwin, You may want to try explore some of the configuration properties to configure in zookeeper. http://zookeeper.apache.org/doc/r3.4.5/zookeeperAdmin.html#sc_zkMulitServerSetup My recommendation is to try run your batch files outside of NSSM so it is easier to debug and observe what you

Highlighting tag is not showing occasionally

2015-10-07 Thread Zheng Lin Edwin Yeo
Hi, Has anyone face the problem of when using highlighting, sometimes there are results which are returned, but there is no highlighting to the result (ie: no tag). I found that there is a match in another field which I did not include in my hl.fl parameters when I do fl=*, but that same word

Re: Pivot facets

2015-10-07 Thread Alessandro Benedetti
I agree with Hoss, is this what you are expecting ? Indexing ... Doc 1 : Country: England Region: Greater London City: London Doc2 : Country:England City: Manchester Query results Country : England(2) Region : Greater london(1) City:

Fuzzy search for names and phrases

2015-10-07 Thread vit
Could someone share experience on applying name fuzzy search using Solr. It should not be just the one which uses Edit Distance. I also want to cover cases with split and merge like "OneIndustrial" vs "One Industrial", etc. -- View this message in context:

Unexpected delayed document deletion with atomic updates

2015-10-07 Thread John Smith
Hi, I'm bumping on the following problem with update XML messages. The idea is to record the number of clicks for a document: each time, a message is sent to .../update such as this one: abc 1 1.05 (Clicks is an int field; Boost is a float field, it's updated to reflect the change in

Re: Pressed optimize and now SOLR is not indexing while optimize is going on

2015-10-07 Thread Shawn Heisey
On 10/7/2015 4:03 AM, Eric Torti wrote: > I'm sorry to diverge this thread a little bit. But could please point me to > resources that explain deeply how this process of OS using the non-java > memory to cache index data? > >> Whatever RAM is left over after you give 12GB to Java for Solr will be

Re: Pressed optimize and now SOLR is not indexing while optimize is going on

2015-10-07 Thread Siddhartha Singh Sandhu
Thanks from my end too. And thanks for the question Eric that added a lot to my understanding as well. Regards. Sid. On Wed, Oct 7, 2015 at 10:04 AM, Eric Torti wrote: > Cool, Toke and Shawn! > > That's exactly what I was looking for. I'll have a look at those resources >

Re: Pressed optimize and now SOLR is not indexing while optimize is going on

2015-10-07 Thread Eric Torti
Cool, Toke and Shawn! That's exactly what I was looking for. I'll have a look at those resources and if something is yet unclear I'll open a thread for it. Thanks for the information, Eric On Wed, Oct 7, 2015 at 10:29 AM, Shawn Heisey wrote: > On 10/7/2015 4:03 AM, Eric

Re: Solr cross core join special condition

2015-10-07 Thread Ryan Josal
I developed a join transformer plugin that did that (although it didn't flatten the results like that). The one thing that was painful about it is that the TextResponseWriter has references to both the IndexSchema and SolrReturnFields objects for the primary core. So when you add a SolrDocument

Re: EdgeNGramFilterFactory question

2015-10-07 Thread vit
any experience with EdgeNGramFilterFactory will be appreciated -- View this message in context: http://lucene.472066.n3.nabble.com/EdgeNGramFilterFactory-question-tp4233034p4233210.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: If zookeeper is down, SolrCloud nodes will not start correctly, even if zookeeper is started later

2015-10-07 Thread Shawn Heisey
On 10/7/2015 3:06 AM, Adrian Liew wrote: > Thanks for informing me. I guess the worst case scenario is that all 3 ZK > services are down and that may be unlikely the case. At this juncture, as you > said the viable workaround is a manual approach to start up the services in > sequence in

Re: Is solr.StandardDirectoryFactory an MMapDirectory?

2015-10-07 Thread Shawn Heisey
On 10/7/2015 8:48 AM, Eric Torti wrote: > class="${solr.directoryFactory:solr.StandardDirectoryFactory}" > name="DirectoryFactory"/> > > I'm just starting to grasp different strategies for Directory > implementation. Can I assume that solr.StandardDirectoryFactory is a > MMapDirectory as

Is solr.StandardDirectoryFactory an MMapDirectory?

2015-10-07 Thread Eric Torti
Hello, I'm running a 5.2.1 SolrCloud cluster and I see that one of my cores is configured under solrconfig.xml to use I'm just starting to grasp different strategies for Directory implementation. Can I assume that solr.StandardDirectoryFactory is a MMapDirectory as described by Uwe Schindler

Re: Unexpected delayed document deletion with atomic updates

2015-10-07 Thread Erick Erickson
This certainly should not be happening. I'd take a careful look at what you actually send. My _guess_ is that you're not sending the update command you think you are As a test you could just curl (or use post.jar) to send these types of commands up individually. Perhaps looking at the solr

Instant Page Previews

2015-10-07 Thread Lewin Joy (TMS)
Hi, Is there anyway we can implement instant page previews in solr? Just saw that Google Search Appliance has this out of the box. Just like what google.com had previously. We need to display the content of the result record when hovering over the link. Thanks, Lewin

Re: EdgeNGramFilterFactory question

2015-10-07 Thread Walter Underwood
You would need an analyzer or char filter factory that removed all spaces. But then you would only get one “edge”. That would make “to be or not to be” into the single token “tobeornottobe”. I don’t think that fixes anything. Stemming and prefix matching do very different things. Use them in

Re: Pressed optimize and now SOLR is not indexing while optimize is going on

2015-10-07 Thread Walter Underwood
Unix has a “buffer cache”, often called a file cache. This chapter discusses the Linux buffer cache, which is very similar to other Unix implementations. Essentially, all unused RAM is used to make disk access faster. http://www.tldp.org/LDP/sag/html/buffer-cache.html

Re: Solr cross core join special condition

2015-10-07 Thread Susheel Kumar
You may want to take a look at new Solr feature of Streaming API & Expressions https://issues.apache.org/jira/browse/SOLR-7584?filter=12333278 for making joins between collections. On Wed, Oct 7, 2015 at 9:42 AM, Ryan Josal wrote: > I developed a join transformer plugin that did

Re: Is solr.StandardDirectoryFactory an MMapDirectory?

2015-10-07 Thread Eric Torti
Thanks, Shawn. > After a look at the code, I found that StandardDirectoryFactory should > use MMap if the OS and Java version support it. If support isn't there, > it will use conventional file access methods. As far as I know, all > 64-bit Java versions and 64-bit operating systems will

Re: Is solr.StandardDirectoryFactory an MMapDirectory?

2015-10-07 Thread Eric Torti
Correcting: When I mentioned high non-JVM memory usage, what I probably meant was high virtual memory allocation. On Wed, Oct 7, 2015 at 3:00 PM, Eric Torti wrote: > Thanks, Shawn. > >> After a look at the code, I found that StandardDirectoryFactory should >> use MMap if

how to deployed another web project into jetty server(solr inbuilt)

2015-10-07 Thread Mugeesh Husain
I am using Solr-5.3 with inbuilt jetty server. everything solr related is working fine. my problem I have a spring based web application for admin configuration. i don't have a another other server, i want to deploy it to jetty server. I have googled still could not fine suitable answer. can we

Re: how to deployed another web project into jetty server(solr inbuilt)

2015-10-07 Thread Daniel Collins
The short answer is technically it might be possible but its not a supported configuration. As of Solr 5.x (I forget the exact version), the use of Jetty is an implementation detail, you should treat Solr as a black box, whether it uses Jetty or not is irrelevant, and not something you can "piggy