Re: Replication Problem from solr-3.6 to solr-4.0

2014-07-24 Thread Sree..
I did optimize the master and the slave started replicating the indices! -- View this message in context: http://lucene.472066.n3.nabble.com/Replication-Problem-from-solr-3-6-to-solr-4-0-tp4025028p4148953.html Sent from the Solr - User mailing list archive at Nabble.com.

solr always loading and not any response

2014-07-24 Thread zhijun liu
hi, all, solr admin page is always loading, and when I send query request also can not get any response. the tcp link is always ESTABLISHED。only restart solr service can fix it. how to find out the problem? solr:4.6 jetty:8 thanks so much.

Re: solr always loading and not any response

2014-07-24 Thread Alexandre Rafalovitch
Is it on the same machine or on a different one? Either way, try to open the developer console in the browser and see what's happening on the network when you reload. Also, see the Solr side whether you get any message in the console. Maybe you are hitting an exception of some sort. Regards,

Re: integrating Accumulo with solr

2014-07-24 Thread Ali Nazemian
Dear Joe, Hi, I am going to store the crawl web pages in accumulo as the main storage part of my project and I need to give these data to solr for indexing and user searches. I need to do some social and web analysis on my data as well as having some security features. Therefore accumulo is my

Need a tipp, how to find documents where content is tel aviv but user query is telaviv?

2014-07-24 Thread Sven Schönfeldt
Hi Solr-Users, what is the best way to find documents, where the user write a wrong word in query. For example the user search for „telaviv“. the search result should also include documents where content is „tel aviv“. any tipp, or keywords how to do that kind of queries? regards, Sven

Re: Need a tipp, how to find documents where content is tel aviv but user query is telaviv?

2014-07-24 Thread Alexandre Rafalovitch
How often does this happen? Could use synonyms if not too many. On 24/07/2014 3:08 pm, Sven Schönfeldt schoenfe...@subshell.com wrote: Hi Solr-Users, what is the best way to find documents, where the user write a wrong word in query. For example the user search for „telaviv“. the search

Where can I get information about sold Cloud H/W spec

2014-07-24 Thread Lee Chunki
Hi, I am trying to build sold cloud. Do you know where can I get informations like : * solr cloud support heterogeneous servers * HDD * SDD vs. SAS vs. …. * no RAID vs. RAID-5 vs. RAID-0 vs. …. * Network * 100MB vs. 1GB vs. …. * …. of course, it will depend on data size, traffic and so

Re: Need a tipp, how to find documents where content is tel aviv but user query is telaviv?

2014-07-24 Thread Sven Schönfeldt
So i will need SynonymFilterFactory at indexing, or? Any chance to get it work by query time? Am 24.07.2014 um 10:24 schrieb Alexandre Rafalovitch arafa...@gmail.com: How often does this happen? Could use synonyms if not too many. On 24/07/2014 3:08 pm, Sven Schönfeldt

Re: Need a tipp, how to find documents where content is tel aviv but user query is telaviv?

2014-07-24 Thread Alexandre Rafalovitch
You can put the SynonymFilterFactory at query time as well. But it's less reliable. Especially if the text is tel aviv and the query is telaviv, you need to make sure to enable auto phrase search as well. Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and

Re: Where can I get information about sold Cloud H/W spec

2014-07-24 Thread Alexandre Rafalovitch
Have you tried searching the mailing list archives? Some of these things have been discussed a number of times. SSDs are definitely good for Solr. But also you may get more specific help if you say what kind of volume/throughput of data you are looking at. Regards, Alex. Personal:

Re: How to migrate content of a collection to a new collection

2014-07-24 Thread Per Steffensen
On 23/07/14 17:13, Erick Erickson wrote: Per: Given that you said that the field redefinition also includes routing info Exactly. It would probably be much faster to make sure that the new collection have the same number of shards on each Solr-machine and that the routing-ranges are

Re: How to migrate content of a collection to a new collection

2014-07-24 Thread Per Steffensen
Thanks for replying I tried this poor mans cursor approach out ad-hoc, but I get OOM. Pretty sure this is because you need all uniqueKey-values in FieldCache in order to be able to sort on it. We do not have memory for that - and never will. Our uniqueKey field is not DocValue. Just out of

RE: Any Solr consultants available??

2014-07-24 Thread Markus Jelsma
Hahaha thanks wunder, made me laugh! -Original message- From:Walter Underwood wun...@wunderwood.org Sent: Thursday 24th July 2014 2:07 To: solr-user@lucene.apache.org Subject: Re: Any Solr consultants available?? When I see job postings like this, I have to assume they were

Re: Mixing ordinary and nested documents

2014-07-24 Thread Bjørn Axelsen
thank you very much :-) 2014-07-22 16:34 GMT+02:00 Umesh Prasad umesh.i...@gmail.com: public static DocSet mapChildDocsToParentOnly(DocSet childDocSet) { DocSet mappedParentDocSet = new BitDocSet(); DocIterator childIterator = childDocSet.iterator(); while

Tracking request for Solr index backup

2014-07-24 Thread zzT
One way to get an index backup in Solr is through an HTTP call like this http://localhost:8983/solr/replication?command=backup I have 2 questions regarding this 1) Is there a way to get information on the progress of the backup operation, much like the async param that was introduced in 4.8

Re: how to achieve static boost in solr

2014-07-24 Thread rahulmodi
Thanks a lot Erick, i have looked at Query Elevation Component, it works but the problem is if i need to add new query tag or update existing query tag in elevate.xml file then i need to restart the server in order to take effect. I have also used forceElevation=true even then it requires

Re: integrating Accumulo with solr

2014-07-24 Thread Joe Gresock
Ali, Sounds like a good choice. It's pretty standard to store the primary storage id as a field in Solr so that you can search the full text in Solr and then retrieve the full document elsewhere. I would recommend creating a document structure in Solr with whatever fields you want indexed (most

Auto Suggest

2014-07-24 Thread benjelloun
Hello, Did solr.SuggestComponent work on MultiValued Field to Auto suggest not only one word but the whole sentence? field name=suggestField type=textSuggest multiValued=true indexed=true/ Regards, Anass BENJELLOUN -- View this message in context:

Re: Need a tipp, how to find documents where content is tel aviv but user query is telaviv?

2014-07-24 Thread Sven Schönfeldt
Thank You Alex! Am 24.07.2014 um 11:08 schrieb Alexandre Rafalovitch arafa...@gmail.com: You can put the SynonymFilterFactory at query time as well. But it's less reliable. Especially if the text is tel aviv and the query is telaviv, you need to make sure to enable auto phrase search as well.

Re: Need a tipp, how to find documents where content is tel aviv but user query is telaviv?

2014-07-24 Thread Jack Krupansky
Google handles this type of word concatenation quite well... but Solr does not out of the box, at least in terms of automatically. Solr does have a word break spell checker: https://cwiki.apache.org/confluence/display/solr/Spell+Checking And described in more detail, with examples in my

Re: Need a tipp, how to find documents where content is tel aviv but user query is telaviv?

2014-07-24 Thread Sven Schönfeldt
Thanks! Thats my core problem, to let solr search a bit like GSA :-) Greetz Am 24.07.2014 um 14:27 schrieb Jack Krupansky j...@basetechnology.com: Google handles this type of word concatenation quite well... but Solr does not out of the box, at least in terms of automatically. Solr does

Java heap space error

2014-07-24 Thread Ameya Aware
Hi I am in process of indexing around 2,00,000 documents. I have increase java jeap space to 4 GB using below command : java -Xmx4096M -Xms4096M -jar start.jar Still after indexing around 15000 documents it gives java heap space error again. Any fix for this? Thanks, Ameya

Re: how to fully test a response writer

2014-07-24 Thread Mikhail Khludnev
Hello, I think you can check TestDistributedSearch or other descendants of BaseDistributedSearchTestCase (I don't think you need to look at *Zk* tests). On Wed, Jul 23, 2014 at 5:03 PM, Matteo Grolla matteo.gro...@gmail.com wrote: Hi, I developed a new SolResponseWriter but I'm not

Re: Passivate core in Solr Cloud

2014-07-24 Thread Aurélien MAZOYER
Thank you Erick and Alex for your answers. Lots of core stuff seems to meet my requirement but it is a problem if it does not work with Solr Cloud. Is there an issue opened for this problem? If I understand well, the only solution for me is to use multiple monoinstances of Solr using transient

Re: Java heap space error

2014-07-24 Thread Marcello Lorenzi
Hi, Did you set a Garbage collection strategy on your JVM ? Marcello On 07/24/2014 03:32 PM, Ameya Aware wrote: Hi I am in process of indexing around 2,00,000 documents. I have increase java jeap space to 4 GB using below command : java -Xmx4096M -Xms4096M -jar start.jar Still after

Re: Java heap space error

2014-07-24 Thread Ameya Aware
I did not make any other change than this.. rest of the settings are default. Do i need to set garbage collection strategy? On Thu, Jul 24, 2014 at 9:49 AM, Marcello Lorenzi mlore...@sorint.it wrote: Hi, Did you set a Garbage collection strategy on your JVM ? Marcello On 07/24/2014

Re: Java heap space error

2014-07-24 Thread Marcello Lorenzi
I think that on large heap is suggested to monitor the garbage collection behavior and try to add a strategy adapted to your performance. On my production environment with a heap of 6 GB I set this parameter (server with 8 cores): -server -Xms6144m -Xmx6144m -XX:MaxPermSize=512m

Re: Java heap space error

2014-07-24 Thread Ameya Aware
ooh ok. So you want to say that since i am using large heap but didnt set my garbage collection, thats why i why getting java heap space error? On Thu, Jul 24, 2014 at 9:58 AM, Marcello Lorenzi mlore...@sorint.it wrote: I think that on large heap is suggested to monitor the garbage

Re: Java heap space error

2014-07-24 Thread François Schiettecatte
A default garbage collector will be chosen for you by the VM, might help to get the stack trace to look at. François On Jul 24, 2014, at 10:06 AM, Ameya Aware ameya.aw...@gmail.com wrote: ooh ok. So you want to say that since i am using large heap but didnt set my garbage collection,

Re: integrating Accumulo with solr

2014-07-24 Thread Ali Nazemian
Thank you very much. Nice Idea but how can Solr and Accumulo can be synchronized in this way? I know that Solr can be integrated with HDFS and also Accumulo works on the top of HDFS. So can I use HDFS as integration point? I mean set Solr to use HDFS as a source of documents as well as the

Re: Java heap space error

2014-07-24 Thread Boon Low
How about simply increasing the heap size if RAM is available? You should also check the update handler config, e.g. auto commit, if docs aren’t being written to disk, they would be hanging around in memory. And “openSearcher” setting too as opening new searchers consumes memory, especially if

Re: integrating Accumulo with solr

2014-07-24 Thread Jack Krupansky
If you are not a true hard-core gunslinger who is willing to dive in and integrate the code yourself, instead you should give serious consideration to a product such as DataStax Enterprise that fully integrates and packages a NoSQL database (Cassandra) and Solr for search. The security aspects

Re: integrating Accumulo with solr

2014-07-24 Thread Erik Hatcher
Just FYI, the blog Joe mentioned below (authored by me) has been adjusted to Solr 4.x in the original blog location here: http://searchhub.org/2012/02/22/custom-security-filtering-in-solr/ Erik On Jul 24, 2014, at 8:03 AM, Joe Gresock jgres...@gmail.com wrote: Ali, Sounds like

Get Data under Highlight Json value pair

2014-07-24 Thread EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions)
I am trying to get the content under highlighting json string, but I am not able to map the values as the highlighting has and values. E.g below . How can I get the value , is there any option at query syntax, current I used h1.on and h1.fl=List of Fields highlighting:{ :{

SolrJ POJO Annotations

2014-07-24 Thread David Philip
Hi, This question is related to SolrJ document as a bean. I have an entity that has another entity within it. Could you please tell me how to annotate for inner entities? The issue I am facing is the inner entities fields are missing while indexing. In the below example, It is just adding

Re: Need a tipp, how to find documents where content is tel aviv but user query is telaviv?

2014-07-24 Thread Jack Krupansky
And I should have added that the advantage of the word break approach is that it automatically handles both splitting and combining words, all based on the index, with no need to mess with creating synonyms. Also, there is a dictionary-based filter called

To warm the whole cache of Solr other than the only autowarmcount

2014-07-24 Thread YouPeng Yang
Hi I think it is wonderful to have caches autowarmed when commit or soft commit happens. However ,If I want to warm the whole cache other than the only autowarmcount,the default the auto warming operation will take long long ~~long time.So it comes up with that maybe it good idea to just

Multipart documents with different update cycles

2014-07-24 Thread Aurélien MAZOYER
Hello, I have to index a dataset containing multipart documents. The main part and the user metadata part have different update cycles : we want to update the user metadata part frequently without having to refetch the main part from the datasource nor storing every fields in order to use

Re: How to migrate content of a collection to a new collection

2014-07-24 Thread Chris Hostetter
: I tried this poor mans cursor approach out ad-hoc, but I get OOM. Pretty : sure this is because you need all uniqueKey-values in FieldCache in order to : be able to sort on it. We do not have memory for that - and never will. Our : uniqueKey field is not DocValue. : Just out of curiosity : *

Re: spatial search: find result in bbox OR first result outside bbox

2014-07-24 Thread david.w.smi...@gmail.com
Hi Elisabeth, Sorry for not responding sooner; I forgot. You’re in need of some spatial nearest-neighbor code I wrote but it isn’t open-sourced yet. It works on the RPT grid. Any way, you should consider doing this in two searches: the first query tries the bbox provided, and if that returns

RE: To warm the whole cache of Solr other than the only autowarmcount

2014-07-24 Thread Matt Kuiper (Springblox)
I don't believe this would work. My understanding (please correct if I have this wrong) is that the underlying Lucene document ids have a potential to change and so when a newSearcher is created the caches must be regenerated and not copied. Matt -Original Message- From: YouPeng Yang

Re: Shuffling results

2014-07-24 Thread babenis
Could you possibly elaborate on what that function could look like and how to use it? I have an ecommerce site with lots of products and some categories have 50 times more products than others, and i would like to shuffle resultset in a way that if the search is conducted by parent category id

Re: Shuffle results a little

2014-07-24 Thread babenis
were you ever able to figure out a way to do this Shuffling of the result set? I'm looking to do the same thing, but shuffle results not only by brand, but also by child categories, mainly because we have very dominant categories and would still like products from other cats to be visible

Re: integrating Accumulo with solr

2014-07-24 Thread Ali Nazemian
Dear Jack, Thank you. I am aware of datastax but I am looking for integrating accumulo with solr. This is something like what sqrrl guys offer. Regards. On Thu, Jul 24, 2014 at 7:27 PM, Jack Krupansky j...@basetechnology.com wrote: If you are not a true hard-core gunslinger who is willing to

Re: integrating Accumulo with solr

2014-07-24 Thread Jack Krupansky
Like I said, you're going to have to be a real, hard-core gunslinger to do that well. Sqrrl uses Lucene directly, BTW: Full-Text Search: Utilizing open-source Lucene and custom indexing methods, Sqrrl Enterprise users can conduct real-time, full-text search across data in Sqrrl Enterprise.

Re: SolrCloud extended warmup support

2014-07-24 Thread Jeff Wartes
Well, I’m not sure what to say. I’ve been observing a noticeable latency decrease over the first few thousand queries. I’m not doing anything too tricky either. Same exact query pattern, only one fq, always on the same field, no faceting. The only potential suspects that occur to me could be that

RE: SolrCloud extended warmup support

2014-07-24 Thread Toke Eskildsen
Jeff Wartes [jwar...@whitepages.com] wrote: Well, I’m not sure what to say. I’ve been observing a noticeable latency decrease over the first few thousand queries. How exactly do you get the index files fully cached? The cp-command will (at least for some systems) happily skip copying if the

Re: Shuffle results a little

2014-07-24 Thread Ahmet Arslan
Hi Babenis, https://cwiki.apache.org/confluence/display/solr/Query+Re-Ranking  is a good place to implement such diversity functionality. There is no stock solution currently other than field collapsing and random fields. Ahmet On Thursday, July 24, 2014 10:04 PM, babenis babe...@gmail.com

Re: Shuffling results

2014-07-24 Thread Joel Bernstein
This is the kind of use case the RankQuery API was created for. It allows you to write your own Lucene ranking collector and plug it in. It's an expert level java API so you'll need to program in Java and understand a lot about how Lucene collectors work, but it's cool stuff to learn. Joel

Re: Shuffling results

2014-07-24 Thread Joel Bernstein
Here's blog describing the RankQuery API: http://heliosearch.org/solrs-new-rankquery-feature/ Joel Bernstein Search Engineer at Heliosearch On Thu, Jul 24, 2014 at 6:22 PM, Joel Bernstein joels...@gmail.com wrote: This is the kind of use case the RankQuery API was created for. It allows you

Understanding the Debug explanations for Query Result Scoring/Ranking

2014-07-24 Thread O. Olson
Hi, If you add /*debug=true*/ to the Solr request /(and wt=xml if your current output is not XML)/, you would get a node in the resulting XML that is named debug. There is a child node to this called explain to this which has a list showing why the results are ranked in a particular

Re: Understanding the Debug explanations for Query Result Scoring/Ranking

2014-07-24 Thread Uwe Reh
Hi, to get an idea of the meaning of all this numbers, have a look on http://explain.solr.pl. I like this tool, it's great. Uwe Am 25.07.2014 00:45, schrieb O. Olson: Hi, If you add /*debug=true*/ to the Solr request /(and wt=xml if your current output is not XML)/, you would get a

Re: Understanding the Debug explanations for Query Result Scoring/Ranking

2014-07-24 Thread Koji Sekiguchi
Hi, In addition, this might be useful: Fundamentals of Information Retrieval, Illustration with Apache Lucene https://www.youtube.com/watch?v=SCsS5ePGmCs This video is about 40 minutes long, but you can fast forward to 24:00 to learn scoring based on vector space model and how Lucene customize

Re: Where can I get information about sold Cloud H/W spec

2014-07-24 Thread Lee Chunki
Hi Alex, Thank you for your reply. let me check mailing list archive again. Regards, Chunki. On Jul 24, 2014, at 6:11 PM, Alexandre Rafalovitch arafa...@gmail.com wrote: Have you tried searching the mailing list archives? Some of these things have been discussed a number of times. SSDs are

Are there any performance impact of using a non-standard length UUID as the unique key of Solr?

2014-07-24 Thread He haobo
Hi, In our Solr collection (Solr 4.8), we have the following unique key definition. field name=id type=string indexed=true stored=true required=true multiValued=false / uniqueKeyid/uniqueKey In our external java program, we will generate an UUID with UUID.randomUUID().toString() first. Then,

Re: Are there any performance impact of using a non-standard length UUID as the unique key of Solr?

2014-07-24 Thread Mark Miller
Some good info on unique id’s for Lucene / Solr can be found here:  http://blog.mikemccandless.com/2014/05/choosing-fast-unique-identifier-uuid.html --  Mark Miller about.me/markrmiller On July 24, 2014 at 9:51:28 PM, He haobo (haob...@gmail.com) wrote: Hi, In our Solr collection (Solr 4.8),

Re: Where can I get information about sold Cloud H/W spec

2014-07-24 Thread Alexandre Rafalovitch
http://search-lucene.com/ is helpful for looking around. Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On Fri, Jul 25, 2014

Re: Multipart documents with different update cycles

2014-07-24 Thread Alexandre Rafalovitch
Do you search the frequently changing user-metadata? If not, maybe the external file field is helpful. https://cwiki.apache.org/confluence/display/solr/Working+with+External+Files+and+Processes Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter:

Re: To warm the whole cache of Solr other than the only autowarmcount

2014-07-24 Thread YouPeng Yang
To Matt Thank you,your opinion is very valuable ,So I have checked the source codes about how the cache warming up. It seems to just put items of the old caches into the new caches. I will pull Mark Miller into this discussion.He is the one of the developer of the Solr whom I had contacted