Re: URLDataSource : indexing from other Solr servers
Just in case the url is not available from outside my network, here is how the url response looks like: response lst name=responseHeader int name=status0/int int name=QTime1007/int lst name=params str name=q*:*/str /lst /lst result name=response numFound=89993613 start=0 doc str name=address_combo1518 INDIANA CT, IRVING, TX/str str name=air_conditioningCentral/str int name=avm200600/int float name=avm_confidence0.31/float int name=avm_high230690/int int name=avm_low170510/int str name=basementNo Basement/str str name=batch_address1518 INDIANA CT/str str name=batch_cityIRVING/str str name=batch_countryUS/str str name=batch_stateTX/str str name=batch_zip75060/str float name=bath2.0/float int name=bed4/int str name=cbsa_labelDallas-Fort Worth-Arlington/str str name=cityIRVING/str str name=construction_typeFrame/str str name=county_labelDallas/str int name=delta_avm38300/int date name=delta_avm_timestamp2014-03-11T00:00:01Z/date int name=delta_home_score-6/int date name=delta_home_score_timestamp2014-03-11T00:00:01Z/date int name=delta_investor_score-12/int date name=delta_investor_score_timestamp2014-03-11T00:00:01Z/date float name=delta_tax_rate-3.0E-4/float date name=delta_tax_rate_timestamp2013-07-10T00:00:01Z/date int name=estimated_rent1550/int str name=exterior_wallBrick veneer/str str name=fireplace1/str str name=foundationSlab/str int name=garage4/int str name=heatingCentral/str int name=home_score29/int float name=hpi146.7849/float int name=investor_score38/int date name=last_tran_date2010-01-13T00:00:01Z/date int name=last_tran_price0/int double name=lat32.79920959472656/double str name=latlng_combo32.799209594726562,-96.926918029785156/str double name=lng-96.92691802978516/double str name=place_labelIRVING/str str name=property_typeSFH/str int name=sqft2348/int long name=sqft_lot0/long str name=stateTX/str str name=state_labelTexas/str str name=street_address1518 INDIANA CT/str str name=street_nameINDIANA CT/str str name=street_name_szINDIANA CT/str str name=street_no_sz1518/str long name=sz_id500018666323/long float name=tax_rate0.0178/float float name=taxes3893.0/float str name=tract_labelIRVING 015000/str int name=ttl_assessed0/int int name=year_built2002/int str name=zip75060/str date name=timestamp2014-04-20T16:28:52.467Z/date /doc doc str name=address_combo2600 ASH CRK, MESQUITE, TX/str str name=air_conditioningCentral/str int name=avm144200/int float name=avm_confidence0.28/float int name=avm_high165830/int int name=avm_low122570/int str name=basementNo Basement/str str name=batch_address2600 ASH CREEK/str str name=batch_cityMESQUITE/str str name=batch_countryUS/str str name=batch_stateTX/str str name=batch_zip75181/str float name=bath2.0/float int name=bed4/int str name=cbsa_labelDallas-Fort Worth-Arlington/str str name=cityMESQUITE/str str name=construction_typeFrame/str str name=county_labelDallas/str int name=delta_avm100/int date name=delta_avm_timestamp2014-04-11T00:00:01Z/date int name=delta_home_score-1/int date name=delta_home_score_timestamp2014-03-11T00:00:01Z/date int name=delta_investor_score-1/int date name=delta_investor_score_timestamp2014-04-11T00:00:01Z/date float name=delta_tax_rate-3.0E-4/float date name=delta_tax_rate_timestamp2013-07-10T00:00:01Z/date int name=estimated_rent1470/int str name=exterior_wallBrick veneer/str str name=fireplace1/str str name=foundationSlab/str int name=garage1/int str name=heatingCentral/str int name=home_score35/int float name=hpi153.4116/float int name=investor_score54/int date name=last_tran_date2006-01-20T00:00:01Z/date int name=last_tran_price0/int double name=lat32.7484283447266/double str name=latlng_combo32.7484283447266,-96.5575180053711/str double name=lng-96.5575180053711/double str name=place_labelMESQUITE/str str name=property_typeSFH/str int name=sqft2189/int long name=sqft_lot0/long str name=stateTX/str str name=state_labelTexas/str str name=street_address2600 ASH CRK/str str name=street_nameASH CRK/str str name=street_name_szASH CRK/str str name=street_no_sz2600/str long name=sz_id500018666324/long float name=tax_rate0.0178/float float name=taxes3345.0/float str name=tract_labelMESQUITE 017304/str int name=ttl_assessed0/int int name=year_built1996/int str name=zip75181/str date name=timestamp2014-04-20T16:28:52.467Z/date /doc /result /response -- View this message in context: http://lucene.472066.n3.nabble.com/URLDataSource-indexing-from-other-Solr-servers-tp4135321p4135332.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Problem when i'm trying to search something
Thanks for replying ! This is my Schema.xml. -- View this message in context: http://lucene.472066.n3.nabble.com/Problem-when-i-m-trying-to-search-something-tp4135045p4135427.html Sent from the Solr - User mailing list archive at Nabble.com.
Fwd: Inconsistent response from Cloud Query
Copying. Community: Looking forward for your response. -- Forwarded message -- From: Vineet Mishra clearmido...@gmail.com Date: Mon, May 12, 2014 at 5:57 PM Subject: Re: Inconsistent response from Cloud Query To: solr-user@lucene.apache.org Hi Shawn, There is no recovery case for me, neither the commit is pending. The case I am talking about is when I restart the Cloud all over again with index already flushed to disk. Thanks! On Sun, May 11, 2014 at 10:17 PM, Shawn Heisey s...@elyograg.org wrote: On 5/9/2014 11:42 AM, Cool Techi wrote: We have noticed Solr returns in-consistent results during replica recovery and not all replicas are in the same state, so when your query goes to a replica which might be recovering or still copying the index then the counts may differ. regards,Ayush SolrCloud should never send requests to a replica that is recovering. If that is happening (which I think is unlikely), then it's a bug. If *you* send a request to a replica that is still recovering, I would expect SolrCloud to redirect the request elsewhere unless distrib=false is used. I'm not sure whether that actually happens, though. Thanks, Shawn
Re: URLDataSource : indexing from other Solr servers
On 12 May 2014 22:52, helder.sepulveda helder.sepulv...@homes.com wrote: Here is the data config: dataConfig dataSource type=URLDataSource / document name=listingcore entity name=listing pk=link url=http://slszip11.as.homes.com/solr/select?q=*:*; processor=XPathEntityProcessor forEach=/response/result/doc transformer=DateFormatTransformer field column=batch_address xpath=/response/result/doc/str[@name='batch_address']/ field column=batch_state xpath=/response/result/doc/str[@name='batch_state']/ field column=batch_city xpath=/response/result/doc/str[@name='batch_city']/ field column=batch_zip xpath=/response/result/doc/str[@name='batch_zip']/ field column=sz_id xpath=/response/result/doc/long[@name='sz_id']/ /entity /document /dataConfig Hmm, see no issues here. Can you also share your Solr schema? Is the URL accessible, and the results from Solr show properly when loaded in a browser window? I cannot seem to reach slszip11.as.homes.com but that could be because it is restricted to certain IPs. Regards, Gora
Re: Easises way to insatll solr cloud with tomcat
Check out HDS from Heliosearch - it comes packaged with Tomcat, ready to go: http://heliosearch.com/download.html -- Jack Krupansky -Original Message- From: Aman Tandon Sent: Monday, May 12, 2014 8:23 AM To: solr-user@lucene.apache.org Subject: Re: Easises way to insatll solr cloud with tomcat Can anybody help me out?? With Regards Aman Tandon On Mon, May 12, 2014 at 1:24 PM, Aman Tandon amantandon...@gmail.comwrote: Hi, I tried to set up solr cloud with jetty which works fine. But in our production environment we uses tomcat so i need to set up the solr cloud with the tomcat. So please help me out to how to setup solr cloud with tomcat on single machine. Thanks in advance. With Regards Aman Tandon
Re: Join in solr to get data from two cores
NO reply from anybody..seems strange ? On Fri, May 9, 2014 at 9:47 AM, Kamal Kishore kamal.kish...@indiamart.comwrote: Any updates guys ? On Thu, May 8, 2014 at 2:05 PM, Kamal Kishore kamal.kish...@indiamart.com wrote: Dear Team, I have two solr cores. One containing products information and second has customers points. I am looking at solr join to query on first product core boost the results based on customer points in second core. I am not able to frame solr query for this. Moreover, solr is not allowing to get data from both the core. With RegardsK Kamal Kishore -- - https://play.google.com/store/apps/details?id=com.indiamart.m Follow IndiaMART.com http://www.indiamart.com/ for latest updates on this and more: https://plus.google.com/+indiamart https://www.facebook.com/IndiaMART https://twitter.com/IndiaMARTMobile Channel:https://play.google.com/store/apps/details?id=com.indiamart.m http://m.indiamart.com/
Solrcore.properties variable question.
Hi, We have a couple of Solr servers acting as master and slave, and each server have the same amount of cores, we are trying to configure the solrcore.properties so that an script is able to add cores without changing the solrcore.properties using a hack like this: enable.master=false enable.slave=true master_url=http://master_solr:8983/solr/${solr.core.name} Our idea is to have solr.core.name to be the dynamic variable, but once we go to admin, the master URL is not showing the last part, is there a format error or something trivial I'm missing? Thanks, Guido.
Re: SolrCloud - Highly Reliable / Scalable Resources?
Hi, Re: we have suffered several issues which always seem quite problematic to resolve. Try grabbing the latest version if you can. We identified a number of issues in older SolrCloud versions when working on large client setups with thousands of cores, but a lot of those issues have been fixes in the more recent versions. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Mon, May 12, 2014 at 9:53 AM, Darren Lee d...@amplience.com wrote: Hi everyone, We have been using Solr Cloud (4.4) for ~ 6 months now. Functionally its excellent but we have suffered several issues which always seem quite problematic to resolve. I was wondering if anyone in the community can recommend good resources / reading for setting up a highly scalable / highly reliable cluster. A lot of what I see in the solr documentation is aimed at small setups or is quite sparse. Dealing with topics like: * Capacity planning * Losing nodes * Voting panic * Recovery failure * Replication factors * Elasticity / Auto scaling / Scaling recipes * Exhibitor * Container configuration, concurrency limits, packet drop tuning * Increasing capacity without downtime * Scalable approaches to full indexing hundreds of millions of documents * External health check vs CloudSolrServer * Separate vs local zookeeper * Benchmarks Sorry, I know that's a lot to ask heh. We are going to run a project for a month or so soon where we re-write all our run books and do deeper testing on various failure scenarios and the above but any starting point would be much appreciated. Thanks all, Darren
Please add me to Contributors Group
Hi, I'd like to be added to the contributors group. My wiki username is gireesh Thanks Gireesh
Re: SWF content not indexed
Hi Ahmet, thank you for your response... yes I think I need tika to do these job, using it like an OCR. I was trying to go deep inside it as yet. Any other suggestion from the group is welcome. Regads, Mauro On Sun, May 11, 2014 at 2:34 PM, Ahmet Arslan iori...@yahoo.com wrote: Hi, Solr/lucene only deals with text. There are some other projects that extract text from rich documents. Solr-cell uses http://tika.apache.org for extraction. May be tika (or any other tool) already extracts text from swf? On Sunday, May 11, 2014 9:40 AM, Mauro Gregorio Binetti maurogregorio.bine...@gmail.com wrote: Hi guys, how can I make it possibile to index content of SWF files? I'm using Solr 3.6.0. Regards, Mauro
KeywordTokenizerFactory splits the string for the exclamation mark
Hi All I have a following field settings in solr schema field name=quot;lt;bExact_Word* omitPositions=true termVectors=false omitTermFreqAndPositions=true compressed=true type=string_ci multiValued=false indexed=true stored=true required=false omitNorms=true/ field name=Word compressed=true type=email_text_ptn multiValued=false indexed=true stored=true required=false omitNorms=true/ fieldtype name=string_ci class=solr.TextField sortMissingLast=true omitNorms=trueanalyzertokenizer class=solr.KeywordTokenizerFactory/filter class=solr.LowerCaseFilterFactory//analyzer/fieldtype copyField source=Word dest=Exact_Word/ As you can see Exact_Email has the KeywordTokenizerFactory and that should treat the string as it is. But when I enter email with the following string d!sdasdsdwasd...@dsadsadas.edu it splits the string to two. I was under the impression that KeywordTokenizerFactory will treat the string as it is. *!* Following is the query debug result. There you can see it has split the word parsedquery:+((DisjunctionMaxQuery((Exact_Email:d)) -DisjunctionMaxQuery((Exact_Email:sdasdsdwasd...@dsadsadas.edu)))~1), can someone please tell why it produce the query result as this If I put a string without the ! sign as below, the produced query will be as below parsedquery:+DisjunctionMaxQuery((Exact_Email:testresu...@testdomain.com)), I thought if the KeywordTokenizerFactory is applied then it should return the exact string as it is Please help me to understand what is going wrong here -- View this message in context: http://lucene.472066.n3.nabble.com/KeywordTokenizerFactory-splits-the-string-for-the-exclamation-mark-tp4135460.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: ContributorsGroup add request
Shawn- Thanks much. Icon ideas have been recorded. -Jim On 5/11/2014 10:39 AM, Shawn Heisey wrote: On 5/10/2014 6:35 PM, Jim Martin wrote: Please add me the ContributorsGroup; I've got some Solr icons I'd like to suggest to the community. Perhaps down the road I can contribute more. I'm the team lead at Overstock.Com for search, and Solr is the foundation of what we do. Username: JamesMartin I went to add you, but someone else has already done so. It's entirely possible that because of the Apache email outage, they have already replied, but the message hasn't made it through to the list yet. I'm adding you as a CC here (which I normally don't do) so that you'll get notified faster. Thanks, Shawn
Re: Join in solr to get data from two cores
Any updates guys ? On Thu, May 8, 2014 at 2:05 PM, Kamal Kishore kamal.kish...@indiamart.comwrote: Dear Team, I have two solr cores. One containing products information and second has customers points. I am looking at solr join to query on first product core boost the results based on customer points in second core. I am not able to frame solr query for this. Moreover, solr is not allowing to get data from both the core. With RegardsK Kamal Kishore -- - https://play.google.com/store/apps/details?id=com.indiamart.m Follow IndiaMART.com http://www.indiamart.com/ for latest updates on this and more: https://plus.google.com/+indiamart https://www.facebook.com/IndiaMART https://twitter.com/IndiaMARTMobile Channel:https://play.google.com/store/apps/details?id=com.indiamart.m http://m.indiamart.com/
Re: Too many documents Exception
As noted by others: you should definitely look into sharding your index -- fundementally there is no way to have that many documents in a single Lucene index. However: this is a terrible error for you to get, something in the stack should have really given you an error when you tried to add the too many documents, not latter when you opened the searcher. I've opened an issue to look into adding that... https://issues.apache.org/jira/browse/SOLR-6065 Off the top of my head, i don't know if/how you can easily fix your current index to be usable again. : Date: Wed, 7 May 2014 09:54:54 +0900 : From: [Tech Fun]山崎 yamaz...@techfun.jp : Reply-To: solr-user@lucene.apache.org : To: solr-user@lucene.apache.org : Subject: Too many documents Exception : : Hello everybody, : : Solr 4.3.1(and 4.7.1), Num Docs + Deleted Docs : 2147483647(Integer.MAX_VALUE) over : Caused by: java.lang.IllegalArgumentException: Too many documents, : composite IndexReaders cannot exceed 2147483647 : : It seems to be trouble similar to the unresolved e-mail. : http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201307.mbox/browser : : If How can I fix this? : This Solr Specification? : : : log. : : ERROR org.apache.solr.core.CoreContainer – Unable to create core: collection1 : org.apache.solr.common.SolrException: Error opening new searcher : at org.apache.solr.core.SolrCore.init(SolrCore.java:821) : at org.apache.solr.core.SolrCore.init(SolrCore.java:618) : at org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:949) : at org.apache.solr.core.CoreContainer.create(CoreContainer.java:984) : at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:597) : at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:592) : at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) : at java.util.concurrent.FutureTask.run(FutureTask.java:138) : at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) : at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) : at java.util.concurrent.FutureTask.run(FutureTask.java:138) : at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) : at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) : at java.lang.Thread.run(Thread.java:662) : Caused by: org.apache.solr.common.SolrException: Error opening new searcher : at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1438) : at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1550) : at org.apache.solr.core.SolrCore.init(SolrCore.java:796) : ... 13 more : Caused by: org.apache.solr.common.SolrException: Error opening Reader : at org.apache.solr.search.SolrIndexSearcher.getReader(SolrIndexSearcher.java:172) : at org.apache.solr.search.SolrIndexSearcher.init(SolrIndexSearcher.java:183) : at org.apache.solr.search.SolrIndexSearcher.init(SolrIndexSearcher.java:179) : at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1414) : ... 15 more : Caused by: java.lang.IllegalArgumentException: Too many documents, : composite IndexReaders cannot exceed 2147483647 : at org.apache.lucene.index.BaseCompositeReader.init(BaseCompositeReader.java:77) : at org.apache.lucene.index.DirectoryReader.init(DirectoryReader.java:368) : at org.apache.lucene.index.StandardDirectoryReader.init(StandardDirectoryReader.java:42) : at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:71) : at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:783) : at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52) : at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:88) : at org.apache.solr.core.StandardIndexReaderFactory.newReader(StandardIndexReaderFactory.java:34) : at org.apache.solr.search.SolrIndexSearcher.getReader(SolrIndexSearcher.java:169) : ... 18 more : ERROR org.apache.solr.core.CoreContainer – : null:org.apache.solr.common.SolrException: Unable to create core: : collection1 : at org.apache.solr.core.CoreContainer.recordAndThrow(CoreContainer.java:1450) : at org.apache.solr.core.CoreContainer.create(CoreContainer.java:993) : at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:597) : at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:592) : at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) : at java.util.concurrent.FutureTask.run(FutureTask.java:138) : at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) : at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) : at java.util.concurrent.FutureTask.run(FutureTask.java:138) : at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) : at
Re: URLDataSource : indexing from other Solr servers
On 5/12/2014 10:11 AM, helder.sepulveda wrote: I been trying to index data from other solr servers but the import always shows: Indexing completed. Added/Updated: 0 documents. Deleted 0 documents. Requests: 1, Fetched: 0, Skipped: 0, Processed I'm wondering why you're using the XPathEntityProcessor instead of SolrEntityProcessor. http://wiki.apache.org/solr/DataImportHandler#SolrEntityProcessor Solr (as of version 3.6) comes with the capability to fully understand the output from another Solr server, so you should probably be using that instead of trying to parse XML. Thanks, Shawn
Re: Replica active during warming
If you are sure about this, can you file a JIRA issue? -- Mark Miller about.me/markrmiller On May 12, 2014 at 8:50:42 PM, lboutros (boutr...@gmail.com) wrote: Dear All, we just finished the migration of a cluster from Solr 4.3.1 to Solr 4.6.1. With solr 4.3.1 a node was not considered as active before the end of the warming process. Now, with solr 4.6.1 a replica is considered as active during the warming process. This means that if you restart a replica or create a new one, queries will be send to this replica and the query will hang until the end of the warming process (We do not use cold searchers). We have quite long warming queries and this is a big issue. Is there a parameter I do not know that could control this behavior ? thanks, Ludovic. - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/Replica-active-during-warming-tp4135274.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Too many documents Exception
One of the hard-core Lucene guys is going to have to help you out. Or you may have to write some custom code to fix the index for any such shard. If you have deleted any documents, it may be sufficient to simply optimize the index. -- Jack Krupansky -Original Message- From: yamazaki Sent: Wednesday, May 7, 2014 8:15 PM To: solr-user@lucene.apache.org Subject: Re: Too many documents Exception Tanks, Jack. Is there a way to suppress setting this exception? For example, maxMergeDocs2147483647/maxMergeDocs ? When this exception occurs, Index will not be read. If solrcloud is used, some data not read. shard1 documents 2^31-1 over shard2 documents 2^31-1 not over shard1 down. shard1 index is dead. -- yamazaki 2014-05-07 11:01 GMT+09:00 Jack Krupansky j...@basetechnology.com: Lucene only supports 2^31-1 documents in an index, so Solr can only support 2^31-1 documents in a single shard. I think it's a bug that Lucene doesn't throw an exception when more than that number of documents have been inserted. Instead, you get this error when Solr tries to read such an overstuffed index. -- Jack Krupansky -Original Message- From: [Tech Fun]山崎 Sent: Tuesday, May 6, 2014 8:54 PM To: solr-user@lucene.apache.org Subject: Too many documents Exception Hello everybody, Solr 4.3.1(and 4.7.1), Num Docs + Deleted Docs 2147483647(Integer.MAX_VALUE) over Caused by: java.lang.IllegalArgumentException: Too many documents, composite IndexReaders cannot exceed 2147483647 It seems to be trouble similar to the unresolved e-mail. http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201307.mbox/browser If How can I fix this? This Solr Specification? log. ERROR org.apache.solr.core.CoreContainer – Unable to create core: collection1 org.apache.solr.common.SolrException: Error opening new searcher at org.apache.solr.core.SolrCore.init(SolrCore.java:821) at org.apache.solr.core.SolrCore.init(SolrCore.java:618) at org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:949) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:984) at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:597) at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:592) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) at java.lang.Thread.run(Thread.java:662) Caused by: org.apache.solr.common.SolrException: Error opening new searcher at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1438) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1550) at org.apache.solr.core.SolrCore.init(SolrCore.java:796) ... 13 more Caused by: org.apache.solr.common.SolrException: Error opening Reader at org.apache.solr.search.SolrIndexSearcher.getReader(SolrIndexSearcher.java:172) at org.apache.solr.search.SolrIndexSearcher.init(SolrIndexSearcher.java:183) at org.apache.solr.search.SolrIndexSearcher.init(SolrIndexSearcher.java:179) at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1414) ... 15 more Caused by: java.lang.IllegalArgumentException: Too many documents, composite IndexReaders cannot exceed 2147483647 at org.apache.lucene.index.BaseCompositeReader.init(BaseCompositeReader.java:77) at org.apache.lucene.index.DirectoryReader.init(DirectoryReader.java:368) at org.apache.lucene.index.StandardDirectoryReader.init(StandardDirectoryReader.java:42) at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:71) at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:783) at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52) at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:88) at org.apache.solr.core.StandardIndexReaderFactory.newReader(StandardIndexReaderFactory.java:34) at org.apache.solr.search.SolrIndexSearcher.getReader(SolrIndexSearcher.java:169) ... 18 more ERROR org.apache.solr.core.CoreContainer – null:org.apache.solr.common.SolrException: Unable to create core: collection1 at org.apache.solr.core.CoreContainer.recordAndThrow(CoreContainer.java:1450) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:993) at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:597) at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:592) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at
Re: Join in solr to get data from two cores
Probably because we answered a nearly identical request yesterday. It had items in one core and counts in a different. Please read all the responses to this e-mail. http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201405.mbox/browser Specifically, these responses: http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201405.mbox/browser http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201405.mbox/browser http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201405.mbox/browser wunder On May 13, 2014, at 12:03 AM, Kamal Kishore kamal.kish...@indiamart.com wrote: NO reply from anybody..seems strange ? On Fri, May 9, 2014 at 9:47 AM, Kamal Kishore kamal.kish...@indiamart.comwrote: Any updates guys ? On Thu, May 8, 2014 at 2:05 PM, Kamal Kishore kamal.kish...@indiamart.com wrote: Dear Team, I have two solr cores. One containing products information and second has customers points. I am looking at solr join to query on first product core boost the results based on customer points in second core. I am not able to frame solr query for this. Moreover, solr is not allowing to get data from both the core. With RegardsK Kamal Kishore -- - https://play.google.com/store/apps/details?id=com.indiamart.m Follow IndiaMART.com http://www.indiamart.com/ for latest updates on this and more: https://plus.google.com/+indiamart https://www.facebook.com/IndiaMART https://twitter.com/IndiaMARTMobile Channel:https://play.google.com/store/apps/details?id=com.indiamart.m http://m.indiamart.com/ -- Walter Underwood wun...@wunderwood.org
Independent/Selfcontained Solr Unit testing with JUnit
Hi, Is there any way to run self-contained JUnit tests for say a Solr dependent class where it doesn't depend on Solr being up and running at localhost:8983 ? I have a collection etc. setup on the Solr server. Is it possible to mockit with an EmbeddedSolr easily with a @Before or @BeforeClass annotation in JUnit4 ? Any pointers to examples would be awesome(I am also trying to look in the source). TIA, Vijay
Grouping on int field in SolrCloud raises exception
Wondering if anyone else has this issue? We have a grouping field which we defined as an integer; when we run a query grouping on that field it works fine in a non-cloud configuration, but when we try the same query in a SolrCloud configuration with multiple shards, we get the following error: Type mismatch: wkcluster was indexed as NUMERIC Schema: field name=wkcluster indexed=true stored=true type=int docValues=true / Query: q=*:*group=truegroup.field=wkclustergroup.limit=3 We worked around it by redefining the field as string and re-indexing the content. Is there any restriction on the type of a field used for grouping/collapsing, especially in a distributed config?
Re: DataImport using SqlEntityProcessor running Out of Memory
Hello O, It seems to me (but it's better to look at the heap histogram) that buffering sub-entities in SortedMapBackedCache blows heap off. I'm aware about two directions: - use file based cache instead. I don't know exactly how it works, you can start from https://issues.apache.org/jira/browse/SOLR-2382 and check how to enable berkleyDB cache; - personally, I'm promoting merging resultsets ordered by RDBMS https://issues.apache.org/jira/browse/SOLR-4799 On Fri, May 9, 2014 at 7:16 PM, O. Olson olson_...@yahoo.it wrote: I have a Data Schema which is Hierarchical i.e. I have an Entity and a number of attributes. For a small subset of the Data - about 300 MB, I can do the import with 3 GB memory. Now with the entire 4 GB Dataset, I find I cannot do the import with 9 GB of memory. I am using the SqlEntityProcessor as below: dataConfig dataSource driver=com.microsoft.sqlserver.jdbc.SQLServerDriver url=jdbc:sqlserver://localhost\MSSQLSERVER;databaseName=SolrDB;user=solrusr;password=solrusr;/ document entity name=Entity query=SELECT EntID, Image FROM ENTITY_TABLE field column=EntID name=EntID / field column=Image name=Image / entity name=EntityAttribute1 query=SELECT AttributeValue, EntID FROM ATTR_TABLE WHERE AttributeID=1 cacheKey=EntID cacheLookup=Entity.EntID processor=SqlEntityProcessor cacheImpl=SortedMapBackedCache field column=AttributeValue name=EntityAttribute1 / /entity entity name=EntityAttribute2 query=SELECT AttributeValue, EntID FROM ATTR_TABLE WHERE AttributeID=2 cacheKey=EntID cacheLookup=Entity.EntID processor=SqlEntityProcessor cacheImpl=SortedMapBackedCache field column=AttributeValue name=EntityAttribute2 / /entity /entity /document /dataConfig What is the best way to import this data? Doing it without a cache, results in many SQL queries. With the cache, I run out of memory. I’m curious why 4GB of data cannot entirely fit in memory. One thing I need to mention is that I have about 400 to 500 attributes. Thanks in advance for any helpful advice. O. O. -- View this message in context: http://lucene.472066.n3.nabble.com/DataImport-using-SqlEntityProcessor-running-Out-of-Memory-tp4135080.html Sent from the Solr - User mailing list archive at Nabble.com. -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com
Re: retreive all the fields in join
Thanks Walter yeah we are now trying to use the external file field as mentioned by you. On May 12, 2014 11:11 PM, Walter Underwood wun...@wunderwood.org wrote: Top management has given requirements that force a broken design. They are requiring something that is impossible with Solr. 1. Flatten the data. You get one table, no joins. 2. 12M records is not a big Solr index. That should work fine. 3. If the supplier activity points are updated frequently, you could use an external file field for those, but they still need to be flattened. wunder On May 12, 2014, at 7:21 AM, Aman Tandon antn.s...@gmail.com wrote: Yeah i understand but i got the requirement from the top management, requirements are: core1: in this we want to keep the supplier activity points case 2: we want to boost those records which are present in core1 by the amount of supplier activity points. I know we can keep that supplier score in same core but this requires the full indexing of 12M records and suppliers are of about 1lacs which won't cost much. With Regards Aman On Mon, May 12, 2014 at 7:44 PM, Erick Erickson erickerick...@gmail.com wrote: Any time you find yourself trying to use Solr like a DB, stop. Solr joins are _not_ DB joins, the data from the from core is not returned (I think there are a few special cases where you can make this happen though). Try denormalizing your data if at all possible, that's what Solr docs best... search single records. Best, Erick On Sun, May 11, 2014 at 6:40 PM, Aman Tandon amantandon...@gmail.com wrote: please help me out here!! With Regards Aman Tandon On Sun, May 11, 2014 at 1:44 PM, Aman Tandon amantandon...@gmail.com wrote: Hi, Is there a way possible to retrieve all the fields present in both the cores(core 1 and core2). e.g. core1: {id:111,name: abc } core2: {page:17, type: fiction} I want is that, on querying both the cores I want to retrieve the results containing all the 4 fields, fields id, name from core1 and page, type from core2. Is it possible? With Regards Aman Tandon -- Walter Underwood wun...@wunderwood.org
Re: solr optimize on fnm file
Oh my! I see what your problem is, but I rather doubt that it'll be addressed. You've obviously been stress-testing the indexing process and have a bunch of garbage left over that's not getting removed on optimize. But it's such a unique case that I don't know if anyone is really interested in fixing it. I'll bring it up on the dev list, but I expect you'll need to blow away your entire index and re-index from scratch. Best, Erick On Tue, May 6, 2014 at 4:09 PM, googoo liu...@gmail.com wrote: For our setup, the file size is 123M. Internal it has 2.6M fields. The problem is facet operation. It take a while for facet. we are stuck in below call stack for 11 second. java.util.HashMap.transfer(Unknown Source) java.util.HashMap.resize(Unknown Source) java.util.HashMap.addEntry(Unknown Source) java.util.HashMap.put(Unknown Source) org.apache.lucene.index.FieldInfos$Builder.addOrUpdateInternal(FieldInfos.java:285) org.apache.lucene.index.FieldInfos$Builder.add(FieldInfos.java:302) org.apache.lucene.index.FieldInfos$Builder.add(FieldInfos.java:251) org.apache.lucene.index.MultiFields.getMergedFieldInfos(MultiFields.java:276) org.apache.lucene.index.SlowCompositeReaderWrapper.getFieldInfos(SlowCompositeReaderWrapper.java:220) org.apache.lucene.search.FieldCacheImpl.getTermsIndex(FieldCacheImpl.java:1116) org.apache.lucene.search.FieldCacheImpl.getTermsIndex(FieldCacheImpl.java:1106) org.apache.solr.request.SimpleFacets.getFieldCacheCounts(SimpleFacets.java:574) org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:429) org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:517) org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:252) org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:78) -- View this message in context: http://lucene.472066.n3.nabble.com/solr-optimize-on-fnm-file-tp4134969p4135037.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: query(subquery, default) filters results
Thanks Yonik and Chris for the the brief explanation. Me too was unknown about the behavior of query parser. Yonik can you help that how can i learn deeply about the query parsers and how to gain expertise in debug query. With Regards Aman Tandon On Wed, May 7, 2014 at 12:35 AM, Yonik Seeley yo...@heliosearch.com wrote: On Tue, May 6, 2014 at 5:08 AM, Matteo Grolla matteo.gro...@gmail.com wrote: Hi everybody, I'm having troubles with the function query query(subquery, default) http://wiki.apache.org/solr/FunctionQuery#query running this http://localhost:8983/solr/select?q=query($qq,1)qq={!dismaxqf=text}hard drive The default query syntax is lucene, so query(... will just be parsed as text. Try q={!func}query($qq,1) OR defType=funcq=query($qq,1) -Yonik http://heliosearch.org - facet functions, subfacets, off-heap filters + fieldcache
getting direct link to solr result.
Hello all! I have been using solr for a few days, but I still don't understand, how can I get direct link to open the document i'm looking for. I tried to do that, but the only information I can retrieve from the Json result from Solr is ID, Name, Modified date ... well, I'm working on android application, and I want to make the user get a direct link to the file he searched for. thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/getting-direct-link-to-solr-result-tp4135084.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solrj Default Data Format
Hi Furkan, If I were to guess, the XML format is more cross-compatible with different versions of SolrJ. But it might not be intentional. In any case, feeding your SolrServer a BinaryResponseParser will switch it over to javabin. Michael Della Bitta Applications Developer o: +1 646 532 3062 appinions inc. “The Science of Influence Marketing” 18 East 41st Street New York, NY 10017 t: @appinions https://twitter.com/Appinions | g+: plus.google.com/appinionshttps://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts w: appinions.com http://www.appinions.com/ On Thu, May 8, 2014 at 10:17 AM, Furkan KAMACI furkankam...@gmail.comwrote: Hi; I found the reason of weird format at my previous mail. Now I capture the data with wireshark and I see that it is pure XML and content type is set to application/xml? Any ideas about why it is not javabin? Thanks; Furkan KAMACI 2014-05-07 22:16 GMT+03:00 Furkan KAMACI furkankam...@gmail.com: Hmmm, I see that it is like XML format but not. I have added three documents but has something like that: add doc boost=1.0 field name=idid1/field field name=idid2/field field name=idid3/field field name=idid4/field field name=dd1/field field name=dd2/field field name=dd3/field field name=dd4/field /doc doc boost=1.0/doc doc boost=1.0/doc doc boost=1.0/doc /add is this javabin format? I mean optimizing XML and having a first byte of 2? Thanks; Furkan KAMACI 2014-05-07 22:04 GMT+03:00 Furkan KAMACI furkankam...@gmail.com: Hi; I am testing Solrj. I use Solr 4.5.1 and HttpSolrServer for my test. I just generate some SolrInputDocuments and call add method of server to add them. When I track the request I see that data is at XML format instead of javabin. Do I miss anything? Thanks; Furkan KAMACI
KeywordTokenizerFactory splits the string for the exclamation mark
Hi jack Please have a look at this I have a following field settings in solr schema field name=bExact_Word omitPositions=true termVectors=false omitTermFreqAndPositions= true compressed=true type=string_ci multiValued=false indexed=true stored=true required=false omitNorms=true/ field name=Word compressed=true type=email_text_ptn multiValued=false indexed=true stored=true required=false omitNorms=true/ fieldtype name=string_ci class=solr.TextField sortMissingLast=true omitNorms=trueanalyzertokenizer class=solr.KeywordTokenizerFactory/filter class=solr.LowerCaseFilterFactory//analyzer/fieldtype copyField source=Word dest=Exact_Word/ As you can see Exact_Word has the KeywordTokenizerFactory and that should treat the string as it is. Following is my responseHeader. As you can see I am searching my string only in the filed Exact_Word and expecting it to return the Word field and the score responseHeader:{ status:0, QTime:14, params:{ explainOther:, fl:Word,score, debugQuery:on, indent:on, start:0, q:d!sdasdsdwasd!a...@dsadsadas.edu, qf:Exact_Word, wt:json, fq:, version:2.2, rows:10}}, But when I enter email with the following string d! sdasdsdwasd...@dsadsadas.edu it splits the string to two. I was under the impression that KeywordTokenizerFactory will treat the string as it is. Following is the query debug result. There you can see it has split the word parsedquery:+((DisjunctionMaxQuery((Exact_Word:d)) -DisjunctionMaxQuery((Exact_Word:sdasdsdwasd...@dsadsadas.edu)))~1), can someone please tell why it produce the query result as this If I put a string without the ! sign as below, the produced query will be as below parsedquery:+DisjunctionMaxQuery(( Exact_Word:d_sdasdsdwasd_...@dsadsadas.edu)),. This is what I expected solr to even with the ! mark. with _ mark it wont do a string split and treats the string as it is I thought if the KeywordTokenizerFactory is applied then it should return the exact string as it is Please help me to understand what is going wrong here -- View this message in context: http://lucene.472066.n3.nabble.com/KeywordTokenizerFactory-splits-the-string-for-the-exclamation-mark-tp4135474.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Independent/Selfcontained Solr Unit testing with JUnit
On 5/13/2014 12:46 PM, Vijay Balakrishnan wrote: Is there any way to run self-contained JUnit tests for say a Solr dependent class where it doesn't depend on Solr being up and running at localhost:8983 ? I have a collection etc. setup on the Solr server. Is it possible to mockit with an EmbeddedSolr easily with a @Before or @BeforeClass annotation in JUnit4 ? Any pointers to examples would be awesome(I am also trying to look in the source). An example of a Solr unit test that fires up Jetty (actually, more than one instance of Jetty) before testing is located here in the source download or checkout: solr/solrj/src/test/org/apache/solr/client/solrj/TestLBHttpSolrServer.java Thanks, Shawn
RE: Easises way to insatll solr cloud with tomcat
Check out http://heliosearch.com/download.html It is a distribution of Apache Solr packaged with Tomcat. I have found it simple to use. Matt -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Monday, May 12, 2014 6:24 AM To: solr-user@lucene.apache.org Subject: Re: Easises way to insatll solr cloud with tomcat Can anybody help me out?? With Regards Aman Tandon On Mon, May 12, 2014 at 1:24 PM, Aman Tandon amantandon...@gmail.comwrote: Hi, I tried to set up solr cloud with jetty which works fine. But in our production environment we uses tomcat so i need to set up the solr cloud with the tomcat. So please help me out to how to setup solr cloud with tomcat on single machine. Thanks in advance. With Regards Aman Tandon
search multiple cores
Hi, I am trying to join across multiple cores using query time join. Following is my setup 3 cores - Solr 4.7 core1: 0.5 million documents core2: 4 million documents and growing. This contains the child documents for documents in core1. core3: 2 million documents and growing. Contains records from all users. core2 contains documents that are accessible to each user based on their permissions. The number of documents accessible to a user range from couple of 1000s to 100,000. I would like to get results by combining all three cores. For each search I get documents from core3 and then query core1 to get parent documents then core2 to get the appropriate child documents depending of user permissions. I 'm referring to this link to join across cores http://stackoverflow.com/questions/12665797/is-solr-4-0-capable-of-using-join-for-multiple-core {!join from=fromField to=toField fromIndex=fromCoreName}fromQuery This is not working for me. Can anyone suggest why it is not working. Any pointers on how to search across multiple cores. thanks J
Re: What is the usage of solr.NumericPayloadTokenFilterFactory
I do have basic coverage for that filter (and all other filters) and the parameter values in my e-book: http://www.lulu.com/us/en/shop/jack-krupansky/solr-4x-deep-dive-early-access-release-7/ebook/product-21203548.html That said, are you sure you want to be using the payload feature of Lucene? -- Jack Krupansky -Original Message- From: ienjreny Sent: Monday, May 12, 2014 12:51 PM To: solr-user@lucene.apache.org Subject: What is the usage of solr.NumericPayloadTokenFilterFactory Dears: Can any body explain at easy way what is the benefits of solr.NumericPayloadTokenFilterFactory and what is acceptable values for typeMatch Thanks in advance -- View this message in context: http://lucene.472066.n3.nabble.com/What-is-the-usage-of-solr-NumericPayloadTokenFilterFactory-tp4135326.html Sent from the Solr - User mailing list archive at Nabble.com.
Error when creating collection
Solr version: 4.2.1 I'm creating a collection via Java using this function call: String collection = profile-2; CoreAdminRequest.Create createRequest = new CoreAdminRequest.Create(); createRequest.setCoreName(collection); createRequest.setCollection(collection); createRequest.setInstanceDir(collection); createRequest.setNumShards(1); createRequest.process(server); It is timing out with this exception (from the solr.out logs): SEVERE: org.apache.solr.common.SolrException: Error CREATEing SolrCore 'profile-2': Could not get shard_id for core: profile-2 coreNodeName:192.168.1.152:8983_solr_profile-2 at org.apache.solr.handler.admin.CoreAdminHandler.handleCreateAction(CoreAdminHandler.java:483) at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:140) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:591) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:192) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141) ... Caused by: org.apache.solr.common.SolrException: Could not get shard_id for core: profile-2 coreNodeName:192.168.1.152:8983_solr_profile-2 at org.apache.solr.cloud.ZkController.doGetShardIdProcess(ZkController.java:1221) at org.apache.solr.cloud.ZkController.preRegister(ZkController.java:1290) at org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:861) In a development environment the zookeeper/solr instances are running with elevated permissions and this function worked without error. In a test environment (which matches the production environment) the permissions are more restricted. I made sure the group/owner of the /usr/local/solr directory are set up to be the correct user. Any insight into potential file permissions that I may be overlooking? Thank you, Mark