Re: Solr for real time analytics system
Thanks Bhimavarapu for the information. We are creating our own dashboard, so probably wont need kibana/banana. I was more curious about Solr support for fast aggregation query over very large data set. As suggested, I guess elasticsearch has this capability. Is there any published metrics or data regarding elasticsearch/solr performance in this area that I can refer to? Thanks Rohit On Thu, Feb 4, 2016 at 11:48 AM, CKReddy Bhimavarapu <chaitu...@gmail.com> wrote: > Hello Rohit, > > You can use the Banana project which was forked from Kibana > <https://github.com/elastic/kibana>, and works with all kinds of time > series (and non-time series) data stored in Apache Solr > <https://lucene.apache.org/solr/>. It uses Kibana's powerful dashboard > configuration capabilities, ports key panels to work with Solr, and > provides significant additional capabilities, including new panels that > leverage D3.js <http://d3js.org/> > > would need mostly aggregation queries like sum/average/groupby etc, but > > data set is quite huge. The aggregation queries should be very fast. > > > all your requirement can be served by this banana but I'm not sure about > how fast solr compare to ELK <https://www.elastic.co/products> > > On Thu, Feb 4, 2016 at 10:51 AM, Rohit Kumar < > rohitkumarbhagat...@gmail.com> > wrote: > > > Hi > > > > I am quite new to Solr. I have to build a real time analytics system > which > > displays metrics based on multiple filters over a huge data set > (~50million > > documents with ~100 fileds ). I would need mostly aggregation queries > like > > sum/average/groupby etc, but data set is quite huge. The aggregation > > queries should be very fast. > > > > Is Solr suitable for such use cases? > > > > Thanks > > Rohit > > > > > > -- > ckreddybh. <chaitu...@gmail.com> >
Solr for real time analytics system
Hi I am quite new to Solr. I have to build a real time analytics system which displays metrics based on multiple filters over a huge data set (~50million documents with ~100 fileds ). I would need mostly aggregation queries like sum/average/groupby etc, but data set is quite huge. The aggregation queries should be very fast. Is Solr suitable for such use cases? Thanks Rohit
Section Search in SOLR
Hi, I have following SOLR documents indexed. doc str name=id1/str arr name=companyName strBoeing/str strKaseya/str /arr arr name=positionName strExecutive/str strTechnician/str /arr doc doc str name=id2/str arr name=companyName strBoeing/str strKodak/str /arr arr name=positionName strTechnician/str strExecutive/str /arr doc Company name and Position name are multivalued fields maintained in order. The following is the solr query. *fq=companyName:Boeingfq=positionName:Executive* which returns both the documents as expected. What changes will i have to make to be able to search for companyName:Boeing and positionName:Executive both at same indexes in the corresponding multivalued fields i.e. should return me only doc id 1. Thanks, Rohit Kumar
Re: Section Search in SOLR
Thanks Jack for quick reply. Probably my question was not elaborate enough. Let me add more explanation. *Option 1: * Even if I flatten my document to store separate *experiences* in multivalued field, solr will still return me the doc id 1 and 2 if i query : *fq=**experience:Boeingfq=**experience:Executive* doc str name=id1/str arr name=companyName strBoeing/str strKaseya/str /arr arr name=positionName strExecutive/str strTechnician/str /arr arr name=experience strBoeing, Executive/str strKaseya, Technician/str /arr doc doc str name=id2/str arr name=companyName strBoeing/str strKodak/str /arr arr name=positionName strTechnician/str strExecutive/str /arr arr name=experience strBoeing, Technician/str strKodak, Executive/str /arr doc *Option 2: * Storing separate experience in separate fields and generate query q=(exp1:(Boeing AND Executive) OR exp2:(Boeing AND Executive)) and this can be queried to return the docs with the expected match. doc str name=id2/str ... str name=exp1Boeing, Executive/str str name=exp2Kodak, Executive/str doc * * Please suggest. * * I would just love to know how linkedin does it to show facets for people working in company with titles. Thanks On Sat, Sep 28, 2013 at 9:58 PM, Jack Krupansky j...@basetechnology.comwrote: multivalued fields maintained in order That is not a feature supported by Solr. Solr will maintain the order of an individual multivalued field and will return the values of that field in order, but makes no other use of the order. Ditto for corresponding multivalued fields. Solr does not support any correspondence between multivalued fields. You must flatten your data your data to achieve any correspondence. Multivalued field are a powerful feature of Solr, but you must be extremely careful to use them only in moderation. -- Jack Krupansky -Original Message- From: Rohit Kumar Sent: Saturday, September 28, 2013 12:11 PM To: solr-user@lucene.apache.org Subject: Section Search in SOLR Hi, I have following SOLR documents indexed. doc str name=id1/str arr name=companyName strBoeing/str strKaseya/str /arr arr name=positionName strExecutive/str strTechnician/str /arr doc doc str name=id2/str arr name=companyName strBoeing/str strKodak/str /arr arr name=positionName strTechnician/str strExecutive/str /arr doc Company name and Position name are multivalued fields maintained in order. The following is the solr query. *fq=companyName:Boeingfq=**positionName:Executive* which returns both the documents as expected. What changes will i have to make to be able to search for companyName:Boeing and positionName:Executive both at same indexes in the corresponding multivalued fields i.e. should return me only doc id 1. Thanks, Rohit Kumar
Frequent softCommits leading to high faceting times?
Hi, We are running *SOLR 4.3* with 8 Gb of index on Ubuntu 12.04 64 bits Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz Single core. 16GB RAM We just started using the autoSoftCommit feature and noticed the facet queries slowed down from milliseconds taking earlier to a minute. We have *8 facet fields*. We add close to 300 documents per second during peak interval. autoCommit maxTime60/maxTime openSearcherfalse/openSearcher /autoCommit autoSoftCommit maxTime1000/maxTime /autoSoftCommit Here is some information i got with debugQuery. Please note that *facet time is more than 50 seconds.* lst name=process double name=time50779.0/double lst name=network double name=time0.0/double /lst lst name=query double name=time41.0/double /lst *lst name=facet double name=time50590.0/double /lst* lst name=mlt double name=time0.0/double /lst lst name=highlight double name=time0.0/double /lst lst name=stats double name=time0.0/double /lst lst name=connection double name=time5.0/double /lst lst name=debug double name=time143.0/double /lst /lst Please help. Thanks, Rohit Kumar
Searching solr on school name during year
Hi, Currently I have a student search which allows me to search for documents in a school. I am looking at including year search into the existing schema which would enable users to search for students in a school during an year. I have a proposed change in the schema to add the year component to facilitate this search. Existing schema: (No year information currently) field name=id type=string indexed=true stored=true required=true multiValued=false / field name=name type=text_general indexed=true stored=true / field name=schoolName type=text_general indexed=true stored=true multiValued=true/ Current sample data: name:Borris Mayers schoolName:Canterbury University New schema: field name=id type=string indexed=true stored=true required=true multiValued=false / field name=name type=text_general indexed=true stored=true / field name=schoolName type=text_general indexed=true stored=true multiValued=true/ field name=schoolNameWithTermOriginal type=string indexed=false stored=true multiValued=true/ Sample data: name:Borris Mayers schoolName:Canterbury University, start_2001, year_2001, year_2002, year_2003, year_2004, year_2005, end_2005 schoolNameWithTermOriginal:Canterbury University||2001-2005 Please suggest if its a correct approach or there is a better way to do the same. I am using Solr 4.3. Thanks, Rohit Kumar
Searching in stopwords
I have a company search which uses stopwords during quezary time. In my stopwords list i have entries like : HR Club India Pvt. Ltd. So if i search for companies like HR Club i get no results. Similarly search for India HR giving no results. How can i get results in query for following companies : 1. HR India 2. HR Club 3. HR India Pvt Ltd I would still want to maintain the above list of stopwords since these letters occur heavily in company text. Please guide if i need to change my strategy itself. field name=company type=text_lowercase_whitespace indexed=true stored=true / fieldType name=text_lowercase_whitespace class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.LowerCaseFilterFactory/ filter class=solr.PorterStemFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.PorterStemFilterFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.LowerCaseFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType Thanks Rohit Kumar
Using Solr to search between two Strings without using index
Hi, I have a scenario. String array = [Input1 is good, Input2 is better, Input2 is sweet, Input3 is bad] I want to compare the string array against the given input : String inputarray= [Input1, Input2] It involves no indexes. I just want to use the power of string search to do a runtime search on the array and should return [Input1 is good, Input2 is better, Input2 is sweet] Thanks
Auto Soft commit not working !!!
My solr config has : autoCommit maxTime15000/maxTime openSearcherfalse/openSearcher /autoCommit !-- softAutoCommit is like autoCommit except it causes a 'soft' commit which only ensures that changes are visible but does not ensure that data is synced to disk. This is faster and more near-realtime friendly than a hard commit. -- autoSoftCommit maxTime1000/maxTime /autoSoftCommit Machine is ubuntu 13 / 4 cores / 16GB RAM. Given 6gb to Solr running over tomcat. Still when i am adding documents to solr and searching its returning 0 hits. Its taking long before the document actually starts showing up. Can somebody help. Thanks
Re: Auto Soft commit not working !!!
I checked with the tomcat logs. Although the config says it to commit every 15000ms autoCommit maxTime15000/maxTime openSearcherfalse/openSearcher /autoCommit Strangely there are no commit logs. Did i miss anything? - Having issues in Soft Auto commit (Near Real Time). Am using solr 4.0 on tomcat . The index size is 10.95 GB. With this configuration it takes more than 60 seconds to return the indexed document. When adding documents to solr and searching after soft commit time, its returning 0 hits. Its taking long before the document actually starts showing up, even more than the autoCommit interval. autoCommit maxTime15000/maxTime openSearcherfalse/openSearcher /autoCommit autoSoftCommit maxTime1000/maxTime /autoSoftCommit Machine is ubuntu 13 / 4 cores / 16GB RAM. Given 6gb to Solr running over tomcat. On Fri, Jul 5, 2013 at 12:13 AM, Daniel Collins danwcoll...@gmail.comwrote: You should see the commit messages in the solr logs, do they come up at the expected frequency? On 4 July 2013 15:35, Rohit Kumar rohit.kku...@gmail.com wrote: My solr config has : autoCommit maxTime15000/maxTime openSearcherfalse/openSearcher /autoCommit !-- softAutoCommit is like autoCommit except it causes a 'soft' commit which only ensures that changes are visible but does not ensure that data is synced to disk. This is faster and more near-realtime friendly than a hard commit. -- autoSoftCommit maxTime1000/maxTime /autoSoftCommit Machine is ubuntu 13 / 4 cores / 16GB RAM. Given 6gb to Solr running over tomcat. Still when i am adding documents to solr and searching its returning 0 hits. Its taking long before the document actually starts showing up. Can somebody help. Thanks
Re: Auto Soft commit not working !!!
1. Do you have an update processor chain that doesn't have RunUpdate in it?*- No * 2. Is the updateLog solrconfig directive missing? - *Bang On. It was still commented !!!* 3. Is _version_ missing from your schema? *Checked it. and its present * *I will test again and update soon . * *Thanks * On Fri, Jul 5, 2013 at 8:30 AM, Jack Krupansky j...@basetechnology.comwrote: 1. Do you have an update processor chain that doesn't have RunUpdate in it? 2. Is the updateLog solrconfig directive missing? 3. Is _version_ missing from your schema? -- Jack Krupansky -Original Message- From: Rohit Kumar Sent: Thursday, July 04, 2013 9:22 PM To: solr-user@lucene.apache.org Subject: Re: Auto Soft commit not working !!! I checked with the tomcat logs. Although the config says it to commit every 15000ms autoCommit maxTime15000/maxTime openSearcherfalse/**openSearcher /autoCommit Strangely there are no commit logs. Did i miss anything? --**--** - Having issues in Soft Auto commit (Near Real Time). Am using solr 4.0 on tomcat . The index size is 10.95 GB. With this configuration it takes more than 60 seconds to return the indexed document. When adding documents to solr and searching after soft commit time, its returning 0 hits. Its taking long before the document actually starts showing up, even more than the autoCommit interval. autoCommit maxTime15000/maxTime openSearcherfalse/**openSearcher /autoCommit autoSoftCommit maxTime1000/maxTime /autoSoftCommit Machine is ubuntu 13 / 4 cores / 16GB RAM. Given 6gb to Solr running over tomcat. On Fri, Jul 5, 2013 at 12:13 AM, Daniel Collins danwcoll...@gmail.com wrote: You should see the commit messages in the solr logs, do they come up at the expected frequency? On 4 July 2013 15:35, Rohit Kumar rohit.kku...@gmail.com wrote: My solr config has : autoCommit maxTime15000/maxTime openSearcherfalse/**openSearcher /autoCommit !-- softAutoCommit is like autoCommit except it causes a 'soft' commit which only ensures that changes are visible but does not ensure that data is synced to disk. This is faster and more near-realtime friendly than a hard commit. -- autoSoftCommit maxTime1000/maxTime /autoSoftCommit Machine is ubuntu 13 / 4 cores / 16GB RAM. Given 6gb to Solr running over tomcat. Still when i am adding documents to solr and searching its returning 0 hits. Its taking long before the document actually starts showing up. Can somebody help. Thanks
SOLR : ArrayIndexOutOfBoundsException from SolrDispatchFilter
Need help to figure out the error below. *Code Snippet*: public class ConnectionComponent extends SearchComponent { @Override public void process(ResponseBuilder rb) throws IOException { NamedList nList = new SimpleOrderedMap(); NamedList nl= new SimpleOrderedMap(); ListDocument ld = new ArrayListDocument(); Document mydoc = new Document(); mydoc.add(f); //IndexableField f not null ld.add(mydoc); nl.add(someKey, ld); nList.add(otherKey, nl); // rb instance of ResponseBuilder rb.rsp.add(returnKey, nList); } } RROR org.apache.solr.servlet.SolrDispatchFilter ? null:java.lang.ArrayIndexOutOfBoundsException: -1 at java.util.ArrayList.get(ArrayList.java:324) at java.util.Collections$UnmodifiableList.get(Collections.java:1152) at org.apache.solr.response.transform.ValueSourceAugmenter.transform(ValueSourceAugmenter.java:92) at org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:165) at org.apache.solr.response.JSONWriter.writeArray(JSONResponseWriter.java:526) at org.apache.solr.response.TextResponseWriter.writeArray(TextResponseWriter.java:289) at org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:192) at org.apache.solr.response.JSONWriter.writeNamedListAsMapWithDups(JSONResponseWriter.java:183) at org.apache.solr.response.JSONWriter.writeNamedList(JSONResponseWriter.java:299) at org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:188)