Index mysql database using data import handler in solr
I want to index mysql database in solr using the Data Import Handler. I have made two tables. The first table holds the metadata of a file. create table filemetadata ( id varchar(20) primary key , filename varchar(50), path varchar(200), size varchar(10), author varchar(50) ) ; +---+-+-+--+-+ | id | filename | path | size | author | +---+-+-+--+-+ | 1| abc.txt| c:\files | 2kb | eric | +---+-+-+--+-+ | 2| xyz.docx | c:\files | 5kb | john | +---+-+-+--+-+ | 3| pqr.txt|c:\files| 10kb | mike | +---+-+-+--+-+ The second table contains the favourite info about a particular file in the above table. create table filefav ( fid varchar(20) primary key , id varchar(20), favouritedby varchar(300), favouritedtime varchar(10), FOREIGN KEY (id) REFERENCES filemetadata(id) ) ; ++--+-++ | fid| id | favouritedby | favouritedtime | ++--+-++ | 1 | 1 | ross | 22:30 | ++--+-++ | 2 | 1 | josh | 12:56 | ++--+-++ | 3 | 2 | johny | 03:03 | ++--+-++ | 4 | 2 | sean | 03:45 | ++--+-++ here id' is a foreign key. The second table is showing which person has marked which document as his/her favourite. Eg the file abc.txt represented by id = 1 has been marked favourite (see column favouritedby) by ross and josh. I want to index the the files as follows: each document should have the following fields id - to be taken from the first table filemetadata filename - to be taken from the first table filemetadata path - to be taken from the first table filemetadata size - to be taken from the first table filemetadata author - to be taken from the first table filemetadata Favouritedby - this field should contain the names of all the people from table 2 filefav (from the favouritedby column) who like that particular file. eg after indexing doc 1 should have id = 1 filename = abc.txt path = c:\files size = 2kb author = eric favourited by - ross , josh How Do I achieve this? I have written a data-config.xml (which is not giving the desired result) as follows dataConfig dataSource type=JdbcDataSource driver=com.mysql.jdbc.Driver url=jdbc:mysql://localhost:3306/test user=root password=root / document name=filemetadata entity name=restaurant query=select * from filemetadata field column=id name=id / entity name=filefav query=select favouritedby from filefav where id=${filemetadata.id} field column=favouritedby name=favouritedby1 / /entity field column=filename name=name1 / field column=path name=path1 / field column=size name=size1 / field column=author name=author1 / /entity /document /dataConfig Can anyone explain how do i achieve this? -- View this message in context: http://lucene.472066.n3.nabble.com/Index-mysql-database-using-data-import-handler-in-solr-tp4077205.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Indexing database in Solr using Data Import Handler
On 11 July 2013 11:13, archit2112 archit2...@gmail.com wrote: Im trying to index MySql database using Data Import Handler in solr. [...] Everything is working but the favouritedby1 field is not getting indexed , ie, that field does not exist when i run the *:* query. Can you please help me out? Please show us your schema.xml. Does it have a favouritedby1 field, and the other fields that you are trying to add through DIH? Regards, Gora
Performance of cross join vs block join
Hello, Does anyone know about some measurements in terms of performance for cross joins compared to joins inside a single index? Is it faster the join inside a single index that stores all documents of various types (from parent table or from children tables)with a discriminator field compared to the cross join (basically in this case each document type resides in its own index)? I have performed some tests but to me it seems that having a join in a single index (bigger index) does not add too much speed improvements compared to cross joins. Why a block join would be faster than a cross join if this is the case? What are the variables that count when trying to improve the query execution time? Thanks! Mihaela
Re: Performance of cross join vs block join
Mihaela, For me it's reasonable that single core join takes the same time as cross core one. I just can't see which gain can be obtained from in the former case. I hardly able to comment join code, I looked into, it's not trivial, at least. With block join it doesn't need to obtain parentId term values/numbers and lookup parents by them. Both of these actions are expensive. Also blockjoin works as an iterator, but join need to allocate memory for parents bitset and populate it out of order that impacts scalability. Also in None scoring mode BJQ don't need to walk through all children, but only hits first. Also, nice feature is 'both side leapfrog' if you have a highly restrictive filter/query intersects with BJQ, it allows to skip many parents and children as well, that's not possible in Join, which has fairly 'full-scan' nature. Main performance factor for Join is number of child docs. I'm not sure I got all your questions, please specify them in more details, if something is still unclear. have you saw my benchmark http://blog.griddynamics.com/2012/08/block-join-query-performs.html ? On Thu, Jul 11, 2013 at 1:52 PM, mihaela olteanu mihaela...@yahoo.comwrote: Hello, Does anyone know about some measurements in terms of performance for cross joins compared to joins inside a single index? Is it faster the join inside a single index that stores all documents of various types (from parent table or from children tables)with a discriminator field compared to the cross join (basically in this case each document type resides in its own index)? I have performed some tests but to me it seems that having a join in a single index (bigger index) does not add too much speed improvements compared to cross joins. Why a block join would be faster than a cross join if this is the case? What are the variables that count when trying to improve the query execution time? Thanks! Mihaela -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com
request to be added as a wiki contributor
Hi, My wiki username is AndyMacKinlay. Can I please be added to the ContributorsGroup? Thanks, Andy
Term component regex to remove stopwords
Hi, Can Termcomponent parameter terms.regex be used to ignore stop words. Regards Shruti -- View this message in context: http://lucene.472066.n3.nabble.com/Term-component-regex-to-remove-stopwords-tp4077196.html Sent from the Solr - User mailing list archive at Nabble.com.
Problem using Term Component in solr
Hi All I am using *Term component* in solr for searching titles with short form using wild card characters(.*) and [a-z0-9]*. I am using *Term Component* specifically as wild card characters are not working on *select?q=* query search. Examples of some *title *are: 1)Medicine, Health Care and Philosophy 2)Medical Physics 3)Physics of fluids 4)Medical Engineering and Physics ***When i do *solr query*: localhost:8080/solr3.6/OA/terms?terms.fl=titleterms.regex=phy.* fluidsterms.regex.flag=case_insensitiveterms.limit=10 *Output* is 3rd title: *Physics of fluids* This is relevant output. ***But when i do *solr query*: localhost:8080/solr3.6/OA/terms?terms.fl=titleterms.regex=med.* phy.*terms.regex.flag=case_insensitiveterms.limit=10 *Output* are 2nd and 4th title: *Medical Engineering and Physics* *Medical Physics* This is irrelevant.I want only one result for this query i.e. *Medical Physics* *Although i have changed my wild card characters to *[a-z0-9]** instead of *.** ,but than first query doesn't work as '*of*' is included in '*Physics of fluids*'.However Second query works fine . example of query is: localhost:8080/solr3.6/OA/terms?terms.fl=titleterms.regex=med[a-z0-9]* phy[a-z0-9]*terms.regex.flag=case_insensitiveterms.limit=10 This works fine,gives one output *Medical Physics*. If there is another way for searching using *Term Component* or without using it..Please suggest to neglect such stop words. Note:Term Component works only on string dataType field. :( -- View this message in context: http://lucene.472066.n3.nabble.com/Problem-using-Term-Component-in-solr-tp4077200.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to make 'fq' optional?
https://lucene.apache.org/solr/4_2_0/solr-core/org/apache/solr/search/SwitchQParserPlugin.html Hoss cares about you! On Wed, Jul 10, 2013 at 10:40 PM, Learner bbar...@gmail.com wrote: I am trying to make a variable in fq optional, Ex: /select?first_name=peterfq=$first_nameq=*:* I don't want the above query to throw error or die whenever the variable first_name is not passed to the query instead return the value corresponding to rest of the query. I can use switch but its difficult to handle each and every case using switch (as I need to handle switch for so many variables)... Is there a way to resolve this via some other way? -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-make-fq-optional-tp4077042.html Sent from the Solr - User mailing list archive at Nabble.com. -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com
Re: Performance of cross join vs block join
In my current use case I have 4 tables with a one to many relationship between them (one is the parent and the rest are the children ) and I have created for each table a separate Solr core. Now I have the request to return all those parents that match a certain criteria or one of its children match the same criteria or a different criteria. Given the fact that moving all these documents in a single core implies more changes in the current code than keeping the cores as they are I considered also the solution with union of cross joins. Next I performed some tests and saw that having join in a single core does not add too much compared to union of cross join, hence I don't know which solution to adopt. Do you see a use case where I would hit the wall if I keep the documents in separate cores? BTW the link bellow does not work (I have found it while searching this topic) , it displays an empty page. Thanks, Mihaela From: Mikhail Khludnev mkhlud...@griddynamics.com To: solr-user solr-user@lucene.apache.org; mihaela olteanu mihaela...@yahoo.com Sent: Thursday, July 11, 2013 2:25 PM Subject: Re: Performance of cross join vs block join Mihaela, For me it's reasonable that single core join takes the same time as cross core one. I just can't see which gain can be obtained from in the former case. I hardly able to comment join code, I looked into, it's not trivial, at least. With block join it doesn't need to obtain parentId term values/numbers and lookup parents by them. Both of these actions are expensive. Also blockjoin works as an iterator, but join need to allocate memory for parents bitset and populate it out of order that impacts scalability. Also in None scoring mode BJQ don't need to walk through all children, but only hits first. Also, nice feature is 'both side leapfrog' if you have a highly restrictive filter/query intersects with BJQ, it allows to skip many parents and children as well, that's not possible in Join, which has fairly 'full-scan' nature. Main performance factor for Join is number of child docs. I'm not sure I got all your questions, please specify them in more details, if something is still unclear. have you saw my benchmark http://blog.griddynamics.com/2012/08/block-join-query-performs.html ? On Thu, Jul 11, 2013 at 1:52 PM, mihaela olteanu mihaela...@yahoo.comwrote: Hello, Does anyone know about some measurements in terms of performance for cross joins compared to joins inside a single index? Is it faster the join inside a single index that stores all documents of various types (from parent table or from children tables)with a discriminator field compared to the cross join (basically in this case each document type resides in its own index)? I have performed some tests but to me it seems that having a join in a single index (bigger index) does not add too much speed improvements compared to cross joins. Why a block join would be faster than a cross join if this is the case? What are the variables that count when trying to improve the query execution time? Thanks! Mihaela -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com
Re: amount of values in a multi value field - is denormalization always the best option?
I also have a similar scenario, where fundamentally I have to retrieve all urls where a userid has been found. So, in my schema, I designed the url as (string) key and a (possible huge) list of attributes automatically mapped to strings. For example: Url1 (key): - language: en - content:userid1 - content:userid1 - content:userid1 (i.e. 3 times actually for user 1) - content:userid2 - content:userid3 - author:userid4 and so on and so forth. So, if I did understand, you're saying that this is a bad design? How should I fix my schema in your opinion in that case? Best, Flavio On Wed, Jul 10, 2013 at 11:53 PM, Jack Krupansky j...@basetechnology.comwrote: Simple answer: avoid large number of values in a single document. There should only be a modest to moderate number of fields in a single document. Is the data relatively static, or subject to frequent updates? To update any field of a single document, even with atomic update, requires Solr to read and rewrite every field of the document. So, lots of smaller documents are best for a frequent update scenario. Multivalues fields are great for storing a relatively small list of values. You can add to the list easily, but under the hood, Solr must read and rewrite the full list as well as the full document. And, there is no way to address or synchronize individual elements of multivalued fields. Joins are great... if used in moderation. Heavy use of joins is not a great idea. -- Jack Krupansky -Original Message- From: Marcelo Elias Del Valle Sent: Wednesday, July 10, 2013 5:37 PM To: solr-user@lucene.apache.org Subject: amount of values in a multi value field - is denormalization always the best option? Hello, I have asked a question recently about solr limitations and some about joins. It comes that this question is about both at the same time. I am trying to figure how to denormalize my data so I will need just 1 document in my index instead of performing a join. I figure one way of doing this is storing an entity as a multivalued field, instead of storing different fields. Let me give an example. Consider the entities: User: id: 1 type: Joan of Arc age: 27 Webpage: id: 1 url: http://wiki.apache.org/solr/**Joinhttp://wiki.apache.org/solr/Join category: Technical user_id: 1 id: 2 url: http://stackoverflow.com category: Technical user_id: 1 Instead of creating 1 document for user, 1 for webpage 1 and 1 for webpage 2 (1 parent and 2 childs) I could store webpages in a user multivalued field, as follows: User: id: 1 name: Joan of Arc age: 27 webpage1: [id:1, url: http://wiki.apache.org/solr/**Joinhttp://wiki.apache.org/solr/Join, category: Technical] webpage2: [id:2, url: http://stackoverflow.com;, category: Technical] It would probably perform better than the join, right? However, it made me think about solr limitations again. What if I have 200 million webpges (200 million fields) per user? Or imagine a case where I could have 200 million values on a field, like in the case I need to index every html DOM element (div, a, etc.) for each web page user visited. I mean, if I need to do the query and this is a business requirement no matter what, although denormalizing could be better than using query time joins, I wonder it distributing the data present in this single document along the cluster wouldn't give me better performance. And this is something I won't get with block joins or multivalued fields... I guess there is probably no right answer for this question (at least not a known one), and I know I should create a POC to check how each perform... But do you think a so large number of values in a single document could make denormalization not possible in an extreme case like this? Would you share my thoughts if I said denormalization is not always the right option? Best regards, -- Marcelo Elias Del Valle http://mvalle.com - @mvallebr
Re: request to be added as a wiki contributor
Done. On Wed, Jul 10, 2013 at 10:25 PM, Andrew MacKinlay admac...@gmail.com wrote: Hi, My wiki username is AndyMacKinlay. Can I please be added to the ContributorsGroup? Thanks, Andy
Applying Sum on Field
Hi, I'm a new solr user, I wanted to know is there any way to apply sum on a field in a result document of group query? Following is the query and its result set, I wanted to apply sum on 'price' filed grouping on type: *Sample input:* doc str name=id3/str str name=typeCaffe/str str name=contentYummm Drinking a latte at Caffe Grecco in SF shistoric North Beach Learning text analysis with SolrInAction by Manning on my iPad/str long name=_version_1440257540658036736/long int name=price250/int /doc doc str name=id1/str str name=typeCaffe/str str name=contentYummm Drinking a latte at Caffe Grecco in SF shistoric North Beach Learning text analysis with SolrInAction by Manning on my iPad/str long name=_version_1440257592044552192/long int name=price100/int /doc * * *Query:* http://localhost:8080/solr/collection2/select?q=caffedf=contentgroup=truegroup.field=type your help will be greatly appreciated! Regards, Jamshaid
Re: Commit different database rows to solr with same id value?
Just use the address in the url. You don't have to use the core name if the defaults are set, which is usually collection1. So it's something like http://host:port/solr/core2/update? blah blah blah Erick On Wed, Jul 10, 2013 at 4:17 PM, Jason Huang jason.hu...@icare.com wrote: Thanks David. I am actually trying to commit the database row on the fly, not DIH. :) Anyway, if I understand you correctly, basically you are suggesting to modify the value of the primary key and pass the new value to id before committing to solr. This could probably be one solution. What if I want to commit the data from table2 to a new core? Anyone knows how I can do that? thanks, Jason On Wed, Jul 10, 2013 at 11:18 AM, David Quarterman da...@corexe.com wrote: Hi Jason, Assuming you're using DIH, why not build a new, unique id within the query to use as the 'doc_id' for SOLR? We do something like this in one of our collections. In MySQL, try this (don't know what it would be for any other db but there must be equivalents): select @rownum:=@rownum+1 rowid, t.* from (main select query) t, (select @rownum:=0) s Regards, DQ -Original Message- From: Jason Huang [mailto:jason.hu...@icare.com] Sent: 10 July 2013 15:50 To: solr-user@lucene.apache.org Subject: Commit different database rows to solr with same id value? Hello, I am trying to use Solr to store fields from two different database tables, where the primary keys are in the format of 1, 2, 3, In Java, we build different POJO classes for these two database tables: table1.java @SolrIndex(name=id) private String idTable1 table2.java @SolrIndex(name=id) private String idTable2 And later we add these fields defined in the two different types of tables and commit it to solrServer. Here is the scenario where I am having issues: (1) commit a row from table1 with primary key = 3, this generates a document in Solr (2) commit another row from table2 with the same value of primary key = 3, this overwrites the document generated in step (1). What we really want to achieve is to keep both rows in (1) and (2) because they are from different tables. I've read something from google search and it appears that we might be able to do it via keeping multiple cores in solr? Could anyone point at how to implement multiple core to achieve this? To be more specific, when I commit the row as a document, I don't have a place to pick a certain core and I am not sure if it makes any sense for me to specify a core when I commit the document since the layer I am working on should abstract it away from me. The second question is - if we don't want to do a multicore (since we can't easily search for related data between multiple cores), how can we resolve this issue so both rows from different database table which shares the same primary key still exist? We don't want to have to always change the primary key format to ensure a uniqueness of the primary key among all different types of database tables. thanks! Jason
Re: Applying Sum on Field
Hi, If you mean adding up numeric values stored in fields - no, Solr doesn't do this by default. We had a similar requirement for this, and created a custom SearchComponent to handle sum, average, stats etc. There are a number of things you need to bear in mind, such as: * Handling errors when a query asks for sums on fields that are non-numeric * Performance issues - e.g. are you willing to wait to add up 50 million fields of stringified numbers * How to return result payloads in a client-friendly way * Be prepared to coalesce results from multi-shard/distributed queries. It's not trivial, but it is do-able. Peter On Thu, Jul 11, 2013 at 12:56 PM, Jamshaid Ashraf jamshaid...@gmail.comwrote: Hi, I'm a new solr user, I wanted to know is there any way to apply sum on a field in a result document of group query? Following is the query and its result set, I wanted to apply sum on 'price' filed grouping on type: *Sample input:* doc str name=id3/str str name=typeCaffe/str str name=contentYummm Drinking a latte at Caffe Grecco in SF shistoric North Beach Learning text analysis with SolrInAction by Manning on my iPad/str long name=_version_1440257540658036736/long int name=price250/int /doc doc str name=id1/str str name=typeCaffe/str str name=contentYummm Drinking a latte at Caffe Grecco in SF shistoric North Beach Learning text analysis with SolrInAction by Manning on my iPad/str long name=_version_1440257592044552192/long int name=price100/int /doc * * *Query:* http://localhost:8080/solr/collection2/select?q=caffedf=contentgroup=truegroup.field=type your help will be greatly appreciated! Regards, Jamshaid
Re: Commit different database rows to solr with same id value?
cool. so far I've been using the default collection 1 only. thanks, Jason On Thu, Jul 11, 2013 at 7:57 AM, Erick Erickson erickerick...@gmail.comwrote: Just use the address in the url. You don't have to use the core name if the defaults are set, which is usually collection1. So it's something like http://host:port/solr/core2/update? blah blah blah Erick On Wed, Jul 10, 2013 at 4:17 PM, Jason Huang jason.hu...@icare.com wrote: Thanks David. I am actually trying to commit the database row on the fly, not DIH. :) Anyway, if I understand you correctly, basically you are suggesting to modify the value of the primary key and pass the new value to id before committing to solr. This could probably be one solution. What if I want to commit the data from table2 to a new core? Anyone knows how I can do that? thanks, Jason On Wed, Jul 10, 2013 at 11:18 AM, David Quarterman da...@corexe.com wrote: Hi Jason, Assuming you're using DIH, why not build a new, unique id within the query to use as the 'doc_id' for SOLR? We do something like this in one of our collections. In MySQL, try this (don't know what it would be for any other db but there must be equivalents): select @rownum:=@rownum+1 rowid, t.* from (main select query) t, (select @rownum:=0) s Regards, DQ -Original Message- From: Jason Huang [mailto:jason.hu...@icare.com] Sent: 10 July 2013 15:50 To: solr-user@lucene.apache.org Subject: Commit different database rows to solr with same id value? Hello, I am trying to use Solr to store fields from two different database tables, where the primary keys are in the format of 1, 2, 3, In Java, we build different POJO classes for these two database tables: table1.java @SolrIndex(name=id) private String idTable1 table2.java @SolrIndex(name=id) private String idTable2 And later we add these fields defined in the two different types of tables and commit it to solrServer. Here is the scenario where I am having issues: (1) commit a row from table1 with primary key = 3, this generates a document in Solr (2) commit another row from table2 with the same value of primary key = 3, this overwrites the document generated in step (1). What we really want to achieve is to keep both rows in (1) and (2) because they are from different tables. I've read something from google search and it appears that we might be able to do it via keeping multiple cores in solr? Could anyone point at how to implement multiple core to achieve this? To be more specific, when I commit the row as a document, I don't have a place to pick a certain core and I am not sure if it makes any sense for me to specify a core when I commit the document since the layer I am working on should abstract it away from me. The second question is - if we don't want to do a multicore (since we can't easily search for related data between multiple cores), how can we resolve this issue so both rows from different database table which shares the same primary key still exist? We don't want to have to always change the primary key format to ensure a uniqueness of the primary key among all different types of database tables. thanks! Jason
Solr caching clarifications
Hello, As a result of frequent java OOM exceptions, I try to investigate more into the solr jvm memory heap usage. Please correct me if I am mistaking, this is my understanding of usages for the heap (per replica on a solr instance): 1. Buffers for indexing - bounded by ramBufferSize 2. Solr caches 3. Segment merge 4. Miscellaneous- buffers for Tlogs, servlet overhead etc. Particularly I'm concerned by Solr caches and segment merges. 1. How much memory consuming (bytes per doc) are FilterCaches (bitDocSet) and queryResultCaches (DocList)? I understand it is related to the skip spaces between doc id's that match (so it's not saved as a bitmap). But basically, is every id saved as a java int? 2. QueryResultMaxDocsCached - (for example = 100) means that any query resulting in more than 100 docs will not be cached (at all) in the queryResultCache? Or does it have to do with the documentCache? 3. DocumentCache - written on the wiki it should be greater than max_results*concurrent_queries. Max result is just the num of rows displayed (rows-start) param, right? Not the queryResultWindow. 4. LazyFieldLoading=true - when quering for id's only (fl=id) will this cache be used? (on the expense of eviction of docs that were already loaded with stored fields) 5. How large is the heap used by mergings? Assuming we have a merge of 10 segments of 500MB each (half inverted files - *.pos *.doc etc, half non inverted files - *.fdt, *.tvd), how much heap should be left unused for this merge? Thanks in advance, Manu
solr 4.3.0 cloud in Tomcat, link many collections to Zookeeper
Hi, We are testing solr 4.3.0 in Tomcat (considering upgrading solr 3.6.1 to 4.3.0), in WIKI page for solrCloud in Tomcat: http://wiki.apache.org/solr/SolrCloudTomcat we need to link each collection explicitly: /// 8) Link uploaded config with target collection java -classpath .:/home/myuser/solr-war-lib/* org.apache.solr.cloud.ZkCLI -cmd linkconfig -collection mycollection -confname ... /// But our application has many cores (a few thousands which all share same schema/config, is there a moe convenient way ? Thanks very much for helps, Lisheng
What happens in indexing request in solr cloud if Zookeepers are all dead?
Hi, In solr cloud latest doc, it mentioned that if all Zookeepers are dead, distributed query still works because solr remembers the cluster state. How about the indexing request handling if all Zookeepers are dead, does solr needs Zookeeper to know which box is master and which is slave for indexing to work? Could solr remember master/slave relations without Zookeeper? Also doc said Zookeeper quorum needs to have a majority rule so that we must have 3 Zookeepers to handle the case one instance is crashed, what would happen if we have two instances in quorum and one instance is crashed (or quorum having 3 instances but two of them are crashed)? I felt the last one should take over? Thanks very much for helps, Lisheng
Solr 4.3.0 memory usage is higher than solr 3.6.1?
Hi, We are testing solr 4.3.0 in Tomcat (considering upgrading solr 3.6.1 to 4.3.0), we have many cores (a few thousands). We have noticed solr 4.3.0 memory usage is much higher than solr 3.6.1 (without using solr cloud yet). With 2K cores, solr 3.6.1 is using 1.5G, but solr 4.3.0 is using close to 3G memory, when Tomcat is initially started. We used shareSchema and sharedLib, we also disabled searcher warm-up during startup. We are still debugging the issue, we would appreciate if you could provide any guidance? Thanks very much for helps, Lisheng
nested queries + joins performance
Hello, Continuing to have fun with joins, I finally figured a way to make my joins works. Suppose I have inserted data as bellow, using solrj. If I want to select a parent (room) that has both: - a keyboard and a mouse - a monitor and a tablet In my data, bellow, only room2 should be a match. I was able to get this working using the following solr query: q= *:* AND _query_:{!join from=root_id to=id}acessory1:Keyboard AND acessory2:Mouse AND _query_:{!join from=root_id to=id}acessory1:Monitor AND acessory2:Tablet As we can see, I am using nested queries. I have a result for each join and the results are merged as I was expecting. The problem is I can have about 20 nested joins on a query, sometimes. Question: How is it performed in Solr, under the hood? If I have 100 million documents, will all my joins (20 joins, for instance) be applied on 100 million documents? Or they will be applied on the result of the prior join? For instance, suppose my first query returns 10 documents (selected among 100 million). Will the other queries apply only in this result or they will apply on the entire index and then the result is merged? Data inserted with solrJ: public void insertDocuments() throws DataGrinderException, SolrServerException, IOException, DGIndexException { SolrServer solr = DGSolrServer.get(); // Add parent SolrInputDocument doc = new SolrInputDocument(); doc.addField(id, room1); doc.addField(cor_parede, white); doc.addField(num_cadeiras, 34); solr.add(doc); // Add children SolrInputDocument doc2 = new SolrInputDocument(); doc2.addField(id, computer1); doc2.addField(acessory1, Keyboard); doc2.addField(acessory2, Mouse); doc2.addField(root_id, room1); solr.add(doc2); doc2 = new SolrInputDocument(); doc2.addField(id, computer2); doc2.addField(acessory1, Monitor); doc2.addField(acessory2, Mouse); doc2.addField(root_id, room1); solr.add(doc2); doc2 = new SolrInputDocument(); doc2.addField(id, computer3); doc2.addField(acessory1, Keyboard); doc2.addField(acessory2, Camera); doc2.addField(root_id, room1); solr.add(doc2); doc2 = new SolrInputDocument(); doc2.addField(id, computer4); doc2.addField(acessory1, Tablet); doc2.addField(acessory2, Mouse USB); doc2.addField(root_id, room1); solr.add(doc2); // Add parent doc = new SolrInputDocument(); doc.addField(id, room2); doc.addField(cor_parede, black); doc.addField(num_cadeiras, 35); solr.add(doc); // Add children doc2 = new SolrInputDocument(); doc2.addField(id, computer5); doc2.addField(acessory1, Keyboard); doc2.addField(acessory2, Mouse); doc2.addField(root_id, room2); solr.add(doc2); doc2 = new SolrInputDocument(); doc2.addField(id, computer6); doc2.addField(acessory1, Monitor); doc2.addField(acessory2, Tablet); doc2.addField(root_id, room2); solr.add(doc2); UpdateResponse response = solr.add(doc); if (response.getStatus() != 0) throw new DGIndexException(Could not insert document to solr!); solr.commit(); } Best regards, -- Marcelo Elias Del Valle http://mvalle.com - @mvallebr
Re: amount of values in a multi value field - is denormalization always the best option?
Hello Flavio, Out of curiosity, are you already using this in prod? Would you share your results / benchmarks with us? (not sure if you have some). I wonder how it is performing for you. I was thinking in using a very similar schema, comparing to yours. The thing is: each option has drawbacks, there is no good or bad schema, if I understood things correctly. Even joins, which is something we should avoid using in a nosql technology like solr, may be a good option in some cases, I guess sometimes the only thing that can answer some questions are POCs and benchmarks. I am not a solr expert, there are several commiters on this list that might help you much better than I, but the way I think you should try your solution, see how it performs, and keep looking for alternatives that perform better forever, if possible. As I said, I am not an expert, but I wouldn't call your model a bad model that needs fix. It's a possible model and who knows, maybe other model could perform better. It's like in the case of an algorithm, we should assume we can always do better... Best regards, Marcelo. 2013/7/11 Flavio Pompermaier pomperma...@okkam.it I also have a similar scenario, where fundamentally I have to retrieve all urls where a userid has been found. So, in my schema, I designed the url as (string) key and a (possible huge) list of attributes automatically mapped to strings. For example: Url1 (key): - language: en - content:userid1 - content:userid1 - content:userid1 (i.e. 3 times actually for user 1) - content:userid2 - content:userid3 - author:userid4 and so on and so forth. So, if I did understand, you're saying that this is a bad design? How should I fix my schema in your opinion in that case? Best, Flavio On Wed, Jul 10, 2013 at 11:53 PM, Jack Krupansky j...@basetechnology.com wrote: Simple answer: avoid large number of values in a single document. There should only be a modest to moderate number of fields in a single document. Is the data relatively static, or subject to frequent updates? To update any field of a single document, even with atomic update, requires Solr to read and rewrite every field of the document. So, lots of smaller documents are best for a frequent update scenario. Multivalues fields are great for storing a relatively small list of values. You can add to the list easily, but under the hood, Solr must read and rewrite the full list as well as the full document. And, there is no way to address or synchronize individual elements of multivalued fields. Joins are great... if used in moderation. Heavy use of joins is not a great idea. -- Jack Krupansky -Original Message- From: Marcelo Elias Del Valle Sent: Wednesday, July 10, 2013 5:37 PM To: solr-user@lucene.apache.org Subject: amount of values in a multi value field - is denormalization always the best option? Hello, I have asked a question recently about solr limitations and some about joins. It comes that this question is about both at the same time. I am trying to figure how to denormalize my data so I will need just 1 document in my index instead of performing a join. I figure one way of doing this is storing an entity as a multivalued field, instead of storing different fields. Let me give an example. Consider the entities: User: id: 1 type: Joan of Arc age: 27 Webpage: id: 1 url: http://wiki.apache.org/solr/**Join http://wiki.apache.org/solr/Join category: Technical user_id: 1 id: 2 url: http://stackoverflow.com category: Technical user_id: 1 Instead of creating 1 document for user, 1 for webpage 1 and 1 for webpage 2 (1 parent and 2 childs) I could store webpages in a user multivalued field, as follows: User: id: 1 name: Joan of Arc age: 27 webpage1: [id:1, url: http://wiki.apache.org/solr/**Join http://wiki.apache.org/solr/Join, category: Technical] webpage2: [id:2, url: http://stackoverflow.com;, category: Technical] It would probably perform better than the join, right? However, it made me think about solr limitations again. What if I have 200 million webpges (200 million fields) per user? Or imagine a case where I could have 200 million values on a field, like in the case I need to index every html DOM element (div, a, etc.) for each web page user visited. I mean, if I need to do the query and this is a business requirement no matter what, although denormalizing could be better than using query time joins, I wonder it distributing the data present in this single document along the cluster wouldn't give me better performance. And this is something I won't get with block joins or multivalued fields... I guess there is probably no right answer for this question (at least not a known one), and I know I should create a POC to check how each perform...
edismax behaviour with japanese
Hello, I have a text and text_ja fields where text is english and text_ja is japanese analyzers, i index both with copyfield from other fields. I'm trying to search both fields using edismax and qf parameter, but I see strange behaviour of edismax , I wonder if someone can give me a hist to what's going on and what am I doing wrong? when I run this query i can see that solr is searching both fields but the text_ja: query is only a partial text and text: is the complete text. http://localhost/solr/core0/select/?indent=onrows=100; debug=query defType=edismaxqf=text+text_jaq=このたびは lst name=debug str name=rawquerystringこのたびは/str str name=querystringこのたびは/str str name=parsedquery(+DisjunctionMaxQuery((text_ja:たび | text:この たびは)))/no_coord/str str name=parsedquery_toString+(text_ja:たび | text:このたびは)/str str name=QParserExtendedDismaxQParser/str /lst now, if I remove the last two characters from the query string solr will not search the text_ja, at list that's what I understand from the debug output: http://localhost/solr/core0/select/?indent=onrows=100; debug=query defType=edismaxqf=text+text_jaq=このた lst name=debug str name=rawquerystringこのた/str str name=querystringこのた/str str name=parsedquery(+DisjunctionMaxQuery((text:このた)))/no_coord /str str name=parsedquery_toString+(text:このた)/str str name=QParserExtendedDismaxQParser/str /lst with another string of japanese text solr now cuts the query to multiple text_ja queries: http://localhost/solr/core0/select/?indent=onrows=100; debug=query defType=edismaxqf=text+text_jaq=システムをお買い求め いただき lst name=debug str name=rawquerystringシステムをお買い求めいただき/str str name=querystringシステムをお買い求めいただき/str str name=parsedquery(+DisjunctionMaxQuerytext_ja:システム text_ja:買い求める text_ja:いただく)~3) | text:システムをお買い求めいた だき)))/no_coord/str str name=parsedquery_toString+(((text_ja:システム text_ja:買い求める text_ja:いただく)~3) | text:システムをお買い求めいただき)/str str name=QParserExtendedDismaxQParser/str /lst Thank you.
Re: amount of values in a multi value field - is denormalization always the best option?
Yeah, probably you're right..I have to test different configurations! That is what I'd like to know in advance the available solutions..I'm still developing fortunately so I'm still in the position to investigate the solution. Obviously I'll do some benchmarking on it, but I should know the alternatives...so I asked the list! I'm sure someone will give me some hint, at least I hope :) Best, Flavio On Thu, Jul 11, 2013 at 3:46 PM, Marcelo Elias Del Valle mvall...@gmail.com wrote: Hello Flavio, Out of curiosity, are you already using this in prod? Would you share your results / benchmarks with us? (not sure if you have some). I wonder how it is performing for you. I was thinking in using a very similar schema, comparing to yours. The thing is: each option has drawbacks, there is no good or bad schema, if I understood things correctly. Even joins, which is something we should avoid using in a nosql technology like solr, may be a good option in some cases, I guess sometimes the only thing that can answer some questions are POCs and benchmarks. I am not a solr expert, there are several commiters on this list that might help you much better than I, but the way I think you should try your solution, see how it performs, and keep looking for alternatives that perform better forever, if possible. As I said, I am not an expert, but I wouldn't call your model a bad model that needs fix. It's a possible model and who knows, maybe other model could perform better. It's like in the case of an algorithm, we should assume we can always do better... Best regards, Marcelo. 2013/7/11 Flavio Pompermaier pomperma...@okkam.it I also have a similar scenario, where fundamentally I have to retrieve all urls where a userid has been found. So, in my schema, I designed the url as (string) key and a (possible huge) list of attributes automatically mapped to strings. For example: Url1 (key): - language: en - content:userid1 - content:userid1 - content:userid1 (i.e. 3 times actually for user 1) - content:userid2 - content:userid3 - author:userid4 and so on and so forth. So, if I did understand, you're saying that this is a bad design? How should I fix my schema in your opinion in that case? Best, Flavio On Wed, Jul 10, 2013 at 11:53 PM, Jack Krupansky j...@basetechnology.com wrote: Simple answer: avoid large number of values in a single document. There should only be a modest to moderate number of fields in a single document. Is the data relatively static, or subject to frequent updates? To update any field of a single document, even with atomic update, requires Solr to read and rewrite every field of the document. So, lots of smaller documents are best for a frequent update scenario. Multivalues fields are great for storing a relatively small list of values. You can add to the list easily, but under the hood, Solr must read and rewrite the full list as well as the full document. And, there is no way to address or synchronize individual elements of multivalued fields. Joins are great... if used in moderation. Heavy use of joins is not a great idea. -- Jack Krupansky -Original Message- From: Marcelo Elias Del Valle Sent: Wednesday, July 10, 2013 5:37 PM To: solr-user@lucene.apache.org Subject: amount of values in a multi value field - is denormalization always the best option? Hello, I have asked a question recently about solr limitations and some about joins. It comes that this question is about both at the same time. I am trying to figure how to denormalize my data so I will need just 1 document in my index instead of performing a join. I figure one way of doing this is storing an entity as a multivalued field, instead of storing different fields. Let me give an example. Consider the entities: User: id: 1 type: Joan of Arc age: 27 Webpage: id: 1 url: http://wiki.apache.org/solr/**Join http://wiki.apache.org/solr/Join category: Technical user_id: 1 id: 2 url: http://stackoverflow.com category: Technical user_id: 1 Instead of creating 1 document for user, 1 for webpage 1 and 1 for webpage 2 (1 parent and 2 childs) I could store webpages in a user multivalued field, as follows: User: id: 1 name: Joan of Arc age: 27 webpage1: [id:1, url: http://wiki.apache.org/solr/**Join http://wiki.apache.org/solr/Join, category: Technical] webpage2: [id:2, url: http://stackoverflow.com;, category: Technical] It would probably perform better than the join, right? However, it made me think about solr limitations again. What if I have 200 million webpges (200 million fields) per user? Or imagine a case where I could have 200 million values on a field, like
How to boost relevance based on distance and age..
Here is the structure of the solr document doc str name=latlong52.401790,4.936660/str date name=dateOfBirth1993-12-09T00:00:00Z/date /doc would like to search for document's based on the following weighted criteria.. - distance 0-10miles weight 40 - distance 10miles and above weight 20 - Age 0-20years weight 20 - Age 20years and above weight 10 wondering what are the recommended approaches to build SOLR queries for this? Thanks -Vineel -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-boost-relevance-based-on-distance-and-age-tp4077330.html Sent from the Solr - User mailing list archive at Nabble.com.
Too many documents, composite IndexReaders cannot exceed 2147483647
Hello everybody, somehow we managed to overload our Solr server 4.2.0 with too many documents (many of which are already deleted, but the index is not optimized). Now Solr cannot be started anymore, see full strack trace below. Caused by: java.lang.IllegalArgumentException: Too many documents, composite IndexReaders cannot exceed 2147483647 at org.apache.lucene.index.BaseCompositeReader.init(BaseCompositeReader.java:79) at org.apache.lucene.index.DirectoryReader.init(DirectoryReader.java:339) at org.apache.lucene.index.StandardDirectoryReader.init(StandardDirectoryReader.java:42) at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:71) at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:783) at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52) at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:87) at org.apache.solr.core.StandardIndexReaderFactory.newReader(StandardIndexReaderFactory.java:34) at org.apache.solr.search.SolrIndexSearcher.init(SolrIndexSearcher.java:124) at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1391) We would like to bring Solr up at least in a maintenance mode to perform the optimize, after which the deleted documents should be removed and we have only 1.5 billion docs. How can we accomplish this? Thanks and regards Manuel
SolrJ and initializing logger in solr 4.3?
I am using SolrJ in a Java (actually jruby) project, with Solr 4.3. When I instantiate an HttpSolrServer, I get the dreaded: log4j:WARN No appenders could be found for logger (org.apache.solr.client.solrj.impl.HttpClientUtil). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Using SolrJ as an embedded library in my own software, what is the proper or 'best practice' way -- or failing that, just any way at all -- to initialize log4j under Solr 4.3? I am not super familiar with Java or log4j; hopefully there is an easy way to do this? (If someone has a way especially suited for jruby, even better; but just a standard Java answer would be great too.) Thanks for any advice!
Re: SolrJ and initializing logger in solr 4.3?
Hi Jonathan, I think you just need some config on the classpath: http://logging.apache.org/log4j/1.2/manual.html#defaultInit Michael Della Bitta Applications Developer o: +1 646 532 3062 | c: +1 917 477 7906 appinions inc. “The Science of Influence Marketing” 18 East 41st Street New York, NY 10017 t: @appinions https://twitter.com/Appinions | g+: plus.google.com/appinions w: appinions.com http://www.appinions.com/ On Thu, Jul 11, 2013 at 12:45 PM, Jonathan Rochkind rochk...@jhu.eduwrote: I am using SolrJ in a Java (actually jruby) project, with Solr 4.3. When I instantiate an HttpSolrServer, I get the dreaded: log4j:WARN No appenders could be found for logger (org.apache.solr.client.solrj.**impl.HttpClientUtil). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/**log4j/1.2/faq.html#noconfighttp://logging.apache.org/log4j/1.2/faq.html#noconfigfor more info. Using SolrJ as an embedded library in my own software, what is the proper or 'best practice' way -- or failing that, just any way at all -- to initialize log4j under Solr 4.3? I am not super familiar with Java or log4j; hopefully there is an easy way to do this? (If someone has a way especially suited for jruby, even better; but just a standard Java answer would be great too.) Thanks for any advice!
Re: amount of values in a multi value field - is denormalization always the best option?
Again, generally, if the number of values is relatively modest and you don't need to discriminate (tell which one matches on a search) and you don't edit the list, a multivalued field makes perfect sense, but if any of those requirements is not true, then you need to represent the items as discrete Solr documents. But, it does all depend on your particular data and particular requirements. -- Jack Krupansky -Original Message- From: Flavio Pompermaier Sent: Thursday, July 11, 2013 7:50 AM To: solr-user@lucene.apache.org Subject: Re: amount of values in a multi value field - is denormalization always the best option? I also have a similar scenario, where fundamentally I have to retrieve all urls where a userid has been found. So, in my schema, I designed the url as (string) key and a (possible huge) list of attributes automatically mapped to strings. For example: Url1 (key): - language: en - content:userid1 - content:userid1 - content:userid1 (i.e. 3 times actually for user 1) - content:userid2 - content:userid3 - author:userid4 and so on and so forth. So, if I did understand, you're saying that this is a bad design? How should I fix my schema in your opinion in that case? Best, Flavio On Wed, Jul 10, 2013 at 11:53 PM, Jack Krupansky j...@basetechnology.comwrote: Simple answer: avoid large number of values in a single document. There should only be a modest to moderate number of fields in a single document. Is the data relatively static, or subject to frequent updates? To update any field of a single document, even with atomic update, requires Solr to read and rewrite every field of the document. So, lots of smaller documents are best for a frequent update scenario. Multivalues fields are great for storing a relatively small list of values. You can add to the list easily, but under the hood, Solr must read and rewrite the full list as well as the full document. And, there is no way to address or synchronize individual elements of multivalued fields. Joins are great... if used in moderation. Heavy use of joins is not a great idea. -- Jack Krupansky -Original Message- From: Marcelo Elias Del Valle Sent: Wednesday, July 10, 2013 5:37 PM To: solr-user@lucene.apache.org Subject: amount of values in a multi value field - is denormalization always the best option? Hello, I have asked a question recently about solr limitations and some about joins. It comes that this question is about both at the same time. I am trying to figure how to denormalize my data so I will need just 1 document in my index instead of performing a join. I figure one way of doing this is storing an entity as a multivalued field, instead of storing different fields. Let me give an example. Consider the entities: User: id: 1 type: Joan of Arc age: 27 Webpage: id: 1 url: http://wiki.apache.org/solr/**Joinhttp://wiki.apache.org/solr/Join category: Technical user_id: 1 id: 2 url: http://stackoverflow.com category: Technical user_id: 1 Instead of creating 1 document for user, 1 for webpage 1 and 1 for webpage 2 (1 parent and 2 childs) I could store webpages in a user multivalued field, as follows: User: id: 1 name: Joan of Arc age: 27 webpage1: [id:1, url: http://wiki.apache.org/solr/**Joinhttp://wiki.apache.org/solr/Join, category: Technical] webpage2: [id:2, url: http://stackoverflow.com;, category: Technical] It would probably perform better than the join, right? However, it made me think about solr limitations again. What if I have 200 million webpges (200 million fields) per user? Or imagine a case where I could have 200 million values on a field, like in the case I need to index every html DOM element (div, a, etc.) for each web page user visited. I mean, if I need to do the query and this is a business requirement no matter what, although denormalizing could be better than using query time joins, I wonder it distributing the data present in this single document along the cluster wouldn't give me better performance. And this is something I won't get with block joins or multivalued fields... I guess there is probably no right answer for this question (at least not a known one), and I know I should create a POC to check how each perform... But do you think a so large number of values in a single document could make denormalization not possible in an extreme case like this? Would you share my thoughts if I said denormalization is not always the right option? Best regards, -- Marcelo Elias Del Valle http://mvalle.com - @mvallebr
Re: What happens in indexing request in solr cloud if Zookeepers are all dead?
There are no masters or slaves in SolrCloud - it is fully distributed and master-free. Leaders are temporary and can vary over time. The basic idea for quorum is to prevent split brain - two (or more) distinct sets of nodes (zookeeper nodes, that is) each thinking they constitute the authoritative source for access to configuration information. The trick is to require (N/2)+1 nodes for quorum. For n=3, quorum would be (3/2)+1 = 1+1 = 2, so one node can be down. For n=1, quorum = (1/2)+1 = 0 + 1 = 1. For n=2, quorum would be (2/2)+1 = 1 + 1 = 2, so no nodes can be down. IOW, for n=2 no nodes can be down for the cluster to do updates. -- Jack Krupansky -Original Message- From: Zhang, Lisheng Sent: Thursday, July 11, 2013 9:28 AM To: solr-user@lucene.apache.org Subject: What happens in indexing request in solr cloud if Zookeepers are all dead? Hi, In solr cloud latest doc, it mentioned that if all Zookeepers are dead, distributed query still works because solr remembers the cluster state. How about the indexing request handling if all Zookeepers are dead, does solr needs Zookeeper to know which box is master and which is slave for indexing to work? Could solr remember master/slave relations without Zookeeper? Also doc said Zookeeper quorum needs to have a majority rule so that we must have 3 Zookeepers to handle the case one instance is crashed, what would happen if we have two instances in quorum and one instance is crashed (or quorum having 3 instances but two of them are crashed)? I felt the last one should take over? Thanks very much for helps, Lisheng
Re: Applying Sum on Field
Take a look at the stats component that calculates aggregate values. It has a facet parameter that may or may not give you something similar to what you want. Or, just form a query that matches the results of the group, and then get the stats. See: http://wiki.apache.org/solr/StatsComponent -- Jack Krupansky -Original Message- From: Jamshaid Ashraf Sent: Thursday, July 11, 2013 7:56 AM To: solr-user@lucene.apache.org Subject: Applying Sum on Field Hi, I'm a new solr user, I wanted to know is there any way to apply sum on a field in a result document of group query? Following is the query and its result set, I wanted to apply sum on 'price' filed grouping on type: *Sample input:* doc str name=id3/str str name=typeCaffe/str str name=contentYummm Drinking a latte at Caffe Grecco in SF shistoric North Beach Learning text analysis with SolrInAction by Manning on my iPad/str long name=_version_1440257540658036736/long int name=price250/int /doc doc str name=id1/str str name=typeCaffe/str str name=contentYummm Drinking a latte at Caffe Grecco in SF shistoric North Beach Learning text analysis with SolrInAction by Manning on my iPad/str long name=_version_1440257592044552192/long int name=price100/int /doc * * *Query:* http://localhost:8080/solr/collection2/select?q=caffedf=contentgroup=truegroup.field=type your help will be greatly appreciated! Regards, Jamshaid
Thousands of cluster state change events per second from zookeeper
Hi, We have 3 search client nodes connected to a 12x2 Solr 4.2.1 cluster through CloudSolrServer. We are noticing thousands of such events being logged every second on these client nodes and filling up the logs quickly. Are there any known bug in Zookeeper or SolrJ client that can cause this? When we restarted one of the search client node, the notifications stopped on all 3 clients. But I am sure it will reappear though because this behavior is intermittent. This behavior does not seem to be correlated to indexing since this notifications happens whether or not indexing is happening. Jul 11 2013 10:38:18.537 PDT [-0700] [http-8080-1-EventThread] INFO o.a.solr.common.cloud.ZkStateReader - A cluster state change: WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json, has occurred - updating... (live nodes size: 24) Jul 11 2013 10:38:18.538 PDT [-0700] [http-8080-1-EventThread] INFO o.a.solr.common.cloud.ZkStateReader - A cluster state change: WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json, has occurred - updating... (live nodes size: 24) Jul 11 2013 10:38:18.540 PDT [-0700] [http-8080-1-EventThread] INFO o.a.solr.common.cloud.ZkStateReader - A cluster state change: WatchedEvent state:SyncConnected type:NodeDataChanged path:/clusterstate.json, Thanks -Shankar
RE: What happens in indexing request in solr cloud if Zookeepers are all dead?
Yes, I should not have used word master/slave for solr cloud! So if all Zookeepers are dead, could indexing requests be handled properly (could solr remember the setting for indexing)? Thanks very much for helps, Lisheng -Original Message- From: Jack Krupansky [mailto:j...@basetechnology.com] Sent: Thursday, July 11, 2013 10:46 AM To: solr-user@lucene.apache.org Subject: Re: What happens in indexing request in solr cloud if Zookeepers are all dead? There are no masters or slaves in SolrCloud - it is fully distributed and master-free. Leaders are temporary and can vary over time. The basic idea for quorum is to prevent split brain - two (or more) distinct sets of nodes (zookeeper nodes, that is) each thinking they constitute the authoritative source for access to configuration information. The trick is to require (N/2)+1 nodes for quorum. For n=3, quorum would be (3/2)+1 = 1+1 = 2, so one node can be down. For n=1, quorum = (1/2)+1 = 0 + 1 = 1. For n=2, quorum would be (2/2)+1 = 1 + 1 = 2, so no nodes can be down. IOW, for n=2 no nodes can be down for the cluster to do updates. -- Jack Krupansky -Original Message- From: Zhang, Lisheng Sent: Thursday, July 11, 2013 9:28 AM To: solr-user@lucene.apache.org Subject: What happens in indexing request in solr cloud if Zookeepers are all dead? Hi, In solr cloud latest doc, it mentioned that if all Zookeepers are dead, distributed query still works because solr remembers the cluster state. How about the indexing request handling if all Zookeepers are dead, does solr needs Zookeeper to know which box is master and which is slave for indexing to work? Could solr remember master/slave relations without Zookeeper? Also doc said Zookeeper quorum needs to have a majority rule so that we must have 3 Zookeepers to handle the case one instance is crashed, what would happen if we have two instances in quorum and one instance is crashed (or quorum having 3 instances but two of them are crashed)? I felt the last one should take over? Thanks very much for helps, Lisheng
Re: Moving replica from node to node?
Yeah, though CREATE and UNLOAD end up being kind of funny descriptors. You'd think LOAD and UNLOAD or CREATE and DELETE or something... On Wed, Jul 10, 2013 at 11:35 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Thanks Mark. I assume you are referring to using the Core Admin API - CREATE and UNLOAD? Added https://issues.apache.org/jira/browse/SOLR-5032 Otis -- Solr ElasticSearch Support -- http://sematext.com/ Performance Monitoring -- http://sematext.com/spm On Mon, Jul 8, 2013 at 10:50 PM, Mark Miller markrmil...@gmail.com wrote: It's simply a sugar method that no one has gotten to yet. I almost have once or twice, but I always have moved onto other things before even starting. It's fairly simple to just start another replica on the TO node and then delete the replica on the FROM node, so not a lot of urgency. - Mark On Jul 8, 2013, at 10:18 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, Solr(Cloud) currently doesn't have any facility to move a specific replica from one node to the other. How come? Is there a technical or philosophical reason, or just the 24 hours/day reason? Thanks, Otis -- Solr ElasticSearch Support -- http://sematext.com/ Performance Monitoring -- http://sematext.com/spm -- - Mark
What does too many merges...stalling in indexwriter log mean?
Hello, We are seeing the message too many merges...stalling in our indexwriter log. Is this something to be concerned about? Does it mean we need to tune something in our indexing configuration? Tom
Leader Election, when?
I have a working Zookeeper ensemble running with 3 instances and also a solrcloud cluster with some solr instances. I've created a collection with settings to 2 shards. Then i: create 1 core on instance1 create 1 core on instance2 create 1 core on instance1 create 1 core on instance2 Just to have this configuration: instance1: shard1_leader, shard2_replica instance2: shard1_replica, shard2_leader If i add 2 cores to instance1 then 2 cores to instance2, both leaders will be on instance1 and no re-election is done. instance1: shard1_leader, shard2_leader instance2: shard1_replica, shard2_replica Back to my ideal scenario (detached leaders), also when i add a third instance with 2 replicas and kill one of my instances running a leader, the election picks the instance that already have a leader. My question is why Zookeeper takes this behavior. Shouldn't it distribute leaders? If i deliver some stress to a double-leader instance, is Zookeeper going to run an election? -- View this message in context: http://lucene.472066.n3.nabble.com/Leader-Election-when-tp4077381.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Moving replica from node to node?
And CREATE and UNLOAD are almost exactly the wrong descriptors, because CREATE loads up a core that's already there, and UNLOAD can in fact delete it from the filesystem… Alan Woodward www.flax.co.uk On 11 Jul 2013, at 20:15, Mark Miller wrote: Yeah, though CREATE and UNLOAD end up being kind of funny descriptors. You'd think LOAD and UNLOAD or CREATE and DELETE or something... On Wed, Jul 10, 2013 at 11:35 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Thanks Mark. I assume you are referring to using the Core Admin API - CREATE and UNLOAD? Added https://issues.apache.org/jira/browse/SOLR-5032 Otis -- Solr ElasticSearch Support -- http://sematext.com/ Performance Monitoring -- http://sematext.com/spm On Mon, Jul 8, 2013 at 10:50 PM, Mark Miller markrmil...@gmail.com wrote: It's simply a sugar method that no one has gotten to yet. I almost have once or twice, but I always have moved onto other things before even starting. It's fairly simple to just start another replica on the TO node and then delete the replica on the FROM node, so not a lot of urgency. - Mark On Jul 8, 2013, at 10:18 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, Solr(Cloud) currently doesn't have any facility to move a specific replica from one node to the other. How come? Is there a technical or philosophical reason, or just the 24 hours/day reason? Thanks, Otis -- Solr ElasticSearch Support -- http://sematext.com/ Performance Monitoring -- http://sematext.com/spm -- - Mark
SolrJ 4.3 to Solr 1.4
So, trying to use a SolrJ 4.3 to talk to an old Solr 1.4. Specifically to add documents. The wiki at http://wiki.apache.org/solr/Solrj suggests, I think, that this should work, so long as you: server.setParser(new XMLResponseParser()); However, when I do this, I still get a org.apache.solr.common.SolrException: parsing error from org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:143) (If I _don't_ setParser to XML, and use the binary parser... I get a fully expected error about binary format corruption -- that part is expected and I understand it, that's why you have to use the XMLResponseParser instead). Am I not doing enough to my SolrJ 4.3 to get it to talk to the Solr 1.4 server in pure XML? I've set the parser to the XMLResponseParser, do I also have to somehow tell it to actually use the Solr 1.4 XML update handler or something? I don't entirely understand what I'm talking about. Alternately... is it just a lost cause trying to get SolrJ 4.3 to talk to Solr 1.4, is the wiki wrong that this is possible? Thanks for any help, Jonathan
Re: What happens in indexing request in solr cloud if Zookeepers are all dead?
Sorry, no updates if no Zookeepers. There would be no way to assure that any node knows the proper configuration. Queries are a little safer using most recent configuration without zookeeper, but update consistency requires accurate configuration information. -- Jack Krupansky -Original Message- From: Zhang, Lisheng Sent: Thursday, July 11, 2013 2:59 PM To: solr-user@lucene.apache.org Subject: RE: What happens in indexing request in solr cloud if Zookeepers are all dead? Yes, I should not have used word master/slave for solr cloud! So if all Zookeepers are dead, could indexing requests be handled properly (could solr remember the setting for indexing)? Thanks very much for helps, Lisheng -Original Message- From: Jack Krupansky [mailto:j...@basetechnology.com] Sent: Thursday, July 11, 2013 10:46 AM To: solr-user@lucene.apache.org Subject: Re: What happens in indexing request in solr cloud if Zookeepers are all dead? There are no masters or slaves in SolrCloud - it is fully distributed and master-free. Leaders are temporary and can vary over time. The basic idea for quorum is to prevent split brain - two (or more) distinct sets of nodes (zookeeper nodes, that is) each thinking they constitute the authoritative source for access to configuration information. The trick is to require (N/2)+1 nodes for quorum. For n=3, quorum would be (3/2)+1 = 1+1 = 2, so one node can be down. For n=1, quorum = (1/2)+1 = 0 + 1 = 1. For n=2, quorum would be (2/2)+1 = 1 + 1 = 2, so no nodes can be down. IOW, for n=2 no nodes can be down for the cluster to do updates. -- Jack Krupansky -Original Message- From: Zhang, Lisheng Sent: Thursday, July 11, 2013 9:28 AM To: solr-user@lucene.apache.org Subject: What happens in indexing request in solr cloud if Zookeepers are all dead? Hi, In solr cloud latest doc, it mentioned that if all Zookeepers are dead, distributed query still works because solr remembers the cluster state. How about the indexing request handling if all Zookeepers are dead, does solr needs Zookeeper to know which box is master and which is slave for indexing to work? Could solr remember master/slave relations without Zookeeper? Also doc said Zookeeper quorum needs to have a majority rule so that we must have 3 Zookeepers to handle the case one instance is crashed, what would happen if we have two instances in quorum and one instance is crashed (or quorum having 3 instances but two of them are crashed)? I felt the last one should take over? Thanks very much for helps, Lisheng
Re: SolrJ 4.3 to Solr 1.4
: However, when I do this, I still get a org.apache.solr.common.SolrException: : parsing error from : org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:143) it's impossible to guess what the underlying problem might be unless you can provide us the full error. The one thing i can think of that might not be obvious is that legacy header format (quey param version=1.0 vs version=2.0) might be confusing the XMLResponseParser ... i don't remember when the default changed but i thought it was *before* Solr 1.4 https://wiki.apache.org/solr/XMLResponseFormat#A.27version.27 -Hoss
Re: Partial Matching in both query and field
Jack, This still isn't working. I just upgraded to 3.6.2 to verify that wasn't the issue. Here's query information: lst name=params str name=debugQueryon/str str name=indenton/str str name=start0/str str name=q0_extrafield1_n:20454/str str name=rows10/str str name=version2.2/str /lst /lst result name=response numFound=0 start=0/ lst name=debug str name=rawquerystring0_extrafield1_n:20454/str str name=querystring0_extrafield1_n:20454/str str name=parsedqueryPhraseQuery(0_extrafield1_n:2o45 o454 2o454)/str str name=parsedquery_toString0_extrafield1_n:2o45 o454 2o454/str lst name=explain/ str name=QParserLuceneQParser/str Here's the applicable lines from schema.xml: fieldType name=ngram class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=false analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=1 splitOnCaseChange=0 splitOnNumerics=0 preserveOriginal=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.PatternReplaceFilterFactory pattern=0 replacement=o replace=all/ filter class=solr.PatternReplaceFilterFactory pattern=1|l replacement=i replace=all/ filter class=solr.NGramFilterFactory minGramSize=4 maxGramSize=16/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.NGramTokenizerFactory minGramSize=4 maxGramSize=16 / tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.PatternReplaceFilterFactory pattern=[^A-Za-z0-9]+ replacement= replace=all/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.PatternReplaceFilterFactory pattern=0 replacement=o replace=all/ filter class=solr.PatternReplaceFilterFactory pattern=1|l replacement=i replace=all/ !--filter class=solr.NGramFilterFactory minGramSize=4 maxGramSize=4 /-- filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType dynamicField name=*_n type=ngram indexed=true stored=true / It looks like it's generating phrases to me even though I have it set to false. James [image: SearchSpring | Findability Unleashed] James Bathgate | Sr. Developer Toll Free (888) 643-9043 x610 - Fax (719) 358-2027 4291 Austin Bluffs Pkwy #206 | Colorado Springs, CO 80918 www.searchspring.net http://www.searchspring.net On Tue, Jul 2, 2013 at 2:47 PM, Jack Krupansky j...@basetechnology.comwrote: Ahhh... you put autoGeneratePhraseQueries=**false on the field - but it needs to be on the field type. You can see from the parsed query that it generated the phrase. -- Jack Krupansky -Original Message- From: James Bathgate Sent: Tuesday, July 02, 2013 5:35 PM To: solr-user@lucene.apache.org Subject: Re: Partial Matching in both query and field Jack, I've already tried that, here's my query: str name=debugQueryon/str str name=indenton/str str name=start0/str str name=q0_extrafield1_n:**20454/str str name=q.opOR/str str name=rows10/str str name=version2.2/str Here's the parsed query: str name=parsedquery_toString0_**extrafield1_n:2o45 o454 2o454/str Here's the applicable lines from schema.xml: fieldType name=ngram class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.**WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=**true/ filter class=solr.**SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.**WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=1 splitOnCaseChange=0 splitOnNumerics=0 preserveOriginal=0/ filter class=solr.**LowerCaseFilterFactory/ filter class=solr.**PatternReplaceFilterFactory pattern=0 replacement=o replace=all/ filter class=solr.**PatternReplaceFilterFactory pattern=1|l replacement=i replace=all/ filter class=solr.**NGramFilterFactory minGramSize=4 maxGramSize=16/ filter class=solr.**RemoveDuplicatesTokenFilterFac**tory/ /analyzer analyzer type=query tokenizer class=solr.**NGramTokenizerFactory minGramSize=4 maxGramSize=16 / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=**true/ filter class=solr.**PatternReplaceFilterFactory pattern=[^A-Za-z0-9]+ replacement= replace=all/ filter
Re: SolrJ 4.3 to Solr 1.4
Huh, that might have been a false problem of some kind. At the moment, it looks like I _do_ have my SolrJ 4.3 succesfully talking to a Solr 1.4, so long as I setParser(new XMLResponseParser()). Not sure what I changed or what wasn't working before, but great! So nevermind. Although if anyone reading this wants to share any other potential gotchas on solrj 4.3 talking to solr 1.4, feel free! On 7/11/13 4:24 PM, Jonathan Rochkind wrote: So, trying to use a SolrJ 4.3 to talk to an old Solr 1.4. Specifically to add documents. The wiki at http://wiki.apache.org/solr/Solrj suggests, I think, that this should work, so long as you: server.setParser(new XMLResponseParser()); However, when I do this, I still get a org.apache.solr.common.SolrException: parsing error from org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:143) (If I _don't_ setParser to XML, and use the binary parser... I get a fully expected error about binary format corruption -- that part is expected and I understand it, that's why you have to use the XMLResponseParser instead). Am I not doing enough to my SolrJ 4.3 to get it to talk to the Solr 1.4 server in pure XML? I've set the parser to the XMLResponseParser, do I also have to somehow tell it to actually use the Solr 1.4 XML update handler or something? I don't entirely understand what I'm talking about. Alternately... is it just a lost cause trying to get SolrJ 4.3 to talk to Solr 1.4, is the wiki wrong that this is possible? Thanks for any help, Jonathan
Re: Partial Matching in both query and field
I just noticed I pasted the wrong fieldType with the extra tokenizer not commented out. fieldType name=ngram class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=false analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=1 splitOnCaseChange=0 splitOnNumerics=0 preserveOriginal=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.PatternReplaceFilterFactory pattern=0 replacement=o replace=all/ filter class=solr.PatternReplaceFilterFactory pattern=1|l replacement=i replace=all/ filter class=solr.NGramFilterFactory minGramSize=4 maxGramSize=16/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.NGramTokenizerFactory minGramSize=4 maxGramSize=16 / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.PatternReplaceFilterFactory pattern=[^A-Za-z0-9]+ replacement= replace=all/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.PatternReplaceFilterFactory pattern=0 replacement=o replace=all/ filter class=solr.PatternReplaceFilterFactory pattern=1|l replacement=i replace=all/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType [image: SearchSpring | Findability Unleashed] James Bathgate | Sr. Developer Toll Free (888) 643-9043 x610 - Fax (719) 358-2027 4291 Austin Bluffs Pkwy #206 | Colorado Springs, CO 80918 www.searchspring.net http://www.searchspring.net On Thu, Jul 11, 2013 at 2:15 PM, James Bathgate ja...@b7interactive.comwrote: Jack, This still isn't working. I just upgraded to 3.6.2 to verify that wasn't the issue. Here's query information: lst name=params str name=debugQueryon/str str name=indenton/str str name=start0/str str name=q0_extrafield1_n:20454/str str name=rows10/str str name=version2.2/str /lst /lst result name=response numFound=0 start=0/ lst name=debug str name=rawquerystring0_extrafield1_n:20454/str str name=querystring0_extrafield1_n:20454/str str name=parsedqueryPhraseQuery(0_extrafield1_n:2o45 o454 2o454)/str str name=parsedquery_toString0_extrafield1_n:2o45 o454 2o454/str lst name=explain/ str name=QParserLuceneQParser/str Here's the applicable lines from schema.xml: fieldType name=ngram class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=false analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=1 splitOnCaseChange=0 splitOnNumerics=0 preserveOriginal=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.PatternReplaceFilterFactory pattern=0 replacement=o replace=all/ filter class=solr.PatternReplaceFilterFactory pattern=1|l replacement=i replace=all/ filter class=solr.NGramFilterFactory minGramSize=4 maxGramSize=16/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.NGramTokenizerFactory minGramSize=4 maxGramSize=16 / tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.PatternReplaceFilterFactory pattern=[^A-Za-z0-9]+ replacement= replace=all/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.PatternReplaceFilterFactory pattern=0 replacement=o replace=all/ filter class=solr.PatternReplaceFilterFactory pattern=1|l replacement=i replace=all/ !--filter class=solr.NGramFilterFactory minGramSize=4 maxGramSize=4 /-- filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType dynamicField name=*_n type=ngram indexed=true stored=true / It looks like it's generating phrases to me even though I have it set to false. James [image: SearchSpring | Findability Unleashed] James Bathgate | Sr. Developer Toll Free (888) 643-9043 x610 - Fax (719) 358-2027 4291 Austin Bluffs Pkwy #206 | Colorado Springs, CO 80918 www.searchspring.net http://www.searchspring.net On Tue, Jul 2, 2013 at 2:47 PM,
Re: Partial Matching in both query and field
A couple of possibilities: 1. Make sure to reload the core. 2. Check that the Solr schema version is new enough to recognize autoGeneratePhraseQueries. 3. What query parser are you using? -- Jack Krupansky -Original Message- From: James Bathgate Sent: Thursday, July 11, 2013 5:26 PM To: solr-user@lucene.apache.org Subject: Re: Partial Matching in both query and field I just noticed I pasted the wrong fieldType with the extra tokenizer not commented out. fieldType name=ngram class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=false analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=1 splitOnCaseChange=0 splitOnNumerics=0 preserveOriginal=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.PatternReplaceFilterFactory pattern=0 replacement=o replace=all/ filter class=solr.PatternReplaceFilterFactory pattern=1|l replacement=i replace=all/ filter class=solr.NGramFilterFactory minGramSize=4 maxGramSize=16/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.NGramTokenizerFactory minGramSize=4 maxGramSize=16 / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.PatternReplaceFilterFactory pattern=[^A-Za-z0-9]+ replacement= replace=all/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.PatternReplaceFilterFactory pattern=0 replacement=o replace=all/ filter class=solr.PatternReplaceFilterFactory pattern=1|l replacement=i replace=all/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType [image: SearchSpring | Findability Unleashed] James Bathgate | Sr. Developer Toll Free (888) 643-9043 x610 - Fax (719) 358-2027 4291 Austin Bluffs Pkwy #206 | Colorado Springs, CO 80918 www.searchspring.net http://www.searchspring.net On Thu, Jul 11, 2013 at 2:15 PM, James Bathgate ja...@b7interactive.comwrote: Jack, This still isn't working. I just upgraded to 3.6.2 to verify that wasn't the issue. Here's query information: lst name=params str name=debugQueryon/str str name=indenton/str str name=start0/str str name=q0_extrafield1_n:20454/str str name=rows10/str str name=version2.2/str /lst /lst result name=response numFound=0 start=0/ lst name=debug str name=rawquerystring0_extrafield1_n:20454/str str name=querystring0_extrafield1_n:20454/str str name=parsedqueryPhraseQuery(0_extrafield1_n:2o45 o454 2o454)/str str name=parsedquery_toString0_extrafield1_n:2o45 o454 2o454/str lst name=explain/ str name=QParserLuceneQParser/str Here's the applicable lines from schema.xml: fieldType name=ngram class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=false analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=1 splitOnCaseChange=0 splitOnNumerics=0 preserveOriginal=0/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.PatternReplaceFilterFactory pattern=0 replacement=o replace=all/ filter class=solr.PatternReplaceFilterFactory pattern=1|l replacement=i replace=all/ filter class=solr.NGramFilterFactory minGramSize=4 maxGramSize=16/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.NGramTokenizerFactory minGramSize=4 maxGramSize=16 / tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.PatternReplaceFilterFactory pattern=[^A-Za-z0-9]+ replacement= replace=all/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.PatternReplaceFilterFactory pattern=0 replacement=o replace=all/ filter class=solr.PatternReplaceFilterFactory pattern=1|l replacement=i replace=all/ !--filter class=solr.NGramFilterFactory minGramSize=4 maxGramSize=4 /-- filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType dynamicField name=*_n type=ngram indexed=true stored=true / It looks like it's generating phrases to me even though I have it set
Re: What does too many merges...stalling in indexwriter log mean?
On 7/11/2013 1:47 PM, Tom Burton-West wrote: We are seeing the message too many merges...stalling in our indexwriter log. Is this something to be concerned about? Does it mean we need to tune something in our indexing configuration? It sounds like you've run into the maximum number of simultaneous merges, which I believe defaults to two, or maybe three. The following config section in indexConfig will likely take care of the issue. This assumes 3.6 or later, I believe that on older versions, this goes in indexDefaults. mergeScheduler class=org.apache.lucene.index.ConcurrentMergeScheduler int name=maxThreadCount1/int int name=maxMergeCount6/int /mergeScheduler Looking through the source code to confirm, this definitely seems like the case. Increasing maxMergeCount is likely going to speed up your indexing, at least by a little bit. A value of 6 is probably high enough for mere mortals, buy you guys don't do anything small, so I won't begin to speculate what you'll need. If you are using spinning disks, you'll want maxThreadCount at 1. If you're using SSD, then you can likely increase that value. Thanks, Shawn
Re: SolrJ 4.3 to Solr 1.4
On 7/11/2013 2:24 PM, Jonathan Rochkind wrote: (If I _don't_ setParser to XML, and use the binary parser... I get a fully expected error about binary format corruption -- that part is expected and I understand it, that's why you have to use the XMLResponseParser instead). Am I not doing enough to my SolrJ 4.3 to get it to talk to the Solr 1.4 server in pure XML? I've set the parser to the XMLResponseParser, do I also have to somehow tell it to actually use the Solr 1.4 XML update handler or something? I don't entirely understand what I'm talking about. From everything I understand, it should be possible to make this work. For XML updates, the handler should be /update on both 1.4 and 4.x. There might be some additional steps that need to be taken, but without more info I'm not sure what those steps might be. Is there more to the client-side exception? Do you see anything in the server-side logs? If your server is logging at INFO, you should hopefully be able to see some of the actual request. Can you share a larger snippet of your SolrJ code? Thanks, Shawn
Re: Partial Matching in both query and field
1. My general process for a schema change (I know it's overkill) is delete the data directory, reload, index data, reload again. 2. I'm using schema version 1.5 on Solr 3.6.2. schema name=SearchSpringDefault version=1.5 3. LuceneQParser, but I've also tried dismax and edismax. Here's my solrQueryParser field in my schema, I think OR is correct for this. solrQueryParser defaultOperator=OR/ James [image: SearchSpring | Findability Unleashed] James Bathgate | Sr. Developer Toll Free (888) 643-9043 x610 - Fax (719) 358-2027 4291 Austin Bluffs Pkwy #206 | Colorado Springs, CO 80918 www.searchspring.net http://www.searchspring.net On Thu, Jul 11, 2013 at 2:29 PM, Jack Krupansky j...@basetechnology.comwrote: A couple of possibilities: 1. Make sure to reload the core. 2. Check that the Solr schema version is new enough to recognize autoGeneratePhraseQueries. 3. What query parser are you using? -- Jack Krupansky -Original Message- From: James Bathgate Sent: Thursday, July 11, 2013 5:26 PM To: solr-user@lucene.apache.org Subject: Re: Partial Matching in both query and field I just noticed I pasted the wrong fieldType with the extra tokenizer not commented out. fieldType name=ngram class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=**false analyzer type=index tokenizer class=solr.**WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=**true/ filter class=solr.**SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.**WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=1 splitOnCaseChange=0 splitOnNumerics=0 preserveOriginal=0/ filter class=solr.**LowerCaseFilterFactory/ filter class=solr.**PatternReplaceFilterFactory pattern=0 replacement=o replace=all/ filter class=solr.**PatternReplaceFilterFactory pattern=1|l replacement=i replace=all/ filter class=solr.**NGramFilterFactory minGramSize=4 maxGramSize=16/ filter class=solr.**RemoveDuplicatesTokenFilterFac**tory/ /analyzer analyzer type=query tokenizer class=solr.**NGramTokenizerFactory minGramSize=4 maxGramSize=16 / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=**true/ filter class=solr.**PatternReplaceFilterFactory pattern=[^A-Za-z0-9]+ replacement= replace=all/ filter class=solr.**LowerCaseFilterFactory/ filter class=solr.**PatternReplaceFilterFactory pattern=0 replacement=o replace=all/ filter class=solr.**PatternReplaceFilterFactory pattern=1|l replacement=i replace=all/ filter class=solr.**RemoveDuplicatesTokenFilterFac**tory/ /analyzer /fieldType [image: SearchSpring | Findability Unleashed] James Bathgate | Sr. Developer Toll Free (888) 643-9043 x610 - Fax (719) 358-2027 4291 Austin Bluffs Pkwy #206 | Colorado Springs, CO 80918 www.searchspring.net http://www.searchspring.net On Thu, Jul 11, 2013 at 2:15 PM, James Bathgate ja...@b7interactive.com* *wrote: Jack, This still isn't working. I just upgraded to 3.6.2 to verify that wasn't the issue. Here's query information: lst name=params str name=debugQueryon/str str name=indenton/str str name=start0/str str name=q0_extrafield1_n:**20454/str str name=rows10/str str name=version2.2/str /lst /lst result name=response numFound=0 start=0/ lst name=debug str name=rawquerystring0_**extrafield1_n:20454/str str name=querystring0_**extrafield1_n:20454/str str name=parsedquery**PhraseQuery(0_extrafield1_n:**2o45 o454 2o454)/str str name=parsedquery_toString0_**extrafield1_n:2o45 o454 2o454/str lst name=explain/ str name=QParserLuceneQParser/**str Here's the applicable lines from schema.xml: fieldType name=ngram class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=**false analyzer type=index tokenizer class=solr.**WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=**true/ filter class=solr.**SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.**WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=1 splitOnCaseChange=0 splitOnNumerics=0 preserveOriginal=0/ filter class=solr.**LowerCaseFilterFactory/ filter class=solr.**PatternReplaceFilterFactory pattern=0 replacement=o replace=all/ filter class=solr.**PatternReplaceFilterFactory pattern=1|l replacement=i replace=all/ filter class=solr.**NGramFilterFactory minGramSize=4 maxGramSize=16/ filter class=solr.**RemoveDuplicatesTokenFilterFac**tory/ /analyzer analyzer
How to set a condition over stats result
Hello, I am trying to see how I can test the sum of values of an attribute across docs. I.e. Whether sum(myfieldvalue)100 . I know I can use the stats module which compiles the sum of my attributes on a certain facet , but how can I perform a test this result (i.e. Is sum100) within my stats query? From what I read, it's not supported yet to perform a function on the stats module.. Any other way to do this ? Cheers, Matt NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
POST question
I want to use a browser and use HTTP POST to add a single document (not a file) to Solr. I don't want to use cURL. I've made several attempts, such as the following: http://localhost:8080/solr/update?commit=truestream.type=text/xml;adddocfield name=id61234567/fieldfield name=titleWAR OF THE WORLDS/fielddoc/add I get following message which makes it appear the POST was successful, but when I query on the id, there are no results. I've commited in a separate post too, but again, no results. ?xml version=1.0 encoding=UTF-8 ? - response - lstname=responseHeader intname=status0/int intname=QTime15/int /lst /response It's probably a syntax error, but not sure. I'm using Solr 3.6 on Windows XP SP 3. Any help would be appreciated.
RE: POST question
Hi John, You can't make a browser to a HTTP POST by adding a URL in a browser. You are doing a HTTP GET. So - use curl, or make a small application for doing the HTTP POST. Or even better: Use a browser plugin. Several of these exists. Example: DEV HTTP CLIENT extension for Chrome. Roland Villemoes -Original Message- From: John Randall [mailto:jmr...@yahoo.com] Sent: 12. juli 2013 00:12 To: solr-user@lucene.apache.org Subject: POST question I want to use a browser and use HTTP POST to add a single document (not a file) to Solr. I don't want to use cURL. I've made several attempts, such as the following: http://localhost:8080/solr/update?commit=truestream.type=text/xml;adddocfield name=id61234567/fieldfield name=titleWAR OF THE WORLDS/fielddoc/add I get following message which makes it appear the POST was successful, but when I query on the id, there are no results. I've commited in a separate post too, but again, no results. ?xml version=1.0 encoding=UTF-8 ? - response - lstname=responseHeader intname=status0/int intname=QTime15/int /lst /response It's probably a syntax error, but not sure. I'm using Solr 3.6 on Windows XP SP 3. Any help would be appreciated.
Re: POST question
On 7/11/2013 4:12 PM, John Randall wrote: I want to use a browser and use HTTP POST to add a single document (not a file) to Solr. I don't want to use cURL. I've made several attempts, such as the following: http://localhost:8080/solr/update?commit=truestream.type=text/xml;adddocfield name=id61234567/fieldfield name=titleWAR OF THE WORLDS/fielddoc/add I get following message which makes it appear the POST was successful, but when I query on the id, there are no results. I've commited in a separate post too, but again, no results. ?xml version=1.0 encoding=UTF-8 ? - response - lstname=responseHeader intname=status0/int intname=QTime15/int /lst /response This is actually not a POST. It's a GET -- that's the only kind of request you can make from a browser with a URL that's typed or pasted. In order to get a POST request from a browser, you need to have an HTML page with an HTML form in it and submit that form. I'm not going to go into how to do this here, because that is basic HTML stuff. If you use the stream.body parameter for your XML update, you might be able to use a GET request and have it actually work. http://wiki.apache.org/solr/UpdateXmlMessages#Updating_via_GET URL encoding the XML characters is required, as mentioned on that page. I recently tried to do this myself on Solr 4.4-SNAPSHOT, and it didn't work. I never did figure out why. It's probably more likely to work on a 3.x version. Thanks, Shawn
Re: POST question
I'll try the plugin. Thanks. From: Roland Villemoes r...@alpha-solutions.dk To: solr-user@lucene.apache.org solr-user@lucene.apache.org; John Randall jmr...@yahoo.com Sent: Thursday, July 11, 2013 6:21 PM Subject: RE: POST question Hi John, You can't make a browser to a HTTP POST by adding a URL in a browser. You are doing a HTTP GET. So - use curl, or make a small application for doing the HTTP POST. Or even better: Use a browser plugin. Several of these exists. Example: DEV HTTP CLIENT extension for Chrome. Roland Villemoes -Original Message- From: John Randall [mailto:jmr...@yahoo.com] Sent: 12. juli 2013 00:12 To: solr-user@lucene.apache.org Subject: POST question I want to use a browser and use HTTP POST to add a single document (not a file) to Solr. I don't want to use cURL. I've made several attempts, such as the following: http://localhost:8080/solr/update?commit=truestream.type=text/xml;adddocfield name=id61234567/fieldfield name=titleWAR OF THE WORLDS/fielddoc/add I get following message which makes it appear the POST was successful, but when I query on the id, there are no results. I've commited in a separate post too, but again, no results. ?xml version=1.0 encoding=UTF-8 ? - response - lstname=responseHeader intname=status0/int intname=QTime15/int /lst /response It's probably a syntax error, but not sure. I'm using Solr 3.6 on Windows XP SP 3. Any help would be appreciated.
Re: POST question
I'll probably move to Solr 4.x, so I'm going to try a plugin instead. Thanks for you insights. From: Shawn Heisey s...@elyograg.org To: solr-user@lucene.apache.org Sent: Thursday, July 11, 2013 6:28 PM Subject: Re: POST question On 7/11/2013 4:12 PM, John Randall wrote: I want to use a browser and use HTTP POST to add a single document (not a file) to Solr. I don't want to use cURL. I've made several attempts, such as the following: http://localhost:8080/solr/update?commit=truestream.type=text/xml;adddocfield name=id61234567/fieldfield name=titleWAR OF THE WORLDS/fielddoc/add I get following message which makes it appear the POST was successful, but when I query on the id, there are no results. I've commited in a separate post too, but again, no results. ?xml version=1.0 encoding=UTF-8 ? - response - lstname=responseHeader intname=status0/int intname=QTime15/int /lst /response This is actually not a POST. It's a GET -- that's the only kind of request you can make from a browser with a URL that's typed or pasted. In order to get a POST request from a browser, you need to have an HTML page with an HTML form in it and submit that form. I'm not going to go into how to do this here, because that is basic HTML stuff. If you use the stream.body parameter for your XML update, you might be able to use a GET request and have it actually work. http://wiki.apache.org/solr/UpdateXmlMessages#Updating_via_GET URL encoding the XML characters is required, as mentioned on that page. I recently tried to do this myself on Solr 4.4-SNAPSHOT, and it didn't work. I never did figure out why. It's probably more likely to work on a 3.x version. Thanks, Shawn
preferred container for running SolrCloud
1) Jboss 2) Jetty 3) Tomcat 4) Other.. ?
Re: preferred container for running SolrCloud
We're running under jetty. Sent from my iPhone On Jul 11, 2013, at 6:06 PM, Ali, Saqib docbook@gmail.com wrote: 1) Jboss 2) Jetty 3) Tomcat 4) Other.. ?
Re: preferred container for running SolrCloud
With the embedded Zookeeper or separate Zookeeper? Also have run into any issues with running SolrCloud on jetty? On Thu, Jul 11, 2013 at 7:01 PM, Saikat Kanjilal sxk1...@hotmail.comwrote: We're running under jetty. Sent from my iPhone On Jul 11, 2013, at 6:06 PM, Ali, Saqib docbook@gmail.com wrote: 1) Jboss 2) Jetty 3) Tomcat 4) Other.. ?
Re: preferred container for running SolrCloud
On production, I'd highly recommend you to run Zk separately as that'd give you, among other things, the liberty of shutting down a SolrCloud instance. I haven't heard or seen any SolrCloud issues while running it on jetty. On Fri, Jul 12, 2013 at 7:57 AM, Ali, Saqib docbook@gmail.com wrote: With the embedded Zookeeper or separate Zookeeper? Also have run into any issues with running SolrCloud on jetty? On Thu, Jul 11, 2013 at 7:01 PM, Saikat Kanjilal sxk1...@hotmail.com wrote: We're running under jetty. Sent from my iPhone On Jul 11, 2013, at 6:06 PM, Ali, Saqib docbook@gmail.com wrote: 1) Jboss 2) Jetty 3) Tomcat 4) Other.. ? -- Anshum Gupta http://www.anshumgupta.net
Re: preferred container for running SolrCloud
Embedded Zookeeper is only for dev. Production needs to run a ZK cluster. --wunder On Jul 11, 2013, at 7:27 PM, Ali, Saqib wrote: With the embedded Zookeeper or separate Zookeeper? Also have run into any issues with running SolrCloud on jetty? On Thu, Jul 11, 2013 at 7:01 PM, Saikat Kanjilal sxk1...@hotmail.comwrote: We're running under jetty. Sent from my iPhone On Jul 11, 2013, at 6:06 PM, Ali, Saqib docbook@gmail.com wrote: 1) Jboss 2) Jetty 3) Tomcat 4) Other.. ?
Re: preferred container for running SolrCloud
Thanks Walter. And the container.. On Thu, Jul 11, 2013 at 7:55 PM, Walter Underwood wun...@wunderwood.orgwrote: Embedded Zookeeper is only for dev. Production needs to run a ZK cluster. --wunder On Jul 11, 2013, at 7:27 PM, Ali, Saqib wrote: With the embedded Zookeeper or separate Zookeeper? Also have run into any issues with running SolrCloud on jetty? On Thu, Jul 11, 2013 at 7:01 PM, Saikat Kanjilal sxk1...@hotmail.com wrote: We're running under jetty. Sent from my iPhone On Jul 11, 2013, at 6:06 PM, Ali, Saqib docbook@gmail.com wrote: 1) Jboss 2) Jetty 3) Tomcat 4) Other.. ?
Re: preferred container for running SolrCloud
We use Tomcat for everything. It might not be the best, but it is what our Ops group is used to. wunder On Jul 11, 2013, at 7:58 PM, Ali, Saqib wrote: Thanks Walter. And the container.. On Thu, Jul 11, 2013 at 7:55 PM, Walter Underwood wun...@wunderwood.orgwrote: Embedded Zookeeper is only for dev. Production needs to run a ZK cluster. --wunder On Jul 11, 2013, at 7:27 PM, Ali, Saqib wrote: With the embedded Zookeeper or separate Zookeeper? Also have run into any issues with running SolrCloud on jetty? On Thu, Jul 11, 2013 at 7:01 PM, Saikat Kanjilal sxk1...@hotmail.com wrote: We're running under jetty. Sent from my iPhone On Jul 11, 2013, at 6:06 PM, Ali, Saqib docbook@gmail.com wrote: 1) Jboss 2) Jetty 3) Tomcat 4) Other.. ? -- Walter Underwood wun...@wunderwood.org
RE: preferred container for running SolrCloud
Separate Zookeeper. Date: Thu, 11 Jul 2013 19:27:18 -0700 Subject: Re: preferred container for running SolrCloud From: docbook@gmail.com To: solr-user@lucene.apache.org With the embedded Zookeeper or separate Zookeeper? Also have run into any issues with running SolrCloud on jetty? On Thu, Jul 11, 2013 at 7:01 PM, Saikat Kanjilal sxk1...@hotmail.comwrote: We're running under jetty. Sent from my iPhone On Jul 11, 2013, at 6:06 PM, Ali, Saqib docbook@gmail.com wrote: 1) Jboss 2) Jetty 3) Tomcat 4) Other.. ?
RE: preferred container for running SolrCloud
One last thing, no issues with jetty. The issues we did have was actually running separate zookeeper clusters. From: sxk1...@hotmail.com To: solr-user@lucene.apache.org Subject: RE: preferred container for running SolrCloud Date: Thu, 11 Jul 2013 20:13:27 -0700 Separate Zookeeper. Date: Thu, 11 Jul 2013 19:27:18 -0700 Subject: Re: preferred container for running SolrCloud From: docbook@gmail.com To: solr-user@lucene.apache.org With the embedded Zookeeper or separate Zookeeper? Also have run into any issues with running SolrCloud on jetty? On Thu, Jul 11, 2013 at 7:01 PM, Saikat Kanjilal sxk1...@hotmail.comwrote: We're running under jetty. Sent from my iPhone On Jul 11, 2013, at 6:06 PM, Ali, Saqib docbook@gmail.com wrote: 1) Jboss 2) Jetty 3) Tomcat 4) Other.. ?
Re: How to set a condition over stats result
None that I know of, short of writing a custom search component. Seriously, you could hack up a copy of the stats component with your own logic. Actually... this may be a case for the new, proposed Script Request Handler, which would let you execute a query and then you could do any custom JavaScript logic you wanted. When we get that feature, it might be interesting to implement a variation of the standard stats component as a JavaScript script, and then people could easily hack it such as in your request. Fascinating. -- Jack Krupansky -Original Message- From: Matt Lieber Sent: Thursday, July 11, 2013 6:08 PM To: solr-user@lucene.apache.org Subject: How to set a condition over stats result Hello, I am trying to see how I can test the sum of values of an attribute across docs. I.e. Whether sum(myfieldvalue)100 . I know I can use the stats module which compiles the sum of my attributes on a certain facet , but how can I perform a test this result (i.e. Is sum100) within my stats query? From what I read, it's not supported yet to perform a function on the stats module.. Any other way to do this ? Cheers, Matt NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
Re: How to set a condition over stats result
What if you perform sub(sum(myfieldvalue),100) 0 using frange? From: Jack Krupansky j...@basetechnology.com To: solr-user@lucene.apache.org Sent: Friday, July 12, 2013 7:44 AM Subject: Re: How to set a condition over stats result None that I know of, short of writing a custom search component. Seriously, you could hack up a copy of the stats component with your own logic. Actually... this may be a case for the new, proposed Script Request Handler, which would let you execute a query and then you could do any custom JavaScript logic you wanted. When we get that feature, it might be interesting to implement a variation of the standard stats component as a JavaScript script, and then people could easily hack it such as in your request. Fascinating. -- Jack Krupansky -Original Message- From: Matt Lieber Sent: Thursday, July 11, 2013 6:08 PM To: solr-user@lucene.apache.org Subject: How to set a condition over stats result Hello, I am trying to see how I can test the sum of values of an attribute across docs. I.e. Whether sum(myfieldvalue)100 . I know I can use the stats module which compiles the sum of my attributes on a certain facet , but how can I perform a test this result (i.e. Is sum100) within my stats query? From what I read, it's not supported yet to perform a function on the stats module.. Any other way to do this ? Cheers, Matt NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
How to set a condition on the number of docs found
Hello there, I would like to be able to know whether I got over a certain threshold of doc results. I.e. Test (Result.numFound 10 ) - true. Is there a way to do this ? I can't seem to find how to do this; (other than have to do this test on the client app, which is not great). Thanks, Matt NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
Re: Usage of CloudSolrServer?
Hi , Iam using cloudsolrserver to connect to solrcloud, im indexing the documents using solrj API using cloudsolrserver object. Index is triggered on master node of a collection, whereas if i need to find the status of the loading , it return the message from replica where status is null. How to find which instance the cloudsolrserver is connecting ? -- View this message in context: http://lucene.472066.n3.nabble.com/Usage-of-CloudSolrServer-tp4056052p4077471.html Sent from the Solr - User mailing list archive at Nabble.com.