Confusion around Binary/XML in SolrJ
I am using solr 1.4 dev in a multicore way. Each of my core's solrconfig.xml has the following lines requestHandler name=/update class=solr.XmlUpdateRequestHandler / requestHandler name=/update/javabin class=solr.BinaryUpdateRequestHandler / I am using SolrJ as EmbeddedSolrServer. When I try to add a POJO ( with @Field annotations ), the data does not get indexed. Where as, if I use SolrInputDocument way, the data gets indexed. PS: Both ways I am adding data using addBean/add and then commit followed by optimize PPS: The final intention is that all the indexing and searching needs to be done in the binary format since I am running on a single machine. Could someone provide insights on this issue ? Thanks!
Re: Confusion around Binary/XML in SolrJ
Another observation: I am even unable to delete documents using the EmbeddedSolrServer ( on a specific core ) Steps: 1) I have 2 cores ( core0 , core1 ) Each of them have ~10 records. 2) System.setProperty(solr.solr.home, /home/user/projects/solr/example/multi); File home = new File(/home/user/projects/solr/example/multi); File f = new File(home, solr.xml); CoreContainer coreContainer = new CoreContainer(); coreContainer.load(/home/user/projects/solr/example/multi, f); SolrServer server = new EmbeddedSolrServer(coreContainer, core1); server.deleteByQuery(*:*); server.commit(); server.optimize(); 3) When I check the status using http://localhost:8983/solr/admin/cores?action=STATUS , I still see same number of numDocs. 4) If I try deleting using CommonsHttpSolrServer, it works fine String url = http://localhost:8983/solr/core1;; CommonsHttpSolrServer server = new CommonsHttpSolrServer(url); server.setSoTimeout(1000); // socket read timeout server.setConnectionTimeout(100); server.setDefaultMaxConnectionsPerHost(100); server.setMaxTotalConnections(100); server.setFollowRedirects(false); // defaults to false server.setAllowCompression(true); server.setMaxRetries(1); // defaults to 0. 1 not recommended. server.setRequestWriter(new BinaryRequestWriter()); server.deleteByQuery(*:*); server.commit(); server.optimize(); Thanks! On Mon, Jul 20, 2009 at 3:26 PM, Code Tester codetester.codetes...@gmail.com wrote: I am using solr 1.4 dev in a multicore way. Each of my core's solrconfig.xml has the following lines requestHandler name=/update class=solr.XmlUpdateRequestHandler / requestHandler name=/update/javabin class=solr.BinaryUpdateRequestHandler / I am using SolrJ as EmbeddedSolrServer. When I try to add a POJO ( with @Field annotations ), the data does not get indexed. Where as, if I use SolrInputDocument way, the data gets indexed. PS: Both ways I am adding data using addBean/add and then commit followed by optimize PPS: The final intention is that all the indexing and searching needs to be done in the binary format since I am running on a single machine. Could someone provide insights on this issue ? Thanks!
Re: Confusion around Binary/XML in SolrJ
Sorry everyone. Found the issue. It was because of a very stupid assumption. My code and solr were running as 2 different processes! ( Weird part is that when I run the code using EmbeddedSolrServer, it did not throw any exception that there was already a server running on that port. ) Thanks! On Mon, Jul 20, 2009 at 3:41 PM, Code Tester codetester.codetes...@gmail.com wrote: Another observation: I am even unable to delete documents using the EmbeddedSolrServer ( on a specific core ) Steps: 1) I have 2 cores ( core0 , core1 ) Each of them have ~10 records. 2) System.setProperty(solr.solr.home, /home/user/projects/solr/example/multi); File home = new File(/home/user/projects/solr/example/multi); File f = new File(home, solr.xml); CoreContainer coreContainer = new CoreContainer(); coreContainer.load(/home/user/projects/solr/example/multi, f); SolrServer server = new EmbeddedSolrServer(coreContainer, core1); server.deleteByQuery(*:*); server.commit(); server.optimize(); 3) When I check the status using http://localhost:8983/solr/admin/cores?action=STATUS , I still see same number of numDocs. 4) If I try deleting using CommonsHttpSolrServer, it works fine String url = http://localhost:8983/solr/core1;; CommonsHttpSolrServer server = new CommonsHttpSolrServer(url); server.setSoTimeout(1000); // socket read timeout server.setConnectionTimeout(100); server.setDefaultMaxConnectionsPerHost(100); server.setMaxTotalConnections(100); server.setFollowRedirects(false); // defaults to false server.setAllowCompression(true); server.setMaxRetries(1); // defaults to 0. 1 not recommended. server.setRequestWriter(new BinaryRequestWriter()); server.deleteByQuery(*:*); server.commit(); server.optimize(); Thanks! On Mon, Jul 20, 2009 at 3:26 PM, Code Tester codetester.codetes...@gmail.com wrote: I am using solr 1.4 dev in a multicore way. Each of my core's solrconfig.xml has the following lines requestHandler name=/update class=solr.XmlUpdateRequestHandler / requestHandler name=/update/javabin class=solr.BinaryUpdateRequestHandler / I am using SolrJ as EmbeddedSolrServer. When I try to add a POJO ( with @Field annotations ), the data does not get indexed. Where as, if I use SolrInputDocument way, the data gets indexed. PS: Both ways I am adding data using addBean/add and then commit followed by optimize PPS: The final intention is that all the indexing and searching needs to be done in the binary format since I am running on a single machine. Could someone provide insights on this issue ? Thanks!
Re: Solr MultiCore query
Thanks ahammad for the quick reply. As suggested, I am trying out multi core way of implementing the search. I am trying out the multicore example and getting stuck at an issue. Here is what I did and the issue I am facing 1) Downloaded 1.4 and started the multicore example using java -Dsolr.solr.home=multicore -jar start.jar 2) There were 2 files present under example/multicore/exampledocs/ , which I added to 2 cores respectively. ( Totally 3 docs are present in those 2 files and all have the word 'ipod' in it ) 3) When I query using http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1q=*:*I get all the 3 results. But when I query using http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1q= *ipod* , I get no results :( What could be the issue ? Thanks! On Fri, Jul 17, 2009 at 7:20 PM, ahammad ahmed.ham...@gmail.com wrote: Hello, I'm not sure what the best way is to do this, but I have done something identical. I have the same requirements, ie several datasources. I also used SolrJ and jsp for this. The way I ended up doing it was to create a multi core environment, one core per datasource. When I do a query across several datasources, I use shards. Solr automatically returns a hybrid result set that way, sorted by solr's default scoring. Faceting comes in the picture when you want to show the number of documents per datasource and have the ability to narrow down the result set. The way I did it was to add a field called dataSource to all the documents, and injected them with a default value of the data source name (in your case, D1, D2 ...). You can do this by adding this in the schema: field name=dataSource type=string indexed=true stored=true required=true default=D1/ When you perform a query across multiple datasources, you will use shards. Here is an example: http://localhost:8080/solr/core1/select?shards=localhost:8080/solr/core1,localhost:8080/solr/core2q=some query That will search on both cores 1 and 2. To facet on the datasource in order to be able to categorize the result set, you can simply add this snippet to the query: facet=onfacet.field=dataSource This will return the datasources that are defined with their number of results for the query. Making the facet results clickable in order to narrow down the results can be achieved by adding a filter to the query and filtering to a specific dataSource. I actually ended uo creating a fairly intuitive front-end for my system with faceting, filtering, paging etc all using jsp and SolrJ. SolrJ is powerful enough to handle all of the backend processing. Good luck! joe_coder wrote: I missed adding some size related information in the query above. D1 and D2 would have close to 1 million records each D3 would have ~10 million records. Thanks! -- View this message in context: http://www.nabble.com/Solr-MultiCore-query-tp24534383p24534793.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr MultiCore query
Both schema.xml ( in example/multicore/core0/conf and example/multicore/core1/conf ) already have * defaultSearchFieldname/defaultSearchField* Here are the following query responses: 1) http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1q=*:* response lst name=responseHeaderint name=status0/intint name=QTime254/int/lstresult name=response numFound=3 start=0docstr name=idMA147LL/A/strstr name=nameApple 60 GB iPod with Video Playback Black/str/docdocstr name=idF8V7067-APL-KIT/strstr name=nameBelkin Mobile Power Cord for iPod w/ Dock/str/docdocstr name=idIW-02/strstr name=nameiPod amp; iPod Mini USB 2.0 Cable/str/doc/result /response 2) http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1q= *ipod* No result 3) http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1q= *name:ipod *No result* * What may be happening? Thanks! On Fri, Jul 17, 2009 at 11:37 PM, ahammad ahmed.ham...@gmail.com wrote: Hello joe_coder, Are you using the default example docs in your queries? If so, then I see that the word ipod appears in a field called name. By default, the default search field (defined in solrconfig.xml) is the field called text. This means that when you submit a query without specifying which field to look for (using the field:query) notation, Solr automatically assumes that you are looking in the field called text. If you change your query to q=name:ipod, you should get the results back. One way to prevent this is to change your default search field to something else. Alternatively, if you want to search on multiple fields, you can copy all those fields to the text field and go from there. This can be useful if for example you had a book library to search through. You may need to search on title, short summary, description etc simultaneously. You can copy all those things to the text field and then search on the text field, which contains all the information that you wanted to search on. joe_coder wrote: Thanks ahammad for the quick reply. As suggested, I am trying out multi core way of implementing the search. I am trying out the multicore example and getting stuck at an issue. Here is what I did and the issue I am facing 1) Downloaded 1.4 and started the multicore example using java -Dsolr.solr.home=multicore -jar start.jar 2) There were 2 files present under example/multicore/exampledocs/ , which I added to 2 cores respectively. ( Totally 3 docs are present in those 2 files and all have the word 'ipod' in it ) 3) When I query using http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1q=*:*I get all the 3 results. But when I query using http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1q= *ipod* , I get no results :( What could be the issue ? Thanks! On Fri, Jul 17, 2009 at 7:20 PM, ahammad ahmed.ham...@gmail.com wrote: Hello, I'm not sure what the best way is to do this, but I have done something identical. I have the same requirements, ie several datasources. I also used SolrJ and jsp for this. The way I ended up doing it was to create a multi core environment, one core per datasource. When I do a query across several datasources, I use shards. Solr automatically returns a hybrid result set that way, sorted by solr's default scoring. Faceting comes in the picture when you want to show the number of documents per datasource and have the ability to narrow down the result set. The way I did it was to add a field called dataSource to all the documents, and injected them with a default value of the data source name (in your case, D1, D2 ...). You can do this by adding this in the schema: field name=dataSource type=string indexed=true stored=true required=true default=D1/ When you perform a query across multiple datasources, you will use shards. Here is an example: http://localhost:8080/solr/core1/select?shards=localhost:8080/solr/core1,localhost:8080/solr/core2q=some query That will search on both cores 1 and 2. To facet on the datasource in order to be able to categorize the result set, you can simply add this snippet to the query: facet=onfacet.field=dataSource This will return the datasources that are defined with their number of results for the query. Making the facet results clickable in order to narrow down the results can be achieved by adding a filter to the query and filtering to a specific dataSource. I actually ended uo creating a fairly intuitive front-end for my system with faceting, filtering, paging etc all using jsp and SolrJ. SolrJ is powerful enough to handle all of the backend processing. Good luck! joe_coder wrote: I missed adding some size