Confusion around Binary/XML in SolrJ

2009-07-20 Thread Code Tester
I am using solr 1.4 dev in a multicore way.  Each of my core's
solrconfig.xml has the following lines

requestHandler name=/update class=solr.XmlUpdateRequestHandler /
requestHandler name=/update/javabin
class=solr.BinaryUpdateRequestHandler /

I am using SolrJ as EmbeddedSolrServer. When I try to add a POJO ( with
@Field annotations ), the data does not get indexed. Where as, if I use
SolrInputDocument way, the data gets indexed.

PS: Both ways I am adding data using addBean/add and then commit followed by
optimize

PPS: The final intention is that all the indexing and searching needs to be
done in the binary format since I am running on a single machine.

Could someone provide insights on this issue ?

Thanks!


Re: Confusion around Binary/XML in SolrJ

2009-07-20 Thread Code Tester
Another observation:

I am even unable to delete documents using the EmbeddedSolrServer ( on a
specific core )

Steps:

1) I have 2 cores ( core0 , core1 ) Each of them have ~10 records.

2) System.setProperty(solr.solr.home,
/home/user/projects/solr/example/multi);
File home = new File(/home/user/projects/solr/example/multi);
File f = new File(home, solr.xml);
CoreContainer coreContainer = new CoreContainer();
coreContainer.load(/home/user/projects/solr/example/multi, f);
SolrServer server = new EmbeddedSolrServer(coreContainer, core1);

server.deleteByQuery(*:*);
server.commit();
server.optimize();

3) When I check the status using
http://localhost:8983/solr/admin/cores?action=STATUS , I still see same
number of numDocs.

4) If I try deleting using CommonsHttpSolrServer, it works fine
String url = http://localhost:8983/solr/core1;;
CommonsHttpSolrServer server = new CommonsHttpSolrServer(url);
server.setSoTimeout(1000); // socket read timeout
server.setConnectionTimeout(100);
server.setDefaultMaxConnectionsPerHost(100);
server.setMaxTotalConnections(100);
server.setFollowRedirects(false); // defaults to false
server.setAllowCompression(true);
server.setMaxRetries(1); // defaults to 0.  1 not recommended.
server.setRequestWriter(new BinaryRequestWriter());

server.deleteByQuery(*:*);
server.commit();
server.optimize();

Thanks!

On Mon, Jul 20, 2009 at 3:26 PM, Code Tester 
codetester.codetes...@gmail.com wrote:

 I am using solr 1.4 dev in a multicore way.  Each of my core's
 solrconfig.xml has the following lines

 requestHandler name=/update class=solr.XmlUpdateRequestHandler /
 requestHandler name=/update/javabin
 class=solr.BinaryUpdateRequestHandler /

 I am using SolrJ as EmbeddedSolrServer. When I try to add a POJO ( with
 @Field annotations ), the data does not get indexed. Where as, if I use
 SolrInputDocument way, the data gets indexed.

 PS: Both ways I am adding data using addBean/add and then commit followed
 by optimize

 PPS: The final intention is that all the indexing and searching needs to be
 done in the binary format since I am running on a single machine.

 Could someone provide insights on this issue ?

 Thanks!






Re: Confusion around Binary/XML in SolrJ

2009-07-20 Thread Code Tester
Sorry everyone. Found the issue. It was because of a very stupid assumption.

My code and solr were running as 2 different processes! ( Weird part is that
when I run the code using EmbeddedSolrServer, it did not throw any exception
that there was already a server running on that port. )

Thanks!

On Mon, Jul 20, 2009 at 3:41 PM, Code Tester 
codetester.codetes...@gmail.com wrote:

 Another observation:

 I am even unable to delete documents using the EmbeddedSolrServer ( on a
 specific core )

 Steps:

 1) I have 2 cores ( core0 , core1 ) Each of them have ~10 records.

 2) System.setProperty(solr.solr.home,
 /home/user/projects/solr/example/multi);
 File home = new File(/home/user/projects/solr/example/multi);
 File f = new File(home, solr.xml);
 CoreContainer coreContainer = new CoreContainer();
 coreContainer.load(/home/user/projects/solr/example/multi, f);
 SolrServer server = new EmbeddedSolrServer(coreContainer, core1);

 server.deleteByQuery(*:*);
 server.commit();
 server.optimize();

 3) When I check the status using
 http://localhost:8983/solr/admin/cores?action=STATUS , I still see same
 number of numDocs.

 4) If I try deleting using CommonsHttpSolrServer, it works fine
 String url = http://localhost:8983/solr/core1;;
 CommonsHttpSolrServer server = new CommonsHttpSolrServer(url);
 server.setSoTimeout(1000); // socket read timeout
 server.setConnectionTimeout(100);
 server.setDefaultMaxConnectionsPerHost(100);
 server.setMaxTotalConnections(100);
 server.setFollowRedirects(false); // defaults to false
 server.setAllowCompression(true);
 server.setMaxRetries(1); // defaults to 0.  1 not recommended.
 server.setRequestWriter(new BinaryRequestWriter());

 server.deleteByQuery(*:*);
 server.commit();
 server.optimize();

 Thanks!


 On Mon, Jul 20, 2009 at 3:26 PM, Code Tester 
 codetester.codetes...@gmail.com wrote:

 I am using solr 1.4 dev in a multicore way.  Each of my core's
 solrconfig.xml has the following lines

 requestHandler name=/update class=solr.XmlUpdateRequestHandler /
 requestHandler name=/update/javabin
 class=solr.BinaryUpdateRequestHandler /

 I am using SolrJ as EmbeddedSolrServer. When I try to add a POJO ( with
 @Field annotations ), the data does not get indexed. Where as, if I use
 SolrInputDocument way, the data gets indexed.

 PS: Both ways I am adding data using addBean/add and then commit followed
 by optimize

 PPS: The final intention is that all the indexing and searching needs to
 be done in the binary format since I am running on a single machine.

 Could someone provide insights on this issue ?

 Thanks!








Re: Solr MultiCore query

2009-07-17 Thread Code Tester
Thanks ahammad for the quick reply.

As suggested, I am trying out multi core way of implementing the search. I
am trying out the multicore example and getting stuck at an issue. Here is
what I did and the issue I am facing

1) Downloaded 1.4 and started the multicore example using java
-Dsolr.solr.home=multicore -jar start.jar

2) There were 2 files present under example/multicore/exampledocs/ , which I
added to 2 cores respectively. ( Totally 3 docs are present in those 2 files
and all have the word 'ipod' in it )

3) When I query using
http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1q=*:*I
get all the 3 results.

But when I query using
http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1q=
*ipod* , I get no results :(

What could be the issue ?

Thanks!


On Fri, Jul 17, 2009 at 7:20 PM, ahammad ahmed.ham...@gmail.com wrote:


 Hello,

 I'm not sure what the best way is to do this, but I have done something
 identical.

 I have the same requirements, ie several datasources. I also used SolrJ and
 jsp for this. The way I ended up doing it was to create a multi core
 environment, one core per datasource. When I do a query across several
 datasources, I use shards. Solr automatically returns a hybrid result set
 that way, sorted by solr's default scoring.

 Faceting comes in the picture when you want to show the number of documents
 per datasource and have the ability to narrow down the result set. The way
 I
 did it was to add a field called dataSource to all the documents, and
 injected them with a default value of the data source name (in your case,
 D1, D2 ...). You can do this by adding this in the schema:

 field name=dataSource type=string indexed=true stored=true
 required=true default=D1/

 When you perform a query across multiple datasources, you will use shards.
 Here is an example:


 http://localhost:8080/solr/core1/select?shards=localhost:8080/solr/core1,localhost:8080/solr/core2q=some
 query

 That will search on both cores 1 and 2.

 To facet on the datasource in order to be able to categorize the result
 set,
 you can simply add this snippet to the query:

 facet=onfacet.field=dataSource

 This will return the datasources that are defined with their number of
 results for the query.

 Making the facet results clickable in order to narrow down the results can
 be achieved by adding a filter to the query and filtering to a specific
 dataSource. I actually ended uo creating a fairly intuitive front-end for
 my
 system with faceting, filtering, paging etc all using jsp and SolrJ. SolrJ
 is powerful enough to handle all of the backend processing.

 Good luck!






 joe_coder wrote:
 
  I missed adding some size related information in the query above.
 
  D1 and D2 would have close to 1 million records each
  D3 would have ~10 million records.
 
  Thanks!
 

 --
 View this message in context:
 http://www.nabble.com/Solr-MultiCore-query-tp24534383p24534793.html
 Sent from the Solr - User mailing list archive at Nabble.com.




Re: Solr MultiCore query

2009-07-17 Thread Code Tester
Both schema.xml ( in example/multicore/core0/conf and
example/multicore/core1/conf ) already have

* defaultSearchFieldname/defaultSearchField*

Here are the following query responses:

1)
http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1q=*:*

response
lst name=responseHeaderint name=status0/intint
name=QTime254/int/lstresult name=response numFound=3
start=0docstr name=idMA147LL/A/strstr name=nameApple 60
GB iPod with Video Playback Black/str/docdocstr
name=idF8V7067-APL-KIT/strstr name=nameBelkin Mobile Power
Cord for iPod w/ Dock/str/docdocstr name=idIW-02/strstr
name=nameiPod amp; iPod Mini USB 2.0 Cable/str/doc/result
/response

2)
http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1q=
*ipod*
No result

3)
http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1q=
*name:ipod
*No result*
*
What may be happening?

Thanks!

On Fri, Jul 17, 2009 at 11:37 PM, ahammad ahmed.ham...@gmail.com wrote:


 Hello joe_coder,

 Are you using the default example docs in your queries?

 If so, then I see that the word ipod appears in a field called name. By
 default, the default search field (defined in solrconfig.xml) is the field
 called text. This means that when you submit a query without specifying
 which field to look for (using the field:query) notation, Solr
 automatically
 assumes that you are looking in the field called text.

 If you change your query to q=name:ipod, you should get the results back.

 One way to prevent this is to change your default search field to something
 else. Alternatively, if you want to search on multiple fields, you can copy
 all those fields to the text field and go from there. This can be useful
 if for example you had a book library to search through. You may need to
 search on title, short summary, description etc simultaneously. You can
 copy
 all those things to the text field and then search on the text field, which
 contains all the information that you wanted to search on.


 joe_coder wrote:
 
  Thanks ahammad for the quick reply.
 
  As suggested, I am trying out multi core way of implementing the search.
 I
  am trying out the multicore example and getting stuck at an issue. Here
 is
  what I did and the issue I am facing
 
  1) Downloaded 1.4 and started the multicore example using java
  -Dsolr.solr.home=multicore -jar start.jar
 
  2) There were 2 files present under example/multicore/exampledocs/ ,
 which
  I
  added to 2 cores respectively. ( Totally 3 docs are present in those 2
  files
  and all have the word 'ipod' in it )
 
  3) When I query using
 
 http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1q=*:*I
  get all the 3 results.
 
  But when I query using
 
 http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1q=
  *ipod* , I get no results :(
 
  What could be the issue ?
 
  Thanks!
 
 
  On Fri, Jul 17, 2009 at 7:20 PM, ahammad ahmed.ham...@gmail.com wrote:
 
 
  Hello,
 
  I'm not sure what the best way is to do this, but I have done something
  identical.
 
  I have the same requirements, ie several datasources. I also used SolrJ
  and
  jsp for this. The way I ended up doing it was to create a multi core
  environment, one core per datasource. When I do a query across several
  datasources, I use shards. Solr automatically returns a hybrid result
  set
  that way, sorted by solr's default scoring.
 
  Faceting comes in the picture when you want to show the number of
  documents
  per datasource and have the ability to narrow down the result set. The
  way
  I
  did it was to add a field called dataSource to all the documents, and
  injected them with a default value of the data source name (in your
 case,
  D1, D2 ...). You can do this by adding this in the schema:
 
  field name=dataSource type=string indexed=true stored=true
  required=true default=D1/
 
  When you perform a query across multiple datasources, you will use
  shards.
  Here is an example:
 
 
 
 http://localhost:8080/solr/core1/select?shards=localhost:8080/solr/core1,localhost:8080/solr/core2q=some
  query
 
  That will search on both cores 1 and 2.
 
  To facet on the datasource in order to be able to categorize the result
  set,
  you can simply add this snippet to the query:
 
  facet=onfacet.field=dataSource
 
  This will return the datasources that are defined with their number of
  results for the query.
 
  Making the facet results clickable in order to narrow down the results
  can
  be achieved by adding a filter to the query and filtering to a specific
  dataSource. I actually ended uo creating a fairly intuitive front-end
 for
  my
  system with faceting, filtering, paging etc all using jsp and SolrJ.
  SolrJ
  is powerful enough to handle all of the backend processing.
 
  Good luck!
 
 
 
 
 
 
  joe_coder wrote:
  
   I missed adding some size