Re: Using solr with cassandra

2015-08-30 Thread Doug Turnbull
Hi my colleague Chris Bradford had a similar issue and ended up using Spatk
as a Cassandra to Solr pipeline. He summarized the work in his talk:

https://m.youtube.com/watch?v=t4ONAI6YkPIlist=PL-x35fyliRwjSJ3D50uXcvJc_lFOTkfLSindex=1

On Sunday, August 30, 2015, bipin bipin@gmail.com wrote:

 Hi Upayavira, Datastax is not an option, so I will have to write my own
 tool.
 Can you give any pointers on how data is written into Solr. I am new to
 solr. All I could find is interaction happens using curl requests So is
 http
 is the only way to add documents.



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Using-solr-with-cassandra-tp4226137p4226143.html
 Sent from the Solr - User mailing list archive at Nabble.com.



-- 
*Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections
http://opensourceconnections.com, LLC | 240.476.9983
Author: Relevant Search http://manning.com/turnbull
This e-mail and all contents, including attachments, is considered to be
Company Confidential unless explicitly stated otherwise, regardless
of whether attachments are marked as such.


Re: commit of xml update by AJAX

2015-08-30 Thread Upayavira


On Sat, Aug 29, 2015, at 05:30 PM, Szűcs Roland wrote:
 Hello SOLR experts,
 
 I am new to solr as you will see from my problem. I just try to
 understand
 how solr works. I use one core (BandW) on my locla machine and I use
 javascript for my learning purpose.
 
 I have a test schema.xml: with two fileds: id, title. I managed to run
 queries with faceting, autocomplete, etc. In all cases I used Ajax post
 method. For example my search was (searchWithSuggest.searchAjaxRequest is
 an XMLHttpRequest object):
 var s=document.getElementById(searchWithSuggest.inputBoxId).value;
 var params='q='+s+'start=0rows=10';
 a=searchWithSuggest.solrServer+'/query';
 searchWithSuggest.searchAjaxRequest.open(POST,a, true);
 searchWithSuggest.searchAjaxRequest.setRequestHeader(Content-type,
 application/x-www-form-urlencoded);
 searchWithSuggest.searchAjaxRequest.send(encodeURIComponent(params));
 
 It worked fine. I thought that an xml update can work the same way so I
 tried to add and index one new document by xml(a is an XMLHttpRequest
 object):
 a.open(POST,http://localhost:8983/solr/bandw/update,true);
 a.setRequestHeader(Content-type, application/x-www-form-urlencoded);
 a.send(encodeURIComponent(stream.body=add commitWithin=5000docfield
 name='id'3222/fieldfield name='title'Blade/field/doc/add));
 
 I got a response with error: missing content stream.
 
 I have changed only the a.open function call to this one:
 a.open(POST,http://localhost:8983/solr/bandw/update?commit=true,true);
 the rest of the did not change.
 Finally, I got response with no error from SOLR. Later it turned out that
 the new doc was not indexed at all.
 
 My questions:
 1. If I get no error from solr what is wrong with the second solution and
 how can I fix it?
 2. Is there any solution to put all the parameters to the a.send call as
 in
 case of queries. I tried
 a.send(encodeURIComponent(commit=truestream.body=add
 commitWithin=5000docfield name='id'3222/fieldfield
 name='title'Blade/field/doc/add)); but it was not working.
 3. Why 95% of the examples in SOLR wiki pages relates to curl. Is this
 the
 most efficient alternative? Is there a mapping between a curl syntax
 and
 the post request?
 
 Best Regards,
 Roland

You're using a POST to fake a GET - just make the Content-type text/xml
(or application/xml, I forget) and call a.send(add/add);

You may need the encodeURIComponent, not sure.

The stream.body feature allows you to do an HTTP GET that has a stream
within it, but you are already doing a POST so it isn't needed.

Upayavira


Re: Using solr with cassandra

2015-08-30 Thread Upayavira


On Sun, Aug 30, 2015, at 10:43 AM, bipin wrote:
 Hi all,
 
 How can I integrate Solr with Cassandra. I am using Cassandra as my
 primary
 database. I have seen solandra but the project is dead. What are the ways
 available now.

i think you have two choices - use the DataStax product, or write your
own tool to push stuff to Solr at the same time as it goes into
Cassandra.

Upayavira


Re: Dynamic field rule plugin?

2015-08-30 Thread Upayavira
On Fri, Aug 28, 2015, at 11:09 PM, Hari Iyer wrote:
 Hi,
 
 I am new to Solr and am trying to create dynamic field rules in my
 Schema. 
 
 I would like to use file name suffix to indicate other properties besides
 the data type and multivalued as provided in the default schema. 
 
 It appears that specifying this via a pattern leads to duplication as
 there
 are various combinations that need to be specified here. It would help to
 have code where I can build parts of the rule 
 
 e.g. if suffix has '_s' then set stored=true
 
 if suffix has '_m' then set multivalued=true
 
 and so on
 
  
 
 From the documentation and various implementation examples (drupal etc) I
 can only see them specifying all combinations.
 
 Is there any way (plugin?) to incrementally build the rule?

Nope - you will need to have one for each combination, so _s, _sm, _m,
etc. Each field can only match a single field type, otherwise it'd be
impossible to work out which rules should be considered from the
different field types.

Upayavira


Re: Using solr with cassandra

2015-08-30 Thread Christopher Bradford
We are using SolrJ. Check out this sample code:
https://gist.github.com/bradfordcp/7661a2d31403325c036b

On Sun, Aug 30, 2015 at 11:13 AM Doug Turnbull 
dturnb...@opensourceconnections.com wrote:

 Pretty sure he was using HTTP, as Solr doesn't have another way to connect
 to it. Be curious what about HTTP caused botttlenecks, SolrJ has a number
 of strategies including using the load balancing SolrServer which forwards
 each document to the shards leader that might be useful to you.

 -Doug

 On Sun, Aug 30, 2015 at 9:51 AM, bipin bipin@gmail.com wrote:

 That was a very good presentation. I have used Spark before so yay. I had
 also thought of using Solrcloud, it validates that. Regarding the api used
 to connect to Solr, I found this
 https://cwiki.apache.org/confluence/display/solr/Using+SolrJ. Is this the
 same one that Chris was using. I want to know if he was using different
 way
 to connect to Solr than HTTP requests because it had caused bottlenecks.



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Using-solr-with-cassandra-tp4226137p4226146.html
 Sent from the Solr - User mailing list archive at Nabble.com.




 --
 *Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections
 http://opensourceconnections.com, LLC | 240.476.9983
 Author: Relevant Search http://manning.com/turnbull
 This e-mail and all contents, including attachments, is considered to be
 Company Confidential unless explicitly stated otherwise, regardless
 of whether attachments are marked as such.



Re: commit of xml update by AJAX

2015-08-30 Thread Szűcs Roland
Thanks Erick,

Your blog post made it clear. It was looong, but not too long.

Roland

2015-08-29 19:00 GMT+02:00 Erick Erickson erickerick...@gmail.com:

 1 My first guess is that your autocommit
 section in solrconfig.xml has openSearcherfalse/openSearcher
 So the commitWithin happened but a new searcher
 was not opened thus the document is invisible.
 Try issuing a separate commit or change that value
 in solrconfig.xml and try again.

 Here's a lng post on all this:

 https://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

 2 No clue since I'm pretty ajax-ignorant.

 3 because curl easily downloadable at worst and most often
 already on someone's machine and let people at least get started.
 Pretty soon, though, for production situations people will use SolrJ
 or the like or use one of the off-the-shelf tools packaged around
 Solr.

 Best
 Erick

 On Sat, Aug 29, 2015 at 9:30 AM, Szűcs Roland
 szucs.rol...@bookandwalk.hu wrote:
  Hello SOLR experts,
 
  I am new to solr as you will see from my problem. I just try to
 understand
  how solr works. I use one core (BandW) on my locla machine and I use
  javascript for my learning purpose.
 
  I have a test schema.xml: with two fileds: id, title. I managed to run
  queries with faceting, autocomplete, etc. In all cases I used Ajax post
  method. For example my search was (searchWithSuggest.searchAjaxRequest is
  an XMLHttpRequest object):
  var s=document.getElementById(searchWithSuggest.inputBoxId).value;
  var params='q='+s+'start=0rows=10';
  a=searchWithSuggest.solrServer+'/query';
  searchWithSuggest.searchAjaxRequest.open(POST,a, true);
  searchWithSuggest.searchAjaxRequest.setRequestHeader(Content-type,
  application/x-www-form-urlencoded);
  searchWithSuggest.searchAjaxRequest.send(encodeURIComponent(params));
 
  It worked fine. I thought that an xml update can work the same way so I
  tried to add and index one new document by xml(a is an XMLHttpRequest
  object):
  a.open(POST,http://localhost:8983/solr/bandw/update,true);
  a.setRequestHeader(Content-type, application/x-www-form-urlencoded);
  a.send(encodeURIComponent(stream.body=add commitWithin=5000docfield
  name='id'3222/fieldfield name='title'Blade/field/doc/add));
 
  I got a response with error: missing content stream.
 
  I have changed only the a.open function call to this one:
  a.open(POST,http://localhost:8983/solr/bandw/update?commit=true
 ,true);
  the rest of the did not change.
  Finally, I got response with no error from SOLR. Later it turned out that
  the new doc was not indexed at all.
 
  My questions:
  1. If I get no error from solr what is wrong with the second solution and
  how can I fix it?
  2. Is there any solution to put all the parameters to the a.send call as
 in
  case of queries. I tried
  a.send(encodeURIComponent(commit=truestream.body=add
  commitWithin=5000docfield name='id'3222/fieldfield
  name='title'Blade/field/doc/add)); but it was not working.
  3. Why 95% of the examples in SOLR wiki pages relates to curl. Is this
 the
  most efficient alternative? Is there a mapping between a curl syntax
 and
  the post request?
 
  Best Regards,
  Roland
 
  --
  https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/huSzűcs
 Roland
  https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu
 Ismerkedjünk
  meg a Linkedin 
 https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu
  -en https://bookandwalk.hu/ÜgyvezetőTelefon: +36 1 210 81
 13Bookandwalk.hu
  https://bokandwalk.hu/




-- 
https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/huSzűcs Roland
https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/huIsmerkedjünk
meg a Linkedin https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu
-en https://bookandwalk.hu/ÜgyvezetőTelefon: +36 1 210 81 13Bookandwalk.hu
https://bokandwalk.hu/


Re: commit of xml update by AJAX

2015-08-30 Thread Szűcs Roland
Hi Upayavira,

You were rigtht. I had to only replace the Content-type to appliacation/xml
and it worked correctly.

Roland

2015-08-30 11:22 GMT+02:00 Upayavira u...@odoko.co.uk:



 On Sat, Aug 29, 2015, at 05:30 PM, Szűcs Roland wrote:
  Hello SOLR experts,
 
  I am new to solr as you will see from my problem. I just try to
  understand
  how solr works. I use one core (BandW) on my locla machine and I use
  javascript for my learning purpose.
 
  I have a test schema.xml: with two fileds: id, title. I managed to run
  queries with faceting, autocomplete, etc. In all cases I used Ajax post
  method. For example my search was (searchWithSuggest.searchAjaxRequest is
  an XMLHttpRequest object):
  var s=document.getElementById(searchWithSuggest.inputBoxId).value;
  var params='q='+s+'start=0rows=10';
  a=searchWithSuggest.solrServer+'/query';
  searchWithSuggest.searchAjaxRequest.open(POST,a, true);
  searchWithSuggest.searchAjaxRequest.setRequestHeader(Content-type,
  application/x-www-form-urlencoded);
  searchWithSuggest.searchAjaxRequest.send(encodeURIComponent(params));
 
  It worked fine. I thought that an xml update can work the same way so I
  tried to add and index one new document by xml(a is an XMLHttpRequest
  object):
  a.open(POST,http://localhost:8983/solr/bandw/update,true);
  a.setRequestHeader(Content-type, application/x-www-form-urlencoded);
  a.send(encodeURIComponent(stream.body=add commitWithin=5000docfield
  name='id'3222/fieldfield name='title'Blade/field/doc/add));
 
  I got a response with error: missing content stream.
 
  I have changed only the a.open function call to this one:
  a.open(POST,http://localhost:8983/solr/bandw/update?commit=true
 ,true);
  the rest of the did not change.
  Finally, I got response with no error from SOLR. Later it turned out that
  the new doc was not indexed at all.
 
  My questions:
  1. If I get no error from solr what is wrong with the second solution and
  how can I fix it?
  2. Is there any solution to put all the parameters to the a.send call as
  in
  case of queries. I tried
  a.send(encodeURIComponent(commit=truestream.body=add
  commitWithin=5000docfield name='id'3222/fieldfield
  name='title'Blade/field/doc/add)); but it was not working.
  3. Why 95% of the examples in SOLR wiki pages relates to curl. Is this
  the
  most efficient alternative? Is there a mapping between a curl syntax
  and
  the post request?
 
  Best Regards,
  Roland

 You're using a POST to fake a GET - just make the Content-type text/xml
 (or application/xml, I forget) and call a.send(add/add);

 You may need the encodeURIComponent, not sure.

 The stream.body feature allows you to do an HTTP GET that has a stream
 within it, but you are already doing a POST so it isn't needed.

 Upayavira




-- 
https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/huSzűcs Roland
https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/huIsmerkedjünk
meg a Linkedin https://www.linkedin.com/pub/roland-sz%C5%B1cs/28/226/24/hu
-en https://bookandwalk.hu/ÜgyvezetőTelefon: +36 1 210 81 13Bookandwalk.hu
https://bokandwalk.hu/


Re: Using solr with cassandra

2015-08-30 Thread Doug Turnbull
Pretty sure he was using HTTP, as Solr doesn't have another way to connect
to it. Be curious what about HTTP caused botttlenecks, SolrJ has a number
of strategies including using the load balancing SolrServer which forwards
each document to the shards leader that might be useful to you.

-Doug

On Sun, Aug 30, 2015 at 9:51 AM, bipin bipin@gmail.com wrote:

 That was a very good presentation. I have used Spark before so yay. I had
 also thought of using Solrcloud, it validates that. Regarding the api used
 to connect to Solr, I found this
 https://cwiki.apache.org/confluence/display/solr/Using+SolrJ. Is this the
 same one that Chris was using. I want to know if he was using different way
 to connect to Solr than HTTP requests because it had caused bottlenecks.



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Using-solr-with-cassandra-tp4226137p4226146.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
*Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections
http://opensourceconnections.com, LLC | 240.476.9983
Author: Relevant Search http://manning.com/turnbull
This e-mail and all contents, including attachments, is considered to be
Company Confidential unless explicitly stated otherwise, regardless
of whether attachments are marked as such.


Using join vs flattening structure

2015-08-30 Thread Brian Narsi
I have read a lot about using flattened structures in solr (instead of
relational). Looks like it is preferable to use flattened structure. But in
our case we have to consider  using (sort of) relational structure to keep
index maintenance cost low.

Does anyone have deeper insight into this?

1) When should we definitely use relational type of structure and use join?
(instead of flattened structure)

2) When should we definitely use flattened structure (instead of
relational)?

3) What are the signs that one has made a wrong choice of flattened vs
relational?

4) Any best practices when relational structure and join is used?

5) I understand that parallel sql (in solr) will have more relational
functionality support? Any ETA on when the parallel sql will support joins?

Thanks for your help!


'missing content stream' issuing expungeDeletes=true

2015-08-30 Thread Derek Poh

Hi

I tried doing a expungeDeletes=true with the following but get the 
message 'missing content stream'. What am I missing? I need to provide 
additional parameters?


curl 'http://127.0.0.1:8983/solr/supplier/update/json?expungeDeletes=true';

Thanks,
Derek

--
CONFIDENTIALITY NOTICE 

This e-mail (including any attachments) may contain confidential and/or privileged information. If you are not the intended recipient or have received this e-mail in error, please inform the sender immediately and delete this e-mail (including any attachments) from your computer, and you must not use, disclose to anyone else or copy this e-mail (including any attachments), whether in whole or in part. 


This e-mail and any reply to it may be monitored for security, legal, 
regulatory compliance and/or other appropriate reasons.