solr cloud without hard commit?

2017-09-28 Thread Wei
Hello All,

What are the impacts if solr cloud is configured to have only soft commits
but no hard commits? In this way if a non-leader node crashes, will it
still be able to recover from the leader? Basically we are wondering  in a
read heavy & write heavy scenario, whether taking hard commit out could
help to improve query performance and what are the consequences.

Thanks,
Wei


Re: streaming with SolrJ

2017-09-28 Thread Joel Bernstein
There isn't much documentation for how to use the Streaming API java
classes directly. All of the effort has been going into Streaming
Expressions which you send to the /stream handler to execute. Over time
it's become more and more complicated to use the Java classes because there
are so many of them and because their initialization can be complex. All of
the test cases are now focused on exercising the underlying classes through
the expressions.


Joel Bernstein
http://joelsolr.blogspot.com/

On Thu, Sep 28, 2017 at 4:47 PM, Hendrik Haddorp 
wrote:

> hm, thanks, but why are all those withFunctionName calls required and how
> did you get to this?
>
>
> On 28.09.2017 22:01, Susheel Kumar wrote:
>
>> I have this snippet with couple of functions e.g. if that helps
>>
>> ---
>>  TupleStream stream;
>>  List tuples;
>>  StreamContext streamContext = new StreamContext();
>>  SolrClientCache solrClientCache = new SolrClientCache();
>>  streamContext.setSolrClientCache(solrClientCache);
>>
>>  StreamFactory factory = new StreamFactory()
>>   .withCollectionZkHost("gettingstarted", "localhost:2181")
>>  .withFunctionName("search", CloudSolrStream.class)
>>.withFunctionName("select", SelectStream.class)
>>.withFunctionName("add", AddEvaluator.class)
>>.withFunctionName("if", IfThenElseEvaluator.class)
>>.withFunctionName("gt", GreaterThanEvaluator.class)
>>.withFunctionName("let", LetStream.class)
>>.withFunctionName("get", GetStream.class)
>>.withFunctionName("echo", EchoStream.class)
>>.withFunctionName("merge", MergeStream.class)
>>.withFunctionName("sort", SortStream.class)
>>.withFunctionName("tuple", TupStream.class)
>>.withFunctionName("rollup",RollupStream.class)
>>.withFunctionName("hashJoin", HashJoinStream.class)
>>.withFunctionName("complement", ComplementStream.class)
>>.withFunctionName("fetch", FetchStream.class)
>>.withFunctionName("having",HavingStream.class)
>> //  .withFunctionName("eq", EqualsEvaluator.class)
>>.withFunctionName("count", CountMetric.class)
>>.withFunctionName("facet", FacetStream.class)
>>.withFunctionName("sum", SumMetric.class)
>>.withFunctionName("unique", UniqueStream.class)
>>.withFunctionName("uniq", UniqueMetric.class)
>>.withFunctionName("innerJoin", InnerJoinStream.class)
>>.withFunctionName("intersect", IntersectStream.class)
>>.withFunctionName("replace", ReplaceOperation.class)
>>
>>;
>>  try {
>>  clause = getClause();
>>stream = factory.constructStream(clause);
>>stream.setStreamContext(streamContext);
>>tuples = getTuples(stream);
>>
>>for(Tuple tuple : tuples )
>>{
>>System.out.println(tuple.getString("id"));
>>System.out.println(tuple.getString("business_email_s"));
>>  
>>
>>}
>>
>>System.out.println("Total tuples retunred "+tuples.size());
>>
>>
>> ---
>> private static String getClause() {
>> String clause = "select(search(gettingstarted,\n" +
>> "q=*:* NOT personal_email_s:*,\n" +
>> "fl=\"id,business_email_s\",\n" +
>> "sort=\"business_email_s asc\"),\n" +
>> "id,\n" +
>> "business_email_s,\n" +
>> "personal_email_s,\n" +
>> "replace(personal_email_s,null,withField=business_email_s)\n" +
>> ")";
>> return clause;
>> }
>>
>>
>> On Thu, Sep 28, 2017 at 3:35 PM, Hendrik Haddorp > >
>> wrote:
>>
>> Hi,
>>>
>>> I'm trying to use the streaming API via SolrJ but have some trouble with
>>> the documentation and samples. In the reference guide I found the below
>>> example in http://lucene.apache.org/solr/guide/6_6/streaming-expression
>>> s.html. Problem is that "withStreamFunction" does not seem to exist.
>>> There is "withFunctionName", which would match the arguments but there is
>>> no documentation in the JavaDoc nor is the sample stating why I would
>>> need
>>> all those "with" calls if pretty much everything is also in the last
>>> "constructStream" method call. I was planning to retrieve a few fields
>>> for
>>> all documents in a collection but have trouble to figure out what is the
>>> correct way to do so. The documentation also uses "/export" and
>>> "/search",
>>> with little explanation on the differences. Would really appreciate a
>>> pointer to some simple samples.
>>>
>>> The org.apache.solr.client.solrj.io package provides Java classes that
>>> compile streaming expressions into streaming API objects. These classes
>>> can
>>> be used to execute streaming expressions from inside a Java application.
>>> For example:
>>>
>>> StreamFactory streamFactory = new StreamFactory().withCollection
>>> ZkHost("collection1",
>>> zkServer.getZkAddress())
>>>  .withStreamFunction("search", CloudSolrStream.class)
>>>  

How to recover from failed SPLITSHARD?

2017-09-28 Thread Kai 'wusel' Siering
Hi,

this is with SolrCloud 6.5.1 on Ubuntu LTS 16.04 and OpenJDK 8, 4 Solr in Cloud 
mode, external ZK.

I tried to split my colection's shard1 (500 GB) with SPLITSHARD, it kind of 
worked. After more than 8 hours the new shards left "construction" state — and 
entered "recovery" :( Another about 12 hours later, Out of Memory errors with 
"could not create thread" happened. Node 10.10.10.162 took leadership of 
shard1, but since we still saw errors on searches, I stopped solr on 
10.10.10.161, changed heap from 24G to 31G and rebooted the system, just in 
case — good time to install latest patches. 10.10.10.161 came back and shards 
shard1, shard1_0 and shard1_1 started recovery. But unfortunately, 
10.10.10.162, leader for shard2 which was being split as well, hit "something": 
solr.log got not updated anymore, the UI didn't work anymore, so in the end, I 
stopped solr there as well (finished instantly) and rebootet. Now both are 
running with 31G java heap, shard1 and shard2 are synced and I try to clean up 
before retrying.

Of shard2, only a shard2_0 without any replicas was left over, and DELETESHARD 
clean it up.

But shard1 has shard1_0 and shard1_1, each with two replicas. DELETESHARD 
errored out, so I DELETEREPLICA all of them. This worked, but "parts of" 
shard1_0 and shard1_1 are still there and I cannot delete them:

$ wget -q -O - 
'http://10.10.10.162:8983/solr/admin/collections?wt=json=CLUSTERSTATUS' 
| jq
[…]
  "shard1_0": {
"range": "8000-bfff",
"state": "recovery_failed",
"replicas": {}
  },
  "shard1_1": {
"parent": "shard1",
"shard_parent_node": "10.10.10.161:8983_solr",
"range": "c000-",
"state": "recovery_failed",
"shard_parent_zk_session": "98682039611162624",
"replicas": {}
  }
[…]


$ wget -O - 
'http://10.10.10.161:8983/solr/admin/collections?action=DELETESHARD=shard1_1=collection'
--2017-09-29 01:01:16--  
http://10.10.10.161:8983/solr/admin/collections?action=DELETESHARD=shard1_1=collection
Connecting to 10.10.10.161:8983... connected.
HTTP request sent, awaiting response... 400 Bad Request
2017-09-29 01:01:16 ERROR 400: Bad Request.

Any hint on how to fix this appreciated ;)

Regards,
-kai





Re: streaming with SolrJ

2017-09-28 Thread Hendrik Haddorp
hm, thanks, but why are all those withFunctionName calls required and 
how did you get to this?


On 28.09.2017 22:01, Susheel Kumar wrote:

I have this snippet with couple of functions e.g. if that helps

---
 TupleStream stream;
 List tuples;
 StreamContext streamContext = new StreamContext();
 SolrClientCache solrClientCache = new SolrClientCache();
 streamContext.setSolrClientCache(solrClientCache);

 StreamFactory factory = new StreamFactory()
  .withCollectionZkHost("gettingstarted", "localhost:2181")
 .withFunctionName("search", CloudSolrStream.class)
   .withFunctionName("select", SelectStream.class)
   .withFunctionName("add", AddEvaluator.class)
   .withFunctionName("if", IfThenElseEvaluator.class)
   .withFunctionName("gt", GreaterThanEvaluator.class)
   .withFunctionName("let", LetStream.class)
   .withFunctionName("get", GetStream.class)
   .withFunctionName("echo", EchoStream.class)
   .withFunctionName("merge", MergeStream.class)
   .withFunctionName("sort", SortStream.class)
   .withFunctionName("tuple", TupStream.class)
   .withFunctionName("rollup",RollupStream.class)
   .withFunctionName("hashJoin", HashJoinStream.class)
   .withFunctionName("complement", ComplementStream.class)
   .withFunctionName("fetch", FetchStream.class)
   .withFunctionName("having",HavingStream.class)
//  .withFunctionName("eq", EqualsEvaluator.class)
   .withFunctionName("count", CountMetric.class)
   .withFunctionName("facet", FacetStream.class)
   .withFunctionName("sum", SumMetric.class)
   .withFunctionName("unique", UniqueStream.class)
   .withFunctionName("uniq", UniqueMetric.class)
   .withFunctionName("innerJoin", InnerJoinStream.class)
   .withFunctionName("intersect", IntersectStream.class)
   .withFunctionName("replace", ReplaceOperation.class)

   ;
 try {
 clause = getClause();
   stream = factory.constructStream(clause);
   stream.setStreamContext(streamContext);
   tuples = getTuples(stream);

   for(Tuple tuple : tuples )
   {
   System.out.println(tuple.getString("id"));
   System.out.println(tuple.getString("business_email_s"));
 

   }

   System.out.println("Total tuples retunred "+tuples.size());


---
private static String getClause() {
String clause = "select(search(gettingstarted,\n" +
"q=*:* NOT personal_email_s:*,\n" +
"fl=\"id,business_email_s\",\n" +
"sort=\"business_email_s asc\"),\n" +
"id,\n" +
"business_email_s,\n" +
"personal_email_s,\n" +
"replace(personal_email_s,null,withField=business_email_s)\n" +
")";
return clause;
}


On Thu, Sep 28, 2017 at 3:35 PM, Hendrik Haddorp 
wrote:


Hi,

I'm trying to use the streaming API via SolrJ but have some trouble with
the documentation and samples. In the reference guide I found the below
example in http://lucene.apache.org/solr/guide/6_6/streaming-expression
s.html. Problem is that "withStreamFunction" does not seem to exist.
There is "withFunctionName", which would match the arguments but there is
no documentation in the JavaDoc nor is the sample stating why I would need
all those "with" calls if pretty much everything is also in the last
"constructStream" method call. I was planning to retrieve a few fields for
all documents in a collection but have trouble to figure out what is the
correct way to do so. The documentation also uses "/export" and "/search",
with little explanation on the differences. Would really appreciate a
pointer to some simple samples.

The org.apache.solr.client.solrj.io package provides Java classes that
compile streaming expressions into streaming API objects. These classes can
be used to execute streaming expressions from inside a Java application.
For example:

StreamFactory streamFactory = new 
StreamFactory().withCollectionZkHost("collection1",
zkServer.getZkAddress())
 .withStreamFunction("search", CloudSolrStream.class)
 .withStreamFunction("unique", UniqueStream.class)
 .withStreamFunction("top", RankStream.class)
 .withStreamFunction("group", ReducerStream.class)
 .withStreamFunction("parallel", ParallelStream.class);

ParallelStream pstream = (ParallelStream)streamFactory.
constructStream("parallel(collection1, group(search(collection1,
q=\"*:*\", fl=\"id,a_s,a_i,a_f\", sort=\"a_s asc,a_f asc\",
partitionKeys=\"a_s\"), by=\"a_s asc\"), workers=\"2\",
zkHost=\""+zkHost+"\", sort=\"a_s asc\")");

regards,
Hendrik





Re: streaming with SolrJ

2017-09-28 Thread Susheel Kumar
I have this snippet with couple of functions e.g. if that helps

---
TupleStream stream;
List tuples;
StreamContext streamContext = new StreamContext();
SolrClientCache solrClientCache = new SolrClientCache();
streamContext.setSolrClientCache(solrClientCache);

StreamFactory factory = new StreamFactory()
 .withCollectionZkHost("gettingstarted", "localhost:2181")
.withFunctionName("search", CloudSolrStream.class)
  .withFunctionName("select", SelectStream.class)
  .withFunctionName("add", AddEvaluator.class)
  .withFunctionName("if", IfThenElseEvaluator.class)
  .withFunctionName("gt", GreaterThanEvaluator.class)
  .withFunctionName("let", LetStream.class)
  .withFunctionName("get", GetStream.class)
  .withFunctionName("echo", EchoStream.class)
  .withFunctionName("merge", MergeStream.class)
  .withFunctionName("sort", SortStream.class)
  .withFunctionName("tuple", TupStream.class)
  .withFunctionName("rollup",RollupStream.class)
  .withFunctionName("hashJoin", HashJoinStream.class)
  .withFunctionName("complement", ComplementStream.class)
  .withFunctionName("fetch", FetchStream.class)
  .withFunctionName("having",HavingStream.class)
//  .withFunctionName("eq", EqualsEvaluator.class)
  .withFunctionName("count", CountMetric.class)
  .withFunctionName("facet", FacetStream.class)
  .withFunctionName("sum", SumMetric.class)
  .withFunctionName("unique", UniqueStream.class)
  .withFunctionName("uniq", UniqueMetric.class)
  .withFunctionName("innerJoin", InnerJoinStream.class)
  .withFunctionName("intersect", IntersectStream.class)
  .withFunctionName("replace", ReplaceOperation.class)

  ;
try {
clause = getClause();
  stream = factory.constructStream(clause);
  stream.setStreamContext(streamContext);
  tuples = getTuples(stream);

  for(Tuple tuple : tuples )
  {
  System.out.println(tuple.getString("id"));
  System.out.println(tuple.getString("business_email_s"));


  }

  System.out.println("Total tuples retunred "+tuples.size());


---
private static String getClause() {
String clause = "select(search(gettingstarted,\n" +
"q=*:* NOT personal_email_s:*,\n" +
"fl=\"id,business_email_s\",\n" +
"sort=\"business_email_s asc\"),\n" +
"id,\n" +
"business_email_s,\n" +
"personal_email_s,\n" +
"replace(personal_email_s,null,withField=business_email_s)\n" +
")";
return clause;
}


On Thu, Sep 28, 2017 at 3:35 PM, Hendrik Haddorp 
wrote:

> Hi,
>
> I'm trying to use the streaming API via SolrJ but have some trouble with
> the documentation and samples. In the reference guide I found the below
> example in http://lucene.apache.org/solr/guide/6_6/streaming-expression
> s.html. Problem is that "withStreamFunction" does not seem to exist.
> There is "withFunctionName", which would match the arguments but there is
> no documentation in the JavaDoc nor is the sample stating why I would need
> all those "with" calls if pretty much everything is also in the last
> "constructStream" method call. I was planning to retrieve a few fields for
> all documents in a collection but have trouble to figure out what is the
> correct way to do so. The documentation also uses "/export" and "/search",
> with little explanation on the differences. Would really appreciate a
> pointer to some simple samples.
>
> The org.apache.solr.client.solrj.io package provides Java classes that
> compile streaming expressions into streaming API objects. These classes can
> be used to execute streaming expressions from inside a Java application.
> For example:
>
> StreamFactory streamFactory = new 
> StreamFactory().withCollectionZkHost("collection1",
> zkServer.getZkAddress())
> .withStreamFunction("search", CloudSolrStream.class)
> .withStreamFunction("unique", UniqueStream.class)
> .withStreamFunction("top", RankStream.class)
> .withStreamFunction("group", ReducerStream.class)
> .withStreamFunction("parallel", ParallelStream.class);
>
> ParallelStream pstream = (ParallelStream)streamFactory.
> constructStream("parallel(collection1, group(search(collection1,
> q=\"*:*\", fl=\"id,a_s,a_i,a_f\", sort=\"a_s asc,a_f asc\",
> partitionKeys=\"a_s\"), by=\"a_s asc\"), workers=\"2\",
> zkHost=\""+zkHost+"\", sort=\"a_s asc\")");
>
> regards,
> Hendrik
>


streaming with SolrJ

2017-09-28 Thread Hendrik Haddorp

Hi,

I'm trying to use the streaming API via SolrJ but have some trouble with 
the documentation and samples. In the reference guide I found the below 
example in 
http://lucene.apache.org/solr/guide/6_6/streaming-expressions.html. 
Problem is that "withStreamFunction" does not seem to exist. There is 
"withFunctionName", which would match the arguments but there is no 
documentation in the JavaDoc nor is the sample stating why I would need 
all those "with" calls if pretty much everything is also in the last 
"constructStream" method call. I was planning to retrieve a few fields 
for all documents in a collection but have trouble to figure out what is 
the correct way to do so. The documentation also uses "/export" and 
"/search", with little explanation on the differences. Would really 
appreciate a pointer to some simple samples.


The org.apache.solr.client.solrj.io package provides Java classes that 
compile streaming expressions into streaming API objects. These classes 
can be used to execute streaming expressions from inside a Java 
application. For example:


StreamFactory streamFactory = new 
StreamFactory().withCollectionZkHost("collection1", zkServer.getZkAddress())

.withStreamFunction("search", CloudSolrStream.class)
.withStreamFunction("unique", UniqueStream.class)
.withStreamFunction("top", RankStream.class)
.withStreamFunction("group", ReducerStream.class)
.withStreamFunction("parallel", ParallelStream.class);

ParallelStream pstream = 
(ParallelStream)streamFactory.constructStream("parallel(collection1, 
group(search(collection1, q=\"*:*\", fl=\"id,a_s,a_i,a_f\", sort=\"a_s 
asc,a_f asc\", partitionKeys=\"a_s\"), by=\"a_s asc\"), workers=\"2\", 
zkHost=\""+zkHost+"\", sort=\"a_s asc\")");


regards,
Hendrik


Re: how to recover from OpenSearcher called on closed core

2017-09-28 Thread Erick Erickson
Are you using NFS or other shared file system? I have some details
from Uwe Schindler on issues with NFS resulting from the fact that NFS
is not POSIX compliant.

Best,
Erick

On Thu, Sep 28, 2017 at 9:32 AM, rubi.hali  wrote:
> Hi Nawaz
>
> No we are not doing any upgradation.
>
> We hardly have 3 documents so we dont feel the need of having a cloud
> configuration
>
> Regarding d exception we analyzed before this error comes We always see
> Cahching Directory Factory closing the core
>
> Plus we tried Solr 6.2 version and the same exception was not happening
>
> Do you have any idea if this issue or any such issue with replication exists
> in 6.1 which was resolved in 6.2
>
> Thanks in advance
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: how to recover from OpenSearcher called on closed core

2017-09-28 Thread rubi.hali
Hi Nawaz

No we are not doing any upgradation.

We hardly have 3 documents so we dont feel the need of having a cloud
configuration

Regarding d exception we analyzed before this error comes We always see
Cahching Directory Factory closing the core

Plus we tried Solr 6.2 version and the same exception was not happening

Do you have any idea if this issue or any such issue with replication exists
in 6.1 which was resolved in 6.2

Thanks in advance



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Unauthorized Requests on Empty Solr Node

2017-09-28 Thread Chris Ulicny
Hi all,

I've run into an issue with using the basic authentication plugin that
comes with solr 6.3.0 that seems to prevent requests from being processed
in certain situations.

Essentially, if we have a solr node as part of a cloud but contains no
replicas for any collection, it cannot process search requests from the
"solrreader" or "solrwriter" user for any of those collections. It just
returns a 403 Unauthorized request error.

I noticed some JIRA tickets for issues with blockUnknown functionality, but
both true and false result in the same problem.

The security.json file content is included below. Is there something wrong
with the permissions that were set that prevents the "reader" and "writer"
roles from communicating with the other nodes or is there something else I
should be looking into? I have the steps to replicate the issue if the
security.json shouldn't be the problem.

Thanks,
Chris

{  "authentication":{
"blockUnknown":true,
"class":"solr.BasicAuthPlugin",
"credentials":{
  "solradmin":"hashedpassword",
  "solrreader":"hashedpassword",
  "solrwriter":"hashedpassword"},
"":{"v":3}},
  "authorization":{
"class":"solr.RuleBasedAuthorizationPlugin",
"permissions":[
  {"name":"read","role":"reader"},
  {"name":"security-read","role":"reader"},
  {"name":"schema-read","role":"reader"},
  {"name":"config-read","role":"reader"},
  {"name":"core-admin-read","role":"reader"},
  {"name":"collection-admin-read","role":"reader"},
  {"name":"update","role":"writer"},
  {"name":"security-edit","role":"admin"},
  {"name":"schema-edit","role":"admin"},
  {"name":"config-edit","role":"admin"},
  {"name":"core-admin-edit","role":"admin"},
  {"name":"collection-admin-edit","role":"admin"},
  {"name":"all","role":"admin"}],
"user-role":{
  "solradmin":["reader","writer","admin"],
  "solrreader":["reader"],
  "solrwriter":["reader","writer"]},
"":{"v":2}}}


RE: Modifing create_core's instanceDir attribute

2017-09-28 Thread Miller, William K - Norman, OK - Contractor
Thanks to you all.  When I used the curl command (which I had forgotten to use) 
and put the url in quotes it worked with one exception.  It did not copy the 
"conf" folder from my custom_configs folder that I had created under the 
configsets folder.  I was able to just add a copy command in my shell script to 
copy this folder over and it works just fine now.

So again thanks to all of you for your help.




~~~
William Kevin Miller

ECS Federal, Inc.
USPS/MTSC
(405) 573-2158


-Original Message-
From: Shawn Heisey [mailto:apa...@elyograg.org] 
Sent: Thursday, September 28, 2017 10:02 AM
To: solr-user@lucene.apache.org
Subject: Re: Modifing create_core's instanceDir attribute

On 9/27/2017 10:24 AM, Miller, William K - Norman, OK - Contractor wrote:
> Thanks Erick for pointing me in this direction.  Unfortunately when I try to 
> us this I get an error.  Here is the command that I am using and the response 
> I get:
>
> https://solrserver:8983/solr/admin/cores?action=CREATE=mycore
> tanceDir=/var/solr/data/mycore=data=custom_configs
>
>
> [1] 32023
> [2] 32024
> [3] 32025
> -bash: https://solrserver:8983/solr/admin/cores?action=CREATE: No such 
> file or directory [4] 32026
> [1]   Exit 127
> https://solrserver:8983/solr/adkmin/cores?action=CREATE
> [2]   Donename=mycore
> [3]-  DoneinstanceDir=/var/solr/data/mycore
> [4]+  DonedataDir=data

It appears that you are trying to type the bare URL into a shell prompt as a 
command.  The shell doesn't know how to deal with a URL -- a URL isn't a 
program or a shell command.

If you put the URL into a browser, which knows how to deal with it, the request 
will go to Solr, and then you can deal with any further problems.

If you want to do it on the commandline, you're going to have to have a valid 
command/program for the shell.  The "curl" and "wget" programs are commonly 
available on systems with a shell prompt.  Here's one command that might work.  
Replace the text URL with your actual URL, and be sure that you keep the quotes:

curl "URL"

Thanks,
Shawn



Re: CDCR does not work

2017-09-28 Thread Amrit Sarkar
Pretty much what Webster and Erick mentioned, else please try the pdf I
attached. I followed the official documentation doing that.

Amrit Sarkar
Search Engineer
Lucidworks, Inc.
415-589-9269
www.lucidworks.com
Twitter http://twitter.com/lucidworks
LinkedIn: https://www.linkedin.com/in/sarkaramrit2

On Thu, Sep 28, 2017 at 8:56 PM, Erick Erickson 
wrote:

> If Webster's idea doesn't solve it, the next thing to check is your
> tlogs on the source cluster. If you have a successful connection to
> the target and it's operative, the tlogs should be regularly pruned.
> If not, they'll collect updates forever.
>
> Also, your Solr logs should show messages as CDCR does its work, to
> you see any evidence that it's
> 1> running
> 2> sending docs?
>
> Also, your problem description doesn't provide any information other
> than "it doesn't work", which makes it very hard to offer anything
> except generalities, you might review:
>
> https://wiki.apache.org/solr/UsingMailingLists
>
> Best,
> Erick
>
>
> On Thu, Sep 28, 2017 at 7:47 AM, Webster Homer 
> wrote:
> > Check that you have autoCommit enabled in the target schema.
> >
> > Try sending a commit to the target collection. If you don't have
> autoCommit
> > enabled then the data could be replicating but not committed so not
> > searchable
> >
> > On Thu, Sep 28, 2017 at 1:57 AM, Jiani Yang  wrote:
> >
> >> Hi,
> >>
> >> Recently I am trying to use CDCR to do the replication of my solr
> cluster.
> >> I have done exactly as what the tutorial says, the tutorial link is
> shown
> >> below:
> >> https://lucene.apache.org/solr/guide/6_6/cross-data-
> >> center-replication-cdcr.html
> >>
> >> But I cannot see any change on target data center even every status
> looks
> >> fine. I have been stuck in this situation for a week and could not find
> a
> >> way to resolve it, could you please help me?
> >>
> >> Please reply me ASAP! Thank you!
> >>
> >> Best,
> >> Jiani
> >>
> >
> > --
> >
> >
> > This message and any attachment are confidential and may be privileged or
> > otherwise protected from disclosure. If you are not the intended
> recipient,
> > you must not copy this message or attachment or disclose the contents to
> > any other person. If you have received this transmission in error, please
> > notify the sender immediately and delete the message and any attachment
> > from your system. Merck KGaA, Darmstadt, Germany and any of its
> > subsidiaries do not accept liability for any omissions or errors in this
> > message which may arise as a result of E-Mail-transmission or for damages
> > resulting from any unauthorized changes of the content of this message
> and
> > any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> > subsidiaries do not guarantee that this message is free of viruses and
> does
> > not accept liability for any damages caused by any virus transmitted
> > therewith.
> >
> > Click http://www.emdgroup.com/disclaimer to access the German, French,
> > Spanish and Portuguese versions of this disclaimer.
>


Re: Filter Factory question

2017-09-28 Thread Erick Erickson
PatternCaptureGroupTokenFilter has been around since 2013 (at least
that's the earliest revision in Git). I located it even in 5x so it
should be there in
...lucene/analysis/common/src/java/org/apache/lucene/analysis/pattern

Best,
Erick

On Thu, Sep 28, 2017 at 7:45 AM, Webster Homer  wrote:
> It's still buggy, so not ready to share.
>
> I keep a copy of Solr source which I use for this type of development. I
> don't see PatternCaptureGroupTokenFilterFactory in the Solr 6.2 code base
> at all. I was thinking of seeing how it treated the positions etc...
>
> My code now looks reasonable in the Analysis tool,  but doesn't seem to
> create searchable lucene data. I've changed it considerably since my first
> post so I see output in the tool which was an improvement
>
>
> On Wed, Sep 27, 2017 at 10:30 AM, Stefan Matheis 
> wrote:
>
>> > In any case I figured out my problem. I was over thinking it.
>>
>> Mind to share?
>>
>> -Stefan
>>
>> On Sep 27, 2017 4:34 PM, "Webster Homer"  wrote:
>>
>> > There is a need for a special filter since the input has to be
>> normalized.
>> > That is the main requirement, splitting into pieces is optional. As far
>> as
>> > I know there is nothing in solr that knows about molecular formulas.
>> >
>> > In any case I figured out my problem. I was over thinking it.
>> >
>> > On Wed, Sep 27, 2017 at 3:52 AM, Emir Arnautović <
>> > emir.arnauto...@sematext.com> wrote:
>> >
>> > > Hi Homer,
>> > > There is no need for special filter, there is one that is for some
>> reason
>> > > not part of documentation (will ask why so follow that thread if
>> decided
>> > to
>> > > go this way): You can use something like:
>> > > > > > pattern=“([A-Z][a-z]?\d+)” preserveOriginal=“true” />
>> > >
>> > > This will capture all atom counts as a separate tokens.
>> > >
>> > > HTH,
>> > > Emir
>> > >
>> > > > On 26 Sep 2017, at 23:14, Webster Homer 
>> > wrote:
>> > > >
>> > > > I am trying to create a filter that normalizes an input token, but
>> also
>> > > > splits it inot multiple pieces. Sort of like what the
>> > WordDelimiterFilter
>> > > > does.
>> > > >
>> > > > It's meant to take a molecular formula like C2H6O and normalize it to
>> > > C2H6O1
>> > > >
>> > > > That part works. However I was also going to have it put out the
>> > > individual
>> > > > atom counts as tokens.
>> > > > C2H6O1
>> > > > C2
>> > > > H6
>> > > > O1
>> > > >
>> > > > When I enable this feature in the factory, I don't get any output at
>> > all.
>> > > >
>> > > > I looked over a couple of filters that do what I want and it's not
>> > > entirely
>> > > > clear what they're doing. So I have some questions:
>> > > > Looking at ShingleFilter and WordDelimitierFilter
>> > > > They both set several attributes:
>> > > > CharTermAttribute : Seems to be the actual terms being set. Seemed
>> > > straight
>> > > > forward, works fine when I only have one term to add.
>> > > >
>> > > > PositionIncrementAttribute: What does this do? It appears that
>> > > > WordDelimiterFilter sets this to 0 most of the time. This has decent
>> > > > documentation.
>> > > >
>> > > > OffsetAttribute: I think that this tracks offsets for each term being
>> > > > processed. Not really sure though. The documentation mentions tokens.
>> > So
>> > > if
>> > > > I have multiple variations for for a token is this for each
>> variation?
>> > > >
>> > > > TypeAttribute: default is "word". Don't know what this is for.
>> > > >
>> > > > PositionLengthAttribute: WordDelimiterFilter doesn' use this but
>> > Shingle
>> > > > does. It defaults to 1. What's it good for when should I use it?
>> > > >
>> > > > Here is my incrementToken method.
>> > > >
>> > > >@Override
>> > > >public boolean incrementToken() throws IOException {
>> > > >while(true) {
>> > > >if (!hasSavedState) {
>> > > >if (! input.incrementToken()) {
>> > > >return false;
>> > > >}
>> > > >if (! generateFragments) { // This part works fine!
>> > > >String normalizedFormula = molFormula.normalize(new
>> > > > String(termAttribute.buffer()));
>> > > >char[]newBuffer = normalizedFormula.toCharArray();
>> > > >termAttribute.setEmpty();
>> > > >termAttribute.copyBuffer(newBuffer, 0, newBuffer.length);
>> > > >return true;
>> > > >}
>> > > >formulas = molFormula.normalizeToList(new
>> > > > String(termAttribute.buffer()));
>> > > >iterator = formulas.listIterator();
>> > > >savedPositionIncrement += posIncAttribute.getPositionIncrement();
>> > > >hasSavedState = true;
>> > > >first = true;
>> > > >saveState();
>> > > >}
>> > > >if (!iterator.hasNext()) {
>> > > >posIncAttribute.setPositionIncrement(savedPositionIncrement);
>> > > >savedPositionIncrement = 0;
>> > > >hasSavedState = false;
>> > > >continue;
>> > > >}
>> > > >String formula = iterator.next();
>> > > >

Re: CDCR does not work

2017-09-28 Thread Erick Erickson
If Webster's idea doesn't solve it, the next thing to check is your
tlogs on the source cluster. If you have a successful connection to
the target and it's operative, the tlogs should be regularly pruned.
If not, they'll collect updates forever.

Also, your Solr logs should show messages as CDCR does its work, to
you see any evidence that it's
1> running
2> sending docs?

Also, your problem description doesn't provide any information other
than "it doesn't work", which makes it very hard to offer anything
except generalities, you might review:

https://wiki.apache.org/solr/UsingMailingLists

Best,
Erick


On Thu, Sep 28, 2017 at 7:47 AM, Webster Homer  wrote:
> Check that you have autoCommit enabled in the target schema.
>
> Try sending a commit to the target collection. If you don't have autoCommit
> enabled then the data could be replicating but not committed so not
> searchable
>
> On Thu, Sep 28, 2017 at 1:57 AM, Jiani Yang  wrote:
>
>> Hi,
>>
>> Recently I am trying to use CDCR to do the replication of my solr cluster.
>> I have done exactly as what the tutorial says, the tutorial link is shown
>> below:
>> https://lucene.apache.org/solr/guide/6_6/cross-data-
>> center-replication-cdcr.html
>>
>> But I cannot see any change on target data center even every status looks
>> fine. I have been stuck in this situation for a week and could not find a
>> way to resolve it, could you please help me?
>>
>> Please reply me ASAP! Thank you!
>>
>> Best,
>> Jiani
>>
>
> --
>
>
> This message and any attachment are confidential and may be privileged or
> otherwise protected from disclosure. If you are not the intended recipient,
> you must not copy this message or attachment or disclose the contents to
> any other person. If you have received this transmission in error, please
> notify the sender immediately and delete the message and any attachment
> from your system. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not accept liability for any omissions or errors in this
> message which may arise as a result of E-Mail-transmission or for damages
> resulting from any unauthorized changes of the content of this message and
> any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its
> subsidiaries do not guarantee that this message is free of viruses and does
> not accept liability for any damages caused by any virus transmitted
> therewith.
>
> Click http://www.emdgroup.com/disclaimer to access the German, French,
> Spanish and Portuguese versions of this disclaimer.


Re: Modifing create_core's instanceDir attribute

2017-09-28 Thread Shawn Heisey
On 9/27/2017 10:24 AM, Miller, William K - Norman, OK - Contractor wrote:
> Thanks Erick for pointing me in this direction.  Unfortunately when I try to 
> us this I get an error.  Here is the command that I am using and the response 
> I get:
>
> https://solrserver:8983/solr/admin/cores?action=CREATE=mycore=/var/solr/data/mycore=data=custom_configs
>
>
> [1] 32023
> [2] 32024
> [3] 32025
> -bash: https://solrserver:8983/solr/admin/cores?action=CREATE: No such file 
> or directory
> [4] 32026
> [1]   Exit 127
> https://solrserver:8983/solr/adkmin/cores?action=CREATE
> [2]   Donename=mycore
> [3]-  DoneinstanceDir=/var/solr/data/mycore
> [4]+  DonedataDir=data

It appears that you are trying to type the bare URL into a shell prompt
as a command.  The shell doesn't know how to deal with a URL -- a URL
isn't a program or a shell command.

If you put the URL into a browser, which knows how to deal with it, the
request will go to Solr, and then you can deal with any further problems.

If you want to do it on the commandline, you're going to have to have a
valid command/program for the shell.  The "curl" and "wget" programs are
commonly available on systems with a shell prompt.  Here's one command
that might work.  Replace the text URL with your actual URL, and be sure
that you keep the quotes:

curl "URL"

Thanks,
Shawn



Re: how to recover from OpenSearcher called on closed core

2017-09-28 Thread Nawab Zada Asad Iqbal
Hi

Are you upgrading from an earlier version? If not, I am curious why not try
SolrCloud instead of Master/Slave.
Is there any other error before this error in the logs? Did the core close
after a crash?



Regards
Nawab

On Thu, Sep 28, 2017 at 2:57 AM, rubi.hali  wrote:

> Hi
>
> we are using Solr 6.1.0 version. We have done a Master/Slave Setup where in
> Slaves we have enabled replication polling after 300 seconds
>
> But after every replication poll, we are getting an error : Index Fetch
> Failed: opening NewSearcher called on closed core.
>
> We have enabled  softcommit after 30 ms and hardcommit with 25000 docs
> and 6 secs
> In slaves we have kept opensearcher true in case of hardcommit.
>
> we are really not sure if this issue has anything to do with our commit
> strategy.
>
> Please let me know if there is any possible explanation for why this is
> happening.
>
> From logs analysis , I observerd Caching Directory Factory is closing the
> core and after that Replication Handler starts throwing this exception.
>
> Does this exception will have any impact on memory consumption on slaves??
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


Re: Solr Beginner!!

2017-09-28 Thread Nawab Zada Asad Iqbal
Hi Jaya

Text extraction is a step before you put data into solr. Say, you have pdf
or doc type documents, you will extract the text (minus unnecessary
formatting details etc.) and store in solr. Later you can query it as you
said. i have not worked in extraction area, but look at this for an idea:
https://lucene.apache.org/solr/guide/6_6/uploading-data-with-solr-cell-using-apache-tika.html

`Tika will automatically attempt to determine the input document type
(Word, PDF, HTML) and extract the content appropriately. If you like, you
can explicitly specify a MIME type for Tika with the stream.type parameter.`



Regards
Nawab


On Thu, Sep 28, 2017 at 6:56 AM, Johnson, Jaya 
wrote:

> Hi:
> I am trying to ingest a few memos - they do not have any standard format
> (json, xml etc etc) but just plain text however the memos all follow some
> template. What I would like to od post ingestion is to extract keywords and
> some values around it. So say for instance if the text contains the key
> word Outstanding Amount: 1000.  I would like to search for Outstanding
> Amount ( I can do that using the query interface) how to I extract the
> entire string Outstanding Amount +3or4 words from Solr.
>
> I am really new to solr so any documentation etc would be super helpful.
> Is Solr the right tool for this use case also
>
> Thanks.
> -
>
> Moody's monitors email communications through its networks for regulatory
> compliance purposes and to protect its customers, employees and business
> and where allowed to do so by applicable law. The information contained in
> this e-mail message, and any attachment thereto, is confidential and may
> not be disclosed without our express permission. If you are not the
> intended recipient or an employee or agent responsible for delivering this
> message to the intended recipient, you are hereby notified that you have
> received this message in error and that any review, dissemination,
> distribution or copying of this message, or any attachment thereto, in
> whole or in part, is strictly prohibited. If you have received this message
> in error, please immediately notify us by telephone, fax or e-mail and
> delete the message and all of its attachments. Every effort is made to keep
> our network free from viruses. You should, however, review this e-mail
> message, as well as any attachment thereto, for viruses. We take no
> responsibility and have no liability for any computer virus which may be
> transferred via this e-mail message.
>
> -
>


Re: CDCR does not work

2017-09-28 Thread Webster Homer
Check that you have autoCommit enabled in the target schema.

Try sending a commit to the target collection. If you don't have autoCommit
enabled then the data could be replicating but not committed so not
searchable

On Thu, Sep 28, 2017 at 1:57 AM, Jiani Yang  wrote:

> Hi,
>
> Recently I am trying to use CDCR to do the replication of my solr cluster.
> I have done exactly as what the tutorial says, the tutorial link is shown
> below:
> https://lucene.apache.org/solr/guide/6_6/cross-data-
> center-replication-cdcr.html
>
> But I cannot see any change on target data center even every status looks
> fine. I have been stuck in this situation for a week and could not find a
> way to resolve it, could you please help me?
>
> Please reply me ASAP! Thank you!
>
> Best,
> Jiani
>

-- 


This message and any attachment are confidential and may be privileged or 
otherwise protected from disclosure. If you are not the intended recipient, 
you must not copy this message or attachment or disclose the contents to 
any other person. If you have received this transmission in error, please 
notify the sender immediately and delete the message and any attachment 
from your system. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not accept liability for any omissions or errors in this 
message which may arise as a result of E-Mail-transmission or for damages 
resulting from any unauthorized changes of the content of this message and 
any attachment thereto. Merck KGaA, Darmstadt, Germany and any of its 
subsidiaries do not guarantee that this message is free of viruses and does 
not accept liability for any damages caused by any virus transmitted 
therewith.

Click http://www.emdgroup.com/disclaimer to access the German, French, 
Spanish and Portuguese versions of this disclaimer.


Re: Solr cloud most stable version

2017-09-28 Thread Nawab Zada Asad Iqbal
Hi Lars

Although, that doesn't really answer of whether 6.6.1 is the most stable
one or not, but there has been a recent security fix, so definitely go to
6.6.1 .

Copied the detail below:-



CVE-2017-9803: Security vulnerability in kerberos delegation token
functionality

Severity: Important

Vendor:
The Apache Software Foundation

Versions Affected:
Apache Solr 6.2.0 to 6.6.0

Description:

Solr's Kerberos plugin can be configured to use delegation tokens,
which allows an application to reuse the authentication of an end-user
or another application.
There are two issues with this functionality (when using
SecurityAwareZkACLProvider type of ACL provider e.g.
SaslZkACLProvider),

Firstly, access to the security configuration can be leaked to users
other than the solr super user. Secondly, malicious users can exploit
this leaked configuration for privilege escalation to further
expose/modify private data and/or disrupt operations in the Solr
cluster.

The vulnerability is fixed from Solr 6.6.1 onwards.

Mitigation:
6.x users should upgrade to 6.6.1

Credit:
This issue was discovered by Hrishikesh Gadre of Cloudera Inc.

References:
https://issues.apache.org/jira/browse/SOLR-11184
https://wiki.apache.org/solr/SolrSecurity

On Thu, Sep 28, 2017 at 6:24 AM, Lars Karlsson <
lars.karlsson.st...@gmail.com> wrote:

> Hi, wanted to check if anyone can help guide with most stable version
> between
>
> 6.3 and 6.6.1
>
> Which should I choose ?
>
> And, are there any performance tests that one can look at for each release?
>
> Regards
> Lars
>


Re: Filter Factory question

2017-09-28 Thread Webster Homer
It's still buggy, so not ready to share.

I keep a copy of Solr source which I use for this type of development. I
don't see PatternCaptureGroupTokenFilterFactory in the Solr 6.2 code base
at all. I was thinking of seeing how it treated the positions etc...

My code now looks reasonable in the Analysis tool,  but doesn't seem to
create searchable lucene data. I've changed it considerably since my first
post so I see output in the tool which was an improvement


On Wed, Sep 27, 2017 at 10:30 AM, Stefan Matheis 
wrote:

> > In any case I figured out my problem. I was over thinking it.
>
> Mind to share?
>
> -Stefan
>
> On Sep 27, 2017 4:34 PM, "Webster Homer"  wrote:
>
> > There is a need for a special filter since the input has to be
> normalized.
> > That is the main requirement, splitting into pieces is optional. As far
> as
> > I know there is nothing in solr that knows about molecular formulas.
> >
> > In any case I figured out my problem. I was over thinking it.
> >
> > On Wed, Sep 27, 2017 at 3:52 AM, Emir Arnautović <
> > emir.arnauto...@sematext.com> wrote:
> >
> > > Hi Homer,
> > > There is no need for special filter, there is one that is for some
> reason
> > > not part of documentation (will ask why so follow that thread if
> decided
> > to
> > > go this way): You can use something like:
> > >  > > pattern=“([A-Z][a-z]?\d+)” preserveOriginal=“true” />
> > >
> > > This will capture all atom counts as a separate tokens.
> > >
> > > HTH,
> > > Emir
> > >
> > > > On 26 Sep 2017, at 23:14, Webster Homer 
> > wrote:
> > > >
> > > > I am trying to create a filter that normalizes an input token, but
> also
> > > > splits it inot multiple pieces. Sort of like what the
> > WordDelimiterFilter
> > > > does.
> > > >
> > > > It's meant to take a molecular formula like C2H6O and normalize it to
> > > C2H6O1
> > > >
> > > > That part works. However I was also going to have it put out the
> > > individual
> > > > atom counts as tokens.
> > > > C2H6O1
> > > > C2
> > > > H6
> > > > O1
> > > >
> > > > When I enable this feature in the factory, I don't get any output at
> > all.
> > > >
> > > > I looked over a couple of filters that do what I want and it's not
> > > entirely
> > > > clear what they're doing. So I have some questions:
> > > > Looking at ShingleFilter and WordDelimitierFilter
> > > > They both set several attributes:
> > > > CharTermAttribute : Seems to be the actual terms being set. Seemed
> > > straight
> > > > forward, works fine when I only have one term to add.
> > > >
> > > > PositionIncrementAttribute: What does this do? It appears that
> > > > WordDelimiterFilter sets this to 0 most of the time. This has decent
> > > > documentation.
> > > >
> > > > OffsetAttribute: I think that this tracks offsets for each term being
> > > > processed. Not really sure though. The documentation mentions tokens.
> > So
> > > if
> > > > I have multiple variations for for a token is this for each
> variation?
> > > >
> > > > TypeAttribute: default is "word". Don't know what this is for.
> > > >
> > > > PositionLengthAttribute: WordDelimiterFilter doesn' use this but
> > Shingle
> > > > does. It defaults to 1. What's it good for when should I use it?
> > > >
> > > > Here is my incrementToken method.
> > > >
> > > >@Override
> > > >public boolean incrementToken() throws IOException {
> > > >while(true) {
> > > >if (!hasSavedState) {
> > > >if (! input.incrementToken()) {
> > > >return false;
> > > >}
> > > >if (! generateFragments) { // This part works fine!
> > > >String normalizedFormula = molFormula.normalize(new
> > > > String(termAttribute.buffer()));
> > > >char[]newBuffer = normalizedFormula.toCharArray();
> > > >termAttribute.setEmpty();
> > > >termAttribute.copyBuffer(newBuffer, 0, newBuffer.length);
> > > >return true;
> > > >}
> > > >formulas = molFormula.normalizeToList(new
> > > > String(termAttribute.buffer()));
> > > >iterator = formulas.listIterator();
> > > >savedPositionIncrement += posIncAttribute.getPositionIncrement();
> > > >hasSavedState = true;
> > > >first = true;
> > > >saveState();
> > > >}
> > > >if (!iterator.hasNext()) {
> > > >posIncAttribute.setPositionIncrement(savedPositionIncrement);
> > > >savedPositionIncrement = 0;
> > > >hasSavedState = false;
> > > >continue;
> > > >}
> > > >String formula = iterator.next();
> > > >int startOffset = savedStartOffset;
> > > >
> > > >if (first) {
> > > >termAttribute.setEmpty();
> > > >}
> > > >int endOffset = savedStartOffset + formula.length();
> > > >System.out.printf("Writing formula %s %d to %d%n", formula,
> > > > startOffset, endOffset);;
> > > >termAttribute.append(formula);
> > > >offsetAttribute.setOffset(startOffset, endOffset);
> > > >savedStartOffset 

how to recover from OpenSearcher called on closed core

2017-09-28 Thread rubi.hali
Hi

we are using Solr 6.1.0 version. We have done a Master/Slave Setup where in
Slaves we have enabled replication polling after 300 seconds

But after every replication poll, we are getting an error : Index Fetch
Failed: opening NewSearcher called on closed core.

We have enabled  softcommit after 30 ms and hardcommit with 25000 docs
and 6 secs 
In slaves we have kept opensearcher true in case of hardcommit.

we are really not sure if this issue has anything to do with our commit
strategy.

Please let me know if there is any possible explanation for why this is
happening. 

>From logs analysis , I observerd Caching Directory Factory is closing the
core and after that Replication Handler starts throwing this exception.

Does this exception will have any impact on memory consumption on slaves??



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


CDCR does not work

2017-09-28 Thread Jiani Yang
Hi,

Recently I am trying to use CDCR to do the replication of my solr cluster.
I have done exactly as what the tutorial says, the tutorial link is shown
below:
https://lucene.apache.org/solr/guide/6_6/cross-data-center-replication-cdcr.html

But I cannot see any change on target data center even every status looks
fine. I have been stuck in this situation for a week and could not find a
way to resolve it, could you please help me?

Please reply me ASAP! Thank you!

Best,
Jiani


how to recover from OpenSearcher called on closed core

2017-09-28 Thread rubi.hali
Hi

We are using solr6.1 version and have a master slave setup.

We have one master and two slaves .

We have enabled replication poll on slaves at an interval of 300s which
results into an error
and says *Index Fetch Failed : Open NewSearcher Called on closed core*

And our commit strategy involves both hardcommit and softcommit
our hardcommit conditions are
maxDocs 25000
maxTime 6
and OpenSearcher is false for master but in case of slaves it is true


and our Softcommit involves 30 as soft commit time.

Please let me know if this error is due to any of the configurations we have
done.

Analysis from logs is that Caching Directory is closing the core and when
replication happens it starts throwing the error.

Thanks in advance












--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Solr Beginner!!

2017-09-28 Thread Johnson, Jaya
Hi:
I am trying to ingest a few memos - they do not have any standard format (json, 
xml etc etc) but just plain text however the memos all follow some template. 
What I would like to od post ingestion is to extract keywords and some values 
around it. So say for instance if the text contains the key word Outstanding 
Amount: 1000.  I would like to search for Outstanding Amount ( I can do that 
using the query interface) how to I extract the entire string Outstanding 
Amount +3or4 words from Solr.

I am really new to solr so any documentation etc would be super helpful. Is 
Solr the right tool for this use case also

Thanks.
-

Moody's monitors email communications through its networks for regulatory 
compliance purposes and to protect its customers, employees and business and 
where allowed to do so by applicable law. The information contained in this 
e-mail message, and any attachment thereto, is confidential and may not be 
disclosed without our express permission. If you are not the intended recipient 
or an employee or agent responsible for delivering this message to the intended 
recipient, you are hereby notified that you have received this message in error 
and that any review, dissemination, distribution or copying of this message, or 
any attachment thereto, in whole or in part, is strictly prohibited. If you 
have received this message in error, please immediately notify us by telephone, 
fax or e-mail and delete the message and all of its attachments. Every effort 
is made to keep our network free from viruses. You should, however, review this 
e-mail message, as well as any attachment thereto, for viruses. We take no 
responsibility and have no liability for any computer virus which may be 
transferred via this e-mail message.

-


Solr cloud most stable version

2017-09-28 Thread Lars Karlsson
Hi, wanted to check if anyone can help guide with most stable version
between

6.3 and 6.6.1

Which should I choose ?

And, are there any performance tests that one can look at for each release?

Regards
Lars


Re: SOLR terminology

2017-09-28 Thread alessandro.benedetti
>From the Solr wiki[1] : 

*Logical*
/Collection/ : It is a collection of documents which share the same logical
domain and data structure

*Physical*
/Solr Node/ : It is a single instance of a Solr Server. From OS point of
view it is a single Java Process ( internally it is the Solr Web App
deployed in a Jetty Server)
/Solr Core/ : It is a single Index ( with its own configuration) within a
single Solr instance. It is the physical counterpart of a collection( or a
collection shard if the collection is fragmented)
/Solr Cluster /: It is a group of Solr Instances which collaborates through
the supervision of Apache zookeeper instance(s)


[1] https://lucene.apache.org/solr/guide/6_6/how-solrcloud-works.html



-
---
Alessandro Benedetti
Search Consultant, R Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: SOLR terminology

2017-09-28 Thread Emir Arnautović
Hi,
Let’s start from the top and introduce also Shards, Primaries and Replicas:
SolrCluster is a cluster of Solr Nodes. Nodes are part of the same cluster if 
reading configuration from the same “folder” of the same Zookeeper ensemble 
(ensemble = cluster in ZK terminology).
Node is the instance of Solr process - whenever you run “solr start -c…” you 
are starting a new Node. That is Jetty process hosting Solr servlet. 
Solr Cluster can hoste one or more Collections where collection is logical 
representation of an Index.
Collection can be sharder to one or more Shards and each shard can have one or 
more copies called Replicas. One copy is declared as Primary and it can change.
Each Replica is one Solr Core which is the same as Solr Index in standalone 
mode.

HTH,
Emir

> On 28 Sep 2017, at 04:27, Gunalan V  wrote:
> 
> Hello,
> 
> Could someone please tell me the difference between Solr Core (core),
> Collections, Nodes, SolrCluster referred in SolrColud. It's bit confusing.
> 
> If there are any diagrammatic representation or example please share me.
> 
> 
> Thanks!



Re: SOLR terminology

2017-09-28 Thread Rick Leir

Gunalan,

Solr Core (core), is one-to-one with a Solr process and its data directory. It 
can be a shard, or part of a replica.
Collection - is one or more shards grouped together, and can be replicated for 
reliability, availability and performance
Node - is a machine in a Zookeeper group
SolrCluster - is a Zookeeper group of nodes

cheers -- Rick

On 2017-09-27 10:27 PM, Gunalan V wrote:

Hello,

Could someone please tell me the difference between Solr Core (core),
Collections, Nodes, SolrCluster referred in SolrColud. It's bit confusing.

If there are any diagrammatic representation or example please share me.


Thanks!





Re: Where the uploaded configset from SOLR into zookeeper ensemble resides?

2017-09-28 Thread Michael Kuhlmann
Do you find your configs in the Solr admin panel, in the Cloud --> Tree
folder?

-Michael

Am 28.09.2017 um 04:50 schrieb Gunalan V:
> Hello,
> 
> Could you please let me know where can I find the uploaded configset from
> SOLR into zookeeper ensemble ?
> 
> In docs it says they will  "/configs/" but I'm not able to see
> the configs directory in zookeeper. Please let me know if I need to check
> somewhere else.
> 
> 
> Thanks!
> 



Re: Modifing create_core's instanceDir attribute

2017-09-28 Thread Michael Kuhlmann
I'd rather say you didn't quote the URL when sending it using curl.

Bash accepts the ampersand as a request to execute curl including the
URL up to CREATE in background - that's why the error is included within
the next output, followed by "Exit" - and then tries to execute the
following part of the URL as additional commands, which of course fails.

Just put the URL in quotes, and it will work much better.

-Michael

Am 27.09.2017 um 23:14 schrieb Miller, William K - Norman, OK - Contractor:
> I understand that this has to be done on the command line, but I don't know 
> where to put this structure or what it should look like.  Can you please be 
> more specific in this answer?  I have only been working with Solr for about 
> six months.
> 
> 
> 
> 
> ~~~
> William Kevin Miller
> 
> ECS Federal, Inc.
> USPS/MTSC
> (405) 573-2158
> 
> 
> -Original Message-
> From: Erick Erickson [mailto:erickerick...@gmail.com] 
> Sent: Wednesday, September 27, 2017 3:57 PM
> To: solr-user
> Subject: Re: Modifing create_core's instanceDir attribute
> 
> Standard command-line. You're doing this on the box itself, not through a 
> REST API.
> 
> Erick
> 
> On Wed, Sep 27, 2017 at 10:26 AM, Miller, William K - Norman, OK - Contractor 
>  wrote:
>> This is my first time to try using the core admin API.  How do I go about 
>> creating the directory structure?
>>
>>
>>
>>
>> ~~~
>> William Kevin Miller
>>
>> ECS Federal, Inc.
>> USPS/MTSC
>> (405) 573-2158
>>
>>
>> -Original Message-
>> From: Erick Erickson [mailto:erickerick...@gmail.com]
>> Sent: Wednesday, September 27, 2017 11:45 AM
>> To: solr-user
>> Subject: Re: Modifing create_core's instanceDir attribute
>>
>> Right, the core admin API is pretty low-level, it expects the base directory 
>> exists, you have to create the directory structure by hand.
>>
>> Best,
>> Erick
>>
>> On Wed, Sep 27, 2017 at 9:24 AM, Miller, William K - Norman, OK - Contractor 
>>  wrote:
>>> Thanks Erick for pointing me in this direction.  Unfortunately when I try 
>>> to us this I get an error.  Here is the command that I am using and the 
>>> response I get:
>>>
>>> https://solrserver:8983/solr/admin/cores?action=CREATE=mycore
>>> s 
>>> tanceDir=/var/solr/data/mycore=data=custom_configs
>>>
>>>
>>> [1] 32023
>>> [2] 32024
>>> [3] 32025
>>> -bash: https://solrserver:8983/solr/admin/cores?action=CREATE: No 
>>> such file or directory [4] 32026
>>> [1] Exit 127
>>> https://solrserver:8983/solr/adkmin/cores?action=CREATE
>>> [2] Donename=mycore
>>> [3]-DoneinstanceDir=/var/solr/data/mycore
>>> [4]+DonedataDir=data
>>>
>>>
>>> I even tried to use the UNLOAD action to remove a core and got the same 
>>> type of error as the -bash line above.
>>>
>>> I have tried searching online for an answer and have found nothing so far.  
>>> Any ideas why this error is occuring.
>>>
>>>
>>>
>>> ~~~
>>> William Kevin Miller
>>>
>>> ECS Federal, Inc.
>>> USPS/MTSC
>>> (405) 573-2158
>>>
>>> -Original Message-
>>> From: Erick Erickson [mailto:erickerick...@gmail.com]
>>> Sent: Tuesday, September 26, 2017 3:33 PM
>>> To: solr-user
>>> Subject: Re: Modifing create_core's instanceDir attribute
>>>
>>> I don't think you can. You can, however, use the core admin API to do 
>>> that,
>>> see:
>>> https://lucene.apache.org/solr/guide/6_6/coreadmin-api.html#coreadmin
>>> -
>>> api
>>>
>>> Best,
>>> Erick
>>>
>>> On Tue, Sep 26, 2017 at 1:14 PM, Miller, William K - Norman, OK - 
>>> Contractor  wrote:
>>>
 I know that when the create_core command is used that it sets the 
 core to the name of the parameter supplied with the “-c” option and 
 the instanceDir attribute in the http is also set to the name of the core.
 What I want is to tell the create_core to use a different 
 instanceDir parameter.  How can I go about doing this?





 I am using Solr 6.5.1 and it is running on a linux server using the 
 apache tomcat webserver.











 ~~~

 William Kevin Miller

 [image: ecsLogo]

 ECS Federal, Inc.

 USPS/MTSC

 (405) 573-2158






Re: Solrcloud configuration

2017-09-28 Thread Shashi Roushan
Hello All,

Thanks to reply. After long time , I found the solution of configuration
uploading in sold cloud from the following link:

http://mtitek.com/tutorials/solr/collections.php

Regards,
Shashi Roushan

On Sep 20, 2017 3:42 AM, "John Bickerstaff" 
wrote:

This may also be of some assistance:

https://gist.github.com/maxivak/3e3ee1fca32f3949f052

I haven't tested, just found it.

On Tue, Sep 19, 2017 at 4:10 PM, John Bickerstaff 
wrote:

> This may be of some assistance...
>
> http://lucene.apache.org/solr/guide/6_6/
>
> There is a section discussing sharding and another section that includes
> the schema.
>
> On Tue, Sep 19, 2017 at 1:42 PM, Shashi Roushan 
> wrote:
>
>> Hello David
>>
>> No, I didn't read any documentation on the schema and DIH.
>>
>> Actually we are already using Solr 4 version. I am now upgrading in
>> solrcloud with shards. I have done lots of google, but not getting
>> relevant
>> information DIH and schema with Solr shards. I am getting result with
>> older
>> version of Solr.
>>
>>
>> On Sep 20, 2017 12:58 AM, "David Hastings" 
>> wrote:
>>
>> Did you read the documentation on the schema and the DIH?
>>
>> On Tue, Sep 19, 2017 at 3:04 PM, Shashi Roushan 
>> wrote:
>>
>> > Hi All,
>> >
>> > I need your help to configure solrcloud with shards.
>> > I have created collection and shards using solr6 and Zookeeper. Its
>> working
>> > fine.
>> > My problems are:
>> > Where I put schema and dbdataconfig files?
>> > How I can use DIH to import data from SQL server To solr?
>> >  In older version I was using schema and DIH to import data from SQL
>> > server.
>> >
>> > Please help.
>> >
>> > Regards
>> > Shashi Roushan
>> >
>>
>
>