Solrj does not support ltr ?

2018-06-18 Thread shreck





--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Solr Odbc for Parallel Sql integration with Tableau

2018-06-18 Thread Joel Bernstein
That's interesting that you were able to setup OpenLink. At Alfresco we've
done quite a bit of work on the Solr's JDBC driver to integrate it with the
Alfresco repository, which uses Solr. But we haven't yet tackled the ODBC
setup. That will come very soon. To really take advantage of Tableau's
capabilities we will need to add joins to Solr's parallel SQL. Solr already
uses Apache Calcite, which has a join optimizer, so mainly this would
involve hooking up the various Streaming Expression joins.

Joel Bernstein
http://joelsolr.blogspot.com/

On Mon, Jun 18, 2018 at 6:37 PM, Aroop Ganguly 
wrote:

> Ok I was able to setup the odic bridge (using OpenLink) and I see the
> collections popping up in Tableau too.
> But I am unable to actually get data flowing into Tableau reports because,
> Tableau keeps creating inner queries and Solr seems to hate inner queries.
> Is there a way to do inner queries in Solr Parallel Sql ?
>
> > On Jun 18, 2018, at 12:30 PM, Aroop Ganguly 
> wrote:
> >
> >
> > Hi Everyone
> >
> > I am not sure if something has been done on this yet, though I did see a
> JIRA with links to the parallel sql documentation, but I do not think that
> answers the question.
> >
> > I love the jdbc driver and it works well for many UIs but there are
> other systems that need an ODBC driver.
> >
> > Can anyone share any guidance as to how this can be done or has been
> done by others.
> >
> > Thanks
> > Aroop
>
>


Re: How to find out which search terms have matches in a search

2018-06-18 Thread Derek Poh

Hi Erik

I have explored the facetquery but it doesnotreally help. Thank you for 
your suggestion.


On 12/6/2018 7:49 PM, Erik Hatcher wrote:

Derek -

One trick I like to do is try various forms of a query all in one go.   With 
facet=on, you can:

   =big brown bear
   =big brown
   =brown bear
   =big
   =brown
   =bear

The returned counts give you an indication of what queries matched docs in the 
result set, and which didn’t.   If you did this with q=*:* you’d see how each 
of those matched across the entire collection.

Grouping and group.query could be used similarly.

I’ve used facet.query to do some Venn diagramming of overlap of search results like 
that.   An oldie but a goodie: 
https://www.slideshare.net/lucenerevolution/hatcher-erik-rapid-prototyping-with-solr/12
 


4.10.4?   woah

Erik Hatcher
Senior Solutions Architect, Lucidworks.com



On Jun 11, 2018, at 11:16 PM, Derek Poh  wrote:

Hi

How can I find out which search terms have matches in a search?

Eg.
The search terms are "big brown bear".And only "big" and "brown" have matches 
in the searchresult.
Can Solr return this information that "big" and "brown" have matches in the 
search result?
I want touse this information to display on the search result page that "big" and 
"brown" have matches.
Somethinglike "big brown bear".

Amusing solr 4.10.4.

Derek

--
CONFIDENTIALITY NOTICE
This e-mail (including any attachments) may contain confidential and/or 
privileged information. If you are not the intended recipient or have received 
this e-mail in error, please inform the sender immediately and delete this 
e-mail (including any attachments) from your computer, and you must not use, 
disclose to anyone else or copy this e-mail (including any attachments), 
whether in whole or in part.
This e-mail and any reply to it may be monitored for security, legal, 
regulatory compliance and/or other appropriate reasons.





--
CONFIDENTIALITY NOTICE 

This e-mail (including any attachments) may contain confidential and/or privileged information. If you are not the intended recipient or have received this e-mail in error, please inform the sender immediately and delete this e-mail (including any attachments) from your computer, and you must not use, disclose to anyone else or copy this e-mail (including any attachments), whether in whole or in part. 


This e-mail and any reply to it may be monitored for security, legal, 
regulatory compliance and/or other appropriate reasons.

Re: How to find out which search terms have matches in a search

2018-06-18 Thread Derek Poh
Seems like theHighlight feature could help but with some workaround. 
Will need to explore more on it. Thank you.


On 12/6/2018 5:32 PM, Alessandro Benedetti wrote:

I would recommend to look into the Highlight feature[1] .
There are few implementations and they should be all right for your user
requirement.

Regards

[1] https://lucene.apache.org/solr/guide/7_3/highlighting.html



-
---
Alessandro Benedetti
Search Consultant, R Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html





--
CONFIDENTIALITY NOTICE 

This e-mail (including any attachments) may contain confidential and/or privileged information. If you are not the intended recipient or have received this e-mail in error, please inform the sender immediately and delete this e-mail (including any attachments) from your computer, and you must not use, disclose to anyone else or copy this e-mail (including any attachments), whether in whole or in part. 


This e-mail and any reply to it may be monitored for security, legal, 
regulatory compliance and/or other appropriate reasons.

Re: tlogs not deleting

2018-06-18 Thread Susheel Kumar
You may have to DISABLEBUFFER in source to get rid of tlogs.

On Mon, Jun 18, 2018 at 6:13 PM, Brian Yee  wrote:

> So I've read a bunch of stuff on hard/soft commits and tlogs. As I
> understand, after a hard commit, solr is supposed to delete old tlogs
> depending on the numRecordsToKeep and maxNumLogsToKeep values in the
> autocommit settings in solrconfig.xml. I am occasionally seeing solr fail
> to do this and the tlogs just build up over time and eventually we run out
> of disk space on the VM and this causes problems for us. This does not
> happen all the time, only sometimes. I currently have a tlog directory that
> has 123G worth of tlogs. The last hard commit on this node was 10 minutes
> ago but these tlogs date back to 3 days ago.
>
> We have sometimes found that restarting solr on the node will get it to
> clean up the old tlogs, but we really want to find the root cause and fix
> it if possible so we don't keep getting disk space alerts and have to adhoc
> restart nodes. Has anyone seen an issue like this before?
>
> My update handler settings look like this:
>   
>
>   
>
>   ${solr.ulog.dir:}
>   ${solr.ulog.numVersionBuckets:
> 65536}
> 
> 
> 60
> 25
> false
> 
> 
> 12
> 
>
>   
> 100
>   
>
>   
>


Re: Solr Odbc for Parallel Sql integration with Tableau

2018-06-18 Thread Erick Erickson
Asher:

Please follow the instructions here:
http://lucene.apache.org/solr/community.html#mailing-lists-irc. You
must use the _exact_ same e-mail as you used to subscribe.

If the initial try doesn't work and following the suggestions at the
"problems" link doesn't work for you, let us know. But note you need
to show us the _entire_ return header to allow anyone to diagnose the
problem.

Best,
Erick

On Mon, Jun 18, 2018 at 4:08 PM, Asher Shih  wrote:
> unsubscribe
>
> On Mon, Jun 18, 2018 at 3:37 PM, Aroop Ganguly  
> wrote:
>> Ok I was able to setup the odic bridge (using OpenLink) and I see the 
>> collections popping up in Tableau too.
>> But I am unable to actually get data flowing into Tableau reports because, 
>> Tableau keeps creating inner queries and Solr seems to hate inner queries.
>> Is there a way to do inner queries in Solr Parallel Sql ?
>>
>>> On Jun 18, 2018, at 12:30 PM, Aroop Ganguly  wrote:
>>>
>>>
>>> Hi Everyone
>>>
>>> I am not sure if something has been done on this yet, though I did see a 
>>> JIRA with links to the parallel sql documentation, but I do not think that 
>>> answers the question.
>>>
>>> I love the jdbc driver and it works well for many UIs but there are other 
>>> systems that need an ODBC driver.
>>>
>>> Can anyone share any guidance as to how this can be done or has been done 
>>> by others.
>>>
>>> Thanks
>>> Aroop
>>


Re: Solr Odbc for Parallel Sql integration with Tableau

2018-06-18 Thread Asher Shih
unsubscribe

On Mon, Jun 18, 2018 at 3:37 PM, Aroop Ganguly  wrote:
> Ok I was able to setup the odic bridge (using OpenLink) and I see the 
> collections popping up in Tableau too.
> But I am unable to actually get data flowing into Tableau reports because, 
> Tableau keeps creating inner queries and Solr seems to hate inner queries.
> Is there a way to do inner queries in Solr Parallel Sql ?
>
>> On Jun 18, 2018, at 12:30 PM, Aroop Ganguly  wrote:
>>
>>
>> Hi Everyone
>>
>> I am not sure if something has been done on this yet, though I did see a 
>> JIRA with links to the parallel sql documentation, but I do not think that 
>> answers the question.
>>
>> I love the jdbc driver and it works well for many UIs but there are other 
>> systems that need an ODBC driver.
>>
>> Can anyone share any guidance as to how this can be done or has been done by 
>> others.
>>
>> Thanks
>> Aroop
>


Configuring SolrJ for Kerberized Environments

2018-06-18 Thread Everly Okorji
Hi,

So, for context, I have very little experience with Kerberos. The
environment has SolrCloud configured, and I am using SolrJ libraries from
Solr 7.0.0 and attempting to set up my application to be able to make Solr
requests when Kerberos is enabled. Specifically, I am making a request to
add solr fields to my schema. The same request is successful when Kerberos
is not enabled. Also, note that I am able to

I went over the documentation which looks to be outdated - at least the *Using
SolrJ with a Kerberized Solr* section - as it references a removed class
*Krb5HttpClientConfigurer*. I tried to use the *Krb5HttpClientBuilder*
class to simulate the behavior, but it seems that my configuration is
incomplete or incorrect, as I have gotten a number of errors depending on
what was tried:

- I attempted to use the Krb5HttpClientBuilder to replicate the behavior,
but I kept getting an error with following cause when the request is made:
Caused by: org.apache.http.client.NonRepeatableRequestException: Cannot
retry request with a non-repeatable request entity.
at
org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:225)
at
org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:88)
at
org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
... 38 more


Based on this, I included a line to allow for preemptive authentication by
setting the following before configuring the builder:
HttpClientUtil.addRequestInterceptor(new PreemptiveAuth(new
SPNegoScheme()));

Based on this new configuration, I are now seeing a checksum failure error:
Caused by:
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
from server at http:///solr/pantheon: Expected mime type
application/octet-stream but got text/html. 


Error 403 GSSException: Failure unspecified at GSS-API level
(Mechanism level: Checksum failed)

HTTP ERROR 403
Problem accessing /solr/pantheon/schema. Reason:
GSSException: Failure unspecified at GSS-API level (Mechanism
level: Checksum failed)Powered by
Jetty://




at
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:591)
at
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:253)
at
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:242)
at
org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:483)
at
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:436)
... 31 more

I understand that the GSS API is responsible for actually fetcing the
Kerberos ticket for a client, and then authenticating and authorizing my
application to talk to the solr server. I'm just not sure if the
application is pulling the correct credentials or where exactly this
failure is happening, if it is related to my configuration or if I am just
using an untested approach.

This is what my jaas config file looks like. According to the docs, I first
tried with just the *Client* configuration, and then I added the
*SolrJClient* config to see if that helped. No change in behavior.
Client {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
storeKey=true
useTicketCache=true
debug=true
keyTab=""
principal="";
};

SolrJClient {
  com.sun.security.auth.module.Krb5LoginModule required
  useKeyTab=true
  keyTab=""
  storeKey=true
  useTicketCache=true
  debug=true
  principal="";
};

Random facts to note, including some of the many things I tried:
- I tried setting my jaas config useTicketCache=false, same error.
- The system property java.security.auth.login.config is set to point to
the jaas config for the application.
- The solr parameter PROP_FOLLOW_REDIRECTS is set to false.
- Zookeeper is used. I tried using the solr url instead of a zkUrl when
building the CloudSolrClient, no luck there either.


Could this also be a problem with my principals or jaas configuration? More
specifically, what are the correct steps to follow on SolrJ 7.0.0 on a
kerberized environment? If I have that and cann follow step-by-step, at
least I know where things fail. At the moment, I'm running around in
circles and not sure what I'm looking for. A lot of StackOverflow questions
were looked at and tried, but either I'm stepping on my own toes or my
issue seems to be unique. Hopefully someone can spot something I missed.


Regards,
Everly


Re: Solr Odbc for Parallel Sql integration with Tableau

2018-06-18 Thread Aroop Ganguly
Ok I was able to setup the odic bridge (using OpenLink) and I see the 
collections popping up in Tableau too.
But I am unable to actually get data flowing into Tableau reports because, 
Tableau keeps creating inner queries and Solr seems to hate inner queries.
Is there a way to do inner queries in Solr Parallel Sql ?

> On Jun 18, 2018, at 12:30 PM, Aroop Ganguly  wrote:
> 
> 
> Hi Everyone
> 
> I am not sure if something has been done on this yet, though I did see a JIRA 
> with links to the parallel sql documentation, but I do not think that answers 
> the question.
> 
> I love the jdbc driver and it works well for many UIs but there are other 
> systems that need an ODBC driver.
> 
> Can anyone share any guidance as to how this can be done or has been done by 
> others.
> 
> Thanks
> Aroop



tlogs not deleting

2018-06-18 Thread Brian Yee
So I've read a bunch of stuff on hard/soft commits and tlogs. As I understand, 
after a hard commit, solr is supposed to delete old tlogs depending on the 
numRecordsToKeep and maxNumLogsToKeep values in the autocommit settings in 
solrconfig.xml. I am occasionally seeing solr fail to do this and the tlogs 
just build up over time and eventually we run out of disk space on the VM and 
this causes problems for us. This does not happen all the time, only sometimes. 
I currently have a tlog directory that has 123G worth of tlogs. The last hard 
commit on this node was 10 minutes ago but these tlogs date back to 3 days ago.

We have sometimes found that restarting solr on the node will get it to clean 
up the old tlogs, but we really want to find the root cause and fix it if 
possible so we don't keep getting disk space alerts and have to adhoc restart 
nodes. Has anyone seen an issue like this before?

My update handler settings look like this:
  

  

  ${solr.ulog.dir:}
  ${solr.ulog.numVersionBuckets:65536}


60
25
false


12


  
100
  

  


Re: Streaming expressions and fetch()

2018-06-18 Thread Dariusz Wojtas
Hi,
I thing this might give some clue.
I tried to reproduce the issue with a collection called testCloud.

fetch(testCloud1,
  search(testCloud1, q="*:*", fq="type:name", fl="parentId",
sort="parentId asc"),
  fl="id,name",
  on="parentId=id")

The expression above produces 3 log entries presented below (just cut the
content before 'webapp' in each line to save space):

webapp=/solr path=/stream
params={expr=fetch(testCloud1,%0a++search(testCloud1,+q%3D"*:*",+fq%3D"type:name",+fl%3D"parentId",+sort%3D"parentId+asc"),%0a++fl%3D"id,name",%0a++on%3D"parentId%3Did")&_=1529178931117}
status=0 QTime=1
webapp=/solr path=/select
params={q=*:*=false=parentId=type:name=parentId+asc=json=2.2}
hits=1 status=0 QTime=1
webapp=/solr path=/select
params={q={!+df%3Did+q.op%3DOR+cache%3Dfalse+}+123=false=id,name,_version_=_version_+desc=50=json=2.2}
hits=0 status=0 QTime=1

If I use the 3rd line parameters with an url:
http://10.0.75.1:8983/solr/testCloud1/select?q={!+df%3Did+q.op%3DOR+cache%3Dfalse+}+123=false=id,name,_version_=_version_+desc=50=json=2.2

then the resultset is empty. It searches for 'id' value fo 123.
BUT if I remove the plus sign before the '123' and have url like this:
http://10.0.75.1:8983/solr/testCloud1/select?q={!+df%3Did+q.op%3DOR+cache%3Dfalse+}123=false=id,name,_version_=_version_+desc=50=json=2.2

THEN IT RETURNS SINGLE ROW WITH EXPECTED VALUES.
Maybe this gives some light? Maybe it's about the enriching query syntax?

I have tried with fetch containing query that returns more identifiers.
In the 3rd log entry the identifiers start with a plus sign and are
separated with pluses, as in the log entry below

q={!+df%3Did+q.op%3DOR+cache%3Dfalse+}+123+124=false=id,name,_version_=_version_+desc=50=json=2.2
No results returned, and the data is not enriched with additional
attributes.

Best regards,
Darek


On Mon, Jun 18, 2018 at 3:07 PM, Joel Bernstein  wrote:

> There is a test case working that is basically the same construct that you
> are having issues with. So, I think the next step is to try and reproduce
> the problem that you are seeing in a test case.
>
> If you have a small sample test dataset I can use to reproduce the error
> please create a jira ticket and I will work on the issue.
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Sun, Jun 17, 2018 at 2:40 PM, Dariusz Wojtas  wrote:
>
> > Hi,
> > I am trying to use streaming expressions with SOLR 7.3.1.
> > I have successfully used innerJoin, leftOuterJoin and several other
> > functions but failed to achieve expected results with the fetch()
> function.
> >
> > Example below is silmplfied, in reality the base search() function uses
> > fuzzy matching and scoring. And works perfectly.
> > But I need to enrich the search results with additional column from the
> > same collection.
> > search() call does a query on nested documents, and returns parentId
> (yes,
> > i know there is _root_, tried it as well) + some calculated custom
> values,
> > requiring some aggregation calls, like rollup(). This part works
> perfectly.
> > But then I want to enrich the resultset with attributes from the top
> level
> > document, where "parentId=id".
> > And all my attempts to fetch additional data have failed, the fetch()
> call
> > below always gives the same results as the search() call inside.
> >
> > fetch(users,
> >   search(users, q="*:*", fq="type:name", fl="parentId",
> sort="parentId
> > asc"),
> >   fl="id,name",
> >   on="parentId=id")
> >
> > As I understand fetch() should retrieve only records narrowed by the
> > "parentId" results.
> > If I call leftOuterJoin(), then I loose the benefit of such nice
> narrowing
> > call.
> > Any clue what i am doing wrong with fetch()?
> >
> > Best regards,
> > Darek
> >
>


Solr Odbc for Parallel Sql integration with Tableau

2018-06-18 Thread Aroop Ganguly


Hi Everyone

I am not sure if something has been done on this yet, though I did see a JIRA 
with links to the parallel sql documentation, but I do not think that answers 
the question.

I love the jdbc driver and it works well for many UIs but there are other 
systems that need an ODBC driver.

Can anyone share any guidance as to how this can be done or has been done by 
others.

Thanks
Aroop

Re: Connection Problem with CloudSolrClient.Builder().build When passing a Zookeeper Addresses and RootParam

2018-06-18 Thread Andy C
>From the error, I think the issue is with your zookeeperList definition.

Try changing:


zookeeperList.add("http://100.12.119.10:2281;);
zookeeperList.add("http://100.12.119.10:2282;);
zookeeperList.add("http://100.12.119.10:2283;);

to


zookeeperList.add("100.12.119.10:2281");
zookeeperList.add("100.12.119.10:2282");
zookeeperList.add("100.12.119.10:2283");

If you are not using a chroot in Zookeeper then just use chrootOption =
Optional.empty(); (as you have done).

Intent of my code was to support both using a chroot and not using a
chroot. The value of _zkChroot is read from a config file in code not shown.

- Andy -


Collection level Permission

2018-06-18 Thread Antony A
Hi,

I am trying to find some help with the format of the collection name under
permissions.

The collection can be "*", "null", "collection_name". Is there a way to
group a set of collections?

example:

This format is not working.

{"collection": ["colname1, colname2, colname3"], "path": "/select", "role":
["dev"], "index": 1},

Thanks,
Antony


Re: Connection Problem with CloudSolrClient.Builder().build When passing a Zookeeper Addresses and RootParam

2018-06-18 Thread THADC
Thanks. So I tried what you had, however, you are not specifying _zkChroot,
so I don't know what to put there. I don't know what that value would be for
me, or if I even need it. So I commented out most you example to:

Optional chrootOption = null; 
  //  if (StringUtils.isNotBlank(_zkChroot)) 
  //  { 
  // chrootOption = Optional.of(_zkChroot); 
 //   } 
//else 
//{ 
   chrootOption = Optional.empty(); 

by the way which StringUtils API are you using? There are a couple that have
that isNotBlank.

In any event, I have now getting a different exception when trying to create
the client:

 ERROR [] - 
java.lang.RuntimeException: Error committing document
 
(a bunch of stack details left out here..)

Caused by: org.apache.solr.common.SolrException:
java.lang.IllegalArgumentException: Invalid path string
"//172.16.120.14:2281,http://172.16.120.14:2282,http://172.16.120.14:2283;
caused by empty node name specified @1
at
org.apache.solr.common.cloud.SolrZkClient.(SolrZkClient.java:171)
at
org.apache.solr.common.cloud.SolrZkClient.(SolrZkClient.java:120)
at
org.apache.solr.common.cloud.SolrZkClient.(SolrZkClient.java:110)
at
org.apache.solr.common.cloud.ZkStateReader.(ZkStateReader.java:285)
at
org.apache.solr.client.solrj.impl.ZkClientClusterStateProvider.connect(ZkClientClusterStateProvider.java:155)
at
org.apache.solr.client.solrj.impl.CloudSolrClient.connect(CloudSolrClient.java:399)
at
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:828)
at
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:818)

So, perhaps I need the root specified (since it say "caused by empty node
name specifed @1")? If so, I don't know what it would be.

Thanks for any insights.




--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Connection Problem with CloudSolrClient.Builder().build When passing a Zookeeper Addresses and RootParam

2018-06-18 Thread Jason Gerlowski
Hi,

Yes, Andy has the right idea.  If no zk-chroot is being used,
"Optional.empty()" is the correct value to specify, not null.

This API is a bit trappy (or at least unintuitive), and we're in the
process of hashing out some doc improvements (and/or API changes).  If
you're curious or would otherwise like to weigh in, check out SOLR-12309.

Best,

Jason

On Mon, Jun 18, 2018 at 1:09 PM, Andy C  wrote:

> I am using the following (Solr 7.3.1) successfully:
>
> import java.util.Optional;
>
>  Optional chrootOption = null;
>  if (StringUtils.isNotBlank(_zkChroot))
>  {
> chrootOption = Optional.of(_zkChroot);
>  }
>  else
>  {
> chrootOption = Optional.empty();
>  }
>  CloudSolrClient client = new CloudSolrClient.Builder(_zkHostList,
> chrootOption).build();
>
> Adapted from code I found somewhere (unit test?). Intent is to support the
> option of configuring a chroot or not (stored in "_zkChroot")
>
> - Andy -
>
> On Mon, Jun 18, 2018 at 12:53 PM, THADC  >
> wrote:
>
> > Hello,
> >
> > I am using solr 7.3 and zookeeper 3.4.10. I have custom client code that
> is
> > supposed to connect the a zookeeper cluster. For the sake of clarity, the
> > main code focus:
> >
> >
> > private synchronized void initSolrClient()
> > {
> > List zookeeperList = new ArrayList();
> >
> > zookeeperList.add("http://100.12.119.10:2281;);
> > zookeeperList.add("http://100.12.119.10:2282;);
> > zookeeperList.add("http://100.12.119.10:2283;);
> >
> > String collectionName = "myCollection"
> >
> > log.debug("in initSolrClient(), collectionName: " +
> > collectionName);
> >
> > try {
> > solrClient = new CloudSolrClient.Builder(
> zookeeperList,
> > null).build();
> >
> > } catch (Exception e) {
> > log.info("Exception creating solr client object.
> > ");
> > e.printStackTrace();
> > }
> > solrClient.setDefaultCollection(collectionName);
> > }
> >
> > Before executing, I test that all three zoo nodes are running
> > (./bin/zkServer.sh status zoo.cfg, ./bin/zkServer.sh status zoo2.cfg,
> > ./bin/zkServer.sh status zoo3.cfg). The status shows the quorum is
> > up and running, with one nodes as the leader and the other two as
> > followers.
> >
> > When I execute my java client to connect to the zookeeper cluster, I get
> :
> >
> > java.lang.NullPointerException
> > at
> > org.apache.solr.client.solrj.impl.CloudSolrClient$Builder.<
> > init>(CloudSolrClient.java:1387)
> >
> >
> > I am assuming it has a problem with my null value for zkChroot, but not
> > certain. Th API says zkChroot is the path to the root ZooKeeper node
> > containing Solr data. May be empty if Solr-data is located at the
> ZooKeeper
> > root.
> >
> > I am confused on what exactly should go here, and when it can be null. I
> > cannot find any coding examples.
> >
> > Any help greatly appreciated.
> >
> >
> >
> >
> > --
> > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
> >
>


Re: Connection Problem with CloudSolrClient.Builder().build When passing a Zookeeper Addresses and RootParam

2018-06-18 Thread Andy C
I am using the following (Solr 7.3.1) successfully:

import java.util.Optional;

 Optional chrootOption = null;
 if (StringUtils.isNotBlank(_zkChroot))
 {
chrootOption = Optional.of(_zkChroot);
 }
 else
 {
chrootOption = Optional.empty();
 }
 CloudSolrClient client = new CloudSolrClient.Builder(_zkHostList,
chrootOption).build();

Adapted from code I found somewhere (unit test?). Intent is to support the
option of configuring a chroot or not (stored in "_zkChroot")

- Andy -

On Mon, Jun 18, 2018 at 12:53 PM, THADC 
wrote:

> Hello,
>
> I am using solr 7.3 and zookeeper 3.4.10. I have custom client code that is
> supposed to connect the a zookeeper cluster. For the sake of clarity, the
> main code focus:
>
>
> private synchronized void initSolrClient()
> {
> List zookeeperList = new ArrayList();
>
> zookeeperList.add("http://100.12.119.10:2281;);
> zookeeperList.add("http://100.12.119.10:2282;);
> zookeeperList.add("http://100.12.119.10:2283;);
>
> String collectionName = "myCollection"
>
> log.debug("in initSolrClient(), collectionName: " +
> collectionName);
>
> try {
> solrClient = new 
> CloudSolrClient.Builder(zookeeperList,
> null).build();
>
> } catch (Exception e) {
> log.info("Exception creating solr client object.
> ");
> e.printStackTrace();
> }
> solrClient.setDefaultCollection(collectionName);
> }
>
> Before executing, I test that all three zoo nodes are running
> (./bin/zkServer.sh status zoo.cfg, ./bin/zkServer.sh status zoo2.cfg,
> ./bin/zkServer.sh status zoo3.cfg). The status shows the quorum is
> up and running, with one nodes as the leader and the other two as
> followers.
>
> When I execute my java client to connect to the zookeeper cluster, I get :
>
> java.lang.NullPointerException
> at
> org.apache.solr.client.solrj.impl.CloudSolrClient$Builder.<
> init>(CloudSolrClient.java:1387)
>
>
> I am assuming it has a problem with my null value for zkChroot, but not
> certain. Th API says zkChroot is the path to the root ZooKeeper node
> containing Solr data. May be empty if Solr-data is located at the ZooKeeper
> root.
>
> I am confused on what exactly should go here, and when it can be null. I
> cannot find any coding examples.
>
> Any help greatly appreciated.
>
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


Connection Problem with CloudSolrClient.Builder().build When passing a Zookeeper Addresses and RootParam

2018-06-18 Thread THADC
Hello,

I am using solr 7.3 and zookeeper 3.4.10. I have custom client code that is
supposed to connect the a zookeeper cluster. For the sake of clarity, the
main code focus:


private synchronized void initSolrClient()
{   
List zookeeperList = new ArrayList();

zookeeperList.add("http://100.12.119.10:2281;);
zookeeperList.add("http://100.12.119.10:2282;);
zookeeperList.add("http://100.12.119.10:2283;);

String collectionName = "myCollection"

log.debug("in initSolrClient(), collectionName: " + 
collectionName);

try {
solrClient = new CloudSolrClient.Builder(zookeeperList, 
null).build();

} catch (Exception e) {
log.info("Exception creating solr client object. ");
e.printStackTrace();
}
solrClient.setDefaultCollection(collectionName);
}

Before executing, I test that all three zoo nodes are running
(./bin/zkServer.sh status zoo.cfg, ./bin/zkServer.sh status zoo2.cfg,
./bin/zkServer.sh status zoo3.cfg). The status shows the quorum is
up and running, with one nodes as the leader and the other two as followers.

When I execute my java client to connect to the zookeeper cluster, I get :

java.lang.NullPointerException
at
org.apache.solr.client.solrj.impl.CloudSolrClient$Builder.(CloudSolrClient.java:1387)


I am assuming it has a problem with my null value for zkChroot, but not
certain. Th API says zkChroot is the path to the root ZooKeeper node
containing Solr data. May be empty if Solr-data is located at the ZooKeeper
root.

I am confused on what exactly should go here, and when it can be null. I
cannot find any coding examples.

Any help greatly appreciated.




--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Sorting on ip address

2018-06-18 Thread David Hastings
sorry, I mean to an ip adress as a numeric value, example in MySQL:
https://dev.mysql.com/doc/refman/8.0/en/miscellaneous-functions.html#function_inet-aton

On Mon, Jun 18, 2018 at 12:35 PM, root23  wrote:

> I am sorry i am not sure what you mean by store as atom. Is that an
> fieldType
> in solr ?
> I couldnt find it in here
> https://lucene.apache.org/solr/guide/6_6/field-types-
> included-with-solr.html
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


Re: Sorting on ip address

2018-06-18 Thread root23
I am sorry i am not sure what you mean by store as atom. Is that an fieldType
in solr ? 
I couldnt find it in here
https://lucene.apache.org/solr/guide/6_6/field-types-included-with-solr.html



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Sorting on ip address

2018-06-18 Thread Dave
Store it as an atom rather than an up address. 

> On Jun 18, 2018, at 12:14 PM, root23  wrote:
> 
> Hi all,
> is there a  built in data type which i can use for ip address which can
> provide me sorting ip address based on the class? if not then what is the
> best way to sort based on ip address ?
> 
> 
> 
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Sorting on ip address

2018-06-18 Thread root23
Hi all,
is there a  built in data type which i can use for ip address which can
provide me sorting ip address based on the class? if not then what is the
best way to sort based on ip address ?



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Securying ONLY the web interface console

2018-06-18 Thread Amanda Shuman
Hi Shawn et al,

As a follow-up to this - then how would you solve the issue? I tried to use
the instructions to set up basic authentication in solr (per a Stack
Overflow post) and it worked to secure things, but the web app couldn't
access solr. Tampering with the app code - which is the solr plug-in used
for Omeka (https://github.com/scholarslab/SolrSearch) - would require a lot
of extra work, so I'm wondering if there's a simpler solution. One of the
developers on that told me to do a reverse proxy like the second poster on
this chain more or less suggests. But from what I understand of what you
wrote, this is not ideal because it only protects the admin UI panel and
not everything else. So how then should I secure everything with the
exception of calls coming from this web app?

Best,
Amanda


--
Dr. Amanda Shuman
Post-doc researcher, University of Freiburg, The Maoist Legacy Project

PhD, University of California, Santa Cruz
http://www.amandashuman.net/
http://www.prchistoryresources.org/
Office: +49 (0) 761 203 4925


On Mon, Mar 19, 2018 at 11:03 PM, Shawn Heisey  wrote:

> On 3/19/2018 11:19 AM, Jesus Olivan wrote:
> > i'm trying to password protect only Solr web interface (not queries
> > launched from my app). I'm currently using SolrCloud 6.6.0 with external
> > zookeepers. I've read tons of Docs about it, but i couldn't find a proper
> > way to secure ONLY the web admin console. Can anybody give me some light
> > about it, please? =)
>
> When you add authentication, it's not actually the admin UI that needs
> authentication.  It's all the API requests (queries and the like) that
> the admin UI makes which require authentication.
>
> The admin UI itself is completely static HTML, CSS, Javascript, and
> images -- it doesn't have ANY information about your installation.
> Requiring authentication for that doesn't make any sense at all --
> there's nothing sensitive in those files.
>
> When you access the admin UI, the UI pieces are downloaded to your
> browser, and then the UI actually runs in your browser, accessing the
> API endpoints.  When the UI running in your browser first accesses one
> of those endpoints, you get the authentication prompt.
>
> If we only secured the admin UI and not the API, then somebody who has
> direct access to your Solr server could do whatever they wanted.  The
> admin UI is just a convenience.  Everything it does can be done directly.
>
> Thanks,
> Shawn
>
>


SOLR migration

2018-06-18 Thread Ana Mercan (RO)
Hi,

I have the following scenario, I'm having a shared cluster solr
installation environment (app server 1-app server 2 load balanced) which
has 4 solr instances.

After reviewing the space audit we have noticed that the partition where
the installation resides is too big versus what is used in term of space.

Therefore we have installed a new drive which is smaller and now we want to
migrate from the old dive (E:) to the new drive (F).

Can you please provide an official answer whether this is a supported
scenario?

If yes, will you please share the steps with us?

Thanks,

Ana

-- 
The information transmitted is intended only for the person or entity to 
which it is addressed and may contain confidential and/or privileged 
material. Any review, retransmission, dissemination or other use of, or 
taking of any action in reliance upon, this information by persons or 
entities other than the intended recipient is prohibited. If you received 
this in error, please contact the sender and delete the material from any 
computer.


Re: Achieving AutoComplete feature using Solrj client

2018-06-18 Thread Alessandro Benedetti
Indeed, you first configure it in the solrconfig.xml ( manually).

Then you can query and parse the response as you like with the SolrJ client
library.

Cheers



-
---
Alessandro Benedetti
Search Consultant, R Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


some solr replicas down

2018-06-18 Thread Satya Marivada
Hi, We are using solr 6.3.0 and a collection has 3 of 4 replicas down and 1
is up and serving.

I see a single line error repeating in logs as below. nothing else specific
exception apart from it. Wondering what this below message is saying, is it
the cause of nodes being down, but saw that this happened even before the
repllicas went down.

2018-06-18 04:45:51.818 ERROR (qtp1528637575-27215) [c:poi s:shard1
r:core_node5 x:poi_shard1_replica3] o.a.s.s.PKIAuthenticationPlugin Invalid
key request timestamp: 1529297138215 , received timestamp: 1529297151817 ,
TTL: 5000

Thanks,
Satya


Re: Streaming Expressions: Merge array values? Inverse of cartesianProduct()

2018-06-18 Thread Joel Bernstein
You are doing things correctly. I was incorrect about the behavior of the
group() operation.

I think the behavior you are looking for should be done using reduce() but
we'll need to create a reduce operation that does this. If you want to
create a ticket we can work through exactly how the operation would work.

Joel Bernstein
http://joelsolr.blogspot.com/

On Fri, Jun 15, 2018 at 8:04 AM, Christian Spitzlay <
christian.spitz...@biologis.com> wrote:

> Hi,
>
> I had come across the reduce function in the docs but
> I have a hard time getting it to work; I haven't found any documentation
> on it
> or its parameters, and the source code of the GroupOperation doesn't
> explain it either ...
> For example, what is the "n" parameter about?
>
>
> I constructed a source stream to produce the input from my second example:
>
> merge(
> sort(cartesianProduct(tuple(k1="1", k2=array(a)), k2, productSort="k1
> asc"), by="k1 asc"),
> sort(cartesianProduct(tuple(k1="2", k2=array(b,c)), k2,
> productSort="k1 asc"), by="k1 asc"),
> on="k1 asc"
>   )
>
> --->
>
> {
>   "result-set": {
> "docs": [
>   {
> "k1": "1",
> "k2": "a"
>   },
>   {
> "k1": "2",
> "k2": "b"
>   },
>   {
> "k1": "2",
> "k2": "c"
>   },
>   {
> "EOF": true,
> "RESPONSE_TIME": 0
>   }
> ]
>   }
> }
>
>
>
> Then wrapped in a reduce function:
>
>
> reduce(
>   merge(
> sort(cartesianProduct(tuple(k1="1", k2=array(a)), k2, productSort="k1
> asc"), by="k1 asc"),
> sort(cartesianProduct(tuple(k1="2", k2=array(b,c)), k2,
> productSort="k1 asc"), by="k1 asc"),
> on="k1 asc"
>   ),
>   by="k1",
>   group(sort="k1 asc", n="10")
> )
>
> --->
>
> {
>   "result-set": {
> "docs": [
>   {
> "k1": "1",
> "k2": "a",
> "group": [
>   {
> "k1": "1",
> "k2": "a"
>   }
> ]
>   },
>   {
> "k1": "2",
> "k2": "c",
> "group": [
>   {
> "k1": "2",
> "k2": "c"
>   },
>   {
> "k1": "2",
> "k2": "b"
>   }
> ]
>   },
>   {
> "EOF": true,
> "RESPONSE_TIME": 0
>   }
> ]
>   }
> }
>
>
> It adds a field "group" that contains an array of the unchanged input
> documents with the same "by" value,
> not grouped values.
>
> {
>   "result-set": {
> "docs": [
>   {
> "k1": "1",
> "k2": "a",
> "group": [
>   {
> "k1": "1",
> "k2": "a"
>   }
> ]
>   },
>   {
> "k1": "2",
> "k2": "c",
> "group": [
>   {
> "k1": "2",
> "k2": "c"
>   },
>   {
> "k1": "2",
> "k2": "b"
>   }
> ]
>   },
>   {
> "EOF": true,
> "RESPONSE_TIME": 0
>   }
> ]
>   }
> }
>
>
> Or am I doing it wrong?
>
> Christian Spitzlay
>
>
>
>
>
>
>
>
>
>
> > Am 15.06.2018 um 01:48 schrieb Joel Bernstein :
> >
> > Actually you're second example is probably a straight forward:
> >
> > reduce(select(...), group(...), by="k1")
> >
> > Joel Bernstein
> > http://joelsolr.blogspot.com/
> >
> > On Thu, Jun 14, 2018 at 7:33 PM, Joel Bernstein 
> wrote:
> >
> >> Take a look at the reduce() function. You'll have to write a custom
> reduce
> >> operation but you can follow the example here:
> >>
> >> https://github.com/apache/lucene-solr/blob/master/solr/
> >> solrj/src/java/org/apache/solr/client/solrj/io/ops/GroupOperation.java
> >>
> >> You can plug in your custom reduce operation in the solrconfig.xml and
> use
> >> it like any other function. If you're interested in working on this you
> >> could create a ticket and I can provide guidance.
> >>
> >>
> >> Joel Bernstein
> >> http://joelsolr.blogspot.com/
> >>
> >> 2018-06-14 13:13 GMT-04:00 Christian Spitzlay <
> >> christian.spitz...@biologis.com>:
> >>
> >>> Hi,
> >>>
> >>> is there a way to merge array values?
> >>>
> >>> Something that transforms
> >>>
> >>> {
> >>>  "k1": "1",
> >>>  "k2": ["a", "b"]
> >>> },
> >>> {
> >>>  "k1": "2",
> >>>  "k2": ["c", "d"]
> >>> },
> >>> {
> >>>  "k1": "2",
> >>>  "k2": ["e", "f"]
> >>> }
> >>>
> >>> into
> >>>
> >>> {
> >>>  "k1": "1",
> >>>  "k2": ["a", "b"]
> >>> },
> >>> {
> >>>  "k1": "2",
> >>>  "k2": ["c", "d", "e", "f"]
> >>> }
> >>>
> >>>
> >>> And an inverse of cartesianProduct() that transforms
> >>>
> >>> {
> >>>  "k1": "1",
> >>>  "k2": "a"
> >>> },
> >>> {
> >>>  "k1": "2",
> >>>  "k2": "b"
> >>> },
> >>> {
> >>>  "k1": "2",
> >>>  "k2": "c"
> >>> }
> >>>
> >>> into
> >>>
> >>> {
> >>>  "k1": "1",
> >>>  "k2": ["a"]
> >>> },
> >>> {
> >>>  "k1": "2",
> >>>  "k2": ["b", "c"]
> >>> }
> >>>
> >>>
> >>> Christian
> >>>
> >>>
> >>>
> >>
>
>


Re: Streaming expressions and fetch()

2018-06-18 Thread Joel Bernstein
There is a test case working that is basically the same construct that you
are having issues with. So, I think the next step is to try and reproduce
the problem that you are seeing in a test case.

If you have a small sample test dataset I can use to reproduce the error
please create a jira ticket and I will work on the issue.

Joel Bernstein
http://joelsolr.blogspot.com/

On Sun, Jun 17, 2018 at 2:40 PM, Dariusz Wojtas  wrote:

> Hi,
> I am trying to use streaming expressions with SOLR 7.3.1.
> I have successfully used innerJoin, leftOuterJoin and several other
> functions but failed to achieve expected results with the fetch() function.
>
> Example below is silmplfied, in reality the base search() function uses
> fuzzy matching and scoring. And works perfectly.
> But I need to enrich the search results with additional column from the
> same collection.
> search() call does a query on nested documents, and returns parentId (yes,
> i know there is _root_, tried it as well) + some calculated custom values,
> requiring some aggregation calls, like rollup(). This part works perfectly.
> But then I want to enrich the resultset with attributes from the top level
> document, where "parentId=id".
> And all my attempts to fetch additional data have failed, the fetch() call
> below always gives the same results as the search() call inside.
>
> fetch(users,
>   search(users, q="*:*", fq="type:name", fl="parentId", sort="parentId
> asc"),
>   fl="id,name",
>   on="parentId=id")
>
> As I understand fetch() should retrieve only records narrowed by the
> "parentId" results.
> If I call leftOuterJoin(), then I loose the benefit of such nice narrowing
> call.
> Any clue what i am doing wrong with fetch()?
>
> Best regards,
> Darek
>


Re: Autoscaling and inactive shards

2018-06-18 Thread Andrzej Białecki


> On 18 Jun 2018, at 14:02, Jan Høydahl  wrote:
> 
> Is there still a valid reason to keep the inactive shards around?
> If shard splitting is robust, could not the split operation delete the 
> inactive shard once the new shards are successfully loaded, just like what 
> happens during an automated merge of segments?
> 


Shard splitting is not robust :) There are some interesting partial failure 
scenarios in SplitShardCmd that still need fixing - most likely a complete 
rewrite of SplitShardCmd is required to improve error handling, perhaps also to 
use a more efficient index splitting algorithm.

Until this is done shard splitting leaves the original shard for a while, and 
then InactiveShardPlanAction removes them after their TTL expired (default is 2 
days).

> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
> 
>> 18. jun. 2018 kl. 12:12 skrev Andrzej Białecki 
>> :
>> 
>> If I’m not mistaken the weird accounting of “inactive” shard cores is caused 
>> also by the fact that individual cores that constitute replicas in the 
>> inactive shard are still loaded, so they still affect the number of active 
>> cores. If that’s the case then we should probably fix this to prevent 
>> loading the cores from inactive (but still present) shards.
>> 
>>> On 14 Jun 2018, at 04:27, Shalin Shekhar Mangar  
>>> wrote:
>>> 
>>> Yes, I believe Noble is working on this. See
>>> https://issues.apache.org/jira/browse/SOLR-11985
>>> 
>>> On Wed, Jun 13, 2018 at 1:35 PM Jan Høydahl  wrote:
>>> 
 Ok, get the meaning of preferences.
 
 Would there be a way to write a generic rule that would suggest moving
 shards to obtain balance, without specifying absolute core counts? I.e. if
 you have three nodes
 A: 3 cores
 B: 5 cores
 C: 3 cores
 
 Then that rule would suggest two moves to end up with 4 cores on all three
 (unless that would violate disk space or load limits)?
 
 --
 Jan Høydahl, search solution architect
 Cominvent AS - www.cominvent.com
 
> 12. jun. 2018 kl. 08:10 skrev Shalin Shekhar Mangar <
 shalinman...@gmail.com>:
> 
> Hi Jan,
> 
> Comments inline:
> 
> On Tue, Jun 12, 2018 at 2:19 AM Jan Høydahl >>> > wrote:
> 
>> Hi
>> 
>> I'm trying to have Autoscaling move a shard to another node after
 manually
>> splitting.
>> We have two nodes, one has a shard1 and the other node is empty.
>> 
>> After SPLITSHARD you have
>> 
>> * shard1 (inactive)
>> * shard1_0
>> * shard1_1
>> 
>> For autoscaling we have the {"minimize" : "cores"} cluster preference
>> active. Because of that I'd expect that Autoscaling would suggest to
 move
>> e.g. shard1_1 to the other (empty) node, but it doesn't. Then I create a
>> rule just to test {"cores": "<2", "node": "#ANY"}, but still no
>> suggestions. Not until I delete the inactive shard1, then it suggests to
>> move one of the two remaining shards to the other node.
>> 
>> So my two questions are
>> 1. Is it by design that inactive shards "count" wrt #cores?
>> I understand that it consumes disk but it is not active otherwise,
>> so one could argue that it should not be counted in core/replica
 rules?
>> 
> 
> Today, inactive slices also count towards the number of cores -- though
> technically correct, it is probably an oversight.
> 
> 
>> 2. Why is there no suggestion to move a shard due to the "minimize
 cores"
>> reference itself?
>> 
> 
> The /autoscaling/suggestions end point only suggests if there are policy
> violations. Preferences such as minimize:cores are more of a sorting
 order
> so they aren't really being violated. After you add the rule, the
 framework
> still cannot give a suggestion that satisfies your rule. This is because
> even if shard1_1 is moved to node2, node1 still has shard1 and shard1_0.
 So
> the system ends up not suggesting anything. You should get a suggestion
 if
> you add a third node to the cluster though.
> 
> Also see SOLR-11997 > which
> will tell users that a suggestion could not be returned because we cannot
> satisfy the policy. There are a slew of other improvements to suggestions
> planned that will return suggestions even when there are no policy
> violations.
> 
> 
>> 
>> --
>> Jan Høydahl, search solution architect
>> Cominvent AS - www.cominvent.com 
>> 
>> 
> 
> --
> Regards,
> Shalin Shekhar Mangar.
 
 
>>> 
>>> -- 
>>> Regards,
>>> Shalin Shekhar Mangar.
>> 
> 



Re: Autoscaling and inactive shards

2018-06-18 Thread Jan Høydahl
Is there still a valid reason to keep the inactive shards around?
If shard splitting is robust, could not the split operation delete the inactive 
shard once the new shards are successfully loaded, just like what happens 
during an automated merge of segments?

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

> 18. jun. 2018 kl. 12:12 skrev Andrzej Białecki 
> :
> 
> If I’m not mistaken the weird accounting of “inactive” shard cores is caused 
> also by the fact that individual cores that constitute replicas in the 
> inactive shard are still loaded, so they still affect the number of active 
> cores. If that’s the case then we should probably fix this to prevent loading 
> the cores from inactive (but still present) shards.
> 
>> On 14 Jun 2018, at 04:27, Shalin Shekhar Mangar  
>> wrote:
>> 
>> Yes, I believe Noble is working on this. See
>> https://issues.apache.org/jira/browse/SOLR-11985
>> 
>> On Wed, Jun 13, 2018 at 1:35 PM Jan Høydahl  wrote:
>> 
>>> Ok, get the meaning of preferences.
>>> 
>>> Would there be a way to write a generic rule that would suggest moving
>>> shards to obtain balance, without specifying absolute core counts? I.e. if
>>> you have three nodes
>>> A: 3 cores
>>> B: 5 cores
>>> C: 3 cores
>>> 
>>> Then that rule would suggest two moves to end up with 4 cores on all three
>>> (unless that would violate disk space or load limits)?
>>> 
>>> --
>>> Jan Høydahl, search solution architect
>>> Cominvent AS - www.cominvent.com
>>> 
 12. jun. 2018 kl. 08:10 skrev Shalin Shekhar Mangar <
>>> shalinman...@gmail.com>:
 
 Hi Jan,
 
 Comments inline:
 
 On Tue, Jun 12, 2018 at 2:19 AM Jan Høydahl >> > wrote:
 
> Hi
> 
> I'm trying to have Autoscaling move a shard to another node after
>>> manually
> splitting.
> We have two nodes, one has a shard1 and the other node is empty.
> 
> After SPLITSHARD you have
> 
> * shard1 (inactive)
> * shard1_0
> * shard1_1
> 
> For autoscaling we have the {"minimize" : "cores"} cluster preference
> active. Because of that I'd expect that Autoscaling would suggest to
>>> move
> e.g. shard1_1 to the other (empty) node, but it doesn't. Then I create a
> rule just to test {"cores": "<2", "node": "#ANY"}, but still no
> suggestions. Not until I delete the inactive shard1, then it suggests to
> move one of the two remaining shards to the other node.
> 
> So my two questions are
> 1. Is it by design that inactive shards "count" wrt #cores?
> I understand that it consumes disk but it is not active otherwise,
> so one could argue that it should not be counted in core/replica
>>> rules?
> 
 
 Today, inactive slices also count towards the number of cores -- though
 technically correct, it is probably an oversight.
 
 
> 2. Why is there no suggestion to move a shard due to the "minimize
>>> cores"
> reference itself?
> 
 
 The /autoscaling/suggestions end point only suggests if there are policy
 violations. Preferences such as minimize:cores are more of a sorting
>>> order
 so they aren't really being violated. After you add the rule, the
>>> framework
 still cannot give a suggestion that satisfies your rule. This is because
 even if shard1_1 is moved to node2, node1 still has shard1 and shard1_0.
>>> So
 the system ends up not suggesting anything. You should get a suggestion
>>> if
 you add a third node to the cluster though.
 
 Also see SOLR-11997 >> https://issues.apache.org/jira/browse/SOLR-11997>> which
 will tell users that a suggestion could not be returned because we cannot
 satisfy the policy. There are a slew of other improvements to suggestions
 planned that will return suggestions even when there are no policy
 violations.
 
 
> 
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com 
> 
> 
 
 --
 Regards,
 Shalin Shekhar Mangar.
>>> 
>>> 
>> 
>> -- 
>> Regards,
>> Shalin Shekhar Mangar.
> 



Re: Autoscaling and inactive shards

2018-06-18 Thread Andrzej Białecki
If I’m not mistaken the weird accounting of “inactive” shard cores is caused 
also by the fact that individual cores that constitute replicas in the inactive 
shard are still loaded, so they still affect the number of active cores. If 
that’s the case then we should probably fix this to prevent loading the cores 
from inactive (but still present) shards.

> On 14 Jun 2018, at 04:27, Shalin Shekhar Mangar  
> wrote:
> 
> Yes, I believe Noble is working on this. See
> https://issues.apache.org/jira/browse/SOLR-11985
> 
> On Wed, Jun 13, 2018 at 1:35 PM Jan Høydahl  wrote:
> 
>> Ok, get the meaning of preferences.
>> 
>> Would there be a way to write a generic rule that would suggest moving
>> shards to obtain balance, without specifying absolute core counts? I.e. if
>> you have three nodes
>> A: 3 cores
>> B: 5 cores
>> C: 3 cores
>> 
>> Then that rule would suggest two moves to end up with 4 cores on all three
>> (unless that would violate disk space or load limits)?
>> 
>> --
>> Jan Høydahl, search solution architect
>> Cominvent AS - www.cominvent.com
>> 
>>> 12. jun. 2018 kl. 08:10 skrev Shalin Shekhar Mangar <
>> shalinman...@gmail.com>:
>>> 
>>> Hi Jan,
>>> 
>>> Comments inline:
>>> 
>>> On Tue, Jun 12, 2018 at 2:19 AM Jan Høydahl > > wrote:
>>> 
 Hi
 
 I'm trying to have Autoscaling move a shard to another node after
>> manually
 splitting.
 We have two nodes, one has a shard1 and the other node is empty.
 
 After SPLITSHARD you have
 
 * shard1 (inactive)
 * shard1_0
 * shard1_1
 
 For autoscaling we have the {"minimize" : "cores"} cluster preference
 active. Because of that I'd expect that Autoscaling would suggest to
>> move
 e.g. shard1_1 to the other (empty) node, but it doesn't. Then I create a
 rule just to test {"cores": "<2", "node": "#ANY"}, but still no
 suggestions. Not until I delete the inactive shard1, then it suggests to
 move one of the two remaining shards to the other node.
 
 So my two questions are
 1. Is it by design that inactive shards "count" wrt #cores?
  I understand that it consumes disk but it is not active otherwise,
  so one could argue that it should not be counted in core/replica
>> rules?
 
>>> 
>>> Today, inactive slices also count towards the number of cores -- though
>>> technically correct, it is probably an oversight.
>>> 
>>> 
 2. Why is there no suggestion to move a shard due to the "minimize
>> cores"
 reference itself?
 
>>> 
>>> The /autoscaling/suggestions end point only suggests if there are policy
>>> violations. Preferences such as minimize:cores are more of a sorting
>> order
>>> so they aren't really being violated. After you add the rule, the
>> framework
>>> still cannot give a suggestion that satisfies your rule. This is because
>>> even if shard1_1 is moved to node2, node1 still has shard1 and shard1_0.
>> So
>>> the system ends up not suggesting anything. You should get a suggestion
>> if
>>> you add a third node to the cluster though.
>>> 
>>> Also see SOLR-11997 > https://issues.apache.org/jira/browse/SOLR-11997>> which
>>> will tell users that a suggestion could not be returned because we cannot
>>> satisfy the policy. There are a slew of other improvements to suggestions
>>> planned that will return suggestions even when there are no policy
>>> violations.
>>> 
>>> 
 
 --
 Jan Høydahl, search solution architect
 Cominvent AS - www.cominvent.com 
 
 
>>> 
>>> --
>>> Regards,
>>> Shalin Shekhar Mangar.
>> 
>> 
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.



Is anybody using UIMA with Solr?

2018-06-18 Thread Alexandre Rafalovitch
Hi,

Solr ships an UIMA component and examples that haven't worked for a
while. Details are in:
https://issues.apache.org/jira/browse/SOLR-11694

The choices for developers are:
1) Rip UIMA out (and save space)
2) Update UIMA to latest 2.x version
3) Update UIMA to super-latest possibly-breaking 3.x

The most likely choice at this point is 1. But I am curious (given
that UIMA is in IBM Watson...) if anybody actually has a use-case that
strongly votes for options 2 or 3, given that the update effort is
probably not trivial.

Note that if you use UIMA with Solr, but in a configuration completely
different from that shipped (so the options 2/3 would still be
irrelevant), it could be still fun to share the knowledge in this
thread, with the appropriate disclaimer.

Regards,
   Alex.


Re: Achieving AutoComplete feature using Solrj client

2018-06-18 Thread Arunan Sugunakumar
Hi,

Thank you for your help. As far as I understood, I cannot configure and
enable the suggester through solrj client. The configuration should be done
manually.

Arunan



On 18 June 2018 at 14:50, Alessandro Benedetti  wrote:

> Hi,
> me and Tommaso contributed this few years ago.[1]
> You can easily get the suggester response now from the Solr response.
> Of course you need to configure and enable the suggester first.[2][3][4]
>
>
> [1] https://issues.apache.org/jira/browse/SOLR-7719
> [2] https://sease.io/2015/07/solr-you-complete-me.html
> [3] https://lucidworks.com/2015/03/04/solr-suggester/
> [4]
> https://sease.io/2018/06/apache-lucene-blendedinfixsuggester-how-it-
> works-bugs-and-improvements.html
>
>
>
> -
> ---
> Alessandro Benedetti
> Search Consultant, R Software Engineer, Director
> Sease Ltd. - www.sease.io
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


Re: Achieving AutoComplete feature using Solrj client

2018-06-18 Thread Alessandro Benedetti
Hi,
me and Tommaso contributed this few years ago.[1]
You can easily get the suggester response now from the Solr response.
Of course you need to configure and enable the suggester first.[2][3][4]


[1] https://issues.apache.org/jira/browse/SOLR-7719
[2] https://sease.io/2015/07/solr-you-complete-me.html
[3] https://lucidworks.com/2015/03/04/solr-suggester/
[4]
https://sease.io/2018/06/apache-lucene-blendedinfixsuggester-how-it-works-bugs-and-improvements.html



-
---
Alessandro Benedetti
Search Consultant, R Software Engineer, Director
Sease Ltd. - www.sease.io
--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


CursorMarks and 'end of results'

2018-06-18 Thread David Frese

Hi List,

the documentation of 'cursorMarks' recommends to fetch until a query 
returns the cursorMark that was passed in to a request.


But that always requires an additional request at the end, so I wonder 
if I can stop already, if a request returns less results than requested 
(num rows). There won't be new documents added during the search in my 
use case, so could there every be a non-empty 'page' after a non-full 
'page'?


Thanks very much.

--
David Frese
+49 7071 70896 75

Active Group GmbH
Hechinger Str. 12/1, 72072 Tübingen
Registergericht: Amtsgericht Stuttgart, HRB 224404
Geschäftsführer: Dr. Michael Sperber


Re: Solr Issue after the DSE upgrade

2018-06-18 Thread Charlie Hull

On 17/06/2018 03:10, Umadevi Nalluri wrote:

I am getting Connection refused (Connection refused) when I am runnind 
reload_core with dsetool after we setup jmx , this issue is happening since the 
dse upgrade to 5.0.12 , can some one please help with this issue is this a bug?
Is there a work around for this?


dsetool appears to be a utility from Datastax - have you tried asking 
them for support?


Charlie


Thanks
Kantheti




--
Charlie Hull
Flax - Open Source Enterprise Search

tel/fax: +44 (0)8700 118334
mobile:  +44 (0)7767 825828
web: www.flax.co.uk


Re: Soft commit impact on replication

2018-06-18 Thread Adarsh_infor
Hi Erick,

Thanks for the response. 

First thing we not indexing on Slave.  And we are not re-indexing/optimizing
entire the core in Master node. 

The only warning which I see in the log is "Unable clean the unused index
directory so starting full copy".  
That one i can understand and I don't have issue with that as its normal
behaviour but most of the time. But most of the time it’s just triggers
full-copy without any details in the log.  

And recently in one of the nodes i enabled soft-commit in master nodes and
monitored the corresponding slave node, what i observed is it didn't even
trigger the full-copy not even once for almost 3 consecutive days. So am
wondering do we need to have soft commit enabled in master for replication
to happen smooth if so what’s the dependency there


Thanks 



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html