Boosting Documents using the field Value

2017-06-23 Thread govind nitk
Hi Solr,

My Index Data:

id name category domain domain_ct
1 Banana Fruits Home > Fruits > Banana 2
2 Orange Fruits Home > Fruits > Orange 4
3 Samsung Mobile Electronics > Mobile > Samsung 3


I am able to retrieve the documents with dismax parser with the weights
mentioned as below.

http://localhost:8983/solr/my_index/select?defType=dismax=on=fruits=category
^0.9=name^0.7=json


Is it possible to retrieve the documents with weight taken from the indexed
field like:

http://localhost:8983/solr/my_index/select?defType=dismax=on=fruits=category
^domain_ct=name^domain_ct=json

Is this possible to give weight from an indexed field ? Am I doing
something wrong?
Is there any other way of doing this?


Regards


Re: Collection name in result

2017-06-23 Thread Jagrut Sharma
Thanks Erick & Alessandro. I am querying the collections by including
'=c1,c2' syntax in the query. The suggestion of adding an
additional field to the schema works. I configured this additional field to
hold the collection's name and now the result from the collection includes
the name.

--
Jagrut

On Fri, Jun 23, 2017 at 7:51 AM, Erick Erickson 
wrote:

> H, you might be able to use a DocTransformer. Hmm, in fact that
> would work if you can keep the two separate. It just injects fields
> into the docs when they're being output. It would be the world's
> simplest transformer if it injected a hard-coded collection name
> (you'd have to have unique transformers per collection). With a little
> more work it should be possible to get the collection name from the
> available data.
>
> Best,
> Erick
>
> On Fri, Jun 23, 2017 at 2:21 AM, alessandro.benedetti
>  wrote:
> > I second Erick,
> > it would be as easy as adding this field to the schema :
> >
> >  > indexed=false stored=true default=your
> > collection name>"/>
> >
> > If you are using inter collections queries, just be aware there a lot of
> > tricky and subtle problems with it ( such as unique Identifier must have
> > same field name, distributed IDF inter collections ect ect)
> > I am preparing a blog post related that.
> > I will keep you updated.
> >
> > Cheers
> >
> >
> >
> > -
> > ---
> > Alessandro Benedetti
> > Search Consultant, R Software Engineer, Director
> > Sease Ltd. - www.sease.io
> > --
> > View this message in context: http://lucene.472066.n3.
> nabble.com/Collection-name-in-result-tp4342474p4342501.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Proximity searches with a wildcard

2017-06-23 Thread Erick Erickson
Have you looked at ComplexPhraseQueryParser? It adds  some expense of
course. See SOLR-1604.

Best,
Erick

On Fri, Jun 23, 2017 at 1:06 PM, Michael Craven  wrote:
> I apologize in advance if this is too off topic or basic. I am a web 
> developer on a Drupal site tasked with trying to improve searching 
> capabilities for our users. A product manager asked to me if proximity 
> searches and wildcards worked on our search form. I did some testing and 
> found that, yes, each work. However, what does not seem to work is both 
> together. That is, looking for words or phrases near each other seems to work 
> and looking for truncated words like pregna* seems to work, but the two 
> together in one search string do not. Does anyone have experience in the 
> Drupal context that could give me some advice or point me to a resource that 
> can?
>
> Thanks
>
> -Michael


RE: Proximity searches with a wildcard

2017-06-23 Thread Markus Jelsma
Sure: 
https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-ComplexPhraseQueryParser

 
 
-Original message-
> From:Michael Craven 
> Sent: Friday 23rd June 2017 22:06
> To: solr-user@lucene.apache.org
> Subject: Proximity searches with a wildcard
> 
> I apologize in advance if this is too off topic or basic. I am a web 
> developer on a Drupal site tasked with trying to improve searching 
> capabilities for our users. A product manager asked to me if proximity 
> searches and wildcards worked on our search form. I did some testing and 
> found that, yes, each work. However, what does not seem to work is both 
> together. That is, looking for words or phrases near each other seems to work 
> and looking for truncated words like pregna* seems to work, but the two 
> together in one search string do not. Does anyone have experience in the 
> Drupal context that could give me some advice or point me to a resource that 
> can? 
> 
> Thanks
> 
> -Michael


Proximity searches with a wildcard

2017-06-23 Thread Michael Craven
I apologize in advance if this is too off topic or basic. I am a web developer 
on a Drupal site tasked with trying to improve searching capabilities for our 
users. A product manager asked to me if proximity searches and wildcards worked 
on our search form. I did some testing and found that, yes, each work. However, 
what does not seem to work is both together. That is, looking for words or 
phrases near each other seems to work and looking for truncated words like 
pregna* seems to work, but the two together in one search string do not. Does 
anyone have experience in the Drupal context that could give me some advice or 
point me to a resource that can? 

Thanks

-Michael

RE: Questions about typical/simple clustered Solr software and hardware architecture

2017-06-23 Thread Markus Jelsma
Hello, see inline. 
 
-Original message-
> From:ken edward 
> Sent: Friday 23rd June 2017 21:07
> To: solr-user@lucene.apache.org
> Subject: Questions about typical/simple clustered Solr software and hardware 
> architecture
> 
> Hello,
> 
> I am brand new to Solr, and trying to ramp up quick. Please correct me
> if I am wrong, but from what I read, in a true production environment,
> is it true that :
> 
> 1. Solr is made up of only "node" processes and "zookeeper" processes?

You run N number of Solr nodes depending on your needs. For a HA environment, 
you run one or more shards (depending on size of data), and three or more 
replica's. These are all Solr nodes. You need three Zookeepers at least for 
proper HA.

> 
> 2. Each node and zookeeper process ideally runs on it's own physical server?

Doesn't need to be physical, virtual is fine. Zookeeper can run on small VM's 
without issues.

> 
> 3. Searches can be sent to any of the node processes?

Yes.

> 
> 4. A typical HA configuration would put a proxy or load balancer out
> in front of the nodes to distribute the work?

Yes, or a cluster-aware client, such as SolrJ if your application uses Java.

> 
> Ken
> 


Does solr support multiple collapse officially?

2017-06-23 Thread Jeffery Yuan
I am trying to use multiple collapse (post) filters:
- collapse on seriesId if seriesId are same
- collapse on tmsId(another Id) if they are same

But seems that Solr doesn't support multiple collapse filters. 

I get NullPointerException sometimes, which I summarized at 
https://issues.apache.org/jira/browse/SOLR-10885

-- Copied from SOLR-10885
Solr collapse is a great function to collapse data that is related so we
only show one in search result.
Just found one issue related with it - It throw NullPointerException in some
cases.
To reproduce it, first ingest some data - AND commit multiple times.
1. When there is no data that matches the query:
http://localhost:8983/solr/thecollection/select?defType=edismax=non-existType:*=
{!collapse field=seriesId nullPolicy=expand}={!collapse field=programId
nullPolicy=expand}

- But the problem only happens if I use both collapse fqs, if I just use one
of them, it would be fine.

*2. When the data that matches the query doesn't have the collapse fields
- This is kind of a big problem as we may store different kinds of docs in
one collection, one query may match different kinds of docs. 
If some docs (docType1) have same value for field1, we want to collapse
them, if other dosc(docType2) have some value for field2, do same things.*
- channel data doesn't have seriesId or programId
http://localhost:8983/solr/thecollection/select?defType=edismax=docType:channel={!collapse
field=seriesId nullPolicy=expand}
=
{!collapse field=programId nullPolicy=expand}
But the problem only happens if I use both collapse fqs, if I just use one
of them, it would be fine.
Exception from log:
Caused by:
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
from server at http://localhost:8983/solr/searchItems_shard1_replica3:
java.lang.NullPointerException
at
org.apache.solr.search.CollapsingQParserPlugin$OrdScoreCollector.finish(CollapsingQParserPlugin.java:617)
at
org.apache.solr.search.CollapsingQParserPlugin$OrdScoreCollector.finish(CollapsingQParserPlugin.java:667)
at
org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:256)
at
org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1823)
at
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1640)
at
org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:611)
at
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:533)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:295)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:166)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2299)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:658)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:464)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:296)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
int nextDocBase = currentContext + 1 < this.contexts.length ?
this.contexts[(currentContext + 1)].docBase : this.maxDoc; - 617 from solr
6.4.1 CollapsingQParserPlugin.java
Seems related with https://issues.apache.org/jira/browse/SOLR-8807
But SOLR-8807 only fixes issue related with spell checker.
I may test this with latest solr 6.6.0 when I have time.
Updated:
Whether solr supports multiple collapse fields?
Seems the query occasionally works (1/10 maybe), but othertimes it throws
NullPointerException
http://localhost:18983/solr/thecollection/select?q=programId:* AND
id:*=edismax= {!collapse+field=id }
=
{!collapse+field=programId }



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Does-solr-support-multiple-collapse-officially-tp4342614.html
Sent from the Solr - User mailing list archive at Nabble.com.


Questions about typical/simple clustered Solr software and hardware architecture

2017-06-23 Thread ken edward
Hello,

I am brand new to Solr, and trying to ramp up quick. Please correct me
if I am wrong, but from what I read, in a true production environment,
is it true that :

1. Solr is made up of only "node" processes and "zookeeper" processes?

2. Each node and zookeeper process ideally runs on it's own physical server?

3. Searches can be sent to any of the node processes?

4. A typical HA configuration would put a proxy or load balancer out
in front of the nodes to distribute the work?

Ken


Re: Facet is not working while querying with group

2017-06-23 Thread Aman Deep Singh
1. No it is with schema with some dynamic fields but facet fields are
proper field
2. No copy field is stored field all are set as stored=false



On Fri, Jun 23, 2017 at 10:21 PM Erick Erickson 
wrote:

> OK, new collection.
>
> 1> With schemaless? When you add a document in schemaless mode, it
> makes some guesses that may not play nice later.
>
> 2> Are you storing the _destination_ of any copyField? Atomic updates
> do odd things if you set stored="true" for fields that are
> destinations for atomic updates, specifically accumulate values in
> them. You should set stored="false" for all destinations of copyField
> directives.
>
> Best,
> Erick
>
> On Fri, Jun 23, 2017 at 9:23 AM, Aman Deep Singh
>  wrote:
> > No Shawn,
> > I download the latest solr again then run without installing by command
> > ./bin/solr -c
> > after upload the fresh configset and create the new collection
> > Then create a single document in solr
> > after do atomic update
> > and the same error occurs again.
> >
> >
> > On Fri, Jun 23, 2017 at 7:53 PM Shawn Heisey 
> wrote:
> >
> >> On 6/20/2017 11:01 PM, Aman Deep Singh wrote:
> >> > If I am using docValues=false getting this exception
> >> > java.lang.IllegalStateException: Type mismatch: isBlibliShipping was
> >> > indexed with multiple values per document, use SORTED_SET instead at
> >> >
> >>
> org.apache.solr.uninverting.FieldCacheImpl$SortedDocValuesCache.createValue(FieldCacheImpl.java:799)
> >> > at
> >> >
> >>
> org.apache.solr.uninverting.FieldCacheImpl$Cache.get(FieldCacheImpl.java:187)
> >> > at
> >> >
> >>
> org.apache.solr.uninverting.FieldCacheImpl.getTermsIndex(FieldCacheImpl.java:767)
> >> > at
> >> >
> >>
> org.apache.solr.uninverting.FieldCacheImpl.getTermsIndex(FieldCacheImpl.java:747)
> >> > at
> >> > But if docValues=true then getting this error
> >> > java.lang.IllegalStateException: unexpected docvalues type NUMERIC for
> >> > field 'isBlibliShipping' (expected=SORTED). Re-index with correct
> >> docvalues
> >> > type. at
> org.apache.lucene.index.DocValues.checkField(DocValues.java:212)
> >> > at org.apache.lucene.index.DocValues.getSorted(DocValues.java:264) at
> >> >
> >>
> org.apache.lucene.search.grouping.term.TermGroupFacetCollector$SV.doSetNextReader(TermGroupFacetCollector.java:129)
> >> > at
> >> >
> >>
> org.apache.lucene.search.SimpleCollector.getLeafCollector(SimpleCollector.java:33)
> >> > at
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:659)
> >> at
> >> > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:472)
> at
> >> >
> >>
> org.apache.solr.request.SimpleFacets.getGroupedCounts(SimpleFacets.java:692)
> >> > at
> >> >
> org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:476)
> >> > at
> >> >
> org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:405)
> >> > at
> >> >
> >>
> org.apache.solr.request.SimpleFacets.lambda$getFacetFieldCounts$0(SimpleFacets.java:803)
> >> >
> >> > It Only appear in case when we facet on group query normal facet works
> >> fine
> >> >
> >> > Also appears only when we atomically update the document.
> >>
> >> These errors look like problems that appear when you *change* the
> >> schema, but try to use that new schema with an existing Lucene index
> >> directory.  As Erick already mentioned, certain changes in the schema
> >> *require* completely deleting the index directory and
> >> restarting/reloading, or starting with a brand new index.  Deleting all
> >> documents instead of wiping out the index may leave Lucene remnants with
> >> incorrect metadata for the new schema.
> >>
> >> What you've said elsewhere in the thread is that you're starting with a
> >> brand new collection ... but the error messages suggest that we're still
> >> dealing with an index where you had one schema setting, indexed some
> >> data, then changed the schema without completely wiping out the index
> >> from the disk.
> >>
> >> Thanks,
> >> Shawn
> >>
> >>
>


Re: Using of Streaming to join between shards

2017-06-23 Thread Erick Erickson
You've provided no information to help guide an answer and even with
more information there are too many variables to say definitively.

There are quite a number of Streaming join options, see:
https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions.
You'll have to do some exploration of the various ones mentioned on
that page as they pertain to your particular use case.

Best,
Erick

On Fri, Jun 23, 2017 at 8:04 AM, mganeshs  wrote:
> Hi,
>
> So far we had only one shards so joins are working fine. And now as our data
> is growing, we would like to go for new shards and we would like to go with
> only default sharding mechanism for various reasons.
>
> Due to this, join will fail. as it's not supported if we have more than one
> shards.
>
> For this reason we are planning to use join.
>
> Can you suggest whether streaming can be used like we used join before ?
> Will there be any penalty wrt response time and CPU utilization ?
>
> Currently we are using simple join which is like one to one mapping sort of
> join. For this when I move to Streaming, What kind of join Should I go for ?
> hashJoin or leftOuterJoin or innerJoin etc ?
>
> Pls suggest,
>
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Using-of-Streaming-to-join-between-shards-tp4342563.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Facet is not working while querying with group

2017-06-23 Thread Erick Erickson
OK, new collection.

1> With schemaless? When you add a document in schemaless mode, it
makes some guesses that may not play nice later.

2> Are you storing the _destination_ of any copyField? Atomic updates
do odd things if you set stored="true" for fields that are
destinations for atomic updates, specifically accumulate values in
them. You should set stored="false" for all destinations of copyField
directives.

Best,
Erick

On Fri, Jun 23, 2017 at 9:23 AM, Aman Deep Singh
 wrote:
> No Shawn,
> I download the latest solr again then run without installing by command
> ./bin/solr -c
> after upload the fresh configset and create the new collection
> Then create a single document in solr
> after do atomic update
> and the same error occurs again.
>
>
> On Fri, Jun 23, 2017 at 7:53 PM Shawn Heisey  wrote:
>
>> On 6/20/2017 11:01 PM, Aman Deep Singh wrote:
>> > If I am using docValues=false getting this exception
>> > java.lang.IllegalStateException: Type mismatch: isBlibliShipping was
>> > indexed with multiple values per document, use SORTED_SET instead at
>> >
>> org.apache.solr.uninverting.FieldCacheImpl$SortedDocValuesCache.createValue(FieldCacheImpl.java:799)
>> > at
>> >
>> org.apache.solr.uninverting.FieldCacheImpl$Cache.get(FieldCacheImpl.java:187)
>> > at
>> >
>> org.apache.solr.uninverting.FieldCacheImpl.getTermsIndex(FieldCacheImpl.java:767)
>> > at
>> >
>> org.apache.solr.uninverting.FieldCacheImpl.getTermsIndex(FieldCacheImpl.java:747)
>> > at
>> > But if docValues=true then getting this error
>> > java.lang.IllegalStateException: unexpected docvalues type NUMERIC for
>> > field 'isBlibliShipping' (expected=SORTED). Re-index with correct
>> docvalues
>> > type. at org.apache.lucene.index.DocValues.checkField(DocValues.java:212)
>> > at org.apache.lucene.index.DocValues.getSorted(DocValues.java:264) at
>> >
>> org.apache.lucene.search.grouping.term.TermGroupFacetCollector$SV.doSetNextReader(TermGroupFacetCollector.java:129)
>> > at
>> >
>> org.apache.lucene.search.SimpleCollector.getLeafCollector(SimpleCollector.java:33)
>> > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:659)
>> at
>> > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:472) at
>> >
>> org.apache.solr.request.SimpleFacets.getGroupedCounts(SimpleFacets.java:692)
>> > at
>> > org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:476)
>> > at
>> > org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:405)
>> > at
>> >
>> org.apache.solr.request.SimpleFacets.lambda$getFacetFieldCounts$0(SimpleFacets.java:803)
>> >
>> > It Only appear in case when we facet on group query normal facet works
>> fine
>> >
>> > Also appears only when we atomically update the document.
>>
>> These errors look like problems that appear when you *change* the
>> schema, but try to use that new schema with an existing Lucene index
>> directory.  As Erick already mentioned, certain changes in the schema
>> *require* completely deleting the index directory and
>> restarting/reloading, or starting with a brand new index.  Deleting all
>> documents instead of wiping out the index may leave Lucene remnants with
>> incorrect metadata for the new schema.
>>
>> What you've said elsewhere in the thread is that you're starting with a
>> brand new collection ... but the error messages suggest that we're still
>> dealing with an index where you had one schema setting, indexed some
>> data, then changed the schema without completely wiping out the index
>> from the disk.
>>
>> Thanks,
>> Shawn
>>
>>


Re: Index 0, Size 0 - hashJoin Stream function Error

2017-06-23 Thread Susheel Kumar
Thanks for confirming.  Here is the JIRA

https://issues.apache.org/jira/browse/SOLR-10944

On Fri, Jun 23, 2017 at 11:20 AM, Joel Bernstein  wrote:

> yeah, this looks like a bug in the get expression.
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Fri, Jun 23, 2017 at 11:07 AM, Susheel Kumar 
> wrote:
>
> > Hi Joel,
> >
> > As i am getting deeper, it doesn't look like a problem due to hashJoin
> etc.
> >
> >
> > Below is a simple let expr where if search would not find a match and
> > return 0 result.  In that case, I would expect get(a) to show a EOF tuple
> > while it is throwing exception. It looks like something wrong/bug in the
> > code.  Please suggest
> >
> > ===
> > let(a=search(collection1,
> > q=id:9,
> > fl="id,business_email",
> > sort="business_email asc"),
> > get(a)
> > )
> >
> >
> > {
> >   "result-set": {
> > "docs": [
> >   {
> > "EXCEPTION": "Index: 0, Size: 0",
> > "EOF": true,
> > "RESPONSE_TIME": 8
> >   }
> > ]
> >   }
> > }
> >
> >
> >
> >
> >
> >
> >
> > On Fri, Jun 23, 2017 at 7:44 AM, Joel Bernstein 
> > wrote:
> >
> > > Ok, I hadn't anticipated some of the scenarios that you've been trying
> > out.
> > > Particularly reading streams into variables and performing joins etc...
> > >
> > > The main idea with variables was to use them with the new statistical
> > > evaluators. So you perform retrievals (search, random, nodes, knn
> etc...)
> > > set the results to variables and then perform statistical analysis.
> > >
> > > The problem with joining variables is that is doesn't scale very well
> > > because all the records are read into memory. Also the parallel stream
> > > won't work over variables.
> > >
> > > Joel Bernstein
> > > http://joelsolr.blogspot.com/
> > >
> > > On Thu, Jun 22, 2017 at 3:50 PM, Susheel Kumar 
> > > wrote:
> > >
> > > > Hi Joel,
> > > >
> > > > I am able to reproduce this in a simple way.  Looks like Let Stream
> is
> > > > having some issues.  Below complement function works fine if I
> execute
> > > > outside let and returns an EOF:true tuple but if a tuple with
> EOF:true
> > > > assigned to let variable, it gets changed to EXCEPTION "Index 0, Size
> > 0"
> > > > etc.
> > > >
> > > > So let stream not able to handle the stream/results which has only
> EOF
> > > > tuple and breaks the whole let expression block
> > > >
> > > >
> > > > ===Complement inside let
> > > > let(
> > > > a=echo(Hello),
> > > > b=complement(sort(select(tuple(id=1,email="A"),id,email),by="id
> > > asc,email
> > > > asc"),
> > > > sort(select(tuple(id=1,email="A"),id,email),by="id asc,email asc"),
> > > > on="id,email"),
> > > > c=get(b),
> > > > get(a)
> > > > )
> > > >
> > > > Result
> > > > ===
> > > > {
> > > >   "result-set": {
> > > > "docs": [
> > > >   {
> > > > "EXCEPTION": "Index: 0, Size: 0",
> > > > "EOF": true,
> > > > "RESPONSE_TIME": 1
> > > >   }
> > > > ]
> > > >   }
> > > > }
> > > >
> > > > ===Complement outside let
> > > >
> > > > complement(sort(select(tuple(id=1,email="A"),id,email),by="id
> > asc,email
> > > > asc"),
> > > > sort(select(tuple(id=1,email="A"),id,email),by="id asc,email asc"),
> > > > on="id,email")
> > > >
> > > > Result
> > > > ===
> > > > { "result-set": { "docs": [ { "EOF": true, "RESPONSE_TIME": 0 } ] } }
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Thu, Jun 22, 2017 at 11:55 AM, Susheel Kumar <
> susheel2...@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > Sorry for typo
> > > > >
> > > > > Facing a weird behavior when using hashJoin / innerJoin etc. The
> > below
> > > > > expression display tuples from variable a shown below
> > > > >
> > > > >
> > > > > let(a=fetch(SMS,having(rollup(over=email,
> > > > >  count(email),
> > > > > select(search(SMS,
> > > > > q=*:*,
> > > > > fl="id,dv_sv_business_email",
> > > > > sort="dv_sv_business_email asc"),
> > > > >id,
> > > > >dv_sv_business_email as email)),
> > > > > eq(count(email),1)),
> > > > > fl="id,dv_sv_business_email as email",
> > > > > on="email=dv_sv_business_email"),
> > > > > b=fetch(SMS,having(rollup(over=email,
> > > > >  count(email),
> > > > > select(search(SMS,
> > > > > q=*:*,
> > > > > fl="id,dv_sv_personal_email",
> > > > > sort="dv_sv_personal_email asc"),
> > > > >id,
> > > > >dv_sv_personal_email as email)),
> > > > > eq(count(email),1)),
> > > > > fl="id,dv_sv_personal_email as email",
> > > > > on="email=dv_sv_personal_email"),
> > > > > c=innerJoin(sort(get(a),by="email asc"),sort(get(b),by="email
> > > > > asc"),on="email"),
> > > > > 

Re: Facet is not working while querying with group

2017-06-23 Thread Aman Deep Singh
No Shawn,
I download the latest solr again then run without installing by command
./bin/solr -c
after upload the fresh configset and create the new collection
Then create a single document in solr
after do atomic update
and the same error occurs again.


On Fri, Jun 23, 2017 at 7:53 PM Shawn Heisey  wrote:

> On 6/20/2017 11:01 PM, Aman Deep Singh wrote:
> > If I am using docValues=false getting this exception
> > java.lang.IllegalStateException: Type mismatch: isBlibliShipping was
> > indexed with multiple values per document, use SORTED_SET instead at
> >
> org.apache.solr.uninverting.FieldCacheImpl$SortedDocValuesCache.createValue(FieldCacheImpl.java:799)
> > at
> >
> org.apache.solr.uninverting.FieldCacheImpl$Cache.get(FieldCacheImpl.java:187)
> > at
> >
> org.apache.solr.uninverting.FieldCacheImpl.getTermsIndex(FieldCacheImpl.java:767)
> > at
> >
> org.apache.solr.uninverting.FieldCacheImpl.getTermsIndex(FieldCacheImpl.java:747)
> > at
> > But if docValues=true then getting this error
> > java.lang.IllegalStateException: unexpected docvalues type NUMERIC for
> > field 'isBlibliShipping' (expected=SORTED). Re-index with correct
> docvalues
> > type. at org.apache.lucene.index.DocValues.checkField(DocValues.java:212)
> > at org.apache.lucene.index.DocValues.getSorted(DocValues.java:264) at
> >
> org.apache.lucene.search.grouping.term.TermGroupFacetCollector$SV.doSetNextReader(TermGroupFacetCollector.java:129)
> > at
> >
> org.apache.lucene.search.SimpleCollector.getLeafCollector(SimpleCollector.java:33)
> > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:659)
> at
> > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:472) at
> >
> org.apache.solr.request.SimpleFacets.getGroupedCounts(SimpleFacets.java:692)
> > at
> > org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:476)
> > at
> > org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:405)
> > at
> >
> org.apache.solr.request.SimpleFacets.lambda$getFacetFieldCounts$0(SimpleFacets.java:803)
> >
> > It Only appear in case when we facet on group query normal facet works
> fine
> >
> > Also appears only when we atomically update the document.
>
> These errors look like problems that appear when you *change* the
> schema, but try to use that new schema with an existing Lucene index
> directory.  As Erick already mentioned, certain changes in the schema
> *require* completely deleting the index directory and
> restarting/reloading, or starting with a brand new index.  Deleting all
> documents instead of wiping out the index may leave Lucene remnants with
> incorrect metadata for the new schema.
>
> What you've said elsewhere in the thread is that you're starting with a
> brand new collection ... but the error messages suggest that we're still
> dealing with an index where you had one schema setting, indexed some
> data, then changed the schema without completely wiping out the index
> from the disk.
>
> Thanks,
> Shawn
>
>


Re: Index 0, Size 0 - hashJoin Stream function Error

2017-06-23 Thread Joel Bernstein
yeah, this looks like a bug in the get expression.

Joel Bernstein
http://joelsolr.blogspot.com/

On Fri, Jun 23, 2017 at 11:07 AM, Susheel Kumar 
wrote:

> Hi Joel,
>
> As i am getting deeper, it doesn't look like a problem due to hashJoin etc.
>
>
> Below is a simple let expr where if search would not find a match and
> return 0 result.  In that case, I would expect get(a) to show a EOF tuple
> while it is throwing exception. It looks like something wrong/bug in the
> code.  Please suggest
>
> ===
> let(a=search(collection1,
> q=id:9,
> fl="id,business_email",
> sort="business_email asc"),
> get(a)
> )
>
>
> {
>   "result-set": {
> "docs": [
>   {
> "EXCEPTION": "Index: 0, Size: 0",
> "EOF": true,
> "RESPONSE_TIME": 8
>   }
> ]
>   }
> }
>
>
>
>
>
>
>
> On Fri, Jun 23, 2017 at 7:44 AM, Joel Bernstein 
> wrote:
>
> > Ok, I hadn't anticipated some of the scenarios that you've been trying
> out.
> > Particularly reading streams into variables and performing joins etc...
> >
> > The main idea with variables was to use them with the new statistical
> > evaluators. So you perform retrievals (search, random, nodes, knn etc...)
> > set the results to variables and then perform statistical analysis.
> >
> > The problem with joining variables is that is doesn't scale very well
> > because all the records are read into memory. Also the parallel stream
> > won't work over variables.
> >
> > Joel Bernstein
> > http://joelsolr.blogspot.com/
> >
> > On Thu, Jun 22, 2017 at 3:50 PM, Susheel Kumar 
> > wrote:
> >
> > > Hi Joel,
> > >
> > > I am able to reproduce this in a simple way.  Looks like Let Stream is
> > > having some issues.  Below complement function works fine if I execute
> > > outside let and returns an EOF:true tuple but if a tuple with EOF:true
> > > assigned to let variable, it gets changed to EXCEPTION "Index 0, Size
> 0"
> > > etc.
> > >
> > > So let stream not able to handle the stream/results which has only EOF
> > > tuple and breaks the whole let expression block
> > >
> > >
> > > ===Complement inside let
> > > let(
> > > a=echo(Hello),
> > > b=complement(sort(select(tuple(id=1,email="A"),id,email),by="id
> > asc,email
> > > asc"),
> > > sort(select(tuple(id=1,email="A"),id,email),by="id asc,email asc"),
> > > on="id,email"),
> > > c=get(b),
> > > get(a)
> > > )
> > >
> > > Result
> > > ===
> > > {
> > >   "result-set": {
> > > "docs": [
> > >   {
> > > "EXCEPTION": "Index: 0, Size: 0",
> > > "EOF": true,
> > > "RESPONSE_TIME": 1
> > >   }
> > > ]
> > >   }
> > > }
> > >
> > > ===Complement outside let
> > >
> > > complement(sort(select(tuple(id=1,email="A"),id,email),by="id
> asc,email
> > > asc"),
> > > sort(select(tuple(id=1,email="A"),id,email),by="id asc,email asc"),
> > > on="id,email")
> > >
> > > Result
> > > ===
> > > { "result-set": { "docs": [ { "EOF": true, "RESPONSE_TIME": 0 } ] } }
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Thu, Jun 22, 2017 at 11:55 AM, Susheel Kumar  >
> > > wrote:
> > >
> > > > Sorry for typo
> > > >
> > > > Facing a weird behavior when using hashJoin / innerJoin etc. The
> below
> > > > expression display tuples from variable a shown below
> > > >
> > > >
> > > > let(a=fetch(SMS,having(rollup(over=email,
> > > >  count(email),
> > > > select(search(SMS,
> > > > q=*:*,
> > > > fl="id,dv_sv_business_email",
> > > > sort="dv_sv_business_email asc"),
> > > >id,
> > > >dv_sv_business_email as email)),
> > > > eq(count(email),1)),
> > > > fl="id,dv_sv_business_email as email",
> > > > on="email=dv_sv_business_email"),
> > > > b=fetch(SMS,having(rollup(over=email,
> > > >  count(email),
> > > > select(search(SMS,
> > > > q=*:*,
> > > > fl="id,dv_sv_personal_email",
> > > > sort="dv_sv_personal_email asc"),
> > > >id,
> > > >dv_sv_personal_email as email)),
> > > > eq(count(email),1)),
> > > > fl="id,dv_sv_personal_email as email",
> > > > on="email=dv_sv_personal_email"),
> > > > c=innerJoin(sort(get(a),by="email asc"),sort(get(b),by="email
> > > > asc"),on="email"),
> > > > #d=select(get(c),id,email),
> > > > get(a)
> > > > )
> > > >
> > > > var a result
> > > > ==
> > > > {
> > > >   "result-set": {
> > > > "docs": [
> > > >   {
> > > > "count(email)": 1,
> > > > "id": "1",
> > > > "email": "A"
> > > >   },
> > > >   {
> > > > "count(email)": 1,
> > > > "id": "2",
> > > > "email": "C"
> > > >   },
> > > >   {
> > > > "EOF": true,
> > > > "RESPONSE_TIME": 18
> > > >   }
> > > > ]
> > > >   

Re: Index 0, Size 0 - hashJoin Stream function Error

2017-06-23 Thread Susheel Kumar
Hi Joel,

As i am getting deeper, it doesn't look like a problem due to hashJoin etc.


Below is a simple let expr where if search would not find a match and
return 0 result.  In that case, I would expect get(a) to show a EOF tuple
while it is throwing exception. It looks like something wrong/bug in the
code.  Please suggest

===
let(a=search(collection1,
q=id:9,
fl="id,business_email",
sort="business_email asc"),
get(a)
)


{
  "result-set": {
"docs": [
  {
"EXCEPTION": "Index: 0, Size: 0",
"EOF": true,
"RESPONSE_TIME": 8
  }
]
  }
}







On Fri, Jun 23, 2017 at 7:44 AM, Joel Bernstein  wrote:

> Ok, I hadn't anticipated some of the scenarios that you've been trying out.
> Particularly reading streams into variables and performing joins etc...
>
> The main idea with variables was to use them with the new statistical
> evaluators. So you perform retrievals (search, random, nodes, knn etc...)
> set the results to variables and then perform statistical analysis.
>
> The problem with joining variables is that is doesn't scale very well
> because all the records are read into memory. Also the parallel stream
> won't work over variables.
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Thu, Jun 22, 2017 at 3:50 PM, Susheel Kumar 
> wrote:
>
> > Hi Joel,
> >
> > I am able to reproduce this in a simple way.  Looks like Let Stream is
> > having some issues.  Below complement function works fine if I execute
> > outside let and returns an EOF:true tuple but if a tuple with EOF:true
> > assigned to let variable, it gets changed to EXCEPTION "Index 0, Size 0"
> > etc.
> >
> > So let stream not able to handle the stream/results which has only EOF
> > tuple and breaks the whole let expression block
> >
> >
> > ===Complement inside let
> > let(
> > a=echo(Hello),
> > b=complement(sort(select(tuple(id=1,email="A"),id,email),by="id
> asc,email
> > asc"),
> > sort(select(tuple(id=1,email="A"),id,email),by="id asc,email asc"),
> > on="id,email"),
> > c=get(b),
> > get(a)
> > )
> >
> > Result
> > ===
> > {
> >   "result-set": {
> > "docs": [
> >   {
> > "EXCEPTION": "Index: 0, Size: 0",
> > "EOF": true,
> > "RESPONSE_TIME": 1
> >   }
> > ]
> >   }
> > }
> >
> > ===Complement outside let
> >
> > complement(sort(select(tuple(id=1,email="A"),id,email),by="id asc,email
> > asc"),
> > sort(select(tuple(id=1,email="A"),id,email),by="id asc,email asc"),
> > on="id,email")
> >
> > Result
> > ===
> > { "result-set": { "docs": [ { "EOF": true, "RESPONSE_TIME": 0 } ] } }
> >
> >
> >
> >
> >
> >
> >
> >
> > On Thu, Jun 22, 2017 at 11:55 AM, Susheel Kumar 
> > wrote:
> >
> > > Sorry for typo
> > >
> > > Facing a weird behavior when using hashJoin / innerJoin etc. The below
> > > expression display tuples from variable a shown below
> > >
> > >
> > > let(a=fetch(SMS,having(rollup(over=email,
> > >  count(email),
> > > select(search(SMS,
> > > q=*:*,
> > > fl="id,dv_sv_business_email",
> > > sort="dv_sv_business_email asc"),
> > >id,
> > >dv_sv_business_email as email)),
> > > eq(count(email),1)),
> > > fl="id,dv_sv_business_email as email",
> > > on="email=dv_sv_business_email"),
> > > b=fetch(SMS,having(rollup(over=email,
> > >  count(email),
> > > select(search(SMS,
> > > q=*:*,
> > > fl="id,dv_sv_personal_email",
> > > sort="dv_sv_personal_email asc"),
> > >id,
> > >dv_sv_personal_email as email)),
> > > eq(count(email),1)),
> > > fl="id,dv_sv_personal_email as email",
> > > on="email=dv_sv_personal_email"),
> > > c=innerJoin(sort(get(a),by="email asc"),sort(get(b),by="email
> > > asc"),on="email"),
> > > #d=select(get(c),id,email),
> > > get(a)
> > > )
> > >
> > > var a result
> > > ==
> > > {
> > >   "result-set": {
> > > "docs": [
> > >   {
> > > "count(email)": 1,
> > > "id": "1",
> > > "email": "A"
> > >   },
> > >   {
> > > "count(email)": 1,
> > > "id": "2",
> > > "email": "C"
> > >   },
> > >   {
> > > "EOF": true,
> > > "RESPONSE_TIME": 18
> > >   }
> > > ]
> > >   }
> > > }
> > >
> > > And after uncomment var d above, even though we are displaying a, we
> get
> > > results shown below. I understand that join in my test data didn't find
> > any
> > > match but then it should not skew up the results of var a.  When data
> > > matches during join then its fine but otherwise I am running into this
> > > issue and whole next expressions doesn't get evaluated due to this...
> > >
> > >
> > > after uncomment var d
> > > ===
> > > {
> > >   "result-set": {
> > > "docs": [
> 

Using of Streaming to join between shards

2017-06-23 Thread mganeshs
Hi,

So far we had only one shards so joins are working fine. And now as our data
is growing, we would like to go for new shards and we would like to go with
only default sharding mechanism for various reasons.

Due to this, join will fail. as it's not supported if we have more than one
shards.

For this reason we are planning to use join. 

Can you suggest whether streaming can be used like we used join before ?
Will there be any penalty wrt response time and CPU utilization ? 

Currently we are using simple join which is like one to one mapping sort of
join. For this when I move to Streaming, What kind of join Should I go for ?
hashJoin or leftOuterJoin or innerJoin etc ? 

Pls suggest,




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Using-of-Streaming-to-join-between-shards-tp4342563.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Collection name in result

2017-06-23 Thread Erick Erickson
H, you might be able to use a DocTransformer. Hmm, in fact that
would work if you can keep the two separate. It just injects fields
into the docs when they're being output. It would be the world's
simplest transformer if it injected a hard-coded collection name
(you'd have to have unique transformers per collection). With a little
more work it should be possible to get the collection name from the
available data.

Best,
Erick

On Fri, Jun 23, 2017 at 2:21 AM, alessandro.benedetti
 wrote:
> I second Erick,
> it would be as easy as adding this field to the schema :
>
>  indexed=false stored=true default=your
> collection name>"/>
>
> If you are using inter collections queries, just be aware there a lot of
> tricky and subtle problems with it ( such as unique Identifier must have
> same field name, distributed IDF inter collections ect ect)
> I am preparing a blog post related that.
> I will keep you updated.
>
> Cheers
>
>
>
> -
> ---
> Alessandro Benedetti
> Search Consultant, R Software Engineer, Director
> Sease Ltd. - www.sease.io
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Collection-name-in-result-tp4342474p4342501.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Query Partial Matching on auto schema

2017-06-23 Thread Erick Erickson
I simply do not recommend going to production with schemaless. That
mechanism must make certain assumptions about the data and simply
cannot anticipate all the types of searching you need to do.

As Alessandro says, you can define whatever you want "by hand" and
still have schemaless add input. It becomes a matter of preference,
would you rather have documents with fields that haven't been seen
before fail immediately? Or would you rather have them get new fields
that you then have to discover? I prefer the former.

Best,
Erick

On Fri, Jun 23, 2017 at 3:41 AM, alessandro.benedetti
 wrote:
> Quoting the official solr documentation :
> " You Can Still Be Explicit
> Even if you want to use schemaless mode for most fields, you can still use
> the Schema API to pre-emptively create some fields, with explicit types,
> before you index documents that use them.
>
> Internally, the Schema API and the Schemaless Update Processors both use the
> same Managed Schema functionality."
>
> Even using schemaless you can use the managed schema APi to define your own
> field types and fields.
>
> For more info [1]
>
> [1]
> https://lucene.apache.org/solr/guide/6_6/schemaless-mode.html#SchemalessMode-EnableManagedSchema
>
>
>
> -
> ---
> Alessandro Benedetti
> Search Consultant, R Software Engineer, Director
> Sease Ltd. - www.sease.io
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Query-Partial-Matching-on-auto-schema-tp4342502p4342509.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Facet is not working while querying with group

2017-06-23 Thread Shawn Heisey
On 6/20/2017 11:01 PM, Aman Deep Singh wrote:
> If I am using docValues=false getting this exception
> java.lang.IllegalStateException: Type mismatch: isBlibliShipping was
> indexed with multiple values per document, use SORTED_SET instead at
> org.apache.solr.uninverting.FieldCacheImpl$SortedDocValuesCache.createValue(FieldCacheImpl.java:799)
> at
> org.apache.solr.uninverting.FieldCacheImpl$Cache.get(FieldCacheImpl.java:187)
> at
> org.apache.solr.uninverting.FieldCacheImpl.getTermsIndex(FieldCacheImpl.java:767)
> at
> org.apache.solr.uninverting.FieldCacheImpl.getTermsIndex(FieldCacheImpl.java:747)
> at
> But if docValues=true then getting this error
> java.lang.IllegalStateException: unexpected docvalues type NUMERIC for
> field 'isBlibliShipping' (expected=SORTED). Re-index with correct docvalues
> type. at org.apache.lucene.index.DocValues.checkField(DocValues.java:212)
> at org.apache.lucene.index.DocValues.getSorted(DocValues.java:264) at
> org.apache.lucene.search.grouping.term.TermGroupFacetCollector$SV.doSetNextReader(TermGroupFacetCollector.java:129)
> at
> org.apache.lucene.search.SimpleCollector.getLeafCollector(SimpleCollector.java:33)
> at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:659) at
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:472) at
> org.apache.solr.request.SimpleFacets.getGroupedCounts(SimpleFacets.java:692)
> at
> org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:476)
> at
> org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:405)
> at
> org.apache.solr.request.SimpleFacets.lambda$getFacetFieldCounts$0(SimpleFacets.java:803)
>
> It Only appear in case when we facet on group query normal facet works fine
>
> Also appears only when we atomically update the document.

These errors look like problems that appear when you *change* the
schema, but try to use that new schema with an existing Lucene index
directory.  As Erick already mentioned, certain changes in the schema
*require* completely deleting the index directory and
restarting/reloading, or starting with a brand new index.  Deleting all
documents instead of wiping out the index may leave Lucene remnants with
incorrect metadata for the new schema.

What you've said elsewhere in the thread is that you're starting with a
brand new collection ... but the error messages suggest that we're still
dealing with an index where you had one schema setting, indexed some
data, then changed the schema without completely wiping out the index
from the disk.

Thanks,
Shawn



Re: [Solr Ref guide 6.6] Search not working

2017-06-23 Thread Cassandra Targett
There is an open JIRA issue to provide this search:
https://issues.apache.org/jira/browse/SOLR-10299.

Yes, it's pretty ironic that docs for a search engine doesn't have a
search engine, and I agree it's absolutely necessary, but it's not
done yet.

The title keyword "search" (I hate to even call it "search" - you're
right, today "autocomplete" is a better word for it) can be expanded
relatively easily to include the body of content also. However, this
has not been tested so we have no idea how well it will perform with
the size of our content - the author of that JavaScript says it can be
bad in some situations. This could maybe be a stopgap until a full
search solution is put into place, but maybe the effort to do the
stopgap and make it work well would be better spent implementing Solr
for the Ref Guide instead.

There were 1000 details to doing this transition, and search is the
only feature from the old system that didn't make the cut before
release (and yes, it IS released - see my announcement to this list on
Tuesday). We could have held things up until someone helped make it
happen, or we could move to the new approach and get the 100 other
benefits the change provides the community right away. We chose the
latter.

I do intend to get to search at some point, hopefully sooner rather
than later. But as we say, we're all volunteers here and all of us
have other commitments - to our employers, to our families, etc. -
that sometimes take precedence. If you feel this is something that
should be worked on immediately, you (and anyone reading this) are
strongly encouraged - no, welcomed - to contribute ideas, time, and/or
code to push it forward faster.

On Fri, Jun 23, 2017 at 5:36 AM, alessandro.benedetti
 wrote:
> Hi all,
> I was just using the new Solr Ref Guide[1] (If I understood correctly this
> is going to be the next official documentation for Solr).
>
> Unfortunately search within the guide works really bad...
> The autocomplete seems to be just on page title ( including headings would
> help a lot).
> If you don't accept any suggestion, it doesn't allow to search (!!!).
> I tried on Safari and Chrome.
>
> For a Reference guide of a search engine is not nice to have the search
> feature in this status.
> Actually, being an entry point for developers and users interested in Solr,
> it should showcase an amazing and intuitive search and ease life of people
> looking for documentation.
> I may state the obvious, so concretely is anybody working to fix this ? Is
> this because it has not been released officially yet ?
>
>
> [1] https://lucene.apache.org/solr/guide/6_6/
>
>
>
> -
> ---
> Alessandro Benedetti
> Search Consultant, R Software Engineer, Director
> Sease Ltd. - www.sease.io
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-Ref-guide-6-6-Search-not-working-tp4342508.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Sort using If function containing multiple conditions

2017-06-23 Thread spanchal
Hi all,

I am trying to sort my results using a function query.
sort=if(eq(FIELD1,1) AND eq(FIELD2,1),1,0) desc

but this is giving error:
"error":{
"metadata":[
  "error-class","org.apache.solr.common.SolrException",
  "root-error-class","org.apache.solr.common.SolrException"],
"msg":"Can't determine a Sort Order (asc or desc) in sort spec
'if(eq(FIELD1,1) AND eq(FIELD2,1),1,0) desc', pos=16",
"code":400}}


Although this is working:
sort=if(eq(FIELD1,1),1,0) desc

Can you help how I can sort using If function containing multiple clauses?

Thanks,
Saurabh.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Sort-using-If-function-containing-multiple-conditions-tp4342539.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Complement Stream function - Invalid ReducerStream - substream comparator (sort) must be a superset of this stream's comparator

2017-06-23 Thread Joel Bernstein
I think it makes sense to create a jira ticket.

Joel Bernstein
http://joelsolr.blogspot.com/

On Thu, Jun 22, 2017 at 2:43 PM, Susheel Kumar 
wrote:

> Please let me know if I shall create a JIRA and i can provide both
> expressions and data to reproduce.
>
> On Thu, Jun 22, 2017 at 11:23 AM, Susheel Kumar 
> wrote:
>
> > Yes, i tried building up expression piece by piece but looks like there
> is
> > an issue with how complement expects / behave for sort.
> >
> > if i use below g and h expr inside complement which are already sorted
> > (sort) then it doesn't work
> >
> > e=select(get(c),id,email),
> > f=select(get(d),id,email),
> > g=sort(get(e),by="id asc,email asc"),
> > h=sort(get(f),by="id asc,email asc"),
> > i=complement(get(g),get(h),on="id,email"),
> >
> > while below worked when i use e and f expr and sort them within
> complement
> > function instead of using g and h directly
> >
> > e=select(get(c),id,email),
> > f=select(get(d),id,email),
> > g=sort(get(e),by="id asc,email asc"),
> > h=sort(get(f),by="id asc,email asc"),
> > i=complement(
> > sort(get(e),by="id asc,email asc"),sort(get(f),by="id asc,email asc")
> > ,on="id,email"),
> >
> > So I am good for now with above approach but running into another issue
> > with empty/null/"Index 0, Size 0" set and will start another thread for
> > that (Need your help there :-)).
> >
> > Appreciate and thanks for all your help while I try to solve my use case
> > using streaming expressions.
> >
> >
> > On Thu, Jun 22, 2017 at 11:10 AM, Joel Bernstein 
> > wrote:
> >
> >> I suspect something is wrong in the syntax but I'm not seeing it.
> >>
> >> Have you tried building up the expression piece by piece until you get
> the
> >> syntax error?
> >>
> >> Joel Bernstein
> >> http://joelsolr.blogspot.com/
> >>
> >> On Wed, Jun 21, 2017 at 3:20 PM, Susheel Kumar 
> >> wrote:
> >>
> >> > While simple complement works in this way
> >> >
> >> > ===
> >> > complement(merge(sort(select(echo("A"),echo as email),by="email
> asc"),
> >> > sort(select(echo("B"),echo as email),by="email asc"),
> >> > on="email asc"),
> >> > merge(sort(select(echo("A"),echo as email),by="email asc"),
> >> > sort(select(echo("D"),echo as email),by="email asc"),on="email asc"),
> >> > on="email")
> >> >
> >> > BUT below it doesn't work when used in similar way
> >> >
> >> > ===
> >> > let(a=fetch(collection1,having(rollup(over=email,
> >> >  count(email),
> >> > select(search(collection1,
> >> > q=*:*,
> >> > fl="id,business_email",
> >> > sort="business_email asc"),
> >> >id,
> >> >business_email as email)),
> >> > eq(count(email),1)),
> >> > fl="id,business_email as email",
> >> > on="email=business_email"),
> >> > b=fetch(collection1,having(rollup(over=email,
> >> >  count(email),
> >> > select(search(collection1,
> >> > q=*:*,
> >> > fl="id,personal_email",
> >> > sort="personal_email asc"),
> >> >id,
> >> >personal_email as email)),
> >> > eq(count(email),1)),
> >> > fl="id,personal_email as email",
> >> > on="email=personal_email"),
> >> > c=hashJoin(get(a),hashed=get(b),on="email"),
> >> > d=hashJoin(get(b),hashed=get(a),on="email"),
> >> > e=select(get(c),id,email),
> >> > f=select(get(d),id,email),
> >> > g=sort(get(e),by="id asc,email asc"),
> >> > h=sort(get(f),by="id asc,email asc"),
> >> > i=complement(get(g),get(h),on="id,email"),
> >> > get(i)
> >> > )
> >> >
> >> >
> >> > On Wed, Jun 21, 2017 at 11:29 AM, Susheel Kumar <
> susheel2...@gmail.com>
> >> > wrote:
> >> >
> >> > > Hi,
> >> > >
> >> > > Two issues with complement function (solr 6.6)
> >> > >
> >> > > 1)  When i execute below streaming expression,
> >> > >
> >> > > ==
> >> > >
> >> > > let(a=fetch(collection1,having(rollup(over=email,
> >> > >  count(email),
> >> > > select(search(collection1,
> >> > > q=*:*,
> >> > > fl="id,business_email",
> >> > > sort="business_email asc"),
> >> > >id,
> >> > >business_email as email)),
> >> > > eq(count(email),1)),
> >> > > fl="id,business_email as email",
> >> > > on="email=business_email"),
> >> > > b=fetch(collection1,having(rollup(over=email,
> >> > >  count(email),
> >> > > select(search(collection1,
> >> > > q=*:*,
> >> > > fl="id,personal_email",
> >> > > sort="personal_email asc"),
> >> > >id,
> >> > >personal_email as email)),
> >> > > eq(count(email),1)),
> >> > > fl="id,personal_email as email",
> >> > > on="email=personal_email"),
> >> > > c=hashJoin(get(a),hashed=get(b),on="email"),
> >> > > 

Re: Index 0, Size 0 - hashJoin Stream function Error

2017-06-23 Thread Joel Bernstein
Ok, I hadn't anticipated some of the scenarios that you've been trying out.
Particularly reading streams into variables and performing joins etc...

The main idea with variables was to use them with the new statistical
evaluators. So you perform retrievals (search, random, nodes, knn etc...)
set the results to variables and then perform statistical analysis.

The problem with joining variables is that is doesn't scale very well
because all the records are read into memory. Also the parallel stream
won't work over variables.

Joel Bernstein
http://joelsolr.blogspot.com/

On Thu, Jun 22, 2017 at 3:50 PM, Susheel Kumar 
wrote:

> Hi Joel,
>
> I am able to reproduce this in a simple way.  Looks like Let Stream is
> having some issues.  Below complement function works fine if I execute
> outside let and returns an EOF:true tuple but if a tuple with EOF:true
> assigned to let variable, it gets changed to EXCEPTION "Index 0, Size 0"
> etc.
>
> So let stream not able to handle the stream/results which has only EOF
> tuple and breaks the whole let expression block
>
>
> ===Complement inside let
> let(
> a=echo(Hello),
> b=complement(sort(select(tuple(id=1,email="A"),id,email),by="id asc,email
> asc"),
> sort(select(tuple(id=1,email="A"),id,email),by="id asc,email asc"),
> on="id,email"),
> c=get(b),
> get(a)
> )
>
> Result
> ===
> {
>   "result-set": {
> "docs": [
>   {
> "EXCEPTION": "Index: 0, Size: 0",
> "EOF": true,
> "RESPONSE_TIME": 1
>   }
> ]
>   }
> }
>
> ===Complement outside let
>
> complement(sort(select(tuple(id=1,email="A"),id,email),by="id asc,email
> asc"),
> sort(select(tuple(id=1,email="A"),id,email),by="id asc,email asc"),
> on="id,email")
>
> Result
> ===
> { "result-set": { "docs": [ { "EOF": true, "RESPONSE_TIME": 0 } ] } }
>
>
>
>
>
>
>
>
> On Thu, Jun 22, 2017 at 11:55 AM, Susheel Kumar 
> wrote:
>
> > Sorry for typo
> >
> > Facing a weird behavior when using hashJoin / innerJoin etc. The below
> > expression display tuples from variable a shown below
> >
> >
> > let(a=fetch(SMS,having(rollup(over=email,
> >  count(email),
> > select(search(SMS,
> > q=*:*,
> > fl="id,dv_sv_business_email",
> > sort="dv_sv_business_email asc"),
> >id,
> >dv_sv_business_email as email)),
> > eq(count(email),1)),
> > fl="id,dv_sv_business_email as email",
> > on="email=dv_sv_business_email"),
> > b=fetch(SMS,having(rollup(over=email,
> >  count(email),
> > select(search(SMS,
> > q=*:*,
> > fl="id,dv_sv_personal_email",
> > sort="dv_sv_personal_email asc"),
> >id,
> >dv_sv_personal_email as email)),
> > eq(count(email),1)),
> > fl="id,dv_sv_personal_email as email",
> > on="email=dv_sv_personal_email"),
> > c=innerJoin(sort(get(a),by="email asc"),sort(get(b),by="email
> > asc"),on="email"),
> > #d=select(get(c),id,email),
> > get(a)
> > )
> >
> > var a result
> > ==
> > {
> >   "result-set": {
> > "docs": [
> >   {
> > "count(email)": 1,
> > "id": "1",
> > "email": "A"
> >   },
> >   {
> > "count(email)": 1,
> > "id": "2",
> > "email": "C"
> >   },
> >   {
> > "EOF": true,
> > "RESPONSE_TIME": 18
> >   }
> > ]
> >   }
> > }
> >
> > And after uncomment var d above, even though we are displaying a, we get
> > results shown below. I understand that join in my test data didn't find
> any
> > match but then it should not skew up the results of var a.  When data
> > matches during join then its fine but otherwise I am running into this
> > issue and whole next expressions doesn't get evaluated due to this...
> >
> >
> > after uncomment var d
> > ===
> > {
> >   "result-set": {
> > "docs": [
> >   {
> > "EXCEPTION": "Index: 0, Size: 0",
> > "EOF": true,
> > "RESPONSE_TIME": 44
> >   }
> > ]
> >   }
> > }
> >
> > On Thu, Jun 22, 2017 at 11:51 AM, Susheel Kumar 
> > wrote:
> >
> >> Hello Joel,
> >>
> >> Facing a weird behavior when using hashJoin / innerJoin etc. The below
> >> expression display tuples from variable a   and the moment I use get on
> >> innerJoin / hashJoin expr on variable c
> >>
> >>
> >> let(a=fetch(SMS,having(rollup(over=email,
> >>  count(email),
> >> select(search(SMS,
> >> q=*:*,
> >> fl="id,dv_sv_business_email",
> >> sort="dv_sv_business_email asc"),
> >>id,
> >>dv_sv_business_email as email)),
> >> eq(count(email),1)),
> >> fl="id,dv_sv_business_email as email",
> >> on="email=dv_sv_business_email"),
> >> b=fetch(SMS,having(rollup(over=email,
> >>  count(email),
> >> 

Re: Query Partial Matching on auto schema

2017-06-23 Thread alessandro.benedetti
Quoting the official solr documentation : 
" You Can Still Be Explicit
Even if you want to use schemaless mode for most fields, you can still use
the Schema API to pre-emptively create some fields, with explicit types,
before you index documents that use them.

Internally, the Schema API and the Schemaless Update Processors both use the
same Managed Schema functionality."

Even using schemaless you can use the managed schema APi to define your own
field types and fields.

For more info [1]

[1]
https://lucene.apache.org/solr/guide/6_6/schemaless-mode.html#SchemalessMode-EnableManagedSchema



-
---
Alessandro Benedetti
Search Consultant, R Software Engineer, Director
Sease Ltd. - www.sease.io
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Query-Partial-Matching-on-auto-schema-tp4342502p4342509.html
Sent from the Solr - User mailing list archive at Nabble.com.


[Solr Ref guide 6.6] Search not working

2017-06-23 Thread alessandro.benedetti
Hi all,
I was just using the new Solr Ref Guide[1] (If I understood correctly this
is going to be the next official documentation for Solr).

Unfortunately search within the guide works really bad...
The autocomplete seems to be just on page title ( including headings would
help a lot).
If you don't accept any suggestion, it doesn't allow to search (!!!).
I tried on Safari and Chrome.

For a Reference guide of a search engine is not nice to have the search
feature in this status.
Actually, being an entry point for developers and users interested in Solr,
it should showcase an amazing and intuitive search and ease life of people
looking for documentation.
I may state the obvious, so concretely is anybody working to fix this ? Is
this because it has not been released officially yet ?


[1] https://lucene.apache.org/solr/guide/6_6/



-
---
Alessandro Benedetti
Search Consultant, R Software Engineer, Director
Sease Ltd. - www.sease.io
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Ref-guide-6-6-Search-not-working-tp4342508.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Query Partial Matching on auto schema

2017-06-23 Thread Guilleret Florian
Yes I mean schemaless.

Then with schemaless its impossible to have what I expect ?

Guilleret Florian 
Tel : +33 6 21 28 43 06

2017-06-23 12:26 GMT+02:00 alessandro.benedetti :

> With automatic schema do you mean schemaless ?
> You will need to define a schema managed/old legacy style as you prefer.
>
> Then you define a field type that suites your needs ( for example with an
> edge n-gram token filter[1] ).
> And you assign that field type to a specific field.
>
> Than in your request handler/ when you build your query just use that field
> to search.
>
> Regards
>
> [1]
> https://lucene.apache.org/solr/guide/6_6/filter-descriptions.html#
> FilterDescriptions-EdgeN-GramFilter
>
>
>
> -
> ---
> Alessandro Benedetti
> Search Consultant, R Software Engineer, Director
> Sease Ltd. - www.sease.io
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/Query-Partial-Matching-on-auto-schema-tp4342502p4342506.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Query Partial Matching on auto schema

2017-06-23 Thread alessandro.benedetti
With automatic schema do you mean schemaless ?
You will need to define a schema managed/old legacy style as you prefer.

Then you define a field type that suites your needs ( for example with an
edge n-gram token filter[1] ).
And you assign that field type to a specific field.

Than in your request handler/ when you build your query just use that field
to search.

Regards

[1]
https://lucene.apache.org/solr/guide/6_6/filter-descriptions.html#FilterDescriptions-EdgeN-GramFilter



-
---
Alessandro Benedetti
Search Consultant, R Software Engineer, Director
Sease Ltd. - www.sease.io
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Query-Partial-Matching-on-auto-schema-tp4342502p4342506.html
Sent from the Solr - User mailing list archive at Nabble.com.


Query Partial Matching on auto schema

2017-06-23 Thread Guilleret Florian
Hi,

I use SOLR 5.2.1 with automatic schema.

But when try a query with a partial word, Solr didn't found anything.

Exemple :

I request a query :
   NHLDO

Solr return nothing but there is a document with name NHLDO457

If i request this query :
NHLDO457

Solr return me the document.


So how i can configure SOLR to retrieve document even with partial word in
query with auto schema ?

Kind Regards
Guilleret Florian


Re: Collection name in result

2017-06-23 Thread alessandro.benedetti
I second Erick,
it would be as easy as adding this field to the schema :

"/>

If you are using inter collections queries, just be aware there a lot of
tricky and subtle problems with it ( such as unique Identifier must have
same field name, distributed IDF inter collections ect ect)
I am preparing a blog post related that.
I will keep you updated.

Cheers



-
---
Alessandro Benedetti
Search Consultant, R Software Engineer, Director
Sease Ltd. - www.sease.io
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Collection-name-in-result-tp4342474p4342501.html
Sent from the Solr - User mailing list archive at Nabble.com.