Re: Solr cloud replica nodes missing some documents

2017-08-21 Thread Sanjay Lokhande


Any pointer guys.



From:   Sanjay Lokhande/India/IBM
To: solr-user@lucene.apache.org
Date:   08/18/2017 02:12 PM
Subject:Solr cloud replica nodes missing some documents


Hello  guys,

  I am having 5 nodes solr cloud setup with single shard. The solr version
is 5.2.1.
  server1 (http://146.XXX.com:4001/solr/contracts_shard1_replica4)is the
leader.
  A document with id '43e14a86cbdd422880cac22d9a15d3c0' was not replicated
3 nodes.
  Log shows that the "{add=[43e14a86cbdd422880cac22d9a15d3c0
(1573510697298427904)]}" request is received only by leader and server5
node.
  The server2, server3 & server4 node did not receive the request and hence
the document is missing in these nodes.

 Search "43e14a86cbdd422880cac22d9a15d3c0 "
  C:\solrIssue\solr_server1.log
INFO  - 2017-07-21 05:54:59.853; [contracts shard1 core_node2
contracts_shard1_replica4]
org.apache.solr.update.processor.LogUpdateProcessor;
[contracts_shard1_replica4] webapp=/solr path=/update params=
{wt=javabin=2} {deleteByQuery=id:(9467353f398448788c261aa347d75b8b
93332ab7f7ff4141a371713871ab65ad 8568e0eab8364bfc89c876aadfa01022
43e14a86cbdd422880cac22d9a15d3c0 a0af8cb24ef94d25b9691eee1f7024ca 8ad...
INFO  - 2017-07-21 05:54:59.853; [contracts shard1 core_node2
contracts_shard1_replica4]
org.apache.solr.update.processor.LogUpdateProcessor;
[contracts_shard1_replica4] webapp=/solr path=/update params=
{wt=javabin=2} {deleteByQuery=id:(9467353f398448788c261aa347d75b8b
93332ab7f7ff4141a371713871ab65ad 8568e0eab8364bfc89c876aadfa01022
43e14a86cbdd422880cac22d9a15d3c0 a0af8cb24ef94d25b9691eee1f7024ca 8ad...
INFO  - 2017-07-21 05:59:23.845; [contracts shard1 core_node2
contracts_shard1_replica4]
org.apache.solr.update.processor.LogUpdateProcessor;
[contracts_shard1_replica4] webapp=/solr path=/update params=
{wt=javabin=2} {add=[43e14a86cbdd422880cac22d9a15d3c0
(1573510697298427904)]} 0 26582
  C:\solrIssue\solr_server2\solr.log.1
INFO  - 2017-07-21 05:54:59.595; [contracts shard1 core_node4
contracts_shard1_replica5]
org.apache.solr.update.processor.LogUpdateProcessor;
[contracts_shard1_replica5] webapp=/solr path=/update params=
{update.distrib=FROMLEADER&_version_=-1573510446380482560=http://146.XXX.com:4001/solr/contracts_shard1_replica4/=javabin=2}
 {deleteByQuery=id:(9467353f398448788c261aa347d75b8b
93332ab7f7ff4141a371713871ab65ad 8568e0eab8364bfc89c876aadfa01022
43e14a86cbdd422880cac22d9a15d3c0 a0af8cb24ef94d25b9691eee1f7024ca 8ad...
INFO  - 2017-07-21 05:54:59.595; [contracts shard1 core_node4
contracts_shard1_replica5]
org.apache.solr.update.processor.LogUpdateProcessor;
[contracts_shard1_replica5] webapp=/solr path=/update params=
{update.distrib=FROMLEADER&_version_=-1573510446380482560=http://146.XXX.com:4001/solr/contracts_shard1_replica4/=javabin=2}
 {deleteByQuery=id:(9467353f398448788c261aa347d75b8b
93332ab7f7ff4141a371713871ab65ad 8568e0eab8364bfc89c876aadfa01022
43e14a86cbdd422880cac22d9a15d3c0 a0af8cb24ef94d25b9691eee1f7024ca 8ad...
  C:\solrIssue\solr_server3.log
INFO  - 2017-07-21 05:54:59.844; [contracts shard1 core_node1
contracts_shard1_replica3]
org.apache.solr.update.processor.LogUpdateProcessor;
[contracts_shard1_replica3] webapp=/solr path=/update params=
{update.distrib=FROMLEADER&_version_=-1573510446380482560=http://146.XXX.com:4001/solr/contracts_shard1_replica4/=javabin=2}
 {deleteByQuery=id:(9467353f398448788c261aa347d75b8b
93332ab7f7ff4141a371713871ab65ad 8568e0eab8364bfc89c876aadfa01022
43e14a86cbdd422880cac22d9a15d3c0 a0af8cb24ef94d25b9691eee1f7024ca 8ad...
INFO  - 2017-07-21 05:54:59.844; [contracts shard1 core_node1
contracts_shard1_replica3]
org.apache.solr.update.processor.LogUpdateProcessor;
[contracts_shard1_replica3] webapp=/solr path=/update params=
{update.distrib=FROMLEADER&_version_=-1573510446380482560=http://146.XXX.com:4001/solr/contracts_shard1_replica4/=javabin=2}
 {deleteByQuery=id:(9467353f398448788c261aa347d75b8b
93332ab7f7ff4141a371713871ab65ad 8568e0eab8364bfc89c876aadfa01022
43e14a86cbdd422880cac22d9a15d3c0 a0af8cb24ef94d25b9691eee1f7024ca 8ad...
  C:\solrIssue\solr_server4\solr.log.1
INFO  - 2017-07-21 05:54:59.734; [contracts shard1 core_node3
contracts_shard1_replica1]
org.apache.solr.update.processor.LogUpdateProcessor;
[contracts_shard1_replica1] webapp=/solr path=/update params=
{update.distrib=FROMLEADER&_version_=-1573510446380482560=http://146.XXX.com:4001/solr/contracts_shard1_replica4/=javabin=2}
 {deleteByQuery=id:(9467353f398448788c261aa347d75b8b
93332ab7f7ff4141a371713871ab65ad 8568e0eab8364bfc89c876aadfa01022
43e14a86cbdd422880cac22d9a15d3c0 a0af8cb24ef94d25b9691eee1f7024ca 8ad...
INFO  - 2017-07-21 05:54:59.734; [contracts shard1 core_node3
contracts_shard1_replica1]
org.apache.solr.update.processor.LogUpdateProcessor;
[contracts_shard1_replica1] webapp=/solr path=/update params=

Re: Return only matched multi-valued field

2017-08-21 Thread Koji Sekiguchi

Hi,

I don't think Lucene/Solr can know which field matches the query you posted.
You should usually use Highlighter to know it.

Koji


On 2017/08/22 2:46, ruby wrote:

Is there a way to return only the matched field from a multivalued field
using filtering?





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Return-only-matched-multi-valued-field-tp4351494.html
Sent from the Solr - User mailing list archive at Nabble.com.





Re: FastVector does not highlight for phrase query when it contains stop word/s

2017-08-21 Thread Rick Leir
Recent discussions have recommended that you not use stop words in any case. 
Cheers -- Rick

On August 21, 2017 11:47:11 AM EDT, Jagdish Vasani 
 wrote:
>Hi  Solr Users,
>
>I come across issue that fast Vector highlighter does not highlight
>field values when search for phrase query contains stop word.
>For example , Query is "blue is the sky" , it will return result but
>highlighting will not available for this field.
>
>I have applied,
>hl.usePhraseHighlighter=true
>hl.preserveMulti=true
>hl.highlightMultiTerm=true
>hl.fragsize=1500
>hl.snippets=5
>hl=on
>hl.fl=
>hl.tag.pre=
>hl.tag.post==
>hl.method = fastVector
>
>schema.xml, fields having
>indexed="true" termOffsets="true" stored="true" termPositions="true"
>termVectors="true" multiValued="true"
>
>I have used solr 6.4.2
>
>Does that correct or I am missing some thing ?
>
>Thanks,
>Jagdish
>
>
>
>
>NOTICE TO RECIPIENT(s):This e-mail message may contain confidential or
>legally privileged information and is intended only for the use of the
>intended recipient(s). Any unauthorized disclosure, dissemination,
>distribution, copying or the taking of any action in reliance on the
>information herein is prohibited. E-mails are not secure and cannot be
>guaranteed to be error free as they can be intercepted, amended, or
>contain viruses. Although The Digital Group has taken reasonable
>precautions to ensure no viruses are present in this email, the company
>cannot accept responsibility for any loss or damage arising from the
>use of this email or attachments. Any opinion defamatory or deemed to
>be defamatory or any material which could be reasonably branded to be a
>species of plagiarism and other statements contained in this message
>and any attachment are solely those of the author and do not
>necessarily represent those of the company.

-- 
Sorry for being brief. Alternate email is rickleir at yahoo dot com 

Re: Huge Facets and Streaming

2017-08-21 Thread Joel Bernstein
The current approach for high cardinality aggregations is the MapReduce
approach:

parallel(rollup(search()))

But what Yonik describes would be much more efficient.


Joel Bernstein
http://joelsolr.blogspot.com/

On Mon, Aug 21, 2017 at 3:44 PM, Mikhail Khludnev  wrote:

> Thanks for sharing this idea, Younik!
> I've raised https://issues.apache.org/jira/browse/SOLR-11271.
>
> On Mon, Aug 21, 2017 at 4:00 PM, Yonik Seeley  wrote:
>
> > On Mon, Aug 21, 2017 at 6:01 AM, Mikhail Khludnev 
> wrote:
> > > Hello!
> > >
> > > I need to count really wide facet on 30 shards index with roughly 100M
> > > docs, the facet response is about 100M values takes 0.5G in text file.
> > >
> > > So, far I experimented with old facets. It calculates per shard facets
> > > fine, but then a node which attempts to merge such 30 responses fails
> due
> > > to OOM. It's reasonable.
> > >
> > > I suppose I'll get pretty much same with json.facet, or it's better
> > > scalable?
> > >
> > > I want to experiment with Streaming Expression, which I've never taken
> > yet.
> > > I've found facet() expression and select() with partitionKeys they'll
> try
> > > to merge facet values in FacetComponent/Module anyway.
> > > Is there a way to merge per-shard facet responses with Streaming?
> >
> > Yeah, I think I've mentioned before that this is the way it should be
> > implemented (per-shard distrib=false facet request merged by streaming
> > expression).
> > The JSON Facet "stream" method does stream (i.e. does not build up the
> > response all in memory first), but only at the shard level and not at
> > the distrib/merge level.  This could then be fed into streaming to get
> > exact facets (and streaming facets).  But I don't think this has been
> > done yet.
> >
> > -Yonik
> >
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>


Re: Huge Facets and Streaming

2017-08-21 Thread Mikhail Khludnev
Thanks for sharing this idea, Younik!
I've raised https://issues.apache.org/jira/browse/SOLR-11271.

On Mon, Aug 21, 2017 at 4:00 PM, Yonik Seeley  wrote:

> On Mon, Aug 21, 2017 at 6:01 AM, Mikhail Khludnev  wrote:
> > Hello!
> >
> > I need to count really wide facet on 30 shards index with roughly 100M
> > docs, the facet response is about 100M values takes 0.5G in text file.
> >
> > So, far I experimented with old facets. It calculates per shard facets
> > fine, but then a node which attempts to merge such 30 responses fails due
> > to OOM. It's reasonable.
> >
> > I suppose I'll get pretty much same with json.facet, or it's better
> > scalable?
> >
> > I want to experiment with Streaming Expression, which I've never taken
> yet.
> > I've found facet() expression and select() with partitionKeys they'll try
> > to merge facet values in FacetComponent/Module anyway.
> > Is there a way to merge per-shard facet responses with Streaming?
>
> Yeah, I think I've mentioned before that this is the way it should be
> implemented (per-shard distrib=false facet request merged by streaming
> expression).
> The JSON Facet "stream" method does stream (i.e. does not build up the
> response all in memory first), but only at the shard level and not at
> the distrib/merge level.  This could then be fed into streaming to get
> exact facets (and streaming facets).  But I don't think this has been
> done yet.
>
> -Yonik
>



-- 
Sincerely yours
Mikhail Khludnev


Return only matched multi-valued field

2017-08-21 Thread ruby
Is there a way to return only the matched field from a multivalued field
using filtering?





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Return-only-matched-multi-valued-field-tp4351494.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: FastVector does not highlight for phrase query when it contains stop word/s

2017-08-21 Thread Leonardo Perez Pulido
Hi,
Do you analyze your text with StopFilterFactory, in your field type?
Regards.

On Mon, Aug 21, 2017 at 11:47 AM, Jagdish Vasani <
jagdish.vas...@thedigitalgroup.com> wrote:

> Hi  Solr Users,
>
> I come across issue that fast Vector highlighter does not highlight field
> values when search for phrase query contains stop word.
> For example , Query is "blue is the sky" , it will return result but
> highlighting will not available for this field.
>
> I have applied,
> hl.usePhraseHighlighter=true
> hl.preserveMulti=true
> hl.highlightMultiTerm=true
> hl.fragsize=1500
> hl.snippets=5
> hl=on
> hl.fl=
> hl.tag.pre=
> hl.tag.post==
> hl.method = fastVector
>
> schema.xml, fields having
> indexed="true" termOffsets="true" stored="true" termPositions="true"
> termVectors="true" multiValued="true"
>
> I have used solr 6.4.2
>
> Does that correct or I am missing some thing ?
>
> Thanks,
> Jagdish
>
>
> 
>
> NOTICE TO RECIPIENT(s):This e-mail message may contain confidential or
> legally privileged information and is intended only for the use of the
> intended recipient(s). Any unauthorized disclosure, dissemination,
> distribution, copying or the taking of any action in reliance on the
> information herein is prohibited. E-mails are not secure and cannot be
> guaranteed to be error free as they can be intercepted, amended, or contain
> viruses. Although The Digital Group has taken reasonable precautions to
> ensure no viruses are present in this email, the company cannot accept
> responsibility for any loss or damage arising from the use of this email or
> attachments. Any opinion defamatory or deemed to be defamatory or any
> material which could be reasonably branded to be a species of plagiarism
> and other statements contained in this message and any attachment are
> solely those of the author and do not necessarily represent those of the
> company.
>


FastVector does not highlight for phrase query when it contains stop word/s

2017-08-21 Thread Jagdish Vasani
Hi  Solr Users,

I come across issue that fast Vector highlighter does not highlight field 
values when search for phrase query contains stop word.
For example , Query is "blue is the sky" , it will return result but 
highlighting will not available for this field.

I have applied,
hl.usePhraseHighlighter=true
hl.preserveMulti=true
hl.highlightMultiTerm=true
hl.fragsize=1500
hl.snippets=5
hl=on
hl.fl=
hl.tag.pre=
hl.tag.post==
hl.method = fastVector

schema.xml, fields having
indexed="true" termOffsets="true" stored="true" termPositions="true" 
termVectors="true" multiValued="true"

I have used solr 6.4.2

Does that correct or I am missing some thing ?

Thanks,
Jagdish




NOTICE TO RECIPIENT(s):This e-mail message may contain confidential or legally 
privileged information and is intended only for the use of the intended 
recipient(s). Any unauthorized disclosure, dissemination, distribution, copying 
or the taking of any action in reliance on the information herein is 
prohibited. E-mails are not secure and cannot be guaranteed to be error free as 
they can be intercepted, amended, or contain viruses. Although The Digital 
Group has taken reasonable precautions to ensure no viruses are present in this 
email, the company cannot accept responsibility for any loss or damage arising 
from the use of this email or attachments. Any opinion defamatory or deemed to 
be defamatory or any material which could be reasonably branded to be a species 
of plagiarism and other statements contained in this message and any attachment 
are solely those of the author and do not necessarily represent those of the 
company.


Re: Solr sort incorrectly

2017-08-21 Thread amukherjee10
Thanks a lot Erick!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-sort-incorrectly-tp4351419p4351463.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr sort incorrectly

2017-08-21 Thread Erick Erickson
String types do no analysis whatsoever. Therefore case matters, All
upper case sorts before all lower case. Punctuation matters. Spaces
matter etc.

What people often do is create a special field for sorting and
normalize it by folding case etc. Usually you have to do this on the
input. You might also be able to use a KeywordTokenizer to produce
sortable tokens on text fields that would allow you to put the
normalization into your schema instead. Look for "alphaonlysort" in
some of the example schemas for a model.

Best,
Erick

On Mon, Aug 21, 2017 at 3:52 AM, amukherjee10  wrote:
> I am passing a Solr query to fetch few results. Im trying to sort the data
> based on field of "string" type - documenttitle_s. Below is the full query
> that is getting passed:
> http://localhost:8000/solr/web/select?q=**=/standard=documenttitle_t+documentproductname_s+documenttypetext_s+mediacontent_t=0=(_latestversion:true)=(_template:f2909e5b58954a53affe8360b275a739)=0=documenttitle_s+desc=50=true=2.2
>
> The issue is that, though most of the data is getting sorted, few results
> are causing some anomaly. For example: text1 = onRisks - The variable
> affecting choices in the 21st century, text2 = onRisks - Econs, Humans and
> the Perception, text3 = Witch Redemption Authorization, text4 = Walk
> Infographic - Rising
> Ideally text3 should come on top when sorted in desc manner, but text1 is
> coming on top.
>
> Can anyone tell me what is going wrong here?
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Solr-sort-incorrectly-tp4351419.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: EdgeNGramFilterFactory did not work

2017-08-21 Thread Erick Erickson
Have you explored eDismax? You specify what fields you want the query
distributed over, often in solrconfig.xml in a request handler.

Best,
Erick

On Mon, Aug 21, 2017 at 5:46 AM, Guilleret Florian
 wrote:
> Ok perfect I will do that then i have only 5-6 field so its ok.
>
> Thank you for your help !
>
> Guilleret Florian 
> Tel : +33 6 21 28 43 06
>
> 2017-08-21 13:32 GMT+02:00 Junte Zhang :
>
>> Unfortunately not. I would recommend not to use a catch-all field (saves
>> storage!), but list all the fields you consider "fulltext" in your
>> application in a disjunction query. This should not come at the expensive
>> of performance, unless you have a huge amount of fields. Then it should
>> work.
>>
>> /JZ
>>
>> -Original Message-
>> From: Guilleret Florian [mailto:guilleret.flor...@gmail.com]
>> Sent: Monday, August 21, 2017 11:21 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: EdgeNGramFilterFactory did not work
>>
>> Yes i use the catch all field like :
>>
>> 
>>
>> And you rights when I request solr like = this ?df=textNgramStr2 it works
>> perfectly.
>>
>> But is there a way to still use the catch all field ? Because i have other
>> file like name and description who are only in the _text_ and not in
>> textNgramStr2
>>
>> Kind regards
>>
>> Guilleret Florian  Tel : +33 6 21 28 43 06
>>
>> 2017-08-21 11:07 GMT+02:00 Junte Zhang :
>>
>> > You have to specify the field where you specified this field analyzer
>> > in your request. If you use the catch all field by omitting the field,
>> > it does not use your filter factory.
>> >
>> > /JZ
>> >
>> > -Original Message-
>> > From: Guilleret Florian [mailto:guilleret.flor...@gmail.com]
>> > Sent: Thursday, August 17, 2017 2:42 PM
>> > To: solr-user@lucene.apache.org
>> > Subject: EdgeNGramFilterFactory did not work
>> >
>> > Hi,
>> >
>> > I want to got this use :
>> >
>> > My document got SKU. Like : 27VAN670
>> >
>> > When i query 27VAN670 solr return my document well.
>> >
>> > But when i query 27VAN solr return 0 document;
>> >
>> > So someone telle me to use EdgeNGramFilterFactory.
>> >
>> > With SOLR Api I add new Field Type :
>> >
>> > {
>> >   "add-field-type" : {
>> >  "name":"textNgram2",
>> >  "class":"solr.TextField",
>> >  "positionIncrementGap":"100",
>> >  "analyzer" : {
>> > "tokenizer":{
>> >"class":"solr.StandardTokenizerFactory" },
>> > "filters":[{
>> >"class":"solr.EdgeNGramFilterFactory",
>> >"minGramSize":"1",
>> >"maxGramSize":"15"
>> >}]}
>> >}
>> >
>> > }
>> >
>> > And I also create a new Field :
>> >
>> > {
>> >   "add-field":{
>> >  "name":"textNgramStr2",
>> >  "type":"textNgram2",
>> >  "stored":true,
>> >  "indexed":true
>> >   }
>> > }
>> >
>> > Now in SOLR admin I cans saw my new textNgramStr2. And when I update
>> > my document in solr I put SKU of my product in textNgramStr2.
>> >
>> > But solr still found nothing when I request 27VAN
>> >
>> > Can someone help me on that ?
>> >
>>


facet processing module in Version 6.x needs significantly more time compared to version 4.10

2017-08-21 Thread guenterh.li...@bluewin.ch
Hi,
I can't figure out the reason why the facet processing in version 6 needs 
significantly more time compared to version 4.
The debugging response (for 30 million documents)
solr 4
280.00.0280.0
(once the query is cached)
before caching: between 1.5 and 2 sec
solr 6.x (my last try was with 6.6)
without docvalues for facetting fields (same schema as version 4)
5874.00.05873.00.0
the time is not getting better even after repeating the query several times
solr 6.6 with docvalues for facetting fields
9837.00.09837.00.0
used query (our productive system with version 4)
http://search.swissbib.ch/solr/sb-biblio/select?debugQuery=true=*:*=true=union=navAuthor_full=format=language=navSub_green=navSubform=publishDate=edismax=2=arrarr=recip(abs(ms(NOW/DAY,freshness)),3.16e-10,100,100)=*,score=250=0=AND=score+desc=0=START_HILITE=100=END_HILITE=false=title_short^1000+title_alt^200+title_sub^200+title_old^200+title_new^200+author^750+author_additional^100+author_additional_dsv11_txt_mv^100+title_additional_dsv11_txt_mv^100+series^200+topic^500+addfields_txt_mv^50+publplace_txt_mv^25+publplace_dsv11_txt_mv^25+fulltext+callnumber^1000+ctrlnum^1000+publishDate+isbn+variant_isbn_isn_mv+issn+localcode+id=title_short^1000=1=fulltext&=xml=count
Running the queries on smaller indices (8 million docs) the difference is 
similar although the absolut figures for processing time are smaller
Any hints why this huge differences?
Günter


Re: Huge Facets and Streaming

2017-08-21 Thread Yonik Seeley
On Mon, Aug 21, 2017 at 6:01 AM, Mikhail Khludnev  wrote:
> Hello!
>
> I need to count really wide facet on 30 shards index with roughly 100M
> docs, the facet response is about 100M values takes 0.5G in text file.
>
> So, far I experimented with old facets. It calculates per shard facets
> fine, but then a node which attempts to merge such 30 responses fails due
> to OOM. It's reasonable.
>
> I suppose I'll get pretty much same with json.facet, or it's better
> scalable?
>
> I want to experiment with Streaming Expression, which I've never taken yet.
> I've found facet() expression and select() with partitionKeys they'll try
> to merge facet values in FacetComponent/Module anyway.
> Is there a way to merge per-shard facet responses with Streaming?

Yeah, I think I've mentioned before that this is the way it should be
implemented (per-shard distrib=false facet request merged by streaming
expression).
The JSON Facet "stream" method does stream (i.e. does not build up the
response all in memory first), but only at the shard level and not at
the distrib/merge level.  This could then be fed into streaming to get
exact facets (and streaming facets).  But I don't think this has been
done yet.

-Yonik


Re: EdgeNGramFilterFactory did not work

2017-08-21 Thread Guilleret Florian
Ok perfect I will do that then i have only 5-6 field so its ok.

Thank you for your help !

Guilleret Florian 
Tel : +33 6 21 28 43 06

2017-08-21 13:32 GMT+02:00 Junte Zhang :

> Unfortunately not. I would recommend not to use a catch-all field (saves
> storage!), but list all the fields you consider "fulltext" in your
> application in a disjunction query. This should not come at the expensive
> of performance, unless you have a huge amount of fields. Then it should
> work.
>
> /JZ
>
> -Original Message-
> From: Guilleret Florian [mailto:guilleret.flor...@gmail.com]
> Sent: Monday, August 21, 2017 11:21 AM
> To: solr-user@lucene.apache.org
> Subject: Re: EdgeNGramFilterFactory did not work
>
> Yes i use the catch all field like :
>
> 
>
> And you rights when I request solr like = this ?df=textNgramStr2 it works
> perfectly.
>
> But is there a way to still use the catch all field ? Because i have other
> file like name and description who are only in the _text_ and not in
> textNgramStr2
>
> Kind regards
>
> Guilleret Florian  Tel : +33 6 21 28 43 06
>
> 2017-08-21 11:07 GMT+02:00 Junte Zhang :
>
> > You have to specify the field where you specified this field analyzer
> > in your request. If you use the catch all field by omitting the field,
> > it does not use your filter factory.
> >
> > /JZ
> >
> > -Original Message-
> > From: Guilleret Florian [mailto:guilleret.flor...@gmail.com]
> > Sent: Thursday, August 17, 2017 2:42 PM
> > To: solr-user@lucene.apache.org
> > Subject: EdgeNGramFilterFactory did not work
> >
> > Hi,
> >
> > I want to got this use :
> >
> > My document got SKU. Like : 27VAN670
> >
> > When i query 27VAN670 solr return my document well.
> >
> > But when i query 27VAN solr return 0 document;
> >
> > So someone telle me to use EdgeNGramFilterFactory.
> >
> > With SOLR Api I add new Field Type :
> >
> > {
> >   "add-field-type" : {
> >  "name":"textNgram2",
> >  "class":"solr.TextField",
> >  "positionIncrementGap":"100",
> >  "analyzer" : {
> > "tokenizer":{
> >"class":"solr.StandardTokenizerFactory" },
> > "filters":[{
> >"class":"solr.EdgeNGramFilterFactory",
> >"minGramSize":"1",
> >"maxGramSize":"15"
> >}]}
> >}
> >
> > }
> >
> > And I also create a new Field :
> >
> > {
> >   "add-field":{
> >  "name":"textNgramStr2",
> >  "type":"textNgram2",
> >  "stored":true,
> >  "indexed":true
> >   }
> > }
> >
> > Now in SOLR admin I cans saw my new textNgramStr2. And when I update
> > my document in solr I put SKU of my product in textNgramStr2.
> >
> > But solr still found nothing when I request 27VAN
> >
> > Can someone help me on that ?
> >
>


RE: EdgeNGramFilterFactory did not work

2017-08-21 Thread Junte Zhang
Unfortunately not. I would recommend not to use a catch-all field (saves 
storage!), but list all the fields you consider "fulltext" in your application 
in a disjunction query. This should not come at the expensive of performance, 
unless you have a huge amount of fields. Then it should work.

/JZ

-Original Message-
From: Guilleret Florian [mailto:guilleret.flor...@gmail.com] 
Sent: Monday, August 21, 2017 11:21 AM
To: solr-user@lucene.apache.org
Subject: Re: EdgeNGramFilterFactory did not work

Yes i use the catch all field like :



And you rights when I request solr like = this ?df=textNgramStr2 it works 
perfectly.

But is there a way to still use the catch all field ? Because i have other file 
like name and description who are only in the _text_ and not in
textNgramStr2

Kind regards

Guilleret Florian  Tel : +33 6 21 28 43 06

2017-08-21 11:07 GMT+02:00 Junte Zhang :

> You have to specify the field where you specified this field analyzer 
> in your request. If you use the catch all field by omitting the field, 
> it does not use your filter factory.
>
> /JZ
>
> -Original Message-
> From: Guilleret Florian [mailto:guilleret.flor...@gmail.com]
> Sent: Thursday, August 17, 2017 2:42 PM
> To: solr-user@lucene.apache.org
> Subject: EdgeNGramFilterFactory did not work
>
> Hi,
>
> I want to got this use :
>
> My document got SKU. Like : 27VAN670
>
> When i query 27VAN670 solr return my document well.
>
> But when i query 27VAN solr return 0 document;
>
> So someone telle me to use EdgeNGramFilterFactory.
>
> With SOLR Api I add new Field Type :
>
> {
>   "add-field-type" : {
>  "name":"textNgram2",
>  "class":"solr.TextField",
>  "positionIncrementGap":"100",
>  "analyzer" : {
> "tokenizer":{
>"class":"solr.StandardTokenizerFactory" },
> "filters":[{
>"class":"solr.EdgeNGramFilterFactory",
>"minGramSize":"1",
>"maxGramSize":"15"
>}]}
>}
>
> }
>
> And I also create a new Field :
>
> {
>   "add-field":{
>  "name":"textNgramStr2",
>  "type":"textNgram2",
>  "stored":true,
>  "indexed":true
>   }
> }
>
> Now in SOLR admin I cans saw my new textNgramStr2. And when I update 
> my document in solr I put SKU of my product in textNgramStr2.
>
> But solr still found nothing when I request 27VAN
>
> Can someone help me on that ?
>


Huge Facets and Streaming

2017-08-21 Thread Mikhail Khludnev
Hello!

I need to count really wide facet on 30 shards index with roughly 100M
docs, the facet response is about 100M values takes 0.5G in text file.

So, far I experimented with old facets. It calculates per shard facets
fine, but then a node which attempts to merge such 30 responses fails due
to OOM. It's reasonable.

I suppose I'll get pretty much same with json.facet, or it's better
scalable?

I want to experiment with Streaming Expression, which I've never taken yet.
I've found facet() expression and select() with partitionKeys they'll try
to merge facet values in FacetComponent/Module anyway.
Is there a way to merge per-shard facet responses with Streaming?

-- 
Sincerely yours
Mikhail Khludnev


Solr sort incorrectly

2017-08-21 Thread amukherjee10
I am passing a Solr query to fetch few results. Im trying to sort the data
based on field of "string" type - documenttitle_s. Below is the full query
that is getting passed:
http://localhost:8000/solr/web/select?q=**=/standard=documenttitle_t+documentproductname_s+documenttypetext_s+mediacontent_t=0=(_latestversion:true)=(_template:f2909e5b58954a53affe8360b275a739)=0=documenttitle_s+desc=50=true=2.2

The issue is that, though most of the data is getting sorted, few results
are causing some anomaly. For example: text1 = onRisks - The variable
affecting choices in the 21st century, text2 = onRisks - Econs, Humans and
the Perception, text3 = Witch Redemption Authorization, text4 = Walk
Infographic - Rising
Ideally text3 should come on top when sorted in desc manner, but text1 is
coming on top.

Can anyone tell me what is going wrong here?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-sort-incorrectly-tp4351419.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: EdgeNGramFilterFactory did not work

2017-08-21 Thread Guilleret Florian
Yes i use the catch all field like :



And you rights when I request solr like = this ?df=textNgramStr2 it works
perfectly.

But is there a way to still use the catch all field ? Because i have other
file like name and description who are only in the _text_ and not in
textNgramStr2

Kind regards

Guilleret Florian 
Tel : +33 6 21 28 43 06

2017-08-21 11:07 GMT+02:00 Junte Zhang :

> You have to specify the field where you specified this field analyzer in
> your request. If you use the catch all field by omitting the field, it does
> not use your filter factory.
>
> /JZ
>
> -Original Message-
> From: Guilleret Florian [mailto:guilleret.flor...@gmail.com]
> Sent: Thursday, August 17, 2017 2:42 PM
> To: solr-user@lucene.apache.org
> Subject: EdgeNGramFilterFactory did not work
>
> Hi,
>
> I want to got this use :
>
> My document got SKU. Like : 27VAN670
>
> When i query 27VAN670 solr return my document well.
>
> But when i query 27VAN solr return 0 document;
>
> So someone telle me to use EdgeNGramFilterFactory.
>
> With SOLR Api I add new Field Type :
>
> {
>   "add-field-type" : {
>  "name":"textNgram2",
>  "class":"solr.TextField",
>  "positionIncrementGap":"100",
>  "analyzer" : {
> "tokenizer":{
>"class":"solr.StandardTokenizerFactory" },
> "filters":[{
>"class":"solr.EdgeNGramFilterFactory",
>"minGramSize":"1",
>"maxGramSize":"15"
>}]}
>}
>
> }
>
> And I also create a new Field :
>
> {
>   "add-field":{
>  "name":"textNgramStr2",
>  "type":"textNgram2",
>  "stored":true,
>  "indexed":true
>   }
> }
>
> Now in SOLR admin I cans saw my new textNgramStr2. And when I update my
> document in solr I put SKU of my product in textNgramStr2.
>
> But solr still found nothing when I request 27VAN
>
> Can someone help me on that ?
>


RE: EdgeNGramFilterFactory did not work

2017-08-21 Thread Junte Zhang
You have to specify the field where you specified this field analyzer in your 
request. If you use the catch all field by omitting the field, it does not use 
your filter factory.

/JZ

-Original Message-
From: Guilleret Florian [mailto:guilleret.flor...@gmail.com] 
Sent: Thursday, August 17, 2017 2:42 PM
To: solr-user@lucene.apache.org
Subject: EdgeNGramFilterFactory did not work

Hi,

I want to got this use :

My document got SKU. Like : 27VAN670

When i query 27VAN670 solr return my document well.

But when i query 27VAN solr return 0 document;

So someone telle me to use EdgeNGramFilterFactory.

With SOLR Api I add new Field Type :

{
  "add-field-type" : {
 "name":"textNgram2",
 "class":"solr.TextField",
 "positionIncrementGap":"100",
 "analyzer" : {
"tokenizer":{
   "class":"solr.StandardTokenizerFactory" },
"filters":[{
   "class":"solr.EdgeNGramFilterFactory",
   "minGramSize":"1",
   "maxGramSize":"15"
   }]}
   }

}

And I also create a new Field :

{
  "add-field":{
 "name":"textNgramStr2",
 "type":"textNgram2",
 "stored":true,
 "indexed":true
  }
}

Now in SOLR admin I cans saw my new textNgramStr2. And when I update my 
document in solr I put SKU of my product in textNgramStr2.

But solr still found nothing when I request 27VAN

Can someone help me on that ?