Re: Re: Multi-language Spellcheck

2019-08-29 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
Thanks, everyone!
-- 
Audrey Lorberfeld
Data Scientist, w3 Search
Digital Workplace Engineering
CIO, Finance and Operations
IBM
audrey.lorberf...@ibm.com
 

On 8/29/19, 11:28 AM, "Atita Arora"  wrote:

I would agree with the suggestion, I remember something similar presented
by someone at Berlin Buzzwords 19.

On Thu, Aug 29, 2019, 5:03 PM Jörn Franke  wrote:

> It could be sensible to have one spellchecker / language (as different
> endpoint or as a queryparameter at runtime). Alternatively, depending on
> your use case you could get away with a generic fieldtype that does not do
> anything language specific, but I doubt.
>
> > Am 29.08.2019 um 16:20 schrieb Audrey Lorberfeld -
> audrey.lorberf...@ibm.com :
> >
> > Hi All,
> >
> > We are starting up an internal search engine that has to work for many
> different languages. We are starting with a POC of Spanish and English
> documents, and we are using the DirectSolrSpellChecker.
> >
> > From reading others' threads online, I know that we have to have
> multiple spellcheckers to do this (1 for each language). However, would
> someone be able to clarify what should go in the "queryAnalyzerFieldType"
> tag? It seems that the tag can only take a single field. So, does that 
mean
> that I have to have a copy field that collates all tokens from all
> languages? Image of code attached for reference & sample code of
> English-only spellchecker below:
> >
> > 
> >
> >   ???  
> >
> >
> >default
> >minimal_en
> >solr.DirectSolrSpellChecker -->
> >internal
> >0.5
> >2
> >1
> >5
> >4
> >0.05
> >
> > ...
> >
> > Thank you!
> >
> > --
> > Audrey Lorberfeld
> > Data Scientist, w3 Search
> > Digital Workplace Engineering
> > CIO, Finance and Operations
> > IBM
> > audrey.lorberf...@ibm.com
> >
> >
> > On 8/29/19, 10:12 AM, "Joe Obernberger" 
> wrote:
> >
> >Thank you Erick.  I'm upgrading from 7.6.0 and as far as I can tell
> the
> >schema and configuration (solrconfig.xml) isn't different (apart from
> >the version).  Right now, I'm at a loss.  I still have the 7.6.0
> cluster
> >running and the query works OK there.
> >
> >Sure seems like I'm missing a field called 'features', but it's not
> >defined in the prior schema either.  Thanks again!
> >
> >-Joe
> >
> >>On 8/28/2019 6:19 PM, Erick Erickson wrote:
> >> What it says ;)
> >>
> >> My guess is that your configuration mentions the field “features” in,
> perhaps carrot.snippet or carrot.title.
> >>
> >> But it’s a guess.
> >>
> >> Best,
> >> Erick
> >>
> >>> On Aug 28, 2019, at 5:18 PM, Joe Obernberger <
> joseph.obernber...@gmail.com> wrote:
> >>>
> >>> Hi All - trying to use clustering with SolrCloud 8.2, but getting this
> error:
> >>>
> >>> "msg":"Error from server at null: org.apache.solr.search.SyntaxError:
> Query Field 'features' is not a valid field name",
> >>>
> >>> The URL, I'm using is:
> >>>
> 
https://urldefense.proofpoint.com/v2/url?u=http-3A__solrServer-3A9100_solr_DOCS_select-3Fq-3D-2A-253A-2A-26qt-3D_clustering-26clustering-3Dtrue-26clustering.collection-3Dtrue=DwIDaQ=jf_iaSHvJObTbx-siA1ZOg=_8ViuZIeSRdQjONA8yHWPZIBlhj291HU3JpNIx5a55M=O_wgAdeSZrC8W73ggxLnVdbVDMeiJ2jnRnzz9zriMWE=Xv6mGAm4OoATTBbEz5m-J0bRyPaUXaVpvWT_f74PIJ4=
>  <
> 
https://urldefense.proofpoint.com/v2/url?u=http-3A__cronus-3A9100_solr_UNCLASS-5F2018-5F5-5F19-5F184_select-3Fq-3D-2A-253A-2A-26qt-3D_clustering-26clustering-3Dtrue-26clustering.collection-3Dtrue=DwIDaQ=jf_iaSHvJObTbx-siA1ZOg=_8ViuZIeSRdQjONA8yHWPZIBlhj291HU3JpNIx5a55M=O_wgAdeSZrC8W73ggxLnVdbVDMeiJ2jnRnzz9zriMWE=Erwr9WXMf9Vk16cIkTMlhUQrEzKfHYinrWrM40fF1KQ=
> >
> >>>
> >>> Thanks for any ideas!
> >>>
> >>> Complete response:
> >>> {
> >>>  "responseHeader":{
> >>>"zkConnected":true,
> >>>"status":400,
> >>>"QTime":38,
> >>>"params":{
> >>>  "q":"*:*",
> >>>  "qt":"/clustering",
> >>>  "clustering":"true",
> >>>  "clustering.collection":"true"}},
> >>>  "error":{
> >>>"metadata":[
> >>>  "error-class","org.apache.solr.common.SolrException",
> >>>  "root-error-class","org.apache.solr.common.SolrException",
> >>>
> 
"error-class","org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException",
> >>>
> 
"root-error-class","org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException"],
> >>>"msg":"Error from server at null:
> 

Re: Multi-language Spellcheck

2019-08-29 Thread Atita Arora
I would agree with the suggestion, I remember something similar presented
by someone at Berlin Buzzwords 19.

On Thu, Aug 29, 2019, 5:03 PM Jörn Franke  wrote:

> It could be sensible to have one spellchecker / language (as different
> endpoint or as a queryparameter at runtime). Alternatively, depending on
> your use case you could get away with a generic fieldtype that does not do
> anything language specific, but I doubt.
>
> > Am 29.08.2019 um 16:20 schrieb Audrey Lorberfeld -
> audrey.lorberf...@ibm.com :
> >
> > Hi All,
> >
> > We are starting up an internal search engine that has to work for many
> different languages. We are starting with a POC of Spanish and English
> documents, and we are using the DirectSolrSpellChecker.
> >
> > From reading others' threads online, I know that we have to have
> multiple spellcheckers to do this (1 for each language). However, would
> someone be able to clarify what should go in the "queryAnalyzerFieldType"
> tag? It seems that the tag can only take a single field. So, does that mean
> that I have to have a copy field that collates all tokens from all
> languages? Image of code attached for reference & sample code of
> English-only spellchecker below:
> >
> > 
> >
> >   ???  
> >
> >
> >default
> >minimal_en
> >solr.DirectSolrSpellChecker -->
> >internal
> >0.5
> >2
> >1
> >5
> >4
> >0.05
> >
> > ...
> >
> > Thank you!
> >
> > --
> > Audrey Lorberfeld
> > Data Scientist, w3 Search
> > Digital Workplace Engineering
> > CIO, Finance and Operations
> > IBM
> > audrey.lorberf...@ibm.com
> >
> >
> > On 8/29/19, 10:12 AM, "Joe Obernberger" 
> wrote:
> >
> >Thank you Erick.  I'm upgrading from 7.6.0 and as far as I can tell
> the
> >schema and configuration (solrconfig.xml) isn't different (apart from
> >the version).  Right now, I'm at a loss.  I still have the 7.6.0
> cluster
> >running and the query works OK there.
> >
> >Sure seems like I'm missing a field called 'features', but it's not
> >defined in the prior schema either.  Thanks again!
> >
> >-Joe
> >
> >>On 8/28/2019 6:19 PM, Erick Erickson wrote:
> >> What it says ;)
> >>
> >> My guess is that your configuration mentions the field “features” in,
> perhaps carrot.snippet or carrot.title.
> >>
> >> But it’s a guess.
> >>
> >> Best,
> >> Erick
> >>
> >>> On Aug 28, 2019, at 5:18 PM, Joe Obernberger <
> joseph.obernber...@gmail.com> wrote:
> >>>
> >>> Hi All - trying to use clustering with SolrCloud 8.2, but getting this
> error:
> >>>
> >>> "msg":"Error from server at null: org.apache.solr.search.SyntaxError:
> Query Field 'features' is not a valid field name",
> >>>
> >>> The URL, I'm using is:
> >>>
> https://urldefense.proofpoint.com/v2/url?u=http-3A__solrServer-3A9100_solr_DOCS_select-3Fq-3D-2A-253A-2A-26qt-3D_clustering-26clustering-3Dtrue-26clustering.collection-3Dtrue=DwIDaQ=jf_iaSHvJObTbx-siA1ZOg=_8ViuZIeSRdQjONA8yHWPZIBlhj291HU3JpNIx5a55M=O_wgAdeSZrC8W73ggxLnVdbVDMeiJ2jnRnzz9zriMWE=Xv6mGAm4OoATTBbEz5m-J0bRyPaUXaVpvWT_f74PIJ4=
>  <
> https://urldefense.proofpoint.com/v2/url?u=http-3A__cronus-3A9100_solr_UNCLASS-5F2018-5F5-5F19-5F184_select-3Fq-3D-2A-253A-2A-26qt-3D_clustering-26clustering-3Dtrue-26clustering.collection-3Dtrue=DwIDaQ=jf_iaSHvJObTbx-siA1ZOg=_8ViuZIeSRdQjONA8yHWPZIBlhj291HU3JpNIx5a55M=O_wgAdeSZrC8W73ggxLnVdbVDMeiJ2jnRnzz9zriMWE=Erwr9WXMf9Vk16cIkTMlhUQrEzKfHYinrWrM40fF1KQ=
> >
> >>>
> >>> Thanks for any ideas!
> >>>
> >>> Complete response:
> >>> {
> >>>  "responseHeader":{
> >>>"zkConnected":true,
> >>>"status":400,
> >>>"QTime":38,
> >>>"params":{
> >>>  "q":"*:*",
> >>>  "qt":"/clustering",
> >>>  "clustering":"true",
> >>>  "clustering.collection":"true"}},
> >>>  "error":{
> >>>"metadata":[
> >>>  "error-class","org.apache.solr.common.SolrException",
> >>>  "root-error-class","org.apache.solr.common.SolrException",
> >>>
> "error-class","org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException",
> >>>
> "root-error-class","org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException"],
> >>>"msg":"Error from server at null:
> org.apache.solr.search.SyntaxError: Query Field 'features' is not a valid
> field name",
> >>>"code":400}}
> >>>
> >>>
> >>> -Joe
> >>>
> >>
> >> ---
> >> This email has been checked for viruses by AVG.
> >>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.avg.com=DwIDaQ=jf_iaSHvJObTbx-siA1ZOg=_8ViuZIeSRdQjONA8yHWPZIBlhj291HU3JpNIx5a55M=O_wgAdeSZrC8W73ggxLnVdbVDMeiJ2jnRnzz9zriMWE=yqhSyt_b52qGudiP49O1SnlGvlyZCbiNd-fp-ziS-uo=
> >>
> >
> >
> >
>


Re: Multi-language Spellcheck

2019-08-29 Thread Jörn Franke
It could be sensible to have one spellchecker / language (as different endpoint 
or as a queryparameter at runtime). Alternatively, depending on your use case 
you could get away with a generic fieldtype that does not do anything language 
specific, but I doubt.

> Am 29.08.2019 um 16:20 schrieb Audrey Lorberfeld - audrey.lorberf...@ibm.com 
> :
> 
> Hi All,
> 
> We are starting up an internal search engine that has to work for many 
> different languages. We are starting with a POC of Spanish and English 
> documents, and we are using the DirectSolrSpellChecker. 
> 
> From reading others' threads online, I know that we have to have multiple 
> spellcheckers to do this (1 for each language). However, would someone be 
> able to clarify what should go in the "queryAnalyzerFieldType" tag? It seems 
> that the tag can only take a single field. So, does that mean that I have to 
> have a copy field that collates all tokens from all languages? Image of code 
> attached for reference & sample code of English-only spellchecker below: 
> 
> 
> 
>   ???  
> 
>
>default
>minimal_en
>solr.DirectSolrSpellChecker -->
>internal
>0.5
>2
>1
>5
>4
>0.05
>
> ...
> 
> Thank you!
> 
> -- 
> Audrey Lorberfeld
> Data Scientist, w3 Search
> Digital Workplace Engineering
> CIO, Finance and Operations
> IBM
> audrey.lorberf...@ibm.com
> 
> 
> On 8/29/19, 10:12 AM, "Joe Obernberger"  wrote:
> 
>Thank you Erick.  I'm upgrading from 7.6.0 and as far as I can tell the 
>schema and configuration (solrconfig.xml) isn't different (apart from 
>the version).  Right now, I'm at a loss.  I still have the 7.6.0 cluster 
>running and the query works OK there.
> 
>Sure seems like I'm missing a field called 'features', but it's not 
>defined in the prior schema either.  Thanks again!
> 
>-Joe
> 
>>On 8/28/2019 6:19 PM, Erick Erickson wrote:
>> What it says ;)
>> 
>> My guess is that your configuration mentions the field “features” in, 
>> perhaps carrot.snippet or carrot.title.
>> 
>> But it’s a guess.
>> 
>> Best,
>> Erick
>> 
>>> On Aug 28, 2019, at 5:18 PM, Joe Obernberger  
>>> wrote:
>>> 
>>> Hi All - trying to use clustering with SolrCloud 8.2, but getting this 
>>> error:
>>> 
>>> "msg":"Error from server at null: org.apache.solr.search.SyntaxError: Query 
>>> Field 'features' is not a valid field name",
>>> 
>>> The URL, I'm using is:
>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__solrServer-3A9100_solr_DOCS_select-3Fq-3D-2A-253A-2A-26qt-3D_clustering-26clustering-3Dtrue-26clustering.collection-3Dtrue=DwIDaQ=jf_iaSHvJObTbx-siA1ZOg=_8ViuZIeSRdQjONA8yHWPZIBlhj291HU3JpNIx5a55M=O_wgAdeSZrC8W73ggxLnVdbVDMeiJ2jnRnzz9zriMWE=Xv6mGAm4OoATTBbEz5m-J0bRyPaUXaVpvWT_f74PIJ4=
>>>
>>> >>  >
>>> 
>>> Thanks for any ideas!
>>> 
>>> Complete response:
>>> {
>>>  "responseHeader":{
>>>"zkConnected":true,
>>>"status":400,
>>>"QTime":38,
>>>"params":{
>>>  "q":"*:*",
>>>  "qt":"/clustering",
>>>  "clustering":"true",
>>>  "clustering.collection":"true"}},
>>>  "error":{
>>>"metadata":[
>>>  "error-class","org.apache.solr.common.SolrException",
>>>  "root-error-class","org.apache.solr.common.SolrException",
>>>  
>>> "error-class","org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException",
>>>  
>>> "root-error-class","org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException"],
>>>"msg":"Error from server at null: org.apache.solr.search.SyntaxError: 
>>> Query Field 'features' is not a valid field name",
>>>"code":400}}
>>> 
>>> 
>>> -Joe
>>> 
>> 
>> ---
>> This email has been checked for viruses by AVG.
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.avg.com=DwIDaQ=jf_iaSHvJObTbx-siA1ZOg=_8ViuZIeSRdQjONA8yHWPZIBlhj291HU3JpNIx5a55M=O_wgAdeSZrC8W73ggxLnVdbVDMeiJ2jnRnzz9zriMWE=yqhSyt_b52qGudiP49O1SnlGvlyZCbiNd-fp-ziS-uo=
>>  
>> 
> 
> 
>