date:20150223

Hello all,
  I am working on collations. Somewhere in Solr, I found that
UnicodeCollation will do searching fast. But after applying
CollationKeyFilterFactory in schema.xml, it stops the suggestions and
collations both. Please check the configurations and help me.

*Schema.xml:*

fieldType name=textSpell class=solr.TextField
positionIncrementGap=100
   analyzer type=index
   tokenizer class=solr.StandardTokenizerFactory/
   filter class=solr.CollationKeyFilterFactory language=
strength=primary/
  /analyzer
  analyzer type=query
  tokenizer class=solr.StandardTokenizerFactory/
  filter class=solr.CollationKeyFilterFactory language=
strength=primary/
  /analyzer
/fieldType


Solrconfig.xml:

requestHandler name=/spell class=solr.SearchHandler startup=lazy
lst name=defaults
  str name=dfgram_ci/str
  !-- Solr will use suggestions from both the 'default' spellchecker
   and from the 'wordbreak' spellchecker and combine them.
   collations (re-written queries) can include a combination of
   corrections from both spellcheckers --
  str name=spellcheck.dictionarydefault/str
  str name=spellcheckon/str
  str name=spellcheck.extendedResultstrue/str
  str name=spellcheck.count25/str
  str name=spellcheck.onlyMorePopulartrue/str
  str name=spellcheck.maxResultsForSuggest10/str
  str name=spellcheck.alternativeTermCount25/str
  str name=spellcheck.collatetrue/str
  str name=spellcheck.maxCollations100/str
  str name=spellcheck.maxCollationTries1000/str
  str name=spellcheck.collateExtendedResultstrue/str
/lst
arr name=last-components
  strspellcheck/str
  !--strsuggest/str--
  !--strquery/str--
/arr
  /requestHandler

Atomic Update while having fields with attribute stored=true in schema

2015-02-23 Thread Rahul Bhooteshwar

Hi,
I have around 50 fields in my schema and having 20 fields are stored=”true”
and rest of them stored=”false”
In case partial update (atomic update), it is  mentioned at many places
that the fields in schema should have stored=”true”. I have also tried
atomic update on documents having fields with stored=false and
indexed=true, and it didn't work (My whole document vanished from solr or
I am unable to search it now, whatever.). Although I didn't change the
existing value for the fields having stored=false.

Which means I have to change all my fields to stored=”true” if I want to
use atomic update.Right?
Will it affect the performance of the Solr? if yes, then what is the best
practice to reduce performance degradation as much as possible?Thanks in
advance.

Thanks and Regards,
Rahul Bhooteshwar
Enterprise Software Engineer
HotWax Systems http://www.hotwaxsystems.com - The global leader in
innovative enterprise commerce solutions powered by Apache OFBiz.
ApacheCon US 2014 Silver Sponsor

Re: Atomic Update while having fields with attribute stored=true in schema

2015-02-23 Thread Yago Riveiro

Field with store=true has the downside of disk space. Your index will grow in 
space requirements.


Maybe update the whole document can be an option ...









—
/Yago Riveiro

On Mon, Feb 23, 2015 at 1:02 PM, Rahul Bhooteshwar
rahul.bhootesh...@hotwaxsystems.com wrote:

 Hi Yago Riveiro,
 Thanks for your quick reply. I am using Solr for faceted search using 
 *Solr**j.
 *I am using facet queries and filter queries. I am new to Solr so I would
 like to know what is the best practice to handle such scenarios.
 Thanks and Regards,
 Rahul Bhooteshwar
 Enterprise Software Engineer
 HotWax Systems http://www.hotwaxsystems.com - The global leader in
 innovative enterprise commerce solutions powered by Apache OFBiz.
 ApacheCon US 2014 Silver Sponsor
 On Mon, Feb 23, 2015 at 5:42 PM, Yago Riveiro yago.rive...@gmail.com
 wrote:
 Which means I have to change all my fields to stored=”true” if I want to

 use atomic update.Right?”




 Yes, and re-index all your data.




 Will it affect the performance of the Solr?”




 What type of queries are you doing now?


 —
 /Yago Riveiro

 On Mon, Feb 23, 2015 at 12:05 PM, Rahul Bhooteshwar
 rahul.bhootesh...@hotwaxsystems.com wrote:

  Hi,
  I have around 50 fields in my schema and having 20 fields are
 stored=”true”
  and rest of them stored=”false”
  In case partial update (atomic update), it is  mentioned at many places
  that the fields in schema should have stored=”true”. I have also tried
  atomic update on documents having fields with stored=false and
  indexed=true, and it didn't work (My whole document vanished from solr
 or
  I am unable to search it now, whatever.). Although I didn't change the
  existing value for the fields having stored=false.
  Which means I have to change all my fields to stored=”true” if I want to
  use atomic update.Right?
  Will it affect the performance of the Solr? if yes, then what is the best
  practice to reduce performance degradation as much as possible?Thanks in
  advance.
  Thanks and Regards,
  Rahul Bhooteshwar
  Enterprise Software Engineer
  HotWax Systems http://www.hotwaxsystems.com - The global leader in
  innovative enterprise commerce solutions powered by Apache OFBiz.
  ApacheCon US 2014 Silver Sponsor

Re: Atomic Update while having fields with attribute stored=true in schema

2015-02-23 Thread Yago Riveiro

Which means I have to change all my fields to stored=”true” if I want to

use atomic update.Right?”




Yes, and re-index all your data.




Will it affect the performance of the Solr?”




What type of queries are you doing now?


—
/Yago Riveiro

On Mon, Feb 23, 2015 at 12:05 PM, Rahul Bhooteshwar
rahul.bhootesh...@hotwaxsystems.com wrote:

 Hi,
 I have around 50 fields in my schema and having 20 fields are stored=”true”
 and rest of them stored=”false”
 In case partial update (atomic update), it is  mentioned at many places
 that the fields in schema should have stored=”true”. I have also tried
 atomic update on documents having fields with stored=false and
 indexed=true, and it didn't work (My whole document vanished from solr or
 I am unable to search it now, whatever.). Although I didn't change the
 existing value for the fields having stored=false.
 Which means I have to change all my fields to stored=”true” if I want to
 use atomic update.Right?
 Will it affect the performance of the Solr? if yes, then what is the best
 practice to reduce performance degradation as much as possible?Thanks in
 advance.
 Thanks and Regards,
 Rahul Bhooteshwar
 Enterprise Software Engineer
 HotWax Systems http://www.hotwaxsystems.com - The global leader in
 innovative enterprise commerce solutions powered by Apache OFBiz.
 ApacheCon US 2014 Silver Sponsor

Re: Solr 4.x to Solr 5 = org.noggit.JSONParser$ParseException

2015-02-23 Thread Alan Woodward

I think this means you've got an older version of noggit around.  You need 
version 0.6.

Alan Woodward
www.flax.co.uk


On 23 Feb 2015, at 13:00, Clemens Wyss DEV wrote:

 Just about to upgrade to Solr5. My UnitTests fail:
 13:50:41.178 [main] ERROR org.apache.solr.core.CoreContainer - Error creating 
 core [1-de_CH]: null
 java.lang.ExceptionInInitializerError: null
   at 
 org.apache.solr.core.SolrConfig.getConfigOverlay(SolrConfig.java:359) 
 ~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
   at org.apache.solr.core.SolrConfig.getOverlay(SolrConfig.java:808) 
 ~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
   at 
 org.apache.solr.core.SolrConfig.getSubstituteProperties(SolrConfig.java:798) 
 ~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
   at org.apache.solr.core.Config.init(Config.java:152) 
 ~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
   at org.apache.solr.core.Config.init(Config.java:92) 
 ~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
   at org.apache.solr.core.SolrConfig.init(SolrConfig.java:180) 
 ~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
   at 
 org.apache.solr.core.SolrConfig.readFromResourceLoader(SolrConfig.java:158) 
 ~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
   at 
 org.apache.solr.core.ConfigSetService.createSolrConfig(ConfigSetService.java:80)
  ~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
   at 
 org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:61) 
 ~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
   at org.apache.solr.core.CoreContainer.create(CoreContainer.java:511) 
 [solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
   at org.apache.solr.core.CoreContainer.create(CoreContainer.java:488) 
 [solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
   at 
 ch.mysign.search.solr.EmbeddedSolrMode.prepareCore(EmbeddedSolrMode.java:51) 
 [target/:na]
 ...
   at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192)
  [.cp/:na]
 Caused by: org.noggit.JSONParser$ParseException: Expected string: 
 char=u,position=2 BEFORE='{ u' AFTER='pdateHandler : { autoCo'
   at org.noggit.JSONParser.err(JSONParser.java:223) ~[noggit.jar:na]
   at org.noggit.JSONParser.nextEvent(JSONParser.java:671) ~[noggit.jar:na]
   at org.noggit.ObjectBuilder.getObject(ObjectBuilder.java:123) 
 ~[noggit.jar:na]
   at org.apache.solr.core.ConfigOverlay.clinit(ConfigOverlay.java:213) 
 ~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
   ... 56 common frames omitted
 
 Look like the exception occurs in the ConfigOverlay static block, line 213:
 editable_prop_map =  (Map)new ObjectBuilder(new JSONParser(new StringReader(
  MAPPING))).getObject();
 
 What is happening?

Re: CollationKeyFilterFactory stops suggestions and collations

Hi all,
I have found to use UnicodeCollation. I need
*lucene-collation-2.9.1.jar.
*I am using solr 4.10.2. I have download lucene-collation-2.9.1.jar where I
have to store this or Is it already in-built in solr?
If it already in solr then why suggestions and collations are not coming?
Any help. Please?


On Mon, Feb 23, 2015 at 4:43 PM, Nitin Solanki nitinml...@gmail.com wrote:

 Hello all,
   I am working on collations. Somewhere in Solr, I found that
 UnicodeCollation will do searching fast. But after applying
 CollationKeyFilterFactory in schema.xml, it stops the suggestions and
 collations both. Please check the configurations and help me.

 *Schema.xml:*

 fieldType name=textSpell class=solr.TextField
 positionIncrementGap=100
analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.CollationKeyFilterFactory language=
 strength=primary/
   /analyzer
   analyzer type=query
   tokenizer class=solr.StandardTokenizerFactory/
   filter class=solr.CollationKeyFilterFactory language=
 strength=primary/
   /analyzer
 /fieldType


 Solrconfig.xml:

 requestHandler name=/spell class=solr.SearchHandler startup=lazy
 lst name=defaults
   str name=dfgram_ci/str
   !-- Solr will use suggestions from both the 'default' spellchecker
and from the 'wordbreak' spellchecker and combine them.
collations (re-written queries) can include a combination of
corrections from both spellcheckers --
   str name=spellcheck.dictionarydefault/str
   str name=spellcheckon/str
   str name=spellcheck.extendedResultstrue/str
   str name=spellcheck.count25/str
   str name=spellcheck.onlyMorePopulartrue/str
   str name=spellcheck.maxResultsForSuggest10/str
   str name=spellcheck.alternativeTermCount25/str
   str name=spellcheck.collatetrue/str
   str name=spellcheck.maxCollations100/str
   str name=spellcheck.maxCollationTries1000/str
   str name=spellcheck.collateExtendedResultstrue/str
 /lst
 arr name=last-components
   strspellcheck/str
   !--strsuggest/str--
   !--strquery/str--
 /arr
   /requestHandler

Solr 4.x to Solr 5 = org.noggit.JSONParser$ParseException

2015-02-23 Thread Clemens Wyss DEV

Just about to upgrade to Solr5. My UnitTests fail:
13:50:41.178 [main] ERROR org.apache.solr.core.CoreContainer - Error creating 
core [1-de_CH]: null
java.lang.ExceptionInInitializerError: null
at 
org.apache.solr.core.SolrConfig.getConfigOverlay(SolrConfig.java:359) 
~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
at org.apache.solr.core.SolrConfig.getOverlay(SolrConfig.java:808) 
~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
at 
org.apache.solr.core.SolrConfig.getSubstituteProperties(SolrConfig.java:798) 
~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
at org.apache.solr.core.Config.init(Config.java:152) 
~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
at org.apache.solr.core.Config.init(Config.java:92) 
~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
at org.apache.solr.core.SolrConfig.init(SolrConfig.java:180) 
~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
at 
org.apache.solr.core.SolrConfig.readFromResourceLoader(SolrConfig.java:158) 
~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
at 
org.apache.solr.core.ConfigSetService.createSolrConfig(ConfigSetService.java:80)
 ~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
at 
org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:61) 
~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:511) 
[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:488) 
[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
at 
ch.mysign.search.solr.EmbeddedSolrMode.prepareCore(EmbeddedSolrMode.java:51) 
[target/:na]
...
at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192)
 [.cp/:na]
Caused by: org.noggit.JSONParser$ParseException: Expected string: 
char=u,position=2 BEFORE='{ u' AFTER='pdateHandler : { autoCo'
at org.noggit.JSONParser.err(JSONParser.java:223) ~[noggit.jar:na]
at org.noggit.JSONParser.nextEvent(JSONParser.java:671) ~[noggit.jar:na]
at org.noggit.ObjectBuilder.getObject(ObjectBuilder.java:123) 
~[noggit.jar:na]
at org.apache.solr.core.ConfigOverlay.clinit(ConfigOverlay.java:213) 
~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
... 56 common frames omitted

Look like the exception occurs in the ConfigOverlay static block, line 213:
editable_prop_map =  (Map)new ObjectBuilder(new JSONParser(new StringReader(
  MAPPING))).getObject();

What is happening?

Re: Atomic Update while having fields with attribute stored=true in schema

2015-02-23 Thread Rahul Bhooteshwar

Hi Yago Riveiro,
Thanks for your quick reply. I am using Solr for faceted search using *Solr**j.
*I am using facet queries and filter queries. I am new to Solr so I would
like to know what is the best practice to handle such scenarios.

Thanks and Regards,
Rahul Bhooteshwar
Enterprise Software Engineer
HotWax Systems http://www.hotwaxsystems.com - The global leader in
innovative enterprise commerce solutions powered by Apache OFBiz.
ApacheCon US 2014 Silver Sponsor

On Mon, Feb 23, 2015 at 5:42 PM, Yago Riveiro yago.rive...@gmail.com
wrote:

 Which means I have to change all my fields to stored=”true” if I want to

 use atomic update.Right?”




 Yes, and re-index all your data.




 Will it affect the performance of the Solr?”




 What type of queries are you doing now?


 —
 /Yago Riveiro

 On Mon, Feb 23, 2015 at 12:05 PM, Rahul Bhooteshwar
 rahul.bhootesh...@hotwaxsystems.com wrote:

  Hi,
  I have around 50 fields in my schema and having 20 fields are
 stored=”true”
  and rest of them stored=”false”
  In case partial update (atomic update), it is  mentioned at many places
  that the fields in schema should have stored=”true”. I have also tried
  atomic update on documents having fields with stored=false and
  indexed=true, and it didn't work (My whole document vanished from solr
 or
  I am unable to search it now, whatever.). Although I didn't change the
  existing value for the fields having stored=false.
  Which means I have to change all my fields to stored=”true” if I want to
  use atomic update.Right?
  Will it affect the performance of the Solr? if yes, then what is the best
  practice to reduce performance degradation as much as possible?Thanks in
  advance.
  Thanks and Regards,
  Rahul Bhooteshwar
  Enterprise Software Engineer
  HotWax Systems http://www.hotwaxsystems.com - The global leader in
  innovative enterprise commerce solutions powered by Apache OFBiz.
  ApacheCon US 2014 Silver Sponsor

Re: Solr 4.x to Solr 5 = org.noggit.JSONParser$ParseException

2015-02-23 Thread Noble Paul

This code is executed every time Solr is initialized and it is unlikely
that it is a bug.
Are you using an older version of noggit.jar by any chance?


On Mon, Feb 23, 2015 at 6:30 PM, Clemens Wyss DEV clemens...@mysign.ch
wrote:

 Just about to upgrade to Solr5. My UnitTests fail:
 13:50:41.178 [main] ERROR org.apache.solr.core.CoreContainer - Error
 creating core [1-de_CH]: null
 java.lang.ExceptionInInitializerError: null
 at
 org.apache.solr.core.SolrConfig.getConfigOverlay(SolrConfig.java:359)
 ~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
 at org.apache.solr.core.SolrConfig.getOverlay(SolrConfig.java:808)
 ~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
 at
 org.apache.solr.core.SolrConfig.getSubstituteProperties(SolrConfig.java:798)
 ~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
 at org.apache.solr.core.Config.init(Config.java:152)
 ~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
 at org.apache.solr.core.Config.init(Config.java:92)
 ~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
 at org.apache.solr.core.SolrConfig.init(SolrConfig.java:180)
 ~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
 at
 org.apache.solr.core.SolrConfig.readFromResourceLoader(SolrConfig.java:158)
 ~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
 at
 org.apache.solr.core.ConfigSetService.createSolrConfig(ConfigSetService.java:80)
 ~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
 at
 org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:61)
 ~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
 at
 org.apache.solr.core.CoreContainer.create(CoreContainer.java:511)
 [solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
 at
 org.apache.solr.core.CoreContainer.create(CoreContainer.java:488)
 [solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
 at
 ch.mysign.search.solr.EmbeddedSolrMode.prepareCore(EmbeddedSolrMode.java:51)
 [target/:na]
 ...
 at
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192)
 [.cp/:na]
 Caused by: org.noggit.JSONParser$ParseException: Expected string:
 char=u,position=2 BEFORE='{ u' AFTER='pdateHandler : { autoCo'
 at org.noggit.JSONParser.err(JSONParser.java:223) ~[noggit.jar:na]
 at org.noggit.JSONParser.nextEvent(JSONParser.java:671)
 ~[noggit.jar:na]
 at org.noggit.ObjectBuilder.getObject(ObjectBuilder.java:123)
 ~[noggit.jar:na]
 at
 org.apache.solr.core.ConfigOverlay.clinit(ConfigOverlay.java:213)
 ~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
 ... 56 common frames omitted

 Look like the exception occurs in the ConfigOverlay static block, line 213:
 editable_prop_map =  (Map)new ObjectBuilder(new JSONParser(new
 StringReader(
   MAPPING))).getObject();

 What is happening?




-- 
-
Noble Paul

Stop solr query

2015-02-23 Thread Moshe Recanati

Hi,
Recently there were some scenarios in which queries that user sent to solr got 
stuck and increased our solr heap.
Is there any option to kill or timeout query that wasn't returned from solr by 
external command?

Thank you,
Regards,
Moshe Recanati
SVP Engineering
Office + 972-73-2617564
Mobile  + 972-52-6194481
Skype:  recanati
[KMS2]http://finance.yahoo.com/news/kms-lighthouse-named-gartner-cool-121000184.html
More at:  www.kmslh.comhttp://www.kmslh.com/ | 
LinkedInhttp://www.linkedin.com/company/kms-lighthouse | 
FBhttps://www.facebook.com/pages/KMS-lighthouse/123774257810917

incorrect Java version reported in solr dashboard

2015-02-23 Thread SolrUser1543

I have upgraded Java version from 1.7 to 1.8 on Linux server. 
After the upgrade,  if I run  Java -version  I can see that it really 
changed to the new one. 

But when I run Solr, it is still reporting the old version in dashboard JVM
section.  

What could be the reason? 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/incorrect-Java-version-reported-in-solr-dashboard-tp4188236.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: incorrect Java version reported in solr dashboard

2015-02-23 Thread Michael Della Bitta

You're probably launching Solr using the older version of Java somehow. You
should make sure your PATH and JAVA_HOME variables point at your Java 8
install from the point of view of the script or configuration that launches
Solr.

Hope that helps.

Michael Della Bitta

Senior Software Engineer

o: +1 646 532 3062

appinions inc.

“The Science of Influence Marketing”

18 East 41st Street

New York, NY 10017

t: @appinions https://twitter.com/Appinions | g+:
plus.google.com/appinions
https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
w: appinions.com http://www.appinions.com/

On Mon, Feb 23, 2015 at 9:19 AM, SolrUser1543 osta...@gmail.com wrote:

 I have upgraded Java version from 1.7 to 1.8 on Linux server.
 After the upgrade,  if I run  Java -version  I can see that it really
 changed to the new one.

 But when I run Solr, it is still reporting the old version in dashboard JVM
 section.

 What could be the reason?



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/incorrect-Java-version-reported-in-solr-dashboard-tp4188236.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Used CollationKeyFilterFactory, Seems not to be working

2015-02-23 Thread Ahmet Arslan

Hi Nitin,


How can you pass empty value to the language attribute?
Is this intentional?

What is your intention to use that filter with suggestion functionality?

Ahmet

On Monday, February 23, 2015 5:03 PM, Nitin Solanki nitinml...@gmail.com 
wrote:



Hi,
  I have integrate CollationKeyFilterFactory in schema.xml and re-index
the data again.

*filter class=solr.CollationKeyFilterFactory language=
strength=primary/*

I need to use this becuase I want to build collations fast.
Referred link: http://wiki.apache.org/solr/UnicodeCollation

But it stops both suggestions and  collations. *Why?*

I have also test *CollationKeyFilterFactory *into solr admin inside
analysis. Inside that, CKF show some chinese language output.

*Please any help?*

Re: Stop solr query

2015-02-23 Thread Shawn Heisey

On 2/23/2015 7:23 AM, Moshe Recanati wrote:
 Recently there were some scenarios in which queries that user sent to
 solr got stuck and increased our solr heap.

 Is there any option to kill or timeout query that wasn't returned from
 solr by external command?


The best thing you can do is examine all user input and stop such
queries before they execute, especially if they are the kind of query
that will cause your heap to grow out of control.

The timeAllowed parameter can abort a query that takes too long in
certain phases of the query.  In recent months, Solr has been modified
so that timeAllowed will take effect during more query phases.  It is
not a perfect solution, but it can be better than nothing.

http://wiki.apache.org/solr/CommonQueryParameters#timeAllowed

Be aware that sometimes legitimate queries will be slow, and using
timeAllowed may cause those queries to fail.

Thanks,
Shawn

[ANNOUNCE] Luke 4.10.3 released

2015-02-23 Thread Dmitry Kan

Hello,

Luke 4.10.3 has been released. Download it here:

https://github.com/DmitryKey/luke/releases/tag/luke-4.10.3

The release has been tested against the solr-4.10.3 based index.

Issues fixed in this release: #13
https://github.com/DmitryKey/luke/pull/13
Apache License 2.0 abbreviation changed from ASL 2.0 to ALv2

Thanks to respective contributors!


P.S. waiting for lucene 5.0 artifacts to hit public maven repositories for
the next major release of luke.

-- 
Dmitry Kan
Luke Toolbox: http://github.com/DmitryKey/luke
Blog: http://dmitrykan.blogspot.com
Twitter: http://twitter.com/dmitrykan
SemanticAnalyzer: www.semanticanalyzer.info

Used CollationKeyFilterFactory, Seems not to be working

Hi,
  I have integrate CollationKeyFilterFactory in schema.xml and re-index
the data again.

*filter class=solr.CollationKeyFilterFactory language=
strength=primary/*

I need to use this becuase I want to build collations fast.
Referred link: http://wiki.apache.org/solr/UnicodeCollation

But it stops both suggestions and  collations. *Why?*

I have also test *CollationKeyFilterFactory *into solr admin inside
analysis. Inside that, CKF show some chinese language output.

*Please any help?*

AW: Solr 4.x to Solr 5 = org.noggit.JSONParser$ParseException

2015-02-23 Thread Clemens Wyss DEV

Bingo!  thx for the hint

-Ursprüngliche Nachricht-
Von: Alan Woodward [mailto:a...@flax.co.uk] 
Gesendet: Montag, 23. Februar 2015 15:00
An: solr-user@lucene.apache.org
Betreff: Re: Solr 4.x to Solr 5 = org.noggit.JSONParser$ParseException

I think this means you've got an older version of noggit around.  You need 
version 0.6.

Alan Woodward
www.flax.co.uk


On 23 Feb 2015, at 13:00, Clemens Wyss DEV wrote:

 Just about to upgrade to Solr5. My UnitTests fail:
 13:50:41.178 [main] ERROR org.apache.solr.core.CoreContainer - Error 
 creating core [1-de_CH]: null
 java.lang.ExceptionInInitializerError: null
   at 
 org.apache.solr.core.SolrConfig.getConfigOverlay(SolrConfig.java:359) 
 ~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
   at org.apache.solr.core.SolrConfig.getOverlay(SolrConfig.java:808) 
 ~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
   at 
 org.apache.solr.core.SolrConfig.getSubstituteProperties(SolrConfig.java:798) 
 ~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
   at org.apache.solr.core.Config.init(Config.java:152) 
 ~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
   at org.apache.solr.core.Config.init(Config.java:92) 
 ~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
   at org.apache.solr.core.SolrConfig.init(SolrConfig.java:180) 
 ~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
   at 
 org.apache.solr.core.SolrConfig.readFromResourceLoader(SolrConfig.java:158) 
 ~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
   at 
 org.apache.solr.core.ConfigSetService.createSolrConfig(ConfigSetService.java:80)
  ~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
   at 
 org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:61) 
 ~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
   at org.apache.solr.core.CoreContainer.create(CoreContainer.java:511) 
 [solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
   at org.apache.solr.core.CoreContainer.create(CoreContainer.java:488) 
 [solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
   at 
 ch.mysign.search.solr.EmbeddedSolrMode.prepareCore(EmbeddedSolrMode.java:51) 
 [target/:na] ...
   at 
 org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:192)
  [.cp/:na] Caused by: org.noggit.JSONParser$ParseException: Expected string: 
 char=u,position=2 BEFORE='{ u' AFTER='pdateHandler : { autoCo'
   at org.noggit.JSONParser.err(JSONParser.java:223) ~[noggit.jar:na]
   at org.noggit.JSONParser.nextEvent(JSONParser.java:671) ~[noggit.jar:na]
   at org.noggit.ObjectBuilder.getObject(ObjectBuilder.java:123) 
 ~[noggit.jar:na]
   at org.apache.solr.core.ConfigOverlay.clinit(ConfigOverlay.java:213) 
 ~[solr-core.jar:5.0.0 1659987 - anshumgupta - 2015-02-15 12:26:10]
   ... 56 common frames omitted
 
 Look like the exception occurs in the ConfigOverlay static block, line 213:
 editable_prop_map =  (Map)new ObjectBuilder(new JSONParser(new StringReader(
  MAPPING))).getObject();
 
 What is happening?

Re: Strange search behaviour when upgrading to 4.10.3

Thanks Shawn.
Just ran the analysis between 4.6 and 4.10, there seems to be only difference 
between the outputs positionLength value is set in 4.10. Does that mean 
anything.

Version 4.10



SF





text

raw_bytes

start

end

positionLength

type

position









message

[6d 65 73 73 61 67 65]

0

7

1

ALNUM

1








 Version 4.6


 


SF





text

raw_bytes

type

start

end

position









message

[6d 65 73 73 61 67 65]

ALNUM

0

7

1







Thanks,
Rishi.


 

-Original Message-
From: Shawn Heisey apa...@elyograg.org
To: solr-user solr-user@lucene.apache.org
Sent: Fri, Feb 20, 2015 6:51 pm
Subject: Re: Strange search behaviour when upgrading to 4.10.3


On 2/20/2015 4:24 PM, Rishi Easwaran wrote:
 Also, the tokenizer we use is very similar to the following.
 ftp://zimbra.imladris.sk/src/HELIX-720.fbsd/ZimbraServer/src/java/com/zimbra/cs/index/analysis/UniversalTokenizer.java
 ftp://zimbra.imladris.sk/src/HELIX-720.fbsd/ZimbraServer/src/java/com/zimbra/cs/index/analysis/UniversalLexer.jflex


 From the looks of it the text is being indexed as a single token and not 
broken across whitespace. 

I can't claim to know how analyzer code works.  I did manage to see the
code, but it doesn't mean much to me.

I would suggest using the analysis tab in the Solr admin interface.  On
that page, select the field or fieldType, set the verbose flag and
type the actual field contents into the index side of the page.  When
you click the Analyze Values button, it will show you what Solr does
with the input at index time.

Do you still have access to any machines (dev or otherwise) running the
old version with the custom component? If so, do the same things on the
analysis page for that version that you did on the new version, and see
whether it does something different.  If it does do something different,
then you will need to track down the problem in the code for your custom
analyzer.

Thanks,
Shawn

Is Solr best for did you mean functionality just like Google?

Hello,
  I came in the worst condition. I want to do spell/query
correction functionality. I have 49 GB indexed data where I have applied
spellchecker. I want to do same as Google - *did you mean*.
*Example* - If any user types any question/query which might be misspell or
wrong typed. I need to give them suggestion like Did you mean.
Is Solr best for it?


Warm Regards,
Nitin Solanki

Re: Collations are not working fine.

Hi Charles,
 How you patch the suggester to get frequency information in
the spellcheck response?
It's very good. I also want to do that?


On Mon, Feb 16, 2015 at 7:59 PM, Reitzel, Charles 
charles.reit...@tiaa-cref.org wrote:

 I have been working with collations the last couple days and I kept adding
 the collation-related parameters until it started working for me.   It
 seems I needed str name=spellcheck.collateMaxCollectDocs50/str.

 But, I am using the Suggester with the WFSTLookupFactory.

 Also, I needed to patch the suggester to get frequency information in the
 spellcheck response.

 -Original Message-
 From: Rajesh Hazari [mailto:rajeshhaz...@gmail.com]
 Sent: Friday, February 13, 2015 3:48 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Collations are not working fine.

 Hi Nitin,

 Can u try with the below config, we have these config seems to be working
 for us.

 searchComponent name=spellcheck class=solr.SpellCheckComponent

  str name=queryAnalyzerFieldTypetext_general/str


   lst name=spellchecker
 str name=namewordbreak/str
 str name=classnamesolr.WordBreakSolrSpellChecker/str
 str name=fieldtextSpell/str
 str name=combineWordstrue/str
 str name=breakWordsfalse/str
 int name=maxChanges5/int
   /lst

lst name=spellchecker
 str name=namedefault/str
 str name=fieldtextSpell/str
 str name=classnamesolr.IndexBasedSpellChecker/str
 str name=spellcheckIndexDir./spellchecker/str
 str name=accuracy0.75/str
 float name=thresholdTokenFrequency0.01/float
 str name=buildOnCommittrue/str
 str name=spellcheck.maxResultsForSuggest5/str
  /lst


   /searchComponent



 str name=spellchecktrue/str
 str name=spellcheck.dictionarydefault/str
 str name=spellcheck.dictionarywordbreak/str
 int name=spellcheck.count5/int
 str name=spellcheck.alternativeTermCount15/str
 str name=spellcheck.collatetrue/str
 str name=spellcheck.onlyMorePopularfalse/str
 str name=spellcheck.extendedResultstrue/str
 str name =spellcheck.maxCollations100/str
 str name=spellcheck.collateParam.mm100%/str
 str name=spellcheck.collateParam.q.opAND/str
 str name=spellcheck.maxCollationTries1000/str


 *Rajesh.*

 On Fri, Feb 13, 2015 at 1:01 PM, Dyer, James james.d...@ingramcontent.com
 
 wrote:

  Nitin,
 
  Can you post the full spellcheck response when you query:
 
  q=gram_ci:gone wthh thes wintwt=jsonindent=trueshards.qt=/spell
 
  James Dyer
  Ingram Content Group
 
 
  -Original Message-
  From: Nitin Solanki [mailto:nitinml...@gmail.com]
  Sent: Friday, February 13, 2015 1:05 AM
  To: solr-user@lucene.apache.org
  Subject: Re: Collations are not working fine.
 
  Hi James Dyer,
I did the same as you told me. Used
  WordBreakSolrSpellChecker instead of shingles. But still collations
  are not coming or working.
  For instance, I tried to get collation of gone with the wind by
  searching gone wthh thes wint on field=gram_ci but didn't succeed.
  Even, I am getting the suggestions of wtth as *with*, thes as *the*,
 wint as *wind*.
  Also I have documents which contains gone with the wind having 167
  times in the documents. I don't know that I am missing something or not.
  Please check my below solr configuration:
 
  *URL: *localhost:8983/solr/wikingram/spell?q=gram_ci:gone wthh thes
  wintwt=jsonindent=trueshards.qt=/spell
 
  *solrconfig.xml:*
 
  searchComponent name=spellcheck class=solr.SpellCheckComponent
  str name=queryAnalyzerFieldTypetextSpellCi/str
  lst name=spellchecker
str name=namedefault/str
str name=fieldgram_ci/str
str name=classnamesolr.DirectSolrSpellChecker/str
str name=distanceMeasureinternal/str
float name=accuracy0.5/float
int name=maxEdits2/int
int name=minPrefix0/int
int name=maxInspections5/int
int name=minQueryLength2/int
float name=maxQueryFrequency0.9/float
str name=comparatorClassfreq/str
  /lst
  lst name=spellchecker
str name=namewordbreak/str
str name=classnamesolr.WordBreakSolrSpellChecker/str
str name=fieldgram/str
str name=combineWordstrue/str
str name=breakWordstrue/str
int name=maxChanges5/int
  /lst
  /searchComponent
 
  requestHandler name=/spell class=solr.SearchHandler startup=lazy
  lst name=defaults
str name=dfgram_ci/str
str name=spellcheck.dictionarydefault/str
str name=spellcheckon/str
str name=spellcheck.extendedResultstrue/str
str name=spellcheck.count25/str
str name=spellcheck.onlyMorePopulartrue/str
str name=spellcheck.maxResultsForSuggest1/str
str name=spellcheck.alternativeTermCount25/str
str name=spellcheck.collatetrue/str
str name=spellcheck.maxCollations50/str
str name=spellcheck.maxCollationTries50/str
str name=spellcheck.collateExtendedResultstrue/str
  /lst
  arr name=last-components
strspellcheck/str
  /arr

Re: syntax for increasing java memory

2015-02-23 Thread Walter Underwood

That depends on the JVM you are using. For the Oracle JVMs, use this to get a 
list of extended options:

java -X

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


On Feb 23, 2015, at 8:21 AM, Kevin Laurie superinterstel...@gmail.com wrote:

 Hi Guys,
 I am a newbie on Solr and I am just using it for dovecot sake.
 Could you help advise the correct syntax to increase java heap size using
 the  -xmx option(or advise some easy-to-read literature for configuring) ?
 Much appreciate if you could help. I just need this to sort out the problem
 with my Dovecot FTS.
 Thanks
 Kevin

Re: syntax for increasing java memory

Hi Walter
Got it.
java -Xmx1024m -jar start.jar
Thanks
Kevin

On Tue, Feb 24, 2015 at 1:00 AM, Kevin Laurie superinterstel...@gmail.com
wrote:

 Hi Walter,

 I am running :-
 Oracle Corporation OpenJDK 64-Bit Server VM (1.7.0_65 24.65-b04)

 I tried running with this command:-

 java -jar start.jar -Xmx1024m
 WARNING: System properties and/or JVM args set.  Consider using --dry-run
 or --exec
 0[main] INFO  org.eclipse.jetty.server.Server  ? jetty-8.1.10.v20130312
 61   [main] INFO  org.eclipse.jetty.deploy.providers.ScanningAppProvider
 ? Deployment monitor /opt/solr/contexts at interval 0

 Still getting 500m.

 Any advise? Will check java -X out.


 On Tue, Feb 24, 2015 at 12:49 AM, Walter Underwood wun...@wunderwood.org
 wrote:

 That depends on the JVM you are using. For the Oracle JVMs, use this to
 get a list of extended options:

 java -X

 wunder
 Walter Underwood
 wun...@wunderwood.org
 http://observer.wunderwood.org/  (my blog)


 On Feb 23, 2015, at 8:21 AM, Kevin Laurie superinterstel...@gmail.com
 wrote:

  Hi Guys,
  I am a newbie on Solr and I am just using it for dovecot sake.
  Could you help advise the correct syntax to increase java heap size
 using
  the  -xmx option(or advise some easy-to-read literature for
 configuring) ?
  Much appreciate if you could help. I just need this to sort out the
 problem
  with my Dovecot FTS.
  Thanks
  Kevin

Re: Used CollationKeyFilterFactory, Seems not to be working

2015-02-23 Thread Ahmet Arslan

Hi Nitin,

I think that token filter factory has nothing to do with 
collations in spellchecker domain. Single term from different domains causing 
confusion.


solr.CollationKeyFilterFactory targets mainly for locale sensitive sorting.
For example, I used below type to fix sorting problem of Turkish strings.

fieldType name=collatedTURKISH class=solr.CollationField language=tr/

Ahmet

 




On Monday, February 23, 2015 6:18 PM, Nitin Solanki nitinml...@gmail.com 
wrote:
Hi Ahmet,
 language= means that  it is used for any language -
simply define the language as the empty string for most languages

*Intention:* I am working on spell/question correction. Just like google, I
want to do same as did you mean.

Using spellchecker, I got suggestions and collations both. But collations
are not coming as I expected. Reason is that
spellcheck.maxCollationTries, If I set the value
spellcheck.maxCollationTries=10 then it gives nearby 10 results.
Sometimes, expected collation doesn't come inside 10 collations. So, I
increased the value to 16000 and results come but it takes around 15 sec.
on 49GB indexed data. It is worst case. So, somewhere in Solr, I found
*unicodeCollation* and it says that build collations fast.
Is it fast? Or Am I doing something wrong in collations?


On Mon, Feb 23, 2015 at 9:12 PM, Ahmet Arslan iori...@yahoo.com.invalid
wrote:

 Hi Nitin,


 How can you pass empty value to the language attribute?
 Is this intentional?

 What is your intention to use that filter with suggestion functionality?

 Ahmet

 On Monday, February 23, 2015 5:03 PM, Nitin Solanki nitinml...@gmail.com
 wrote:



 Hi,
   I have integrate CollationKeyFilterFactory in schema.xml and re-index
 the data again.

 *filter class=solr.CollationKeyFilterFactory language=
 strength=primary/*

 I need to use this becuase I want to build collations fast.
 Referred link: http://wiki.apache.org/solr/UnicodeCollation

 But it stops both suggestions and  collations. *Why?*

 I have also test *CollationKeyFilterFactory *into solr admin inside
 analysis. Inside that, CKF show some chinese language output.

 *Please any help?*

Re: Collations are not working fine.

2015-02-23 Thread Rajesh Hazari

Hi,

we have used spellcheck component the below configs to get a best collation
(exact collation) when a query has either single term or multiple terms.

As charles, mentioned above we do have a check for getOriginalFrequency()
for each term in our service before we send spellcheck response to client,
this may not be the case for you, hope this helps

request-handler name=/select class=solr.SearchHandler
!-- default values for query parameters can be specified, these
 will be overridden by parameters in the request
  --
lst name=defaults
str name=echoParamsexplicit/str
int name=rows100/int
str name=dftextSpell/str
 str name=spellchecktrue/str
str name=spellcheck.dictionarydefault/str
str name=spellcheck.dictionarywordbreak/str
int name=spellcheck.count5/int
* str name=spellcheck.alternativeTermCount15/str *
* str name=spellcheck.collatetrue/str*
* str name=spellcheck.onlyMorePopularfalse/str*
* str name=spellcheck.extendedResultstrue/str*
* str name =spellcheck.maxCollations100/str*
* str name=spellcheck.collateParam.mm
http://spellcheck.collateParam.mm100%/str*
* str name=spellcheck.collateParam.q.opAND/str*
* str name=spellcheck.maxCollationTries1000/str*
str name=q.opOR/str
.
.
..   /lst /request-handler
.
.
.

searchComponent name=spellcheck class=solr.SpellCheckComponent

 lst name=spellchecker
str name=namewordbreak/str
str name=classnamesolr.WordBreakSolrSpellChecker/str
str name=fieldtextSpell/str
str name=combineWordstrue/str
str name=breakWordsfalse/str
int name=maxChanges5/int
  /lst

   lst name=spellchecker
str name=namedefault/str
str name=fieldtextSpell/str
str name=classnamesolr.IndexBasedSpellChecker/str
!-- str name=classnamesolr.DirectSolrSpellChecker/str --
str name=spellcheckIndexDir./spellchecker/str
!-- str
name=distanceMeasureorg.apache.lucene.search.spell.JaroWinklerDistance/str--
str name=accuracy0.75/str
float name=thresholdTokenFrequency0.01/float
str name=buildOnCommittrue/str
str name=spellcheck.maxResultsForSuggest5/str
 /lst


  /searchComponent



*Rajesh**.*

On Fri, Feb 20, 2015 at 8:42 AM, Nitin Solanki nitinml...@gmail.com wrote:

 How to get only the best collations whose hits are more and need to sort
 them?

 On Wed, Feb 18, 2015 at 3:53 AM, Reitzel, Charles 
 charles.reit...@tiaa-cref.org wrote:

  Hi Nitin,
 
  I was trying many different options for a couple different queries.   In
  fact, I have collations working ok now with the Suggester and WFSTLookup.
   The problem may have been due to a different dictionary and/or lookup
  implementation and the specific options I was sending.
 
  In general, we're using spellcheck for search suggestions.   The
 Suggester
  component (vs. Suggester spellcheck implementation), doesn't handle all
 of
  our cases.  But we can get things working using the spellcheck interface.
  What gives us particular troubles are the cases where a term may be valid
  by itself, but also be the start of longer words.
 
  The specific terms are acronyms specific to our business.   But I'll
  attempt to show generic examples.
 
  E.g. a partial term like fo can expand to fox, fog, etc. and a full
 term
  like brown can also expand to something like brownstone.   And, yes, the
  collation brownstone fox is nonsense.  But assume, for the sake of
  argument, it appears in our documents somewhere.
 
  For multiple term query with a spelling error (or partially typed term):
  brown fo
 
  We get collations in order of hits, descending like ...
  brown fox,
  brown fog,
  brownstone fox.
 
  So far, so good.
 
  For a single term query, brown, we get a single suggestion, brownstone
 and
  no collations.
 
  So, we don't know to keep the term brown!
 
  At this point, we need spellcheck.extendedResults=true and look at the
  origFreq value in the suggested corrections.  Unfortunately, the
 Suggester
  (spellcheck dictionary) does not populate the original frequency
  information.  And, without this information, the SpellCheckComponent
 cannot
  format the extended results.
 
  However, with a simple change to Suggester.java, it was easy to get the
  needed frequency information use it to make a sound decision to keep or
  drop the input term.   But I'd be much obliged if there is a better way
 to
  go about it.
 
  Configs below.
 
  Thanks,
  Charlie
 
  !-- SpellCheck component --
searchComponent class=solr.SpellCheckComponent name=suggestSC
  lst name=spellchecker
str name=namesuggestDictionary/str
str
  name=classnameorg.apache.solr.spelling.suggest.Suggester/str
str
 
 name=lookupImplorg.apache.solr.spelling.suggest.fst.WFSTLookupFactory/str
str name=fieldtext_all/str
float name=threshold0.0001/float
str name=exactMatchFirsttrue/str
str name=buildOnCommittrue/str
  /lst
/searchComponent
 
  !-- Request Handler --
  requestHandler name=/tcSuggest class=solr.SearchHandler
lst name=defaults
  str name=titleSearch

syntax for increasing java memory

Hi Guys,
 I am a newbie on Solr and I am just using it for dovecot sake.
Could you help advise the correct syntax to increase java heap size using
the  -xmx option(or advise some easy-to-read literature for configuring) ?
Much appreciate if you could help. I just need this to sort out the problem
with my Dovecot FTS.
Thanks
Kevin

Re: highlighting the boolean query

2015-02-23 Thread Dmitry Kan

Erick,

nope, we are using std lucene qparser with some customizations, that do not
affect the boolean query parsing logic.

Should we try some other highlighter?

On Mon, Feb 23, 2015 at 6:57 PM, Erick Erickson erickerick...@gmail.com
wrote:

 Are you using edismax?

 On Mon, Feb 23, 2015 at 3:28 AM, Dmitry Kan solrexp...@gmail.com wrote:
  Hello!
 
  In solr 4.3.1 there seem to be some inconsistency with the highlighting
 of
  the boolean query:
 
  a OR (b c) OR d
 
  This returns a proper hit, which shows that only d was included into the
  document score calculation.
 
  But the highlighter returns both d and c in em tags.
 
  Is this a known issue of the standard highlighter? Can it be mitigated?
 
 
  --
  Dmitry Kan
  Luke Toolbox: http://github.com/DmitryKey/luke
  Blog: http://dmitrykan.blogspot.com
  Twitter: http://twitter.com/dmitrykan
  SemanticAnalyzer: www.semanticanalyzer.info




-- 
Dmitry Kan
Luke Toolbox: http://github.com/DmitryKey/luke
Blog: http://dmitrykan.blogspot.com
Twitter: http://twitter.com/dmitrykan
SemanticAnalyzer: www.semanticanalyzer.info

Re: highlighting the boolean query

2015-02-23 Thread Erick Erickson

Are you using edismax?

On Mon, Feb 23, 2015 at 3:28 AM, Dmitry Kan solrexp...@gmail.com wrote:
 Hello!

 In solr 4.3.1 there seem to be some inconsistency with the highlighting of
 the boolean query:

 a OR (b c) OR d

 This returns a proper hit, which shows that only d was included into the
 document score calculation.

 But the highlighter returns both d and c in em tags.

 Is this a known issue of the standard highlighter? Can it be mitigated?


 --
 Dmitry Kan
 Luke Toolbox: http://github.com/DmitryKey/luke
 Blog: http://dmitrykan.blogspot.com
 Twitter: http://twitter.com/dmitrykan
 SemanticAnalyzer: www.semanticanalyzer.info

Re: Used CollationKeyFilterFactory, Seems not to be working

Hi Ahmet,
 language= means that  it is used for any language -
simply define the language as the empty string for most languages

*Intention:* I am working on spell/question correction. Just like google, I
want to do same as did you mean.

Using spellchecker, I got suggestions and collations both. But collations
are not coming as I expected. Reason is that
spellcheck.maxCollationTries, If I set the value
spellcheck.maxCollationTries=10 then it gives nearby 10 results.
Sometimes, expected collation doesn't come inside 10 collations. So, I
increased the value to 16000 and results come but it takes around 15 sec.
on 49GB indexed data. It is worst case. So, somewhere in Solr, I found
*unicodeCollation* and it says that build collations fast.
Is it fast? Or Am I doing something wrong in collations?

On Mon, Feb 23, 2015 at 9:12 PM, Ahmet Arslan iori...@yahoo.com.invalid
wrote:

 Hi Nitin,


 How can you pass empty value to the language attribute?
 Is this intentional?

 What is your intention to use that filter with suggestion functionality?

 Ahmet

 On Monday, February 23, 2015 5:03 PM, Nitin Solanki nitinml...@gmail.com
 wrote:



 Hi,
   I have integrate CollationKeyFilterFactory in schema.xml and re-index
 the data again.

 *filter class=solr.CollationKeyFilterFactory language=
 strength=primary/*

 I need to use this becuase I want to build collations fast.
 Referred link: http://wiki.apache.org/solr/UnicodeCollation

 But it stops both suggestions and  collations. *Why?*

 I have also test *CollationKeyFilterFactory *into solr admin inside
 analysis. Inside that, CKF show some chinese language output.

 *Please any help?*

Optimize maxSegments=2 not working right with Solr 4.10.2

2015-02-23 Thread Tom Burton-West

Hello,

We normally run an optimize with maxSegments=2  after our daily indexing.
This has worked without problem on Solr 3.6.  We recently moved to Solr
4.10.2 and on several shards the optimize completed with no errors in the
logs, but left more than 2 segments.

We send this xml to Solr
optimize maxSegments=2/

I've attached a copy of the indexwriter log for one of the segments where
there were 4 segments rather than the requested number (i.e. there should
have been only 2 segments) at the end of the optimize.It looks like a
merge was done down to two segments and then somehow another process
flushed some postings to disk creating two more segments.  Then there are
messages about 2 of the remaining 4 segments being too big. (See below)

What we expected is that the remainng 2 small segments (about 40MB) would
get merged with the smaller of the two large segments, i.e. with the 56GB
segment, since we gave the argument maxSegments=2.   This didn't happen.


Any suggestions about how to troubleshoot this issue would be appreciated.

Tom

---
Excerpt from indexwriter log:

TMP][http-8091-Processor5]: findForcedMerges maxSegmentCount=2  ...
...
[IW][Lucene Merge Thread #0]: merge time 3842310 msec for 65236 docs
...
[TMP][http-8091-Processor5]: findMerges: 4 segments
 [TMP][http-8091-Processor5]:   seg=_1fzb(4.10.2):C1081559/24089:delGen=9
size=672402.066 MB [skip: too large]
 [TMP][http-8091-Processor5]:   seg=_1gj2(4.10.2):C65236/2:delGen=1
size=56179.245 MB [skip: too large]
 [TMP][http-8091-Processor5]:   seg=_1gj0(4.10.2):C16 size=44.280 MB
 [TMP][http-8091-Processor5]:   seg=_1gj1(4.10.2):C8 size=40.442 MB
 [TMP][http-8091-Processor5]:   allowedSegmentCount=3 vs count=4 (eligible
count=2) tooBigCount=2


build-1.iw.2015-02-23.txt.gz
Description: GNU Zip compressed data

Re: syntax for increasing java memory

Hi Walter,

I am running :-
Oracle Corporation OpenJDK 64-Bit Server VM (1.7.0_65 24.65-b04)

I tried running with this command:-

java -jar start.jar -Xmx1024m
WARNING: System properties and/or JVM args set.  Consider using --dry-run
or --exec
0[main] INFO  org.eclipse.jetty.server.Server  ? jetty-8.1.10.v20130312
61   [main] INFO  org.eclipse.jetty.deploy.providers.ScanningAppProvider  ?
Deployment monitor /opt/solr/contexts at interval 0

Still getting 500m.

Any advise? Will check java -X out.


On Tue, Feb 24, 2015 at 12:49 AM, Walter Underwood wun...@wunderwood.org
wrote:

 That depends on the JVM you are using. For the Oracle JVMs, use this to
 get a list of extended options:

 java -X

 wunder
 Walter Underwood
 wun...@wunderwood.org
 http://observer.wunderwood.org/  (my blog)


 On Feb 23, 2015, at 8:21 AM, Kevin Laurie superinterstel...@gmail.com
 wrote:

  Hi Guys,
  I am a newbie on Solr and I am just using it for dovecot sake.
  Could you help advise the correct syntax to increase java heap size using
  the  -xmx option(or advise some easy-to-read literature for configuring)
 ?
  Much appreciate if you could help. I just need this to sort out the
 problem
  with my Dovecot FTS.
  Thanks
  Kevin

RE: Collations are not working fine.

2015-02-23 Thread Reitzel, Charles

I filed issue SOLR-7144 with the patch attached.   It's probably best to get 
some feedback from developers.  It may not be the right approach, etc.

Also, spellcheck.maxCollationTries  0 is the parameter needed to get collation 
results that respect the current filter queries, etc.

Set spellcheck.maxCollations  1 to get multiple collation results.   However, 
if the original query has only a single term, there will be no collation 
results.   Thus, for single term queries, you need to look at the original 
frequency information to determine if the original term is valid or not.   
There may be spellcheck suggestions even for terms with origFreq  0.

-Original Message-
From: Nitin Solanki [mailto:nitinml...@gmail.com] 
Sent: Monday, February 23, 2015 11:35 AM
To: solr-user@lucene.apache.org
Subject: Re: Collations are not working fine.

Hi Charles,
 How you patch the suggester to get frequency information in the 
spellcheck response?
It's very good. I also want to do that?


On Mon, Feb 16, 2015 at 7:59 PM, Reitzel, Charles  
charles.reit...@tiaa-cref.org wrote:

 I have been working with collations the last couple days and I kept adding
 the collation-related parameters until it started working for me.   It
 seems I needed str name=spellcheck.collateMaxCollectDocs50/str.

 But, I am using the Suggester with the WFSTLookupFactory.

 Also, I needed to patch the suggester to get frequency information in 
 the spellcheck response.

 -Original Message-
 From: Rajesh Hazari [mailto:rajeshhaz...@gmail.com]
 Sent: Friday, February 13, 2015 3:48 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Collations are not working fine.

 Hi Nitin,

 Can u try with the below config, we have these config seems to be 
 working for us.

 searchComponent name=spellcheck class=solr.SpellCheckComponent

  str name=queryAnalyzerFieldTypetext_general/str


   lst name=spellchecker
 str name=namewordbreak/str
 str name=classnamesolr.WordBreakSolrSpellChecker/str
 str name=fieldtextSpell/str
 str name=combineWordstrue/str
 str name=breakWordsfalse/str
 int name=maxChanges5/int
   /lst

lst name=spellchecker
 str name=namedefault/str
 str name=fieldtextSpell/str
 str name=classnamesolr.IndexBasedSpellChecker/str
 str name=spellcheckIndexDir./spellchecker/str
 str name=accuracy0.75/str
 float name=thresholdTokenFrequency0.01/float
 str name=buildOnCommittrue/str
 str name=spellcheck.maxResultsForSuggest5/str
  /lst


   /searchComponent



 str name=spellchecktrue/str
 str name=spellcheck.dictionarydefault/str
 str name=spellcheck.dictionarywordbreak/str
 int name=spellcheck.count5/int
 str name=spellcheck.alternativeTermCount15/str
 str name=spellcheck.collatetrue/str
 str name=spellcheck.onlyMorePopularfalse/str
 str name=spellcheck.extendedResultstrue/str
 str name =spellcheck.maxCollations100/str
 str name=spellcheck.collateParam.mm100%/str
 str name=spellcheck.collateParam.q.opAND/str
 str name=spellcheck.maxCollationTries1000/str


 *Rajesh.*

 On Fri, Feb 13, 2015 at 1:01 PM, Dyer, James 
 james.d...@ingramcontent.com
 
 wrote:

  Nitin,
 
  Can you post the full spellcheck response when you query:
 
  q=gram_ci:gone wthh thes wintwt=jsonindent=trueshards.qt=/spell
 
  James Dyer
  Ingram Content Group
 
 
  -Original Message-
  From: Nitin Solanki [mailto:nitinml...@gmail.com]
  Sent: Friday, February 13, 2015 1:05 AM
  To: solr-user@lucene.apache.org
  Subject: Re: Collations are not working fine.
 
  Hi James Dyer,
I did the same as you told me. Used 
  WordBreakSolrSpellChecker instead of shingles. But still collations 
  are not coming or working.
  For instance, I tried to get collation of gone with the wind by 
  searching gone wthh thes wint on field=gram_ci but didn't succeed.
  Even, I am getting the suggestions of wtth as *with*, thes as *the*,
 wint as *wind*.
  Also I have documents which contains gone with the wind having 167 
  times in the documents. I don't know that I am missing something or not.
  Please check my below solr configuration:
 
  *URL: *localhost:8983/solr/wikingram/spell?q=gram_ci:gone wthh thes 
  wintwt=jsonindent=trueshards.qt=/spell
 
  *solrconfig.xml:*
 
  searchComponent name=spellcheck class=solr.SpellCheckComponent
  str name=queryAnalyzerFieldTypetextSpellCi/str
  lst name=spellchecker
str name=namedefault/str
str name=fieldgram_ci/str
str name=classnamesolr.DirectSolrSpellChecker/str
str name=distanceMeasureinternal/str
float name=accuracy0.5/float
int name=maxEdits2/int
int name=minPrefix0/int
int name=maxInspections5/int
int name=minQueryLength2/int
float name=maxQueryFrequency0.9/float
str name=comparatorClassfreq/str
  /lst
  lst name=spellchecker
str name=namewordbreak/str
str name=classnamesolr.WordBreakSolrSpellChecker/str
str name=fieldgram/str
str

Re: Suggestion on distinct/ group by for a field ?

2015-02-23 Thread Erick Erickson

Maybe pivot facets will do what you need? See:

https://cwiki.apache.org/confluence/display/solr/Faceting#Faceting-Pivot(DecisionTree)Faceting

Best,
Erick

On Mon, Feb 23, 2015 at 11:31 AM, Vishal Swaroop vishal@gmail.com wrote:
 Please suggest on how to get the distinct count for a field (name).

 Summary : I have data indexed in the following format
 category name value
 Cat1 A 1
 Cat1 A 2
 Cat1 B 3
 Cat1 B 4

 I tried getting the distinct name count... but it returns 4 records
 instaed of 2 (i.e. A, B)
 http://localhost:8081/solr/core_test/select?q=category:Cat1fl=category,namewt=jsonindent=truefacet.mincount=1facet=true

 In Oracle I can easily perform the distinct count using groop-by
 select c.cat, count(*distinct *i.name) from category c, itemname i, value v
 where v.item_id = i.id and i.cat_id = c.id and c.cat ='Cat1' *group by
 c.cat http://c.cat*
 Result:
 Cat1 2

 Thanks

Basic Multilingual search capability

Hi All,

For our use case we don't really need to do a lot of manipulation of incoming 
text during index time. At most removal of common stop words, tokenize emails/ 
filenames etc if possible. We get text documents from our end users, which can 
be in any language (sometimes combination) and we cannot determine the language 
of the incoming text. Language detection at index time is not necessary.

Which analyzer is recommended to achive basic multilingual search capability 
for a use case like this.
I have read a bunch of posts about using a combination standardtokenizer or 
ICUtokenizer, lowercasefilter and reverwildcardfilter factory, but looking for 
ideas, suggestions, best practices.

http://lucene.472066.n3.nabble.com/ICUTokenizer-or-StandardTokenizer-or-for-quot-text-all-quot-type-field-that-might-include-non-whitess-td4142727.html#a4144236
http://lucene.472066.n3.nabble.com/How-to-implement-multilingual-word-components-fields-schema-td4157140.html#a4158923
https://issues.apache.org/jira/browse/SOLR-6492  

 
Thanks,
Rishi.

Re: highlighting the boolean query

2015-02-23 Thread Erick Erickson

Highlighting is such a pain...

what does the parsed query look like? If the default operator is OR,
then this seems correct as both 'd' and 'c' appear in the doc. So
I'm a bit puzzled by your statement that c didn't contribute to the score.

If the parsed query is, indeed
a +b +c d

then it does look like something with the highlighter. Whether other
highlighters are better for this case.. no clue ;(

Best,
Erick

On Mon, Feb 23, 2015 at 9:36 AM, Dmitry Kan solrexp...@gmail.com wrote:
 Erick,

 nope, we are using std lucene qparser with some customizations, that do not
 affect the boolean query parsing logic.

 Should we try some other highlighter?

 On Mon, Feb 23, 2015 at 6:57 PM, Erick Erickson erickerick...@gmail.com
 wrote:

 Are you using edismax?

 On Mon, Feb 23, 2015 at 3:28 AM, Dmitry Kan solrexp...@gmail.com wrote:
  Hello!
 
  In solr 4.3.1 there seem to be some inconsistency with the highlighting
 of
  the boolean query:
 
  a OR (b c) OR d
 
  This returns a proper hit, which shows that only d was included into the
  document score calculation.
 
  But the highlighter returns both d and c in em tags.
 
  Is this a known issue of the standard highlighter? Can it be mitigated?
 
 
  --
  Dmitry Kan
  Luke Toolbox: http://github.com/DmitryKey/luke
  Blog: http://dmitrykan.blogspot.com
  Twitter: http://twitter.com/dmitrykan
  SemanticAnalyzer: www.semanticanalyzer.info




 --
 Dmitry Kan
 Luke Toolbox: http://github.com/DmitryKey/luke
 Blog: http://dmitrykan.blogspot.com
 Twitter: http://twitter.com/dmitrykan
 SemanticAnalyzer: www.semanticanalyzer.info

Suggestion on distinct/ group by for a field ?

2015-02-23 Thread Vishal Swaroop

Please suggest on how to get the distinct count for a field (name).

Summary : I have data indexed in the following format
category name value
Cat1 A 1
Cat1 A 2
Cat1 B 3
Cat1 B 4

I tried getting the distinct name count... but it returns 4 records
instaed of 2 (i.e. A, B)
http://localhost:8081/solr/core_test/select?q=category:Cat1fl=category,namewt=jsonindent=truefacet.mincount=1facet=true

In Oracle I can easily perform the distinct count using groop-by
select c.cat, count(*distinct *i.name) from category c, itemname i, value v
where v.item_id = i.id and i.cat_id = c.id and c.cat ='Cat1' *group by
c.cat http://c.cat*
Result:
Cat1 2

Thanks

SolrCloud 4.10.3 Security

2015-02-23 Thread mihaela olteanu

Hello,
Does anyone know why the Basic authentication was not yet released for 
SolrCloud as described on the wiki page: 
https://wiki.apache.org/solr/SolrSecurity? Is there any plan in the near future 
for closing this issue: https://issues.apache.org/jira/browse/SOLR-4470 ?
Isn't already a very basic implementation that can be released?
Thanks a lot!Mihaela

more like this and term vectors

2015-02-23 Thread Scott C. Cote

Is there a way to configure the more like this query handler and also receive 
the corresponding term vectors? (tf-idf) ?

I tried by creating a “search component” for the term vectors and adding it to 
the mlt handler, but that did not work.

Here is what I tried:

 searchComponent name=tvComponent 
class=org.apache.solr.handler.component.TermVectorComponent”/

   requestHandler name=/mlt class=solr.MoreLikeThisHandler
lst name=defaults
  str name=mlt.flfilteredText/str
  str name=mlt.mintf1/str
  str name=mlt.mindf1/str
  str name=mlt.interestingTermslist/str
  bool name=tvtrue/bool
/lst 
arr name=last-components
  strtvComponent/str
/arr
   /requestHandler

Now I realize that I could turn on the debug parameter but that does not 
contain the all of the tf/idf (at least not like the tv component provides)

Thanks,

SCott

Re: more like this and term vectors

2015-02-23 Thread Jack Krupansky

It's never helpful when you merely say that it did not work - detail the
symptom, please.

Post both the query and the response. As well as the field and type
definitions for the fields for which you expected term vectors - no term
vectors are enabled by default.

-- Jack Krupansky

On Mon, Feb 23, 2015 at 2:48 PM, Scott C. Cote scottcc...@yahoo.com.invalid
 wrote:

 Is there a way to configure the more like this query handler and also
 receive the corresponding term vectors? (tf-idf) ?

 I tried by creating a “search component” for the term vectors and adding
 it to the mlt handler, but that did not work.

 Here is what I tried:

  searchComponent name=tvComponent
 class=org.apache.solr.handler.component.TermVectorComponent”/

requestHandler name=/mlt class=solr.MoreLikeThisHandler
 lst name=defaults
   str name=mlt.flfilteredText/str
   str name=mlt.mintf1/str
   str name=mlt.mindf1/str
   str name=mlt.interestingTermslist/str
   bool name=tvtrue/bool
 /lst
 arr name=last-components
   strtvComponent/str
 /arr
/requestHandler

 Now I realize that I could turn on the debug parameter but that does not
 contain the all of the tf/idf (at least not like the tv component provides)

 Thanks,

 SCott

Re: Basic Multilingual search capability

2015-02-23 Thread Alexandre Rafalovitch

Which languages are you expecting to deal with? Multilingual support
is a complex issue. Even if you think you don't need much, it is
usually a lot more complex than expected, especially around relevancy.

Regards,
   Alex.

Sign up for my Solr resources newsletter at http://www.solr-start.com/


On 23 February 2015 at 16:19, Rishi Easwaran rishi.easwa...@aol.com wrote:
 Hi All,

 For our use case we don't really need to do a lot of manipulation of incoming 
 text during index time. At most removal of common stop words, tokenize 
 emails/ filenames etc if possible. We get text documents from our end users, 
 which can be in any language (sometimes combination) and we cannot determine 
 the language of the incoming text. Language detection at index time is not 
 necessary.

 Which analyzer is recommended to achive basic multilingual search capability 
 for a use case like this.
 I have read a bunch of posts about using a combination standardtokenizer or 
 ICUtokenizer, lowercasefilter and reverwildcardfilter factory, but looking 
 for ideas, suggestions, best practices.

 http://lucene.472066.n3.nabble.com/ICUTokenizer-or-StandardTokenizer-or-for-quot-text-all-quot-type-field-that-might-include-non-whitess-td4142727.html#a4144236
 http://lucene.472066.n3.nabble.com/How-to-implement-multilingual-word-components-fields-schema-td4157140.html#a4158923
 https://issues.apache.org/jira/browse/SOLR-6492


 Thanks,
 Rishi.

Error instantiating class: 'org.apache.lucene.collation.CollationKeyFilterFactory'

Hi,
   I am using Collation Key Filter. After adding it into schema.xml.

*Schema.xml*
field name=gram type=textSpell indexed=true stored=true
required=true multiValued=false/

/fieldTypefieldType name=textSpell class=solr.TextField
positionIncrementGap=100
   analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.CollationKeyFilterFactory language=
strength=primary/
   /analyzer
   analyzer type=query
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.CollationKeyFilterFactory language=
strength=primary/
   /analyzer
/fieldType


*  It throws errror...*

Problem accessing /solr/. Reason:

{msg=SolrCore 'collection1' is not available due to init failure:
Could not load conf for core collection1: Plugin init failure for
[schema.xml] fieldType textSpell: Plugin init failure for
[schema.xml] analyzer/filter: Error instantiating class:
'org.apache.lucene.collation.CollationKeyFilterFactory'. Schema file
is /configs/myconf/schema.xml,trace=org.apache.solr.common.SolrException:
SolrCore 'collection1' is not available due to init failure: Could not
load conf for core collection1: Plugin init failure for [schema.xml]
fieldType textSpell: Plugin init failure for [schema.xml]
analyzer/filter: Error instantiating class:
'org.apache.lucene.collation.CollationKeyFilterFactory'. Schema file
is /configs/myconf/schema.xml
at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:745)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:347)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:368)
at 
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
at 
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
at 
org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
at 
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at 
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
at 
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:745)

Geo Aggregations and Search Alerts in Solr

2015-02-23 Thread Richard Gibbs

Hi There,

I am in the process of choosing a search technology for one of my projects
and I was looking into Solr and Elasticsearch.

Two features that I am more interested are geo aggregations (for map
clustering) and search alerts. Elasticsearch seem to have these two
features built-in.

http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/geo-aggs.html
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-percolate.html

I couldn't find relevant documentation for Solr and therefore not sure
whether these features are readily available in Solr. Can you please let me
know whether these features are available in Solr? If not, whether there
are solutions to achieve same with Solr.

Thank you.

Query: no result returned if use AND OR operators

2015-02-23 Thread arthur.hk.c...@gmail.com

Hi,

My Solr is 4.10.2

When I use the web UI to run a simple query: 1+AND+2


1) from the log, I can see the hits=8
7629109 [qtp1702388274-16] INFO  org.apache.solr.core.SolrCore  – [infocast] 
webapp=/solr path=/clustering 
params={q=1+AND+2wt=velocityv.template=cluster_results} hits=8 
status=0 QTime=21 

However, from the query page, it returns
2) 0 results found in 5 ms Page 0 of 0
  0 results found. Page 0 of 0


3) If I use Admin page to ruyn the query, I can get 3 back

{
  responseHeader: {
status: 0,
QTime: 5,
params: {
  indent: true,
  q: \1\ AND \2\,
  _: 1424761089223,
  wt: json
}
  },
  response: {
numFound: 3,
start: 0,
docs: [
  {
title: [ ….

Very strange to me, please help!

Regards

Re: Basic Multilingual search capability

2015-02-23 Thread Walter Underwood

It isn’t just complicated, it can be impossible.

Do you have content in Chinese or Japanese? Those languages (and some others)
do not separate words with spaces. You cannot even do word search without a
language-specific, dictionary-based parser.

German is space separated, except many noun compounds are not space-separated.

Do you have Finnish content? Entire prepositional phrases turn into word
endings.

Do you have Arabic content? That is even harder.

If all your content is in space-separated languages that are not heavily
inflected, you can kind of do OK with a language-insensitive approach. But it
hits the wall pretty fast.

One thing that does work pretty well is trademarked names (LaserJet, Coke,
etc). Those are spelled the same in all languages and usually not inflected.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)

On Feb 23, 2015, at 8:00 PM, Rishi Easwaran rishi.easwa...@aol.com wrote:

Hi Alex,

There is no specific language list.
For example: the documents that needs to be indexed are emails or any
messages for a global customer base. The messages back and forth could be in
any language or mix of languages.

I understand relevancy, stemming etc becomes extremely complicated with
multilingual support, but our first goal is to be able to tokenize and
provide basic search capability for any language. Ex: When the document
contains hello or здравствуйте, the analyzer creates tokens and provides
exact match search results.

Now it would be great if it had capability to tokenize email addresses
(ex:he...@aol.com- i think standardTokenizer already does this), filenames
(здравствуйте.pdf), but maybe we can use filters to accomplish that.

Thanks,
Rishi.

-Original Message-
From: Alexandre Rafalovitch arafa...@gmail.com
To: solr-user solr-user@lucene.apache.org
Sent: Mon, Feb 23, 2015 5:49 pm
Subject: Re: Basic Multilingual search capability

Which languages are you expecting to deal with? Multilingual support
is a complex issue. Even if you think you don't need much, it is
usually a lot more complex than expected, especially around relevancy.

Regards,
Alex.

On 23 February 2015 at 16:19, Rishi Easwaran rishi.easwa...@aol.com wrote:
Hi All,

For our use case we don't really need to do a lot of manipulation of
incoming
text during index time. At most removal of common stop words, tokenize
emails/
filenames etc if possible. We get text documents from our end users, which
can
be in any language (sometimes combination) and we cannot determine the
language
of the incoming text. Language detection at index time is not necessary.

Which analyzer is recommended to achive basic multilingual search capability
for a use case like this.
I have read a bunch of posts about using a combination standardtokenizer or
ICUtokenizer, lowercasefilter and reverwildcardfilter factory, but looking
for
ideas, suggestions, best practices.

http://lucene.472066.n3.nabble.com/ICUTokenizer-or-StandardTokenizer-or-for-quot-text-all-quot-type-field-that-might-include-non-whitess-td4142727.html#a4144236
http://lucene.472066.n3.nabble.com/How-to-implement-multilingual-word-components-fields-schema-td4157140.html#a4158923
https://issues.apache.org/jira/browse/SOLR-6492

Thanks,
Rishi.

Re: Special character and wildcard matching

2015-02-23 Thread Jack Krupansky

Is it really a string field - as opposed to a text field? Show us the field
and field type.

Besides, if it really were a raw name, wouldn't that be a capital B?

-- Jack Krupansky

On Mon, Feb 23, 2015 at 6:52 PM, Arun Rangarajan arunrangara...@gmail.com
wrote:

 I have a string field raw_name like this in my document:

 {raw_name: beyoncé}

 (Notice that the last character is a special character.)

 When I issue this wildcard query:

 q=raw_name:beyonce*

 i.e. with the last character simply being the ASCII 'e', Solr returns me
 the above document.

 How do I prevent this?

Re: Special character and wildcard matching

2015-02-23 Thread Jack Krupansky

But how is that lowercasing occurring? I mean, solr.StrField doesn't do
that.

Some containers default to automatically mapping accented characters, so
that the accented e would then get indexed as a normal e, and then your
wildcard would match it, and an accented e in a query would get mapped as
well and then match the normal e in the index. What does your query
response look like?

This blog post explains that problem:
http://bensch.be/tomcat-solr-and-special-characters

Note that you could make your string field a text field with the keyword
tokenizer and then filter it for lower case, such as when the user query
might have a capital B. String field is most appropriate when the field
really is 100% raw.


-- Jack Krupansky

On Mon, Feb 23, 2015 at 7:37 PM, Arun Rangarajan arunrangara...@gmail.com
wrote:

 Yes, it is a string field and not a text field.

 fieldType name=string class=solr.StrField sortMissingLast=true
 omitNorms=true/
 field name=raw_name type=string indexed=true stored=true /

 Lower-casing done to do case-insensitive matching.

 On Mon, Feb 23, 2015 at 4:01 PM, Jack Krupansky jack.krupan...@gmail.com
 wrote:

  Is it really a string field - as opposed to a text field? Show us the
 field
  and field type.
 
  Besides, if it really were a raw name, wouldn't that be a capital B?
 
  -- Jack Krupansky
 
  On Mon, Feb 23, 2015 at 6:52 PM, Arun Rangarajan 
 arunrangara...@gmail.com
  
  wrote:
 
   I have a string field raw_name like this in my document:
  
   {raw_name: beyoncé}
  
   (Notice that the last character is a special character.)
  
   When I issue this wildcard query:
  
   q=raw_name:beyonce*
  
   i.e. with the last character simply being the ASCII 'e', Solr returns
 me
   the above document.
  
   How do I prevent this?

Re: Basic Multilingual search capability

Hi Alex,

There is no specific language list.  
For example: the documents that needs to be indexed are emails or any messages 
for a global customer base. The messages back and forth could be in any 
language or mix of languages.
 
I understand relevancy, stemming etc becomes extremely complicated with 
multilingual support, but our first goal is to be able to tokenize and provide 
basic search capability for any language. Ex: When the document contains hello 
or здравствуйте, the analyzer creates tokens and provides exact match search 
results.

Now it would be great if it had capability to tokenize email addresses 
(ex:he...@aol.com- i think standardTokenizer already does this),  filenames 
(здравствуйте.pdf), but maybe we can use filters to accomplish that. 

Thanks,
Rishi.
 
 
-Original Message-
From: Alexandre Rafalovitch arafa...@gmail.com
To: solr-user solr-user@lucene.apache.org
Sent: Mon, Feb 23, 2015 5:49 pm
Subject: Re: Basic Multilingual search capability


Which languages are you expecting to deal with? Multilingual support
is a complex issue. Even if you think you don't need much, it is
usually a lot more complex than expected, especially around relevancy.

Regards,
   Alex.

Sign up for my Solr resources newsletter at http://www.solr-start.com/


On 23 February 2015 at 16:19, Rishi Easwaran rishi.easwa...@aol.com wrote:
 Hi All,

 For our use case we don't really need to do a lot of manipulation of incoming 
text during index time. At most removal of common stop words, tokenize emails/ 
filenames etc if possible. We get text documents from our end users, which can 
be in any language (sometimes combination) and we cannot determine the language 
of the incoming text. Language detection at index time is not necessary.

 Which analyzer is recommended to achive basic multilingual search capability 
for a use case like this.
 I have read a bunch of posts about using a combination standardtokenizer or 
ICUtokenizer, lowercasefilter and reverwildcardfilter factory, but looking for 
ideas, suggestions, best practices.

 http://lucene.472066.n3.nabble.com/ICUTokenizer-or-StandardTokenizer-or-for-quot-text-all-quot-type-field-that-might-include-non-whitess-td4142727.html#a4144236
 http://lucene.472066.n3.nabble.com/How-to-implement-multilingual-word-components-fields-schema-td4157140.html#a4158923
 https://issues.apache.org/jira/browse/SOLR-6492


 Thanks,
 Rishi.

Special character and wildcard matching

2015-02-23 Thread Arun Rangarajan

I have a string field raw_name like this in my document:

{raw_name: beyoncé}

(Notice that the last character is a special character.)

When I issue this wildcard query:

q=raw_name:beyonce*

i.e. with the last character simply being the ASCII 'e', Solr returns me
the above document.

How do I prevent this?

Re: Special character and wildcard matching

2015-02-23 Thread Arun Rangarajan

Yes, it is a string field and not a text field.

fieldType name=string class=solr.StrField sortMissingLast=true
omitNorms=true/
field name=raw_name type=string indexed=true stored=true /

Lower-casing done to do case-insensitive matching.

On Mon, Feb 23, 2015 at 4:01 PM, Jack Krupansky jack.krupan...@gmail.com
wrote:

 Is it really a string field - as opposed to a text field? Show us the field
 and field type.

 Besides, if it really were a raw name, wouldn't that be a capital B?

 -- Jack Krupansky

 On Mon, Feb 23, 2015 at 6:52 PM, Arun Rangarajan arunrangara...@gmail.com
 
 wrote:

  I have a string field raw_name like this in my document:
 
  {raw_name: beyoncé}
 
  (Notice that the last character is a special character.)
 
  When I issue this wildcard query:
 
  q=raw_name:beyonce*
 
  i.e. with the last character simply being the ASCII 'e', Solr returns me
  the above document.
 
  How do I prevent this?

apache solr - dovecot - some search fields works some dont

Hi,
I finally understand how Solr works(somewhat) its a bit complicated as I am
new to the whole concept but I understand it as a search engine. I am using
Solr with dovecot.
and  I found out that some seach fields from the inbox work and other dont.
For example if I were to search To and From apache solr would process it in
its log and give me an output, however if I were to search something in the
Body it would stall and no output.
I am guessing this is some schema.xml problem. Could you advise?
Oh. I already addressed the java heap size problem.
I have underlined the syntax that shows it.
I am guessing its only the body search that fails, and it might be
schema.xml related.



*3374412 [qtp1728413448-16] INFO  org.apache.solr.core.SolrCore  ?
[collection1] webapp=/solr path=/select
params={sort=uid+ascfl=uid,scoreq=subject:dave+OR+from:dave+OR+to:davefq=%2Bbox:ac553604f7314b54e6233555fc1a+%2Buser:b...@email.net
b...@email.netrows=107161} hits=571 status=0 QTime=706 *
3379438 [qtp1728413448-18] INFO  org.apache.solr.servlet.
SolrDispatchFilter  ? [admin] webapp=null path=/admin/info/logging
params={_=1424714397078since=1424711021771wt=json} status=0 QTime=0
3389791 [qtp1728413448-18] INFO
org.apache.solr.servlet.SolrDispatchFilter  ? [admin] webapp=null
path=/admin/info/logging
params={_=1424714407453since=1424711021771wt=json} status=0 QTime=1
3400172 [qtp1728413448-18] INFO
org.apache.solr.servlet.SolrDispatchFilter  ? [admin] webapp=null
path=/admin/info/logging
params={_=1424714417834since=1424711021771wt=json} status=0 QTime=1
3410544 [qtp1728413448-18] INFO
org.apache.solr.servlet.SolrDispatchFilter  ? [admin] webapp=null
path=/admin/info/logging
params={_=1424714428205since=1424711021771wt=json} status=0 QTime=0
3420895 [qtp1728413448-18] INFO
org.apache.solr.servlet.SolrDispatchFilter  ? [admin] webapp=null
path=/admin/info/logging
params={_=1424714438558since=1424711021771wt=json} status=0 QTime=0
3431247 [qtp1728413448-18] INFO
org.apache.solr.servlet.SolrDispatchFilter  ? [admin] webapp=null
path=/admin/info/logging
params={_=1424714448908since=1424711021771wt=json} status=0 QTime=1
3441671 [qtp1728413448-18] INFO
org.apache.solr.servlet.SolrDispatchFilter  ? [admin] webapp=null
path=/admin/info/logging
params={_=1424714459334since=1424711021771wt=json} status=0 QTime=1
3452017 [qtp1728413448-18] INFO
org.apache.solr.servlet.SolrDispatchFilter  ? [admin] webapp=null
path=/admin/info/logging
params={_=1424714469679since=1424711021771wt=json} status=0 QTime=1
3462363 [qtp1728413448-18] INFO
org.apache.solr.servlet.SolrDispatchFilter  ? [admin] webapp=null
path=/admin/info/logging
params={_=1424714480026since=1424711021771wt=json} status=0 QTime=0
3472707 [qtp1728413448-18] INFO
org.apache.solr.servlet.SolrDispatchFilter  ? [admin] webapp=null
path=/admin/info/logging
params={_=1424714490369since=1424711021771wt=json} status=0 QTime=0
3483139 [qtp1728413448-18] INFO
org.apache.solr.servlet.SolrDispatchFilter  ? [admin] webapp=null
path=/admin/info/logging
params={_=1424714500802since=1424711021771wt=json} status=0 QTime=1
3493590 [qtp1728413448-18] INFO
org.apache.solr.servlet.SolrDispatchFilter  ? [admin] webapp=null
path=/admin/info/logging
params={_=1424714511246since=1424711021771wt=json} status=0 QTime=0
3504027 [qtp1728413448-18] INFO
org.apache.solr.servlet.SolrDispatchFilter  ? [admin] webapp=null
path=/admin/info/logging
params={_=1424714521691since=1424711021771wt=json} status=0 QTime=0
3514477 [qtp1728413448-18] INFO
org.apache.solr.servlet.SolrDispatchFilter  ? [admin] webapp=null
path=/admin/info/logging
params={_=1424714532137since=1424711021771wt=json} status=0 QTime=1
3524933 [qtp1728413448-18] INFO
org.apache.solr.servlet.SolrDispatchFilter  ? [admin] webapp=null
path=/admin/info/logging
params={_=1424714542598since=1424711021771wt=json} status=0 QTime=0
3535288 [qtp1728413448-18] INFO
org.apache.solr.servlet.SolrDispatchFilter  ? [admin] webapp=null
path=/admin/info/logging
params={_=1424714552951since=1424711021771wt=json} status=0 QTime=0
3545634 [qtp1728413448-18] INFO
org.apache.solr.servlet.SolrDispatchFilter  ? [admin] webapp=null
path=/admin/info/logging
params={_=1424714563290since=1424711021771wt=json} status=0 QTime=0
3556077 [qtp1728413448-18] INFO
org.apache.solr.servlet.SolrDispatchFilter  ? [admin] webapp=null
path=/admin/info/logging
params={_=1424714573714since=1424711021771wt=json} status=0 QTime=0
3566496 [qtp1728413448-18] INFO
org.apache.solr.servlet.SolrDispatchFilter  ? [admin] webapp=null
path=/admin/info/logging
params={_=1424714584157since=1424711021771wt=json} status=0 QTime=1
3576937 [qtp1728413448-18] INFO
org.apache.solr.servlet.SolrDispatchFilter  ? [admin] webapp=null
path=/admin/info/logging
params={_=1424714594601since=1424711021771wt=json} status=0 QTime=0
3587273 [qtp1728413448-18] INFO
org.apache.solr.servlet.SolrDispatchFilter  ? [admin] webapp=null
path=/admin/info/logging
params={_=1424714604939since=1424711021771wt=json} status=0

snapinstaller does not start newSearcher

2015-02-23 Thread alxsss

Hello,

I am using latest solr (solr trunk) . I run snapinstaller, and see that it 
copies snapshot to index folder but changes are not picked up and

 logs in slave after running snapinstaller are

44302 [qtp1312571113-14] INFO  org.apache.solr.update.UpdateHandler  – start 
commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
44303 [qtp1312571113-14] INFO  org.apache.solr.update.UpdateHandler  – No 
uncommitted changes. Skipping IW.commit.
44304 [qtp1312571113-14] INFO  org.apache.solr.core.SolrCore  – 
SolrIndexSearcher has not changed - not re-opening: 
org.apache.solr.search.SolrIndexSearcher
44305 [qtp1312571113-14] INFO  org.apache.solr.update.UpdateHandler  – 
end_commit_flush
44305 [qtp1312571113-14] INFO  
org.apache.solr.update.processor.LogUpdateProcessor  – [product] webapp=/solr 
path=/update params={} {commit=} 0 57

Restarting solr  gives

 Error creating core [product]: Error opening new searcher
org.apache.solr.common.SolrException: Error opening new searcher
at org.apache.solr.core.SolrCore.init(SolrCore.java:873)
at org.apache.solr.core.SolrCore.init(SolrCore.java:646)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:491)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:255)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:249)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
Caused by: org.apache.solr.common.SolrException: Error opening new searcher
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1565)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1677)
at org.apache.solr.core.SolrCore.init(SolrCore.java:845)
... 9 more

Any idea what causes this issue.

Thanks in advance.
Alex.

Re: Basic Multilingual search capability