Re: search filter

2013-05-23 Thread Kamal Palei
Looks I am getting exception as below


May 22, 2013 10:52:11 PM org.apache.solr.common.SolrException log
SEVERE: java.lang.NumberFormatException: For input string: [3 TO 9] OR
salary:0
at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:438)
at java.lang.Long.parseLong(Long.java:478)


Regards
kamal


On Thu, May 23, 2013 at 11:19 AM, Kamal Palei palei.ka...@gmail.com wrote:

 HI Rafał Kuć
 I tried fq=Salary:[5+TO+10]+OR+Salary:0 and as well as fq=Salary:[5 TO
 10] OR Salary:0  both, both the cases I retrieved 0 results.

 I use drupal along with solr, my code looks as below.

 *   if($include_0_salary == 1)
 {
 $conditions['fq'][0] = 'salary:[' . $min_ctc . '+TO+'
 .
 $max_ctc . ']+OR+salary:0';
 }
 else
 {
 $conditions['fq'][0] = 'salary:[' . $min_ctc . ' TO '
 . $max_ctc . ']';
 }
 $conditions['fq'][1] = 'experience:[' . $min_exp . ' TO '
 . $max_exp . ']';

 $results = apachesolr_search_search_execute($keys,
 $conditions);
 *
 Looks when iclude_0_salary is false, I am getting results as expected.
 If iclude_0_salary is true, I get 0 results, that means for me 
 *$conditions['fq'][0]=
 salary:[5 TO 10] OR salary:0*  did not work.

 Can somebody help me what the wrong I am doing here...

 Best regards
 kamal




 On Wed, May 22, 2013 at 7:00 PM, Rafał Kuć r@solr.pl wrote:

 Hello!

 You can try sending a filter like this fq=Salary:[5+TO+10]+OR+Salary:0

 It should work

 --
 Regards,
  Rafał Kuć
  Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch -
 ElasticSearch

  Dear All
  Can I write a search filter for a field having a value in a range or a
  specific value.

  Say if I want to have a filter like
  1. Select profiles with salary 5 to 10  or Salary 0.

  So I expect profiles having salary either 0 , 5, 6, 7, 8, 9, 10 etc.

  It should be possible, can somebody help me with syntax of 'fq' filter
  please.

  Best Regards
  kamal





OPENNLP current patch compiling problem for 4.x branch

2013-05-23 Thread Patrick Mi
Hi,

I checked out from here
http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_4_3_0 and
downloaded the latest patch LUCENE-2899-current.patch.

Applied the patch ok but when I did 'ant compile' I got the following error:


==
[javac]
/home/lucene_solr_4_3_0/lucene/analysis/opennlp/src/java/org/apache/lucene/a
nalysis/opennlp/FilterPayloadsFilter.java:43: error
r: cannot find symbol
[javac] super(Version.LUCENE_44, input);
[javac]  ^
[javac]   symbol:   variable LUCENE_44
[javac]   location: class Version
[javac] 1 error
==

Compiled it on trunk without problem.

Is this patch supposed to work for 4.X?

Regards,
Patrick 



Re: Solr french search optimisation

2013-05-23 Thread It-forum

Hello again,

Is any one could help me, plase

David

Le 22/05/2013 18:09, It-forum a écrit :

Hello to all,

I'm trying to setup solr 4.2 to index and search into french content.

I defined a special fieldtype for french content :

fieldType name=text_fr class=solr.TextField 
positionIncrementGap=100

analyzer type=index
charFilter class=solr.MappingCharFilterFactory 
mapping=mapping-ISOLatin1Accent.txt/

tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.WordDelimiterFilterFactory 
generateWordParts=1 generateNumberParts=1 catenateWords=1 
catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/

filter class=solr.LowerCaseFilterFactory/
filter class=solr.SnowballPorterFilterFactory 
language=French protected=protwords.txt/

/analyzer

analyzer type=query
charFilter class=solr.MappingCharFilterFactory 
mapping=mapping-ISOLatin1Accent.txt/

tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.WordDelimiterFilterFactory 
generateWordParts=1 generateNumberParts=1 catenateWords=0 
catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/

filter class=solr.LowerCaseFilterFactory/
filter class=solr.SnowballPorterFilterFactory 
language=French protected=protwords.txt/

/analyzer
/fieldType


unfortunately, this field does not behave as I wish.

I'd like to be able to get results from unwell spelled word.

IE : I wish to get the same result typing Pompe à chaleur than 
typing pomppe a chaler  or with solère and solaire


I'm do not find the right way to create a fieldtype to reach this aim.

thanks in advance for your help, do not hesitate for more information 
if need.


Regards

David






Re: Solr french search optimisation

2013-05-23 Thread Cristian Cascetta
Hello,

I think you're confusing three different things:

1) schema and fields definition is for precision/recall: treating
differently a field means different search results and results ranking
2) the pomppe a chaler problem is more a spellchecking problem
http://wiki.apache.org/solr/SpellCheckComponent
3) solère and solaire is a phonetic search problem
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PhoneticFilterFactory

Hope this helps a little,

cristian


2013/5/23 It-forum it-fo...@meseo.fr

 Hello again,

 Is any one could help me, plase

 David

 Le 22/05/2013 18:09, It-forum a écrit :

  Hello to all,

 I'm trying to setup solr 4.2 to index and search into french content.

 I defined a special fieldtype for french content :

 fieldType name=text_fr class=solr.TextField
 positionIncrementGap=100
 analyzer type=index
 charFilter class=solr.**MappingCharFilterFactory
 mapping=mapping-**ISOLatin1Accent.txt/
 tokenizer class=solr.**
 WhitespaceTokenizerFactory/
 filter class=solr.**WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
 filter class=solr.**LowerCaseFilterFactory/
 filter class=solr.**SnowballPorterFilterFactory
 language=French protected=protwords.txt/
 /analyzer

 analyzer type=query
 charFilter class=solr.**MappingCharFilterFactory
 mapping=mapping-**ISOLatin1Accent.txt/
 tokenizer class=solr.**
 WhitespaceTokenizerFactory/
 filter class=solr.**WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=0
 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
 filter class=solr.**LowerCaseFilterFactory/
 filter class=solr.**SnowballPorterFilterFactory
 language=French protected=protwords.txt/
 /analyzer
 /fieldType


 unfortunately, this field does not behave as I wish.

 I'd like to be able to get results from unwell spelled word.

 IE : I wish to get the same result typing Pompe à chaleur than typing
 pomppe a chaler  or with solère and solaire

 I'm do not find the right way to create a fieldtype to reach this aim.

 thanks in advance for your help, do not hesitate for more information if
 need.

 Regards

 David






Solr 4.3: node is seen as active in Zk while in recovery mode + endless recovery

2013-05-23 Thread AlexeyK
Consider the following:
Solr 4.3, 2 node test cluster, each is a leader.
During (or immediately after, before hard commit) indexing I shutdown one of
them and restart later.
The tlog is about 200Mb size.
I see recurring 'Reordered DBQs detected' in the log, seems like an endless
loop because THE VERY SAME update query appears thousands of times, runs for
a long time now.
In the meanwhile, the node is inaccessible (obviously) but in the Zk state
it appears as active, NOT in recovery mode or down.
It seems that this is caused by a recent changed in ZkController which adds
recovery logic into 'register' routine.

Regards,
Alexey




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-3-node-is-seen-as-active-in-Zk-while-in-recovery-mode-endless-recovery-tp4065549.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to query docs with an indexed polygon field in java?

2013-05-23 Thread David Smiley (@MITRE.org)
Hi Kevenz,


kevenz wrote
 ...
 String sql = indexType:219  AND 
 geo:Contains(POINT(114.078327401257,22.5424866754136));
 ...
 Then I got an error at java.lang.IllegalArgumentException: missing
 parens: Contains. Is there any suggestion?

First of all, if your query shape is a point, then use Intersects, which
semantically equivalent but works much faster.  One error in your query is
that your quotes look messed up.  Another is that you used a comma to
separate the X and Y when you should use a space (because you are using WKT
syntax via POINT).  Try this:

   indexType:219 AND geo:Contains(POINT(114.078327401257
22.5424866754136))

This will also work using lat comma lon non-WKT syntax:

   indexType:219 AND geo:Contains(22.5424866754136, 114.078327401257)

Disclaimer: I didn't run these, I just typed them in the email.

~ David



-
 Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-query-docs-with-an-indexed-polygon-field-in-java-tp4065512p4065550.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 4.3: node is seen as active in Zk while in recovery mode + endless recovery

2013-05-23 Thread AlexeyK
a small change: it's not an endless loop, but a painfully slow processing
which includes running a delete query and then insertion. Each document from
the tlog takes tens of seconds to process (more than 100 times slower than
during normal insertion process)



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-3-node-is-seen-as-active-in-Zk-while-in-recovery-mode-endless-recovery-tp4065549p4065551.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr french search optimisation

2013-05-23 Thread fbrisbart
You can also think about using a SynonymFilter if you can list the
misspelled words.

That's a quick and dirty solution.
But it's easier to add a pomppe - pompe in a synonym list than tuning
a phonetic filter.
NB: an indexation is required whenever the synonyms file change

Franck Brisbart

Le jeudi 23 mai 2013 à 08:59 +0200, Cristian Cascetta a écrit :
 Hello,
 
 I think you're confusing three different things:
 
 1) schema and fields definition is for precision/recall: treating
 differently a field means different search results and results ranking
 2) the pomppe a chaler problem is more a spellchecking problem
 http://wiki.apache.org/solr/SpellCheckComponent
 3) solère and solaire is a phonetic search problem
 http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PhoneticFilterFactory
 
 Hope this helps a little,
 
 cristian
 
 
 2013/5/23 It-forum it-fo...@meseo.fr
 
  Hello again,
 
  Is any one could help me, plase
 
  David
 
  Le 22/05/2013 18:09, It-forum a écrit :
 
   Hello to all,
 
  I'm trying to setup solr 4.2 to index and search into french content.
 
  I defined a special fieldtype for french content :
 
  fieldType name=text_fr class=solr.TextField
  positionIncrementGap=100
  analyzer type=index
  charFilter class=solr.**MappingCharFilterFactory
  mapping=mapping-**ISOLatin1Accent.txt/
  tokenizer class=solr.**
  WhitespaceTokenizerFactory/
  filter class=solr.**WordDelimiterFilterFactory
  generateWordParts=1 generateNumberParts=1 catenateWords=1
  catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
  filter class=solr.**LowerCaseFilterFactory/
  filter class=solr.**SnowballPorterFilterFactory
  language=French protected=protwords.txt/
  /analyzer
 
  analyzer type=query
  charFilter class=solr.**MappingCharFilterFactory
  mapping=mapping-**ISOLatin1Accent.txt/
  tokenizer class=solr.**
  WhitespaceTokenizerFactory/
  filter class=solr.**WordDelimiterFilterFactory
  generateWordParts=1 generateNumberParts=1 catenateWords=0
  catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
  filter class=solr.**LowerCaseFilterFactory/
  filter class=solr.**SnowballPorterFilterFactory
  language=French protected=protwords.txt/
  /analyzer
  /fieldType
 
 
  unfortunately, this field does not behave as I wish.
 
  I'd like to be able to get results from unwell spelled word.
 
  IE : I wish to get the same result typing Pompe à chaleur than typing
  pomppe a chaler  or with solère and solaire
 
  I'm do not find the right way to create a fieldtype to reach this aim.
 
  thanks in advance for your help, do not hesitate for more information if
  need.
 
  Regards
 
  David
 
 
 
 




Re: [ANNOUNCE] Web Crawler

2013-05-23 Thread Dominique Bejean

Hi,

Release 3.0.3 was tested with :

* Oracle Java 6 but should work fine with version 7
* Tomcat 5.5 and 6 and 7
* PHP 5.2.x and 5.3.x
* Apache 2.2.x
* MongoDB 64 bits 2.2 (know issue with 2.4)

The new release 4.0.0-alpha-2 is available under Github - 
https://github.com/bejean/crawl-anywhere


The pre-requisites are :

Oracle Java 6 or 
Tomcat 5.5 or 
Apache 2.2 or 
PHP 5.2.x or 5.3.x or 5.4.x
MongoDB 64 bits 2.2 or 
Solr 3.x or  (configuration files provided for Solr 4.3.0)

And the up to date installation instructions are here 
http://www.crawl-anywhere.com/installation-v400/


Please read the Github project home page, all information are provided.

Regards.

Dominique




Le 23/05/13 07:38, Rajesh Nikam a écrit :

Hi,

crawl anywhere seems to using old versions of java, tomcat, etc.

http://www.crawl-anywhere.com/installation-v300/

Will it work with new versions of these required software ?

Is there updated installation guide available ?

Thanks
Rajesh





On Wed, May 22, 2013 at 6:48 PM, Dominique Bejean 
dominique.bej...@eolya.fr mailto:dominique.bej...@eolya.fr wrote:


Hi,

Crawl-Anywhere is now open-source -
https://github.com/bejean/crawl-anywhere

Best regards.


Le 02/03/11 10:02, findbestopensource a écrit :

Hello Dominique Bejean,

Good job.

We identified almost 8 open source web crawlers
http://www.findbestopensource.com/tagged/webcrawler   I don't
know how far yours would be different from the rest.

Your license states that it is not open source but it is free
for personnel use.

Regards
Aditya
www.findbestopensource.com http://www.findbestopensource.com
http://www.findbestopensource.com


On Wed, Mar 2, 2011 at 5:55 AM, Dominique Bejean
dominique.bej...@eolya.fr mailto:dominique.bej...@eolya.fr
mailto:dominique.bej...@eolya.fr
mailto:dominique.bej...@eolya.fr wrote:

Hi,

I would like to announce Crawl Anywhere. Crawl-Anywhere is
a Java
Web Crawler. It includes :

  * a crawler
  * a document processing pipeline
  * a solr indexer

The crawler has a web administration in order to manage
web sites
to be crawled. Each web site crawl is configured with a lot of
possible parameters (no all mandatory) :

  * number of simultaneous items crawled by site
  * recrawl period rules based on item type (html, PDF, …)
  * item type inclusion / exclusion rules
  * item path inclusion / exclusion / strategy rules
  * max depth
  * web site authentication
  * language
  * country
  * tags
  * collections
  * ...

The pileline includes various ready to use stages (text
extraction, language detection, Solr ready to index xml
writer, ...).

All is very configurable and extendible either by scripting or
java coding.

With scripting technology, you can help the crawler to handle
javascript links or help the pipeline to extract relevant
title
and cleanup the html pages (remove menus, header, footers, ..)

With java coding, you can develop your own pipeline stage
stage

The Crawl Anywhere web site provides good explanations and
screen
shots. All is documented in a wiki.

The current version is 1.1.4. You can download and try it
out from
here : www.crawl-anywhere.com
http://www.crawl-anywhere.com http://www.crawl-anywhere.com


Regards

Dominique



-- 
Dominique Béjean

+33 6 08 46 12 43
skype: dbejean
www.eolya.fr http://www.eolya.fr
www.crawl-anywhere.com http://www.crawl-anywhere.com
www.mysolrserver.com http://www.mysolrserver.com




--
Dominique Béjean
+33 6 08 46 12 43
skype: dbejean
www.eolya.fr
www.crawl-anywhere.com



Distributed query: strange behavior.

2013-05-23 Thread Luis Cappa Banda
Hello, guys!

I'm running Solr 4.3.0 and I've notice an strange behavior during
distributed queries execution. Currently I have three Solr servers as
shards and I when I do the following query...


http://localhost:11080/twitter/data/select?q=*:**rows=10*
shards=localhost:11080/twitter/data,localhost:12080/twitter/data,localhost:13080/twitter/datawt=jsonhttp://localhost:11080/twitter/data/select?q=*:*rows=10sort=docIndexDate%20descshards=localhost:11080/twitter/data,localhost:12080/twitter/data,localhost:13080/twitter/datawt=json

*Numfound* = 47131


I've query each Solr shard server one by one and the total number of
documents is correct. However, when I change rows parameter from 10 to 100
the total numFound of documents change:

http://localhost:11080/twitter/data/select?q=*:**rows=100*
shards=localhost:11080/twitter/data,localhost:12080/twitter/data,localhost:13080/twitter/datawt=jsonhttp://localhost:11080/twitter/data/select?q=*:*rows=10sort=docIndexDate%20descshards=localhost:11080/twitter/data,localhost:12080/twitter/data,localhost:13080/twitter/datawt=json

*Numfound* = 47124

And if i set rows=50 again the numFound count changes:

http://localhost:11080/twitter/data/select?q=*:*rows=50shards=localhost:11080/twitter/data,localhost:12080/twitter/data,localhost:13080/twitter/datawt=json

*Numfound* = 47129


What's happening here? Anybody knows? It's a distributed search bug or
something?

Thank you very much in advance!


Best regards,

-- 
- Luis Cappa


Re: Boosting Documents

2013-05-23 Thread Oussama Jilal
Oh thank you Chris, this is much clearer, and thank you for updating the 
Wiki too.


On 05/22/2013 08:29 PM, Chris Hostetter wrote:

: NOTE: make sure norms are enabled (omitNorms=false in the schema.xml) for
: any fields where the index-time boost should be stored.
:
: In my case where I only need to boost the whole document (not a specific
: field), do I have to activate the  omitNorms=false  for all the fields
: in the schema ?

docBoost is really just syntactic sugar for a field boost on each field i
the document -- it's factored into the norm value for each field in the
document.  (I'll update the wiki to make this more clear)

If you do a query that doesn't utilize any field which has norms, then the
docBoost you specified when indexing the document never comes into play.


In general, doc boosts and field boosts, and the way they come into play
as part of the field norm is fairly inflexible, and (in my opinion)
antiquated.  A much better way of dealing with this type of problem is
also discussed in the section of the wiki you linked to.  Imeediately
below...

http://wiki.apache.org/solr/SolrRelevancyFAQ#index-time_boosts

...you'll find...

http://wiki.apache.org/solr/SolrRelevancyFAQ#Field_Based_Boosting


-Hoss




Re: Solr french search optimisation

2013-05-23 Thread It-forum

Hello,

Tx Cristian for your details.

I totally agreed with your explanation, this is 2 differents aspect 
which I need to solve.


Could you clarify few more thinks :

- SpellchekComponent and Phonetic, should be use while indexing or only 
while querying ?


- Does spellcheck component return only the right spelling, or is it 
used to search into result?


- If i want to solve Spelling, Phonetic, stemming problem in french 
language. Can I use only one field or should I use several with 
different filters ?


Regards

David


Le 23/05/2013 08:59, Cristian Cascetta a écrit :

Hello,

I think you're confusing three different things:

1) schema and fields definition is for precision/recall: treating
differently a field means different search results and results ranking
2) the pomppe a chaler problem is more a spellchecking problem
http://wiki.apache.org/solr/SpellCheckComponent
3) solère and solaire is a phonetic search problem
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PhoneticFilterFactory

Hope this helps a little,

cristian


2013/5/23 It-forum it-fo...@meseo.fr


Hello again,

Is any one could help me, plase

David

Le 22/05/2013 18:09, It-forum a écrit :

  Hello to all,

I'm trying to setup solr 4.2 to index and search into french content.

I defined a special fieldtype for french content :

 fieldType name=text_fr class=solr.TextField
positionIncrementGap=100
 analyzer type=index
 charFilter class=solr.**MappingCharFilterFactory
mapping=mapping-**ISOLatin1Accent.txt/
 tokenizer class=solr.**
WhitespaceTokenizerFactory/
 filter class=solr.**WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=1
catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
 filter class=solr.**LowerCaseFilterFactory/
 filter class=solr.**SnowballPorterFilterFactory
language=French protected=protwords.txt/
 /analyzer

 analyzer type=query
 charFilter class=solr.**MappingCharFilterFactory
mapping=mapping-**ISOLatin1Accent.txt/
 tokenizer class=solr.**
WhitespaceTokenizerFactory/
 filter class=solr.**WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=0
catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
 filter class=solr.**LowerCaseFilterFactory/
 filter class=solr.**SnowballPorterFilterFactory
language=French protected=protwords.txt/
 /analyzer
 /fieldType


unfortunately, this field does not behave as I wish.

I'd like to be able to get results from unwell spelled word.

IE : I wish to get the same result typing Pompe à chaleur than typing
pomppe a chaler  or with solère and solaire

I'm do not find the right way to create a fieldtype to reach this aim.

thanks in advance for your help, do not hesitate for more information if
need.

Regards

David







Re: Facet pivot 50.000.000 different values

2013-05-23 Thread Carlos Bonilla
In case anyone is interested, I solved my problem using the grouping
feature:

*query* -- filter query (if any)
*field* -- field that you want to count (in my case field B)

SolrQuery solrQuery = new SolrQuery(query);
solrQuery.add(group, true);
solrQuery.add(group.field, B); // Group by the field
solrQuery.add(group.ngroups, true);
solrQuery.setRows(0);

And in the response *getNGroups()* will give you the total number
of distinct values (total number of B distinct values)

Cheers,
Carlos.


2013/5/18 Carlos Bonilla carlosbonill...@gmail.com

 Hi Mikhail,
 yes the thing is that I need to take into account different queries and
 that's why I can't use the Terms Component.

 Cheers.


 2013/5/17 Mikhail Khludnev mkhlud...@griddynamics.com

 On Fri, May 17, 2013 at 12:47 PM, Carlos Bonilla
 carlosbonill...@gmail.comwrote:

  We
  only need to calculate how many different B values have more than 1
  document but it takes ages
 

 Carlos,
 It's not clear whether you need to take results of a query into account or
 just gather statistics from index. if later you can just enumerate terms
 and watch into TermsEnum.docFreq() . Am I getting it right?


 --
 Sincerely yours
 Mikhail Khludnev
 Principal Engineer,
 Grid Dynamics

 http://www.griddynamics.com
  mkhlud...@griddynamics.com





Re: Solr french search optimisation

2013-05-23 Thread Cristian Cascetta
 Could you clarify few more thinks :

 - SpellchekComponent and Phonetic, should be use while indexing or only
 while querying ?


SpellCheck: you can define a specific field for spellchecking (in this
sense it's a query/schema time) or you can create a specific vocabulary for
spell-checking. I strongly suggest to go through documentation
http://wiki.apache.org/solr/SpellCheckComponent for this component, every
time I used it I've had the need to customize and adapt configuration.



 - Does spellcheck component return only the right spelling, or is it used
 to search into result?


I'm not sure, please check the documentation, but I remeber that you can
configure it to directly re-execute the spell-corrected query AND show some
alternatives/suggestions to the user (obviously this is a display/frontend
choice)



 - If i want to solve Spelling, Phonetic, stemming problem in french
 language. Can I use only one field or should I use several with different
 filters ?



I don't think it's possible to use only one field, in my experience I can
suggest you to use multiple fields for multiple scopes, if you're scared by
the index-size remember that fields that are indexed and NOT stored don't
grow your index so much. Set as stored only fields you need to display to
end-user.


Re: Solr Faceting doesn't return values.

2013-05-23 Thread Sandeep Mestry
*str name=msgorg.apache.solr.search.SyntaxError: Cannot parse
'*mm_state_code:(**TX)*': Encountered  : :  at line 1, column 14.
Was expecting one of:*

This suggests to me that you kept the df parameter in the query hence it
was forming mm_state_code:mm_state_code:(TX), can you try exactly they way
I gave you - i.e. without the df parameter?
Also, can you post schema.xml and /select handler config from
solrconfig.xml?


On 22 May 2013 18:36, samabhiK qed...@gmail.com wrote:

 When I use your query, I get :

 ?xml version=1.0 encoding=UTF-8?
 response

 lst name=responseHeader
   int name=status400/int
   int name=QTime12/int
   lst name=params
 str name=facettrue/str
 str name=dfmm_state_code/str
 str name=indenttrue/str
 str name=q*mm_state_code:(**TX)*/str
 str name=_1369244078714/str
 str name=debugall/str
 str name=facet.fieldsa_site_city/str
 str name=wtxml/str
   /lst
 /lst
 lst name=error
   str name=msgorg.apache.solr.search.SyntaxError: Cannot parse
 '*mm_state_code:(**TX)*': Encountered  : :  at line 1, column 14.
 Was expecting one of:
 EOF
 AND ...
 OR ...
 NOT ...
 + ...
 - ...
 BAREOPER ...
 ( ...
 * ...
 ^ ...
 QUOTED ...
 TERM ...
 FUZZY_SLOP ...
 PREFIXTERM ...
 WILDTERM ...
 REGEXPTERM ...
 [ ...
 { ...
 LPARAMS ...
 NUMBER ...
 /str
   int name=code400/int
 /lst
 /response

 Not sure why the data wont show up. Almost all the records has the field
 sa_site_city has data and is also indexed. :(



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Solr-Faceting-doesn-t-return-values-tp4065276p4065406.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr4.2 - Fuzzy Search Problems

2013-05-23 Thread meghana
Thanks Chris , 

for my 2nd Query (~1 returns words with 2 editing distance), it may be the
issue.  

still m looking for my last issue. hope jira helps to resolve that. 


Chris Hostetter-3 wrote
 : 
 : 2) although I set editing distance to 1 in my query (e.g. worde~1), solr
 : returns me results having 2 editing distance (like WORDOES, WORHEE,
 WORKEE,
 : .. ect. )
 
 fuzzy search works on *terms* in your index -- if you use a stemme when 
 you index your data (your schema shows that you are) then a word in your 
 input like WORDOES might wind up in your index as a term within the edit 
 distance you specified (ie: wordo or word or something similar)
 
 : 3) Last and major issue, I had very few data at startup in my solr core
 (say
 : around 1K - 2K ), at that time, when i was searching with worde~1 , it
 was
 : returning many records (around 450).
 : 
 : Then I ingested few more records in my solr core (say around 1K). It was
 : ingested successfully , no errors or warning in Log. After that when I
 : performed the same fuzzy search (worde~1) on previous records only, not
 in
 : new ingested records , It did not return me previous results(around 450)
 as
 : well, and return total 1 record only having highlight as WORD!N .
 
 This sounds like the same issue as discribed in SOLR-4824...
 
 https://issues.apache.org/jira/browse/SOLR-4824
 
 
 -Hoss





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr4-2-Fuzzy-Search-Problems-tp4063199p4065576.html
Sent from the Solr - User mailing list archive at Nabble.com.


index multiple files into one index entity

2013-05-23 Thread Mark.Kappe
Hello solr team,

I want to index multiple fields into one solr index entity, with the same id. 
We are using solr 4.1


I try it with following source fragment:

public void addContentSet(ContentSet contentSet) throws 
SearchProviderException {

...

ContentStreamUpdateRequest csur = 
generateCSURequest(contentSet.getIndexId(), contentSet);
String indexId = contentSet.getIndexId();

ConcurrentUpdateSolrServer server = 
serverPool.getUpdateServer(indexId);
server.request(csur);

...
}

private ContentStreamUpdateRequest generateCSURequest(String indexId, 
ContentSet contentSet)
throws IOException {
ContentStreamUpdateRequest csur = new 
ContentStreamUpdateRequest(confStore.getExtractUrl());

ModifiableSolrParams parameters = csur.getParams();
if (parameters == null) {
parameters = new ModifiableSolrParams();
}

parameters.set(literalsOverride, false);

// maps the tika default content attribute to the Attribute with name 
'fulltext'
parameters.set(fmap.content, 
SearchSystemAttributeDef.FULLTEXT.getName());
// create an empty content stream, this seams necessary for 
ContentStreamUpdateRequest
csur.addContentStream(new ImaContentStream());

for (Content content : contentSet.getContentList()) {
csur.addContentStream(new ImaContentStream(content));
// for each content stream add additional attributes
parameters.add(literal. + 
SearchSystemAttributeDef.CONTENT_ID.getName(), 
content.getBinaryObjectId().toString());
parameters.add(literal. + 
SearchSystemAttributeDef.CONTENT_KEY.getName(), content.getContentKey());
parameters.add(literal. + 
SearchSystemAttributeDef.FILE_NAME.getName(), content.getContentName());
parameters.add(literal. + 
SearchSystemAttributeDef.MIME_TYPE.getName(), content.getMimeType());
}

parameters.set(literal.id , indexId);

// adding some other attributes
...

csur.setParams(parameters);

return csur;
}

During debugging I can see that the method 'server.request(csur)' read for each 
ImaContentStream the buffer.
When I'm looking on solr catalina log I see that the attached files reach the 
solr servlet.

INFO: Releasing directory:/data/V-4-1/master0/data/index
Apr 25, 2013 5:48:07 AM org.apache.solr.update.processor.LogUpdateProcessor 
finish
INFO: [master0] webapp=/solr-4-1 path=/update/extract 
params={literal.searchconnectortest15_c8150e41_cc49_4a .. 
literal.id=26afa5dc-40ad-442a-ac79-0e7880c06aa1 .
{add=[26afa5dc-40ad-442a-ac79-0e7880c06aa1 (1433265910940958720), 
26afa5dc-40ad-442a-ac79-0e7880c06aa1 (1433265910971367424), 
26afa5dc-40ad-442a-ac79-0e7880c06aa1 (1433265910976610304), 
26afa5dc-40ad-442a-ac79-0e7880c06aa1 (1433265910983950336), 
26afa5dc-40ad-442a-ac79-0e7880c06aa1 (1433265910989193216), 
26afa5dc-40ad-442a-ac79-0e7880c06aa1 (1433265910995484672)]} 0 58


But only the latest in the content list will be indexed.


My schema.xml has the following field definitions:

field name=id type=string indexed=true stored=true required=true 
/
field name=content type=text_general indexed=false stored=true 
multiValued=true/

field name=contentkey type=string indexed=true stored=true 
multiValued=true/
field name=contentid type=string indexed=true stored=true 
multiValued=true/
field name=contentfilename  type=string indexed=true stored=true 
multiValued=true/
field name=contentmimetype type=string indexed=true stored=true 
multiValued=true/

field name=fulltext type=text_general indexed=true stored=true 
multiValued=true/


I'm using the tika ExtractingRequestHandler which can extract binary files.



  requestHandler name=/update/extract
  startup=lazy
  class=solr.extraction.ExtractingRequestHandler 
lst name=defaults
  str name=lowernamestrue/str
  str name=uprefixignored_/str

  !-- capture link hrefs but ignore div attributes --
  str name=captureAttrtrue/str
  str name=fmap.alinks/str
  str name=fmap.divignored_/str

/lst
  /requestHandler

Is it possible to index multiple files with the same id?
It is necessary to implement my own RequestHandler?

With best regards Mark





Solr DIH - Small index still take time?

2013-05-23 Thread Spadez
Hi,

This is the situation, I have two sources of data in my dataimport handler,
one is huge, the other is tiny:

Source A: 10-20 records
Source B: 50,000,000 records

I was wondering what happens if I was to do a DIH just on Source A every 10
mins, and only run the DIH on source B every 24 hours.

Would running my DIH on Source A be extremely quick, because the data we are
importing is small, or would it still be time consuming, because it would
have to rebuild the index of the entire SOLR (i.e 50,000,010 records).

Thank you!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-DIH-Small-index-still-take-time-tp4065582.html
Sent from the Solr - User mailing list archive at Nabble.com.


Question about Coordination factor

2013-05-23 Thread Kazuaki Hiraga
Hello Folks,

Sorry, my last email was a bit messy, so I am sending it again.

I have a question about coordination factor to ensure my understanding
of this value is correct.

If I have documents that contain some keywords like the following:
  Doc1: A, B, C
  Doc2: A, C
  Doc3: B, C

And my query is A OR B OR C OR D. In this case, Coord factor value
for each documents will be the following:
 Doc1: 3/4
 Doc2: 2/4
 Doc3: 2/4

In the same fashion, respective value of coord factor is the following
if I have a query C OR D:
 Doc1: 1/2
 Doc2: 1/2
 Doc3: 1/2

Is this correct? or Did I miss something?

Please correct me if I am wrong.

Regards,
Kazuaki


Re: Question about Coordination factor

2013-05-23 Thread Anshum Gupta
This looks correct.


On Thu, May 23, 2013 at 7:37 AM, Kazuaki Hiraga kazuaki.hir...@gmail.comwrote:

 Hello Folks,

 Sorry, my last email was a bit messy, so I am sending it again.

 I have a question about coordination factor to ensure my understanding
 of this value is correct.

 If I have documents that contain some keywords like the following:
   Doc1: A, B, C
   Doc2: A, C
   Doc3: B, C

 And my query is A OR B OR C OR D. In this case, Coord factor value
 for each documents will be the following:
  Doc1: 3/4
  Doc2: 2/4
  Doc3: 2/4

 In the same fashion, respective value of coord factor is the following
 if I have a query C OR D:
  Doc1: 1/2
  Doc2: 1/2
  Doc3: 1/2

 Is this correct? or Did I miss something?

 Please correct me if I am wrong.

 Regards,
 Kazuaki




-- 

Anshum Gupta
http://www.anshumgupta.net


Re: Can anyone explain this Solr query behavior?

2013-05-23 Thread Erick Erickson
Please post the results of adding debug=query to the URL.
That'll tell us what the query parser spits out which is much
easier to analyze.

Best
Erick

On Wed, May 22, 2013 at 12:16 PM, Shankar Sundararaju
shan...@ebrary.com wrote:
 This query returns 0 documents: *q=(+Title:() +Classification:()
 +Contributors:() +text:())*

 This returns 1 document: *q=doc-id:3000*

 And this returns 631580 documents when I was expecting 0: *q=doc-id:3000
 AND (+Title:() +Classification:() +Contributors:() +text:())*

 Am I missing something here? Can someone please explain? I am using Solr
 4.2.1

 Thanks
 -Shankar


Re: fq facet on double and non-indexed field

2013-05-23 Thread Erick Erickson
bq: So cant we do fq on non-indexed field

No. By definition the fq clause is a search and
you can only search on indexed fields.

Best
Erick

On Wed, May 22, 2013 at 5:08 PM, gpssolr2020 psgoms...@gmail.com wrote:
 Hi

 i am trying to apply filtering on non-indexed double field .But its not
 returning  any results. So cant we do fq on non-indexed field?

 can not use FieldCache on a field which is neither indexed nor has doc
 values: EXCH_RT_AMT
 /str
 int name=code400/int

 We are using Solr4.2.

 Thanks.



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/fq-facet-on-double-and-non-indexed-field-tp4065457.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: Approach to apply full index from master to slaves?

2013-05-23 Thread Erick Erickson
What's your max warming searcher value?

About warming queries, that may be _adding_ to your problem.
I'd first try removing many of them, especially if you have
your cache autowarm settings very high, try 16 or so.

Autowarming is all about pre-loading the caches etc, but you
reach diminishing returns pretty quickly.

And what are all the threads doing?

Best
Erick

On Wed, May 22, 2013 at 11:14 PM, William Bell billnb...@gmail.com wrote:
 We have a 3GB index. We index on the master and then replicate to the
 slaves.

 But the issue is that after the slaves switch over - we get deadlocking, #
 of threads increase to 500, and most times the SOLR instance just plain
 locks up.

 We tried adding a bunch of warming queries, but we still have a major
 performance hit and same issues.

 Are there any other tweaks and recommendations? Are others experiencing
 this?

 --
 Bill Bell
 billnb...@gmail.com
 cell 720-256-8076


hook to know when a DOC is committed.

2013-05-23 Thread Fredrik Rødland
I need to know when a document is committed in SOLR - i.e. is searchable.

Is there anyone who has a solution on how to do this.

I'm aware of three methods to create hooks for knowing when a doc is added or 
a commit is performed, but the doc(id) does not seem to be included for the 
commit-hooks (naturally I guess):

A. subclass DirectUpdateHandler2 and override commit and/or addDoc
B. subclass UpdateRequestProcessor (and include it in the update-chain) and 
override processAdd and/or processCommit
C. implement SolrEventListener and implement postCommit and/or postSoftCommit

The use-case is to let other parts of a system know that a document is 
searchable without having to create a poller which has to have state on 
when/how it polls.

Any ideas or tricks out there?


Fredrik


--
Fredrik Rødland   Mail:fredrik.rodl...@finn.no
FINN.no   Cell:+47 99 21 98 17
  Twitter: @fredrikr
Oslo, NORWAY  Web: http://about.me/fmr



Re: search filter

2013-05-23 Thread Gora Mohanty
On 23 May 2013 11:19, Kamal Palei palei.ka...@gmail.com wrote:
 HI Rafał Kuć
 I tried fq=Salary:[5+TO+10]+OR+Salary:0 and as well as fq=Salary:[5 TO 10]
 OR Salary:0  both, both the cases I retrieved 0 results.
[...]

Please try the suggested filter query from the
Solr admin. interface, or by typing it directly
into the browser URL bar. My guess is that
there is still some issue with your Drupal/Solr
integration.

Regards,
Gora


Re: OPENNLP current patch compiling problem for 4.x branch

2013-05-23 Thread Erick Erickson
by definition, there is no LUCENE_44 constant in a 4.3
distro! Just change it to LUCENE_43 (or whatever you find
in the Version class that suits your needs) or try this on a
4.x checkout.

Best
Erick

On Thu, May 23, 2013 at 2:08 AM, Patrick Mi
patrick...@touchpointgroup.com wrote:
 Hi,

 I checked out from here
 http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_4_3_0 and
 downloaded the latest patch LUCENE-2899-current.patch.

 Applied the patch ok but when I did 'ant compile' I got the following error:


 ==
 [javac]
 /home/lucene_solr_4_3_0/lucene/analysis/opennlp/src/java/org/apache/lucene/a
 nalysis/opennlp/FilterPayloadsFilter.java:43: error
 r: cannot find symbol
 [javac] super(Version.LUCENE_44, input);
 [javac]  ^
 [javac]   symbol:   variable LUCENE_44
 [javac]   location: class Version
 [javac] 1 error
 ==

 Compiled it on trunk without problem.

 Is this patch supposed to work for 4.X?

 Regards,
 Patrick



Re: Solr 4.3: node is seen as active in Zk while in recovery mode + endless recovery

2013-05-23 Thread Erick Erickson
Tangential to the issue you raise is that this is a huge tlog. It indicates that
you aren't doing a hard commit (openSearcher=false) very often. That
operation will truncate your tlog which should speed recovery/startup.
You're also chewing up some memory with a tlog that size since pointers
to the tlog are kept for each document.

This comment doesn't address your comment about the change to
ZkController, I'll leave that to someone who knows the code.

Best
Erick

On Thu, May 23, 2013 at 3:14 AM, AlexeyK lex.kudi...@gmail.com wrote:
 a small change: it's not an endless loop, but a painfully slow processing
 which includes running a delete query and then insertion. Each document from
 the tlog takes tens of seconds to process (more than 100 times slower than
 during normal insertion process)



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Solr-4-3-node-is-seen-as-active-in-Zk-while-in-recovery-mode-endless-recovery-tp4065549p4065551.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: hook to know when a DOC is committed.

2013-05-23 Thread Jack Krupansky
A poller really is the most sensible, practical, and easiest route to go. If 
you add the versions=true parameter to your update request and have the 
transaction log enabled the update response will have the version numbers 
for each document id, then the poller can also tell if an update has been 
committed as well.


Also, with soft commit, documents should be visible must more rapidly.

Do you have some other, unmentioned requirement that you feel is biasing you 
against a sensible poller? Clue us in as to the nature of such a 
requirement.


-- Jack Krupansky

-Original Message- 
From: Fredrik Rødland

Sent: Thursday, May 23, 2013 7:53 AM
To: solr-user@lucene.apache.org
Subject: hook to know when a DOC is committed.

I need to know when a document is committed in SOLR - i.e. is searchable.

Is there anyone who has a solution on how to do this.

I'm aware of three methods to create hooks for knowing when a doc is added 
or a commit is performed, but the doc(id) does not seem to be included for 
the commit-hooks (naturally I guess):


A. subclass DirectUpdateHandler2 and override commit and/or addDoc
B. subclass UpdateRequestProcessor (and include it in the update-chain) and 
override processAdd and/or processCommit
C. implement SolrEventListener and implement postCommit and/or 
postSoftCommit


The use-case is to let other parts of a system know that a document is 
searchable without having to create a poller which has to have state on 
when/how it polls.


Any ideas or tricks out there?


Fredrik


--
Fredrik Rødland   Mail:fredrik.rodl...@finn.no
FINN.no   Cell:+47 99 21 98 17
 Twitter: @fredrikr
Oslo, NORWAY  Web: http://about.me/fmr 



Re: Solr DIH - Small index still take time?

2013-05-23 Thread Alexandre Rafalovitch
That should work. Just watch out for (set value of)
preImportDeleteQuery. Otherwise, when you do full import you may
accidentally delete items from the other set.

Regards,
   Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Thu, May 23, 2013 at 6:25 AM, Spadez james_will...@hotmail.com wrote:
 Hi,

 This is the situation, I have two sources of data in my dataimport handler,
 one is huge, the other is tiny:

 Source A: 10-20 records
 Source B: 50,000,000 records

 I was wondering what happens if I was to do a DIH just on Source A every 10
 mins, and only run the DIH on source B every 24 hours.

 Would running my DIH on Source A be extremely quick, because the data we are
 importing is small, or would it still be time consuming, because it would
 have to rebuild the index of the entire SOLR (i.e 50,000,010 records).

 Thank you!



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Solr-DIH-Small-index-still-take-time-tp4065582.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: index multiple files into one index entity

2013-05-23 Thread Erick Erickson
I just skimmed your post, but I'm responding to the last bit.

If you have uniqueKey defined as id in schema.xml then
no, you cannot have multiple documents with the same ID.
Whenever a new doc comes in it replaces the old doc with that ID.

You can remove the uniqueKey definition and do what you want,
but there are very few Solr installations with no uniqueKey and
it's probably a better idea to make your id's truly unique.

Best
Erick

On Thu, May 23, 2013 at 6:14 AM,  mark.ka...@t-systems.com wrote:
 Hello solr team,

 I want to index multiple fields into one solr index entity, with the same id. 
 We are using solr 4.1


 I try it with following source fragment:

 public void addContentSet(ContentSet contentSet) throws 
 SearchProviderException {

 ...

 ContentStreamUpdateRequest csur = 
 generateCSURequest(contentSet.getIndexId(), contentSet);
 String indexId = contentSet.getIndexId();

 ConcurrentUpdateSolrServer server = 
 serverPool.getUpdateServer(indexId);
 server.request(csur);

 ...
 }

 private ContentStreamUpdateRequest generateCSURequest(String indexId, 
 ContentSet contentSet)
 throws IOException {
 ContentStreamUpdateRequest csur = new 
 ContentStreamUpdateRequest(confStore.getExtractUrl());

 ModifiableSolrParams parameters = csur.getParams();
 if (parameters == null) {
 parameters = new ModifiableSolrParams();
 }

 parameters.set(literalsOverride, false);

 // maps the tika default content attribute to the Attribute with name 
 'fulltext'
 parameters.set(fmap.content, 
 SearchSystemAttributeDef.FULLTEXT.getName());
 // create an empty content stream, this seams necessary for 
 ContentStreamUpdateRequest
 csur.addContentStream(new ImaContentStream());

 for (Content content : contentSet.getContentList()) {
 csur.addContentStream(new ImaContentStream(content));
 // for each content stream add additional attributes
 parameters.add(literal. + 
 SearchSystemAttributeDef.CONTENT_ID.getName(), 
 content.getBinaryObjectId().toString());
 parameters.add(literal. + 
 SearchSystemAttributeDef.CONTENT_KEY.getName(), content.getContentKey());
 parameters.add(literal. + 
 SearchSystemAttributeDef.FILE_NAME.getName(), content.getContentName());
 parameters.add(literal. + 
 SearchSystemAttributeDef.MIME_TYPE.getName(), content.getMimeType());
 }

 parameters.set(literal.id , indexId);

 // adding some other attributes
 ...

 csur.setParams(parameters);

 return csur;
 }

 During debugging I can see that the method 'server.request(csur)' read for 
 each ImaContentStream the buffer.
 When I'm looking on solr catalina log I see that the attached files reach the 
 solr servlet.

 INFO: Releasing directory:/data/V-4-1/master0/data/index
 Apr 25, 2013 5:48:07 AM org.apache.solr.update.processor.LogUpdateProcessor 
 finish
 INFO: [master0] webapp=/solr-4-1 path=/update/extract 
 params={literal.searchconnectortest15_c8150e41_cc49_4a .. 
 literal.id=26afa5dc-40ad-442a-ac79-0e7880c06aa1 .
 {add=[26afa5dc-40ad-442a-ac79-0e7880c06aa1 (1433265910940958720), 
 26afa5dc-40ad-442a-ac79-0e7880c06aa1 (1433265910971367424), 
 26afa5dc-40ad-442a-ac79-0e7880c06aa1 (1433265910976610304), 
 26afa5dc-40ad-442a-ac79-0e7880c06aa1 (1433265910983950336), 
 26afa5dc-40ad-442a-ac79-0e7880c06aa1 (1433265910989193216), 
 26afa5dc-40ad-442a-ac79-0e7880c06aa1 (1433265910995484672)]} 0 58


 But only the latest in the content list will be indexed.


 My schema.xml has the following field definitions:

 field name=id type=string indexed=true stored=true 
 required=true /
 field name=content type=text_general indexed=false stored=true 
 multiValued=true/

 field name=contentkey type=string indexed=true stored=true 
 multiValued=true/
 field name=contentid type=string indexed=true stored=true 
 multiValued=true/
 field name=contentfilename  type=string indexed=true stored=true 
 multiValued=true/
 field name=contentmimetype type=string indexed=true stored=true 
 multiValued=true/

 field name=fulltext type=text_general indexed=true stored=true 
 multiValued=true/


 I'm using the tika ExtractingRequestHandler which can extract binary files.



   requestHandler name=/update/extract
   startup=lazy
   class=solr.extraction.ExtractingRequestHandler 
 lst name=defaults
   str name=lowernamestrue/str
   str name=uprefixignored_/str

   !-- capture link hrefs but ignore div attributes --
   str name=captureAttrtrue/str
   str name=fmap.alinks/str
   str name=fmap.divignored_/str

 /lst
   /requestHandler

 Is it possible to index multiple files with the same id?
 It is necessary to implement my own 

Re: hook to know when a DOC is committed.

2013-05-23 Thread Fredrik Rødland
On 23. mai 2013, at 14:05, Jack Krupansky j...@basetechnology.com wrote:

Hi Jack,

thanks for your answer.

 A poller really is the most sensible, practical, and easiest route to go. If 
 you add the versions=true parameter to your update request and have the 
 transaction log enabled the update response will have the version numbers for 
 each document id, then the poller can also tell if an update has been 
 committed as well.

The poller will still have to retry before advertising a doc as searchable - 
won't it?

 Do you have some other, unmentioned requirement that you feel is biasing you 
 against a sensible poller? Clue us in as to the nature of such a requirement.

My plan was to link sold with our already established high-volume 
messaging-system.  So each time a document is searchable a message would be 
broadcasted on a given channel.

Our system consist of approx 10 indexes and 8 replications of each of these, so 
keeping track of all these by pollers would require a whole bunch of logic.  
Having a pushed-based system would facilitate knowing where  when a document 
is searchable quite a lot.



regards,


Fredrik


--
Fredrik Rødland   Mail:fredrik.rodl...@finn.no
FINN.no   Cell:+47 99 21 98 17
  Twitter: @fredrikr
Oslo, NORWAY  Web: http://about.me/fmr



Re: fq facet on double and non-indexed field

2013-05-23 Thread gpssolr2020
Thanks Erick..


i  hope we cant do q also on non-indexed field.

Whats is the difference between q and fq other than cache .



Thanks.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/fq-facet-on-double-and-non-indexed-field-tp4065457p4065604.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr 4.3 fails to load MySQL driver

2013-05-23 Thread Christian Köhler


Hi,

in my attempt to migrate for m 3.6.x to 4.3.0 I stumbled upon an issue 
loading the MySQL driver from the [instance]/lib dir:


Caused by: java.lang.ClassNotFoundException: 
org.apache.solr.handler.dataimport.DataImportHandler

 at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
 at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:789)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Class.java:266)
 at 
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:448)

... 18 more

To narrow it down, I use the plain example configuration with the 
following changes:


- Add a dataimport requestHandler to example/conf/solrconfig.xml
  (copied from a working solr 3.6.x)
- Created example/conf/data-config.xml with
  dataSource type=JdbcDataSource driver=com.mysql.jdbc.Driver ...
  and SQL statement (both copied from a working solr 3.6.x)
- placed the current driver mysql-connector-java-5.1.25-bin.jar in
  example/lib

As to my knowledge the lib dir is included automatically to the path. To 
make sure I tried to:


- add lib dir=./lib / to explicit to solrconf.xml
- add absolute path to solrarconf.xml
- changed solr.xml to use solr persistent=true sharedLib=lib

All to no avail.

System Info:
- OpenJDK Runtime Environmentm 1.7.0_19
- Solr 4.3.0
- mysql-connector-java-5.1.25-bin.jar

The same configuration run fine with a solr 3.6.x on the very same machine.

Any help is appreciated!
Cheers
Chris



--
Christian Köhler



Re: hook to know when a DOC is committed.

2013-05-23 Thread Jack Krupansky
Yes, by definition, a poller retries. But by picking a sensible default for 
initial poll and retry (possibly an initial delay tuned to match average 
update/commit time) couple with a traditional exponential backoff, that 
should not be a problem at all. In other words, an average request would not 
require a retry.


Even so, do you feel that there is some sort of problem with retry? If so, 
please state what it is.


Again, if you utilize soft commit, the time to commit will be significantly 
reduced.


Or, just go ahead a force a commit on every commit here the delay of a poll 
request is not acceptable. But I'd recommend the tuned poller.


would require a whole bunch of logic - and you think the commit hooks and 
your push model implementation (on both Solr and client side) will be less 
logic?!!


-- Jack Krupansky

-Original Message- 
From: Fredrik Rødland

Sent: Thursday, May 23, 2013 8:18 AM
To: solr-user@lucene.apache.org
Subject: Re: hook to know when a DOC is committed.

On 23. mai 2013, at 14:05, Jack Krupansky j...@basetechnology.com wrote:

Hi Jack,

thanks for your answer.

A poller really is the most sensible, practical, and easiest route to go. 
If you add the versions=true parameter to your update request and have 
the transaction log enabled the update response will have the version 
numbers for each document id, then the poller can also tell if an update 
has been committed as well.


The poller will still have to retry before advertising a doc as searchable - 
won't it?


Do you have some other, unmentioned requirement that you feel is biasing 
you against a sensible poller? Clue us in as to the nature of such a 
requirement.


My plan was to link sold with our already established high-volume 
messaging-system.  So each time a document is searchable a message would be 
broadcasted on a given channel.


Our system consist of approx 10 indexes and 8 replications of each of these, 
so keeping track of all these by pollers would require a whole bunch of 
logic.  Having a pushed-based system would facilitate knowing where  when a 
document is searchable quite a lot.




regards,


Fredrik


--
Fredrik Rødland   Mail:fredrik.rodl...@finn.no
FINN.no   Cell:+47 99 21 98 17
 Twitter: @fredrikr
Oslo, NORWAY  Web: http://about.me/fmr 



Bug in spellcheck.alternativeTermCount

2013-05-23 Thread Rounak Jain
I was playing around with spellcheck.alternativeTermCount and noticed that
if it is set to zero, Solr gives an exception with certain queries. Maybe
the value isn't supposed to be zero, but I don't think an exception is the
expected behaviour.

Rounak


Restaurant availability from database

2013-05-23 Thread rajh
Hi,

I am are building a website that lists restaurant information and I also
like to include the availability information.

I've created a custom ValueSourceParser and ValueSource that retrieve the
availability information from a MySQL database. An example query is as
follows.

http://localhost:8983/solr/collection1/select?q=restaurant_id:*fl=*,available:availability(2013-05-23,
2, 1700, 2359)

This results in a psuedo (boolean) field available per document result and
this works as expected. But my problem is that I also need the total number
of available restaurants.

Is there a way to count the number of available restaurants over the whole
result set? I tried the stats component, but it doesn't seem to work with
pseudo fields.

Thanks in advance,

Ronald





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Restaurant-availability-from-database-tp4065609.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 4.3 fails to load MySQL driver

2013-05-23 Thread Jack Krupansky
Check the Solr log on startup - it will explicitly state which lib 
directories/files will be used. Make sure they agree with where the DIH jars 
reside. Keep in mind that the directory structure of Solr changed - use the 
lib from 4.3 solrconfig.


Try to use DIH in the standard Solr 4.3 example first. Then mimic that in 
your customization.


-- Jack Krupansky

-Original Message- 
From: Christian Köhler

Sent: Thursday, May 23, 2013 8:25 AM
To: solr-user@lucene.apache.org
Subject: Solr 4.3 fails to load MySQL driver


Hi,

in my attempt to migrate for m 3.6.x to 4.3.0 I stumbled upon an issue
loading the MySQL driver from the [instance]/lib dir:

Caused by: java.lang.ClassNotFoundException:
org.apache.solr.handler.dataimport.DataImportHandler
 at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
 at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:789)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Class.java:266)
 at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:448)
... 18 more

To narrow it down, I use the plain example configuration with the
following changes:

- Add a dataimport requestHandler to example/conf/solrconfig.xml
  (copied from a working solr 3.6.x)
- Created example/conf/data-config.xml with
  dataSource type=JdbcDataSource driver=com.mysql.jdbc.Driver ...
  and SQL statement (both copied from a working solr 3.6.x)
- placed the current driver mysql-connector-java-5.1.25-bin.jar in
  example/lib

As to my knowledge the lib dir is included automatically to the path. To
make sure I tried to:

- add lib dir=./lib / to explicit to solrconf.xml
- add absolute path to solrarconf.xml
- changed solr.xml to use solr persistent=true sharedLib=lib

All to no avail.

System Info:
- OpenJDK Runtime Environmentm 1.7.0_19
- Solr 4.3.0
- mysql-connector-java-5.1.25-bin.jar

The same configuration run fine with a solr 3.6.x on the very same machine.

Any help is appreciated!
Cheers
Chris



--
Christian Köhler 



Shardsplitting

2013-05-23 Thread Arkadi Colson

Hi

When having a collection with 3 shards en 2 replica's for each shard and 
I want to split shard1. Does it matter where to start the splitshard 
command in the cloud or should it be started on the master of that shard?


BR,
Arkadi




Re: Solr 4.3: node is seen as active in Zk while in recovery mode + endless recovery

2013-05-23 Thread Jan Høydahl
Huge tlogs seems to be a common problem. Should we make it flush automatically 
on huge file size? Could be configurable on the updateLog tag?

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

23. mai 2013 kl. 14:03 skrev Erick Erickson erickerick...@gmail.com:

 Tangential to the issue you raise is that this is a huge tlog. It indicates 
 that
 you aren't doing a hard commit (openSearcher=false) very often. That
 operation will truncate your tlog which should speed recovery/startup.
 You're also chewing up some memory with a tlog that size since pointers
 to the tlog are kept for each document.
 
 This comment doesn't address your comment about the change to
 ZkController, I'll leave that to someone who knows the code.
 
 Best
 Erick
 
 On Thu, May 23, 2013 at 3:14 AM, AlexeyK lex.kudi...@gmail.com wrote:
 a small change: it's not an endless loop, but a painfully slow processing
 which includes running a delete query and then insertion. Each document from
 the tlog takes tens of seconds to process (more than 100 times slower than
 during normal insertion process)
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Solr-4-3-node-is-seen-as-active-in-Zk-while-in-recovery-mode-endless-recovery-tp4065549p4065551.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Regular expression in solr

2013-05-23 Thread Erik Hatcher
Regex expressions work on individual terms.  Positional information is 
irrelevant when it comes to regex matching - it's not matching across terms*.

The syntax allowed is documented here 
https://lucene.apache.org/core/4_3_0/core/org/apache/lucene/util/automaton/RegExp.html
 - it's not quite the full standard syntax.  ^ and $ aren't mentioned there.  
The beginning of the regex implicitly starts at the beginning of the term.

So whatever constitutes a term is the granularity of what matches.  string 
fields operate on the entire string.  A text field that is analyzed will regex 
match on the individual terms that emerge from the index-time analysis process.

Erik

* Though with the surround query parser you can do proximity matching using 
wildcarded terms in sophisticated ways.

On May 22, 2013, at 16:42 , Lance Norskog wrote:

 If the indexed data includes positions, it should be possible to implement ^ 
 and $ as the first and last positions.
 
 On 05/22/2013 04:08 AM, Oussama Jilal wrote:
 There is no ^ or $ in the solr regex since the regular expression will match 
 tokens (not the complete indexed text). So the results you get will basicly 
 depend on your way of indexing, if you use the regex on a tokenized field 
 and that is not what you want, try to use a copy field wich is not tokenized 
 and then use the regex on that one.
 
 On 05/22/2013 11:53 AM, Stéphane Habett Roux wrote:
 I just can't get the $ endpoint to work.
 
 I am not sure but I heard it works with the Java Regex engine (a little 
 obvious if it is true ...), so any Java regex tutorial would help you.
 
 On 05/22/2013 11:42 AM, Sagar Chaturvedi wrote:
 Yes, it works for me too. But many times result is not as expected. Is 
 there some guide on use of regex in solr?
 
 -Original Message-
 From: Oussama Jilal [mailto:jilal.ouss...@gmail.com]
 Sent: Wednesday, May 22, 2013 4:00 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Regular expression in solr
 
 I don't think so, it always worked for me without anything special, just 
 try it and see :)
 
 On 05/22/2013 11:26 AM, Sagar Chaturvedi wrote:
 @Oussama Thank you for your reply. Is it as simple as that? I mean no 
 additional settings required?
 
 -Original Message-
 From: Oussama Jilal [mailto:jilal.ouss...@gmail.com]
 Sent: Wednesday, May 22, 2013 3:37 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Regular expression in solr
 
 You can write a regular expression query like this (you need to specify 
 the regex between slashes / ) :
 
 fieldName:/[rR]egular.*/
 
 On 05/22/2013 10:51 AM, Sagar Chaturvedi wrote:
 Hi,
 
 How do we search based upon regular expressions in solr?
 
 Regards,
 Sagar
 
 
 
 DISCLAIMER:
 - 
 -
 -
 The contents of this e-mail and any attachment(s) are confidential
 and intended for the named recipient(s) only.
 It shall not attach any liability on the originator or NEC or its
 affiliates. Any views or opinions presented in this email are solely
 those of the author and may not necessarily reflect the opinions of
 NEC or its affiliates.
 Any form of reproduction, dissemination, copying, disclosure,
 modification, distribution and / or publication of this message
 without the prior written consent of the author of this e-mail is
 strictly prohibited. If you have received this email in error please
 delete it and notify the sender immediately. .
 - 
 -
 -
 
 DISCLAIMER:
 -- 
 -
 The contents of this e-mail and any attachment(s) are confidential and
 intended for the named recipient(s) only.
 It shall not attach any liability on the originator or NEC or its
 affiliates. Any views or opinions presented in this email are solely
 those of the author and may not necessarily reflect the opinions of
 NEC or its affiliates.
 Any form of reproduction, dissemination, copying, disclosure,
 modification, distribution and / or publication of this message
 without the prior written consent of the author of this e-mail is
 strictly prohibited. If you have received this email in error please
 delete it and notify the sender immediately. .
 -- 
 -
 
 
 DISCLAIMER:
 ---
  
 The contents of this e-mail and any attachment(s) are confidential and
 intended
 for the named recipient(s) only.
 It shall not attach any liability on the originator or NEC or its
 affiliates. Any views or opinions presented in
 this email are solely those of the author and may not necessarily reflect 
 the
 

AW: index multiple files into one index entity

2013-05-23 Thread Mark.Kappe
Hello Erick,
Thank you for your fast answer.

Maybe I don't exclaim my question clearly.
I want index many files to one index entity. I will use the same behavior as 
any other multivalued field which can indexed to one unique id.
So I think every ContentStreamUpdateRequest represent one index entity, isn't 
it? And with each addContentStream I will add one File to this entity.

Thank you and with best Regards
Mark




-Ursprüngliche Nachricht-
Von: Erick Erickson [mailto:erickerick...@gmail.com] 
Gesendet: Donnerstag, 23. Mai 2013 14:11
An: solr-user@lucene.apache.org
Betreff: Re: index multiple files into one index entity

I just skimmed your post, but I'm responding to the last bit.

If you have uniqueKey defined as id in schema.xml then no, you cannot have 
multiple documents with the same ID.
Whenever a new doc comes in it replaces the old doc with that ID.

You can remove the uniqueKey definition and do what you want, but there are 
very few Solr installations with no uniqueKey and it's probably a better idea 
to make your id's truly unique.

Best
Erick

On Thu, May 23, 2013 at 6:14 AM,  mark.ka...@t-systems.com wrote:
 Hello solr team,

 I want to index multiple fields into one solr index entity, with the 
 same id. We are using solr 4.1


 I try it with following source fragment:

 public void addContentSet(ContentSet contentSet) throws 
 SearchProviderException {

 ...

 ContentStreamUpdateRequest csur = 
 generateCSURequest(contentSet.getIndexId(), contentSet);
 String indexId = contentSet.getIndexId();

 ConcurrentUpdateSolrServer server = 
 serverPool.getUpdateServer(indexId);
 server.request(csur);

 ...
 }

 private ContentStreamUpdateRequest generateCSURequest(String indexId, 
 ContentSet contentSet)
 throws IOException {
 ContentStreamUpdateRequest csur = new 
 ContentStreamUpdateRequest(confStore.getExtractUrl());

 ModifiableSolrParams parameters = csur.getParams();
 if (parameters == null) {
 parameters = new ModifiableSolrParams();
 }

 parameters.set(literalsOverride, false);

 // maps the tika default content attribute to the Attribute with name 
 'fulltext'
 parameters.set(fmap.content, 
 SearchSystemAttributeDef.FULLTEXT.getName());
 // create an empty content stream, this seams necessary for 
 ContentStreamUpdateRequest
 csur.addContentStream(new ImaContentStream());

 for (Content content : contentSet.getContentList()) {
 csur.addContentStream(new ImaContentStream(content));
 // for each content stream add additional attributes
 parameters.add(literal. + 
 SearchSystemAttributeDef.CONTENT_ID.getName(), 
 content.getBinaryObjectId().toString());
 parameters.add(literal. + 
 SearchSystemAttributeDef.CONTENT_KEY.getName(), content.getContentKey());
 parameters.add(literal. + 
 SearchSystemAttributeDef.FILE_NAME.getName(), content.getContentName());
 parameters.add(literal. + 
 SearchSystemAttributeDef.MIME_TYPE.getName(), content.getMimeType());
 }

 parameters.set(literal.id , indexId);

 // adding some other attributes
 ...

 csur.setParams(parameters);

 return csur;
 }

 During debugging I can see that the method 'server.request(csur)' read for 
 each ImaContentStream the buffer.
 When I'm looking on solr catalina log I see that the attached files reach the 
 solr servlet.

 INFO: Releasing directory:/data/V-4-1/master0/data/index
 Apr 25, 2013 5:48:07 AM 
 org.apache.solr.update.processor.LogUpdateProcessor finish
 INFO: [master0] webapp=/solr-4-1 path=/update/extract 
 params={literal.searchconnectortest15_c8150e41_cc49_4a .. 
 literal.id=26afa5dc-40ad-442a-ac79-0e7880c06aa1 .
 {add=[26afa5dc-40ad-442a-ac79-0e7880c06aa1 (1433265910940958720), 
 26afa5dc-40ad-442a-ac79-0e7880c06aa1 (1433265910971367424), 
 26afa5dc-40ad-442a-ac79-0e7880c06aa1 (1433265910976610304), 
 26afa5dc-40ad-442a-ac79-0e7880c06aa1 (1433265910983950336), 
 26afa5dc-40ad-442a-ac79-0e7880c06aa1 (1433265910989193216), 
 26afa5dc-40ad-442a-ac79-0e7880c06aa1 (1433265910995484672)]} 0 58


 But only the latest in the content list will be indexed.


 My schema.xml has the following field definitions:

 field name=id type=string indexed=true stored=true 
 required=true /
 field name=content type=text_general indexed=false 
 stored=true multiValued=true/

 field name=contentkey type=string indexed=true stored=true 
 multiValued=true/
 field name=contentid type=string indexed=true stored=true 
 multiValued=true/
 field name=contentfilename  type=string indexed=true stored=true 
 multiValued=true/
 field name=contentmimetype type=string indexed=true 
 stored=true multiValued=true/

 field name=fulltext type=text_general 

Re: Solr 4.3: node is seen as active in Zk while in recovery mode + endless recovery

2013-05-23 Thread AlexeyK
the hard commit is set to about 20 minutes, while ram buffer is 256Mb. 
We will add more frequent hard commits without refreshing the searcher, that
for the tip.

from what I understood from the code, for each 'add' command there is a test
for a 'delete by query'. if there is an older dbq, it's run after the 'add'
operation if its version  'add' version.
in my case, there are a lot of documents to be inserted, and a single large
DBQ. My question is: shouldn't this be done in bulks? Why is it necessary to
run the DBQ after each insertion? Supposedly there are 1000 insertions it's
run 1000 times.





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-3-node-is-seen-as-active-in-Zk-while-in-recovery-mode-endless-recovery-tp4065549p4065628.html
Sent from the Solr - User mailing list archive at Nabble.com.


Broken pipe

2013-05-23 Thread Arkadi Colson

Any idea why I got a Broken pipe?

INFO  - 2013-05-23 13:37:19.881; org.apache.solr.core.SolrCore; 
[messages_shard3_replica1] webapp=/solr path=/select/ 
params={sort=score+descfl=id,smsc_module,smsc_modulekey,smsc_userid,smsc_ssid,smsc_description,smsc_description_ngram,smsc_content,smsc_content_ngram,smsc_courseid,smsc_lastdate,score,metadata_stream_size,metadata_stream_source_info,metadata_stream_name,metadata_stream_content_type,last_modified,author,title,subjectdebugQuery=truedefaultOperator=ANDindent=onstart=0q=(smsc_content:banaan+||+smsc_content_ngram:banaan+||+smsc_description:banaan+||+smsc_description_ngram:banaan)+%26%26+(smsc_lastdate:[2000-04-23T15:14:40Z+TO+2013-05-23T15:14:40Z])+%26%26+(smsc_ssid:9)collection=messageswt=xmlrows=50version=2.2} 
hits=119 status=0 QTime=81108
ERROR - 2013-05-23 13:37:19.892; org.apache.solr.common.SolrException; 
null:ClientAbortException: java.net.SocketException: Broken pipe
at 
org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:406)

at org.apache.tomcat.util.buf.ByteChunk.append(ByteChunk.java:342)
at 
org.apache.catalina.connector.OutputBuffer.writeBytes(OutputBuffer.java:431)
at 
org.apache.catalina.connector.OutputBuffer.write(OutputBuffer.java:419)
at 
org.apache.catalina.connector.CoyoteOutputStream.write(CoyoteOutputStream.java:91)

at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
at sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:282)
at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
at java.io.OutputStreamWriter.write(OutputStreamWriter.java:207)
at org.apache.solr.util.FastWriter.flush(FastWriter.java:141)
at org.apache.solr.util.FastWriter.flushBuffer(FastWriter.java:155)
at 
org.apache.solr.response.TextResponseWriter.close(TextResponseWriter.java:85)
at 
org.apache.solr.response.XMLResponseWriter.write(XMLResponseWriter.java:41)
at 
org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:644)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:372)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:155)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
at 
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:953)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
at 
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1008)
at 
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
at 
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:722)
Caused by: java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
at 
org.apache.coyote.http11.InternalOutputBuffer.realWriteBytes(InternalOutputBuffer.java:215)

at org.apache.tomcat.util.buf.ByteChunk.flushBuffer(ByteChunk.java:480)
at org.apache.tomcat.util.buf.ByteChunk.append(ByteChunk.java:366)
at 
org.apache.coyote.http11.InternalOutputBuffer$OutputStreamOutputBuffer.doWrite(InternalOutputBuffer.java:240)
at 
org.apache.coyote.http11.filters.ChunkedOutputFilter.doWrite(ChunkedOutputFilter.java:117)
at 
org.apache.coyote.http11.AbstractOutputBuffer.doWrite(AbstractOutputBuffer.java:192)

at org.apache.coyote.Response.doWrite(Response.java:505)
at 
org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:401)

... 30 more

ERROR - 2013-05-23 13:37:19.893; org.apache.solr.common.SolrException; 
null:ClientAbortException: java.net.SocketException: Broken pipe
at 
org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:406)

at org.apache.tomcat.util.buf.ByteChunk.append(ByteChunk.java:342)
at 
org.apache.catalina.connector.OutputBuffer.writeBytes(OutputBuffer.java:431)
at 
org.apache.catalina.connector.OutputBuffer.write(OutputBuffer.java:419)
at 

Re: Distributed query: strange behavior.

2013-05-23 Thread Shawn Heisey
On 5/23/2013 1:51 AM, Luis Cappa Banda wrote:
 I've query each Solr shard server one by one and the total number of
 documents is correct. However, when I change rows parameter from 10 to 100
 the total numFound of documents change:

I've seen this problem on the list before and the cause has been
determined each time to be caused by documents with the same uniqueKey
value appearing in more than one shard.

What I think happens here:

With rows=10, you get the top ten docs from each of the three shards,
and each shard sends its numFound for that query to the core that's
coordinating the search.  The coordinator adds up numFound, looks
through those thirty docs, and arranges them according to the requested
sort order, returning only the top 10.  In this case, there happen to be
no duplicates.

With rows=100, you get a total of 300 docs.  This time, duplicates are
found and removed by the coordinator.  I think that the coordinator
adjusts the total numFound by the number of duplicate documents it
removed, in an attempt to be more accurate.

I don't know if adjusting numFound when duplicates are found in a
sharded query is the right thing to do, I'll leave that for smarter
people.  Perhaps Solr should return a message with the results saying
that duplicates were found, and if a config option is not enabled, the
server should throw an exception and return a 4xx HTTP error code.  One
idea for a config parameter name would be allowShardDuplicates, but
something better can probably be found.

Thanks,
Shawn



AW: Broken pipe

2013-05-23 Thread André Widhani
This usually happens when the client sending the request to Solr has given up 
waiting for the response (terminated the connection).

In your example, we see that the Solr query time is 81 seconds. Probably the 
client issuing the request has a time-out of maybe 30 or 60 seconds.

André


Von: Arkadi Colson [ark...@smartbit.be]
Gesendet: Donnerstag, 23. Mai 2013 15:40
An: solr-user@lucene.apache.org
Betreff: Broken pipe

Any idea why I got a Broken pipe?

INFO  - 2013-05-23 13:37:19.881; org.apache.solr.core.SolrCore;
[messages_shard3_replica1] webapp=/solr path=/select/
params={sort=score+descfl=id,smsc_module,smsc_modulekey,smsc_userid,smsc_ssid,smsc_description,smsc_description_ngram,smsc_content,smsc_content_ngram,smsc_courseid,smsc_lastdate,score,metadata_stream_size,metadata_stream_source_info,metadata_stream_name,metadata_stream_content_type,last_modified,author,title,subjectdebugQuery=truedefaultOperator=ANDindent=onstart=0q=(smsc_content:banaan+||+smsc_content_ngram:banaan+||+smsc_description:banaan+||+smsc_description_ngram:banaan)+%26%26+(smsc_lastdate:[2000-04-23T15:14:40Z+TO+2013-05-23T15:14:40Z])+%26%26+(smsc_ssid:9)collection=messageswt=xmlrows=50version=2.2}
hits=119 status=0 QTime=81108
ERROR - 2013-05-23 13:37:19.892; org.apache.solr.common.SolrException;
null:ClientAbortException: java.net.SocketException: Broken pipe
 at
org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:406)
 at org.apache.tomcat.util.buf.ByteChunk.append(ByteChunk.java:342)
 at
org.apache.catalina.connector.OutputBuffer.writeBytes(OutputBuffer.java:431)
 at
org.apache.catalina.connector.OutputBuffer.write(OutputBuffer.java:419)
 at
org.apache.catalina.connector.CoyoteOutputStream.write(CoyoteOutputStream.java:91)
 at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
 at sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:282)
 at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
 at java.io.OutputStreamWriter.write(OutputStreamWriter.java:207)
 at org.apache.solr.util.FastWriter.flush(FastWriter.java:141)
 at org.apache.solr.util.FastWriter.flushBuffer(FastWriter.java:155)
 at
org.apache.solr.response.TextResponseWriter.close(TextResponseWriter.java:85)
 at
org.apache.solr.response.XMLResponseWriter.write(XMLResponseWriter.java:41)
 at
org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:644)
 at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:372)
 at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:155)
 at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
 at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
 at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
 at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
 at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
 at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
 at
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:953)
 at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
 at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
 at
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1008)
 at
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
 at
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
 at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:722)
Caused by: java.net.SocketException: Broken pipe
 at java.net.SocketOutputStream.socketWrite0(Native Method)
 at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
 at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
 at
org.apache.coyote.http11.InternalOutputBuffer.realWriteBytes(InternalOutputBuffer.java:215)
 at org.apache.tomcat.util.buf.ByteChunk.flushBuffer(ByteChunk.java:480)
 at org.apache.tomcat.util.buf.ByteChunk.append(ByteChunk.java:366)
 at
org.apache.coyote.http11.InternalOutputBuffer$OutputStreamOutputBuffer.doWrite(InternalOutputBuffer.java:240)
 at
org.apache.coyote.http11.filters.ChunkedOutputFilter.doWrite(ChunkedOutputFilter.java:117)
 at
org.apache.coyote.http11.AbstractOutputBuffer.doWrite(AbstractOutputBuffer.java:192)
 at org.apache.coyote.Response.doWrite(Response.java:505)
 at
org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:401)
 ... 30 more

ERROR - 

Re: Solr 4.3 fails to load MySQL driver

2013-05-23 Thread Shawn Heisey
On 5/23/2013 6:25 AM, Christian Köhler wrote:
 in my attempt to migrate for m 3.6.x to 4.3.0 I stumbled upon an issue
 loading the MySQL driver from the [instance]/lib dir:
 
 Caused by: java.lang.ClassNotFoundException:
 org.apache.solr.handler.dataimport.DataImportHandler

The best thing to do is take the lib directives out of solrconfig.xml
and put your extra jars in ${solr.solr.home}/lib, where solr.solr.home
is the directory where solr.xml lives.  NB: There might be two solr.xml
files in your setup, but if there are, one of them will tell your
servlet container how to start solr, the correct file tells solr about
cores.

Normally, you can set up another global lib directory, absolute or
relative to solr.solr.home, with the sharedLib attribute in solr.xml,
but that doesn't work in 4.3.0 - only ${solr.solr.home}/lib works in
that specific version.  Here's the bug report:

https://issues.apache.org/jira/browse/SOLR-4791

I discovered another glitch last night in the 4.4 development version
and filed a bug report, but I've been informed that I've been doing it
wrong for the last couple of years:

https://issues.apache.org/jira/browse/SOLR-4852

Thanks,
Shawn



Problem with document routing with Solr 4.2.1

2013-05-23 Thread Jean-Sebastien Vachon
Hi All,

I just started indexing data in my brand new Solr Cloud running on 4.2.1.
Since I am a big user of the grouping feature, I need to route my documents on 
the proper shard.
Following the instruction found here:
http://docs.lucidworks.com/display/solr/Shards+and+Indexing+Data+in+SolrCloud

I set my document id to something like this  'fieldA!id' where fieldA is the 
key I want to use to distribute my documents.
(All documents with the same value for fieldA will be sent to the same shard).

When I query my index, I can see that the number of documents increase but 
there are no fields at all in the index.

http://10.0.5.211:8201/solr/Current/select?q=*:*

response
  lst name=responseHeader
  int name=status0/int
  int name=QTime11/int
  lst name=params
  str name=q*:*/str
  /lst
  /lst
  result name=response numFound=26318 start=0 maxScore=1.0/
/response

Specifying fields in the 'fl' parameter does nothing.

What am I doing wrong?


Re: Problem with document routing with Solr 4.2.1

2013-05-23 Thread Shalin Shekhar Mangar
That's strange. The default value of rows param is 10 so you should be
getting 10 results back unless your StandardRequestHandler config in
solrconfig has set rows to 0 or if none of your fields are stored.


On Thu, May 23, 2013 at 7:40 PM, Jean-Sebastien Vachon 
jean-sebastien.vac...@wantedanalytics.com wrote:

 Hi All,

 I just started indexing data in my brand new Solr Cloud running on 4.2.1.
 Since I am a big user of the grouping feature, I need to route my
 documents on the proper shard.
 Following the instruction found here:

 http://docs.lucidworks.com/display/solr/Shards+and+Indexing+Data+in+SolrCloud

 I set my document id to something like this  'fieldA!id' where fieldA is
 the key I want to use to distribute my documents.
 (All documents with the same value for fieldA will be sent to the same
 shard).

 When I query my index, I can see that the number of documents increase but
 there are no fields at all in the index.

 http://10.0.5.211:8201/solr/Current/select?q=*:*

 response
   lst name=responseHeader
   int name=status0/int
   int name=QTime11/int
   lst name=params
   str name=q*:*/str
   /lst
   /lst
   result name=response numFound=26318 start=0 maxScore=1.0/
 /response

 Specifying fields in the 'fl' parameter does nothing.

 What am I doing wrong?




-- 
Regards,
Shalin Shekhar Mangar.


RE: Bug in spellcheck.alternativeTermCount

2013-05-23 Thread Dyer, James
Can you give instructions on how to reproduce problem?

James Dyer
Ingram Content Group
(615) 213-4311


-Original Message-
From: Rounak Jain [mailto:rouna...@gmail.com] 
Sent: Thursday, May 23, 2013 7:36 AM
To: solr-user@lucene.apache.org
Subject: Bug in spellcheck.alternativeTermCount

I was playing around with spellcheck.alternativeTermCount and noticed that
if it is set to zero, Solr gives an exception with certain queries. Maybe
the value isn't supposed to be zero, but I don't think an exception is the
expected behaviour.

Rounak



Re: Broken pipe

2013-05-23 Thread Alexandre Rafalovitch
Also happens (same reason) if you are behind a smart load-balance and
it decides to time out and fail over.

Regards,
   Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Thu, May 23, 2013 at 9:59 AM, André Widhani andre.widh...@digicol.de wrote:
 This usually happens when the client sending the request to Solr has given up 
 waiting for the response (terminated the connection).

 In your example, we see that the Solr query time is 81 seconds. Probably the 
 client issuing the request has a time-out of maybe 30 or 60 seconds.

 André

 
 Von: Arkadi Colson [ark...@smartbit.be]
 Gesendet: Donnerstag, 23. Mai 2013 15:40
 An: solr-user@lucene.apache.org
 Betreff: Broken pipe

 Any idea why I got a Broken pipe?

 INFO  - 2013-05-23 13:37:19.881; org.apache.solr.core.SolrCore;
 [messages_shard3_replica1] webapp=/solr path=/select/
 params={sort=score+descfl=id,smsc_module,smsc_modulekey,smsc_userid,smsc_ssid,smsc_description,smsc_description_ngram,smsc_content,smsc_content_ngram,smsc_courseid,smsc_lastdate,score,metadata_stream_size,metadata_stream_source_info,metadata_stream_name,metadata_stream_content_type,last_modified,author,title,subjectdebugQuery=truedefaultOperator=ANDindent=onstart=0q=(smsc_content:banaan+||+smsc_content_ngram:banaan+||+smsc_description:banaan+||+smsc_description_ngram:banaan)+%26%26+(smsc_lastdate:[2000-04-23T15:14:40Z+TO+2013-05-23T15:14:40Z])+%26%26+(smsc_ssid:9)collection=messageswt=xmlrows=50version=2.2}
 hits=119 status=0 QTime=81108
 ERROR - 2013-05-23 13:37:19.892; org.apache.solr.common.SolrException;
 null:ClientAbortException: java.net.SocketException: Broken pipe
  at
 org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:406)
  at org.apache.tomcat.util.buf.ByteChunk.append(ByteChunk.java:342)
  at
 org.apache.catalina.connector.OutputBuffer.writeBytes(OutputBuffer.java:431)
  at
 org.apache.catalina.connector.OutputBuffer.write(OutputBuffer.java:419)
  at
 org.apache.catalina.connector.CoyoteOutputStream.write(CoyoteOutputStream.java:91)
  at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
  at sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:282)
  at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:125)
  at java.io.OutputStreamWriter.write(OutputStreamWriter.java:207)
  at org.apache.solr.util.FastWriter.flush(FastWriter.java:141)
  at org.apache.solr.util.FastWriter.flushBuffer(FastWriter.java:155)
  at
 org.apache.solr.response.TextResponseWriter.close(TextResponseWriter.java:85)
  at
 org.apache.solr.response.XMLResponseWriter.write(XMLResponseWriter.java:41)
  at
 org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:644)
  at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:372)
  at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:155)
  at
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
  at
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
  at
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
  at
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
  at
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
  at
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
  at
 org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:953)
  at
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
  at
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
  at
 org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1008)
  at
 org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
  at
 org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
  at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:722)
 Caused by: java.net.SocketException: Broken pipe
  at java.net.SocketOutputStream.socketWrite0(Native Method)
  at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
  at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
  at
 org.apache.coyote.http11.InternalOutputBuffer.realWriteBytes(InternalOutputBuffer.java:215)
  at org.apache.tomcat.util.buf.ByteChunk.flushBuffer(ByteChunk.java:480)
  at 

Re: Restaurant availability from database

2013-05-23 Thread Alexandre Rafalovitch
Check out Gilt's presentation. It might give you some ideas, including
possibly on refactoring your entities around 'availability' as a
document:
http://www.lucenerevolution.org/sites/default/files/Personalized%20Search%20on%20the%20Largest%20Flash%20Sale%20Site%20in%20America.pdf

Regards,
   Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Thu, May 23, 2013 at 8:36 AM, rajh ron...@trimm.nl wrote:
 Hi,

 I am are building a website that lists restaurant information and I also
 like to include the availability information.

 I've created a custom ValueSourceParser and ValueSource that retrieve the
 availability information from a MySQL database. An example query is as
 follows.

 http://localhost:8983/solr/collection1/select?q=restaurant_id:*fl=*,available:availability(2013-05-23,
 2, 1700, 2359)

 This results in a psuedo (boolean) field available per document result and
 this works as expected. But my problem is that I also need the total number
 of available restaurants.

 Is there a way to count the number of available restaurants over the whole
 result set? I tried the stats component, but it doesn't seem to work with
 pseudo fields.

 Thanks in advance,

 Ronald





 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Restaurant-availability-from-database-tp4065609.html
 Sent from the Solr - User mailing list archive at Nabble.com.


RE: Problem with document routing with Solr 4.2.1

2013-05-23 Thread Jean-Sebastien Vachon
I know. If a stop routing the documents and simply use a standard 'id' field 
then I am getting back my fields. 
I forgot to tell you how the collection was created.

http://localhost:8201/solr/admin/collections?action=CREATEname=CurrentnumShards=15replicationFactor=3maxShardsPerNode=9

Since I am using the numshards parameter then composite routing should be 
working... unless I misunderstood something

-Original Message-
From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] 
Sent: May-23-13 10:27 AM
To: solr-user@lucene.apache.org
Subject: Re: Problem with document routing with Solr 4.2.1

That's strange. The default value of rows param is 10 so you should be 
getting 10 results back unless your StandardRequestHandler config in solrconfig 
has set rows to 0 or if none of your fields are stored.


On Thu, May 23, 2013 at 7:40 PM, Jean-Sebastien Vachon  
jean-sebastien.vac...@wantedanalytics.com wrote:

 Hi All,

 I just started indexing data in my brand new Solr Cloud running on 4.2.1.
 Since I am a big user of the grouping feature, I need to route my 
 documents on the proper shard.
 Following the instruction found here:

 http://docs.lucidworks.com/display/solr/Shards+and+Indexing+Data+in+So
 lrCloud

 I set my document id to something like this  'fieldA!id' where fieldA 
 is the key I want to use to distribute my documents.
 (All documents with the same value for fieldA will be sent to the same 
 shard).

 When I query my index, I can see that the number of documents increase 
 but there are no fields at all in the index.

 http://10.0.5.211:8201/solr/Current/select?q=*:*

 response
   lst name=responseHeader
   int name=status0/int
   int name=QTime11/int
   lst name=params
   str name=q*:*/str
   /lst
   /lst
   result name=response numFound=26318 start=0 maxScore=1.0/ 
 /response

 Specifying fields in the 'fl' parameter does nothing.

 What am I doing wrong?




--
Regards,
Shalin Shekhar Mangar.

-
Aucun virus trouvé dans ce message.
Analyse effectuée par AVG - www.avg.fr
Version: 2013.0.3336 / Base de données virale: 3162/6319 - Date: 12/05/2013 La 
Base de données des virus a expiré.


Core admin action CREATE fails for existing core

2013-05-23 Thread André Widhani
It seems to me that the behavior of the Core admin action CREATE has changed 
when going from Solr 4.1 to 4.3.

With 4.1, I could re-configure an existing core (changing path/name to 
solrconfig.xml for example). In 4.3, I get an error message:

  SEVERE: org.apache.solr.common.SolrException: Error CREATEing SolrCore 
'core-tex69b6iom1djrbzmlmg83-index2': Core with name 
'core-tex69b6iom1djrbzmlmg83-index2' already exists.

Is this change intended?

André



Re: Shardsplitting

2013-05-23 Thread Shalin Shekhar Mangar
Hi Arkadi,

It does not matter where you invoke that command because ultimately that
command is executed by the Overseer node. That being said, shard splitting
has some bugs whose fixes will be released with Solr 4.3.1 so I'd suggest
that you wait until then to use this feature.


On Thu, May 23, 2013 at 6:09 PM, Arkadi Colson ark...@smartbit.be wrote:

 Hi

 When having a collection with 3 shards en 2 replica's for each shard and I
 want to split shard1. Does it matter where to start the splitshard command
 in the cloud or should it be started on the master of that shard?

 BR,
 Arkadi





-- 
Regards,
Shalin Shekhar Mangar.


Re: OPENNLP current patch compiling problem for 4.x branch

2013-05-23 Thread Steve Rowe
Hi Patrick,

I think you should check out and apply the patch to branch_4x, rather than the 
lucene_solr_4_3_0 tag:

http://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x

Steve

On May 23, 2013, at 2:08 AM, Patrick Mi patrick...@touchpointgroup.com wrote:

 Hi,
 
 I checked out from here
 http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_4_3_0 and
 downloaded the latest patch LUCENE-2899-current.patch.
 
 Applied the patch ok but when I did 'ant compile' I got the following error:
 
 
 ==
[javac]
 /home/lucene_solr_4_3_0/lucene/analysis/opennlp/src/java/org/apache/lucene/a
 nalysis/opennlp/FilterPayloadsFilter.java:43: error
 r: cannot find symbol
[javac] super(Version.LUCENE_44, input);
[javac]  ^
[javac]   symbol:   variable LUCENE_44
[javac]   location: class Version
[javac] 1 error
 ==
 
 Compiled it on trunk without problem.
 
 Is this patch supposed to work for 4.X?
 
 Regards,
 Patrick 
 



RE: Problem with document routing with Solr 4.2.1

2013-05-23 Thread Jean-Sebastien Vachon
If that can help.. adding distrib=false or shard.keys= is giving back 
results.


-Original Message-
From: Jean-Sebastien Vachon [mailto:jean-sebastien.vac...@wantedanalytics.com] 
Sent: May-23-13 10:39 AM
To: solr-user@lucene.apache.org
Subject: RE: Problem with document routing with Solr 4.2.1

I know. If a stop routing the documents and simply use a standard 'id' field 
then I am getting back my fields. 
I forgot to tell you how the collection was created.

http://localhost:8201/solr/admin/collections?action=CREATEname=CurrentnumShards=15replicationFactor=3maxShardsPerNode=9

Since I am using the numshards parameter then composite routing should be 
working... unless I misunderstood something

-Original Message-
From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com]
Sent: May-23-13 10:27 AM
To: solr-user@lucene.apache.org
Subject: Re: Problem with document routing with Solr 4.2.1

That's strange. The default value of rows param is 10 so you should be 
getting 10 results back unless your StandardRequestHandler config in solrconfig 
has set rows to 0 or if none of your fields are stored.


On Thu, May 23, 2013 at 7:40 PM, Jean-Sebastien Vachon  
jean-sebastien.vac...@wantedanalytics.com wrote:

 Hi All,

 I just started indexing data in my brand new Solr Cloud running on 4.2.1.
 Since I am a big user of the grouping feature, I need to route my 
 documents on the proper shard.
 Following the instruction found here:

 http://docs.lucidworks.com/display/solr/Shards+and+Indexing+Data+in+So
 lrCloud

 I set my document id to something like this  'fieldA!id' where fieldA 
 is the key I want to use to distribute my documents.
 (All documents with the same value for fieldA will be sent to the same 
 shard).

 When I query my index, I can see that the number of documents increase 
 but there are no fields at all in the index.

 http://10.0.5.211:8201/solr/Current/select?q=*:*

 response
   lst name=responseHeader
   int name=status0/int
   int name=QTime11/int
   lst name=params
   str name=q*:*/str
   /lst
   /lst
   result name=response numFound=26318 start=0 maxScore=1.0/ 
 /response

 Specifying fields in the 'fl' parameter does nothing.

 What am I doing wrong?




--
Regards,
Shalin Shekhar Mangar.

-
Aucun virus trouvé dans ce message.
Analyse effectuée par AVG - www.avg.fr
Version: 2013.0.3336 / Base de données virale: 3162/6319 - Date: 12/05/2013 La 
Base de données des virus a expiré.

-
Aucun virus trouvé dans ce message.
Analyse effectuée par AVG - www.avg.fr
Version: 2013.0.3336 / Base de données virale: 3162/6319 - Date: 12/05/2013 La 
Base de données des virus a expiré.


Re: Core admin action CREATE fails for existing core

2013-05-23 Thread Mark Miller
Yes, this did change - it's actually a protection for a previous change though.

There was a time when you did a core reload by just making a new core with the 
same name and closing the old core - that is no longer really supported though 
- the proper way to do this is to use SolrCore#reload, and that has been the 
case for all of 4.x release if I remember right. I supported making this change 
to force people who might still be doing what is likely quite a buggy operation 
to switch to the correct code.

Sorry about the inconvenience.

- Mark

On May 23, 2013, at 10:45 AM, André Widhani andre.widh...@digicol.de wrote:

 It seems to me that the behavior of the Core admin action CREATE has 
 changed when going from Solr 4.1 to 4.3.
 
 With 4.1, I could re-configure an existing core (changing path/name to 
 solrconfig.xml for example). In 4.3, I get an error message:
 
  SEVERE: org.apache.solr.common.SolrException: Error CREATEing SolrCore 
 'core-tex69b6iom1djrbzmlmg83-index2': Core with name 
 'core-tex69b6iom1djrbzmlmg83-index2' already exists.
 
 Is this change intended?
 
 André
 



Re: fq facet on double and non-indexed field

2013-05-23 Thread Raymond Wiker
On May 23, 2013, at 14:25 , gpssolr2020 psgoms...@gmail.com wrote:
 Thanks Erick..
 
 
 i  hope we cant do q also on non-indexed field.
 
 Whats is the difference between q and fq other than cache .
 
 
 
 Thanks.


How do you expect to search on a field that is non-indexed (and thus 
non-searchable)?

RE: .skip.autorecovery=Y + restart solr after crash + losing many documents

2013-05-23 Thread Gilles Comeau
Hi Otis, 

Thank you for your reply.  I'm in the middle of that upgrade and will report 
back when testing is complete.   I'd like to get some nice set of reproducible 
steps so I'm not just ranting on. :)   

Regards,

Gilles

-Original Message-
From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com] 
Sent: 20 May 2013 04:29
To: solr-user@lucene.apache.org
Subject: Re: .skip.autorecovery=Y + restart solr after crash + losing many 
documents

Hi Gilles,

Could you upgrade to 4.3.0 and see if you can reproduce?

Otis
--
Solr  ElasticSearch Support
http://sematext.com/





On Mon, May 13, 2013 at 5:26 PM, Gilles Comeau gilles.com...@polecat.co wrote:
 Hi all,

 We write to two same-named cores in the same collection for redundancy, and 
 are not taking advantage of the full benefits of solr cloud replication.

 We use solrcloud.skip.autorecovery=true so that Solr doesn't try to sync the 
 indexes when it starts up.

 However, we find that if the core is not optimized prior to shutting it down 
 (in a crash situation), we can lose all of the data after starting up.   The 
 files are written to disk, but we can lose a full 24 hours worth of data as 
 they are all removed when we start SOLR.  (I don't think it is a commit issue)

 If we optimize before shutting down, we never lose any data.   Sadly, 
 sometimes SOLR is in a state where optimizing is not an option.

 Can anyone think of why that might be?   Is there any special configuration 
 you need if you want to write directly to two cores rather than use 
 replication?   Version 4.0, this used to work in our 4.0 nightly build, but 
 broke when we migrated to 4.0 production.(until we test and migrate to 
 the replication setup - it won't be too long and I'm a bit embarrassed to be 
 asking this question!)

 Regards,

 Gilles



Re: Core admin action CREATE fails for existing core

2013-05-23 Thread Alan Woodward
I think the wiki needs to be updated to reflect this?  
http://wiki.apache.org/solr/CoreAdmin

If somebody adds me as an editor (AlanWoodward), I'll do it.

Alan Woodward
www.flax.co.uk


On 23 May 2013, at 16:43, Mark Miller wrote:

 Yes, this did change - it's actually a protection for a previous change 
 though.
 
 There was a time when you did a core reload by just making a new core with 
 the same name and closing the old core - that is no longer really supported 
 though - the proper way to do this is to use SolrCore#reload, and that has 
 been the case for all of 4.x release if I remember right. I supported making 
 this change to force people who might still be doing what is likely quite a 
 buggy operation to switch to the correct code.
 
 Sorry about the inconvenience.
 
 - Mark
 
 On May 23, 2013, at 10:45 AM, André Widhani andre.widh...@digicol.de wrote:
 
 It seems to me that the behavior of the Core admin action CREATE has 
 changed when going from Solr 4.1 to 4.3.
 
 With 4.1, I could re-configure an existing core (changing path/name to 
 solrconfig.xml for example). In 4.3, I get an error message:
 
 SEVERE: org.apache.solr.common.SolrException: Error CREATEing SolrCore 
 'core-tex69b6iom1djrbzmlmg83-index2': Core with name 
 'core-tex69b6iom1djrbzmlmg83-index2' already exists.
 
 Is this change intended?
 
 André
 
 



Re: Core admin action CREATE fails for existing core

2013-05-23 Thread Steve Rowe
Alan, I've added AlanWoodward to the Solr AdminGroup page.

On May 23, 2013, at 12:29 PM, Alan Woodward a...@flax.co.uk wrote:

 I think the wiki needs to be updated to reflect this?  
 http://wiki.apache.org/solr/CoreAdmin
 
 If somebody adds me as an editor (AlanWoodward), I'll do it.
 
 Alan Woodward
 www.flax.co.uk
 
 
 On 23 May 2013, at 16:43, Mark Miller wrote:
 
 Yes, this did change - it's actually a protection for a previous change 
 though.
 
 There was a time when you did a core reload by just making a new core with 
 the same name and closing the old core - that is no longer really supported 
 though - the proper way to do this is to use SolrCore#reload, and that has 
 been the case for all of 4.x release if I remember right. I supported making 
 this change to force people who might still be doing what is likely quite a 
 buggy operation to switch to the correct code.
 
 Sorry about the inconvenience.
 
 - Mark
 
 On May 23, 2013, at 10:45 AM, André Widhani andre.widh...@digicol.de wrote:
 
 It seems to me that the behavior of the Core admin action CREATE has 
 changed when going from Solr 4.1 to 4.3.
 
 With 4.1, I could re-configure an existing core (changing path/name to 
 solrconfig.xml for example). In 4.3, I get an error message:
 
 SEVERE: org.apache.solr.common.SolrException: Error CREATEing SolrCore 
 'core-tex69b6iom1djrbzmlmg83-index2': Core with name 
 'core-tex69b6iom1djrbzmlmg83-index2' already exists.
 
 Is this change intended?
 
 André
 
 
 



Re: Restaurant availability from database

2013-05-23 Thread rajh
Thank you for your answer.

Do you mean I should index the availability data as a document in Solr?
Because the availability data in our databases is around 6,509,972 records
and contains the availability per number of seats and per 15 minutes. I also
tried this method, and as far as I know it's only possible to join the
availability documents and not to include that information per result
document.

An example API response (created from the Solr response):
{
restaurants: [
{
id: 13906,
name: Allerlei,
zipcode: 6511DP,
house_number: 59,
available: true
},
{
id: 13907,
name: Voorbeeld,
zipcode: 6512DP,
house_number: 39,
available: false
}
],
resultCount: 12156,
resultCountAvailable: 55,
}

I'm currently hacking around the problem by executing the search again with
a very high value for the rows parameter and counting the number of
available restaurants on the backend, but this causes a big performance
impact (as expected).




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Restaurant-availability-from-database-tp4065609p4065710.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Core admin action CREATE fails for existing core

2013-05-23 Thread Alan Woodward
Thanks!

Alan Woodward
www.flax.co.uk


On 23 May 2013, at 17:38, Steve Rowe wrote:

 Alan, I've added AlanWoodward to the Solr AdminGroup page.
 
 On May 23, 2013, at 12:29 PM, Alan Woodward a...@flax.co.uk wrote:
 
 I think the wiki needs to be updated to reflect this?  
 http://wiki.apache.org/solr/CoreAdmin
 
 If somebody adds me as an editor (AlanWoodward), I'll do it.
 
 Alan Woodward
 www.flax.co.uk
 
 
 On 23 May 2013, at 16:43, Mark Miller wrote:
 
 Yes, this did change - it's actually a protection for a previous change 
 though.
 
 There was a time when you did a core reload by just making a new core with 
 the same name and closing the old core - that is no longer really supported 
 though - the proper way to do this is to use SolrCore#reload, and that has 
 been the case for all of 4.x release if I remember right. I supported 
 making this change to force people who might still be doing what is likely 
 quite a buggy operation to switch to the correct code.
 
 Sorry about the inconvenience.
 
 - Mark
 
 On May 23, 2013, at 10:45 AM, André Widhani andre.widh...@digicol.de 
 wrote:
 
 It seems to me that the behavior of the Core admin action CREATE has 
 changed when going from Solr 4.1 to 4.3.
 
 With 4.1, I could re-configure an existing core (changing path/name to 
 solrconfig.xml for example). In 4.3, I get an error message:
 
 SEVERE: org.apache.solr.common.SolrException: Error CREATEing SolrCore 
 'core-tex69b6iom1djrbzmlmg83-index2': Core with name 
 'core-tex69b6iom1djrbzmlmg83-index2' already exists.
 
 Is this change intended?
 
 André
 
 
 
 



Re: Solr 4.3 fails to load MySQL driver

2013-05-23 Thread Christian Köhler - ganzgraph gmbh

Hi,

thanx for pointing this out to me.

1152 [coreLoadExecutor-3-thread-1] INFO  org.apache.solr.core.SolrConfig 
 – Adding specified lib dirs to ClassLoader
org.apache.solr.core.SolrResourceLoader  – Adding 
'file:/home/christian/zfmk/solr/solr-4.3.0/example/lib/mysql-connector-java-5.1.25-bin.jar' 
to classloader


The mysql-connector-java DOES get loaded, but is not available to
org.apache.solr.core.SolrResourceLoader.findClass

Has something changed for the syntax creating a dataimport handler?

solrconfig.xml:
---
  requestHandler name=/dataimport
class=org.apache.solr.handler.dataimport.DataImportHandler
lst name=defaults
str name=configdata-config.xml/str
/lst
  /requestHandler

data-config.xml:

dataConfig
  dataSource type=JdbcDataSource
driver=com.mysql.jdbc.Driver
url=jdbc:mysql://localhost/koehler_zfmk
user=my_user
password=secret/

  document name=content
  entity name=rawidentificationid
query=SELECT * FROM foobar; 
  /entity
  /document
/dataConfig

I use this configuration successfully with 3.6

Regards
Chris


Am 23.05.2013 14:39, schrieb Jack Krupansky:

Check the Solr log on startup - it will explicitly state which lib
directories/files will be used. Make sure they agree with where the DIH
jars reside. Keep in mind that the directory structure of Solr changed -
use the lib from 4.3 solrconfig.

Try to use DIH in the standard Solr 4.3 example first. Then mimic that
in your customization.

-- Jack Krupansky

-Original Message- From: Christian Köhler
Sent: Thursday, May 23, 2013 8:25 AM
To: solr-user@lucene.apache.org
Subject: Solr 4.3 fails to load MySQL driver


Hi,

in my attempt to migrate for m 3.6.x to 4.3.0 I stumbled upon an issue
loading the MySQL driver from the [instance]/lib dir:

Caused by: java.lang.ClassNotFoundException:
org.apache.solr.handler.dataimport.DataImportHandler
  at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
  at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
  at java.security.AccessController.doPrivileged(Native Method)
  at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
  at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:789)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
  at java.lang.Class.forName0(Native Method)
  at java.lang.Class.forName(Class.java:266)
  at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:448)

 ... 18 more

To narrow it down, I use the plain example configuration with the
following changes:

- Add a dataimport requestHandler to example/conf/solrconfig.xml
   (copied from a working solr 3.6.x)
- Created example/conf/data-config.xml with
   dataSource type=JdbcDataSource driver=com.mysql.jdbc.Driver ...
   and SQL statement (both copied from a working solr 3.6.x)
- placed the current driver mysql-connector-java-5.1.25-bin.jar in
   example/lib

As to my knowledge the lib dir is included automatically to the path. To
make sure I tried to:

- add lib dir=./lib / to explicit to solrconf.xml
- add absolute path to solrarconf.xml
- changed solr.xml to use solr persistent=true sharedLib=lib

All to no avail.

System Info:
- OpenJDK Runtime Environmentm 1.7.0_19
- Solr 4.3.0
- mysql-connector-java-5.1.25-bin.jar

The same configuration run fine with a solr 3.6.x on the very same machine.

Any help is appreciated!
Cheers
Chris






--
Christian Köhler

ganzgraph gmbh
Bornheimer Straße 37
53111 Bonn

koeh...@ganzgraph.de
http://www.ganzgraph.de/

Tel.: +49-(0)228-227 99 400
Fax : +49-(0)228-227 99 409

Geschäftsführer: Christian Köhler, Thorsten Orth
Unternehmenssitz: Bonn
Handelsregister-Nummer: HRB 19066 beim Amtsgericht: Bonn
UstId-Nr: DE 280482111


Re: Solr 4.3 fails to load MySQL driver

2013-05-23 Thread Chris Hostetter

: in my attempt to migrate for m 3.6.x to 4.3.0 I stumbled upon an issue loading
: the MySQL driver from the [instance]/lib dir:
: 
: Caused by: java.lang.ClassNotFoundException:
: org.apache.solr.handler.dataimport.DataImportHandler

one of us is mistaken by what that error means.  you say it means that 
the MySQL driver isn't being loaded, but nothing in your mail suggests 
to me that there is a problem loading hte MySql driver.  what i see is 
that Solr can't seem to load the DIH class, suggesting that the 
dataimporthandler jar is not getting loaded.  

There may or nay not also be a problem loading the MySQL driver, but 
nothing is even going to attempt to do so unless Solr can successfully 
construct an instance of the DataImportHandler.

So unless there are more details to your error that start mentioning the 
MySql classes, i would check your lib settings for loading the DIH jars 
and make sure those are right.


-Hoss


Re: Fast faceting over large number of distinct terms

2013-05-23 Thread David Larochelle
Interesting solution. My concern is how to select the most frequent terms
in the story_text field in a way that would make sense to the user. Only
including the X most common non-stopword terms in a document could easily
cause important patterns to be missed. There's a similar issue with only
returning counts for terms in the top N documents matching a particular
query.

Also is there an efficient way to add term counts on the client side? I
thought of using the TermVectorComponent to get document level frequency
counts and then using something like Hadoop to add them up. However, I
couldn't find any documentation on using the results of a solr query to
feed a map reduce operation.

--

David


On Wed, May 22, 2013 at 11:12 PM, Otis Gospodnetic 
otis.gospodne...@gmail.com wrote:

 Here's a possibility:

 At index time extract important terms (and/or phrases) from this
 story_text and store top N of them in a separate field (which will be
 much smaller/shorter).  Then facet on that.  Or just retrieve it and
 manually parse and count in the client if that turns out to be faster.
 I did this in the previous decade before Solr was available and it
 worked well.  I limited my counting to top N (200?) hits.

 Otis
 --
 Solr  ElasticSearch Support
 http://sematext.com/





 On Wed, May 22, 2013 at 10:54 PM, David Larochelle
 dlaroche...@cyber.law.harvard.edu wrote:
  The goal of the system is to obtain data that can be used to generate
 word
  clouds so that users can quickly get a sense of the aggregate contents of
  all documents matching a particular query. For example, a user might want
  to see a word cloud of all documents discussing 'Iraq' in a particular
 new
  papers.
 
  Faceting on story_text gives counts of individual words rather than
 entire
  text strings. I think this is because of the tokenization that happens
  automatically as part of the text_general type. I'm happy to look at
  alternatives to faceting but I wasn't able to find one that
  provided aggregate word counts for just the documents matching a
 particular
  query rather than an individual documents  or the entire index.
 
  --
 
  David
 
 
  On Wed, May 22, 2013 at 10:32 PM, Brendan Grainger 
  brendan.grain...@gmail.com wrote:
 
  Hi David,
 
  Out of interest, what are you trying to accomplish by faceting over the
  story_text field? Is it generally the case that the story_text field
 will
  contain values that are repeated or categorize your documents somehow?
   From your description: story_text is used to store free form text
  obtained by crawling new papers and blogs, it doesn't seem that way, so
  I'm not sure faceting is what you want in this situation.
 
  Cheers,
  Brendan
 
 
  On Wed, May 22, 2013 at 9:49 PM, David Larochelle 
  dlaroche...@cyber.law.harvard.edu wrote:
 
   I'm trying to quickly obtain cumulative word frequency counts over all
   documents matching a particular query.
  
   I'm running in Solr 4.3.0 on a machine with 16GB of ram. My index is
 2.5
  GB
   and has around ~350,000 documents.
  
   My schema includes the following fields:
  
   field name=id type=string indexed=true stored=true
  required=true
   multiValued=false /
   field name=media_id type=int indexed=true stored=true
   required=true multiValued=false /
   field name=story_text  type=text_general indexed=true
  stored=true
   termVectors=true termPositions=true termOffsets=true /
  
  
   story_text is used to store free form text obtained by crawling new
  papers
   and blogs.
  
   Running faceted searches with the fc or fcs methods fails with the
 error
   Too many values for UnInvertedField faceting on field story_text
  
  
 
 http://localhost:8983/solr/query?q=id:106714828_6621facet=truefacet.limit=10facet.pivot=publish_date,story_textrows=0facet.method=fcs
  
   Running faceted search with the 'enum' method succeeds but takes a
 very
   long time.
  
  
 
 http://localhost:8983/solr/query?q=includes:foobarfacet=truefacet.limit=100facet.pivot=media_id,includesfacet.method=enumrows=0
   
  
 
 http://localhost:8983/solr/query?q=includes:mccainfacet=truefacet.limit=100facet.pivot=media_id,includesfacet.method=enumrows=0
   
  
   The frustrating thing is even if the query only returns a few hundred
   documents, it still takes 10 minutes or longer to get the cumulative
 word
   count results.
  
   Eventually we're hoping to build a system that will return results in
 a
  few
   seconds and scale to hundreds of millions of documents.
   Is there anyway to get this level of performance out of Solr/Lucene?
  
   Thanks,
  
   David
  
 
 
 
  --
  Brendan Grainger
  www.kuripai.com
 



Re: Upgrading from SOLR 3.5 to 4.2.1 Results.

2013-05-23 Thread Noble Paul നോബിള്‍ नोब्ळ्
Actually  , It's pretty high end for most of the users. Rishi, u can post
the real h/w details and our typical deployment .
No :of cpus per node
No :of disks per host
Vms per host
Gc params
No :of cores per instance

Noble Paul
Sent from phone
On 21 May 2013 01:47, Rishi Easwaran rishi.easwa...@aol.com wrote:

 No, we just upgraded to 4.2.1.
 With the size of our complex and effort required apply our patches and
 rollout, our upgrades are not that often.






 -Original Message-
 From: Noureddine Bouhlel nouredd...@ecotour.com
 To: solr-user solr-user@lucene.apache.org
 Sent: Mon, May 20, 2013 3:36 pm
 Subject: Re: Upgrading from SOLR 3.5 to 4.2.1 Results.


 Hi Rishi,

 Have you done any tests with Solr 4.3 ?

 Regards,


 Cordialement,

 BOUHLEL Noureddine



 On 17 May 2013 21:29, Rishi Easwaran rishi.easwa...@aol.com wrote:

 
 
  Hi All,
 
  Its Friday 3:00pm, warm  sunny outside and it was a good week. Figured
  I'd share some good news.
  I work for AOL mail team and we use SOLR for our mail search backend.
  We have been using it since pre-SOLR 1.4 and strong supporters of SOLR
  community.
  We deal with millions indexes and billions of requests a day across our
  complex.
  We finished full rollout of SOLR 4.2.1 into our production last week.
 
  Some key highlights:
  - ~75% Reduction in Search response times
  - ~50% Reduction in SOLR Disk busy , which in turn helped with ~90%
  Reduction in errors
  - Garbage collection total stop reduction by over 50% moving application
  throughput into the 99.8% - 99.9% range
  - ~15% reduction in CPU usage
 
  We did not tune our application moving from 3.5 to 4.2.1 nor update java.
  For the most part it was a binary upgrade, with patches for our special
  use case.
 
  Now going forward we are looking at prototyping SOLR Cloud for our search
  system, upgrade java and tomcat, tune our application further. Lots of
 fun
  stuff :)
 
  Have a great weekend everyone.
  Thanks,
 
  Rishi.
 
 
 
 
 





Re: Solr 4.3 fails to load MySQL driver

2013-05-23 Thread Christian Köhler

Hi


one of us is mistaken by what that error means.  you say it means that
the MySQL driver isn't being loaded, but nothing in your mail suggests
to me that there is a problem loading hte MySql driver.  what i see is
that Solr can't seem to load the DIH class, suggesting that the
dataimporthandler jar is not getting loaded.


I corrected myself in my last mail: the MySQL driver IS loaded (thanx 
for pointing out to me where to look).



There may or nay not also be a problem loading the MySQL driver, but


I only SUSPECT of the MySQL driver being the culprit for the 
dataimporthandler jar is not getting loaded. Not sure!


 MySql classes, i would check your lib settings for loading the DIH
 jars

I am not using DIH. IMHO its just the plain example code in
solr-4.3.0/example/solr/collection1/ that is being called.

I include the full trace back to clarify my problem (hopefully)

Cheers
Chris


/home/solr-4.3.0/example# java -jar start.jar
0[main] INFO  org.eclipse.jetty.server.Server  – jetty-8.1.8.v20121106
19   [main] INFO  org.eclipse.jetty.deploy.providers.ScanningAppProvider 
 – Deployment monitor /home/solr/solr-4.3.0/example/contexts at interval 0
24   [main] INFO  org.eclipse.jetty.deploy.DeploymentManager  – 
Deployable added: 
/home/solr/solr-4.3.0/example/contexts/solr-jetty-context.xml
653  [main] INFO  org.eclipse.jetty.webapp.StandardDescriptorProcessor 
– NO JSP Support for /solr, did not find 
org.apache.jasper.servlet.JspServlet

Null identity service, trying login service: null
Finding identity service: null
674  [main] INFO  org.eclipse.jetty.server.handler.ContextHandler  – 
started 
o.e.j.w.WebAppContext{/solr,file:/home/solr/solr-4.3.0/example/solr-webapp/webapp/},/home/solr/solr-4.3.0/example/webapps/solr.war
674  [main] INFO  org.eclipse.jetty.server.handler.ContextHandler  – 
started 
o.e.j.w.WebAppContext{/solr,file:/home/solr/solr-4.3.0/example/solr-webapp/webapp/},/home/solr/solr-4.3.0/example/webapps/solr.war
688  [main] INFO  org.apache.solr.servlet.SolrDispatchFilter  – 
SolrDispatchFilter.init()
703  [main] INFO  org.apache.solr.core.SolrResourceLoader  – JNDI not 
configured for solr (NoInitialContextEx)
704  [main] INFO  org.apache.solr.core.SolrResourceLoader  – solr home 
defaulted to 'solr/' (could not find system property or JNDI)
713  [main] INFO  org.apache.solr.core.CoreContainer  – looking for solr 
config file: /home/solr/solr-4.3.0/example/solr/solr.xml
715  [main] INFO  org.apache.solr.core.CoreContainer  – New 
CoreContainer 1857140958
716  [main] INFO  org.apache.solr.core.CoreContainer  – Loading 
CoreContainer using Solr Home: 'solr/'
716  [main] INFO  org.apache.solr.core.SolrResourceLoader  – new 
SolrResourceLoader for directory: 'solr/'
962  [main] INFO  org.apache.solr.core.CoreContainer  – loading shared 
library: /home/solr/solr-4.3.0/example/solr/lib
962  [main] ERROR org.apache.solr.core.SolrResourceLoader  – Can't find 
(or read) file to add to classloader: solr/lib
971  [main] INFO 
org.apache.solr.handler.component.HttpShardHandlerFactory  – Setting 
socketTimeout to: 0
973  [main] INFO 
org.apache.solr.handler.component.HttpShardHandlerFactory  – Setting 
urlScheme to: http://
973  [main] INFO 
org.apache.solr.handler.component.HttpShardHandlerFactory  – Setting 
connTimeout to: 0
974  [main] INFO 
org.apache.solr.handler.component.HttpShardHandlerFactory  – Setting 
maxConnectionsPerHost to: 20
974  [main] INFO 
org.apache.solr.handler.component.HttpShardHandlerFactory  – Setting 
corePoolSize to: 0
974  [main] INFO 
org.apache.solr.handler.component.HttpShardHandlerFactory  – Setting 
maximumPoolSize to: 2147483647
974  [main] INFO 
org.apache.solr.handler.component.HttpShardHandlerFactory  – Setting 
maxThreadIdleTime to: 5
974  [main] INFO 
org.apache.solr.handler.component.HttpShardHandlerFactory  – Setting 
sizeOfQueue to: -1
975  [main] INFO 
org.apache.solr.handler.component.HttpShardHandlerFactory  – Setting 
fairnessPolicy to: false
980  [main] INFO  org.apache.solr.client.solrj.impl.HttpClientUtil  – 
Creating new http client, 
config:maxConnectionsPerHost=20maxConnections=1socketTimeout=0connTimeout=0retry=false
1073 [main] INFO  org.apache.solr.core.CoreContainer  – Registering Log 
Listener
1087 [coreLoadExecutor-3-thread-1] INFO 
org.apache.solr.core.CoreContainer  – Creating SolrCore 'collection1' 
using instanceDir: solr/collection1
1088 [coreLoadExecutor-3-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader  – new SolrResourceLoader for 
directory: 'solr/collection1/'
1143 [coreLoadExecutor-3-thread-1] INFO  org.apache.solr.core.SolrConfig 
 – Adding specified lib dirs to ClassLoader
1144 [coreLoadExecutor-3-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader  – Adding 
'file:/home/solr/solr-4.3.0/example/lib/jetty-util-8.1.8.v20121106.jar' 
to classloader
1144 [coreLoadExecutor-3-thread-1] INFO 
org.apache.solr.core.SolrResourceLoader  – Adding 
'file:/home/solr/solr-4.3.0/example/lib/servlet-api-3.0.jar' to 

RE: Problem with document routing with Solr 4.2.1

2013-05-23 Thread Jean-Sebastien Vachon
I must add the shard.keys= does not return anything on two on my nodes. But 
that is to be expected since I'm using a replication factor of 3 on a cloud of 
5 servers

-Original Message-
From: Jean-Sebastien Vachon [mailto:jean-sebastien.vac...@wantedanalytics.com] 
Sent: May-23-13 11:27 AM
To: solr-user@lucene.apache.org
Subject: RE: Problem with document routing with Solr 4.2.1

If that can help.. adding distrib=false or shard.keys= is giving back 
results.


-Original Message-
From: Jean-Sebastien Vachon [mailto:jean-sebastien.vac...@wantedanalytics.com]
Sent: May-23-13 10:39 AM
To: solr-user@lucene.apache.org
Subject: RE: Problem with document routing with Solr 4.2.1

I know. If a stop routing the documents and simply use a standard 'id' field 
then I am getting back my fields. 
I forgot to tell you how the collection was created.

http://localhost:8201/solr/admin/collections?action=CREATEname=CurrentnumShards=15replicationFactor=3maxShardsPerNode=9

Since I am using the numshards parameter then composite routing should be 
working... unless I misunderstood something

-Original Message-
From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com]
Sent: May-23-13 10:27 AM
To: solr-user@lucene.apache.org
Subject: Re: Problem with document routing with Solr 4.2.1

That's strange. The default value of rows param is 10 so you should be 
getting 10 results back unless your StandardRequestHandler config in solrconfig 
has set rows to 0 or if none of your fields are stored.


On Thu, May 23, 2013 at 7:40 PM, Jean-Sebastien Vachon  
jean-sebastien.vac...@wantedanalytics.com wrote:

 Hi All,

 I just started indexing data in my brand new Solr Cloud running on 4.2.1.
 Since I am a big user of the grouping feature, I need to route my 
 documents on the proper shard.
 Following the instruction found here:

 http://docs.lucidworks.com/display/solr/Shards+and+Indexing+Data+in+So
 lrCloud

 I set my document id to something like this  'fieldA!id' where fieldA 
 is the key I want to use to distribute my documents.
 (All documents with the same value for fieldA will be sent to the same 
 shard).

 When I query my index, I can see that the number of documents increase 
 but there are no fields at all in the index.

 http://10.0.5.211:8201/solr/Current/select?q=*:*

 response
   lst name=responseHeader
   int name=status0/int
   int name=QTime11/int
   lst name=params
   str name=q*:*/str
   /lst
   /lst
   result name=response numFound=26318 start=0 maxScore=1.0/ 
 /response

 Specifying fields in the 'fl' parameter does nothing.

 What am I doing wrong?




--
Regards,
Shalin Shekhar Mangar.

-
Aucun virus trouvé dans ce message.
Analyse effectuée par AVG - www.avg.fr
Version: 2013.0.3336 / Base de données virale: 3162/6319 - Date: 12/05/2013 La 
Base de données des virus a expiré.

-
Aucun virus trouvé dans ce message.
Analyse effectuée par AVG - www.avg.fr
Version: 2013.0.3336 / Base de données virale: 3162/6319 - Date: 12/05/2013 La 
Base de données des virus a expiré.

-
Aucun virus trouvé dans ce message.
Analyse effectuée par AVG - www.avg.fr
Version: 2013.0.3336 / Base de données virale: 3162/6319 - Date: 12/05/2013 La 
Base de données des virus a expiré.


Re: Solr 4.3 fails to load MySQL driver

2013-05-23 Thread Chris Hostetter

: I only SUSPECT of the MySQL driver being the culprit for the dataimporthandler
: jar is not getting loaded. Not sure!

the dataimporthandler *class* is not getting loaded the 
dataimporthandler *jar* is not getting loaded.

:  MySql classes, i would check your lib settings for loading the DIH
:  jars
: 
: I am not using DIH. IMHO its just the plain example code in
: solr-4.3.0/example/solr/collection1/ that is being called.

i'm totally confused ... DIH == DataImportHandler ... it's just an 
acronym, you say you aren't using DIH, but you are having a problem 
loading DIH, so DIH is used in your configs.

: I include the full trace back to clarify my problem (hopefully)

...

: org.apache.solr.core.SolrResourceLoader  – new SolrResourceLoader for
: directory: 'solr/collection1/'
: 1143 [coreLoadExecutor-3-thread-1] INFO  org.apache.solr.core.SolrConfig  –
: Adding specified lib dirs to ClassLoader
: 1144 [coreLoadExecutor-3-thread-1] INFO
: org.apache.solr.core.SolrResourceLoader  – Adding
: 'file:/home/solr/solr-4.3.0/example/lib/jetty-util-8.1.8.v20121106.jar' to
: classloader

...ok, for starters this makes no sense, and may be the cause of 
some problems.  you aparently have your collection1 configs setup to load 
all of the classes from the /home/solr/solr-4.3.0/example/example/lib 
directory as part of the collection1 classloader.

you really don't want to do that.  It will most likeley cause you all 
sorts of problems, even if it's unrelated to the current problem.


Second, note in particular all of the lines that look like that line above 
-- specifically lines that say org.apache.solr.core.SolrResourceLoader - 
Addming  to classloader.  besides the ones refering to 
/home/solr/solr-4.3.0/example/lib/ (which is almost certainly not what you 
want) you then have a bunch refering to contrib/extraction and 
contrib/langid, and contrib/velocity -- all of which is great, those 
plugins and their dependencies are now available to use.

but no where does it ever say anything about adding 
contrib/dataimporthandler jars to the classloader.

which means your config isn't setup to load any of hte dataimporthandler 
jars as plugins

which means when it's done loading plugins, and it starts to initialize 
things like RequestHandlers, and it finds a refrence to the 
DataImportHandler, it doesn't know what that means...

: Caused by: java.lang.ClassNotFoundException:
: org.apache.solr.handler.dataimport.DataImportHandler


if you look at the 4.3 DIH examples, you'll note that 
the only solrconfig.xml files that mention DataImportHandler also 
include lib directives like the following in order to load 
dataimporthandler as a plugin...


  lib dir=../../../../dist/ regex=solr-dataimporthandler-.*\.jar /
...
   requestHandler name=/dataimport 
class=org.apache.solr.handler.dataimport.DataImportHandler


-Hoss

AW: Core admin action CREATE fails for existing core

2013-05-23 Thread André Widhani
Mark, Alan,

thanks for explaining and updating the wiki.

When reloading the core using action=CREATE with Solr 4.1 I could specify the 
path to schema and config. In fact I used this to reconfigure the core to use a 
specific one of two prepared config files depending on some external index 
state (instead of making changes to one and the same config file). 

action=RELOAD does not understand the corresponding request parameters schema 
and config (which is why I used CREATE, not RELOAD in the first place). So 
the functionality to switch to a different config file for an existing core is 
no longer there, I guess?

Thanks,
André


Von: Alan Woodward [a...@flax.co.uk]
Gesendet: Donnerstag, 23. Mai 2013 18:43
An: solr-user@lucene.apache.org
Betreff: Re: Core admin action CREATE fails for existing core

Thanks!

Alan Woodward
www.flax.co.uk


On 23 May 2013, at 17:38, Steve Rowe wrote:

 Alan, I've added AlanWoodward to the Solr AdminGroup page.

 On May 23, 2013, at 12:29 PM, Alan Woodward a...@flax.co.uk wrote:

 I think the wiki needs to be updated to reflect this?  
 http://wiki.apache.org/solr/CoreAdmin

 If somebody adds me as an editor (AlanWoodward), I'll do it.

 Alan Woodward
 www.flax.co.uk


 On 23 May 2013, at 16:43, Mark Miller wrote:

 Yes, this did change - it's actually a protection for a previous change 
 though.

 There was a time when you did a core reload by just making a new core with 
 the same name and closing the old core - that is no longer really supported 
 though - the proper way to do this is to use SolrCore#reload, and that has 
 been the case for all of 4.x release if I remember right. I supported 
 making this change to force people who might still be doing what is likely 
 quite a buggy operation to switch to the correct code.

 Sorry about the inconvenience.

 - Mark

 On May 23, 2013, at 10:45 AM, André Widhani andre.widh...@digicol.de 
 wrote:

 It seems to me that the behavior of the Core admin action CREATE has 
 changed when going from Solr 4.1 to 4.3.

 With 4.1, I could re-configure an existing core (changing path/name to 
 solrconfig.xml for example). In 4.3, I get an error message:

 SEVERE: org.apache.solr.common.SolrException: Error CREATEing SolrCore 
 'core-tex69b6iom1djrbzmlmg83-index2': Core with name 
 'core-tex69b6iom1djrbzmlmg83-index2' already exists.

 Is this change intended?

 André







Re: Core admin action CREATE fails for existing core

2013-05-23 Thread Mark Miller
Your right - that does seem to be a new limitation. Could you create a JIRA 
issue for it?

It would be fairly simple to add another reload method that also took the name 
of a new solrconfig/schema file.

- Mark

On May 23, 2013, at 4:11 PM, André Widhani andre.widh...@digicol.de wrote:

 Mark, Alan,
 
 thanks for explaining and updating the wiki.
 
 When reloading the core using action=CREATE with Solr 4.1 I could specify the 
 path to schema and config. In fact I used this to reconfigure the core to use 
 a specific one of two prepared config files depending on some external index 
 state (instead of making changes to one and the same config file). 
 
 action=RELOAD does not understand the corresponding request parameters 
 schema and config (which is why I used CREATE, not RELOAD in the first 
 place). So the functionality to switch to a different config file for an 
 existing core is no longer there, I guess?
 
 Thanks,
 André
 
 
 Von: Alan Woodward [a...@flax.co.uk]
 Gesendet: Donnerstag, 23. Mai 2013 18:43
 An: solr-user@lucene.apache.org
 Betreff: Re: Core admin action CREATE fails for existing core
 
 Thanks!
 
 Alan Woodward
 www.flax.co.uk
 
 
 On 23 May 2013, at 17:38, Steve Rowe wrote:
 
 Alan, I've added AlanWoodward to the Solr AdminGroup page.
 
 On May 23, 2013, at 12:29 PM, Alan Woodward a...@flax.co.uk wrote:
 
 I think the wiki needs to be updated to reflect this?  
 http://wiki.apache.org/solr/CoreAdmin
 
 If somebody adds me as an editor (AlanWoodward), I'll do it.
 
 Alan Woodward
 www.flax.co.uk
 
 
 On 23 May 2013, at 16:43, Mark Miller wrote:
 
 Yes, this did change - it's actually a protection for a previous change 
 though.
 
 There was a time when you did a core reload by just making a new core with 
 the same name and closing the old core - that is no longer really 
 supported though - the proper way to do this is to use SolrCore#reload, 
 and that has been the case for all of 4.x release if I remember right. I 
 supported making this change to force people who might still be doing what 
 is likely quite a buggy operation to switch to the correct code.
 
 Sorry about the inconvenience.
 
 - Mark
 
 On May 23, 2013, at 10:45 AM, André Widhani andre.widh...@digicol.de 
 wrote:
 
 It seems to me that the behavior of the Core admin action CREATE has 
 changed when going from Solr 4.1 to 4.3.
 
 With 4.1, I could re-configure an existing core (changing path/name to 
 solrconfig.xml for example). In 4.3, I get an error message:
 
 SEVERE: org.apache.solr.common.SolrException: Error CREATEing SolrCore 
 'core-tex69b6iom1djrbzmlmg83-index2': Core with name 
 'core-tex69b6iom1djrbzmlmg83-index2' already exists.
 
 Is this change intended?
 
 André
 
 
 
 
 



Re: Solr 4.3 fails to load MySQL driver

2013-05-23 Thread Christian Köhler - ganzgraph gmbh

Hi,


i'm totally confused ... DIH == DataImportHandler ... it's just an
acronym, you say you aren't using DIH, but you are having a problem
loading DIH, so DIH is used in your configs.



sorry for the confusion. I was just trying to say:
I use the example code from
solr-4.3.0/example/solr
and not from
solr-4.3.0/example/example-DIH



...ok, for starters this makes no sense, and may be the cause of
some problems.  you aparently have your collection1 configs setup to load
all of the classes from the /home/solr/solr-4.3.0/example/example/lib
directory as part of the collection1 classloader.

you really don't want to do that.  It will most likeley cause you all
sorts of problems, even if it's unrelated to the current problem.


For solr is was recomended to place the MySQL driver in
solr_3.6.2/example/lib/ This dir is load by default in 3.6 (as I did not 
add any additional lib dirs). Thats why I did this in 4.3 as well. 
What's the best practice to place third party libs?


I added example/lib/ to collection1/conf/solrconfig.xml as lib dir
Without this, the MySQL driver is not loaded according to the
org.apache.solr.core.SolrResourceLoader  – Adding xxx messages


but no where does it ever say anything about adding
contrib/dataimporthandler jars to the classloader.


collection1/conf/solrconfig.xml has the following lib dirs by default:
  lib dir=../../../contrib/extraction/lib regex=.*\.jar /
  lib dir=../../../dist/ regex=solr-cell-\d.*\.jar /

  lib dir=../../../contrib/clustering/lib/ regex=.*\.jar /
  lib dir=../../../dist/ regex=solr-clustering-\d.*\.jar /

  lib dir=../../../contrib/langid/lib/ regex=.*\.jar /
  lib dir=../../../dist/ regex=solr-langid-\d.*\.jar /

  lib dir=../../../contrib/velocity/lib regex=.*\.jar /
  lib dir=../../../dist/ regex=solr-velocity-\d.*\.jar /

Looks the same to me as in 3.6.



which means your config isn't setup to load any of hte dataimporthandler
jars as plugins


That means I have to configure the dataimporthandler manually in 4.3? If 
yes, this is the root of all problems ...





which means when it's done loading plugins, and it starts to initialize
things like RequestHandlers, and it finds a refrence to the
DataImportHandler, it doesn't know what that means...

: Caused by: java.lang.ClassNotFoundException:
: org.apache.solr.handler.dataimport.DataImportHandler


if you look at the 4.3 DIH examples, you'll note that
the only solrconfig.xml files that mention DataImportHandler also
include lib directives like the following in order to load
dataimporthandler as a plugin...


   lib dir=../../../../dist/ regex=solr-dataimporthandler-.*\.jar /


included this ... to no avail.


requestHandler name=/dataimport 
class=org.apache.solr.handler.dataimport.DataImportHandler



  requestHandler name=/dataimport
class=org.apache.solr.handler.dataimport.DataImportHandler

still does not load.

Regards
Chris


Core admin action CREATE fails to persist some settings in solr.xml with Solr 4.3

2013-05-23 Thread André Widhani
When I create a core with Core admin handler using these request parameters:

action=CREATE
name=core-tex69bbum21ctk1kq6lmkir-index3
schema=/etc/opt/dcx/solr/conf/schema.xml
instanceDir=/etc/opt/dcx/solr/
config=/etc/opt/dcx/solr/conf/solrconfig.xml
dataDir=/var/opt/dcx/solr/core-tex69bbum21ctk1kq6lmkir-index3

in Solr 4.1, solr.xml would have the following entry:

core schema=/etc/opt/dcx/solr/conf/schema.xml loadOnStartup=true 
instanceDir=/etc/opt/dcx/solr/ transient=false 
name=core-tex69bbum21ctk1kq6lmkir-index3 
config=/etc/opt/dcx/solr/conf/solrconfig.xml 
dataDir=/var/opt/dcx/solr/core-tex69bbum21ctk1kq6lmkir-index3/ 
collection=core-tex69bbum21ctk1kq6lmkir-index3/

while in Solr 4.3 schema, config and dataDir will be missing:

core loadOnStartup=true instanceDir=/etc/opt/dcx/solr/ 
transient=false name=core-tex69bbum21ctk1kq6lmkir-index3 
collection=core-tex69bbum21ctk1kq6lmkir-index3/

The new core would use the settings specified during CREATE, but after a Solr 
restart they are lost (fall back to some defaults), as they are not persisted 
in solr.xml.

Is this a bug or am I doing something wrong here?

André


AW: Core admin action CREATE fails for existing core

2013-05-23 Thread André Widhani
Ok - yes, will do so tomorrow.

Thanks,
André


Von: Mark Miller [markrmil...@gmail.com]
Gesendet: Donnerstag, 23. Mai 2013 22:46
An: solr-user@lucene.apache.org
Betreff: Re: Core admin action CREATE fails for existing core

Your right - that does seem to be a new limitation. Could you create a JIRA 
issue for it?

It would be fairly simple to add another reload method that also took the name 
of a new solrconfig/schema file.

- Mark

On May 23, 2013, at 4:11 PM, André Widhani andre.widh...@digicol.de wrote:

 Mark, Alan,

 thanks for explaining and updating the wiki.

 When reloading the core using action=CREATE with Solr 4.1 I could specify the 
 path to schema and config. In fact I used this to reconfigure the core to use 
 a specific one of two prepared config files depending on some external index 
 state (instead of making changes to one and the same config file).

 action=RELOAD does not understand the corresponding request parameters 
 schema and config (which is why I used CREATE, not RELOAD in the first 
 place). So the functionality to switch to a different config file for an 
 existing core is no longer there, I guess?

 Thanks,
 André

 
 Von: Alan Woodward [a...@flax.co.uk]
 Gesendet: Donnerstag, 23. Mai 2013 18:43
 An: solr-user@lucene.apache.org
 Betreff: Re: Core admin action CREATE fails for existing core

 Thanks!

 Alan Woodward
 www.flax.co.uk


 On 23 May 2013, at 17:38, Steve Rowe wrote:

 Alan, I've added AlanWoodward to the Solr AdminGroup page.

 On May 23, 2013, at 12:29 PM, Alan Woodward a...@flax.co.uk wrote:

 I think the wiki needs to be updated to reflect this?  
 http://wiki.apache.org/solr/CoreAdmin

 If somebody adds me as an editor (AlanWoodward), I'll do it.

 Alan Woodward
 www.flax.co.uk


 On 23 May 2013, at 16:43, Mark Miller wrote:

 Yes, this did change - it's actually a protection for a previous change 
 though.

 There was a time when you did a core reload by just making a new core with 
 the same name and closing the old core - that is no longer really 
 supported though - the proper way to do this is to use SolrCore#reload, 
 and that has been the case for all of 4.x release if I remember right. I 
 supported making this change to force people who might still be doing what 
 is likely quite a buggy operation to switch to the correct code.

 Sorry about the inconvenience.

 - Mark

 On May 23, 2013, at 10:45 AM, André Widhani andre.widh...@digicol.de 
 wrote:

 It seems to me that the behavior of the Core admin action CREATE has 
 changed when going from Solr 4.1 to 4.3.

 With 4.1, I could re-configure an existing core (changing path/name to 
 solrconfig.xml for example). In 4.3, I get an error message:

 SEVERE: org.apache.solr.common.SolrException: Error CREATEing SolrCore 
 'core-tex69b6iom1djrbzmlmg83-index2': Core with name 
 'core-tex69b6iom1djrbzmlmg83-index2' already exists.

 Is this change intended?

 André








Warning: no uniqueKey specified in schema.

2013-05-23 Thread O. Olson
Hi,

I just downloaded Apache Solr 4.3.0 from 
http://lucene.apache.org/solr/. I
then got into the /example directory and started Solr with: 

 java -Djava.util.logging.config.file=etc/logging.properties
 -Dsolr.solr.home=./example-DIH/solr/ -jar start.jar

I have not made any changes at this point and I get the following Warning:
no uniqueKey specified in schema. 

I have no clue why this error occurs because the schema.xml has
uniqueKeyid/uniqueKey. Isn’t this correctly defined?? I have not changed
the examples in any way, just ran them. I would like to add that if I use
the normal Solr (not the one with the DataImportHandler): 

 java -Djava.util.logging.config.file=etc/logging.properties -jar start.jar

This warning does not occur here. I’d appreciate any clues on why this
warning occurs in the example-DIH.

Thank you,
O. O.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Warning-no-uniqueKey-specified-in-schema-tp4065791.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Warning: no uniqueKey specified in schema.

2013-05-23 Thread Shawn Heisey

On 5/23/2013 3:50 PM, O. Olson wrote:

I just downloaded Apache Solr 4.3.0 from 
http://lucene.apache.org/solr/. I
then got into the /example directory and started Solr with:


java -Djava.util.logging.config.file=etc/logging.properties
-Dsolr.solr.home=./example-DIH/solr/ -jar start.jar


I have not made any changes at this point and I get the following Warning:
no uniqueKey specified in schema.


One of the cores defined in example-DIH, specifically the one named 
tika, does not have uniqueKey in its schema.


example/example-DIH/solr/tika/conf/schema.xml

Thanks,
Shawn



Re: Restaurant availability from database

2013-05-23 Thread Amit Nithian
Hossman did a presentation on something similar to this using spatial data
at a Solr meetup some months ago.

http://people.apache.org/~hossman/spatial-for-non-spatial-meetup-20130117/

May be helpful to you.


On Thu, May 23, 2013 at 9:40 AM, rajh ron...@trimm.nl wrote:

 Thank you for your answer.

 Do you mean I should index the availability data as a document in Solr?
 Because the availability data in our databases is around 6,509,972 records
 and contains the availability per number of seats and per 15 minutes. I
 also
 tried this method, and as far as I know it's only possible to join the
 availability documents and not to include that information per result
 document.

 An example API response (created from the Solr response):
 {
 restaurants: [
 {
 id: 13906,
 name: Allerlei,
 zipcode: 6511DP,
 house_number: 59,
 available: true
 },
 {
 id: 13907,
 name: Voorbeeld,
 zipcode: 6512DP,
 house_number: 39,
 available: false
 }
 ],
 resultCount: 12156,
 resultCountAvailable: 55,
 }

 I'm currently hacking around the problem by executing the search again with
 a very high value for the rows parameter and counting the number of
 available restaurants on the backend, but this causes a big performance
 impact (as expected).




 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Restaurant-availability-from-database-tp4065609p4065710.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Note on The Book

2013-05-23 Thread Jack Krupansky
To those of you who may have heard about the Lucene/Solr book that I and two 
others are writing on Lucene and Solr, some bad and good news. The bad news: 
The book contract with O’Reilly has been canceled. The good news: I’m going to 
proceed with self-publishing (possibly on Lulu or even Amazon) a somewhat 
reduced scope Solr-only Reference Guide (with hints of Lucene). The scope of 
the previous effort was too great, even for O’Reilly – a book larger than 800 
pages (or even 600) that was heavy on reference and lighter on “guide” just 
wasn’t fitting in with their traditional “guide” model. In truth, Solr is just 
too complex for a simple guide that covers it all, let alone Lucene as well.

I’ll announce more details in the coming weeks, but I expect to publish an 
e-book-only version of the book, focused on Solr reference (and plenty of guide 
as well), possibly on Lulu, plus eventually publish 4-8 individual print 
volumes for people who really want the paper. One model I may pursue is to 
offer the current, incomplete, raw, rough, draft as a $7.99 e-book, with the 
promise of updates every two weeks or a month as new and revised content and 
new releases of Solr become available. Maybe the individual e-book volumes 
would be $2 or $3. These are just preliminary ideas. Feel free to let me know 
what seems reasonable or excessive.

For paper: Do people really want perfect bound, or would you prefer spiral 
bound that lies flat and folds back easily? I suppose we could offer both – 
which should be considered “premium”?

I’ll announce more details next week. The immediate goal will be to get the 
“raw rough draft” available to everyone ASAP.

For those of you who have been early reviewers – your effort will not have been 
in vain. I have all your comments and will address them over the next month or 
two or three.

Just for some clarity, the existing Solr Wiki and even the recent contribution 
of the LucidWorks Solr Reference to Apache really are still great contributions 
to general knowledge about Solr, but the book is intended to go much deeper 
into detail, especially with loads of examples and a lot more narrative guide. 
For example, the book has a complete list of the analyzer filters, each with a 
clean one-liner description. Ditto for every parameter (although I would note 
that the LucidWorks Solr Reference does a decent job of that as well.) Maybe, 
eventually, everything in the book COULD (and will) be integrated into the 
standard Solr doc, but until then, a single, integrated reference really is 
sorely needed. And, the book has a lot of narrative guide and walking through 
examples as well. Over time, I’m sure both will evolve. And just to be clear, 
the book is not a simple repurposing of the Solr wiki content – EVERY 
description of everything has been written fresh, from scratch. So, for 
example, analyzer filters get both short one-liner summary descriptions as well 
as more detailed descriptions, plus formal attribute specifications and 
numerous examples, including sample input and outputs (the LucidWorks Solr 
Reference does a better job with examples as well.)

The book has been written in parallel with branch_4x and that will continue.

-- Jack Krupansky

Re: Can anyone explain this Solr query behavior?

2013-05-23 Thread Shankar Sundararaju
Hi Erick,

Here's the output after turning on the debug flag:

*q=text:()debug=query*

yields

response
lst name=responseHeader
int name=status0/int
int name=QTime17/int
lst name=params
str name=indenttrue/str
str name=qtext:()/str
str name=debugquery/str
/lst
/lst
result name=response numFound=0 start=0 maxScore=0.0/result
lst name=debug
str name=rawquerystringtext:()/str
str name=querystringtext:()/str
str name=parsedquery(+())/no_coord/str
str name=parsedquery_toString+()/str
str name=QParserExtendedDismaxQParser/str
null name=altquerystring/
null name=boost_queries/
arr name=parsed_boost_queries/
null name=boostfuncs/
/lst
/response

*q=doc-id:3000debug=query*

yields

response
lst name=responseHeader
int name=status0/int
int name=QTime17/int
lst name=params
str name=qdoc-id:3000/str
str name=debugquery/str
/lst
/lst
result name=response numFound=1 start=0 maxScore=11.682044
doc
  :
  :
/doc
/result
lst name=debug
str name=rawquerystringdoc-id:3000/str
str name=querystringdoc-id:3000/str
str name=parsedquery(+doc-id:3000)/no_coord/str
str name=parsedquery_toString+doc-id:`#8;#0;#0;#23;8/str
str name=QParserExtendedDismaxQParser/str
null name=altquerystring/
null name=boost_queries/
arr name=parsed_boost_queries/
null name=boostfuncs/
/lst
/response

*q=doc-id:3000 AND text:()debug=query*

  yields

response
lst name=responseHeader
int name=status0/int
int name=QTime23/int
lst name=params
str name=qdoc-id:3000 AND text:()/str
str name=debugquery/str
/lst
/lst
result name=response numFound=631647 start=0 maxScore=8.056607
doc
 :
/doc
 :
/doc
doc
 :
/doc
doc
 :
/doc
doc
 :
/doc
doc
 :
/doc
/result
lst name=debug
str name=rawquerystringdoc-id:3000 AND text:()/str
str name=querystringdoc-id:3000 AND text:()/str
str name=parsedquery
(+(doc-id:3000 DisjunctionMaxQuery((Publisher:and^2.0 | text:and |
Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0/no_coord
/str
str name=parsedquery_toString
+(doc-id:`#8;#0;#0;#23;8 (Publisher:and^2.0 | text:and |
Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0))
/str
str name=QParserExtendedDismaxQParser/str
null name=altquerystring/
null name=boost_queries/
arr name=parsed_boost_queries/
null name=boostfuncs/
/lst
/response

*solrconfig.xml:*
requestHandler name=/select class=solr.SearchHandler
 lst name=defaults
   str name=echoParamsexplicit/str
   int name=rows10/int
   str name=dftext/str
   str name=defTypeedismax/str
   str name=qftext^1.0 Title^3.0 Classification^2.0
Contributors^2.0 Publisher^2.0/str
 /lst

*schema.xml:*
field name=text type=my_text indexed=true stored=false required=
false/*
*
dynamicField name=* type=my_text indexed=true stored=true
multiValued=false/
fieldType name=my_text class=solr.TextField analyzer type=index
class=MyAnalyzer/ analyzer type=query class=MyAnalyzer/ analyzer
type=multiterm class=MyAnalyzer/ /fieldType
*
*
*Note:* MyAnalyzer among few other customizations, uses WhitespaceTokenizer
and LoweCaseFilter

Thanks a lot.

-Shankar


On Thu, May 23, 2013 at 4:34 AM, Erick Erickson erickerick...@gmail.comwrote:

 Please post the results of adding debug=query to the URL.
 That'll tell us what the query parser spits out which is much
 easier to analyze.

 Best
 Erick

 On Wed, May 22, 2013 at 12:16 PM, Shankar Sundararaju
 shan...@ebrary.com wrote:
  This query returns 0 documents: *q=(+Title:() +Classification:()
  +Contributors:() +text:())*
 
  This returns 1 document: *q=doc-id:3000*
 
  And this returns 631580 documents when I was expecting 0: *q=doc-id:3000
  AND (+Title:() +Classification:() +Contributors:() +text:())*
 
  Am I missing something here? Can someone please explain? I am using Solr
  4.2.1
 
  Thanks
  -Shankar




-- 
Regards,
*Shankar Sundararaju
*Sr. Software Architect
ebrary, a ProQuest company
410 Cambridge Avenue, Palo Alto, CA 94306 USA
shan...@ebrary.com | www.ebrary.com | 650-475-8776 (w) | 408-426-3057 (c)


FW: howto: get the value from a multivalued field?

2013-05-23 Thread world hello




hi, all - 
how can I retrieve the value out of a multivalued field in a customized 
function query?I want to implement a function query whose first parameter is a 
multi-value fileld, from which values are retrieved and manipulated. 
however, I used the code but get exceptions - can not use FieldCache on 
multivalued field
/public ValueSource parse(FunctionQParser fp) 
throws ParseException {
try { ValueSource vs = fp.parseValueSource();   }   
catch (...)   {   }
Thanks.
- Frank


  

Re: howto: get the value from a multivalued field?

2013-05-23 Thread Jack Krupansky
Yeah, you can't do that. You'll need to keep a copy of whichever value from 
the multi-valued field you wish to be considered the value in a separate, 
non-multi-valued field. Possibly using an update processor, such as one of:


FirstFieldValueUpdateProcessorFactory, LastFieldValueUpdateProcessorFactory, 
MaxFieldValueUpdateProcessorFactory, MinFieldValueUpdateProcessorFactory


-- Jack Krupansky

-Original Message- 
From: world hello

Sent: Thursday, May 23, 2013 7:50 PM
To: solr-user@lucene.apache.org
Subject: FW: howto: get the value from a multivalued field?





hi, all -
how can I retrieve the value out of a multivalued field in a customized 
function query?I want to implement a function query whose first parameter is 
a multi-value fileld, from which values are retrieved and manipulated.
however, I used the code but get exceptions - can not use FieldCache on 
multivalued field
/public ValueSource parse(FunctionQParser fp) 
throws ParseException {
   try { ValueSource vs = fp.parseValueSource();   } 
catch (...)   {   }

Thanks.
- Frank





RE: howto: get the value from a multivalued field?

2013-05-23 Thread world hello
thanks, jack. 
could you please  give more details on using update processor?
Thanks.
- frank

 From: j...@basetechnology.com
 To: solr-user@lucene.apache.org
 Subject: Re: howto: get the value from a multivalued field?
 Date: Thu, 23 May 2013 20:06:34 -0400
 
 Yeah, you can't do that. You'll need to keep a copy of whichever value from 
 the multi-valued field you wish to be considered the value in a separate, 
 non-multi-valued field. Possibly using an update processor, such as one of:
 
 FirstFieldValueUpdateProcessorFactory, LastFieldValueUpdateProcessorFactory, 
 MaxFieldValueUpdateProcessorFactory, MinFieldValueUpdateProcessorFactory
 
 -- Jack Krupansky
 
 -Original Message- 
 From: world hello
 Sent: Thursday, May 23, 2013 7:50 PM
 To: solr-user@lucene.apache.org
 Subject: FW: howto: get the value from a multivalued field?
 
 
 
 
 
 hi, all -
 how can I retrieve the value out of a multivalued field in a customized 
 function query?I want to implement a function query whose first parameter is 
 a multi-value fileld, from which values are retrieved and manipulated.
 however, I used the code but get exceptions - can not use FieldCache on 
 multivalued field
 /public ValueSource parse(FunctionQParser fp) 
 throws ParseException {
 try { ValueSource vs = fp.parseValueSource();   } 
 catch (...)   {   }
 Thanks.
 - Frank
 
  
 
  

Re: Can anyone explain this Solr query behavior?

2013-05-23 Thread Upayavira
(+(doc-id:3000 DisjunctionMaxQuery((Publisher:and^2.0 | text:and |
Classification:and^2.0 | Contributors:and^2.0 |
Title:and^3.0/no_coord

You're using edismax, not lucene. So AND is being considered as a search
term, not an operator, and the word 'and' probably exists in 631580
documents.

Why is it triggering dismax? Probably because field:() is not valid
syntax, so edismax is dropping to dismax because it isn't a valid lucene
query.

What do you expect text:() to do?

If you want to match any docs that have a value in the text field, use
q=text:[* TO *]

To match docs that *don't* have a value in the text field: q=-text[* TO
*]

Upayavira

On Fri, May 24, 2013, at 12:23 AM, Shankar Sundararaju wrote:
 Hi Erick,
 
 Here's the output after turning on the debug flag:
 
 *q=text:()debug=query*
 
 yields
 
 response
 lst name=responseHeader
 int name=status0/int
 int name=QTime17/int
 lst name=params
 str name=indenttrue/str
 str name=qtext:()/str
 str name=debugquery/str
 /lst
 /lst
 result name=response numFound=0 start=0 maxScore=0.0/result
 lst name=debug
 str name=rawquerystringtext:()/str
 str name=querystringtext:()/str
 str name=parsedquery(+())/no_coord/str
 str name=parsedquery_toString+()/str
 str name=QParserExtendedDismaxQParser/str
 null name=altquerystring/
 null name=boost_queries/
 arr name=parsed_boost_queries/
 null name=boostfuncs/
 /lst
 /response
 
 *q=doc-id:3000debug=query*
 
 yields
 
 response
 lst name=responseHeader
 int name=status0/int
 int name=QTime17/int
 lst name=params
 str name=qdoc-id:3000/str
 str name=debugquery/str
 /lst
 /lst
 result name=response numFound=1 start=0 maxScore=11.682044
 doc
   :
   :
 /doc
 /result
 lst name=debug
 str name=rawquerystringdoc-id:3000/str
 str name=querystringdoc-id:3000/str
 str name=parsedquery(+doc-id:3000)/no_coord/str
 str name=parsedquery_toString+doc-id:`#8;#0;#0;#23;8/str
 str name=QParserExtendedDismaxQParser/str
 null name=altquerystring/
 null name=boost_queries/
 arr name=parsed_boost_queries/
 null name=boostfuncs/
 /lst
 /response
 
 *q=doc-id:3000 AND text:()debug=query*
 
   yields
 
 response
 lst name=responseHeader
 int name=status0/int
 int name=QTime23/int
 lst name=params
 str name=qdoc-id:3000 AND text:()/str
 str name=debugquery/str
 /lst
 /lst
 result name=response numFound=631647 start=0 maxScore=8.056607
 doc
  :
 /doc
  :
 /doc
 doc
  :
 /doc
 doc
  :
 /doc
 doc
  :
 /doc
 doc
  :
 /doc
 /result
 lst name=debug
 str name=rawquerystringdoc-id:3000 AND text:()/str
 str name=querystringdoc-id:3000 AND text:()/str
 str name=parsedquery
 (+(doc-id:3000 DisjunctionMaxQuery((Publisher:and^2.0 | text:and |
 Classification:and^2.0 | Contributors:and^2.0 |
 Title:and^3.0/no_coord
 /str
 str name=parsedquery_toString
 +(doc-id:`#8;#0;#0;#23;8 (Publisher:and^2.0 | text:and |
 Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0))
 /str
 str name=QParserExtendedDismaxQParser/str
 null name=altquerystring/
 null name=boost_queries/
 arr name=parsed_boost_queries/
 null name=boostfuncs/
 /lst
 /response
 
 *solrconfig.xml:*
 requestHandler name=/select class=solr.SearchHandler
  lst name=defaults
str name=echoParamsexplicit/str
int name=rows10/int
str name=dftext/str
str name=defTypeedismax/str
str name=qftext^1.0 Title^3.0 Classification^2.0
 Contributors^2.0 Publisher^2.0/str
  /lst
 
 *schema.xml:*
 field name=text type=my_text indexed=true stored=false required=
 false/*
 *
 dynamicField name=* type=my_text indexed=true stored=true
 multiValued=false/
 fieldType name=my_text class=solr.TextField analyzer type=index
 class=MyAnalyzer/ analyzer type=query class=MyAnalyzer/
 analyzer
 type=multiterm class=MyAnalyzer/ /fieldType
 *
 *
 *Note:* MyAnalyzer among few other customizations, uses
 WhitespaceTokenizer
 and LoweCaseFilter
 
 Thanks a lot.
 
 -Shankar
 
 
 On Thu, May 23, 2013 at 4:34 AM, Erick Erickson
 erickerick...@gmail.comwrote:
 
  Please post the results of adding debug=query to the URL.
  That'll tell us what the query parser spits out which is much
  easier to analyze.
 
  Best
  Erick
 
  On Wed, May 22, 2013 at 12:16 PM, Shankar Sundararaju
  shan...@ebrary.com wrote:
   This query returns 0 documents: *q=(+Title:() +Classification:()
   +Contributors:() +text:())*
  
   This returns 1 document: *q=doc-id:3000*
  
   And this returns 631580 documents when I was expecting 0: *q=doc-id:3000
   AND (+Title:() +Classification:() +Contributors:() +text:())*
  
   Am I missing something here? Can someone please explain? I am using Solr
   4.2.1
  
   Thanks
   -Shankar
 
 
 
 
 -- 
 Regards,
 *Shankar Sundararaju
 *Sr. Software Architect
 ebrary, a ProQuest company
 410 Cambridge Avenue, Palo Alto, CA 94306 USA
 shan...@ebrary.com | www.ebrary.com | 650-475-8776 (w) | 408-426-3057 (c)


Re: Can anyone explain this Solr query behavior?

2013-05-23 Thread Jack Krupansky
Okay... sorry I wasn't paying close enough attention. What is happening is 
that the empty parentheses are illegal in Lucene query syntax:


 str name=msgorg.apache.solr.search.SyntaxError: Cannot parse 'id:* AND 
text:()': Encountered  ) )  at line 1, column 15.

Was expecting one of:
   lt;NOTgt; ...
   + ...
   - ...
   lt;BAREOPERgt; ...
   ( ...
   * ...
   lt;QUOTEDgt; ...
   lt;TERMgt; ...
   lt;PREFIXTERMgt; ...
   lt;WILDTERMgt; ...
   lt;REGEXPTERMgt; ...
   [ ...
   { ...
   lt;LPARAMSgt; ...
   lt;NUMBERgt; ...
   lt;TERMgt; ...
   * ...
   /str
 int name=code400/int

Edismax traps such errors and then escapes the query so that Lucene will 
no longer throw an error. In this case, it puts quotes around the AND 
operator, which is why you see and included in the parsed query as if it 
were a term. And I believe it turns text:() into text:(), which makes 
the original Lucene error go away, but the () analyzes to nothing and 
generates no term in the query.


So, fix your syntax error and the anomaly should go away.

-- Jack Krupansky

-Original Message- 
From: Shankar Sundararaju

Sent: Thursday, May 23, 2013 7:23 PM
To: solr-user@lucene.apache.org
Subject: Re: Can anyone explain this Solr query behavior?

Hi Erick,

Here's the output after turning on the debug flag:

*q=text:()debug=query*

   yields

response
lst name=responseHeader
int name=status0/int
int name=QTime17/int
lst name=params
str name=indenttrue/str
str name=qtext:()/str
str name=debugquery/str
/lst
/lst
result name=response numFound=0 start=0 maxScore=0.0/result
lst name=debug
str name=rawquerystringtext:()/str
str name=querystringtext:()/str
str name=parsedquery(+())/no_coord/str
str name=parsedquery_toString+()/str
str name=QParserExtendedDismaxQParser/str
null name=altquerystring/
null name=boost_queries/
arr name=parsed_boost_queries/
null name=boostfuncs/
/lst
/response

*q=doc-id:3000debug=query*

   yields

response
lst name=responseHeader
int name=status0/int
int name=QTime17/int
lst name=params
str name=qdoc-id:3000/str
str name=debugquery/str
/lst
/lst
result name=response numFound=1 start=0 maxScore=11.682044
doc
 :
 :
/doc
/result
lst name=debug
str name=rawquerystringdoc-id:3000/str
str name=querystringdoc-id:3000/str
str name=parsedquery(+doc-id:3000)/no_coord/str
str name=parsedquery_toString+doc-id:`#8;#0;#0;#23;8/str
str name=QParserExtendedDismaxQParser/str
null name=altquerystring/
null name=boost_queries/
arr name=parsed_boost_queries/
null name=boostfuncs/
/lst
/response

*q=doc-id:3000 AND text:()debug=query*

 yields

response
lst name=responseHeader
int name=status0/int
int name=QTime23/int
lst name=params
str name=qdoc-id:3000 AND text:()/str
str name=debugquery/str
/lst
/lst
result name=response numFound=631647 start=0 maxScore=8.056607
doc
:
/doc
:
/doc
doc
:
/doc
doc
:
/doc
doc
:
/doc
doc
:
/doc
/result
lst name=debug
str name=rawquerystringdoc-id:3000 AND text:()/str
str name=querystringdoc-id:3000 AND text:()/str
str name=parsedquery
(+(doc-id:3000 DisjunctionMaxQuery((Publisher:and^2.0 | text:and |
Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0/no_coord
/str
str name=parsedquery_toString
+(doc-id:`#8;#0;#0;#23;8 (Publisher:and^2.0 | text:and |
Classification:and^2.0 | Contributors:and^2.0 | Title:and^3.0))
/str
str name=QParserExtendedDismaxQParser/str
null name=altquerystring/
null name=boost_queries/
arr name=parsed_boost_queries/
null name=boostfuncs/
/lst
/response

*solrconfig.xml:*
requestHandler name=/select class=solr.SearchHandler
lst name=defaults
  str name=echoParamsexplicit/str
  int name=rows10/int
  str name=dftext/str
  str name=defTypeedismax/str
  str name=qftext^1.0 Title^3.0 Classification^2.0
Contributors^2.0 Publisher^2.0/str
/lst

*schema.xml:*
field name=text type=my_text indexed=true stored=false required=
false/*
*
dynamicField name=* type=my_text indexed=true stored=true
multiValued=false/
fieldType name=my_text class=solr.TextField analyzer type=index
class=MyAnalyzer/ analyzer type=query class=MyAnalyzer/ analyzer
type=multiterm class=MyAnalyzer/ /fieldType
*
*
*Note:* MyAnalyzer among few other customizations, uses WhitespaceTokenizer
and LoweCaseFilter

Thanks a lot.

-Shankar


On Thu, May 23, 2013 at 4:34 AM, Erick Erickson 
erickerick...@gmail.comwrote:



Please post the results of adding debug=query to the URL.
That'll tell us what the query parser spits out which is much
easier to analyze.

Best
Erick

On Wed, May 22, 2013 at 12:16 PM, Shankar Sundararaju
shan...@ebrary.com wrote:
 This query returns 0 documents: *q=(+Title:() +Classification:()
 +Contributors:() +text:())*

 This returns 1 document: *q=doc-id:3000*

 And this returns 631580 documents when I was expecting 0: *q=doc-id:3000
 AND (+Title:() +Classification:() +Contributors:() +text:())*

 Am I missing something here? Can someone please explain? I am using Solr
 4.2.1

 Thanks
 -Shankar





--
Regards,
*Shankar Sundararaju
*Sr. Software 

RE: Question about Coordination factor

2013-05-23 Thread Kazuaki Hiraga
Thank you for your comment.

Due to historical reasons, Our organization uses trunk version of Solr-4.0, 
which is a bit old and unofficial version. And edismax always returns 1/2 as a 
coordination value. So I wanted to make sure what this value would be like. 
This will be a good reason to upgrade our system to Solr 4.3 or later version.

Thank you very much!

-Kazu


 From: ans...@anshumgupta.net
 Date: Thu, 23 May 2013 16:58:46 +0530
 Subject: Re: Question about Coordination factor
 To: solr-user@lucene.apache.org

 This looks correct.


 On Thu, May 23, 2013 at 7:37 AM, Kazuaki Hiraga 
 kazuaki.hir...@gmail.comwrote:

 Hello Folks,

 Sorry, my last email was a bit messy, so I am sending it again.

 I have a question about coordination factor to ensure my understanding
 of this value is correct.

 If I have documents that contain some keywords like the following:
 Doc1: A, B, C
 Doc2: A, C
 Doc3: B, C

 And my query is A OR B OR C OR D. In this case, Coord factor value
 for each documents will be the following:
 Doc1: 3/4
 Doc2: 2/4
 Doc3: 2/4

 In the same fashion, respective value of coord factor is the following
 if I have a query C OR D:
 Doc1: 1/2
 Doc2: 1/2
 Doc3: 1/2

 Is this correct? or Did I miss something?

 Please correct me if I am wrong.

 Regards,
 Kazuaki




 --

 Anshum Gupta
 http://www.anshumgupta.net