Re: Multi-select faceting is not working when facet fields are configured in default request handler.

2013-02-07 Thread Jan Høydahl
If you want to override facet.field through the query, you have to override ALL 
facet.field's defined as default in reqeust handler, else those other facets 
are gone.

You say But it's not working. without specifying WHAT is not working.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com

6. feb. 2013 kl. 15:47 skrev manivanann manimail...@gmail.com:

 Hi solr-user,
 
   In my work i have to do multi facet select. we have already configured
 facet fields  globally in default request handler(solrconfig.xml). For multi
 facet select i have done the query with exclusion filter. But it's not
 working. The following is my query.
 
 http://192.168.101.141:8080/solr/select?q=digital+camerarows=0facet=onfq={!tag=Br}Brands:canonfacet.field={!ex=Br}Brands
 
 But if i try after removing all the facet fields from my request hander in
 solrconfig.xml, then above query is working fine.
 
 please can you give me a solution. This multi-select faceting will work with
 the current implementation or i have to remove all the facet from my request
 handler means dynamically i have to send the facet fields through query when
 i do multi-select faceting.
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Multi-select-faceting-is-not-working-when-facet-fields-are-configured-in-default-request-handler-tp4038768.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Eject a node from SolrCloud

2013-02-07 Thread yriveiro
Hi,

Exists any way to eject a node from a solr cluster?

If I shutdown a node in the cluster, the zookeeper tag the node as down. 

Thanks

/Yago



-
Best regards
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Eject-a-node-from-SolrCloud-tp4038950.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Eject a node from SolrCloud

2013-02-07 Thread Tomás Fernández Löbbe
Yes, currently the only option is to shutdown the node. Maybe not the
cleanest way to remove a node. See this jira too:
https://issues.apache.org/jira/browse/SOLR-3512


On Thu, Feb 7, 2013 at 7:20 AM, yriveiro yago.rive...@gmail.com wrote:

 Hi,

 Exists any way to eject a node from a solr cluster?

 If I shutdown a node in the cluster, the zookeeper tag the node as down.

 Thanks

 /Yago



 -
 Best regards
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Eject-a-node-from-SolrCloud-tp4038950.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Advanced Search Option in Solr corresponding to DtSearch options

2013-02-07 Thread Alan Woodward
Hi Soumyanayan,

We developed a parser that converts dtSearch queries to Lucene queries, with 
some Solr integration - see 
http://www.flax.co.uk/blog/2012/04/24/dtsolr-an-open-source-replacement-for-the-dtsearch-closed-source-search-engine/

At the moment it relies on an unreleased version of Lucene/Solr, because we 
needed to get some extra data from the index that wasn't available in trunk for 
our use case, but it can probably be tweaked to just use vanilla solr.  Feel 
free to contact me for more details!

Alan Woodward
www.flax.co.uk


On 6 Feb 2013, at 16:09, Soumyanayan Kar wrote:

 Hi,
 
 
 
 We are replacing the search and indexing module in an application from
 DtSearch to Solr using solrnet as the .net Solr client library.
 
 
 
 We are relatively new to Solr/Lucene and would need some help/direction to
 understand the more advanced search options in Solr.
 
 
 
 The current application supports the following search options using
 DtSearch:
 
 
 
 1)Word(s) or phrase
 
 2)Exact words or phrases
 
 3)Not these words or phrases
 
 4)One or more of words(A OR B OR C)
 
 5)Proximity of word with n words of another word
 
 6)Numeric range - From - To
 
 7)Option
 
 . Stemming(search* finds searching or searches)
 
 . Synonym(search finds seek or look)
 
 . Fuzzy within n letters(p%arts finds paris)
 
 . Phonic homonyms(#Smith also finds Smithe and Smythe)
 
 
 
 As an example the search query that gets generated to be posted to DtSearch
 for the below use case:
 
 1.   Search Phrase:  generic collection
 
 2.   Exact Phrase: linq
 
 3.   Not these words: sql
 
 4.   One or more of these words:  ICollection or ArrayList or
 Hashtable
 
 5.   Proximity:   csharp within
 4 words of language
 
 6.   Options:
 
 a.  Stemming
 
 b.  Synonym
 
 c.   Fuzzy within 2 letters
 
 d.  Phonic homonyms
 
 
 
 Search Query: generic* collection* generic collection #generic #collection
 g%%eneric c%%ollection linq  -sql ICollection OR ArrayList OR Hashtable
 csharp w/4 language
 
 
 
 We have been able to do simple searches(singular term search in a file
 content) with highlights with Solr. Now we need to replace these options
 with Solr/Lucene.
 
 
 
 Can anybody provide some directions on what/where should we be looking.
 
 
 
 Thanks  Regards,
 
 
 
 Soumya.
 
 
 
 
 



Re: Multi-select faceting is not working when facet fields are configured in default request handler.

2013-02-07 Thread Erik Hatcher
How was your facet.field defined in the request handler?   My guess is it needs 
to be moved to an appends section. 

Erik

On Feb 7, 2013, at 4:11, Jan Høydahl jan@cominvent.com wrote:

 If you want to override facet.field through the query, you have to override 
 ALL facet.field's defined as default in reqeust handler, else those other 
 facets are gone.
 
 You say But it's not working. without specifying WHAT is not working.
 
 --
 Jan Høydahl, search solution architect
 Cominvent AS - www.cominvent.com
 Solr Training - www.solrtraining.com
 
 6. feb. 2013 kl. 15:47 skrev manivanann manimail...@gmail.com:
 
 Hi solr-user,
 
  In my work i have to do multi facet select. we have already configured
 facet fields  globally in default request handler(solrconfig.xml). For multi
 facet select i have done the query with exclusion filter. But it's not
 working. The following is my query.
 
 http://192.168.101.141:8080/solr/select?q=digital+camerarows=0facet=onfq={!tag=Br}Brands:canonfacet.field={!ex=Br}Brands
 
 But if i try after removing all the facet fields from my request hander in
 solrconfig.xml, then above query is working fine.
 
 please can you give me a solution. This multi-select faceting will work with
 the current implementation or i have to remove all the facet from my request
 handler means dynamically i have to send the facet fields through query when
 i do multi-select faceting.
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Multi-select-faceting-is-not-working-when-facet-fields-are-configured-in-default-request-handler-tp4038768.html
 Sent from the Solr - User mailing list archive at Nabble.com.
 


how-to configure mysql pool connection on Solr Server

2013-02-07 Thread Miguel

Hi

 I need configure a mysql pool connection on Solr Server for using on 
custom plugin. I saw DataImportHandler wiki: 
http://wiki.apache.org/solr/DataImportHandler , but it's seems that 
DataImportHandler open the  connection when handler is calling and close 
when finish import and I need keep opening pool to reuse connections 
whenever I need them.


I not found on documentation of Apache Solr how-to define a pools 
connection to DB for reusing them on whatever class of solr.

Any ideas?

thanks



Re: Maximum Number of Records In Index

2013-02-07 Thread Mikhail Khludnev
Rafal,

What about docnums, don't they are limited by int32 ?
07.02.2013 15:33 пользователь Rafał Kuć r@solr.pl написал:

 Hello!

 Practically there is no limit in how many documents can be stored in a
 single index. In your case, as you are using Solr from 2011, there is
 a limitation regarding the number of unique terms per Lucene segment
 (
 http://lucene.apache.org/core/old_versioned_docs/versions/3_0_0/fileformats.html#Limitations
 ).
 However I don't think you've hit that. Solr by itself doesn't remove
 documents unless told to do so.

 Its hard to guess what can be the reason and as you said, you see
 updates coming to your handler. Maybe new documents have the same
 identifiers that the ones that are already indexed ? As I said, this
 is only a guess and we would need to have more information. Are there
 any exceptions in the logs ? Do you run delete command ? Are your
 index files changed ? How do you run commit ?

 --
 Regards,
  Rafał Kuć
  Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch

  I have searched this forum but not yet found a definitive answer, I
 think the
  answer is There is No Limit depends on server specification. But never
 the
  less I will say what I have seen and then ask the questions.

  From scratch (November 2011) I have set up our SOLR which contains data
 from
  various sources, since March 2012 , the number of indexed records (unique
  ID's) reached 13.5 million , which was to be expected. However for the
 last
  8 months the number of records in the index has not gone above 13.5
 million,
  yet looking at the request handler outputs I can safely say at least
  anywhere from 50 thousand to 100 thousand records are being indexed
 daily.
  So I am assuming that earlier records are being removed, and I do not
 want
  that.

  Question: If there is a limit to the number of records the index can
 store
  where do I find this and change it?
  Question: If there is no limit does anyone have any idea why for the last
  months the number has not gone beyond 13.5 million, I can safely say
 that at
  least 90% are new records.

  thanks

  macroman



  --
  View this message in context:
 
 http://lucene.472066.n3.nabble.com/Maximum-Number-of-Records-In-Index-tp4038961.html
  Sent from the Solr - User mailing list archive at Nabble.com.




Re: solr file based spell suggestions

2013-02-07 Thread Jack Krupansky
Changing x to y (e.g., s2 to sII) is not a function of spell check 
or suggestion.


Synonyms are a closer match, but can be difficult to configure properly. 
Good luck.


You may be better off preprocessing the query at the application level and 
then generating the appropriate boolean logic, such as: (s2 OR sII).


-- Jack Krupansky

-Original Message- 
From: Rohan Thakur

Sent: Thursday, February 07, 2013 8:24 AM
To: solr-user@lucene.apache.org
Subject: solr file based spell suggestions

hi all

I wanted to know how can I apply file based dictionary for spell
suggestions such that if I search for s2 in the query it would take it as
sII which also represent same thing in my indexed field...but as in search
it can also be interpreted as s2 please help anyone...

thanks
regards
Rohan 



Re: Maximum Number of Records In Index

2013-02-07 Thread Rafał Kuć
Hello!

Right, my bad - ids are still using int32. However, that still
gives us 2,147,483,648 possible identifiers right per single index,
which is not close to the 13,5 millions mentioned in the first mail.

-- 
Regards,
 Rafał Kuć
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch

 Rafal,

 What about docnums, don't they are limited by int32 ?
 07.02.2013 15:33 пользователь Rafał Kuć r@solr.pl написал:

 Hello!

 Practically there is no limit in how many documents can be stored in a
 single index. In your case, as you are using Solr from 2011, there is
 a limitation regarding the number of unique terms per Lucene segment
 (
 http://lucene.apache.org/core/old_versioned_docs/versions/3_0_0/fileformats.html#Limitations
 ).
 However I don't think you've hit that. Solr by itself doesn't remove
 documents unless told to do so.

 Its hard to guess what can be the reason and as you said, you see
 updates coming to your handler. Maybe new documents have the same
 identifiers that the ones that are already indexed ? As I said, this
 is only a guess and we would need to have more information. Are there
 any exceptions in the logs ? Do you run delete command ? Are your
 index files changed ? How do you run commit ?

 --
 Regards,
  Rafał Kuć
  Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch

  I have searched this forum but not yet found a definitive answer, I
 think the
  answer is There is No Limit depends on server specification. But never
 the
  less I will say what I have seen and then ask the questions.

  From scratch (November 2011) I have set up our SOLR which contains data
 from
  various sources, since March 2012 , the number of indexed records (unique
  ID's) reached 13.5 million , which was to be expected. However for the
 last
  8 months the number of records in the index has not gone above 13.5
 million,
  yet looking at the request handler outputs I can safely say at least
  anywhere from 50 thousand to 100 thousand records are being indexed
 daily.
  So I am assuming that earlier records are being removed, and I do not
 want
  that.

  Question: If there is a limit to the number of records the index can
 store
  where do I find this and change it?
  Question: If there is no limit does anyone have any idea why for the last
  months the number has not gone beyond 13.5 million, I can safely say
 that at
  least 90% are new records.

  thanks

  macroman



  --
  View this message in context:
 
 http://lucene.472066.n3.nabble.com/Maximum-Number-of-Records-In-Index-tp4038961.html
  Sent from the Solr - User mailing list archive at Nabble.com.





Re: Eject a node from SolrCloud

2013-02-07 Thread Mark Miller
You can unload the core for that node and it will be removed from zookeeper. 
You can add it back after if you leave it's state on disk and recreate the core.

- Mark

On Feb 7, 2013, at 5:20 AM, yriveiro yago.rive...@gmail.com wrote:

 Hi,
 
 Exists any way to eject a node from a solr cluster?
 
 If I shutdown a node in the cluster, the zookeeper tag the node as down. 
 
 Thanks
 
 /Yago
 
 
 
 -
 Best regards
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Eject-a-node-from-SolrCloud-tp4038950.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Calculate score according to another indexed field

2013-02-07 Thread Pragyanshis Pattanaik



Hi,

My schema is like below
fields   
field name=ProductId type=int indexed=true stored=true /
field name=ProductName type=string indexed=true stored=true 
required=true /
field name=ProductDesription type=string indexed=true stored=true 
required=true /
field name=ProductRating type=int indexed=true stored=true 
required=true /
/fields

Product name will be passed as q parameter to solr.
Is there a way to affect score on the basis of ProductRating which is not 
passed as query parameter ?


Or I need to go to solr source code and change the ranking algorithm ?

Please Guide me.

Thanks in advance

  

Search a Phrase

2013-02-07 Thread Pragyanshis Pattanaik



Hi,

My schema is like below

fields   
field name=ProductId type=int indexed=true stored=true /
field name=ProductName type=text_general indexed=true stored=true 
required=true /
field name=ProductDesription type=string indexed=true stored=true 
required=true /
field name=Product Rating type=int indexed=true stored=true 
required=true /
field name=Product Feedback type=text_general indexed=true 
stored=true required=true /
/fields

and my text_general field is like below

fieldType name=text_general class=solr.TextField 
positionIncrementGap=100
  analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true 
words=stopwords.txt enablePositionIncrements=true /
filter class=solr.LowerCaseFilterFactory/
  /analyzer
  analyzer type=query
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true 
words=stopwords.txt enablePositionIncrements=true /
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt 
ignoreCase=true expand=true/
filter class=solr.LowerCaseFilterFactory/
  /analyzer
/fieldType

How can i search a Phrase(Good Microwave) over ProductDesription and Product 
Feedback field ?
Here some documents might contain only Good and some might contain only 
Microwave.

How to get all  documents that contains Good or Microwave or Good 
Microwave,if i will pass Good Microwave as q parameter  ?



Thanks in advance


  

Re: Multi-select faceting is not working when facet fields are configured in default request handler.

2013-02-07 Thread Alexandre Rafalovitch
I think it would still fail because of the 'tag' exclusions. Whatever
facets will be defined on the handler, they will not be taking into account
'tag' exclusions.

Regards,
   Alex.

Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


On Thu, Feb 7, 2013 at 7:17 AM, Erik Hatcher erik.hatc...@gmail.com wrote:

 How was your facet.field defined in the request handler?   My guess is it
 needs to be moved to an appends section.

 Erik

 On Feb 7, 2013, at 4:11, Jan Høydahl jan@cominvent.com wrote:

  If you want to override facet.field through the query, you have to
 override ALL facet.field's defined as default in reqeust handler, else
 those other facets are gone.
 
  You say But it's not working. without specifying WHAT is not working.
 
  --
  Jan Høydahl, search solution architect
  Cominvent AS - www.cominvent.com
  Solr Training - www.solrtraining.com
 
  6. feb. 2013 kl. 15:47 skrev manivanann manimail...@gmail.com:
 
  Hi solr-user,
 
   In my work i have to do multi facet select. we have already configured
  facet fields  globally in default request handler(solrconfig.xml). For
 multi
  facet select i have done the query with exclusion filter. But it's not
  working. The following is my query.
 
 
 http://192.168.101.141:8080/solr/select?q=digital+camerarows=0facet=onfq={!tag=Br}Brands:canonfacet.field={!ex=Br}Brands
 
  But if i try after removing all the facet fields from my request hander
 in
  solrconfig.xml, then above query is working fine.
 
  please can you give me a solution. This multi-select faceting will work
 with
  the current implementation or i have to remove all the facet from my
 request
  handler means dynamically i have to send the facet fields through query
 when
  i do multi-select faceting.
 
 
 
  --
  View this message in context:
 http://lucene.472066.n3.nabble.com/Multi-select-faceting-is-not-working-when-facet-fields-are-configured-in-default-request-handler-tp4038768.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 



Re: SolrCore#getIndexDir() contract change between 3.6 and 4.1?

2013-02-07 Thread Gregg Donovan
Thanks, Mark. I created SOLR-4413 [1] for it.

I'm not sure what the best fix is since it looks like a lot of the work at
that time went into refactoring SolrIndexSearcher to use DirectoryFactory
everywhere and index.properties doesn't make much sense when an FSDirectory
is not used...

Anyway, I'll follow up in JIRA.

--Gregg

[1] https://issues.apache.org/jira/browse/SOLR-4413

On Wed, Feb 6, 2013 at 8:42 PM, Mark Miller markrmil...@gmail.com wrote:

 Thanks Gregg - can you file a JIRA issue?

 - Mark

 On Feb 6, 2013, at 5:57 PM, Gregg Donovan gregg...@gmail.com wrote:

  Mark-
 
  You're right that SolrCore#getIndexDir() did not directly read
  index.properties in 3.6. In 3.6, it gets it indirectly from what is
 passed
  to the constructor of SolrIndexSearcher. Here's SolrCore#getIndexDir() in
  3.6:
 
   public String getIndexDir() {
 synchronized (searcherLock) {
   if (_searcher == null)
 return dataDir + index/;
   SolrIndexSearcher searcher = _searcher.get();
   return searcher.getIndexDir() == null ? dataDir + index/ :
  searcher.getIndexDir();
 }
   }
 
  In 3.6 the only time I see a new SolrIndexSearcher created without the
  results of SolrCore#getNewIndexDir() getting passed in somehow would be
 if
  SolrCore#newSearcher(String, boolean) is called manually before any other
  SolrIndexSearcher. Otherwise, it looks like getNewIndexDir() is getting
  passed to new SolrIndexSearcher which is then reflected back
  in SolrCore#getIndexDir().
 
  So, in 3.6 we had been able to rely on SolrCore#getIndexDir() giving us
  either the value the index referenced in index.properties OR dataDir +
  index/ if index.properties was missing. In 4.1, it always gives us
  dataDir + index/.
 
  Here's the comment in 3.6 on SolrCore#getNewIndexDir() that I think you
  were referring to. The comment is unchanged in 4.1:
 
   /**
* Returns the indexdir as given in index.properties. If
 index.properties
  exists in dataDir and
* there is a property iindex/i available and it points to a valid
  directory
* in dataDir that is returned Else dataDir/index is returned. Only
  called for creating new indexSearchers
* and indexwriters. Use the getIndexDir() method to know the active
  index directory
*
* @return the indexdir as given in index.properties
*/
   public String getNewIndexDir() {
 
  *Use the getIndexDir() method to know the active index directory* is
 the
  behavior that we were reliant on. Since it's now hardcoded to dataDir +
  index/, it doesn't always return the active index directory.
 
  --Gregg
 
  On Wed, Feb 6, 2013 at 5:13 PM, Mark Miller markrmil...@gmail.com
 wrote:
 
 
  On Feb 6, 2013, at 4:23 PM, Gregg Donovan gregg...@gmail.com wrote:
 
  code we had that relied on the 3.6 behavior of SolrCore#getIndexDir()
 is
  not working the same way.
 
  Can you be very specific about the different behavior that you are
 seeing?
  What exactly where you seeing and counting on and what are you seeing
 now?
 
  - Mark




Not condition not working for Korean search

2013-02-07 Thread Cool Techi
Hi,

I am no Korean expert and am finding it difficult to fix this, my client is 
searching for the following query, but the NOT condition doesn't seem to be 
working.

(stnostem:((옵티머스 OR 엘지 스마트폰) AND NOT (옵티머스 프라임 OR 프라임)))

 the search result (xml attached ) return result with the not condition 
keywords? How can this be fixed.

Regards,
Ayush
  

AnalyzingSuggester returning index value instead of field value?

2013-02-07 Thread Sebastian Saip
I'm looking into a way to implement an autosuggest and for my special needs
(I'm doing a startsWith-search that should retrieve the full name, which
may have accents - However, I want to search with/without accents and in
any upper/lowercase for comfort)

Here's part of my configuration: http://pastebin.com/20vSGJ1a

So I have a name=Têst Námè and I query for test, tést, TÈST, or
similiar. This gives me back test name as a suggestion, which looks like
the index, rather than the actual value.

Furthermore, when I fed the document without index-analyzers, then added
the index-analyzers, restarted without refeeding and queried, it returned
the right value (so this seems to retrieve the index, rather than the
actual stored value?)

Or maybe I just configured it the wrong way :?
Theres not really much documentation about this yet :(

BR Sebastian Saip


Re: Maximum Number of Records In Index

2013-02-07 Thread Mikhail Khludnev
Actually, I have a dream to exceed those two billions. It seems possible,
to move to Vint in fileformat and change int docnums to longs in Lucene
API. Does anyone know whether it's possible?
And this question is not so esoteric if we are talking about SolrCloud,
which can hold more that 2bn docs in few smaller shards. Any experience?


On Thu, Feb 7, 2013 at 5:46 PM, Rafał Kuć r@solr.pl wrote:

 Hello!

 Right, my bad - ids are still using int32. However, that still
 gives us 2,147,483,648 possible identifiers right per single index,
 which is not close to the 13,5 millions mentioned in the first mail.

 --
 Regards,
  Rafał Kuć
  Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch

  Rafal,

  What about docnums, don't they are limited by int32 ?
  07.02.2013 15:33 пользователь Rafał Kuć r@solr.pl написал:

  Hello!
 
  Practically there is no limit in how many documents can be stored in a
  single index. In your case, as you are using Solr from 2011, there is
  a limitation regarding the number of unique terms per Lucene segment
  (
 
 http://lucene.apache.org/core/old_versioned_docs/versions/3_0_0/fileformats.html#Limitations
  ).
  However I don't think you've hit that. Solr by itself doesn't remove
  documents unless told to do so.
 
  Its hard to guess what can be the reason and as you said, you see
  updates coming to your handler. Maybe new documents have the same
  identifiers that the ones that are already indexed ? As I said, this
  is only a guess and we would need to have more information. Are there
  any exceptions in the logs ? Do you run delete command ? Are your
  index files changed ? How do you run commit ?
 
  --
  Regards,
   Rafał Kuć
   Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch -
 ElasticSearch
 
   I have searched this forum but not yet found a definitive answer, I
  think the
   answer is There is No Limit depends on server specification. But
 never
  the
   less I will say what I have seen and then ask the questions.
 
   From scratch (November 2011) I have set up our SOLR which contains
 data
  from
   various sources, since March 2012 , the number of indexed records
 (unique
   ID's) reached 13.5 million , which was to be expected. However for the
  last
   8 months the number of records in the index has not gone above 13.5
  million,
   yet looking at the request handler outputs I can safely say at least
   anywhere from 50 thousand to 100 thousand records are being indexed
  daily.
   So I am assuming that earlier records are being removed, and I do not
  want
   that.
 
   Question: If there is a limit to the number of records the index can
  store
   where do I find this and change it?
   Question: If there is no limit does anyone have any idea why for the
 last
   months the number has not gone beyond 13.5 million, I can safely say
  that at
   least 90% are new records.
 
   thanks
 
   macroman
 
 
 
   --
   View this message in context:
  
 
 http://lucene.472066.n3.nabble.com/Maximum-Number-of-Records-In-Index-tp4038961.html
   Sent from the Solr - User mailing list archive at Nabble.com.
 
 




-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

http://www.griddynamics.com
 mkhlud...@griddynamics.com


Re: AnalyzingSuggester returning index value instead of field value?

2013-02-07 Thread Michael McCandless
I'm not very familiar with how AnalyzingSuggester works inside Solr
... if you try this directly with the Lucene APIs does it still
happen?

Hmm maybe one idea: if you remove whitespace from your suggestion does
it work?  I wonder if there's a whitespace / multi-token issue ... if
so then maybe see how TestPhraseSuggestions.java (in Solr) does this?

Mike McCandless

http://blog.mikemccandless.com

On Thu, Feb 7, 2013 at 9:48 AM, Sebastian Saip sebastian.s...@gmail.com wrote:
 I'm looking into a way to implement an autosuggest and for my special needs
 (I'm doing a startsWith-search that should retrieve the full name, which
 may have accents - However, I want to search with/without accents and in
 any upper/lowercase for comfort)

 Here's part of my configuration: http://pastebin.com/20vSGJ1a

 So I have a name=Têst Námè and I query for test, tést, TÈST, or
 similiar. This gives me back test name as a suggestion, which looks like
 the index, rather than the actual value.

 Furthermore, when I fed the document without index-analyzers, then added
 the index-analyzers, restarted without refeeding and queried, it returned
 the right value (so this seems to retrieve the index, rather than the
 actual stored value?)

 Or maybe I just configured it the wrong way :?
 Theres not really much documentation about this yet :(

 BR Sebastian Saip


Re: how-to configure mysql pool connection on Solr Server

2013-02-07 Thread elyograg
 Hi

   I need configure a mysql pool connection on Solr Server for using on
 custom plugin. I saw DataImportHandler wiki:
 http://wiki.apache.org/solr/DataImportHandler , but it's seems that
 DataImportHandler open the  connection when handler is calling and close
 when finish import and I need keep opening pool to reuse connections
 whenever I need them.'

 I not found on documentation of Apache Solr how-to define a pools
 connection to DB for reusing them on whatever class of solr.

The dataimport handler is a module in the contrib directory. It is not
part of the core of Solr ...  From version 3.1.0, you have to add a jar to
make it work plus A jar for your jdbc driver.

There is no other database functionality in Solr. Your plugin will have to
manage its database connections.  There might be connection pooling
available from the servlet container that Solr runs in, such as tomcat,
jetty, glassfish, etc. I don't know a lot about that.

Thanks,
Shawn





Re: Calculate score according to another indexed field

2013-02-07 Thread Jonas Birgander

On 2013-02-07 14:58, Pragyanshis Pattanaik wrote:

Hi,


Hi,


My schema is like below
fields
 field name=ProductId type=int indexed=true stored=true /
 field name=ProductName type=string indexed=true stored=true 
required=true /
 field name=ProductDesription type=string indexed=true stored=true 
required=true /
 field name=ProductRating type=int indexed=true stored=true 
required=true /
/fields

Product name will be passed as q parameter to solr.
Is there a way to affect score on the basis of ProductRating which is not 
passed as query parameter ?


You can use a boost function to achieve this.
There are examples in the wiki: 
http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_change_the_score_of_a_document_based_on_the_.2Avalue.2A_of_a_field_.28say.2C_.22popularity.22.29


A quick example:
defType=dismaxqf=textq=supervilliansbf=sqrt(ProductRating)


Regards,
--
Jonas Birgander


Re: AnalyzingSuggester returning index value instead of field value?

2013-02-07 Thread Sebastian Saip
It's the same with whitespace removed unfortunately - still getting back
testname then.
I'm not quite sure how to test this via the Lucene API - in particular, how
to define the KeywordTokenizer with ASCII+LowerCase, so I can't test this
atm :/

BR Sebastian Saip


On 7 February 2013 16:19, Michael McCandless luc...@mikemccandless.comwrote:

 I'm not very familiar with how AnalyzingSuggester works inside Solr
 ... if you try this directly with the Lucene APIs does it still
 happen?

 Hmm maybe one idea: if you remove whitespace from your suggestion does
 it work?  I wonder if there's a whitespace / multi-token issue ... if
 so then maybe see how TestPhraseSuggestions.java (in Solr) does this?

 Mike McCandless

 http://blog.mikemccandless.com

 On Thu, Feb 7, 2013 at 9:48 AM, Sebastian Saip sebastian.s...@gmail.com
 wrote:
  I'm looking into a way to implement an autosuggest and for my special
 needs
  (I'm doing a startsWith-search that should retrieve the full name,
 which
  may have accents - However, I want to search with/without accents and in
  any upper/lowercase for comfort)
 
  Here's part of my configuration: http://pastebin.com/20vSGJ1a
 
  So I have a name=Têst Námè and I query for test, tést, TÈST, or
  similiar. This gives me back test name as a suggestion, which looks
 like
  the index, rather than the actual value.
 
  Furthermore, when I fed the document without index-analyzers, then added
  the index-analyzers, restarted without refeeding and queried, it returned
  the right value (so this seems to retrieve the index, rather than the
  actual stored value?)
 
  Or maybe I just configured it the wrong way :?
  Theres not really much documentation about this yet :(
 
  BR Sebastian Saip



Re: how-to configure mysql pool connection on Solr Server

2013-02-07 Thread Michael Della Bitta
Hello Miguel,

If you set up a JNDI datasource in your servlet container, you can use
that as your database config. Then you just need to use a pooling
datasource:

http://wiki.apache.org/solr/DataImportHandlerFaq#How_do_I_use_a_JNDI_DataSource.3F
http://dev.mysql.com/tech-resources/articles/connection_pooling_with_connectorj.html


Michael Della Bitta


Appinions
18 East 41st Street, 2nd Floor
New York, NY 10017-6271

www.appinions.com

Where Influence Isn’t a Game


On Thu, Feb 7, 2013 at 7:20 AM, Miguel
miguel.valen...@juntadeandalucia.es wrote:
 Hi

  I need configure a mysql pool connection on Solr Server for using on custom
 plugin. I saw DataImportHandler wiki:
 http://wiki.apache.org/solr/DataImportHandler , but it's seems that
 DataImportHandler open the  connection when handler is calling and close
 when finish import and I need keep opening pool to reuse connections
 whenever I need them.

 I not found on documentation of Apache Solr how-to define a pools connection
 to DB for reusing them on whatever class of solr.
 Any ideas?

 thanks



Re: IP Address as number

2013-02-07 Thread Isaac Hebsh
Small addition:
To support query, I probably have to implement an analyzer (query time)...
An analyzer can be configured on numeric (i.e non TEXT) field?


On Thu, Feb 7, 2013 at 6:48 PM, Isaac Hebsh isaac.he...@gmail.com wrote:

 Hi.

 I have to index field which contains an IP address.
 Users want to query this field using RANGE queries. to support this, the
 IP is stored as its DWORD value (assume it is IPv4...). On the other side,
 users supply the IP addresses textually (xxx.xxx.xxx.xxx).

 I can write a new field type, extends TrieLongField, which will change the
 textual representation to numeric one.
 But what about the stored field retrieval? I want to return the textual
 form..  may be a search component, which changes the stored fields?

 Has anyone encountered this need before?



Re: Best way to perform search on several fields

2013-02-07 Thread Mikhail Khludnev
http://wiki.apache.org/solr/ExtendedDisMax


On Thu, Feb 7, 2013 at 6:53 PM, Pragyanshis Pattanaik 
pragyans...@outlook.com wrote:

 Thanks for the reply

 But i need to boost each field so the first approach might not be
 applicable here right ?

  Date: Thu, 7 Feb 2013 06:08:25 -0800
  From: marc.sturl...@gmail.com
  To: solr-user@lucene.apache.org
  Subject: Re: Best way to perform search on several fields
 
  If you don't care about giving different boosts depending on the field
 that
  matches, the approach of copying all field in just one is going to be
  faster.
 
 
 
  --
  View this message in context:
 http://lucene.472066.n3.nabble.com/Best-way-to-perform-search-on-several-fields-tp4038996p4039002.html
  Sent from the Solr - User mailing list archive at Nabble.com.





-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

http://www.griddynamics.com
 mkhlud...@griddynamics.com


Re: What is the graceful shutdown API for Solrj embedded?

2013-02-07 Thread Ali, Saqib
Hello Alex,

I asked a similar question on server fault:
http://serverfault.com/a/474442/156440


On Wed, Feb 6, 2013 at 7:05 PM, Alexandre Rafalovitch arafa...@gmail.comwrote:

 Hello,

 When I CTRL-C the example Solr, it prints a bunch of graceful shutdown
 messages.  I assume it shuts down safe and without corruption issues.

 When I do that to Solrj (embedded, not remote), it just drops dead.

 I found CoreContainer.shutdown(), which looks about right and does
 terminate Solrj but it prints out a completely different set of messages.

 Is CoreContainer.shutdown() the right method for Solrj (4.1)? Is there more
 than just one call?

 And what happens if you just Ctrl-C Solrj instance? Wiki says nothing about
 shutdown, so I can imagine a lot of people probably think it is ok to just
 kill it. Is there a danger of corruption?

 Regards,
 Alex.
 Personal blog: http://blog.outerthoughts.com/
 LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
 - Time is the quality of nature that keeps events from happening all at
 once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)



Re: AnalyzingSuggester returning index value instead of field value?

2013-02-07 Thread Sebastian Saip
The solution, as pointed out on
http://stackoverflow.com/questions/14732713/solr-autosuggest-with-diacritics/14743278
,
is not to use a copyField but instead use the AnalyzingSuggester on the
StrField directly.

Cheers!


On 7 February 2013 17:30, Sebastian Saip sebastian.s...@gmail.com wrote:

 It's the same with whitespace removed unfortunately - still getting back
 testname then.
 I'm not quite sure how to test this via the Lucene API - in particular,
 how to define the KeywordTokenizer with ASCII+LowerCase, so I can't test
 this atm :/

 BR Sebastian Saip


 On 7 February 2013 16:19, Michael McCandless luc...@mikemccandless.comwrote:

 I'm not very familiar with how AnalyzingSuggester works inside Solr
 ... if you try this directly with the Lucene APIs does it still
 happen?

 Hmm maybe one idea: if you remove whitespace from your suggestion does
 it work?  I wonder if there's a whitespace / multi-token issue ... if
 so then maybe see how TestPhraseSuggestions.java (in Solr) does this?

 Mike McCandless

 http://blog.mikemccandless.com

 On Thu, Feb 7, 2013 at 9:48 AM, Sebastian Saip sebastian.s...@gmail.com
 wrote:
  I'm looking into a way to implement an autosuggest and for my special
 needs
  (I'm doing a startsWith-search that should retrieve the full name,
 which
  may have accents - However, I want to search with/without accents and in
  any upper/lowercase for comfort)
 
  Here's part of my configuration: http://pastebin.com/20vSGJ1a
 
  So I have a name=Têst Námè and I query for test, tést, TÈST, or
  similiar. This gives me back test name as a suggestion, which looks
 like
  the index, rather than the actual value.
 
  Furthermore, when I fed the document without index-analyzers, then added
  the index-analyzers, restarted without refeeding and queried, it
 returned
  the right value (so this seems to retrieve the index, rather than the
  actual stored value?)
 
  Or maybe I just configured it the wrong way :?
  Theres not really much documentation about this yet :(
 
  BR Sebastian Saip





Re: AnalyzingSuggester returning index value instead of field value?

2013-02-07 Thread Alexandre Rafalovitch
Glad it helped. :-)

Now, if you could write this up as a full example and explanation, I am
sure Solr community would benefit from it as well. If you don't have your
own blog, I would be happy to guest host it, as I am sure  would at least a
couple more people/organizations.

Regards,
   Alex.

Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


On Thu, Feb 7, 2013 at 12:44 PM, Sebastian Saip sebastian.s...@gmail.comwrote:

 The solution, as pointed out on

 http://stackoverflow.com/questions/14732713/solr-autosuggest-with-diacritics/14743278
 ,
 is not to use a copyField but instead use the AnalyzingSuggester on the
 StrField directly.

 Cheers!


 On 7 February 2013 17:30, Sebastian Saip sebastian.s...@gmail.com wrote:

  It's the same with whitespace removed unfortunately - still getting back
  testname then.
  I'm not quite sure how to test this via the Lucene API - in particular,
  how to define the KeywordTokenizer with ASCII+LowerCase, so I can't test
  this atm :/
 
  BR Sebastian Saip
 
 
  On 7 February 2013 16:19, Michael McCandless luc...@mikemccandless.com
 wrote:
 
  I'm not very familiar with how AnalyzingSuggester works inside Solr
  ... if you try this directly with the Lucene APIs does it still
  happen?
 
  Hmm maybe one idea: if you remove whitespace from your suggestion does
  it work?  I wonder if there's a whitespace / multi-token issue ... if
  so then maybe see how TestPhraseSuggestions.java (in Solr) does this?
 
  Mike McCandless
 
  http://blog.mikemccandless.com
 
  On Thu, Feb 7, 2013 at 9:48 AM, Sebastian Saip 
 sebastian.s...@gmail.com
  wrote:
   I'm looking into a way to implement an autosuggest and for my special
  needs
   (I'm doing a startsWith-search that should retrieve the full name,
  which
   may have accents - However, I want to search with/without accents and
 in
   any upper/lowercase for comfort)
  
   Here's part of my configuration: http://pastebin.com/20vSGJ1a
  
   So I have a name=Têst Námè and I query for test, tést, TÈST,
 or
   similiar. This gives me back test name as a suggestion, which looks
  like
   the index, rather than the actual value.
  
   Furthermore, when I fed the document without index-analyzers, then
 added
   the index-analyzers, restarted without refeeding and queried, it
  returned
   the right value (so this seems to retrieve the index, rather than the
   actual stored value?)
  
   Or maybe I just configured it the wrong way :?
   Theres not really much documentation about this yet :(
  
   BR Sebastian Saip
 
 
 



Can you call the elevation component in another requesthandler?

2013-02-07 Thread eShard
Good day,
I got my elevation component working with the /elevate handler. 
However, I would like to add the elevation component to my main search
handler which is currently /query.
so I can have one handler return everything (elevated items with regular
search results; i.e. one stop shopping, so to speak)
This is what I tried:
  requestHandler name=/query class=solr.SearchHandler
 lst name=defaults
   str name=echoParamsexplicit/str
   str name=wtxml/str
   str name=indenttrue/str
   str name=dftext/str
 /lst
 arr name=last-components
strelevator/str
strmanifoldCFSecurity/str 
 /arr
  /requestHandler

I also tried it in first components as well.
Is there any way to combine these? Otherwise the UI will have to make
separate ajax calls and we're trying to minimize that.
Thanks,








--
View this message in context: 
http://lucene.472066.n3.nabble.com/Can-you-call-the-elevation-component-in-another-requesthandler-tp4039054.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: What is the graceful shutdown API for Solrj embedded?

2013-02-07 Thread Alexandre Rafalovitch
Thanks, but it is not quite the same. I am talking about SolrJ, where Solr
is hosted within an application, not in a servlet container.

Regards,
  Alex.

Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


On Thu, Feb 7, 2013 at 12:07 PM, Ali, Saqib docbook@gmail.com wrote:

 Hello Alex,

 I asked a similar question on server fault:
 http://serverfault.com/a/474442/156440


 On Wed, Feb 6, 2013 at 7:05 PM, Alexandre Rafalovitch arafa...@gmail.com
 wrote:

  Hello,
 
  When I CTRL-C the example Solr, it prints a bunch of graceful shutdown
  messages.  I assume it shuts down safe and without corruption issues.
 
  When I do that to Solrj (embedded, not remote), it just drops dead.
 
  I found CoreContainer.shutdown(), which looks about right and does
  terminate Solrj but it prints out a completely different set of messages.
 
  Is CoreContainer.shutdown() the right method for Solrj (4.1)? Is there
 more
  than just one call?
 
  And what happens if you just Ctrl-C Solrj instance? Wiki says nothing
 about
  shutdown, so I can imagine a lot of people probably think it is ok to
 just
  kill it. Is there a danger of corruption?
 
  Regards,
  Alex.
  Personal blog: http://blog.outerthoughts.com/
  LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
  - Time is the quality of nature that keeps events from happening all at
  once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)
 



Re: What is the graceful shutdown API for Solrj embedded?

2013-02-07 Thread Ahmet Arslan
Hi,

I think yes CoreContainer.shutdown() is the right method for embedded solr 
server.

But i believe embedded solr server is not preferred anymore after javabin 
codec. And it is not well tested compared to http server.


--- On Thu, 2/7/13, Alexandre Rafalovitch arafa...@gmail.com wrote:

 From: Alexandre Rafalovitch arafa...@gmail.com
 Subject: Re: What is the graceful shutdown API for Solrj embedded?
 To: solr-user@lucene.apache.org
 Date: Thursday, February 7, 2013, 8:01 PM
 Thanks, but it is not quite the same.
 I am talking about SolrJ, where Solr
 is hosted within an application, not in a servlet
 container.
 
 Regards,
   Alex.
 
 Personal blog: http://blog.outerthoughts.com/
 LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
 - Time is the quality of nature that keeps events from
 happening all at
 once. Lately, it doesn't seem to be working. 
 (Anonymous  - via GTD book)
 
 
 On Thu, Feb 7, 2013 at 12:07 PM, Ali, Saqib docbook@gmail.com
 wrote:
 
  Hello Alex,
 
  I asked a similar question on server fault:
  http://serverfault.com/a/474442/156440
 
 
  On Wed, Feb 6, 2013 at 7:05 PM, Alexandre Rafalovitch
 arafa...@gmail.com
  wrote:
 
   Hello,
  
   When I CTRL-C the example Solr, it prints a bunch
 of graceful shutdown
   messages.  I assume it shuts down safe and
 without corruption issues.
  
   When I do that to Solrj (embedded, not remote), it
 just drops dead.
  
   I found CoreContainer.shutdown(), which looks
 about right and does
   terminate Solrj but it prints out a completely
 different set of messages.
  
   Is CoreContainer.shutdown() the right method for
 Solrj (4.1)? Is there
  more
   than just one call?
  
   And what happens if you just Ctrl-C Solrj
 instance? Wiki says nothing
  about
   shutdown, so I can imagine a lot of people
 probably think it is ok to
  just
   kill it. Is there a danger of corruption?
  
   Regards,
       Alex.
   Personal blog: http://blog.outerthoughts.com/
   LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
   - Time is the quality of nature that keeps events
 from happening all at
   once. Lately, it doesn't seem to be working. 
 (Anonymous  - via GTD book)
  
 



Re: Can you call the elevation component in another requesthandler?

2013-02-07 Thread eShard
Update:
Ok, If I search for gangnam style in /query handler by itself, elevation
works!
If I search with gangnam style and/or something else the elevation component
doesn't work but the rest of the query does.

here's the examples:
works:
/query?q=gangnam+stylefl=*,[elevated]wt=xmlstart=0rows=50debugQuery=truedismax=true

elevation fails:
/query?q=gangnam+style+OR+title%3A*White*fl=*,[elevated]wt=xmlstart=0rows=50debugQuery=truedismax=true

So I guess I have to do separate queries at this point.
Is there a way to combine these 2 request handlers?

Thanks,





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Can-you-call-the-elevation-component-in-another-requesthandler-tp4039054p4039076.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solrj how to use TrieDoubleField

2013-02-07 Thread dm_tim
Howdy,

I have a Solr implementation that allows me to do a geospatial search and
I'm trying to replicate it using the solrj libs. The schema.xml that I'm
using looks like this:
schema name=v3_geo version=1.1
  types
fieldtype name=string  class=solr.StrField sortMissingLast=true
omitNorms=true/
fieldType name=int class=solr.TrieIntField precisionStep=0
positionIncrementGap=0/
fieldType name=float class=solr.TrieFloatField precisionStep=0
positionIncrementGap=0/
fieldType name=long class=solr.TrieLongField precisionStep=0
positionIncrementGap=0/
fieldType name=date class=solr.TrieDateField precisionStep=0
positionIncrementGap=0/
fieldType name=tdouble class=solr.TrieDoubleField precisionStep=8
omitNorms=true positionIncrementGap=0/
fieldType name=location class=solr.LatLonType
subFieldSuffix=_coordinate/
fieldType name=text_en class=solr.TextField
positionIncrementGap=100
  analyzer type=index 
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/ 
  /analyzer 
  analyzer type=query 
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/ 
  /analyzer 
/fieldType  
fieldType name=text_general class=solr.TextField
positionIncrementGap=100
  analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/ 
  /analyzer
  analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/ 
  /analyzer
/fieldType
  /types

 fields   
  
   field name=id type=string indexed=true stored=true
required=true / 
   field name=cid type=long indexed=true stored=true
required=true/
   field name=lang type=string indexed=true stored=true
required=true/
   field name=file_version type=int indexed=true stored=true
required=true/
   field name=name type=text_general indexed=true stored=true
required=true/
   field name=loc type=location indexed=true stored=true
required=true /
   field name=created type=date indexed=false stored=true/
   field name=last_modified type=date indexed=true stored=true/
   field name=version type=long indexed=true stored=true/
   field name=_version_ type=long indexed=true stored=true
multiValued=false/
   dynamicField name=*_coordinate type=tdouble indexed=true
stored=false multiValued=false /
 /fields
 
 
 uniqueKeyid/uniqueKey

 
 defaultSearchFieldname/defaultSearchField

 
 solrQueryParser defaultOperator=AND/
/schema

And it works perfectly. Now I'm trying to write code to create an index
using the same fields. I have previously created other indexes just fine (by
creating an Analyzer and an IndexWriter and writing Document objects) and I
will be reusing the same Analyzer I used before. The problem specifically
lies in creating the field (a TrieDoubleField) for the lat/lon data. Is
there an example of that somewhere that I could plagiarize?

Regards,

Tim



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solrj-how-to-use-TrieDoubleField-tp4039083.html
Sent from the Solr - User mailing list archive at Nabble.com.


Sorting grouped results by numfound

2013-02-07 Thread Tony Paloma
For some reason I thought Solr 4.0 brought us the ability to sort groups by 
numFound when using field collapsing, but now I can't find any info about how 
to do that. Did I make it all up? Might I be able to fake it using pseudo 
fields somehow?

Thanks,
Tony


Schema/config changes and re-indexing in a SolrCloud setup

2013-02-07 Thread Steffen Elberg Godskesen

Hi Solr community

I'm in the process of getting my mind set straight on SolrCloud; more 
specifically: trying to design a feasible workflow for a use-case where we 
currently use master/slave replication. First, the use case:

We want to
  1. separate indexing workload from query workload
  2. deploying config and/or schema changes without interrupting queries

Currently we do (1) with a straight-forward master/slave replication setup. N 
master shards that handle updates and N slave shards replicating from these. In 
this setup we can do (2) by temporarily stopping replication, deploying new 
configuration/schema to master shards, possibly re-indexing, switching queries 
to go the master shards, re-enabling replication, and - when replication has 
finished - switching queries back to the slave shards

So... introducing SolrCloud. We would really like to utilize SolrCloud, 
especially for the added fault-tolerance and simpler distributed indexing, but 
I'm a bit puzzled on how to achieve something similar to the above.

Re (1): Am I right in thinking that a given update is sent to every replica of 
the shard to which it belongs for analysis and indexing? And that there is no 
immediate way to separate indexing from queries within a collection? 

Re (2): Deploying new schema/config should be as simple as uploading to 
ZooKeeper and reloading cores. Right? So for the case where the new 
config/schema is compatible with the index we're good. For the other case, I 
think we could do it by: Create a new collection, upload the new config/schema 
to zookeeper, index into the new collection, switch queries to the new 
collection, delete the old collection. Would this be the way to go? Or is there 
a simpler way that I cannot see?


Just to bring the scale of our operation into it: Our index is approx. 200 
million documents, with a total index size around 0.5TB. The normal flow of 
updates is in the order of a few million/day, but we will frequently (say on a 
weekly basis) need to re-index all or large parts of our documents. Either due 
to schema changes or re-processing of the original data.


Sorry for dumping my brain on you, but any input you might have on this, will 
be highly appreciated.

Regards, 

-- 
Steffen Elberg Godskesen
Programmer
DTU Library
---
Technical University of Denmark
Technical Information Center of Denmark
Anker Engelunds Vej 1
PO Box 777
Building 101D
2800 Kgs. Lyngby
s...@dtic.dtu.dk
http://www.dtic.dtu.dk/





Re: What is the graceful shutdown API for Solrj embedded?

2013-02-07 Thread Alexandre Rafalovitch
Looks like SolrServer.shutdown() is the more standard approach. For
embedded server, it just calls container.shutdown() anyway.

Regards,
   Alex.

Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


On Thu, Feb 7, 2013 at 1:58 PM, Ahmet Arslan iori...@yahoo.com wrote:

 Hi,

 I think yes CoreContainer.shutdown() is the right method for embedded solr
 server.

 But i believe embedded solr server is not preferred anymore after javabin
 codec. And it is not well tested compared to http server.


 --- On Thu, 2/7/13, Alexandre Rafalovitch arafa...@gmail.com wrote:

  From: Alexandre Rafalovitch arafa...@gmail.com
  Subject: Re: What is the graceful shutdown API for Solrj embedded?
  To: solr-user@lucene.apache.org
  Date: Thursday, February 7, 2013, 8:01 PM
  Thanks, but it is not quite the same.
  I am talking about SolrJ, where Solr
  is hosted within an application, not in a servlet
  container.
 
  Regards,
Alex.
 
  Personal blog: http://blog.outerthoughts.com/
  LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
  - Time is the quality of nature that keeps events from
  happening all at
  once. Lately, it doesn't seem to be working.
  (Anonymous  - via GTD book)
 
 
  On Thu, Feb 7, 2013 at 12:07 PM, Ali, Saqib docbook@gmail.com
  wrote:
 
   Hello Alex,
  
   I asked a similar question on server fault:
   http://serverfault.com/a/474442/156440
  
  
   On Wed, Feb 6, 2013 at 7:05 PM, Alexandre Rafalovitch
  arafa...@gmail.com
   wrote:
  
Hello,
   
When I CTRL-C the example Solr, it prints a bunch
  of graceful shutdown
messages.  I assume it shuts down safe and
  without corruption issues.
   
When I do that to Solrj (embedded, not remote), it
  just drops dead.
   
I found CoreContainer.shutdown(), which looks
  about right and does
terminate Solrj but it prints out a completely
  different set of messages.
   
Is CoreContainer.shutdown() the right method for
  Solrj (4.1)? Is there
   more
than just one call?
   
And what happens if you just Ctrl-C Solrj
  instance? Wiki says nothing
   about
shutdown, so I can imagine a lot of people
  probably think it is ok to
   just
kill it. Is there a danger of corruption?
   
Regards,
Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events
  from happening all at
once. Lately, it doesn't seem to be working.
  (Anonymous  - via GTD book)
   
  
 



Re: Schema/config changes and re-indexing in a SolrCloud setup

2013-02-07 Thread Mark Miller

On Feb 7, 2013, at 5:23 PM, Steffen Elberg Godskesen 
steffen.godske...@gmail.com wrote:

 Re (1): Am I right in thinking that a given update is sent to every replica 
 of the shard to which it belongs for analysis and indexing? And that there is 
 no immediate way to separate indexing from queries within a collection?

Right - this is required not only for NRT, but our durability/reliability 
promises.

If you want to separate out indexing, it's probably best to build the indexes 
offline, move them to the box and merge or place them into each Solr node. 

You might want to continue using the old master-slave architecture as one 
option.

With a small amount of dev, having some polling replication for the index side 
and using solrcloud for the search side might be possible, though not 
necessarily a perfect marriage.

- Mark

 
 
 Re (2): Deploying new schema/config should be as simple as uploading to 
 ZooKeeper and reloading cores. Right? So for the case where the new 
 config/schema is compatible with the index we're good. For the other case, I 
 think we could do it by: Create a new collection, upload the new 
 config/schema to zookeeper, index into the new collection, switch queries to 
 the new collection, delete the old collection. Would this be the way to go? 
 Or is there a simpler way that I cannot see?

Yes, you can just deploy new config and reload. Reindexing is a different 
story. You have a variety of options - one would be to create a new cluster 
with the reindexed data and then flip over. I don't think any of that is really 
specific to SolrCloud vs Master-Slave.

- Mark



Re: SolrCloud new zookeper node on different ip/ replicate between two clasters

2013-02-07 Thread Jan Høydahl
You should run replicated ZK: 
http://zookeeper.apache.org/doc/trunk/zookeeperStarted.html#sc_RunningReplicatedZooKeeper
Give Solr the list of all ZK's and you're good to go

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com

7. feb. 2013 kl. 21:55 skrev mizayah miza...@gmail.com:

 Hey,
 
 I just wonder is there any way to tell solr node that my zookeper instance
 is down and new one is on another ip?
 
 What I want to achive is to have one zookeeper instance for claster of solr
 nodes.
 When fail occur i will setup new one zoo. But i have to restart all of solr
 nodes? Or is there way to inform them that zoo has new ip, and make them
 reconnect to  it?
 
 
 
 --
 Other think I wanted to ask is how to replicate between two clasters.
 Imagine  that i got two clasters of solr in diferent zones. Lets say for
 failover purpose.
 
 Having them all in one single claster covered by zoo will make appear leader
 in zone that i  probably dont want to have there. Can I somehow specify
 order for nodes i want to appear as leader in case of ealier leader die?
 
 
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/SolrCloud-new-zookeper-node-on-different-ip-replicate-between-two-clasters-tp4039101.html
 Sent from the Solr - User mailing list archive at Nabble.com.



LocalParam tag does not work when is placed in brackets

2013-02-07 Thread Karol Sikora

Hi all,

I`m struggiling with strange local params behaviour when {!tag=somethig} 
is in brackets term.

case:
facet.field={!ex=d0feea8}categoryfq={!tag=d0feea8}category:5 AND 
type:DOCUMENT

works as I expected, but:
facet.field={!ex=d0feea8}categoryfq=({!tag=d0feea8}category:5 OR 
otherField:otherValue) AND type:DOCUMENT

not.
There is a special syntax for dealing with such cases?

Thanks in advance for your help.

--
 
Karol Sikora




RE: LocalParam tag does not work when is placed in brackets

2013-02-07 Thread Michael Ryan
I'm pretty sure the local params have to be at the very start of the query. But 
you should be able to do this with nested queries. Try this...

fq=_query_:{!tag=d0feea8}category:\5\ OR otherField:\otherValue\ AND 
type:DOCUMENT

-Michael

-Original Message-
From: Karol Sikora [mailto:karol.sik...@laboratorium.ee] 
Sent: Thursday, February 07, 2013 7:51 PM
To: solr-user@lucene.apache.org
Subject: LocalParam tag does not work when is placed in brackets

Hi all,

I`m struggiling with strange local params behaviour when {!tag=somethig} is in 
brackets term.
case:
facet.field={!ex=d0feea8}categoryfq={!tag=d0feea8}category:5 AND 
type:DOCUMENT works as I expected, but:
facet.field={!ex=d0feea8}categoryfq=({!tag=d0feea8}category:5 OR
otherField:otherValue) AND type:DOCUMENT not.
There is a special syntax for dealing with such cases?

Thanks in advance for your help.

-- 
  
Karol Sikora



Grouping results - set document return count not group.limit

2013-02-07 Thread Rajani Maski
Hi all,

Is there any parameter which will set the number of document returned
after applying grouping on results? Like we have query.setRows for results
without grouping?



I know all the below functions will set to group param. But this will not
limit number of document returned. We want to get all the results belonging
for each group (group.limit=-1) and display only 20 records at a time(doc
returned should be limited to given integer). Anyway param to get this?

rowsintegerThe number of groups to return. The default value is 10.start
integerSpecifies an initial offset for the list of
groups.group.limitintegerSpecifies
the number of results to return for each group. The default value is 1.


Re: Updating data

2013-02-07 Thread anurag.jain
this was my previous data.[
   {
  id:1,
  movie_name:Twelve Monkeys,
  genre:[
 Children's,
 Comedy,
 Drama,
 Sci-Fi
  ],
  release_year:1995,
  url:http://us.imdb.com/M/title-exact?Twelve%20Monkeys%20(1995),
  rating:8.1,
  total_viewed:432121
   },
   {
  id:2,
  movie_name:Toy Story,
  genre:[
 Children's,
 Comedy,
 Crime
  ],
  release_year:1995,
  url:http://us.imdb.com/M/title-exact?Toy%20Story%20(1995),
  rating:7.1,
  total_viewed:5423
   },
   {
  id:3,
  movie_name:Copycat,
  genre:[
 Children's,
 Comedy,
 Drama,
 Horror
  ],
  release_year:1998,
  url:http://us.imdb.com/M/title-exact?Copycat%20(1998),
  rating:2.1,
  total_viewed:54323
   },
   {
  id:4,
  movie_name:Crumb,
  genre:[
 Comedy,
 Drama,
 Sci-Fi
  ],
  release_year:1998,
  url:http://us.imdb.com/M/title-exact?Crumb%20(1998),
  rating:4.1,
  total_viewed:5123
   },
   {
  id:5,
  movie_name:Young Guns,
  genre:[
 Comedy,
 Drama,
 Sci-Fi,
 War
  ],
  release_year:2012,
  url:http://us.imdb.com/M/title-exact?Young%20Guns%20(1988),
  rating:9.1,
  total_viewed:524323
   }
]


i also want to add [
   {
  id:1,
  is_good:1
   },
   {
  id:2,
   is_good:1
   },
   {
  id:3,
  is_good:0
   },

   {
  id:4,
  is_good:0
   },
   {
  id:5,
is_good:1   
   }
]


and here is my schema.xml part... 


 field name=id type=string indexed=true stored=true required=true
multiValued=false / 
   field name=movie_name type=lowercase indexed=true stored=true
multiValued=false / 
   field name=genre type=comaSplit indexed=true stored=true
multiValued=true / 
   field name=release_year type=string indexed=true stored=true
multiValued=false / 
   field name=url type=string indexed=false stored=true
multiValued=false / 
   field name=rating type=float indexed=true stored=true
multiValued=false / 
   field name=total_viewed type=int indexed=true stored=true
multiValued=false / 
   field name=is_good type=int indexed=true stored=true
multiValued=false / 



when i try to update through 2nd data it erase first data.. 




please reply






--
View this message in context: 
http://lucene.472066.n3.nabble.com/Updating-data-tp4038492p4039178.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Grouping results - set document return count not group.limit

2013-02-07 Thread Prakhar Birla
Hi Rajani,

I recently tried to solve a similar problem as the one you have. (I think)
Solr doesn't support a param to achieve this because if we were to limit
the no of documents returned, to get the next result set the starting
offset of each group will be different based on the number of
documents/group in the first page.

My problem was a little more complex as I had to limit the number of
documents differently per group and paginate them together. We solved this
by using Solr 4.0 with a patch from JIRA (
https://issues.apache.org/jira/browse/SOLR-1093) which allowed execution of
multiple queries in parallel threads along with a few enhancements that
have not been made public yet by the company I work for.

On 8 February 2013 10:13, Rajani Maski rajinima...@gmail.com wrote:

 Hi all,

 Is there any parameter which will set the number of document returned
 after applying grouping on results? Like we have query.setRows for results
 without grouping?



 I know all the below functions will set to group param. But this will not
 limit number of document returned. We want to get all the results belonging
 for each group (group.limit=-1) and display only 20 records at a time(doc
 returned should be limited to given integer). Anyway param to get this?

 rowsintegerThe number of groups to return. The default value is 10.start
 integerSpecifies an initial offset for the list of
 groups.group.limitintegerSpecifies
 the number of results to return for each group. The default value is 1.




-- 
Regards,
Prakhar Birla


Re: OR OR OR

2013-02-07 Thread Prakhar Birla
It is giving zero response mostly because your default query operator is
set to AND.

try this:

fq={!lucene q.op=OR}institute_name:(xyz sfsda sdfsaf)

On 8 February 2013 12:26, anurag.jain anurag.k...@gmail.com wrote:

 Not working through that style. it is giving me zero response



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/OR-OR-OR-tp4038836p4039179.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Regards,
Prakhar Birla
+91 9739868086


Re: Trying to understand soft vs hard commit vs transaction log

2013-02-07 Thread Shawn Heisey

On 2/7/2013 9:29 PM, Alexandre Rafalovitch wrote:

Hello,

What actually happens when using soft (as opposed to hard) commit?

I understand somewhat very high-level picture (documents become available
faster, but you may loose them on power loss).
I don't care about low-level implementation details.

But I am trying to understand what is happening on the medium level of
details.

For example what are stages of a document if we are using all available
transaction log, soft commit, hard commit options? It feels like there is
three stages:
*) Uncommitted (soft or hard): accessible only via direct real-time get?
*) Soft-committed: accessible through all search operatons? (but not on
disk? but where is it? in memory?)
*) Hard-committed: all the same as soft-committed but it is now on disk

Similarly,  in performance section of Wiki, it says: A commit (including a
soft commit) will free up almost all heap memory - why would soft commit
free up heap memory? I thought it was not flushed to disk.

Also, with soft-commits and transaction log enabled, doesn't transaction
log allows to replay/recover the latest state after crash? I believe that's
what transaction log does for the database. If not, how does one recover,
if at all?

And where does openSearcher=false fits into that? Does it cause
inconsistent results somehow?

I am missing something, but I am not sure what or where. Any points in the
right direction would be appreciated.


Let's see if I can answer your questions without giving you incorrect 
information.


New indexed content is not searchable until you open a new searcher, 
regardless of the type of commit that you do.


A hard commit will close the current transaction log and start a new 
one.  It will also instruct the Directory implementation to flush to 
disk.  If you specify openSearcher=false, then the content that has just 
been committed will NOT be searchable, as discussed in the previous 
paragraph.  The existing searcher will remain open and continue to serve 
queries against the same index data.


A soft commit does not flush the new content to disk, but it does open a 
new searcher.  I'm sure that the amount of memory available for caching 
this content is not large, so it's possible that if you do a lot of 
indexing with soft commits and your hard commits are too infrequent, 
you'll end up flushing part of the cached data to disk anyway.  I'd love 
to hear from a committer about this, because I could be wrong.


There's a caveat with that 'flush to disk' operation -- the default 
Directory implementation in the Solr example config, which is 
NRTCachingDirectoryFactory, will cache the last few megabytes of indexed 
data and not flush it to disk even with a hard commit.  If your commits 
are small, then the net result is similar to a soft commit.  If the 
server or Solr were to crash, the transaction logs would be replayed on 
Solr startup, recovering that last few megabytes.  The transaction log 
may also recover documents that were soft committed, but I'm not 100% 
sure about that.


To take full advantage of NRT functionality, you can commit as often as 
you like with soft commits.  On some reasonable interval, say every one 
to fifteen minutes, you can issue a hard commit with openSearcher set to 
false, to flush things to disk and cycle through transaction logs before 
they get huge.  Solr will keep a few of the transaction logs around, and 
if they are huge, it can take a long time to replay them.  You'll want 
to choose a hard commit interval that doesn't create giant transaction logs.


If any of the info I've given here is wrong, someone should correct me!

Thanks,
Shawn