Re: Local Solr and Webserver-Solr act differently (and treated like or)

2013-10-17 Thread Stavros Delsiavas
Unfortunatly, I don't really know what stopwords are. I would like it to 
not ignore any words of my query.

How/Where can I change this stopwords-behaviour?


Am 16.10.2013 23:45, schrieb Jack Krupansky:
So, the stopwords.txt file is different between the two systems - the 
first has stop words but the second does not. Did you expect stop 
words to be removed, or not?


-- Jack Krupansky

-Original Message- From: Stavros Delsiavas
Sent: Wednesday, October 16, 2013 5:02 PM
To: solr-user@lucene.apache.org
Subject: Re: Local Solr and Webserver-Solr act differently (and 
treated like or)


Okay I understand,

here's the rawquerystring. It was at about line 3000:

lst name=debug
 str name=rawquerystringtitle:(into AND the AND wild*)/str
 str name=querystringtitle:(into AND the AND wild*)/str
 str name=parsedquery+title:wild*/str
 str name=parsedquery_toString+title:wild*/str

At this place the debug output DOES differ from the one on my local
system. But I don't understand why...
This is the local debug output:

lst name=debug
  str name=rawquerystringtitle:(into AND the AND wild*)/str
  str name=querystringtitle:(into AND the AND wild*)/str
  str name=parsedquery+title:into +title:the +title:wild*/str
  str name=parsedquery_toString+title:into +title:the
+title:wild*/str

Why is that? Any ideas?




Am 16.10.2013 21:03, schrieb Shawn Heisey:

On 10/16/2013 4:46 AM, Stavros Delisavas wrote:

My local solr gives me:
http://pastebin.com/Q6d9dFmZ

and my webserver this:
http://pastebin.com/q87WEjVA

I copied only the first few hundret lines (of more than 8000) because
the webserver output was to big even for pastebin.



On 16.10.2013 12:27, Erik Hatcher wrote:
What does the debug output say from debugQuery=true say between the 
two?

What's really needed here is the first part of the debug section,
which has rawquerystring, querystring, parsedquery, and
parsedquery_toString.  The info from your local solr has this part, but
what you pasted from the webserver one didn't include those parts,
because it's further down than the first few hundred lines.

Thanks,
Shawn







Re: SolrCloud Performance Issue

2013-10-17 Thread primoz . skale
Query result cache hit might be low due to using NOW in bf. NOW is always 
translated to current time and that of course changes from ms to ms... :)

Primoz



From:   Shamik Bandopadhyay sham...@gmail.com
To: solr-user@lucene.apache.org
Date:   17.10.2013 00:14
Subject:SolrCloud Performance Issue



Hi,

  I'm in the process of transitioning to SolrCloud from a conventional
Master-Slave model. I'm using Solr 4.4 and has set-up 2 shards with 1
replica each. I've 3 zookeeper ensemble. All the nodes are running on AWS
EC2 instances. Shards are on m1.xlarge and sharing a zookeeper instance
(mounted on a separate volume). 6 gb memory is allocated to each solr
instance.

I've around 10 million documents in index. With the previous standalone
model, the queries avg around 100 ms.  The SolrCloud query response have
been abysmal so far. The query response time is over 1000ms, reaching
2000ms often. I expected some surge due to additional servers, network
latency, etc. but this difference is really baffling. The hardware is
similar in both cases, except for the fact that couple of SolrCloud node 
is
sharing zookeeper as well. m1x.large I/O is high, so shouldn't be a
bottleneck as well.

The other difference from old setup is that I'm using the new
CloudSolrServer class which is having the 3 zookeeper reference for load
balancing. But I don't think it has any major impact as the queries
executed from Solr admin query panel confirms the slowness.

Here are some of my configuration setup:

autoCommit
maxTime3/maxTime
openSearcherfalse/openSearcher
/autoCommit

autoSoftCommit
maxTime1000/maxTime
/autoSoftCommit


maxBooleanClauses1024/maxBooleanClauses


filterCache class=solr.FastLRUCache size=16384 initialSize=4096
autowarmCount=4096/

queryResultCache class=solr.LRUCache size=16384 initialSize=8192
autowarmCount=4096/

documentCache class=solr.LRUCache size=32768 initialSize=16384
autowarmCount=0/

fieldValueCache class=solr.FastLRUCache size=16384
autowarmCount=8192 showItems=4096 /

enableLazyFieldLoadingtrue/enableLazyFieldLoading

queryResultWindowSize200/queryResultWindowSize

queryResultMaxDocsCached400/queryResultMaxDocsCached



listener event=newSearcher class=solr.QuerySenderListener
arr name=queries
lststr name=qline/str/lst
lststr name=qxref/str/lst
lststr name=qdraw/str/lst
/arr
/listener
listener event=firstSearcher
class=solr.QuerySenderListener
arr name=queries
lststr name=qline/str/lst
lststr name=qdraw/str/lst
lststr name=qline/strstr
name=fqlanguage:english/str/lst
lststr name=qline/strstr
name=fqSource2:documentation/str/lst
lststr name=qline/strstr
name=fqSource2:CloudHelp/str/lst
lststr name=qdraw/strstr
name=fqlanguage:english/str/lst
lststr name=qdraw/strstr
name=fqSource2:documentation/str/lst
lststr name=qdraw/strstr
name=fqSource2:CloudHelp/str/lst
/arr
/listener

maxWarmingSearchers2/maxWarmingSearchers


The custom request handler :

requestHandler name=/adskcloudhelp class=solr.SearchHandler
lst name=defaults
str name=echoParamsexplicit/str
float name=tie0.01/float
str name=wtvelocity/str
str name=v.templatebrowse/str
str
name=v.contentTypetext/html;charset=UTF-8/str
str name=v.layoutlayout/str
str name=v.channelcloudhelp/str

str name=defTypeedismax/str
str name=q.alt*:*/str
str name=rows15/str
str
name=flid,url,Description,Source2,text,filetype,title,LastUpdateDate,PublishDate,ViewCount,TotalMessageCount,Solution,LastPostAuthor,Author,Duration,AuthorUrl,ThumbnailUrl,TopicId,score/str
str name=qftext^1.5 title^2 IndexTerm^.9
keywords^1.2 ADSKCommandSrch^2 ADSKContextId^1/str
str name=bqSource2:CloudHelp^3
Source2:youtube^0.85/str
str
name=bfrecip(ms(NOW,PublishDate),3.16e-11,1,1)^2.0/str
str name=dftext/str


str name=faceton/str
str name=facet.mincount1/str
str name=facet.limit100/str
str name=facet.fieldlanguage/str
str name=facet.fieldSource2/str
str name=facet.fieldDocumentationBook/str
str name=facet.fieldADSKProductDisplay/str
str name=facet.fieldaudience/str


str 

Re: Local Solr and Webserver-Solr act differently (and treated like or)

2013-10-17 Thread Upayavira
Stopwords are small words such as and, the or is,that we might
choose to exclude from our documents and queries because they are such
common terms. Once you have stripped stop words from your above query,
all that is left is the word wild, or so is being suggested.

Somewhere in your config, close to solr config.xml, you will find a file
called something like stopwords.txt. Compare these files between your
two systems.

Upayavira

On Thu, Oct 17, 2013, at 07:18 AM, Stavros Delsiavas wrote:
 Unfortunatly, I don't really know what stopwords are. I would like it to 
 not ignore any words of my query.
 How/Where can I change this stopwords-behaviour?
 
 
 Am 16.10.2013 23:45, schrieb Jack Krupansky:
  So, the stopwords.txt file is different between the two systems - the 
  first has stop words but the second does not. Did you expect stop 
  words to be removed, or not?
 
  -- Jack Krupansky
 
  -Original Message- From: Stavros Delsiavas
  Sent: Wednesday, October 16, 2013 5:02 PM
  To: solr-user@lucene.apache.org
  Subject: Re: Local Solr and Webserver-Solr act differently (and 
  treated like or)
 
  Okay I understand,
 
  here's the rawquerystring. It was at about line 3000:
 
  lst name=debug
   str name=rawquerystringtitle:(into AND the AND wild*)/str
   str name=querystringtitle:(into AND the AND wild*)/str
   str name=parsedquery+title:wild*/str
   str name=parsedquery_toString+title:wild*/str
 
  At this place the debug output DOES differ from the one on my local
  system. But I don't understand why...
  This is the local debug output:
 
  lst name=debug
str name=rawquerystringtitle:(into AND the AND wild*)/str
str name=querystringtitle:(into AND the AND wild*)/str
str name=parsedquery+title:into +title:the +title:wild*/str
str name=parsedquery_toString+title:into +title:the
  +title:wild*/str
 
  Why is that? Any ideas?
 
 
 
 
  Am 16.10.2013 21:03, schrieb Shawn Heisey:
  On 10/16/2013 4:46 AM, Stavros Delisavas wrote:
  My local solr gives me:
  http://pastebin.com/Q6d9dFmZ
 
  and my webserver this:
  http://pastebin.com/q87WEjVA
 
  I copied only the first few hundret lines (of more than 8000) because
  the webserver output was to big even for pastebin.
 
 
 
  On 16.10.2013 12:27, Erik Hatcher wrote:
  What does the debug output say from debugQuery=true say between the 
  two?
  What's really needed here is the first part of the debug section,
  which has rawquerystring, querystring, parsedquery, and
  parsedquery_toString.  The info from your local solr has this part, but
  what you pasted from the webserver one didn't include those parts,
  because it's further down than the first few hundred lines.
 
  Thanks,
  Shawn
 
 
 


Re: Local Solr and Webserver-Solr act differently (and treated like or)

2013-10-17 Thread Stavros Delisavas
Thank you,
I found the file with the stopwords and noticed that my local file is
empty (comments only) and the one on my webserver has a big list of
english stopwords. That seems to be the problem.

I think in general it is a good idea to use stopwords for random
searches, but it is not usefull in my special case. Is there a way to
(de)activate stopwords query-wise? Like I would like to ignore stopwords
when searching in titles but I would like to use stopwords when users do
a fulltext-search on whole articles, etc.

Thanks again,
Stavros


On 17.10.2013 09:13, Upayavira wrote:
 Stopwords are small words such as and, the or is,that we might
 choose to exclude from our documents and queries because they are such
 common terms. Once you have stripped stop words from your above query,
 all that is left is the word wild, or so is being suggested.

 Somewhere in your config, close to solr config.xml, you will find a file
 called something like stopwords.txt. Compare these files between your
 two systems.

 Upayavira

 On Thu, Oct 17, 2013, at 07:18 AM, Stavros Delsiavas wrote:
 Unfortunatly, I don't really know what stopwords are. I would like it to 
 not ignore any words of my query.
 How/Where can I change this stopwords-behaviour?


 Am 16.10.2013 23:45, schrieb Jack Krupansky:
 So, the stopwords.txt file is different between the two systems - the 
 first has stop words but the second does not. Did you expect stop 
 words to be removed, or not?

 -- Jack Krupansky

 -Original Message- From: Stavros Delsiavas
 Sent: Wednesday, October 16, 2013 5:02 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Local Solr and Webserver-Solr act differently (and 
 treated like or)

 Okay I understand,

 here's the rawquerystring. It was at about line 3000:

 lst name=debug
  str name=rawquerystringtitle:(into AND the AND wild*)/str
  str name=querystringtitle:(into AND the AND wild*)/str
  str name=parsedquery+title:wild*/str
  str name=parsedquery_toString+title:wild*/str

 At this place the debug output DOES differ from the one on my local
 system. But I don't understand why...
 This is the local debug output:

 lst name=debug
   str name=rawquerystringtitle:(into AND the AND wild*)/str
   str name=querystringtitle:(into AND the AND wild*)/str
   str name=parsedquery+title:into +title:the +title:wild*/str
   str name=parsedquery_toString+title:into +title:the
 +title:wild*/str

 Why is that? Any ideas?




 Am 16.10.2013 21:03, schrieb Shawn Heisey:
 On 10/16/2013 4:46 AM, Stavros Delisavas wrote:
 My local solr gives me:
 http://pastebin.com/Q6d9dFmZ

 and my webserver this:
 http://pastebin.com/q87WEjVA

 I copied only the first few hundret lines (of more than 8000) because
 the webserver output was to big even for pastebin.



 On 16.10.2013 12:27, Erik Hatcher wrote:
 What does the debug output say from debugQuery=true say between the 
 two?
 What's really needed here is the first part of the debug section,
 which has rawquerystring, querystring, parsedquery, and
 parsedquery_toString.  The info from your local solr has this part, but
 what you pasted from the webserver one didn't include those parts,
 because it's further down than the first few hundred lines.

 Thanks,
 Shawn




Change config set for a collection

2013-10-17 Thread michael.boom
The question also asked some 10 months ago in
http://lucene.472066.n3.nabble.com/SolrCloud-4-1-change-config-set-for-a-collection-td4037456.html,
and then the answer was negative, but here it goes again, maybe now it's
different.

Is it possible to change the config set of a collection using the Collection
API to another one (stored in zookeeper)? If not, is it possible to do it
using zkCli ?

Also how can somebody check which config set a collection is using ?
Thanks!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Change-config-set-for-a-collection-tp4096032.html
Sent from the Solr - User mailing list archive at Nabble.com.


measure result set quality

2013-10-17 Thread Alvaro Cabrerizo
Hi,

Imagine the next situation. You have a corpus of documents and a list of
queries extracted from production environment. The corpus haven't been
manually annotated with relvant/non relevant tags for every query. Then you
configure various solr instances changing the schema (adding synonyms,
stopwords...). After indexing, you prepare and execute the test over
different schema configurations.  How do you compare the quality of your
search result in order to decide which schema is better?

Regards.


A few questions about solr and tika

2013-10-17 Thread wonder
Hello everyone! Please tell me how and where to set Tika options in 
Solr? Where is Tica conf? I'm want to know how I can eliminate not 
required to me response attribute(such as links or images)? Also I am 
interesting how i can get and index only metadata in several file formats?


Status of wiki documentation on grouping under distributed search

2013-10-17 Thread Jackson, Andrew
On the SolrCloud wiki page (https://wiki.apache.org/solr/SolrCloud), I
found this statement:

 

The Grouping feature only works if groups are in the same shard. You
must use the custom sharding feature to use the Grouping feature.

 

However, the Distributed Search page
(https://wiki.apache.org/solr/DistributedSearch) implies that grouping
largely works, and the actual grouping page (or rather Field Collapsing:
http://wiki.apache.org/solr/FieldCollapsing) goes into much more detail,
outlining limitations on specific features.

 

Am I right in assuming that the statement on the SolrCloud page is out
of date? I'm happy to replace it with some text that links to
https://wiki.apache.org/solr/DistributedSearch#Distributed_Searching_Lim
itations if that makes more sense?

 

Similarly, on the Distributed Search page, we find:

 

Doesn't support MoreLikeThis -- (see
https://issues.apache.org/jira/browse/SOLR-788) 

 

Looking at the issue, it seems this has been (largely?) resolved since
Solr 4.1 and 5.0. Can I update the text to reflect that?

 

Thanks for your time.

 

Best wishes,

Andy Jackson 

 

--

Dr Andrew N Jackson

Web Archiving Technical Lead

The British Library

 

Tel: 01937 546602

Mobile: 07765 897948

Web: www.webarchive.org.uk http://www.webarchive.org.uk/ 

Twitter: @UKWebArchive

 



Re: Timeout Errors while using Collections API

2013-10-17 Thread Grzegorz Sobczyk
On 16 October 2013 11:48, RadhaJayalakshmi
rlakshminaraya...@inautix.co.inwrote:

 Hi,
 My setup is
 Zookeeper ensemble - running with 3 nodes
 Tomcats - 9 Tomcat instances are brought up, by registereing with
 zookeeper.

 Steps :
 1) I uploaded the solr configuration like db_data_config, solrconfig,
 schema
 xmls into zookeeoper
 2)  Now, i am trying to create a collection with the collection API like
 below:


 http://miadevuser001.albridge.com:7021/solr/admin/collections?action=CREATEname=Schwab_InvACC_CollnumShards=1replicationFactor=2createNodeSet=localhost:7034_solr,localhost:7036_solrcollection.configName=InvestorAccountDomainConfig

 Now, when i execute this command, i am getting the following error:
 responselst name=responseHeaderint name=status500/intint
 name=QTime60015/int/lstlst name=errorstr
 name=msgcreatecollection the collection time out:60s/strstr
 name=traceorg.apache.solr.common.SolrException: createcollection the
 collection time out:60s
 at

 org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:175)
 at

 org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:156)
 at

 org.apache.solr.handler.admin.CollectionsHandler.handleCreateAction(CollectionsHandler.java:290)
 at

 org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:112)
 at

 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
 at

 org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:611)
 at

 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:218)
 at

 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
 at

 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
 at

 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
 at

 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
 at

 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
 at

 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
 at

 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
 at
 org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:947)
 at

 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
 at
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
 at

 org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1009)
 at

 org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
 at

 org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
 at

 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at

 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:722)
 /strint name=code500/int/lst/response

 Now after i got this error, i am not able to do any operation on these
 instances with collection API. It is repeteadly giving the same timeout
 error..
 This setup was working fine 5 mins back. suddenly it started throwing this
 exceptions. Any ideas please??






 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Timeout-Errors-while-using-Collections-API-tp4095852.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Grzegorz Sobczyk


Re: Timeout Errors while using Collections API

2013-10-17 Thread Grzegorz Sobczyk
Sorry for previous spam (something eat my message)

I have the same problem but with reload action
ENV:
 - 3x Solr 4.2.1 with 4 cores each
 - ZK

Before error I have:
- 14, 2013 5:25:36 AM CollectionsHandler handleReloadAction INFO: Reloading
Collection : name=productsaction=RELOAD
- hundreds of (with the same timestamp) 14, 2013 5:25:36 AM
DistributedQueue$LatchChildWatcher process INFO: Watcher fired on path:
/overseer/collection-queue-work state: SyncConnected type
NodeChildrenChanged
- 13 times (from 2013 5:25:39 to 5:25:45):
-- 14, 2013 5:25:39 AM SolrDispatchFilter handleAdminRequest INFO: [admin]
webapp=null path=/admin/cores params={action=STATUSwt=ruby} status=0
QTime=2
-- 14, 2013 5:25:39 AM SolrDispatchFilter handleAdminRequest INFO: [admin]
webapp=null path=/admin/cores params={action=STATUSwt=ruby} status=0
QTime=1
-- 14, 2013 5:25:39 AM SolrCore execute INFO: [forum] webapp=/solr
path=/admin/mbeans params={stats=truewt=ruby} status=0 QTime=2
-- 14, 2013 5:25:39 AM SolrCore execute INFO: [knowledge] webapp=/solr
path=/admin/mbeans params={stats=truewt=ruby} status=0 QTime=2
-- 14, 2013 5:25:39 AM SolrCore execute INFO: [products] webapp=/solr
path=/admin/mbeans params={stats=truewt=ruby} status=0 QTime=2
-- 14, 2013 5:25:39 AM SolrCore execute INFO: [shops] webapp=/solr
path=/admin/mbeans params={stats=truewt=ruby} status=0 QTime=1
- 14, 2013 5:26:21 AM SolrCore execute INFO: [products] webapp=/solr
path=/select/ params={q=solrpingquery} hits=0 status=0 QTime=0
- 14, 2013 5:26:36 AM DistributedQueue$LatchChildWatcher process INFO:
Watcher fired on path: /overseer/collection-queue-work/qnr-000806
state: SyncConnected type NodeDeleted
- 14, 2013 5:26:36 AM SolrException log SEVERE:
org.apache.solr.common.SolrException: reloadcollection the collection time
out:60s
at
org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:162)
at
org.apache.solr.handler.admin.CollectionsHandler.handleReloadAction(CollectionsHandler.java:184)
at
org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:120)

What are possilibities of such behaviour? When this error is thrown?
Does anybody has the same issue?


On 17 October 2013 13:08, Grzegorz Sobczyk gsobc...@gmail.com wrote:



 On 16 October 2013 11:48, RadhaJayalakshmi 
 rlakshminaraya...@inautix.co.in wrote:

 Hi,
 My setup is
 Zookeeper ensemble - running with 3 nodes
 Tomcats - 9 Tomcat instances are brought up, by registereing with
 zookeeper.

 Steps :
 1) I uploaded the solr configuration like db_data_config, solrconfig,
 schema
 xmls into zookeeoper
 2)  Now, i am trying to create a collection with the collection API like
 below:


 http://miadevuser001.albridge.com:7021/solr/admin/collections?action=CREATEname=Schwab_InvACC_CollnumShards=1replicationFactor=2createNodeSet=localhost:7034_solr,localhost:7036_solrcollection.configName=InvestorAccountDomainConfig

 Now, when i execute this command, i am getting the following error:
 responselst name=responseHeaderint name=status500/intint
 name=QTime60015/int/lstlst name=errorstr
 name=msgcreatecollection the collection time out:60s/strstr
 name=traceorg.apache.solr.common.SolrException: createcollection the
 collection time out:60s
 at

 org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:175)
 at

 org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:156)
 at

 org.apache.solr.handler.admin.CollectionsHandler.handleCreateAction(CollectionsHandler.java:290)
 at

 org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:112)
 at

 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
 at

 org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:611)
 at

 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:218)
 at

 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
 at

 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
 at

 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
 at

 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
 at

 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
 at

 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
 at

 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
 at
 org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:947)
 at

 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
 at

 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
 at

 

Re: SolrCloud on SSL

2013-10-17 Thread Christopher Gross
Tim, if a separate VLAN was an option, I wouldn't be trying to use SSL.

-- Chris


On Wed, Oct 16, 2013 at 7:27 PM, Tim Vaillancourt t...@elementspace.comwrote:

 Not important, but I'm also curious why you would want SSL on Solr (adds
 overhead, complexity, harder-to-troubleshoot, etc)?

 To avoid the overhead, could you put Solr on a separate VLAN (with ACLs to
 client servers)?

 Cheers,

 Tim


 On 12 October 2013 17:30, Shawn Heisey s...@elyograg.org wrote:

  On 10/11/2013 9:38 AM, Christopher Gross wrote:
   On Fri, Oct 11, 2013 at 11:08 AM, Shawn Heisey s...@elyograg.org
  wrote:
  
   On 10/11/2013 8:17 AM, Christopher Gross wrote: 
   Is there a spot in a Solr configuration that I can set this up to use
   HTTPS?
  
   From what I can tell, not yet.
  
   https://issues.apache.org/jira/browse/SOLR-3854
   https://issues.apache.org/jira/browse/SOLR-4407
   https://issues.apache.org/jira/browse/SOLR-4470
  
  
   Dang.
 
  Christopher,
 
  I was just looking through Solr source code for a completely different
  issue, and it seems that there *IS* a way to do this in your
 configuration.
 
  If you were to use https://hostname; or https://ipaddress; as the
  host parameter in your solr.xml file on each machine, it should do
  what you want.  The parameter is described here, but not the behavior
  that I have discovered:
 
  http://wiki.apache.org/solr/SolrCloud#SolrCloud_Instance_Params
 
  Boring details: In the org.apache.solr.cloud package, there is a
  ZkController class.  The getHostAddress method is where I discovered
  that you can do this.
 
  If you could try this out and confirm that it works, I will get the wiki
  page updated and look into the Solr reference guide as well.
 
  Thanks,
  Shawn
 
 



RE: Solr 4.4 - Master/Slave configuration - Replication Issue with Commits after deleting documents using Delete by ID

2013-10-17 Thread Akkinepalli, Bharat (ELS-CON)
Thanks Shalin.

Regards,
Bharat Akkinepalli

-Original Message-
From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] 
Sent: Thursday, October 17, 2013 1:18 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr 4.4 - Master/Slave configuration - Replication Issue with 
Commits after deleting documents using Delete by ID

Thanks Bharat. This is a bug. I've opened LUCENE-5289.

https://issues.apache.org/jira/browse/LUCENE-5289


On Wed, Oct 16, 2013 at 9:35 PM, Akkinepalli, Bharat (ELS-CON)  
b.akkinepa...@elsevier.com wrote:

 Hi Shalin,
 I am not sure why the log specifies No uncommitted changes appear.  
 The data is available in Solr at the time I perform a delete.

 please find the below steps I have performed:
  Inserted a document in master (with id= change.me.1) issued a commit 
  on master Triggered replication on slave Ensured that the document 
  is replicated successfully.
  Issued a delete by ID.
  Issued a commit on master
  Replication did NOT happen.

 The logs are as follows:
 Master - http://pastebin.com/265CtCEp
 Slave - http://pastebin.com/Qx0xLwmK

 Regards,
 Bharat Akkinepalli.

 -Original Message-
 From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com]
 Sent: Wednesday, October 16, 2013 11:28 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Solr 4.4 - Master/Slave configuration - Replication Issue 
 with Commits after deleting documents using Delete by ID

 The only delete I see in the master logs is:

 INFO  - 2013-10-11 14:06:54.793;
 org.apache.solr.update.processor.LogUpdateProcessor; [annotation] 
 webapp=/solr path=/update params={} 
 {delete=[change.me(-1448623278425899008)]}
 0 60

 When you commit, we have the following:

 INFO  - 2013-10-11 14:07:03.809;
 org.apache.solr.update.DirectUpdateHandler2; start 
 commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDele
 tes=false,softCommit=false,prepareCommit=false}
 INFO  - 2013-10-11 14:07:03.813;
 org.apache.solr.update.DirectUpdateHandler2; No uncommitted changes.
 Skipping IW.commit.

 That suggests that the id you are trying to delete never existed in 
 the first place and hence there was nothing to commit. Hence 
 replication was not triggered. Am I missing something?


 On Wed, Oct 16, 2013 at 5:06 PM, Akkinepalli, Bharat (ELS-CON)  
 b.akkinepa...@elsevier.com wrote:

  Hi Otis,
  Did you get a chance to look into the logs.  Please let me know if 
  you need more information.  Thank you.
 
  Regards,
  Bharat Akkinepalli
 
  -Original Message-
  From: Akkinepalli, Bharat (ELS-CON)
  [mailto:b.akkinepa...@elsevier.com]
  Sent: Friday, October 11, 2013 2:16 PM
  To: solr-user@lucene.apache.org
  Subject: RE: Solr 4.4 - Master/Slave configuration - Replication 
  Issue with Commits after deleting documents using Delete by ID
 
  Hi Otis,
  Thanks for the response.  The log files can be found here.
 
  MasterLog : http://pastebin.com/DPLKMPcF Slave Log:
  http://pastebin.com/DX9sV6Jx
 
  One more point worth mentioning here is that when we issue the 
  commit with expungeDeletes=true, then the delete by id replication 
  is
 successful. i.e.
  http://localhost:8983/solr/annotation/update?commit=trueexpungeDele
  te
  s=true
 
  Regards,
  Bharat Akkinepalli
 
  -Original Message-
  From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com]
  Sent: Wednesday, October 09, 2013 6:35 PM
  To: solr-user@lucene.apache.org
  Subject: Re: Solr 4.4 - Master/Slave configuration - Replication 
  Issue with Commits after deleting documents using Delete by ID
 
  Bharat,
 
  Can you look at the logs on the Master when you issue the delete and 
  the subsequent commits and share that?
 
  Otis
  --
  Solr  ElasticSearch Support -- http://sematext.com/ Performance 
  Monitoring -- http://sematext.com/spm
 
 
 
  On Tue, Oct 8, 2013 at 3:57 PM, Akkinepalli, Bharat (ELS-CON)  
  b.akkinepa...@elsevier.com wrote:
   Hi,
   We have recently migrated from Solr 3.6 to Solr 4.4.  We are using 
   the
  Master/Slave configuration in Solr 4.4 (not Solr Cloud).  We have 
  noticed the following behavior/defect.
  
   Configuration:
   ===
  
   1.   The Hard Commit and Soft Commit are disabled in the
  configuration (we control the commits from the application)
  
   2.   We have 1 Master and 2 Slaves configured and the pollInterval
  is configured to 10 Minutes.
  
   3.   The Master is configured to have the replicateAfter as
 commit
   startup
  
   Steps to reproduce the problem:
   ==
  
   1.   Delete a document in Solr  (using delete by id).  URL -
  http://localhost:8983/solr/annotation/update with body as 
  deleteid change.me/id/delete
  
   2.   Issue a commit in Master (
  http://localhost:8983/solr/annotation/update?commit=true).
  
   3.   The replication of the DELETE WILL NOT happen.  The master and
  slave has the same Index version.
  
   4.   If we try to issue another commit in Master, we see that it
  replicates 

Solr errors

2013-10-17 Thread wonder

Hello everyone! Please tell my wy Solr freezes when I adding this file
http://yadi.sk/d/dy-RtcHXB7KZU
The response from the server does not come.
curl 
http://localhost:8085/solr/myCollection/update/extract?literal.id=doc1literal.fileName=asuprefix=attr_commit=true; 
-F myfile=@/media/PENDRIVE/Out/www-http/159/8696_6_5_5535.mp3


Second question:
When I adding this file
http://yadi.sk/d/OpLW2JTTB7Ms4
Solr returns:
wonder@wonder:~$ curl 
http://localhost:8085/solr/myCollection/update/extract?literal.id=doc1literal.fileName=asuprefix=attr_commit=true; 
-F myfile=@/media/PENDRIVE/Out/www-http/152/8696_6_5_5528.jpeg


?xml version=1.0 encoding=UTF-8?
response
lst name=errorstr name=msgjava.lang.NoClassDefFoundError: 
com/adobe/xmp/XMPException/strstr 
name=tracejava.lang.RuntimeException: java.lang.NoClassDefFoundError: 
com/adobe/xmp/XMPException
at 
org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:673)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:383)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1489)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:517)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:138)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:540)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:213)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1097)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:446)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:175)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1031)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:136)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:200)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:109)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at 
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:317)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)

at org.eclipse.jetty.server.Server.handle(Server.java:445)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:269)
at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:229)
at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.run(AbstractConnection.java:358)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:601)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:532)

at java.lang.Thread.run(Thread.java:722)
Caused by: java.lang.NoClassDefFoundError: com/adobe/xmp/XMPException
at 
com.drew.imaging.jpeg.JpegMetadataReader.extractMetadataFromJpegSegmentReader(JpegMetadataReader.java:112)
at 
com.drew.imaging.jpeg.JpegMetadataReader.readMetadata(JpegMetadataReader.java:71)
at 
org.apache.tika.parser.image.ImageMetadataExtractor.parseJpeg(ImageMetadataExtractor.java:91)

at org.apache.tika.parser.jpeg.JpegParser.parse(JpegParser.java:56)
at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
at 
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:219)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at 
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:241)

at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362)

... 23 more
Caused by: java.lang.ClassNotFoundException: com.adobe.xmp.XMPException
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:789)
at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
... 37 more
/strint name=code500/int/lst
/response




ExtractRequestHandler, skipping errors

2013-10-17 Thread Roland Everaert
Hi,

I helped a customer to deployed solr+manifoldCF and everything is going
quite smoothly, but every time solr is raising an exception, the
manifoldcfjob feeding
solr aborts. I would like to know if it is possible to configure the
ExtractRequestHandler to ignore errors like it seems to be possible with
dataimporthandler and entity processors.

I know that it is possible to configure the ExtractRequestHandler to ignore
tika exception (We already do that) but the errors that now stops the
mcfjobs are generated by
solr itself.

While it is interesting to have such option in solr, I plan to post to the
manifoldcf mailing list, anyway, to know if it is possible to configure
manifolcf to be less picky about solr errors.


Regards,


Roland.


Re: Solr errors

2013-10-17 Thread wonder

Does anybody know how index files in zip archives?



Re: A few questions about solr and tika

2013-10-17 Thread wonder

Thanks for answer. If I dont want to store and index any fields i do:
field name=links type=string indexed=false stored=false 
multiValued=true/!--удаление лишних полей TIKA--
field name=link type=string indexed=false stored=false 
multiValued=true/!--удаление лишних полей TIKA--
field name=img type=string indexed=false stored=false 
multiValued=true/!--удаление лишних TIKA--
field name=iframe type=string indexed=false stored=false 
multiValued=true/!--удаление лишних полей TIKA--
field name=area type=string indexed=false stored=false 
multiValued=true/!--удаление лишних полей TIKA--
field name=map type=string indexed=false stored=false 
multiValued=true/!--удаление лишних полей TIKA--
field name=pragma type=string indexed=false stored=false 
multiValued=true/!--удаление лишних TIKA--
field name=expires type=string indexed=false stored=false 
multiValued=true/!--удаление лишних полей TIKA--
field name=keywords type=string indexed=false stored=false 
multiValued=true/!--удаление лишних полей TIKA--
field name=stream_source_info type=string indexed=false 
stored=false multiValued=true/!--удаление лишних полей TIKA--


Other qestions is still open for me.


17.10.2013 14:26, primoz.sk...@policija.si пишет:

Why don't you check these:

- Content extraction with Apache Tika (
http://www.youtube.com/watch?v=ifgFjAeTOws)
- ExtractingRequestHandler (
http://wiki.apache.org/solr/ExtractingRequestHandler)
- Uploading Data with Solr Cell using Apache Tika (
https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Solr+Cell+using+Apache+Tika
)

Primož



From:   wonder a-wonde...@rambler.ru
To: solr-user@lucene.apache.org
Date:   17.10.2013 12:23
Subject:A few questions about solr and tika



Hello everyone! Please tell me how and where to set Tika options in
Solr? Where is Tica conf? I'm want to know how I can eliminate not
required to me response attribute(such as links or images)? Also I am
interesting how i can get and index only metadata in several file formats?






Re: Regarding Solr Cloud issue...

2013-10-17 Thread Chris
Wow thanks for all that, i just upgraded, linked my plugins  it seems fine
so far, but i have run into another issue

while adding a document to the solr cloud it says -
org.apache.solr.common.SolrException: Unknown document router
'{name=compositeId}'

in the clusterstate.json i can see -

 shard5:{
range:4ccc-7fff,
state:active,
replicas:{core_node4:{
state:active,
base_url:http://64.251.14.47:1984/solr;,
core:web_shard5_replica1,
node_name:64.251.14.47:1984_solr,
leader:true,
maxShardsPerNode:2,
router:{name:compositeId},
replicationFactor:1},

I am using this to add -

   CloudSolrServer solrCoreCloud = new 
CloudSolrServer(cloudURL);
   solrCoreCloud.setDefaultCollection(web);
   UpdateResponse up = solrCoreCloud.addBean(resultItem);
   UpdateResponse upr = solrCoreCloud.commit();

Please advice.





On Wed, Oct 16, 2013 at 9:49 PM, Shawn Heisey s...@elyograg.org wrote:

 On 10/16/2013 4:51 AM, Chris wrote:
  Also, is there any easy way upgrading to 4.5 without having to change
 most
  of my plugins  configuration files?

 Upgrading is something that should be done carefully.  If you can, it's
 always recommended that you try it out on dev hardware with your real
 index data beforehand, so you can deal with any problems that arise
 without causing problems for your production cluster.  Upgrading
 SolrCloud is particularly tricky, because for a while you will be
 running different versions on different machines in your cluster.

 If you're using your own custom software to go with Solr, or you're
 using third-party plugins that aren't included in the Solr download,
 upgrading might take more effort than usual.  Also, if you are doing
 anything in your config/schema that changes the format of the Lucene
 index, you may find that it can't be upgraded without completely
 rebuilding the index.  Examples of this are changing the postings format
 or docValues format.  This is a very nasty complication with SolrCloud,
 because those configurations affect the entire cluster.  In that case,
 the whole index may need to be rebuilt without custom formats before
 upgrading is attempted.

 If you don't have any of the complications mentioned in the preceding
 paragraph, upgrading is usually a very simple process:

 *) Shut down Solr.
 *) Delete the extracted WAR file directory.
 *) Replace solr.war with the new war from dist/ in the download.
 **) Usually it must actually be named solr.war, which means renaming it.
 *) Delete and replace other jars copied from the download.
 *) Change luceneMatchVersion in all solrconfig.xml files. **
 *) Start Solr back up.

 ** With SolrCloud, you can't actually change the luceneMatchVersion
 until all of your servers have been upgraded.

 A full reindex is strongly recommended.  With SolrCloud, it normally
 needs to wait until all servers are upgraded.  In situations where it
 won't work at all without a reindex, upgrading SolrCloud can be very
 challenging.

 It's strongly recommended that you look over CHANGES.txt and compare the
 new example config/schema with the example from the old version, to see
 if there are any changes that you might want to incorporate into your
 own config.  As with luceneMatchVersion, if you're running SolrCloud,
 those changes might need to wait until you're fully upgraded.

 Side note: When upgrading to a new minor version, config changes aren't
 normally required.  They will usually be required when upgrading major
 versions, such as 3.x to 4.x.

 If you *do* have custom plugins that aren't included in the Solr
 download, you may have to recompile them for the new version, or wait
 for the vendor to create a new version before you upgrade.

 This is only the tip of the iceberg, but a lot of the rest of it depends
 greatly on your configurations.

 Thanks,
 Shawn




Re: Local Solr and Webserver-Solr act differently (and treated like or)

2013-10-17 Thread Jack Krupansky
The default Solr stopwords.txt file is empty, so SOMEBODY created that 
non-empty stop words file.


The StopFilterFactory token filter in the field type analyzer controls stop 
word processing. You can remove that step entirely, or different field types 
can reference different stop word files, or some field type analyzers can 
use the stop filter and some would not have it. This does mean that you 
would have to use different field types for fields that want different stop 
word processing.


-- Jack Krupansky

-Original Message- 
From: Stavros Delisavas

Sent: Thursday, October 17, 2013 3:27 AM
To: solr-user@lucene.apache.org
Subject: Re: Local Solr and Webserver-Solr act differently (and treated 
like or)


Thank you,
I found the file with the stopwords and noticed that my local file is
empty (comments only) and the one on my webserver has a big list of
english stopwords. That seems to be the problem.

I think in general it is a good idea to use stopwords for random
searches, but it is not usefull in my special case. Is there a way to
(de)activate stopwords query-wise? Like I would like to ignore stopwords
when searching in titles but I would like to use stopwords when users do
a fulltext-search on whole articles, etc.

Thanks again,
Stavros


On 17.10.2013 09:13, Upayavira wrote:

Stopwords are small words such as and, the or is,that we might
choose to exclude from our documents and queries because they are such
common terms. Once you have stripped stop words from your above query,
all that is left is the word wild, or so is being suggested.

Somewhere in your config, close to solr config.xml, you will find a file
called something like stopwords.txt. Compare these files between your
two systems.

Upayavira

On Thu, Oct 17, 2013, at 07:18 AM, Stavros Delsiavas wrote:

Unfortunatly, I don't really know what stopwords are. I would like it to
not ignore any words of my query.
How/Where can I change this stopwords-behaviour?


Am 16.10.2013 23:45, schrieb Jack Krupansky:

So, the stopwords.txt file is different between the two systems - the
first has stop words but the second does not. Did you expect stop
words to be removed, or not?

-- Jack Krupansky

-Original Message- From: Stavros Delsiavas
Sent: Wednesday, October 16, 2013 5:02 PM
To: solr-user@lucene.apache.org
Subject: Re: Local Solr and Webserver-Solr act differently (and
treated like or)

Okay I understand,

here's the rawquerystring. It was at about line 3000:

lst name=debug
 str name=rawquerystringtitle:(into AND the AND wild*)/str
 str name=querystringtitle:(into AND the AND wild*)/str
 str name=parsedquery+title:wild*/str
 str name=parsedquery_toString+title:wild*/str

At this place the debug output DOES differ from the one on my local
system. But I don't understand why...
This is the local debug output:

lst name=debug
  str name=rawquerystringtitle:(into AND the AND wild*)/str
  str name=querystringtitle:(into AND the AND wild*)/str
  str name=parsedquery+title:into +title:the +title:wild*/str
  str name=parsedquery_toString+title:into +title:the
+title:wild*/str

Why is that? Any ideas?




Am 16.10.2013 21:03, schrieb Shawn Heisey:

On 10/16/2013 4:46 AM, Stavros Delisavas wrote:

My local solr gives me:
http://pastebin.com/Q6d9dFmZ

and my webserver this:
http://pastebin.com/q87WEjVA

I copied only the first few hundret lines (of more than 8000) because
the webserver output was to big even for pastebin.



On 16.10.2013 12:27, Erik Hatcher wrote:

What does the debug output say from debugQuery=true say between the
two?

What's really needed here is the first part of the debug section,
which has rawquerystring, querystring, parsedquery, and
parsedquery_toString.  The info from your local solr has this part, but
what you pasted from the webserver one didn't include those parts,
because it's further down than the first few hundred lines.

Thanks,
Shawn





Re: Solr errors

2013-10-17 Thread Roland Everaert
Even if I don't test it myself, you can use Tika, it is able to extract
document from zip archives and index them, but of course it depends of the
file type in the archive.

Regards,


Roland.


On Thu, Oct 17, 2013 at 2:36 PM, wonder a-wonde...@rambler.ru wrote:

 Does anybody know how index files in zip archives?




Re: Solr errors

2013-10-17 Thread wonder
Thanks for answer. Yes Tika extract, but not index content. Here is the 
solr response

...
content: [  9118_xmessengereu_v18ximpda.jar dimonvideo.ru.txt  ],
...
There are not any of this files in index.
Any ideas?
17.10.2013 17:20, Roland Everaert ?:

Even if I don't test it myself, you can use Tika, it is able to extract
document from zip archives and index them, but of course it depends of the
file type in the archive.




Re: Regarding Solr Cloud issue...

2013-10-17 Thread Chris
I am also trying with something like -

java -Durl=http://domainname.com:1981/solr/web/update-Dtype=application/json
-jar /solr4RA/example1/exampledocs/post.jar
/root/Desktop/web/*.json

but it is giving error -

19:06:22 ERROR SolrCore org.apache.solr.common.SolrException: Unknown
command: subDomain [12]

org.apache.solr.common.SolrException: Unknown command: subDomain [12]
at 
org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:152)
at 
org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:101)
at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:65)
at 
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:368)
at 
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
at 
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
at 
org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953)
at 
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at 
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
at 
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:722)




On Thu, Oct 17, 2013 at 6:31 PM, Chris christu...@gmail.com wrote:

 Wow thanks for all that, i just upgraded, linked my plugins  it seems
 fine so far, but i have run into another issue

 while adding a document to the solr cloud it says -
 org.apache.solr.common.SolrException: Unknown document router
 '{name=compositeId}'

 in the clusterstate.json i can see -

  shard5:{
 range:4ccc-7fff,
 state:active,
 replicas:{core_node4:{
 state:active,
 base_url:http://64.251.14.47:1984/solr;,
 core:web_shard5_replica1,
 node_name:64.251.14.47:1984_solr,
 leader:true,
 maxShardsPerNode:2,
 router:{name:compositeId},
 replicationFactor:1},

 I am using this to add -


  CloudSolrServer solrCoreCloud = new 
 CloudSolrServer(cloudURL);

  solrCoreCloud.setDefaultCollection(web);

UpdateResponse up = solrCoreCloud.addBean(resultItem);
UpdateResponse upr = solrCoreCloud.commit();

 Please advice.





 On Wed, Oct 16, 2013 at 9:49 PM, Shawn Heisey s...@elyograg.org wrote:

 On 10/16/2013 4:51 AM, Chris wrote:
  Also, is there any easy way upgrading to 4.5 without having to change
 most
  of my plugins  configuration files?

 Upgrading is something that should be 

Re: Solr errors

2013-10-17 Thread Roland Everaert
I have just find this JIRA report, which could explain your problem:

https://issues.apache.org/jira/browse/SOLR-2416


Regards,

Roland.



On Thu, Oct 17, 2013 at 3:30 PM, wonder a-wonde...@rambler.ru wrote:

 Thanks for answer. Yes Tika extract, but not index content. Here is the
 solr response
 ...
 content: [  9118_xmessengereu_v18ximpda.**jar dimonvideo.ru.txt  ],
 ...
 There are not any of this files in index.
 Any ideas?
 17.10.2013 17:20, Roland Everaert ?:

  Even if I don't test it myself, you can use Tika, it is able to extract
 document from zip archives and index them, but of course it depends of the
 file type in the archive.





Re: Timeout Errors while using Collections API

2013-10-17 Thread Mark Miller
There was a reload bug in SolrCloud that was fixed in 4.4 - 
https://issues.apache.org/jira/browse/SOLR-4805

Mark

On Oct 17, 2013, at 7:18 AM, Grzegorz Sobczyk gsobc...@gmail.com wrote:

 Sorry for previous spam (something eat my message)
 
 I have the same problem but with reload action
 ENV:
 - 3x Solr 4.2.1 with 4 cores each
 - ZK
 
 Before error I have:
 - 14, 2013 5:25:36 AM CollectionsHandler handleReloadAction INFO: Reloading
 Collection : name=productsaction=RELOAD
 - hundreds of (with the same timestamp) 14, 2013 5:25:36 AM
 DistributedQueue$LatchChildWatcher process INFO: Watcher fired on path:
 /overseer/collection-queue-work state: SyncConnected type
 NodeChildrenChanged
 - 13 times (from 2013 5:25:39 to 5:25:45):
 -- 14, 2013 5:25:39 AM SolrDispatchFilter handleAdminRequest INFO: [admin]
 webapp=null path=/admin/cores params={action=STATUSwt=ruby} status=0
 QTime=2
 -- 14, 2013 5:25:39 AM SolrDispatchFilter handleAdminRequest INFO: [admin]
 webapp=null path=/admin/cores params={action=STATUSwt=ruby} status=0
 QTime=1
 -- 14, 2013 5:25:39 AM SolrCore execute INFO: [forum] webapp=/solr
 path=/admin/mbeans params={stats=truewt=ruby} status=0 QTime=2
 -- 14, 2013 5:25:39 AM SolrCore execute INFO: [knowledge] webapp=/solr
 path=/admin/mbeans params={stats=truewt=ruby} status=0 QTime=2
 -- 14, 2013 5:25:39 AM SolrCore execute INFO: [products] webapp=/solr
 path=/admin/mbeans params={stats=truewt=ruby} status=0 QTime=2
 -- 14, 2013 5:25:39 AM SolrCore execute INFO: [shops] webapp=/solr
 path=/admin/mbeans params={stats=truewt=ruby} status=0 QTime=1
 - 14, 2013 5:26:21 AM SolrCore execute INFO: [products] webapp=/solr
 path=/select/ params={q=solrpingquery} hits=0 status=0 QTime=0
 - 14, 2013 5:26:36 AM DistributedQueue$LatchChildWatcher process INFO:
 Watcher fired on path: /overseer/collection-queue-work/qnr-000806
 state: SyncConnected type NodeDeleted
 - 14, 2013 5:26:36 AM SolrException log SEVERE:
 org.apache.solr.common.SolrException: reloadcollection the collection time
 out:60s
 at
 org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:162)
 at
 org.apache.solr.handler.admin.CollectionsHandler.handleReloadAction(CollectionsHandler.java:184)
 at
 org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:120)
 
 What are possilibities of such behaviour? When this error is thrown?
 Does anybody has the same issue?
 
 
 On 17 October 2013 13:08, Grzegorz Sobczyk gsobc...@gmail.com wrote:
 
 
 
 On 16 October 2013 11:48, RadhaJayalakshmi 
 rlakshminaraya...@inautix.co.in wrote:
 
 Hi,
 My setup is
 Zookeeper ensemble - running with 3 nodes
 Tomcats - 9 Tomcat instances are brought up, by registereing with
 zookeeper.
 
 Steps :
 1) I uploaded the solr configuration like db_data_config, solrconfig,
 schema
 xmls into zookeeoper
 2)  Now, i am trying to create a collection with the collection API like
 below:
 
 
 http://miadevuser001.albridge.com:7021/solr/admin/collections?action=CREATEname=Schwab_InvACC_CollnumShards=1replicationFactor=2createNodeSet=localhost:7034_solr,localhost:7036_solrcollection.configName=InvestorAccountDomainConfig
 
 Now, when i execute this command, i am getting the following error:
 responselst name=responseHeaderint name=status500/intint
 name=QTime60015/int/lstlst name=errorstr
 name=msgcreatecollection the collection time out:60s/strstr
 name=traceorg.apache.solr.common.SolrException: createcollection the
 collection time out:60s
at
 
 org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:175)
at
 
 org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:156)
at
 
 org.apache.solr.handler.admin.CollectionsHandler.handleCreateAction(CollectionsHandler.java:290)
at
 
 org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:112)
at
 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at
 
 org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:611)
at
 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:218)
at
 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
at
 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at
 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at
 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
at
 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
at
 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
at
 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
at
 

Re: limiting deep pagination

2013-10-17 Thread Peter Keegan
Yes, right now this constraint could be implemented in either the web app
or Solr. I see now that many of the QTimes on these queries are 10 ms
(probably due to caching), so I'm a bit less concerned.


On Wed, Oct 16, 2013 at 2:13 AM, Furkan KAMACI furkankam...@gmail.comwrote:

 I just wonder that: Don't you implement a custom API that interacts with
 Solr and limits such kinds of requestst? (I know that you are asking about
 how to do that in Solr but I handle such situations at my custom search
 APIs and want to learn what fellows do)


 9 Ekim 2013 Çarşamba tarihinde Michael Sokolov 
 msoko...@safaribooksonline.com adlı kullanıcı şöyle yazdı:
  On 10/8/13 6:51 PM, Peter Keegan wrote:
 
  Is there a way to configure Solr 'defaults/appends/invariants' such that
  the product of the 'start' and 'rows' parameters doesn't exceed a given
  value? This would be to prevent deep pagination.  Or would this require
 a
  custom requestHandler?
 
  Peter
 
  Just wondering -- isn't it the sum that you should be concerned about
 rather than the product?  Actually I think what we usually do is limit both
 independently, with slightly different concerns, since. eg start=1,
 rows=1000 causes memory problems if you have large fields in your results,
 where start=1000, rows=1 may not actually be a problem
 
  -Mike
 



Re: ExtractRequestHandler, skipping errors

2013-10-17 Thread Koji Sekiguchi

Hi Roland,

(13/10/17 20:44), Roland Everaert wrote:

Hi,

I helped a customer to deployed solr+manifoldCF and everything is going
quite smoothly, but every time solr is raising an exception, the
manifoldcfjob feeding
solr aborts. I would like to know if it is possible to configure the
ExtractRequestHandler to ignore errors like it seems to be possible with
dataimporthandler and entity processors.

I know that it is possible to configure the ExtractRequestHandler to ignore
tika exception (We already do that) but the errors that now stops the
mcfjobs are generated by
solr itself.

While it is interesting to have such option in solr, I plan to post to the
manifoldcf mailing list, anyway, to know if it is possible to configure
manifolcf to be less picky about solr errors.



ignoreTikaException flag might help you?

https://issues.apache.org/jira/browse/SOLR-2480

koji
--
http://soleami.com/blog/automatically-acquiring-synonym-knowledge-from-wikipedia.html


Re: SolrCloud Performance Issue

2013-10-17 Thread shamik
Thanks Primoz, I was suspecting that too. But then, its hard to imagine that
query cache is only contributing to the big performance hit. The setting
applies to the old configuration, and it works pretty well even with the
query cache low hit rate.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-Performance-Issue-tp4095971p4096123.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Change config set for a collection

2013-10-17 Thread Shawn Heisey
On 10/17/2013 2:36 AM, michael.boom wrote:
 The question also asked some 10 months ago in
 http://lucene.472066.n3.nabble.com/SolrCloud-4-1-change-config-set-for-a-collection-td4037456.html,
 and then the answer was negative, but here it goes again, maybe now it's
 different.
 
 Is it possible to change the config set of a collection using the Collection
 API to another one (stored in zookeeper)? If not, is it possible to do it
 using zkCli ?
 
 Also how can somebody check which config set a collection is using ?
 Thanks!

The zkcli command linkconfig should take care of that.  You'd need to
reload the collection after making the change.  If you're using a
version prior to 4.4, reloading doesn't work, you need to restart Solr
completely.

You can see what config a collection is using with the Cloud-Tree
section of the admin UI.  Open /collections and click on the collection.
 At the bottom of the right-hand window, it has a small JSON string with
configName in it.  I don't know of a way to easily get this
information from Solr with a program.  If your program is Java, you
could very likely grab the zookeeper object from CloudSolrServer and
find it that way, but I have no idea how to write that code.

Thanks,
Shawn



Re: Timeout Errors while using Collections API

2013-10-17 Thread Grzegorz Sobczyk
Thanks, I'll try upgade.


On 17 October 2013 15:55, Mark Miller markrmil...@gmail.com wrote:

 There was a reload bug in SolrCloud that was fixed in 4.4 -
 https://issues.apache.org/jira/browse/SOLR-4805

 Mark

 On Oct 17, 2013, at 7:18 AM, Grzegorz Sobczyk gsobc...@gmail.com wrote:

  Sorry for previous spam (something eat my message)
 
  I have the same problem but with reload action
  ENV:
  - 3x Solr 4.2.1 with 4 cores each
  - ZK
 
  Before error I have:
  - 14, 2013 5:25:36 AM CollectionsHandler handleReloadAction INFO:
 Reloading
  Collection : name=productsaction=RELOAD
  - hundreds of (with the same timestamp) 14, 2013 5:25:36 AM
  DistributedQueue$LatchChildWatcher process INFO: Watcher fired on path:
  /overseer/collection-queue-work state: SyncConnected type
  NodeChildrenChanged
  - 13 times (from 2013 5:25:39 to 5:25:45):
  -- 14, 2013 5:25:39 AM SolrDispatchFilter handleAdminRequest INFO:
 [admin]
  webapp=null path=/admin/cores params={action=STATUSwt=ruby} status=0
  QTime=2
  -- 14, 2013 5:25:39 AM SolrDispatchFilter handleAdminRequest INFO:
 [admin]
  webapp=null path=/admin/cores params={action=STATUSwt=ruby} status=0
  QTime=1
  -- 14, 2013 5:25:39 AM SolrCore execute INFO: [forum] webapp=/solr
  path=/admin/mbeans params={stats=truewt=ruby} status=0 QTime=2
  -- 14, 2013 5:25:39 AM SolrCore execute INFO: [knowledge] webapp=/solr
  path=/admin/mbeans params={stats=truewt=ruby} status=0 QTime=2
  -- 14, 2013 5:25:39 AM SolrCore execute INFO: [products] webapp=/solr
  path=/admin/mbeans params={stats=truewt=ruby} status=0 QTime=2
  -- 14, 2013 5:25:39 AM SolrCore execute INFO: [shops] webapp=/solr
  path=/admin/mbeans params={stats=truewt=ruby} status=0 QTime=1
  - 14, 2013 5:26:21 AM SolrCore execute INFO: [products] webapp=/solr
  path=/select/ params={q=solrpingquery} hits=0 status=0 QTime=0
  - 14, 2013 5:26:36 AM DistributedQueue$LatchChildWatcher process INFO:
  Watcher fired on path: /overseer/collection-queue-work/qnr-000806
  state: SyncConnected type NodeDeleted
  - 14, 2013 5:26:36 AM SolrException log SEVERE:
  org.apache.solr.common.SolrException: reloadcollection the collection
 time
  out:60s
  at
 
 org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:162)
  at
 
 org.apache.solr.handler.admin.CollectionsHandler.handleReloadAction(CollectionsHandler.java:184)
  at
 
 org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:120)
 
  What are possilibities of such behaviour? When this error is thrown?
  Does anybody has the same issue?
 
 
  On 17 October 2013 13:08, Grzegorz Sobczyk gsobc...@gmail.com wrote:
 
 
 
  On 16 October 2013 11:48, RadhaJayalakshmi 
  rlakshminaraya...@inautix.co.in wrote:
 
  Hi,
  My setup is
  Zookeeper ensemble - running with 3 nodes
  Tomcats - 9 Tomcat instances are brought up, by registereing with
  zookeeper.
 
  Steps :
  1) I uploaded the solr configuration like db_data_config, solrconfig,
  schema
  xmls into zookeeoper
  2)  Now, i am trying to create a collection with the collection API
 like
  below:
 
 
 
 http://miadevuser001.albridge.com:7021/solr/admin/collections?action=CREATEname=Schwab_InvACC_CollnumShards=1replicationFactor=2createNodeSet=localhost:7034_solr,localhost:7036_solrcollection.configName=InvestorAccountDomainConfig
 
  Now, when i execute this command, i am getting the following error:
  responselst name=responseHeaderint name=status500/intint
  name=QTime60015/int/lstlst name=errorstr
  name=msgcreatecollection the collection time out:60s/strstr
  name=traceorg.apache.solr.common.SolrException: createcollection the
  collection time out:60s
 at
 
 
 org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:175)
 at
 
 
 org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:156)
 at
 
 
 org.apache.solr.handler.admin.CollectionsHandler.handleCreateAction(CollectionsHandler.java:290)
 at
 
 
 org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:112)
 at
 
 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
 at
 
 
 org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:611)
 at
 
 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:218)
 at
 
 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
 at
 
 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
 at
 
 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
 at
 
 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
 at
 
 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
 at
 
 
 

Re: Change config set for a collection

2013-10-17 Thread michael.boom
Thank you, Shawn!

linkconfig - that's exactly what i was looking for!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Change-config-set-for-a-collection-tp4096032p4096134.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Change config set for a collection

2013-10-17 Thread Garth Grimm
But if you're working with multiple configs in zookeeper, be aware that 4.5 
currently has an issue creating multiple collections in a cloud that has 
multiple configs.  It's targeted to be fixed whenever 4.5.1 comes out.

https://issues.apache.org/jira/i#browse/SOLR-5306


-Original Message-
From: Shawn Heisey [mailto:s...@elyograg.org] 
Sent: Thursday, October 17, 2013 10:24 AM
To: solr-user@lucene.apache.org
Subject: Re: Change config set for a collection

On 10/17/2013 2:36 AM, michael.boom wrote:
 The question also asked some 10 months ago in 
 http://lucene.472066.n3.nabble.com/SolrCloud-4-1-change-config-set-for
 -a-collection-td4037456.html, and then the answer was negative, but 
 here it goes again, maybe now it's different.
 
 Is it possible to change the config set of a collection using the 
 Collection API to another one (stored in zookeeper)? If not, is it 
 possible to do it using zkCli ?
 
 Also how can somebody check which config set a collection is using ?
 Thanks!

The zkcli command linkconfig should take care of that.  You'd need to reload 
the collection after making the change.  If you're using a version prior to 
4.4, reloading doesn't work, you need to restart Solr completely.

You can see what config a collection is using with the Cloud-Tree section of 
the admin UI.  Open /collections and click on the collection.
 At the bottom of the right-hand window, it has a small JSON string with 
configName in it.  I don't know of a way to easily get this information from 
Solr with a program.  If your program is Java, you could very likely grab the 
zookeeper object from CloudSolrServer and find it that way, but I have no idea 
how to write that code.

Thanks,
Shawn



Chegg is looking for a search engineer

2013-10-17 Thread Walter Underwood
I work at Chegg.com and I really like it, but we have more search work than I 
can do by myself, so we are hiring a senior software engineer for search. Most 
of our search services are on Solr. 

http://www.chegg.com/jobs/listings/?jvi=oAQGXfwN,Job

If you'd like to know a lot more about Chegg's business, you can read the S1 
that we filed recently in preparation for an IPO.

wunder
--
Walter Underwood
wun...@wunderwood.org
Search Guy
Chegg.com



RE: Change config set for a collection

2013-10-17 Thread michael.boom
Thanks Garth!

Yes, indeed, I know that issue.
I had set up my SolrCloud using 4.5.0 and then encountered this problem, so
I rolled back to 4.4.0



-
Thanks,
Michael
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Change-config-set-for-a-collection-tp4096032p4096136.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Switching indexes

2013-10-17 Thread Christopher Gross
OK, super confused now.

http://index1:8080/solr/admin/cores?action=CREATEname=test2collection=test2numshards=1replicationFactor=3

Nets me this:
response
lst name=responseHeader
int name=status400/int
int name=QTime15007/int
/lst
lst name=error
str name=msgError CREATEing SolrCore 'test2': Could not find configName
for collection test2 found:[xxx, xxx, , x, xx]/str
int name=code400/int
/lst
/response

For that node (test2), in my solr data directory, I have a folder with the
conf files and an existing data dir (copied the index from another
location).

Right now it seems like the only way that I can add in a collection is to
load the configs into zookeeper, stop tomcat, add it to the solr.xml file,
and restart tomcat.

Is there a primer that I'm missing for how to do this?

Thanks.


-- Chris


On Wed, Oct 16, 2013 at 2:59 PM, Christopher Gross cogr...@gmail.comwrote:

 Thanks Shawn, the explanations help bring me forward to the SolrCloud
 mentality.

 So it sounds like going forward that I should have a more complicated name
 (ex: coll1-20131015) aliased to coll1, to make it easier to switch in the
 future.

 Now, if I already have an index (copied from one location to another), it
 sounds like I should just remove my existing (bad/old data) coll1, create
 the replicated one (calling it coll1-date), then alias coll1 to that
 one.

 This type of information would have been awesome to know before I got
 started, but I can make do with what I've got going now.

 Thanks again!


 -- Chris


 On Wed, Oct 16, 2013 at 2:40 PM, Shawn Heisey s...@elyograg.org wrote:

 On 10/16/2013 11:51 AM, Christopher Gross wrote:
  Ok, so I think I was confusing the terminology (still in a 3.X mindset I
  guess.)
 
  From the Cloud-Tree, I do see that I have collections for what I was
  calling core1, core2, etc.
 
  So, to redo the above,
  Servers: index1, index2, index3
  Collections: (on each) coll1, coll2
  Collection (core?) on index1: coll1new
 
  Each Collection has 1 shard (too small to make sharding worthwhile).
 
  So should I run something like this:
 
 http://index1:8080/solr/admin/collections?action=CREATEALIASname=coll1collections=col11new
 
  Or will I need coll1new to be on each of the index1, index2 and index3
  instances of Solr?

 I don't think you can create an alias if a collection already exists
 with that name - so having a collection named core1 means you wouldn't
 want an alias named core1.  I could be wrong, but just to keep things
 clean, I wouldn't recommend it, even if it's possible.

 That CREATEALIAS command will only work if coll1new shows up in
 /collections and shows green on the cloud graph.  If it does, and you're
 using an alias name that doesn't already exist as a collection, then
 you're good.

 Whether coll1new is living on one server, two servers, or all three
 servers doesn't matter for CREATEALIAS, or for most other
 collection-related topics.  Any query or update can be sent to any
 server in the cloud and it will be routed to the correct place according
 to the clusterstate.

 Where things live and how many replicas there are *does* matter for a
 discussion about redundancy.  Generally speaking, you're going to want
 your shards to have at least two replicas, so that if a Solr instance
 goes down, or is taken down for maintenance, your cloud remains fully
 operational.  In your situation, you probably want three replicas - so
 each collection lives on all three servers.

 So my general advice:

 Decide what name you want your application to use, make sure none of
 your existing collections are using that name, and set up an alias with
 that name pointing to whichever collection is current.  Then change your
 application configurations or code to point at the alias instead of
 directly at the collection.

 When you want to do your reindex, first create a new collection using
 the collections API.  Index to that new collection.  When it's ready to
 go, use CREATEALIAS to update the alias, and your application will start
 using the new index.

 Thanks,
 Shawn





Re: Switching indexes

2013-10-17 Thread Christopher Gross
Also, when I make an alias:
http://index1:8080/solr/admin/collections?action=CREATEALIASname=test1-aliascollections=test1

I get a pretty useless response:
responselst name=responseHeaderint name=status0/intint
name=QTime0/int/lst/response

So I'm not sure if it is made.  I tried going to:
http://index1:8080/solr/test1-alias/select?q=*:*
but that didn't work.  How do I use an alias when it gets made?


-- Chris


On Thu, Oct 17, 2013 at 2:51 PM, Christopher Gross cogr...@gmail.comwrote:

 OK, super confused now.


 http://index1:8080/solr/admin/cores?action=CREATEname=test2collection=test2numshards=1replicationFactor=3

 Nets me this:
 response
 lst name=responseHeader
 int name=status400/int
 int name=QTime15007/int
 /lst
 lst name=error
 str name=msgError CREATEing SolrCore 'test2': Could not find
 configName for collection test2 found:[xxx, xxx, , x, xx]/str
 int name=code400/int
 /lst
 /response

 For that node (test2), in my solr data directory, I have a folder with the
 conf files and an existing data dir (copied the index from another
 location).

 Right now it seems like the only way that I can add in a collection is to
 load the configs into zookeeper, stop tomcat, add it to the solr.xml file,
 and restart tomcat.

 Is there a primer that I'm missing for how to do this?

 Thanks.


 -- Chris


 On Wed, Oct 16, 2013 at 2:59 PM, Christopher Gross cogr...@gmail.comwrote:

 Thanks Shawn, the explanations help bring me forward to the SolrCloud
 mentality.

 So it sounds like going forward that I should have a more complicated
 name (ex: coll1-20131015) aliased to coll1, to make it easier to switch in
 the future.

 Now, if I already have an index (copied from one location to another), it
 sounds like I should just remove my existing (bad/old data) coll1, create
 the replicated one (calling it coll1-date), then alias coll1 to that
 one.

 This type of information would have been awesome to know before I got
 started, but I can make do with what I've got going now.

 Thanks again!


 -- Chris


 On Wed, Oct 16, 2013 at 2:40 PM, Shawn Heisey s...@elyograg.org wrote:

 On 10/16/2013 11:51 AM, Christopher Gross wrote:
  Ok, so I think I was confusing the terminology (still in a 3.X mindset
 I
  guess.)
 
  From the Cloud-Tree, I do see that I have collections for what I was
  calling core1, core2, etc.
 
  So, to redo the above,
  Servers: index1, index2, index3
  Collections: (on each) coll1, coll2
  Collection (core?) on index1: coll1new
 
  Each Collection has 1 shard (too small to make sharding worthwhile).
 
  So should I run something like this:
 
 http://index1:8080/solr/admin/collections?action=CREATEALIASname=coll1collections=col11new
 
  Or will I need coll1new to be on each of the index1, index2 and index3
  instances of Solr?

 I don't think you can create an alias if a collection already exists
 with that name - so having a collection named core1 means you wouldn't
 want an alias named core1.  I could be wrong, but just to keep things
 clean, I wouldn't recommend it, even if it's possible.

 That CREATEALIAS command will only work if coll1new shows up in
 /collections and shows green on the cloud graph.  If it does, and you're
 using an alias name that doesn't already exist as a collection, then
 you're good.

 Whether coll1new is living on one server, two servers, or all three
 servers doesn't matter for CREATEALIAS, or for most other
 collection-related topics.  Any query or update can be sent to any
 server in the cloud and it will be routed to the correct place according
 to the clusterstate.

 Where things live and how many replicas there are *does* matter for a
 discussion about redundancy.  Generally speaking, you're going to want
 your shards to have at least two replicas, so that if a Solr instance
 goes down, or is taken down for maintenance, your cloud remains fully
 operational.  In your situation, you probably want three replicas - so
 each collection lives on all three servers.

 So my general advice:

 Decide what name you want your application to use, make sure none of
 your existing collections are using that name, and set up an alias with
 that name pointing to whichever collection is current.  Then change your
 application configurations or code to point at the alias instead of
 directly at the collection.

 When you want to do your reindex, first create a new collection using
 the collections API.  Index to that new collection.  When it's ready to
 go, use CREATEALIAS to update the alias, and your application will start
 using the new index.

 Thanks,
 Shawn






RE: Switching indexes

2013-10-17 Thread Garth Grimm
Go to the admin screen for Cloud/Tree, and then click the node for 
aliases.json.  To the lower right, you should see something like:

{collection:{AdWorksQuery:AdWorks}}

Or access the Zookeeper instance, and do a 'get /aliases.json'.

-Original Message-
From: Christopher Gross [mailto:cogr...@gmail.com] 
Sent: Thursday, October 17, 2013 2:40 PM
To: solr-user
Subject: Re: Switching indexes

Also, when I make an alias:
http://index1:8080/solr/admin/collections?action=CREATEALIASname=test1-aliascollections=test1

I get a pretty useless response:
responselst name=responseHeaderint name=status0/intint 
name=QTime0/int/lst/response

So I'm not sure if it is made.  I tried going to:
http://index1:8080/solr/test1-alias/select?q=*:*
but that didn't work.  How do I use an alias when it gets made?


-- Chris


On Thu, Oct 17, 2013 at 2:51 PM, Christopher Gross cogr...@gmail.comwrote:

 OK, super confused now.


 http://index1:8080/solr/admin/cores?action=CREATEname=test2collectio
 n=test2numshards=1replicationFactor=3

 Nets me this:
 response
 lst name=responseHeader
 int name=status400/int
 int name=QTime15007/int
 /lst
 lst name=error
 str name=msgError CREATEing SolrCore 'test2': Could not find 
 configName for collection test2 found:[xxx, xxx, , x, 
 xx]/str int name=code400/int /lst /response

 For that node (test2), in my solr data directory, I have a folder with 
 the conf files and an existing data dir (copied the index from another 
 location).

 Right now it seems like the only way that I can add in a collection is 
 to load the configs into zookeeper, stop tomcat, add it to the 
 solr.xml file, and restart tomcat.

 Is there a primer that I'm missing for how to do this?

 Thanks.


 -- Chris


 On Wed, Oct 16, 2013 at 2:59 PM, Christopher Gross cogr...@gmail.comwrote:

 Thanks Shawn, the explanations help bring me forward to the SolrCloud
 mentality.

 So it sounds like going forward that I should have a more complicated 
 name (ex: coll1-20131015) aliased to coll1, to make it easier to 
 switch in the future.

 Now, if I already have an index (copied from one location to 
 another), it sounds like I should just remove my existing (bad/old 
 data) coll1, create the replicated one (calling it coll1-date), 
 then alias coll1 to that one.

 This type of information would have been awesome to know before I got 
 started, but I can make do with what I've got going now.

 Thanks again!


 -- Chris


 On Wed, Oct 16, 2013 at 2:40 PM, Shawn Heisey s...@elyograg.org wrote:

 On 10/16/2013 11:51 AM, Christopher Gross wrote:
  Ok, so I think I was confusing the terminology (still in a 3.X 
  mindset
 I
  guess.)
 
  From the Cloud-Tree, I do see that I have collections for what 
  I was calling core1, core2, etc.
 
  So, to redo the above,
  Servers: index1, index2, index3
  Collections: (on each) coll1, coll2 Collection (core?) on index1: 
  coll1new
 
  Each Collection has 1 shard (too small to make sharding worthwhile).
 
  So should I run something like this:
 
 http://index1:8080/solr/admin/collections?action=CREATEALIASname=co
 ll1collections=col11new
 
  Or will I need coll1new to be on each of the index1, index2 and 
  index3 instances of Solr?

 I don't think you can create an alias if a collection already exists 
 with that name - so having a collection named core1 means you 
 wouldn't want an alias named core1.  I could be wrong, but just to 
 keep things clean, I wouldn't recommend it, even if it's possible.

 That CREATEALIAS command will only work if coll1new shows up in 
 /collections and shows green on the cloud graph.  If it does, and 
 you're using an alias name that doesn't already exist as a 
 collection, then you're good.

 Whether coll1new is living on one server, two servers, or all three 
 servers doesn't matter for CREATEALIAS, or for most other 
 collection-related topics.  Any query or update can be sent to any 
 server in the cloud and it will be routed to the correct place 
 according to the clusterstate.

 Where things live and how many replicas there are *does* matter for 
 a discussion about redundancy.  Generally speaking, you're going to 
 want your shards to have at least two replicas, so that if a Solr 
 instance goes down, or is taken down for maintenance, your cloud 
 remains fully operational.  In your situation, you probably want 
 three replicas - so each collection lives on all three servers.

 So my general advice:

 Decide what name you want your application to use, make sure none of 
 your existing collections are using that name, and set up an alias 
 with that name pointing to whichever collection is current.  Then 
 change your application configurations or code to point at the alias 
 instead of directly at the collection.

 When you want to do your reindex, first create a new collection 
 using the collections API.  Index to that new collection.  When it's 
 ready to go, use CREATEALIAS to update the alias, and your 
 application will start 

Re: Switching indexes

2013-10-17 Thread Christopher Gross
I can't find it in the Admin-Cloud-Tree part of the UI.

Trying to get the file:
[zk: localhost:2181(CONNECTED) 0] get /aliases.json
Node does not exist: /aliases.json

So it didn't stick -- I'm guessing.  I don't see an error message regarding
the alias in my logs either.  Anywhere else I should look?

-- Chris


On Thu, Oct 17, 2013 at 3:50 PM, Garth Grimm 
garthgr...@averyranchconsulting.com wrote:

 Go to the admin screen for Cloud/Tree, and then click the node for
 aliases.json.  To the lower right, you should see something like:

 {collection:{AdWorksQuery:AdWorks}}

 Or access the Zookeeper instance, and do a 'get /aliases.json'.

 -Original Message-
 From: Christopher Gross [mailto:cogr...@gmail.com]
 Sent: Thursday, October 17, 2013 2:40 PM
 To: solr-user
 Subject: Re: Switching indexes

 Also, when I make an alias:

 http://index1:8080/solr/admin/collections?action=CREATEALIASname=test1-aliascollections=test1

 I get a pretty useless response:
 responselst name=responseHeaderint name=status0/intint
 name=QTime0/int/lst/response

 So I'm not sure if it is made.  I tried going to:
 http://index1:8080/solr/test1-alias/select?q=*:*
 but that didn't work.  How do I use an alias when it gets made?


 -- Chris


 On Thu, Oct 17, 2013 at 2:51 PM, Christopher Gross cogr...@gmail.com
 wrote:

  OK, super confused now.
 
 
  http://index1:8080/solr/admin/cores?action=CREATEname=test2collectio
  n=test2numshards=1replicationFactor=3
 
  Nets me this:
  response
  lst name=responseHeader
  int name=status400/int
  int name=QTime15007/int
  /lst
  lst name=error
  str name=msgError CREATEing SolrCore 'test2': Could not find
  configName for collection test2 found:[xxx, xxx, , x,
  xx]/str int name=code400/int /lst /response
 
  For that node (test2), in my solr data directory, I have a folder with
  the conf files and an existing data dir (copied the index from another
  location).
 
  Right now it seems like the only way that I can add in a collection is
  to load the configs into zookeeper, stop tomcat, add it to the
  solr.xml file, and restart tomcat.
 
  Is there a primer that I'm missing for how to do this?
 
  Thanks.
 
 
  -- Chris
 
 
  On Wed, Oct 16, 2013 at 2:59 PM, Christopher Gross cogr...@gmail.com
 wrote:
 
  Thanks Shawn, the explanations help bring me forward to the SolrCloud
  mentality.
 
  So it sounds like going forward that I should have a more complicated
  name (ex: coll1-20131015) aliased to coll1, to make it easier to
  switch in the future.
 
  Now, if I already have an index (copied from one location to
  another), it sounds like I should just remove my existing (bad/old
  data) coll1, create the replicated one (calling it coll1-date),
  then alias coll1 to that one.
 
  This type of information would have been awesome to know before I got
  started, but I can make do with what I've got going now.
 
  Thanks again!
 
 
  -- Chris
 
 
  On Wed, Oct 16, 2013 at 2:40 PM, Shawn Heisey s...@elyograg.org
 wrote:
 
  On 10/16/2013 11:51 AM, Christopher Gross wrote:
   Ok, so I think I was confusing the terminology (still in a 3.X
   mindset
  I
   guess.)
  
   From the Cloud-Tree, I do see that I have collections for what
   I was calling core1, core2, etc.
  
   So, to redo the above,
   Servers: index1, index2, index3
   Collections: (on each) coll1, coll2 Collection (core?) on index1:
   coll1new
  
   Each Collection has 1 shard (too small to make sharding worthwhile).
  
   So should I run something like this:
  
  http://index1:8080/solr/admin/collections?action=CREATEALIASname=co
  ll1collections=col11new
  
   Or will I need coll1new to be on each of the index1, index2 and
   index3 instances of Solr?
 
  I don't think you can create an alias if a collection already exists
  with that name - so having a collection named core1 means you
  wouldn't want an alias named core1.  I could be wrong, but just to
  keep things clean, I wouldn't recommend it, even if it's possible.
 
  That CREATEALIAS command will only work if coll1new shows up in
  /collections and shows green on the cloud graph.  If it does, and
  you're using an alias name that doesn't already exist as a
  collection, then you're good.
 
  Whether coll1new is living on one server, two servers, or all three
  servers doesn't matter for CREATEALIAS, or for most other
  collection-related topics.  Any query or update can be sent to any
  server in the cloud and it will be routed to the correct place
  according to the clusterstate.
 
  Where things live and how many replicas there are *does* matter for
  a discussion about redundancy.  Generally speaking, you're going to
  want your shards to have at least two replicas, so that if a Solr
  instance goes down, or is taken down for maintenance, your cloud
  remains fully operational.  In your situation, you probably want
  three replicas - so each collection lives on all three servers.
 
  So my general advice:
 
  Decide what name you want your 

Check if dynamic columns exists and query else ignore

2013-10-17 Thread Utkarsh Sengar
I trying to do this:

if (US_offers_i exists):
   fq=US_offers_i:[1 TO *]
else:
   fq=offers_count:[1 TO *]

Where:
US_offers_i is a dynamic field containing an int
offers_count is a status field containing an int.

I have tried this so far but it doesn't work:

http://solr_server/solr/col1/select?
q=iphone+5s 
fq=if(exist(US_offers_i),US_offers_i:[1 TO *], offers_count:[1 TO *])

Also, there is a heavy performance penalty for this condition? I am
planning to use this for all my queries.

-- 
Thanks,
-Utkarsh


Re: Switching indexes

2013-10-17 Thread Michael Della Bitta
 load the configs into zookeeper,
Yes.

 stop tomcat, add it to the solr.xml file,
and restart tomcat.

To your CREATE URL, add the parameter collection.configName=blah

http://wiki.apache.org/solr/SolrCloud#Managing_collections_via_the_Collections_API

Michael Della Bitta

Applications Developer

o: +1 646 532 3062  | c: +1 917 477 7906

appinions inc.

“The Science of Influence Marketing”

18 East 41st Street

New York, NY 10017

t: @appinions https://twitter.com/Appinions | g+:
plus.google.com/appinionshttps://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
w: appinions.com http://www.appinions.com/


On Thu, Oct 17, 2013 at 2:51 PM, Christopher Gross cogr...@gmail.comwrote:

 OK, super confused now.


 http://index1:8080/solr/admin/cores?action=CREATEname=test2collection=test2numshards=1replicationFactor=3

 Nets me this:
 response
 lst name=responseHeader
 int name=status400/int
 int name=QTime15007/int
 /lst
 lst name=error
 str name=msgError CREATEing SolrCore 'test2': Could not find configName
 for collection test2 found:[xxx, xxx, , x, xx]/str
 int name=code400/int
 /lst
 /response

 For that node (test2), in my solr data directory, I have a folder with the
 conf files and an existing data dir (copied the index from another
 location).

 Right now it seems like the only way that I can add in a collection is to
 load the configs into zookeeper, stop tomcat, add it to the solr.xml file,
 and restart tomcat.

 Is there a primer that I'm missing for how to do this?

 Thanks.


 -- Chris


 On Wed, Oct 16, 2013 at 2:59 PM, Christopher Gross cogr...@gmail.com
 wrote:

  Thanks Shawn, the explanations help bring me forward to the SolrCloud
  mentality.
 
  So it sounds like going forward that I should have a more complicated
 name
  (ex: coll1-20131015) aliased to coll1, to make it easier to switch in the
  future.
 
  Now, if I already have an index (copied from one location to another), it
  sounds like I should just remove my existing (bad/old data) coll1, create
  the replicated one (calling it coll1-date), then alias coll1 to that
  one.
 
  This type of information would have been awesome to know before I got
  started, but I can make do with what I've got going now.
 
  Thanks again!
 
 
  -- Chris
 
 
  On Wed, Oct 16, 2013 at 2:40 PM, Shawn Heisey s...@elyograg.org wrote:
 
  On 10/16/2013 11:51 AM, Christopher Gross wrote:
   Ok, so I think I was confusing the terminology (still in a 3.X
 mindset I
   guess.)
  
   From the Cloud-Tree, I do see that I have collections for what I
 was
   calling core1, core2, etc.
  
   So, to redo the above,
   Servers: index1, index2, index3
   Collections: (on each) coll1, coll2
   Collection (core?) on index1: coll1new
  
   Each Collection has 1 shard (too small to make sharding worthwhile).
  
   So should I run something like this:
  
 
 http://index1:8080/solr/admin/collections?action=CREATEALIASname=coll1collections=col11new
  
   Or will I need coll1new to be on each of the index1, index2 and index3
   instances of Solr?
 
  I don't think you can create an alias if a collection already exists
  with that name - so having a collection named core1 means you wouldn't
  want an alias named core1.  I could be wrong, but just to keep things
  clean, I wouldn't recommend it, even if it's possible.
 
  That CREATEALIAS command will only work if coll1new shows up in
  /collections and shows green on the cloud graph.  If it does, and you're
  using an alias name that doesn't already exist as a collection, then
  you're good.
 
  Whether coll1new is living on one server, two servers, or all three
  servers doesn't matter for CREATEALIAS, or for most other
  collection-related topics.  Any query or update can be sent to any
  server in the cloud and it will be routed to the correct place according
  to the clusterstate.
 
  Where things live and how many replicas there are *does* matter for a
  discussion about redundancy.  Generally speaking, you're going to want
  your shards to have at least two replicas, so that if a Solr instance
  goes down, or is taken down for maintenance, your cloud remains fully
  operational.  In your situation, you probably want three replicas - so
  each collection lives on all three servers.
 
  So my general advice:
 
  Decide what name you want your application to use, make sure none of
  your existing collections are using that name, and set up an alias with
  that name pointing to whichever collection is current.  Then change your
  application configurations or code to point at the alias instead of
  directly at the collection.
 
  When you want to do your reindex, first create a new collection using
  the collections API.  Index to that new collection.  When it's ready to
  go, use CREATEALIAS to update the alias, and your application will start
  using the new index.
 
  Thanks,
  Shawn
 
 
 



Re: Skipping caches on a /select

2013-10-17 Thread Tim Vaillancourt

Thanks Yonik,

Does cache=false apply to all caches? The docs make it sound like it 
is for filterCache only, but I could be misunderstanding.


When I force a commit and perform a /select a query many times with 
cache=false, I notice my query gets cached still, my guess is in the 
queryResultCache. At first the query takes 500ms+, then all subsequent 
requests take 0-1ms. I'll confirm this queryResultCache assumption today.


Cheers,

Tim

On 16/10/13 06:33 PM, Yonik Seeley wrote:

On Wed, Oct 16, 2013 at 6:18 PM, Tim Vaillancourtt...@elementspace.com  wrote:

I am debugging some /select queries on my Solr tier and would like to see
if there is a way to tell Solr to skip the caches on a given /select query
if it happens to ALREADY be in the cache. Live queries are being inserted
and read from the caches, but I want my debug queries to bypass the cache
entirely.

I do know about the cache=false param (that causes the results of a
select to not be INSERTED in to the cache), but what I am looking for
instead is a way to tell Solr to not read the cache at all, even if there
actually is a cached result for my query.

Yeah, cache=false for q or fq should already not use the cache at
all (read or write).

-Yonik


Re: Skipping caches on a /select

2013-10-17 Thread Yonik Seeley
There isn't a global  cache=false... it's a local param that can be
applied to any fq or q parameter independently.

-Yonik


On Thu, Oct 17, 2013 at 4:39 PM, Tim Vaillancourt t...@elementspace.com wrote:
 Thanks Yonik,

 Does cache=false apply to all caches? The docs make it sound like it is
 for filterCache only, but I could be misunderstanding.

 When I force a commit and perform a /select a query many times with
 cache=false, I notice my query gets cached still, my guess is in the
 queryResultCache. At first the query takes 500ms+, then all subsequent
 requests take 0-1ms. I'll confirm this queryResultCache assumption today.

 Cheers,

 Tim


 On 16/10/13 06:33 PM, Yonik Seeley wrote:

 On Wed, Oct 16, 2013 at 6:18 PM, Tim Vaillancourtt...@elementspace.com
 wrote:

 I am debugging some /select queries on my Solr tier and would like to see
 if there is a way to tell Solr to skip the caches on a given /select
 query
 if it happens to ALREADY be in the cache. Live queries are being inserted
 and read from the caches, but I want my debug queries to bypass the cache
 entirely.

 I do know about the cache=false param (that causes the results of a
 select to not be INSERTED in to the cache), but what I am looking for
 instead is a way to tell Solr to not read the cache at all, even if there
 actually is a cached result for my query.

 Yeah, cache=false for q or fq should already not use the cache at
 all (read or write).

 -Yonik


Re: Skipping caches on a /select

2013-10-17 Thread Chris Hostetter


: Does cache=false apply to all caches? The docs make it sound like it is for
: filterCache only, but I could be misunderstanding.

it's per *query* -- not per cache, or per request...

 /select?q={!cache=true}foofq={!cache=false}barfq={!cache=true}baz

...should cause 1 lookup/insert in the filterCache (baz) and 1 
lookup/insert into the queryResultCache (for the main query with it's 
associated filters  pagination)



-Hoss


Re: Skipping caches on a /select

2013-10-17 Thread Tim Vaillancourt


  
  
Awesome, this make a lot of sense now. Thanks a lot guys.

Currently the only mention of this setting in the docs is under
filterQuery on the "SolrCaching" page as:

" Solr3.4 Adding the
localParam flag of {!cache=false} to a query will prevent
the filterCache from being consulted for that query. "

I will update the docs sometime soon to reflect that this can apply
to any query (q or fq).

Cheers,

Tim

On 17/10/13 01:44 PM, Chris Hostetter wrote:

  

: Does "cache=false" apply to all caches? The docs make it sound like it is for
: filterCache only, but I could be misunderstanding.

it's per *query* -- not per cache, or per request...

 /select?q={!cache=true}foofq={!cache=false}barfq={!cache=true}baz

...should cause 1 lookup/insert in the filterCache (baz) and 1 
lookup/insert into the queryResultCache (for the main query with it's 
associated filters  pagination)



-Hoss


  



solrconfig.xml carrot2 params

2013-10-17 Thread youknow...@heroicefforts.net
Would someone help me out with the syntax for setting Tokenizer.documentFields 
in the ClusteringComponent engine definition in solrconfig.xml?  Carrot2 is 
expecting a Collection of Strings.  There's no schema definition for this XML 
file and a big TODO on the Wiki wrt init params.  Every permutation I have 
tried results in an error stating:  Cannot set java.until.Collection field ... 
to java.lang.String.
-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.

Re: Skipping caches on a /select

2013-10-17 Thread Bill Bell
But global on a qt would be awesome !!!

Bill Bell
Sent from mobile


 On Oct 17, 2013, at 2:43 PM, Yonik Seeley ysee...@gmail.com wrote:
 
 There isn't a global  cache=false... it's a local param that can be
 applied to any fq or q parameter independently.
 
 -Yonik
 
 
 On Thu, Oct 17, 2013 at 4:39 PM, Tim Vaillancourt t...@elementspace.com 
 wrote:
 Thanks Yonik,
 
 Does cache=false apply to all caches? The docs make it sound like it is
 for filterCache only, but I could be misunderstanding.
 
 When I force a commit and perform a /select a query many times with
 cache=false, I notice my query gets cached still, my guess is in the
 queryResultCache. At first the query takes 500ms+, then all subsequent
 requests take 0-1ms. I'll confirm this queryResultCache assumption today.
 
 Cheers,
 
 Tim
 
 
 On 16/10/13 06:33 PM, Yonik Seeley wrote:
 
 On Wed, Oct 16, 2013 at 6:18 PM, Tim Vaillancourtt...@elementspace.com
 wrote:
 
 I am debugging some /select queries on my Solr tier and would like to see
 if there is a way to tell Solr to skip the caches on a given /select
 query
 if it happens to ALREADY be in the cache. Live queries are being inserted
 and read from the caches, but I want my debug queries to bypass the cache
 entirely.
 
 I do know about the cache=false param (that causes the results of a
 select to not be INSERTED in to the cache), but what I am looking for
 instead is a way to tell Solr to not read the cache at all, even if there
 actually is a cached result for my query.
 
 Yeah, cache=false for q or fq should already not use the cache at
 all (read or write).
 
 -Yonik


Re: Switching indexes

2013-10-17 Thread Shawn Heisey

On 10/17/2013 12:51 PM, Christopher Gross wrote:

OK, super confused now.

http://index1:8080/solr/admin/cores?action=CREATEname=test2collection=test2numshards=1replicationFactor=3

Nets me this:
response
lst name=responseHeader
int name=status400/int
int name=QTime15007/int
/lst
lst name=error
str name=msgError CREATEing SolrCore 'test2': Could not find configName
for collection test2 found:[xxx, xxx, , x, xx]/str
int name=code400/int
/lst
/response

For that node (test2), in my solr data directory, I have a folder with the
conf files and an existing data dir (copied the index from another
location).

Right now it seems like the only way that I can add in a collection is to
load the configs into zookeeper, stop tomcat, add it to the solr.xml file,
and restart tomcat.


The config does need to be loaded into zookeeper.  That's how SolrCloud 
works.


Because you have existing collections, you're going to have at least one 
config set already uploaded, you may be able to use that directly.  You 
don't need to stop anything, though.  Michael Della Bitta's response 
indicates the part you're missing on your create URL - the 
collection.configName parameter.


The basic way to get things done with collections is this:

1) Upload one or more named config sets to zookeeper.  This can be done 
with zkcli and its upconfig command, or with the bootstrap startup 
options that are intended to be used once.


2) Create the collection, referencing the proper collection.configName.

You can have many collections that all share one config name.  You can 
also change which config an existing collection uses with the zkcli 
linkconfig command, followed by a collection reload.  If you upload a 
new configuration with an existing name, a collection reload (or Solr 
restart) is required to use the new config.


For uploading configs, I find zkcli to be a lot cleaner than the 
bootstrap options - it doesn't require stopping Solr or giving it 
different startup options.  Actually, it doesn't even require Solr to be 
started - it talks only to zookeeper, and we strongly recommend 
standalone zookeeper, not the zk server that can be run embedded in Solr.


Thanks,
Shawn



Re: SolrDocumentList - bitwise operation

2013-10-17 Thread Michael Tyler
Hi,

   Regrets, I was confused with bit-set. I l have Shawn's suggested
approach in system.  I want to try with other ways and test performance.

How can I use join? I have 2 different solr indexes.
localhost:8080/solr_1/select?q=content:testfl=id,name,type
localhost:8081/solr_1_1/select?q=text:testfl=id

After getting results - Join by id

How do I do this? please suggest me with other ways to do this. current
method is taking lot of time.

Thanks
Michael.










On Tue, Oct 15, 2013 at 11:41 PM, Erick Erickson erickerick...@gmail.comwrote:

 Why do you think a bitset would help? Bitsets have
 a bit set on for every document that matches
 based on the _internal_ Lucene document ID, it
 has nothing to do with the uniqueKey you have
 defined. Nor does it have anything to do with the
 foreign key relationship.

 So either I don't understand the problem at all or
 pursuing bitsets is a red herring.

 You might be substantially faster by sorting the
 results and then doing a skip-list sort of thing.

 FWIW,
 Erick


 On Mon, Oct 14, 2013 at 1:47 PM, Michael Tyler
 michaeltyler1...@gmail.comwrote:

  Hi Shawn,
 
This is time consuming operation. I already have this in my
 application .
  I was pondering whether I can get bit set from both the solr indexes ,
  bitset.and  then retrieve only those matched? I don't know how do I
  retrieve bitset. - wanted to try this and test the performance.
 
 
  Regards
  Michael
 
 
  On Sun, Oct 13, 2013 at 8:54 PM, Shawn Heisey s...@elyograg.org wrote:
 
   On 10/13/2013 8:34 AM, Michael Tyler wrote:
Hello,
   
I have 2 different solr indexes returning 2 different sets of
SolrDocumentList. Doc Id is the foreign key relation.
   
After obtaining them, I want to perform AND operation between them
  and
then return results to user. Can you tell me how do I get this? I am
   using
solr 4.3
   
 SolrDocumentList results1 = responseA.getResults();
 SolrDocumentList results2 = responseB.getResults();
   
results1  : d1, d2, d3
results2  :  d1,d2, d4
  
   The SolrDocumentList class extends ArrayListSolrDocument, which means
   that it inherits all ArrayList functionality.  Unfortunately, there's
 no
   built-in way of eliminating duplicates with a java List.  It's very
 easy
   to combine the two results into another object, but that object will
   contain both of the d1 and both of the d2 SolrDocument objects.
  
   The following code is a reasonably fast way to handle this.  It assumes
   that results1 is the list that should win when there are duplicates, so
   it gets added first.  It assumes that the uniqueKey field is named id
   and that it contains a String value.  If these are incorrect
   assumptions, you can adjust the code accordingly.
  
   SolrDocumentList results1 = responseA.getResults();
   SolrDocumentList results2 = responseB.getResults();
   ListSolrDocumentList tmpList = new ArrayListSolrDocumentList();
   tmpList.add(results1);
   tmpList.add(results2);
  
   SetString tmpSet = new HashSetString();
   SolrDocumentList newList = new SolrDocumentList();
   for (SolrDocumentList l : tmpList)
   {
   for (SolrDocument d : l)
   {
   String id = (String) d.get(id);
   if (tmpSet.contains(id)) {
   continue;
   }
   tmpSet.add(id);
   newList.add(d);
   }
   }
  
   Thanks,
   Shawn
  
  
 



Re: Different document types in different collections OR same collection without sharing fields?

2013-10-17 Thread shrikanth k
Hi,

 Logically maintaining will be easy, as both collections are in
different folders.
 Next, even thought making separate fields in one collection, at search
time if field list is not mentioned then results will be combination of
both domains. If this is mandatorily taking care at search/query level that
should be fine. Else in case of 2 collections, the search word can queried
at specified collection level easily with or without field list.

regards,
Shrikanth



On Wed, Oct 16, 2013 at 4:32 PM, user 01 user...@gmail.com wrote:

 @Shrikanth: how do you manage multiple redundant configurations(isn' it?) ?
 I thought indexes would be separate when fields aren't shared. I don't need
 to import any data/ or re-indexing, if those are the only benefits for
 separate collections.  I just index when a request comes/ new item is added
 to DB.


 On Wed, Oct 16, 2013 at 4:12 PM, shrikanth k jconsult.s...@gmail.com
 wrote:

  Hi,
 
  Please refer below link for clarification on fields having null
 value.
 
 
 
 http://stackoverflow.com/questions/7332122/solr-what-are-the-default-values-for-fields-which-does-not-have-a-default-value
 
  logically it is better to have different collections for different domain
  data. Having 2 collections will improve the overall performances.
 
  Currently am holding 2 collections for different domain data. It eases
  importing data and re-indexing.
 
 
  regards,
  Shrikanth
 
 
 
  On Wed, Oct 16, 2013 at 3:48 PM, user 01 user...@gmail.com wrote:
 
   Can some expert users please leave a comment on this ?
  
  
   On Sun, Oct 6, 2013 at 2:54 AM, user 01 user...@gmail.com wrote:
  
 Using a single node Solr instance, I need to search for, lets say,
electronics items  grocery items. But I never want to search both of
   them
together. When I search for electrnoics I don't expect a grocery item
   ever
 vice versa.
   
Should I be defining both these document types within a single
  schema.xml
or should I use different collection for each of these
 two(maintaining
separate schema.xml  solrconfig.xml for each of two) ?
   
I believe that if I add both to a single collection, without sharing
fields among these two document types, I should be equally good as
separating them in two collection(in terms of performance  all), as
   their
indexes/filter caches would be totally independent of each other when
   they
don't share fields?
   
   
Also posted at SO: http://stackoverflow.com/q/19202882/530153
   
  
 
 
 
  --
 




--


Re: SolrCloud Performance Issue

2013-10-17 Thread shamik
I tried commenting out NOW in bq, but didn't make any difference in the
performance. I do see minor entry in the queryfiltercache rate which is a
meager 0.02. 

I'm really struggling to figure out the bottleneck, any known pain points I
should be checking ?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-Performance-Issue-tp4095971p4096277.html
Sent from the Solr - User mailing list archive at Nabble.com.