Re: solr4.1 createNodeSet requires ip addresses?

2013-02-18 Thread Charton, Andre
Hi,

I created a ticket and try to describe here
https://issues.apache.org/jira/browse/SOLR-4471

Actually search speed, ram and memory usage on solr 4.x compared with 3.6.
looks good, only the network is blocked by full copy index from slave.

André

On 16.02.13 03:25, Mark Miller markrmil...@gmail.com wrote:

For 4.2, I'll try and put in
https://issues.apache.org/jira/browse/SOLR-4078 soon.

Not sure about the behavior your seeing - you might want to file a JIRA
issue.

- Mark

On Feb 15, 2013, at 8:17 PM, Gary Yngve gary.yn...@gmail.com wrote:

 Hi all,
 
 I've been unable to get the collections create API to work with
 createNodeSet containing hostnames, both localhost and external
hostnames.
 I've only been able to get it working when using explicit IP addresses.
 
 It looks like zk stores the IP addresses in the clusterstate.json and
 live_nodes.  Is it possible that Solr Cloud is not doing any hostname
 resolving but just looking for an explicit match with createNodeSet?
This
 is kind of annoying, in that I am working with EC2 instances and
consider
 it pretty lame to need to use elastic IPs for internal use.  I'm hacking
 around it now (looking up the eth0 inet addr on each machine), but I'm
not
 happy about it.
 
 Has anyone else found a better solution?
 
 The reason I want to specify explicit nodes for collections is so I can
 have just one zk ensemble managing collections across different
 environments that will go up and down independently of each other.
 
 Thanks,
 Gary




SpellCheck - Ignore list of words

2013-02-18 Thread Hemant Verma
Hi All

I have a use case where I have a list of words, on which I don't want to
perform spellcheck.
Like stemming ignores the words listed in protwords.txt file.
Any idea, how it can be solved?

Thanks
Hemant



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SpellCheck-Ignore-list-of-words-tp4041099.html
Sent from the Solr - User mailing list archive at Nabble.com.


tlog file questions

2013-02-18 Thread giovanni.bricc...@banzai.it

Hi

I have some questions about  tlog files and how are managed.

I'm using dih to do incremental data loading, once a day I do a full 
refresh.


these are the request parameters

/dataimport?command=full-importcommit=true
/dataimport?command=delta-importcommit=trueoptimize=false

I was expecting to see removed all the old tlog file when completing a 
delta/full, but I see that these files remains. Actually

the older files gets removed.

Am I using the wrong parameters? is there a different parameter to 
trigger the hard commit?
Are there some configuration parameters to configure the number of tlog 
files to keep? Unfortunately I have very little space on my disks and I 
need to double check space consumption .


I'm using solr 4

Thank you


Custom shard key, shard partitioning

2013-02-18 Thread Markus Jelsma
Hi,

By defaut SolrCloud partitions records by the hash of the uniqueKey field but 
we want to do some tests and partition the records by a signed integer field 
but keep the current uniqueKey unique. I've scanned through several issues 
concerning distributed index, custom hashing, shard policies etc but i have not 
found some concise examples or documentation or even blog post on this matter.

How do we set up shard partitioning via another than the default uniqueKey 
field?

According to some older resolved issue CloudSolrServer should be cloud aware 
and send updates to the leader of the correct shards, how does it know this? 
Must we set up the same partitioning in SolrServer client as well? If so, how? 
The apidocs do not reveal a lot when i look through them.

I probably totally missed an issue or discussion or wiki page.

Thanks,
Markus


Re: Custom Solr FunctionQuery Error

2013-02-18 Thread Á_____o
Hi!

Although more than 1 year has passed, could I ask you, Parvin, what was your
final approach?

I have to deal with a similar problem
(http://lucene.472066.n3.nabble.com/Combining-Solr-score-with-customized-user-ratings-for-a-document-td4040200.html),
maybe a bit more difficult because it's a by-user score customization, but I
would probably find helpful your solution.

Thanks!
Álvaro



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Custom-Solr-FunctionQuery-Error-tp3615899p4041113.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: get filterCache in Component

2013-02-18 Thread Markus Jelsma
Chris, Mihhail,

I'd like to avoid issueing a query and spare the cycles. In SOLR-4280 i only 
look for the smallest DocSet by iterating over them. I would tend to think it's 
cheaper than getDocSet() and perhaps cacheDocSet().

In case i would add non-usercaches to the cacheMap and create a separate issue 
for that, would that break things? Be really bad?

Thanks,
Markus

 
 
-Original message-
 From:Mikhail Khludnev mkhlud...@griddynamics.com
 Sent: Fri 15-Feb-2013 21:17
 To: solr-user solr-user@lucene.apache.org
 Subject: Re: get filterCache in Component
 
 Markus,
 
 I wonder why you need an access to it. I've thought that current searcher's
 methods (getDocSet(), cacheDocSet() ) are enough to do everything. Anyway,
 if you wish, I just looked in code and see that it's available via
 core.getInfoRegistry().get(filterCache), it can lead to some problems,
 but should work.
 
 
 On Fri, Feb 15, 2013 at 4:30 PM, Markus Jelsma
 markus.jel...@openindex.iowrote:
 
  Hi,
 
  I need to get the filterCache for SOLR-4280. I can create a new issue
  patching SolrIndexSearcher and adding the missing caches (non-user caches)
  to the cacheMap so they can be returned using getCache(String) but i'm not
  sure this is intented. It does work but is this the right path?
 
  https://issues.apache.org/jira/browse/SOLR-4280
 
  Thanks,
  Markus
 
 
  -Original message-
   From:Markus Jelsma markus.jel...@openindex.io
   Sent: Thu 14-Feb-2013 13:18
   To: solr-user@lucene.apache.org
   Subject: get filterCache in Component
  
   Hi,
  
   We need to get the filterCache in a Component but
  SolrIndexSearcher.getCache(String name) does not return it. It seems the
  filterCache is not added to cacheMap and can therefore not be returned.
  
   SolrCacheQuery,DocSet filterCache =
  rb.req.getSearcher().getCache(filterCache);
  
   Will always return null. Can we get the filterCache via other means or
  should it be added to the cacheMap so getCache can return it?
  
   Thanks,
   Markus
  
 
 
 
 
 -- 
 Sincerely yours
 Mikhail Khludnev
 Principal Engineer,
 Grid Dynamics
 
 http://www.griddynamics.com
  mkhlud...@griddynamics.com
 


Re: Custom shard key, shard partitioning

2013-02-18 Thread Marcin Rzewucki
Hi,

I was able to implement custom hashing with the use of _shard_ field. It
contains the name of shard a document should go to. Works fine. Maybe
there's some other method to do the same with the use of solrconfig.xml,
but I have not found any docs about it so far.

Regards.


On 18 February 2013 13:34, Markus Jelsma markus.jel...@openindex.io wrote:

 Hi,

 By defaut SolrCloud partitions records by the hash of the uniqueKey field
 but we want to do some tests and partition the records by a signed integer
 field but keep the current uniqueKey unique. I've scanned through several
 issues concerning distributed index, custom hashing, shard policies etc but
 i have not found some concise examples or documentation or even blog post
 on this matter.

 How do we set up shard partitioning via another than the default uniqueKey
 field?

 According to some older resolved issue CloudSolrServer should be cloud
 aware and send updates to the leader of the correct shards, how does it
 know this? Must we set up the same partitioning in SolrServer client as
 well? If so, how? The apidocs do not reveal a lot when i look through them.

 I probably totally missed an issue or discussion or wiki page.

 Thanks,
 Markus



Re: Updating data

2013-02-18 Thread anurag.jain
Hi,

i got a problem.

problem is i have json file 

[
  {
  id:5,
   is_good:{add:1}
   },
  {
  id:1,
   is_good:{add:1}
   },
  {
  id:2,
   is_good:{add:1}
   },
  {
  id:3,
   is_good:{add:1}
   }
]


now due to stopping of tomcat only one of doc [id:5]added in solr. 

now if i am trying again to post this file for update. it is giving me error
multivalue 

becoz id:5 already updated ..   due to this remaining id is not updating
in solr. i have 25 lakh doc  in a json file. please give me some idea.. 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Updating-data-tp4038492p4041123.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: tlog file questions

2013-02-18 Thread Shawn Heisey

On 2/18/2013 4:57 AM, giovanni.bricc...@banzai.it wrote:

I have some questions about  tlog files and how are managed.

I'm using dih to do incremental data loading, once a day I do a full
refresh.

these are the request parameters

/dataimport?command=full-importcommit=true
/dataimport?command=delta-importcommit=trueoptimize=false

I was expecting to see removed all the old tlog file when completing a
delta/full, but I see that these files remains. Actually
the older files gets removed.

Am I using the wrong parameters? is there a different parameter to
trigger the hard commit?
Are there some configuration parameters to configure the number of tlog
files to keep? Unfortunately I have very little space on my disks and I
need to double check space consumption .


Your best option is to turn on autoCommit with openSearcher set to 
false.  I use a maxDocs of 25000 and a maxTime of 30 (five minutes). 
 Every 25000 docs, Solr does a hard commit, but because openSearcher is 
false, it does not change the index at all from the perspective of a 
client.  You would need to choose values appropriate for your installation.


!-- the default high-performance update handler --
updateHandler class=solr.DirectUpdateHandler2
  autoCommit
maxDocs25000/maxDocs
maxTime30/maxTime
openSearcherfalse/openSearcher
  /autoCommit
  updateLog /
/updateHandler

The hard commit does one important thing here - it closes the current 
tlog and starts a new one.  Solr does not keep very many tlogs around, 
but if you do a full-import without any commits, the tlog will contain 
every single document you have.


I actually do my index rebuilds in build core and swap them to live when 
the rebuild is fully complete, but I have double-checked the docs 
available from a client, and they do not change until the full-import is 
done.


Another thing - I would use optimize=false on the full-import and the 
delta-import.  The only real reason to do an optimize in a modern Solr 
version is to purge deleted documents.  If you are doing a new 
full-import every day, then you don't have to worry about that, because 
the new index will not contain any deleted documents.  It's true that an 
optimized index does slightly outperform one with many segments of 
varying sizes, but generally speaking the huge I/O overhead during the 
optimize is very detrimental to performance.


Thanks,
Shawn



SEVERE RecoveryStrategy Recovery failed - trying again... (9)

2013-02-18 Thread Cool Techi
I am seeing the following error in my Admin console and the core/ cloud status 
is taking forever to load.

SEVERERecoveryStrategyRecovery failed - trying again... (9) 

What causes this and how can I recover from this mode?

Regards,
Rohit

  

Re: SpellCheck - Ignore list of words

2013-02-18 Thread Erick Erickson
The 4.x based spellcheck process just looks in the index and enumerates the
terms, there's no special sidecar index. So you'd probably have to create
a different field that contained only the words you wanted to be returned
as possibilities

Best
Erick


On Mon, Feb 18, 2013 at 5:06 AM, Hemant Verma hemantverm...@gmail.comwrote:

 Hi All

 I have a use case where I have a list of words, on which I don't want to
 perform spellcheck.
 Like stemming ignores the words listed in protwords.txt file.
 Any idea, how it can be solved?

 Thanks
 Hemant



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/SpellCheck-Ignore-list-of-words-tp4041099.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Errors during index optimization on solrcloud

2013-02-18 Thread adm1n
Hi,

I'm running SolrCloud (Solr4) with 1 core, 8 shards and zookeeper
My index is being updated every minute, so I'm running optimization once a
day.
Every time during the optimization there is an error:
SEVERE: shard update error StdNode: http://host:port/solr/core_name/
SEVERE: shard update error StdNode:
http://host:port/solr/core_name/:org.apache.solr.common.SolrException:
Server at http://host:port/solr/core_name/ returned non ok status:503,
message:Service Unavailable

Any ideas what is causes this error and how to avoid it?


thanks.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Errors-during-index-optimization-on-solrcloud-tp4041135.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SEVERE RecoveryStrategy Recovery failed - trying again... (9)

2013-02-18 Thread Mark Miller
We need to see more of your logs to determine why - there should be some 
exceptions logged.

- Mark

On Feb 18, 2013, at 9:47 AM, Cool Techi cooltec...@outlook.com wrote:

 I am seeing the following error in my Admin console and the core/ cloud 
 status is taking forever to load.
 
 SEVERERecoveryStrategyRecovery failed - trying again... (9) 
 
 What causes this and how can I recover from this mode?
 
 Regards,
 Rohit
 
 



Re: Custom shard key, shard partitioning

2013-02-18 Thread Mark Miller
Yeah, I think we are missing some docs on this…

I think the info is in here: https://issues.apache.org/jira/browse/SOLR-2592

But it's not so easy to pick out - I'd been considering going through and 
writing up some wiki doc for that feature (unless I'm somehow missing it), but 
just been too busy with other stuff..

Concerning CloudSolrServer, there is a JIRA to make it hash and send updates to 
the right leader, but currently it still doesn't - it just favors leaders in 
general over non leaders currently.

- Mark

On Feb 18, 2013, at 7:34 AM, Markus Jelsma markus.jel...@openindex.io wrote:

 Hi,
 
 By defaut SolrCloud partitions records by the hash of the uniqueKey field but 
 we want to do some tests and partition the records by a signed integer field 
 but keep the current uniqueKey unique. I've scanned through several issues 
 concerning distributed index, custom hashing, shard policies etc but i have 
 not found some concise examples or documentation or even blog post on this 
 matter.
 
 How do we set up shard partitioning via another than the default uniqueKey 
 field?
 
 According to some older resolved issue CloudSolrServer should be cloud aware 
 and send updates to the leader of the correct shards, how does it know this? 
 Must we set up the same partitioning in SolrServer client as well? If so, 
 how? The apidocs do not reveal a lot when i look through them.
 
 I probably totally missed an issue or discussion or wiki page.
 
 Thanks,
 Markus



Re: Errors during index optimization on solrcloud

2013-02-18 Thread Mark Miller
Not sure - any other errors? An optimize once a day is a very heavy operation 
by the way! Be sure the gains are worth the pain you pay.

- Mark

On Feb 18, 2013, at 10:04 AM, adm1n evgeni.evg...@gmail.com wrote:

 Hi,
 
 I'm running SolrCloud (Solr4) with 1 core, 8 shards and zookeeper
 My index is being updated every minute, so I'm running optimization once a
 day.
 Every time during the optimization there is an error:
 SEVERE: shard update error StdNode: http://host:port/solr/core_name/
 SEVERE: shard update error StdNode:
 http://host:port/solr/core_name/:org.apache.solr.common.SolrException:
 Server at http://host:port/solr/core_name/ returned non ok status:503,
 message:Service Unavailable
 
 Any ideas what is causes this error and how to avoid it?
 
 
 thanks.
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Errors-during-index-optimization-on-solrcloud-tp4041135.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: How can i instruct the Solr/ Solr Cell to output the original HTML document which was fed to it.?

2013-02-18 Thread Jack Krupansky
Look at HTMLStripCharFilter, which accepts HTML as its source text, which 
preserves all the HTML tags in the stored value, but then strips off the 
HTML tags for tokenization into terms. So, you can search for the actual 
text terms, but the HTML will still be in the returned field value for 
highlighting.


See:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.HTMLStripCharFilterFactory

-- Jack Krupansky

-Original Message- 
From: Divyanand Tiwari

Sent: Monday, February 18, 2013 7:28 AM
To: solr-user@lucene.apache.org
Subject: How can i instruct the Solr/ Solr Cell to output the original HTML 
document which was fed to it.?


Hi everyone, i am new to solr technology and not getting a way to get back
the original HTML document with Hits highlighted into it. what
configuration and where i can do to instruct SolrCell/ Tika so that it does
not strips down the tags of HTML document in the content field.

Any support would be greatly appreciated.

A
waiting for your quick

reply..

Thank you!!!
--
Regards,
Divyanand Tiwari 



Re: SpellCheck - Ignore list of words

2013-02-18 Thread Jack Krupansky

1. Create a copy of the field and add the exception list to it.

2. Or, add a second spell checker to your spellcheck search component that 
is a FileBasedSpellChecker with the exceptions in a simple text file. Then 
reference both spellcheckers with spellcheck.dictionary, with the 
FileBasedSpellChecker as the first.


-- Jack Krupansky

-Original Message- 
From: Hemant Verma

Sent: Monday, February 18, 2013 2:06 AM
To: solr-user@lucene.apache.org
Subject: SpellCheck - Ignore list of words

Hi All

I have a use case where I have a list of words, on which I don't want to
perform spellcheck.
Like stemming ignores the words listed in protwords.txt file.
Any idea, how it can be solved?

Thanks
Hemant



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SpellCheck-Ignore-list-of-words-tp4041099.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: Errors during index optimization on solrcloud

2013-02-18 Thread Mark Miller
I think it's best to tweak merge parameters instead and amortize the cost of 
keeping down the number of segments. Deletes will be naturally expunged as 
documents come in and segments are merged. For 90% of use cases, this is the 
best way to go IMO. Even if you just want to get rid of deletes, look into 
expunge deletes - it merges just what's needed to get rid of deletes, which may 
not always mean a full optimize down to one segment.

My advice on optimize would be to do it when you are not going to get any 
updates in very often or for a long time. Otherwise it's best just to tune 
merge parameters and avoid optimize altogether. It's usually pre optimization 
that leads to the over use of optimize and it's usually unnecessary and quite 
costly.

- Mark

On Feb 18, 2013, at 11:12 AM, adm1n evgeni.evg...@gmail.com wrote:

 Thanks for your response.
 
 No, nothing else. Only those errors.
 
 By the way, what is the best practice for optimization process - should it
 be done each period of time (for example cron-based) or it depends on diff
 between max doc and num docs counts?
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Errors-during-index-optimization-on-solrcloud-tp4041135p4041157.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Updating data

2013-02-18 Thread Jack Krupansky

Use set instead of add.

See:
http://wiki.apache.org/solr/UpdateJSON#Atomic_Updates

-- Jack Krupansky

-Original Message- 
From: anurag.jain

Sent: Monday, February 18, 2013 6:09 AM
To: solr-user@lucene.apache.org
Subject: Re: Updating data

Hi,

i got a problem.

problem is i have json file

[
 {
 id:5,
  is_good:{add:1}
  },
 {
 id:1,
  is_good:{add:1}
  },
 {
 id:2,
  is_good:{add:1}
  },
 {
 id:3,
  is_good:{add:1}
  }
]


now due to stopping of tomcat only one of doc [id:5]added in solr.

now if i am trying again to post this file for update. it is giving me error
multivalue 

becoz id:5 already updated ..   due to this remaining id is not updating
in solr. i have 25 lakh doc  in a json file. please give me some idea..



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Updating-data-tp4038492p4041123.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: Solr search – Tika extracted text from PDF not return highlighting snippet

2013-02-18 Thread tuxdna
I am replying to this post because I am also facing very similar issue.

I am indexing the documents stored in a blob field of a MySQL database. I
have described the whole setup in the following blog post:

http://tuxdna.wordpress.com/2013/02/04/indexing-the-documents-stored-in-a-database-using-apache-solr-and-apache-tika/


Basically, the blob content is fetched from database, and then it is parsed
by Tika and converted into text. All the fields in the datbase table get
indexed properly except the blob field ( which was processed by Tika ). It
doesn't reflect in Solr schema browser. There are no terms against the text
field. 

I tried with some permutation and combination of the fields in (
db-data-config.xml and schema.xml ) and got it working. I now have to fields
text and text1, where text is indexed + stored, and text2 is
neither. However if I remove text2 from configuration, I am back to the
same problem i.e. the field doesn't get indexed. 

I don't understand how, the above work around is working. Can anyone give me
pointers where I can explore further to understand this behaviour? Is it
solvable using copyField ?

NOTE: I have described the configuration files and setup in the link above.

Thanks in advance! :)

/tuxdna




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-search-Tika-extracted-text-from-PDF-not-return-highlighting-snippet-tp3999647p4041180.html
Sent from the Solr - User mailing list archive at Nabble.com.


RequestHandler init failure

2013-02-18 Thread Mingfeng Yang
When trying to use SolrEntityProcessor to do data import from another solr
index (solor 4.1)

I added  the following in solrconfig.xml

 requestHandler name=/data
class=org.apache.solr.handler.dataimport.DataImportHandler
   lst name=defaults
   str name=configdata-config.xml/str
   /lst
   /requestHandler

and create new file data-config.xml with
dataConfig
  document
entity name=sep processor=SolrEntityProcessor
url=http://wolf:1Xnbdoq@myserver:8995/solr/; query=*:*
fl=id,md5_text,title,text/
  /document
/dataConfig


I got the following errors:

org.apache.solr.common.SolrException: RequestHandler init failure
at org.apache.solr.core.SolrCore.init(SolrCore.java:794)
at org.apache.solr.core.SolrCore.init(SolrCore.java:607)
at
org.apache.solr.core.CoreContainer.createFromZk(CoreContainer.java:949)
at
org.apache.solr.core.CoreContainer.create(CoreContainer.java:1031)
at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629)
at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:624)
at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Caused by: org.apache.solr.common.SolrException: RequestHandler init failure
at
org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:168)
at org.apache.solr.core.SolrCore.init(SolrCore.java:731)
... 13 more
Caused by: org.apache.solr.common.SolrException: Error loading class
'org.apache.solr.handler.dataimport.DataImportHandler'
at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:438)
at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:507)
at
org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:581)
at
org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:154)
... 14 more
Caused by: java.lang.ClassNotFoundException:
org.apache.solr.handler.dataimport.DataImportHandler
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:627)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:422)
... 17 more
Feb 18, 2013 7:24:43 PM org.apache.solr.common.SolrException log
SEVERE: null:org.apache.solr.common.SolrException: Unable to create core:
collection1
at
org.apache.solr.core.CoreContainer.recordAndThrow(CoreContainer.java:1654)
at
org.apache.solr.core.CoreContainer.create(CoreContainer.java:1039)
at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629)

I assume that it's because jar file related to dataimporthandler is not
included in default solr 4.1 distribution.  Where can I find it?

Thanks
Ming


Re: Conditional Field Search without affecting score.

2013-02-18 Thread adityab
thanks Eric, 

is this what you are pointing me to ?

http://.../solr/select?q=if(exist(title.3),(title.3:xyz),(title.0:xyz))

I believe i should be able to use boost along with proximity too. 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Conditional-Field-Search-without-affecting-score-tp4040657p4041188.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: RequestHandler init failure

2013-02-18 Thread Mingfeng Yang
Found it by myself.  It's here
http://mirrors.ibiblio.org/maven2/org/apache/solr/solr-dataimporthandler/4.1.0/

Download and move the jar file to solr-webapp/webapp/WEB-INF/lib directory,
and the errors are all gone.

Ming


On Mon, Feb 18, 2013 at 11:52 AM, Mingfeng Yang mfy...@wisewindow.comwrote:

 When trying to use SolrEntityProcessor to do data import from another solr
 index (solor 4.1)

 I added  the following in solrconfig.xml

  requestHandler name=/data
 class=org.apache.solr.handler.dataimport.DataImportHandler
lst name=defaults
str name=configdata-config.xml/str
/lst
/requestHandler

 and create new file data-config.xml with
 dataConfig
   document
 entity name=sep processor=SolrEntityProcessor
 url=http://wolf:1Xnbdoq@myserver:8995/solr/; query=*:*
 fl=id,md5_text,title,text/
   /document
 /dataConfig


 I got the following errors:

 org.apache.solr.common.SolrException: RequestHandler init failure
 at org.apache.solr.core.SolrCore.init(SolrCore.java:794)
 at org.apache.solr.core.SolrCore.init(SolrCore.java:607)
 at
 org.apache.solr.core.CoreContainer.createFromZk(CoreContainer.java:949)
 at
 org.apache.solr.core.CoreContainer.create(CoreContainer.java:1031)
 at
 org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629)
 at
 org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:624)
 at
 java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at
 java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 Caused by: org.apache.solr.common.SolrException: RequestHandler init
 failure
 at
 org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:168)
 at org.apache.solr.core.SolrCore.init(SolrCore.java:731)
 ... 13 more
 Caused by: org.apache.solr.common.SolrException: Error loading class
 'org.apache.solr.handler.dataimport.DataImportHandler'
 at
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:438)
 at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:507)
 at
 org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:581)
 at
 org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:154)
 ... 14 more
 Caused by: java.lang.ClassNotFoundException:
 org.apache.solr.handler.dataimport.DataImportHandler
 at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
 at
 java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:627)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Class.java:247)
 at
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:422)
 ... 17 more
 Feb 18, 2013 7:24:43 PM org.apache.solr.common.SolrException log
 SEVERE: null:org.apache.solr.common.SolrException: Unable to create core:
 collection1
 at
 org.apache.solr.core.CoreContainer.recordAndThrow(CoreContainer.java:1654)
 at
 org.apache.solr.core.CoreContainer.create(CoreContainer.java:1039)
 at
 org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629)

 I assume that it's because jar file related to dataimporthandler is not
 included in default solr 4.1 distribution.  Where can I find it?

 Thanks
 Ming



Re: Reloading config to zookeeper

2013-02-18 Thread mshirman
I hope my question is somewhat relevant to the discussion. 

I'm relatively new to zk/SolrCloud, and I have new environment configured
with an ZK ensemble  (3 nodes) running with SolrCloud. Things are running,
yet I'm puzzled since I can't find the Solr congif data on zookeeper nodes. 
What is the default location? 

Thanks in advance!

/michael



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Reloading-config-to-zookeeper-tp4021901p4041189.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Reloading config to zookeeper

2013-02-18 Thread Timothy Potter
@Marcin - Maybe I mis-understood your process but I don't think you
need to reload the collection on each node if you use the expanded
collections admin API, i.e. the following will propagate the reload
across your cluster for you:

http://localhost:8983/solr/admin/collections?action=RELOADname=mycollection

See 
http://wiki.apache.org/solr/SolrCloud#Managing_collections_via_the_Collections_API

On Mon, Feb 18, 2013 at 1:13 PM, mshirman mshir...@gmail.com wrote:
 I hope my question is somewhat relevant to the discussion.

 I'm relatively new to zk/SolrCloud, and I have new environment configured
 with an ZK ensemble  (3 nodes) running with SolrCloud. Things are running,
 yet I'm puzzled since I can't find the Solr congif data on zookeeper nodes.
 What is the default location?

 Thanks in advance!

 /michael



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Reloading-config-to-zookeeper-tp4021901p4041189.html
 Sent from the Solr - User mailing list archive at Nabble.com.


RE: Is it possible to manually select a shard leader in a running SolrCloud?

2013-02-18 Thread Vaillancourt, Tim
Hey all,

I feel having to unload the leader core to force an election is hacky, and as 
far as I know would still leave which node becomes the Leader to chance, ie I 
cannot guarantee NodeX becomes Leader 100% in all cases.

Also, this imposes additional load temporarily.

Is there a way to force the winner of the Election, and if not, is there a 
known feature-request for this?

Cheers,

Tim Vaillancourt

-Original Message-
From: Joseph Dale [mailto:joey.d...@gmail.com] 
Sent: Sunday, February 03, 2013 7:42 AM
To: solr-user@lucene.apache.org
Subject: Re: Is it possible to manually select a shard leader in a running 
SolrCloud?

With solrclound all cores are collections. The collections API it just a 
wrapper to call the core api a million times with one command.

to /solr/admin/cores?action=CREATEname=core1collection=core1shard=1

Basically your creating the shard again, after leader props have gone out. 
Solr will check ZK and find a core meeting that description, then simply get a 
copy of the index from the leader of that shard.


On Feb 3, 2013, at 10:37 AM, Brett Hoerner br...@bretthoerner.com wrote:

 What is the inverse I'd use to re-create/load a core on another 
 machine but make sure it's also known to SolrCloud/as a shard?
 
 
 On Sat, Feb 2, 2013 at 4:01 PM, Joseph Dale joey.d...@gmail.com wrote:
 
 
 To be more clear lets say bob it the leader of core 1. On bob do a 
 /admin/cores?action=unloadname=core1. This removes the core/shard 
 from bob, giving the other servers a chance to grab leader props.
 
 -Joey
 
 On Feb 2, 2013, at 11:27 AM, Brett Hoerner br...@bretthoerner.com wrote:
 
 Hi,
 
 I have a 5 server cluster running 1 collection with 20 shards,
 replication
 factor of 2.
 
 Earlier this week I had to do a rolling restart across the cluster, 
 this worked great and the cluster stayed up the whole time. The 
 problem is
 that
 the last node I restarted is now the leader of 0 shards, and is just 
 holding replicas.
 
 I've noticed this node has abnormally high load average, while the 
 other nodes (who have the same number of shards, but more leaders on 
 average)
 are
 fine.
 
 First, I'm wondering if that loud could be related to being a 5x 
 replica and 0x leader?
 
 Second, I was wondering if I could somehow flag single shards to
 re-elect a
 leader (or force a leader) so that I could more evenly distribute 
 how
 many
 leader shards each physical server has running?
 
 Thanks.
 
 




Japanese mm parameter in Solr3.6.2 generated lots of results with big performance hit

2013-02-18 Thread kirpakaro
In Solr3.6.1 using text_ja field generated huge number of results, that
degraded its performance significantly. The queries that were taking 15ms
have gone up to 400ms and the other issue, it is not honoring rows
parameter. The output results are not capped by the the number of documents
requested using rows=100 parameter but lot more.

Has anyone experienced this issue and what is the solution to improve
the performance as putting the index into RAM and Cache did not have a
significant impact on the performance.

Thanks.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Japanese-mm-parameter-in-Solr3-6-2-generated-lots-of-results-with-big-performance-hit-tp4041200.html
Sent from the Solr - User mailing list archive at Nabble.com.


SolrCloud configuration in a zookeeper node

2013-02-18 Thread mshirman

I'm relatively new to zk/SolrCloud, and I have new environment configured
with an ZK ensemble  (3 nodes) running with SolrCloud. Things are running,
yet I'm puzzled since I can't find the Solr config data on zookeeper nodes. 

What is the default location? 

Thank you in advance!

/michael



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-configuration-in-a-zookeeper-node-tp4041205.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrCloud configuration in a zookeeper node

2013-02-18 Thread Timothy Potter
/configs/collectionName

You should be able to see this from the Solr admin console as well:
Cloud  Tree  configs  collectionName

Cheers,
Tim

On Mon, Feb 18, 2013 at 4:23 PM, mshirman mshir...@gmail.com wrote:

 I'm relatively new to zk/SolrCloud, and I have new environment configured
 with an ZK ensemble  (3 nodes) running with SolrCloud. Things are running,
 yet I'm puzzled since I can't find the Solr config data on zookeeper nodes.

 What is the default location?

 Thank you in advance!

 /michael



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/SolrCloud-configuration-in-a-zookeeper-node-tp4041205.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: Japanese mm parameter in Solr3.6.2 generated lots of results with big performance hit

2013-02-18 Thread Jack Krupansky

Maybe you need to turn on autoGeneratePhraseQueries=true on your field type.

And turn on debugQuery=true on your query to see what actually get 
generated.


Show us a typical query - the rows parameter should always work, unless 
it's written wrong.


-- Jack Krupansky

-Original Message- 
From: kirpakaro

Sent: Monday, February 18, 2013 2:39 PM
To: solr-user@lucene.apache.org
Subject: Japanese mm parameter in Solr3.6.2 generated lots of results with 
big performance hit


In Solr3.6.1 using text_ja field generated huge number of results, that
degraded its performance significantly. The queries that were taking 15ms
have gone up to 400ms and the other issue, it is not honoring rows
parameter. The output results are not capped by the the number of documents
requested using rows=100 parameter but lot more.

   Has anyone experienced this issue and what is the solution to improve
the performance as putting the index into RAM and Cache did not have a
significant impact on the performance.

Thanks.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Japanese-mm-parameter-in-Solr3-6-2-generated-lots-of-results-with-big-performance-hit-tp4041200.html
Sent from the Solr - User mailing list archive at Nabble.com. 



RE: SEVERE RecoveryStrategy Recovery failed - trying again... (9)

2013-02-18 Thread Cool Techi
There is not error I can see in the logs, my shards are divided over three 
machines, the cloud runs fine when I don't bring up one of the nodes, the 
moment I start that particular note, the cloud stops responding,

Feb 19, 2013 5:22:22 AM 
org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener 
newSearcher
INFO: Loading spell index for spellchecker: default
Feb 19, 2013 5:22:22 AM 
org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener 
newSearcher
INFO: Loading spell index for spellchecker: wordbreak
Feb 19, 2013 5:22:22 AM org.apache.solr.core.SolrCore registerSearcher
INFO: [cmn] Registered new searcher Searcher@3b47788d 
main{StandardDirectoryReader(segments_1dvf:1488121 _2acm(4.1):C13967428/87404 
_62w6(4.1):C259989/31792 _8ehw(4.1):C405062/57136 _8um4(4.1):C228434/26526 
_a0i1(4.1):C171825/43653 _bgu3(4.1):C315311/30246 _ao6h(4.1):C176468/44702 
_b7uu(4.1):C97823/27124 _bjzb(4.1):C77280/8476 _bra3(4.1):C142681/21340 
_bzpo(4.1):C198058/23506 _c0jh(4.1):C18201/8171 _c307(4.1):C37984/5305 
_c2e0(4.1):C22300/9788 _c1o6(4.1):C23523/8630 _c3hl(4.1):C12034/2871 
_c3kw(4.1):C5821/971 _c3l6(4.1):C1106 _c3lh(4.1):C707/1 _c3lu(4.1):C509/2 
_c3mf(4.1):C482/1 _c3m5(4.1):C374/2 _c3mc(4.1):C164/2 _c3mh(4.1):C64/3 
_c3mi(4.1):C49 _c3mj(4.1):C25 _c3mk(4.1):C12)}
Feb 19, 2013 5:22:22 AM org.apache.solr.cloud.ZkController publish
INFO: publishing core=cmn state=down
Feb 19, 2013 5:22:22 AM org.apache.solr.cloud.ZkController publish
INFO: numShards not found on descriptor - reading it from system property
Feb 19, 2013 5:22:22 AM org.apache.solr.core.CoreContainer registerCore
INFO: registering core: cmn
Feb 19, 2013 5:22:22 AM org.apache.solr.cloud.ZkController register
INFO: Register replica - core:cmn address:http://10.0.0.205:8080/solr 
collection:cmn shard:shard2
Feb 19, 2013 5:22:22 AM org.apache.solr.client.solrj.impl.HttpClientUtil 
createClient
INFO: Creating new http client, 
config:maxConnections=1maxConnectionsPerHost=20connTimeout=3socketTimeout=3retry=false


Regards,
Ayush

 Subject: Re: SEVERERecoveryStrategyRecovery failed - trying again... 
 (9)
 From: markrmil...@gmail.com
 Date: Mon, 18 Feb 2013 10:21:53 -0500
 To: solr-user@lucene.apache.org
 
 We need to see more of your logs to determine why - there should be some 
 exceptions logged.
 
 - Mark
 
 On Feb 18, 2013, at 9:47 AM, Cool Techi cooltec...@outlook.com wrote:
 
  I am seeing the following error in my Admin console and the core/ cloud 
  status is taking forever to load.
  
  SEVERERecoveryStrategyRecovery failed - trying again... (9) 
  
  What causes this and how can I recover from this mode?
  
  Regards,
  Rohit
  

 
  

TIMESTAMP

2013-02-18 Thread anurag.jain
Hi all,

I have json file in which there is field name last_login and value of that
field in timestamp. 

I want to store that value in timestamp. do not want to change field type. 

Now question is how to store timestamp so that when i need output in
datetime format it give date time format and whenver i need output in
timestamp format it give timestamp format. 


Please Reply it is very urgent --- i have to do this task today itself. 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/TIMESTAMP-tp4041225.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How can i instruct the Solr/ Solr Cell to output the original HTML document which was fed to it.?

2013-02-18 Thread Divyanand Tiwari
Thank you for replying sir !!!

I have two queries related with this -

1) So in this case which request handler I have to use because
'ExtractingRequestHandler' by default strips the html content and the
default handler 'UpdateRequestHandler' does not accepts the HTML contrents.

2) How can I 'Extract'  'Index' META information in the HTML document
separately.

Awaiting your reply
Thank you!!!


Re: How can i instruct the Solr/ Solr Cell to output the original HTML document which was fed to it.?

2013-02-18 Thread Jack Krupansky
Use the standard update handler and pass the entire HTML page as literal 
text in a Solr XML document for the field that has the HTML strip filter, 
but be sure to escape the HTML (angle brackets, ampersands, etc.) syntax.


You'll have to process meta information yourself.

-- Jack Krupansky

-Original Message- 
From: Divyanand Tiwari

Sent: Monday, February 18, 2013 10:52 PM
To: solr-user@lucene.apache.org
Subject: Re: How can i instruct the Solr/ Solr Cell to output the original 
HTML document which was fed to it.?


Thank you for replying sir !!!

I have two queries related with this -

1) So in this case which request handler I have to use because
'ExtractingRequestHandler' by default strips the html content and the
default handler 'UpdateRequestHandler' does not accepts the HTML contrents.

2) How can I 'Extract'  'Index' META information in the HTML document
separately.

Awaiting your reply
Thank you!!! 



Re: JMX generation number is wrong

2013-02-18 Thread Aristedes Maniatis

Should I log a defect in Jira for this?

Ari Maniatis


On 14/02/13 6:50pm, Aristedes Maniatis wrote:

I'm trying to monitor the state of a master-slave Solr4.1 cluster. I can easily 
get the generation number of the slaves using JMX like this:

 solr/{corename}/org.apache.solr.handler.ReplicationHandler/generation

That works fine. However on the master, this number is always 1. Which makes it 
rather hard to check if the slaves are lagging behind.

Is this a defect in the JMX properties in Solr and I should file a Jira?


Ari




--
--
Aristedes Maniatis
GPG fingerprint CBFB 84B4 738D 4E87 5E5C  5EFA EF6A 7D2E 3E49 102A