Re: More on topic of Meta-search/Federated Search with Solr

2013-08-27 Thread Paul Libbrecht
Dan, if you're bound to federated search then I would say that you need to work on the service guarantees of each of the nodes and, maybe, create strategies to cope with bad nodes. paul Le 26 août 2013 à 22:57, Dan Davis a écrit : First answer: My employer is a library and do not have

Magento solr Invalid Date String:'false'

2013-08-27 Thread Nikesh12
We are getting below message during solr indexing running by cron setting in magento. Aug 12, 2013 8:06:15 AM org.apache.solr.update.processor.LogUpdateProcessor finish INFO: {add=[24P1602]} 0 1 Aug 12, 2013 8:06:16 AM org.apache.solr.common.SolrException log SEVERE:

Re: More on topic of Meta-search/Federated Search with Solr

2013-08-27 Thread Bernd Fehling
Years ago when Federated Search was a buzzword we did some development and testing with Lucene, FAST Search, Google and several other Search Engines according Federated Search in Library context. The results can be found here http://pub.uni-bielefeld.de/download/2516631/2516644 Some minor parts

Re: ERROR org.apache.solr.update.CommitTracker – auto commit error...:org.apache.solr.common.SolrException: Error opening new searcher

2013-08-27 Thread zhaoxin
Thanks, Shawn ! it's ok now -- View this message in context: http://lucene.472066.n3.nabble.com/ERROR-org-apache-solr-update-CommitTracker-auto-commit-error-org-apache-solr-common-SolrException-Err-tp4086576p4086763.html Sent from the Solr - User mailing list archive at Nabble.com.

Can we used CloudSolrServer for searching data

2013-08-27 Thread Dharmendra Jaiswal
Hello, I am using multi-core mechnism with Solr4.4.0. And each core is dedicated to a particular client (each core is a collection) Like If we search data from SiteA, it will provide search result from CoreA And if we search data from SiteB, it will provide search result from CoreB and similar

Re: SolrCloud: no timing when no result in distributed mode

2013-08-27 Thread Elodie Sannier
Hello, I'm using the 4.4.0 version but I still have the problem. Should I create a JIRA issue for it ? Elodie On 06/21/2013 02:54 PM, Elodie Sannier wrote: Hello, I am using SolrCloud 4.2.1 with two shards, with the debugQuery=true parameter, when a query does not return documents then the

Concat 2 fields in another field

2013-08-27 Thread Alok Bhandari
Hello all , I am using solr 4.x , I have a requirement where I need to have a field which holds data from 2 fields concatenated using _. So for example I have 2 fields firstName and lastName , I want a third field which should hold firstName_lastName. Is there any existing concatenating component

Re: Concat 2 fields in another field

2013-08-27 Thread Rafał Kuć
Hello! You don't have to write custom component - you can use ScriptUpdateProcessor - http://wiki.apache.org/solr/ScriptUpdateProcessor -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch Hello all , I am using solr 4.x , I have a requirement where I

Re: Concat 2 fields in another field

2013-08-27 Thread Alok Bhandari
Thanks for reply. But I don't want to introduce any scripting in my code so want to know is there any Java component available for the same. -- View this message in context: http://lucene.472066.n3.nabble.com/Concat-2-fields-in-another-field-tp4086786p4086791.html Sent from the Solr - User

RE: Concat 2 fields in another field

2013-08-27 Thread Markus Jelsma
You may be more interested in the ConcatFieldUpdateProcessorFactory: http://lucene.apache.org/solr/4_1_0/solr-core/org/apache/solr/update/processor/ConcatFieldUpdateProcessorFactory.html -Original message- From:Alok Bhandari alokomprakashbhand...@gmail.com Sent: Tuesday 27th August

Re: Concat 2 fields in another field

2013-08-27 Thread Federico Chiacchiaretta
Hi, we do the same thing using an update request processor chain, this is the snippet from solrconfig.xml updateRequestProcessorChain name=concatenation processor class=solr.CloneFieldUpdateProcessorFactory str name=source firstname/str str name=destconcatfield/str /processor processor

Re: Concat 2 fields in another field

2013-08-27 Thread Jack Krupansky
I have additional examples in the two most recent early access releases of my book - variations on using the existing update processors. -- Jack Krupansky -Original Message- From: Federico Chiacchiaretta Sent: Tuesday, August 27, 2013 8:39 AM To: solr-user@lucene.apache.org Subject:

Re: Magento solr Invalid Date String:'false'

2013-08-27 Thread Jack Krupansky
Invalid Date String:'false' That's correct, false is not a valid date in Solr. Solr uses ISO format: -MM-DDThh:mm:ss[.ttt]Z. You obviously have some issue with whatever software is feeding data into Solr. Nothing we can do to help you there, other than to tell you to make sure you feed

Re: Concat 2 fields in another field

2013-08-27 Thread Bill Bell
If for search just copyField into a multivalued field Or do it on indexing using DIH or code. A rhino script works too. Bill Bell Sent from mobile On Aug 27, 2013, at 7:15 AM, Jack Krupansky j...@basetechnology.com wrote: I have additional examples in the two most recent early access

Re: Solr 4.2.1 update to 4.3/4.4 problem

2013-08-27 Thread Bill Bell
Index and query analyzer type=index Bill Bell Sent from mobile On Aug 26, 2013, at 5:42 AM, skorrapa korrapati.sus...@gmail.com wrote: I have also re indexed the data and tried. And also tried with the belowl fieldType name=string_lower_case class=solr.TextField sortMissingLast=true

Re: Solr cloud hash range set to null after recovery from index corruption

2013-08-27 Thread Rikke Willer
Hi again, a follow-up on this: I ended up fixing it by uploading a new version of clusterstate.json to Zookeeper with the missing hash ranges set (they were easily deducible since they were sorted by shard name). I still don't know what the correct solution to handle index corruption (where

Transaction log on-disk guarantees

2013-08-27 Thread Sandro Zbinden
Dear solr users We are using the solr soft comit feature and we are worried about what happens after we restart the solr server. Can we activate the transaction log to have on disk guarantees and then use the solr soft commit feature ? Thanks and Best regards Sandro Zbinden

Re: Transaction log on-disk guarantees

2013-08-27 Thread Mark Miller
On Aug 27, 2013, at 11:08 AM, Sandro Zbinden zbin...@imagic.ch wrote: Can we activate the transaction log to have on disk guarantees and then use the solr soft commit feature ? Yes you can. If you only have a single node (no replication), you probably want to turn on fsync via the config.

Multiple replicas for specific shard

2013-08-27 Thread maephisto
Hi! Imagine the following configuration: a SolrCloud cluster, with 3 shards, a replication factor of 2 and 6 nodes. Now, if i'll add one more node to the cluster ZK will automatically assign a shard replica to it. My question is, can i influence which of the shards to be replicated on the new

Re: Multiple replicas for specific shard

2013-08-27 Thread Keith Duntz
I think you could do this by specifying the shard id of each core in solr.xml. Something like... ?xml version=1.0 encoding=UTF-8 ? solr persistent=true cores adminPath=/admin/cores defaultCoreName=core host=${host:} hostPort=${jetty.port:} hostContext=${hostContext:}

AW: Transaction log on-disk guarantees

2013-08-27 Thread Sandro Zbinden
Hey Mark Thank you very much for the quick answer. We have a single node environment. I try to find the fsync option but was not successful. Ended up in the UpdateLog class :-) How do I enable fsync in the solrconfig.xml ? Besides that: If solr soft commit feature has a on disk guarantee

Re: Transaction log on-disk guarantees

2013-08-27 Thread Erick Erickson
Here's a blog I wrote up a bit ago: http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ Hmmm, unfortunately it doesn't say anything about how to set the fsync option, but do you really care? Soft commits flush to the op system, so a JVM

Re: Transaction log on-disk guarantees

2013-08-27 Thread Mark Miller
On Aug 27, 2013, at 11:54 AM, Erick Erickson erickerick...@gmail.com wrote: Soft commits flush to the op system, so a JVM crash/termination shouldn't affect it anyway. A soft commit is not a hard commit, so there are not guarantees like this. It searches committed and non committed segments

Re: Transaction log on-disk guarantees

2013-08-27 Thread Erick Erickson
I updated the SolrCloud page here: http://wiki.apache.org/solr/SolrCloud to include how to change this option. Bh, Soft commits flush to the op system should have read Hard commits flush the transaction log to the op system. Blame it on just getting back from the dentist, that was just

AW: Transaction log on-disk guarantees

2013-08-27 Thread Sandro Zbinden
@Mark Do you know how I can set the syncLevel to fsync in the solrconfig.xml I can't find in the default solrconfig.xml https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/example/solr/collection1/conf/solrconfig.xml The blog posts at

Re: Transaction log on-disk guarantees

2013-08-27 Thread Erick Erickson
Well, that blog post is, at best, an estimate based on disk head seek times so take it with a large grain of salt, I probably shouldn't even have put that in the post. But for a single node, it's probably not all that noticeable. Erick On Tue, Aug 27, 2013 at 12:20 PM, Sandro Zbinden

Shard splitting error: cannot uncache file=_1.nvm

2013-08-27 Thread Greg Preston
I haven't been able to successfully split a shard with Solr 4.4.0 If I have an empty index, or all documents would go to one side of the split, I hit SOLR-5144. But if I avoid that case, I consistently get this error: 290391 [qtp243983770-60] INFO

Re: Transaction log on-disk guarantees

2013-08-27 Thread Jack Krupansky
You missed the wiki update that went by a short while ago: updateLog str name=dir${solr.data.dir:}/str + !-- if you want to take control of the synchronization you may specify the syncLevel as one of the +following where ''flush'' is the default. fsync will reduce

SOLR 4.2.1 - High Resident Memory Usage

2013-08-27 Thread vsilgalis
We have a 2 shard SOLRCloud implementation with 6 servers in production. We have allocated 24GB to each server and are using JVM max memory settings of -Xmx14336 on each of the servers. We are using the same embedded jetty that SOLR comes with. The JVM side of things looks like what I'd expect

Re: Transaction log on-disk guarantees

2013-08-27 Thread SandroZbinden
Hey Jack Thanks a lot. I just googled for fsync and syncLevel instead of searching in the solr wiki. Won't happen again. Here is the link to the solr wiki page that describes to set the syncLevel http://wiki.apache.org/solr/SolrCloud?highlight=%28fsync%29 -- View this message in context:

Solr 4.2 Regular expression, returning only matched substring

2013-08-27 Thread Jai
Hi, is it possible to get only the matched substring of a text/string type field in response. i am trying to search with regular expression and do facet on different strings (substring of the field) that matches this regular expression. For example if i write a regular expression to match email,

Re: No documents found for some queries with special chars like mm

2013-08-27 Thread Utkarsh Sengar
Thanks for the info. 1. http://SERVER/solr/prodinfo/select?q=o%27reillywt=jsonindent=truedebugQuery=truereturn: { responseHeader:{ status:0, QTime:16, params:{ debugQuery:true, indent:true, q:o'reilly, wt:json}},

Re: SOLR 4.2.1 - High Resident Memory Usage

2013-08-27 Thread Shawn Heisey
On 8/27/2013 11:56 AM, vsilgalis wrote: We have a 2 shard SOLRCloud implementation with 6 servers in production. We have allocated 24GB to each server and are using JVM max memory settings of -Xmx14336 on each of the servers. We are using the same embedded jetty that SOLR comes with. The JVM

Re: Transaction log on-disk guarantees

2013-08-27 Thread Erick Erickson
Well, when you originally googled it wasn't there G, I just put it in after reading your post and realizing that it wasn't documented. Erick On Tue, Aug 27, 2013 at 2:13 PM, SandroZbinden zbin...@imagic.ch wrote: Hey Jack Thanks a lot. I just googled for fsync and syncLevel instead of

Re: Transaction log on-disk guarantees

2013-08-27 Thread Jack Krupansky
And here I was just about to give Mark credit for updating the wiki! -- Jack Krupansky -Original Message- From: Erick Erickson Sent: Tuesday, August 27, 2013 4:24 PM To: solr-user@lucene.apache.org Subject: Re: Transaction log on-disk guarantees Well, when you originally googled it

Re: No documents found for some queries with special chars like mm

2013-08-27 Thread Utkarsh Sengar
Yup, the query o'reilly worked after adding WDF to the index analyser. Although mm or m\m doesn't work. Field analysis for mm says: ST m, m WDF m, m ST m, m WDF m, m So essentially is ignored during the index or the query. My guess is, the standard tokenize is the problem. As the

Re: SOLR 4.2.1 - High Resident Memory Usage

2013-08-27 Thread vsilgalis
thanks for the quick reply. I made to rule out what I could around how Linux is handling this stuff. Yes I'm using the default swappiness setting of 60, but at this point it looks like the machine is swapping now because of low memory. Here is the vmstat and free -m results:

Re: SOLR 4.2.1 - High Resident Memory Usage

2013-08-27 Thread Shawn Heisey
On 8/27/2013 3:32 PM, vsilgalis wrote: thanks for the quick reply. I made to rule out what I could around how Linux is handling this stuff. Yes I'm using the default swappiness setting of 60, but at this point it looks like the machine is swapping now because of low memory. Here is the vmstat

Re: Solr 4.2 Regular expression, returning only matched substring

2013-08-27 Thread Erick Erickson
You can facet by arbitrary query, does that work? See facet.query... Best Erick On Tue, Aug 27, 2013 at 2:31 PM, Jai jai4l...@gmail.com wrote: Hi, is it possible to get only the matched substring of a text/string type field in response. i am trying to search with regular expression and

Re: No documents found for some queries with special chars like mm

2013-08-27 Thread Erick Erickson
bq: Is there a way I can make mm index as one string AND also keep StandardTokenizerFactory since I need it for other searches. In a word, no. You get one and only one tokenizer per field. But there are lots of options: Use a different tokenizer, possibly one of the regex ones. fake it with

Re: SOLR 4.2.1 - High Resident Memory Usage

2013-08-27 Thread Erick Erickson
Ok, this whole topic usually gives me heartburn. So I'll just point out an interesting blog on this from Mike McCandless: http://blog.mikemccandless.com/2011/04/just-say-no-to-swapping.html At least tuning swappiness to 0 will tell you whether it's real or phantom. Of course I'd be trying it on a

ICUTokenizer class not found with Solr 4.4

2013-08-27 Thread Tom Burton-West
Hello all, According to the README.txt in solr-4.4.0/solr/example/solr/collection1, all we have to do is create a collection1/lib directory and put whatever jars we want in there. .. /lib. If it exists, Solr will load any Jars found in this directory and use them to resolve any

Re: SOLR 4.2.1 - High Resident Memory Usage

2013-08-27 Thread vsilgalis
dash: http://lucene.472066.n3.nabble.com/file/n4086902/solr_dash1.png JVM section: http://lucene.472066.n3.nabble.com/file/n4086902/solr_dash2.png ps output: http://lucene.472066.n3.nabble.com/file/n4086902/solr_ps_out.png Erick that may be one of the ways I approach this, I just want to

Re: ICUTokenizer class not found with Solr 4.4

2013-08-27 Thread Shawn Heisey
On 8/27/2013 4:29 PM, Tom Burton-West wrote: According to the README.txt in solr-4.4.0/solr/example/solr/collection1, all we have to do is create a collection1/lib directory and put whatever jars we want in there. .. /lib. If it exists, Solr will load any Jars found in this

Re: SOLR 4.2.1 - High Resident Memory Usage

2013-08-27 Thread Shawn Heisey
On 8/27/2013 4:17 PM, Erick Erickson wrote: Ok, this whole topic usually gives me heartburn. So I'll just point out an interesting blog on this from Mike McCandless: http://blog.mikemccandless.com/2011/04/just-say-no-to-swapping.html At least tuning swappiness to 0 will tell you whether it's

Re: SOLR 4.2.1 - High Resident Memory Usage

2013-08-27 Thread Shawn Heisey
On 8/27/2013 4:48 PM, vsilgalis wrote: dash: http://lucene.472066.n3.nabble.com/file/n4086902/solr_dash1.png JVM section: http://lucene.472066.n3.nabble.com/file/n4086902/solr_dash2.png ps output: http://lucene.472066.n3.nabble.com/file/n4086902/solr_ps_out.png Erick that may be one of the

RE: SOLR 4.2.1 - High Resident Memory Usage

2013-08-27 Thread Markus Jelsma
Hi -Original message- From:Shawn Heisey s...@elyograg.org Sent: Wednesday 28th August 2013 0:50 To: solr-user@lucene.apache.org Subject: Re: SOLR 4.2.1 - High Resident Memory Usage On 8/27/2013 4:17 PM, Erick Erickson wrote: Ok, this whole topic usually gives me heartburn. So

Re: ICUTokenizer class not found with Solr 4.4

2013-08-27 Thread Naomi Dushay
Hi Tom, Sorry - I was meeting with the East-Asia librarians … Perhaps you are missing the following from your solrconfig lib dir=/home/blacklight/solr-home/lib / (this is the top of my solrconfig.xml: config !-- NOTE: various comments and unused configuration possibilities have been

Re: ICUTokenizer class not found with Solr 4.4

2013-08-27 Thread Shawn Heisey
On 8/27/2013 5:11 PM, Naomi Dushay wrote: Perhaps you are missing the following from your solrconfig lib dir=/home/blacklight/solr-home/lib / I ran into this issue (I'm the one that filed SOLR-4852) and I am not using blacklight. I am only using what can be found in a Solr download, plus

Re: Can a data import handler grab all pages of an RSS feed?

2013-08-27 Thread Alexandre Rafalovitch
Have you tried using $hasMore and $nextUrl? You can inject it with a custom transformer. It is not documented very well, but is mentioned on the Wiki. Regards, Alex. Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality

Re: No documents found for some queries with special chars like mm

2013-08-27 Thread Utkarsh Sengar
Use a different tokenizer, possibly one of the regex ones. fake it with phrase queries. Take a really good look at the various filter combinations. It's possible that WhitespaceTokenizer and WordDelimiterFilterFactory might be able to do good things. Will try to play with these two

Re: SOLR 4.2.1 - High Resident Memory Usage

2013-08-27 Thread vsilgalis
http://lucene.472066.n3.nabble.com/file/n4086923/huge.png That doesn't seem to be a problem. Markus, are you saying that I should plan on resident memory being at least double my heap size? I haven't run into issues around this before but then again I don't know everything. Is this a rule of

Re: Filter cache pollution during sharded edismax queries

2013-08-27 Thread Otis Gospodnetic
Hi Ken, JIRA is kind of stuffed. I'd imagine showing more proof on the ML may be more effective. Otis -- Solr ElasticSearch Support -- http://sematext.com/ Performance Monitoring -- http://sematext.com/spm On Tue, Aug 27, 2013 at 4:32 AM, Ken Krugler kkrugler_li...@transpac.com wrote: Hi

Adding weight to location of the string found

2013-08-27 Thread zseml
In Solr syntax, is there a way to add weight to the result found based on the location of the string that it's found? For instance, if I'm searching these strings for Hello: Hello World World Hello ...I'd like the first result to be the first one in my search results. Additionally, is there a