Solr phonetics with spelling

2015-03-10 Thread Ashish Mukherjee
Hello, Couple of questions related to phonetics - 1. If I enable the phonetic filter in managed-schema file for a particular field, how does it affect the spell handler? 2. What is the meaning of the inject attribute within analyzer in managed-schema? The documentation is not very clear about

Re: SolrCloud: Chroot error

2015-03-10 Thread Aman Tandon
Thanks Shawn, I tried it with single string but still no success. So currently i am running it without chroot and it is working fine. With Regards Aman Tandon On Mon, Mar 9, 2015 at 9:46 PM, Shawn Heisey apa...@elyograg.org wrote: On 3/9/2015 10:03 AM, Aman Tandon wrote: Thanks for

Re: unusually high 4.10.2 vs 4.3.1 RAM consumption

2015-03-10 Thread Dmitry Kan
For the sake of the story completeness, just wanted to confirm these params made a positive affect: -Dsolr.solr.home=cores -Xmx12000m -Djava.awt.headless=true -XX:+UseParNewGC -XX:+ExplicitGCInvokesConcurrent -XX:+UseConcMarkSweepGC -XX:MaxTenuringThreshold=8 -XX:CMSInitiatingOccupancyFraction=40

Re: Solrcloud Index corruption

2015-03-10 Thread Martin de Vries
Hi, this _sounds_ like you somehow don't have indexed=true set for the field in question. We investigated a lot more. The CheckIndex tool didn't find any error. We now think the following happened: - We changed the schema two months ago: we changed a field to indexed=true. We reloaded the

Re: SolrCloud: Chroot error

2015-03-10 Thread Shawn Heisey
On 3/10/2015 6:10 AM, Aman Tandon wrote: Thanks Shawn, I tried it with single string but still no success. So currently i am running it without chroot and it is working fine. That brings up a something for me or you to try. I wonder if perhaps there is a bug that will prevent the directory

Re: Solr 5: data_driven_schema_config's solrconfig causing error

2015-03-10 Thread Steve Rowe
Hi Aman, The stack trace shows that the AddSchemaFieldsUpdateProcessorFactory specified in data_driven_schema_configs’s solrconfig.xml expects the “booleans” field type to exist. Solr 5’s data_driven_schema_configs includes the “booleans” field type:

RE: Solr phonetics with spelling

2015-03-10 Thread Dyer, James
Ashish, I would not recommend using spellcheck against a phonetic-analyzed field. Instead, you can use copyField to create a separate field that is lightly analyzed and use the copy for spelling. James Dyer Ingram Content Group -Original Message- From: Ashish Mukherjee

Re: how to change configurations in solrcloud setup

2015-03-10 Thread Nitin Solanki
Hi Aman, You can apply configuration on solr cloud by using this command - sudo path_of_solr/solr_folder_name/example/scripts/cloud-scripts/zkcli.sh -zkhost localhost:9983 -cmd upconfig -confdir path_of_solr/solr_folder_name/example/solr/collection1/conf -confname default and

Re: Chaining components in request handler

2015-03-10 Thread Alexandre Rafalovitch
Ok. Components then. Defined in solrconfig.xml. You can prepend/append/replace the standard list. Try that and see if that's enough. Regards, Alex. Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: http://www.solr-start.com/ On 10 March 2015 at 14:03, Ashish Mukherjee

Re: Solr TCP layer

2015-03-10 Thread Saumitra Srivastav
Thanks everyone for the responses. My motivation for TCP is coming from a very heavy indexing pipeline where the smallest of optimization matters. I am working on a machine data parser which feeds data into Cassandra and Solr and we have SLAs based on how fast we can make data available in both

Chaining components in request handler

2015-03-10 Thread Ashish Mukherjee
Hello, I would like to create a request handler which chains components in a particular sequence to return the result, similar to a Unix pipe. eg. Component 1 - result1 - Component 2 - result2 result2 is final result returned. Component 1 may be a standard component, Component 2 may be out of

Re: Cores and and ranking (search quality)

2015-03-10 Thread Shawn Heisey
On 3/10/2015 11:17 AM, johnmu...@aol.com wrote: If I have two cores, one core has 10 docs another has 100,000 docs. I then submit two docs that are 100% identical (with the exception of the unique-ID fields, which is stored but not indexed) one to each core. The question is, during

Re: Solr 5.0.0 - Multiple instances sharing Solr server *read-only* dir

2015-03-10 Thread Damien Dykman
Thanks Timothy for the pointer to the Jira ticket. That's exactly it :-) Erick, the main reason why I would run multiple instances on the same machine is to simulate a multi node environment. But beyond that, I like the idea of being able to clearly separate the server dir and the data dirs. That

Re: Parsing cluster result's docs

2015-03-10 Thread Erick Erickson
You can get some fields back besides ID, see the carrot.title and carrot.snippet params. I don't know a good way to get the full underlying documents though. Best, Erick On Mon, Mar 9, 2015 at 9:33 AM, Jorge Luis Lazo jorgeluis1...@gmail.com wrote: Hi, I have a Solr instance using the

Re: Solrcloud Index corruption

2015-03-10 Thread Erick Erickson
Ahhh, ok. When you reloaded the cores, did you do it core-by-core? I can see how something could get dropped in that case. However, if you used the Collections API and two cores mysteriously failed to reload that would be a bug. Assuming the replicas in question were up and running at the time

Re: Cores and and ranking (search quality)

2015-03-10 Thread johnmunir
Thanks Erick for trying to help, I really appreciate it. Unfortunately, I'm still stuck. There are times one must know the inner working and behavior of the software to make design decision and this one is one of them. If I know the inner working of Solr, I would not be asking. In addition,

Re: Chaining components in request handler

2015-03-10 Thread Alexandre Rafalovitch
Is that during indexing or during query phase? Indexing has UpdateRequestProcessors (e.g. http://www.solr-start.com/info/update-request-processors/ ) Query has Components (e.g. Faceting, MoreLIkeThis, etc) Or something different? Regards, Alex. Solr Analyzers, Tokenizers, Filters, URPs

Re: Field Rename in SOLR

2015-03-10 Thread Erick Erickson
What do you mean rename field? It _looks_ like you're trying to get the results into a doc from your document and changing it's name _in the results_. I.e. you have ProductName in your document, but want to see Name_en-US in your output. My guess is that the hyphen is the problem. Does it work if

Solr 5 upgrade

2015-03-10 Thread richardg
Ubuntu 14.04.02 Trying to install solr 5 following this: https://cwiki.apache.org/confluence/display/solr/Upgrading+a+Solr+4.x+Cluster+to+Solr+5.0 I keep getting this script requires extracting a war file with either the jar or unzip utility, please install these utilities or contact your

Num docs, block join, and dupes?

2015-03-10 Thread Timothy Potter
Before I open a JIRA, I wanted to put this out to solicit feedback on what I'm seeing and what Solr should be doing. So I've indexed the following 8 docs into a 2-shard collection (Solr 4.8'ish - internal custom branch roughly based on 4.8) ... notice that the 3 grand-children of 2-1 have dup'd

Re: unusually high 4.10.2 vs 4.3.1 RAM consumption

2015-03-10 Thread Erick Erickson
Thanks for letting us know! Erick On Tue, Mar 10, 2015 at 5:20 AM, Dmitry Kan solrexp...@gmail.com wrote: For the sake of the story completeness, just wanted to confirm these params made a positive affect: -Dsolr.solr.home=cores -Xmx12000m -Djava.awt.headless=true -XX:+UseParNewGC

Re: Solr 5.0.0 - Multiple instances sharing Solr server *read-only* dir

2015-03-10 Thread Timothy Potter
I think the next step here is to ship Solr with the war already extracted so that Jetty doesn't need to extract it on first startup - https://issues.apache.org/jira/browse/SOLR-7227 On Tue, Mar 10, 2015 at 10:15 AM, Erick Erickson erickerick...@gmail.com wrote: If I'm understanding your problem

Re: Solr TCP layer

2015-03-10 Thread Erick Erickson
Just to pile on: I admire your bravery! I'll add to the other comments only by saying that _before_ you start down this path, you really need to articulate the benefit/cost analysis. to gain a little more communications efficiency will be a pretty hard sell due to the reasons Shawn outlined. This

Re: Solr 5.0.0 - Multiple instances sharing Solr server *read-only* dir

2015-03-10 Thread Erick Erickson
If I'm understanding your problem correctly, I think you want the -d option, then all the -s guys would be under that. Just to check, though, why are you running multiple Solrs? There are sometimes very good reasons, just checking that you're not making things more difficult than necessary

Re: Cores and and ranking (search quality)

2015-03-10 Thread johnmunir
Thanks Walter. The design decision I'm trying to solve is this: using multiple cores, will my ranking be impacted vs. using single core? I have records to index and each record can be grouped into object-types, such as object-A, object-B, object-C, etc. I have a total of 30 (maybe more)

Re: Invalid Date String:'1992-07-10T17'

2015-03-10 Thread Chris Hostetter
: is a syntactically significant character to the query parser, so it's getting confused by it in the text of your query. you're seeing the same problem as if you tried to search for foo:bar in the yak field using q=yak:foo:bar you either need to backslash escape the : characters, or wrap the

Re: Cores and and ranking (search quality)

2015-03-10 Thread Walter Underwood
If the documents are distributed randomly across shards/cores, then the statistics will be similar in each core and the results will be similar. If the documents are distributed semantically (say, by topic or type), the statistics of each core will be skewed towards that set of documents and

Re: Num docs, block join, and dupes?

2015-03-10 Thread Jessica Mallet
We've seen this as well. Before we understood the cause, it seemed very bizarre that hitting different nodes would yield different numFound, as well as using different rows=N (since the proxying node only de-dupe the documents that are returned in the response). I think consistency and

Re: Solr TCP layer

2015-03-10 Thread Walter Underwood
I would strongly recommend taking a look at HTTP/2. It might not be fast enough for you, but it is fast enough for Google and there are already implementations. http://http2.github.io/faq/ wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) On Mar 10,

Re: Solr TCP layer

2015-03-10 Thread Erick Erickson
Saumitra: We certainly don't mean to be overly discouraging, so have at it! There has been some talk of using Netty in the future as we pull the war-file distribution out of the distro. Now, I have no technical clue about the merits .vs. TCP. But that's another possibility you might want to put

Invalid Date String:'1992-07-10T17'

2015-03-10 Thread Mirko Torrisi
Hi all, I am very new with Solr (and Lucene) and I use the last version of it. I do not understand why I obtain this: Exception in thread main org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://localhost:8983/solr/Collection1: Invalid Date

RE: Invalid Date String:'1992-07-10T17'

2015-03-10 Thread Ryan, Michael F. (LNG-DAY)
You'll need to wrap the date in quotes, since it contains a colon: String a = speechDate:\1992-07-10T17:33:18Z\; -Michael -Original Message- From: Mirko Torrisi [mailto:mirko.torr...@ucdconnect.ie] Sent: Tuesday, March 10, 2015 3:34 PM To: solr-user@lucene.apache.org Subject: Invalid

Re: Cores and and ranking (search quality)

2015-03-10 Thread Walter Underwood
On Mar 10, 2015, at 10:17 AM, johnmu...@aol.com wrote: If I have two cores, one core has 10 docs another has 100,000 docs. I then submit two docs that are 100% identical (with the exception of the unique-ID fields, which is stored but not indexed) one to each core. The question is,

Re: Solr TCP layer

2015-03-10 Thread Shawn Heisey
On 3/10/2015 12:13 PM, Saumitra Srivastav wrote: Now we want to do the same with Solr. While I do realize that this is going to be a lot of work, but if its something that will reap benefit in long run, then so be it. Datastax provides a netty based layer in their enterprise version which

Re: Num docs, block join, and dupes?

2015-03-10 Thread Mikhail Khludnev
On Tue, Mar 10, 2015 at 7:09 PM, Timothy Potter thelabd...@gmail.com wrote: So I guess my question is why doesn't the non-distrib query do de-duping? Tim, that's by design behavior. the special _root_ field is used as a delete term when a block update is applied i.e in case of block,

Import Feed rss delta-import

2015-03-10 Thread Ednardo
Hi, How do I create a DataImportHandler using delta-import for rss feeds? Thanks!! -- View this message in context: http://lucene.472066.n3.nabble.com/Import-Feed-rss-delta-import-tp4192257.html Sent from the Solr - User mailing list archive at Nabble.com.

Solr 5: data_driven_schema_config's solrconfig causing error

2015-03-10 Thread Aman Tandon
Hi, For the sake of using the new schema.xml and solrconfig.xml with solr 5, I put my old required field type fields names (being used with solr 4.8.1) in the schema.xml given in *basic_configs* configurations setting given in solrconfig.xml present in *data_driven_schema_configs* and put I put

Re: Import Feed rss delta-import

2015-03-10 Thread Alexandre Rafalovitch
I don't think you can since you can't query RSS normally. You just do full import and override on ids. Regards, Alex On 10 Mar 2015 7:16 pm, Ednardo ednardomart...@gmail.com wrote: Hi, How do I create a DataImportHandler using delta-import for rss feeds? Thanks!! -- View this