Re: Modifying date format when using TrieDateField.

2014-08-14 Thread Modassar Ather
Thanks Erick for you inputs. Regards, Modassar On Tue, Aug 12, 2014 at 8:32 PM, Erick Erickson erickerick...@gmail.com wrote: The response will always be the full specification, so you'll have -MM-dd'T'HH:mm:ss format. If you want the user to just see the -MM-dd you could use a

Re: Help Required

2014-08-14 Thread Dmitry Kan
Thanks a lot, Shawn! Dmitry On Wed, Aug 13, 2014 at 4:22 PM, Shawn Heisey s...@elyograg.org wrote: On 8/13/2014 5:11 AM, Dmitry Kan wrote: OK, thanks. Can you please add my user name to the Contributor group? username: DmitryKan You are added. Edit away! Thanks, Shawn --

RE: Replication Issue with Repeater Please help

2014-08-14 Thread waqas sarwar
Date: Wed, 13 Aug 2014 07:19:58 -0600 From: s...@elyograg.org To: solr-user@lucene.apache.org Subject: Re: Replication Issue with Repeater Please help On 8/13/2014 12:49 AM, waqas sarwar wrote: Hi, I'm using Solr. I need a little bit assistance from you. I am bit stuck with

structuring

2014-08-14 Thread Oded Sofer
Hello  I am trying to indexed structured data, kind of event log (user, client, serverIP, time, etc.);  I would like to enable specific field search (e.g., user = John Smith) and free text search (e.g., John Smith).  I've tried to index each field seperately and all string together in another

Question

2014-08-14 Thread Oded Sofer
Hello We are implementing SolrCloud; we expect around ~200millions documents per node and 160-200 nodes. I looked on other references, seems like we are not the first to work with such volume. The indexing itself will be done locally (no distribution, each node-server indexes its own) The

Re: Question

2014-08-14 Thread Jack Krupansky
1. Better to target a max of 100 million docs per node, unless you do a POC that more docs really does work well for you. 2. Sounds like you don't have enough memory, either heap or system memory. Increase your heap first. Then more system memory. 3. Document examples of a simple query, facet

Re: Replication Issue with Repeater Please help

2014-08-14 Thread Shawn Heisey
On 8/14/2014 2:09 AM, waqas sarwar wrote: Thanks Shawn. What i got is Circular replication is totally impossible Solr fails in distributed environment. Then why solr documentation says that configure REPEATER for distributed architecture, because REPEATER behave like master-slave at a

Random OOM Exceptions

2014-08-14 Thread Scott Rankin
Hello all, I¹m running a Solr setup and am getting occasional periods where memory usage and GC just spike out of nowhere (unrelated to traffic). I¹m hoping someone can shed some light. Here¹s the setup: - Solr 4.3.1, Oracle JDK 1.7.0_51 64 bit on CentOS 6.5 - We have 2 Solr servers, one

BlendedInfixSuggester index write.lock failures on core reload

2014-08-14 Thread Zisis Tachtsidis
Hi all, I'm using Solr 4.9.0 and have setup a spellcheck component for returning suggestions. The configuration inside my solr.SpellCheckComponent has as follows. str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str

RE: Updates to index not available immediately as index scales, even with autoSoftCommit at 1 second

2014-08-14 Thread cwhit
Thanks for the explanation. This makes a lot of sense to me... I'm wondering if there's a way to get the best of both worlds. Can throwing more hardware at the index give real time updates + a large LRU cache? Would we be CPU bound at this point? -- View this message in context:

RE: Solr cloud performance degradation with billions of documents

2014-08-14 Thread Wilburn, Scott
Erick, Thanks for your suggestion to look into MapReduceIndexerTool, I'm looking into that now. I agree what I am trying to do is a tall order, and the more I hear from all of your comments, the more I am convinced that lack of memory is my biggest problem. I'm going to work on increasing the

Re: Random OOM Exceptions

2014-08-14 Thread Shawn Heisey
On 8/14/2014 7:46 AM, Scott Rankin wrote: I¹m running a Solr setup and am getting occasional periods where memory usage and GC just spike out of nowhere (unrelated to traffic). I¹m hoping someone can shed some light. Here¹s the setup: - Solr 4.3.1, Oracle JDK 1.7.0_51 64 bit on CentOS 6.5

Re: Solr cloud performance degradation with billions of documents

2014-08-14 Thread Jack Krupansky
You're using the term cloud again. Maybe that's the cause of your misunderstanding - SolrCloud probably should have been named SolrCluster since that's what it really is, a cluster rather than a cloud. The term cloud conjures up images of vast, unlimited numbers of nodes, thousands, tens of

Re: structuring

2014-08-14 Thread Erick Erickson
You haven't given us anything to go on here, details matter. What is your field definition? How often do you commit? What is the query that's slow? What have you tried? You might want to review: http://wiki.apache.org/solr/UsingMailingLists Best, Erick On Thu, Aug 14, 2014 at 2:06 AM, Oded

Re: Random OOM Exceptions

2014-08-14 Thread Scott Rankin
On 8/14/14, 11:22 AM, Shawn Heisey s...@elyograg.org wrote: On 8/14/2014 7:46 AM, Scott Rankin wrote: I¹m running a Solr setup and am getting occasional periods where memory usage and GC just spike out of nowhere (unrelated to traffic). I¹m hoping someone can shed some light. Here¹s the

ComplexPhraseQuery and Date ranges

2014-08-14 Thread Bryan Bende
Does anyone know if it is possible to get data ranges working with the ComplexPhraseQueryParser? I'm using Solr 4.8.1 and seeing the same behavior described in this post: http://stackoverflow.com/questions/19402268/solr-4-2-1-and-solr-1604-complexphrase-and-date-range-queries-do-not-work-toge I

Re: Updates to index not available immediately as index scales, even with autoSoftCommit at 1 second

2014-08-14 Thread Erick Erickson
Oh, what a lovely anti-pattern! Every second, you're throwing away your filterCache. And then firing up to 4096 autowarm queries at your index on the filterCache. And this doesn't even consider the other caches! And this will get worse with time after restarts if my scenario is accurate. Having

Re: Random OOM Exceptions

2014-08-14 Thread Shawn Heisey
On 8/14/2014 10:06 AM, Scott Rankin wrote: My question was actually more about what in Solr might cause the server to suddenly go from a very consistent heap size of 300-400 MB to over 2 GB in a matter of minutes with no changes in traffic. I get why the VM is crashing, I just don’t know why

RE: Solr cloud performance degradation with billions of documents

2014-08-14 Thread Toke Eskildsen
Wilburn, Scott [scott.wilb...@verizonwireless.com.INVALID] wrote: Thanks for your suggestion to look into MapReduceIndexerTool, I'm looking into that now. I agree what I am trying to do is a tall order, and the more I hear from all of your comments, the more I am convinced that lack of

Re: Solr cloud performance degradation with billions of documents

2014-08-14 Thread Erick Erickson
You are absolutely on the bleeding edge. I know of a couple of projects that are at that scale, but 1 they aren't being done on just a few nodes. As Jack says, this scale for SolrCloud is not common and there are no OOB templates to follow. 2 AFAIK, the projects I'm talking about aren't in

RE: Solr cloud performance degradation with billions of documents

2014-08-14 Thread Wilburn, Scott
Thanks, Jack. I'd like to stay away from a terminology debate, since it is clear you know what I am talking about. But just to give my opinion, I prefer the term 'cloud' because it differentiates it from the term 'cluster', which refers to the Hadoop environment which I am running it on. I

Re: ComplexPhraseQuery and Date ranges

2014-08-14 Thread Erick Erickson
that code is relatively new in the code base, I know of nothing that has been done to make it work with date range queries. Sounds like a JIRA to me. Best, Erick On Thu, Aug 14, 2014 at 9:41 AM, Bryan Bende bbe...@gmail.com wrote: Does anyone know if it is possible to get data ranges working

Re: Random OOM Exceptions

2014-08-14 Thread Erick Erickson
bq: I just don’t know why Solr is suddenly going nuts. Hmmm, as Shawn says, hard to say at this remove. But I've personally doubled the memory requirements for Solr on the _same_ index by altering the query to a pathological one. Something like q=*:*facet.field=whatever where the field whatever

Spell check collation

2014-08-14 Thread Corey Gerhardt
Solr 4.6 Current settings for my handler: str name=defTypeedismax/str str name=spellcheck.maxResultsForSuggest5/str str name=spellcheck.maxCollations3/str str name=spellcheck.maxCollationTries30/str str name=qfBUS_BUSINESS_NAME_PHRASE/str str name=spellcheck.count10/str str

RE: Spell check collation

2014-08-14 Thread Dyer, James
DirectSolrSpellChecker defaults with a minimum term length of 4. So you'd need to bring this down with int name=minQueryLength1/int. But you might not like the results from this. See:

Re: Random OOM Exceptions

2014-08-14 Thread François Schiettecatte
I would also get some metrics when SOLR is doing nothing, the JVM does do work in the background and looking at the memory graph in VisualVM will show a nice sawtooth. François On Aug 14, 2014, at 1:16 PM, Erick Erickson erickerick...@gmail.com wrote: bq: I just don’t know why Solr is

RE: Solr cloud performance degradation with billions of documents

2014-08-14 Thread Toke Eskildsen
Erick Erickson [erickerick...@gmail.com] wrote: Solr requires holding large parts of the index in memory. For the entire corpus. At once. That requirement is under the assumption that one must have the lowest possible latency at each individual box. You might as well argue for the fastest

Ubuntu 14.04 Tomcat 7.0.52 Solr 4.9 - org.apache.solr.common.SolrException: Invalid chunk header

2014-08-14 Thread mark12345
I seem frequently getting the following exception in my Solr 4.9 logs, org.apache.solr.common.SolrException: Invalid chunk header. These exceptions still continue to happen even if I throttle my Solr requests. Does anyone have any suggestions on how to address or work-around his issue? I have

Re: ICUTokenizer acting very strangely with oriental characters

2014-08-14 Thread Steve Rowe
On Aug 13, 2014, at 1:53 PM, Shawn Heisey s...@elyograg.org wrote: On 8/12/2014 9:13 PM, Steve Rowe wrote: In the table below, the IsSameS (is same script) and SBreak? (script break = not IsSameS) decisions are based on what I mentioned in my previous message, and the WBreak (word break)