Re: java.lang.OutOfMemoryError: Requested array size exceeds VM limit

2014-07-25 Thread Greg Walters
Would you include the entire stack trace for your OOM message? Are you seeing this on the client or server side? Thanks, Greg On Jul 25, 2014, at 10:21 AM, Ameya Aware ameya.aw...@gmail.com wrote: Hi, I am in process of indexing lot of documents but after around 9 documents i am

Re: java.lang.OutOfMemoryError: Requested array size exceeds VM limit

2014-07-25 Thread Greg Walters
(ServletHandler.java:455) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) Thanks, Ameya On Fri, Jul 25, 2014 at 12:36 PM, Greg Walters greg.walt...@answers.com wrote

Re: SOLR Cloud Rebuild core

2014-06-16 Thread Greg Walters
Plus one to Shawn's method below. I'm using that method in production right now and on occasion there's only a small blip of queued queries while the alias is in the process of swapping. Thanks, Greg On Jun 15, 2014, at 10:36 AM, Shawn Heisey s...@elyograg.org wrote: On 6/14/2014 1:29 PM,

Short hangs when doing collection alias updates

2014-05-15 Thread Greg Walters
Good day list members. I've got a couple solr clusters in cloud mode that make use of collection aliases for offline indexing that we rotate into when indexing is complete. Every time we rotate we see a huge jump in response time, a couple timeouts and a jump in threads. You can see the

Re: Easises way to insatll solr cloud with tomcat

2014-05-15 Thread Greg Walters
While solr can run under tomcat, the (strongly) recommended container is the jetty that comes with solr. In my experience it's possible to just deploy the solr.war to tomcat like any other J2EE app but it runs better under the included jetty. Thanks, Greg On May 14, 2014, at 9:39 AM, Matt

Re: Too many documents Exception

2014-05-12 Thread Greg Walters
Looks like you've hit an internal limitation of Lucene, see http://lucene.apache.org/core/3_0_3/fileformats.html#Limitations: When referring to term numbers, Lucene's current implementation uses a Java int to hold the term index, which means the maximum number of unique terms in any

Re: Physical Files v. Reported Index Size

2014-05-12 Thread Greg Walters
See which index directory is actually in use by catting the index.properties file, verify nothing is using the others via lsof and you're safe to delete them. Thanks, Greg On May 6, 2014, at 10:34 PM, Darrell Burgan darrell.bur...@infor.com wrote: Hello all, I’m trying to reconcile what I’m

Re: Solr data directory contains index backups

2014-04-29 Thread Greg Walters
None that I'm aware of. A bit of googling shows the accepted solution to be an external script via cron or something similar. I think I saw an issue open on Apache's Jira about this but can't find it now. Thanks, Greg On Apr 25, 2014, at 4:37 PM, solr2020 psgoms...@gmail.com wrote: Thanks

Re: Solr data directory contains index backups

2014-04-23 Thread Greg Walters
In the data/ directory there's frequently a .properties file that contains the name of the index. directory that solr is currently using. Most of the time you can safely delete all of the index. directories that aren't in the properties file. As an extra verification step you might want

Re: No route to host

2014-04-09 Thread Greg Walters
This doesn't looks like a solr-specfic issue. Be sure to check your routes and your firewall. I've seen firewalls refuse packets and return a special flag that results in a no route to host error. Thanks, Greg On Apr 9, 2014, at 3:28 PM, Rallavagu rallav...@gmail.com wrote: All, I see the

Re: Logging which client connected to Solr

2014-03-27 Thread Greg Walters
We do something similar and include the server's hostname in solr's response. To accomplish this you'll have to write a class that extends org.apache.solr.servlet.SolrDispatchFilter and put your custom class in place as the SolrRequestFilter in solr's web.xml. Thanks, Greg On Mar 27, 2014, at

Re: Solr4.7 No live SolrServers available to handle this request

2014-03-24 Thread Greg Walters
think this could be fixed pretty easily; see SOLR-5985 for my suggestion. -Mike On 03/21/2014 10:20 AM, Greg Walters wrote: Broken pipe errors are generally caused by unexpected disconnections and are some times hard to track down. Given the stack traces you've provided it's hard

Re: Solr4.7 No live SolrServers available to handle this request

2014-03-21 Thread Greg Walters
, maxShardsPerNode:2, router:{name:compositeId}, replicationFactor:2}} On 03/20/2014 09:44 AM, Greg Walters wrote: Sathya, I assume you're using Solr Cloud. Please provide your clusterstate.json while you're seeing this issue and check your logs for any exceptions. With no information from

Re: solr cloud distributed optimize() becomes serialized

2014-03-21 Thread Greg Walters
I've seen this on 4.6. Thanks, Greg On Mar 20, 2014, at 11:58 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: That's not right. Which Solr versions are you on (question for both William and Chris)? On Fri, Mar 21, 2014 at 8:07 AM, William Bell billnb...@gmail.com wrote: Yeah.

Re: SolrCell and indexing HTML

2014-03-21 Thread Greg Walters
I've never tried indexing via groovy or using solrCell but I think you might be working a bit too low level in solrj if you're just adding documents. You might try checking out https://wiki.apache.org/solr/Solrj#Adding_Data_to_Solr and I might be way off base :) Thanks, Greg On Mar 21, 2014,

Re: Solr4.7 No live SolrServers available to handle this request

2014-03-20 Thread Greg Walters
Sathya, I assume you're using Solr Cloud. Please provide your clusterstate.json while you're seeing this issue and check your logs for any exceptions. With no information from you it's hard to troubleshoot any issues! Thanks, Greg On Mar 20, 2014, at 12:44 AM, Sathya

Re: More heap usage in Solr during indexing

2014-03-17 Thread Greg Walters
Are your JVM running out of ram (actual exceptions) or is the used heap just reaching 16G prior to a garbage collection? If it's the later then that is expected behavior and is how Java's garbage collection works. Thanks, Greg On Mar 17, 2014, at 1:26 PM, solr2020 psgoms...@gmail.com wrote:

Re: More heap usage in Solr during indexing

2014-03-17 Thread Greg Walters
It's entirely possible that you're seeing higher memory usage while indexing due to more objects being created and abandoned. Another thing to consider could be your commit settings. Perhaps http://www.oracle.com/webfolder/technetwork/tutorials/obe/java/gc01/index.html can answer some of your

Re: PROBLEM SOLRJ

2014-03-14 Thread Greg Walters
Hello, You shouldn't include the # as part of the url nor should the collection be specified directly like that either. Check out https://wiki.apache.org/solr/Solrj#HttpSolrServer for an example. Thanks, Greg On Mar 14, 2014, at 3:00 AM, Ángel Miralles angel.miralles.e...@juntadeandalucia.es

Re: Empty string in tfloat type field

2014-03-14 Thread Greg Walters
Ravi, You must not have gotten Ahmet Arslan's reply the last time you posted this question three days ago. Allow me to quote: ** Hi Ravi, How about RemoveBlankFieldUpdateProcessorFactory ? https://lucene.apache.org/solr/4_0_0/solr-core/org/apache/solr/update/processor/ Ahmet ** Thanks,

Re: More Maintenance Releases?

2014-03-12 Thread Greg Walters
Furkan, This list tends to eat attachments. Could you post it somewhere like imgur? Thanks, Greg On Mar 12, 2014, at 2:19 PM, Furkan KAMACI furkankam...@gmail.com wrote: Hi; I've attached the chart that I've prepared as I mentioned at e-mail. Thanks; Furkan KAMACI 2014-03-12 21:17

Re: zkHost configuration

2014-03-11 Thread Greg Walters
It's used for failover and if you've got ZooKeeper running on a separate machine(s) you need a way to tell Solr where to look. Thanks, Greg On Mar 11, 2014, at 10:11 AM, Oliver Schrenk oliver.schr...@gmail.com wrote: Hi, I was wondering why there is the need to full specify all zookeeper

Re: to reduce indexing time

2014-03-05 Thread Greg Walters
It doesn't sound like you have much of an understanding of java's garbage collection. You might read http://www.oracle.com/webfolder/technetwork/tutorials/obe/java/gc01/index.html to get a better understanding of how it works and why you're seeing different levels of memory utilization at any

Re: Solr Permgen Exceptions when creating/removing cores

2014-03-03 Thread Greg Walters
Josh, You've mentioned a couple of times that you've got PermGen set to 512M but then you say you're running with -XX:MaxPermSize=64M. These two statements are contradictory so are you *sure* that you're running with 512M of PermGen? Assuming your on a *nix box can you provide `ps` output

Re: Solr 4.5.0 replication numDocs larger in slave

2014-03-03 Thread Greg Walters
I just ran into an issue similar to this that effected document scores on distributed searches. You might try doing an optimize and purging your deleted documents while no indexing is being done then checking your counts. Once I optimized all my indexes the document counts on all of my cores

Re: Solr cloud: Faceting issue on text field

2014-02-26 Thread Greg Walters
IIRC faceting uses copious amounts of memory; have you checked for GC activity while the query is running? Thanks, Greg On Feb 26, 2014, at 1:06 PM, David Miller davthehac...@gmail.com wrote: Hi, I am encountering an issue where Solr nodes goes down when trying to obtain facets on a text

Re: Solr cloud: Faceting issue on text field

2014-02-26 Thread Greg Walters
is that, the query was returning only a single document, but the facet still seems to be having the issue. So, it should be technically possible to get facets on text field over 200-300 million docs at a decent speed, right? Regards, On Wed, Feb 26, 2014 at 2:13 PM, Greg

Re: SolrCloud: How to replicate shard of another machine for failover?

2014-02-25 Thread Greg Walters
Oliver, You'll probably have better luck not supplying CLI arguments and creating your collection via the collections api (https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-CreateaCollection). Try removing -DnumShards and setting the -Dcollection.configName to

Re: in XML Node Getting Error

2014-02-21 Thread Greg Walters
Ravi, What's the error you're getting? Thanks, Greg On Feb 21, 2014, at 11:08 AM, EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions) external.ravi.tamin...@us.bosch.com wrote: Hi, I am getting Error if any of the field in the xml file has as value. How can I fix this issue FYI

Re: Preventing multiple on-deck searchers without causing failed commits

2014-02-19 Thread Greg Walters
I believe that there's a configuration option that'll make on-deck searchers be used if they're needed even if they're not fully warmed yet. You might try that option and see if it doesn't solve your 503 errors. Thanks, Greg On Feb 18, 2014, at 9:05 PM, Erick Erickson erickerick...@gmail.com

Re: Preventing multiple on-deck searchers without causing failed commits

2014-02-19 Thread Greg Walters
A quick peek at the code (branch_4x, SolrCore.java, starting at line 1647) seems to confirm this. It seems my understanding of that option was wrong! Thanks for correcting me Shawn. Greg On Feb 19, 2014, at 11:19 AM, Shawn Heisey s...@elyograg.org wrote: On 2/19/2014 8:59 AM, Greg

Re: Could not connect or ping a core after import a big data into it...

2014-02-14 Thread Greg Walters
You should check your server logs for error messages during startup related to loading that core. Feel free to post them here if you can't parse them. Thanks, Greg On Feb 14, 2014, at 10:14 AM, Eric_Peng sagittariuse...@gmail.com wrote: Need help, Thx in advance. About import a big XML

Re: Solr4 performance

2014-02-12 Thread Greg Walters
Shital, Take a look at http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html as it's a pretty decent explanation of memory mapped files. I don't believe that the default configuration for solr is to use MMapDirectory but even if it does my understanding is that the entire

Re: SolrCloudServer questions

2014-02-03 Thread Greg Walters
be beneficial when bulk uploading? On Fri, Jan 31, 2014 at 11:05 AM, Mark Miller markrmil...@gmail.com wrote: On Jan 31, 2014, at 1:56 PM, Greg Walters greg.walt...@answers.com wrote: I'm assuming you mean CloudSolrServer here. If I'm wrong please ignore my response. -updatesToLeaders

Re: need help in understating solr cloud stats data

2014-02-03 Thread Greg Walters
I've had some issues monitoring Solr with the per-core mbeans and ended up writing a custom request handler that gets loaded then registers itself as an mbean. When called it polls all the per-core mbeans then adds or averages them where appropriate before returning the requested value. I'm not

Re: need help in understating solr cloud stats data

2014-02-03 Thread Greg Walters
currently expect you to aggregate in the monitoring layer and it’s a lot to ask IMO. - Mark http://about.me/markrmiller On Feb 3, 2014, at 10:49 AM, Greg Walters greg.walt...@answers.com wrote: I've had some issues monitoring Solr with the per-core mbeans and ended up writing

Re: SolrCloudServer questions

2014-01-31 Thread Greg Walters
I'm assuming you mean CloudSolrServer here. If I'm wrong please ignore my response. -updatesToLeaders Only send documents to shard leaders while indexing. This saves cross-talk between slaves and leaders which results in more efficient document routing. shutdownLBHttpSolrServer

Re: solrcloud shards backup/restoration

2014-01-24 Thread Greg Walters
We've managed some success restoring existing/backed up indexes into solr cloud and even building the indexes offline and dumping the lucene files into the directories that solr expects. The general steps we follow are: 1) Round up your files. It doesn't matter if you pull from a master or

Re: solr cloud + hdfs issue

2014-01-21 Thread Greg Walters
You can configure the Solr client to use a replication factor of 1 for hdfs and then let Solr replicate for you if you want to avoid this. What is solr's behavior if the lucene files underneath it suddenly disappear? Will a core that's running and can't access its files in the case of a HDFS

Re: Setting leaderVoteWait for auto discovered cores

2014-01-21 Thread Greg Walters
Allow me to quote Mark via StackOverflow: ** In solr.xml, add a cores attribute of leaderVoteWait=0. It defaults to 18 (3 minutes). This is simply to protect against starting the cluster with an old node - you don't want it to become the leader before other nodes get to participate in the

Re: Questions about integrateing SolrCloud with HDFS

2013-12-26 Thread Greg Walters
YouPeng, While I'm unable to help you with the issue that you're seeing I did want to comment here and say that I have previously brought up the same goal that you're trying to accomplish on this mailing list but received no feedback or input. I think it makes sense that Solr should not try to

Re: Questions about integrateing SolrCloud with HDFS

2013-12-26 Thread Greg Walters
Mark, I'd be happy to but some clarification first; should this issue be about creating cores with overlapping names and the stack trace that YouPeng initially described, Solr's behavior when storing data on HDFS or YouPeng's other thread (Maybe a bug for solr 4.6 when create a new core) that

Re: email datasource connect timeout issue

2013-12-20 Thread Greg Walters
Xie, Based on the error message: ** Caused by: javax.mail.MessagingException: Connection timed out; nested exception is: java.net.ConnectException: Connection timed out at com.sun.mail.imap.IMAPStore.protocolConnect(IMAPStore.java:571) at

Re: installing a 3rd party index

2013-12-13 Thread Greg Walters
Christian, I literally did this 10 minutes ago for an internal example. You need to issue a RELOAD for your index to open a new searcher using the updated files. Here's an example showing how I did it: ** Creating a new collection Greg-Walters-MacBook-Pro:SolrUpload greg.walters$ curl http

Re: starting up solr automatically

2013-12-05 Thread Greg Walters
www.flax.co.uk On 4 Dec 2013, at 21:26, Greg Walters wrote: I almost forgot, you'll need a file to setup the environment a bit too: ** JAVA_HOME=/usr/java/default JAVA_OPTIONS=-Xmx15g \ -Xms15g \ -XX:+PrintGCApplicationStoppedTime \ -XX:+PrintGCDateStamps \ -XX:+PrintGCDetails \ -XX

Re: starting up solr automatically

2013-12-05 Thread Greg Walters
=\ -Dsolr.solr.home=/home/ec2-user/solr/solr-4.5.1/example/solr/ \ -Xms1g \ -Djetty.port=8983 \ -Dcollection.configName=collection1 \ $JAVA_OPTIONS Any ideas what I should check? Eric P thanks in advance On Thu, Dec 5, 2013 at 11:28 AM, Greg Walters greg.walt...@answers.comwrote

Re: starting up solr automatically

2013-12-05 Thread Greg Walters
) at org.eclipse.jetty.start.Config.getActiveClasspath(Config.java:388) at org.eclipse.jetty.start.Main.start(Main.java:509) at org.eclipse.jetty.start.Main.main(Main.java:96) On Thu, Dec 5, 2013 at 2:28 PM, Greg Walters greg.walt...@answers.comwrote: Eric, If you're using the script from the gist I

Re: Programmatically upload configuration into ZooKeeper

2013-12-04 Thread Greg Walters
Hi Artem, This question (or one very like it) has been asked on this list before so there's some prior art you could modify to suit your needs. Taken from Timothy Potter thelabd...@gmail.com: ** public static void updateClusterstateJsonInZk(CloudSolrServer cloudSolrServer, CommandLine cli)

Re: starting up solr automatically

2013-12-04 Thread Greg Walters
I found the instructions and scripts on that page to be unclear and/or not work. Here's the script I've been using for solr 4.5.1: https://gist.github.com/gregwalters/7795791 Do note that you'll have to change a couple of paths to get things working correctly. Thanks, Greg On Dec 4, 2013, at

Re: starting up solr automatically

2013-12-04 Thread Greg Walters
/lib/answers/atlascloud/solr45/ JETTY_USER=tomcat JETTY_LOGS=/var/lib/answers/atlascloud/solr45/logs ** On Dec 4, 2013, at 3:21 PM, Greg Walters greg.walt...@answers.com wrote: I found the instructions and scripts on that page to be unclear and/or not work. Here's the script I've been using

Re: safe to delete old index

2013-10-31 Thread Greg Walters
Hi Chris, In my experience it is safe to delete older indexes like that. You might want to check if the index is in use prior to deleting it via the `lsof` command on linux or the equivalent on other platforms. I've found that most times, if the index isn't the one specified in

Re: safe to delete old index

2013-10-31 Thread Greg Walters
* You might want to check that the index is NOT in use * (It's still early and dark here!) Greg On 2013Oct 31,, at 9:57 AM, Greg Walters greg.walt...@answers.com wrote: Hi Chris, In my experience it is safe to delete older indexes like that. You might want to check if the index is in use

RE: Solr cloud shard goes down after SocketException in another shard

2013-09-12 Thread Greg Walters
Neoman, Make sure that solr08-prod (or the elected leader at any time) isn't doing a stop-the-world garbage collection that takes long enough that the zookeeper connection times out. I've seen that in my cluster when I didn't have parallel GC enabled and my zkClientTimeout in solr.xml was too

RE: Solr cloud shard goes down after SocketException in another shard

2013-09-12 Thread Greg Walters
Neoman, I've got ours set at 45 seconds: int name=zkClientTimeout${zkClientTimeout:45000}/int -Original Message- From: neoman [mailto:harira...@gmail.com] Sent: Thursday, September 12, 2013 9:33 AM To: solr-user@lucene.apache.org Subject: Re: Solr cloud shard goes down after

RE: Distributing lucene segments across multiple disks.

2013-09-11 Thread Greg Walters
Why not use some form of RAID for your index store? You'd get the performance benefit of multiple disks without the complexity of managing them via solr. Thanks, Greg -Original Message- From: Deepak Konidena [mailto:deepakk...@gmail.com] Sent: Wednesday, September 11, 2013 2:07 PM

RE: Distributing lucene segments across multiple disks.

2013-09-11 Thread Greg Walters
as a replacement for Solr or making Solr work with RAID? Could you elaborate more on the latter, if that's you meant? We make use of solr's advanced text processing features which would be hard to replicate just using RAID. -Deepak On Wed, Sep 11, 2013 at 12:11 PM, Greg Walters gwalt...@sherpaanalytics.com

RE: Distributing lucene segments across multiple disks.

2013-09-11 Thread Greg Walters
Deepak, It might be a bit outside what you're willing to consider but you can make a raid out of your spinning disks then use your SSD(s) as a dm-cache device to accelerate reads and writes to the raid device. If you're putting lucene indexes on a mixed bag of disks and ssd's without any type

RE: Solr Cloud hangs when replicating updates

2013-09-04 Thread Greg Walters
Kevin, Take a look at http://lucene.472066.n3.nabble.com/updating-docs-in-solr-cloud-hangs-td4067388.html and https://issues.apache.org/jira/browse/SOLR-4816. I had the same issue that you're reporting for a while then I applied the patch from SOLR-4816 to my clients and the problems went

RE: SolrCloud 4.x hangs under high update volume

2013-09-04 Thread Greg Walters
Tim, Take a look at http://lucene.472066.n3.nabble.com/updating-docs-in-solr-cloud-hangs-td4067388.html and https://issues.apache.org/jira/browse/SOLR-4816. I had the same issue that you're reporting for a while then I applied the patch from SOLR-4816 to my clients and the problems went away.

RE: coordination factor in between query terms

2013-08-28 Thread Greg Walters
Just boost the term you want to show up higher in your results. http://wiki.apache.org/solr/SolrRelevancyCookbook#Boosting_Ranking_Terms - Greg -Original Message- From: anirudh...@gmail.com [mailto:anirudh...@gmail.com] On Behalf Of Anirudha Jadhav Sent: Wednesday, August 28, 2013 3:36

RE: Caused by: java.net.SocketException: Connection reset by peer: socket write error solr querying

2013-08-26 Thread Greg Walters
AnilJayanti, Have you checked your entire stack from the client all the way to solr along with anything between them? Your timeout values should match everywhere and if there's something between the client and server that'll timeout before either the client or server does it'll cause that

RE: how to integrate solr with HDFS HA

2013-08-23 Thread Greg Walters
Finally something I can help with! I went through the same problems you're having a short while ago. Check out https://cwiki.apache.org/confluence/display/solr/Running+Solr+on+HDFS for most of the information you need and be sure to check the comments on the page as well. Here's an example

RE: Caused by: java.net.SocketException: Connection reset by peer: socket write error solr querying

2013-08-23 Thread Greg Walters
If you're using the bundled jetty that comes with the download, check the etc/jetty.xml property for maxIdleTime and set it appropriately. I get that error when operations take longer than the property is set to and time out. Do note that the property is specified in milliseconds! Thanks, Greg

RE: updating docs in solr cloud hangs

2013-08-22 Thread Greg Walters
Thanks, Erick that's exactly the clarification/confirmation I was looking for! Greg

RE: problems running solr 4.4 with HDFS HA

2013-08-07 Thread Greg Walters
Hi Mark, Setting str name=solr.hdfs.confdir properly in my solrconfig.xml did it. Thanks! Greg Walters | Operations Team 530 Maryville Center Drive, Suite 250 St. Louis, Missouri 63141 t. 314.225.2745 | c. 314.225.2797 gwalt...@sherpaanalytics.com www.sherpaanalytics.com

Data duplication using Cloud+HDFS+Mirroring

2013-08-07 Thread Greg Walters
While testing Solr's new ability to store data and transaction directories in HDFS I added an additional core to one of my testing servers that was configured as a backup (active but not leader) core for a shard elsewhere. It looks like this extra core copies the data into its own directory

problems running solr 4.4 with HDFS HA

2013-08-06 Thread Greg Walters
valueorg.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider/value /property ** How do I get HA working? Thanks, Greg Greg Walters | Operations Team 530 Maryville Center Drive, Suite 250 St. Louis, Missouri 63141 t. 314.225.2745 | c. 314.225.2797 gwalt...@sherpaanalytics.com