Out Of Memory =( Too many cores on one server?

2012-11-16 Thread stockii
Hello. if my server is running for a while i get some OOM Problems. I think the problem is, that i running to many cores on one Server with too many documents. this is my server concept: 14 cores. 1 with 30 million docs 1 with 22 million docs 1 with growing 25 million docs 1 with 67 million

RE: cores shards and disks in SolrCloud

2012-11-16 Thread Toke Eskildsen
On Fri, 2012-11-16 at 02:18 +0100, Buttler, David wrote: Obviously, I could replicate the data so that I wouldn't lose any documents while I replace my disk, but since I am already storing the original data in HDFS, (with a 3x replication), adding additional replication for solr eats into my

Re: Out Of Memory =( Too many cores on one server?

2012-11-16 Thread Bernd Fehling
I guess you should give JVM more memory. When starting to find a good value for -Xmx I oversized and set it to Xmx20G and Xms20G. Then I monitored the system and saw that JVM is between 5G and 10G (java7 with G1 GC). Now it is finally set to Xmx11G and Xms11G for my system with 1 core and 38

Re: CloudSolrServer and LBHttpSolrServer: setting BinaryResponseParser and BinaryRequestWriter.

2012-11-16 Thread Sandopolus
There is a way to make CloudSolrServer use LBHttpSolrServer with the BinaryRequestWriter that is quite simple as i have had to work around this very problem. Create a new class which extends LBHttpSolrServer (Call it BinaryLBHttpSolrServer or something like that). This class will need to setup

Re: how make a suggester?

2012-11-16 Thread iwo
Many thanks Otis! Yes, I searched and read most of the posts found about this topic on the mailing. I'm finding the best way at this moment :-) now, I begin to develop and test - Complicare è facile, semplificare é difficile. Complicated is easy, simple is hard. quote:

Re: admin query showing unstored fields

2012-11-16 Thread Upayavira
Er, it can't. What are you seeing that seems wrong? Upayavira On Fri, Nov 16, 2012, at 10:13 AM, Reik Schatz wrote: This might be a silly question but if I search *.* in the admin tool, how can it show me the full document including all the fields that are set to stored=false or that don't

Re: admin query showing unstored fields

2012-11-16 Thread Reik Schatz
I did this test. Here is my schema.xml (setting stored=false explicitly though it should be default): schema name=minimal version=1.1 types fieldType name=string class=solr.StrField / fieldType name=score class=solr.TrieFloatField precisionStep=32 omitNorms=true

Report exception: too many close count

2012-11-16 Thread Markus Jelsma
Hi, I stumbled upon SOLR-4037 again and this time restarting with a clean Zookeeper gave a very interesting error log: 2012-11-16 11:05:51,876 INFO [solr.core.SolrCore] - [Thread-4] - : Closing SolrCoreState 2012-11-16 11:05:51,876 INFO [solr.update.DefaultSolrCoreState] - [Thread-4] - :

Re: Neary text search system with solr.

2012-11-16 Thread alu
Thank you iorixxx san! Sorry, i'm very poor english...! This is concrete example. ::shcema.xml . fieldType name=text_cjk class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.CJKWidthFilterFactory/

ZooKeeper and single file update

2012-11-16 Thread Marcin Rzewucki
Hi, It happens a lot of times that I need to update just 1 file in ZooKeeper. I'm using zkcli.sh and -upconfig for the whole directory with configuration files. I wonder, if it is possible to update a single file in ZooKeeper. Do you have any ideas ? Thanks!

Re: consistency in SolrCloud replication

2012-11-16 Thread Bill Au
Yes, my original question is about search. And Mark did answered is in his original reply. I am guessing that the replicas are updated sequentially so the newly added documents will be available in some replicas before other. I want to know where SolrCloud stands in terms of CAP. Bill On

RE: consistency in SolrCloud replication

2012-11-16 Thread Markus Jelsma
Solr is provides availability and it is tolerant to partioning so that leaves consistency. It is eventual consistent. -Original message- From:Bill Au bill.w...@gmail.com Sent: Fri 16-Nov-2012 15:00 To: solr-user@lucene.apache.org Subject: Re: consistency in SolrCloud replication

RE: DIH nested entities don't work

2012-11-16 Thread mroosendaal
Hi, You are correct about not wanting to index everything every day, however for this PoC i need a 'bootstrap' mechanism which basically does what Endeca does. The 'defaultRowPrefetch' in the solrconfig.xml does not seem to take, i'll have a closer look. With the long time, it appeard that one

How to format xml for Solr import

2012-11-16 Thread Spadez
Hi, I was wondering if someone could show me an example XML file for use to import to solr. Bascially I have the following information that I am trying to import to solr: Title Description Keyword Description Source Location Name Location Co-ordinates URL Time I've never worked with XML before

Re: admin query showing unstored fields

2012-11-16 Thread Jack Krupansky
Is there any chance that you had added the document and then changed the schema to have stored=false? Changing the schema doesn't affect the existing index/stored values. Also, what release are you using? -- Jack Krupansky -Original Message- From: Reik Schatz Sent: Friday, November

Reduce QueryComponent prepare time

2012-11-16 Thread Markus Jelsma
Hi, We're seeing high prepare times for the QueryComponent, obviously due to the vast amount of field and queries. It's common to have a prepare time of 70-80ms while the process times drop significantly due to warmed searchers, OS cache etc. The prepare time is a recurring issue and i'd hope

Re: How to format xml for Solr import

2012-11-16 Thread Marcin Rzewucki
Hi, You can prepare the following structure: add doc field name=solrfieldname1value1/field field name=solrfieldname2value2/field /doc /add You can find sample files in solr package (example/exampledocs/ dir) along with post.sh script which might be useful for you. Regards. On 16

Re: admin query showing unstored fields

2012-11-16 Thread Reik Schatz
I am using Solr 4.0 (new admin interface) and I am sure I don't have anything left in my index because I empty the data directory every time before testing. On Fri, Nov 16, 2012 at 3:39 PM, Jack Krupansky j...@basetechnology.comwrote: Is there any chance that you had added the document and

Re: admin query showing unstored fields

2012-11-16 Thread Erik Hatcher
Did you also restart Solr after changing things? On Nov 16, 2012, at 10:04, Reik Schatz reik.sch...@gmail.com wrote: I am using Solr 4.0 (new admin interface) and I am sure I don't have anything left in my index because I empty the data directory every time before testing. On Fri, Nov

Re: High Slave CPU Intermittently After Replication

2012-11-16 Thread richardg
We tried using MergeFactor setting but out CPU Load/Slow Query time issues were more widespread, optimizing the index always alleviated the issue that is why we are using it now. Our index is 2 GB when optimized and would balloon to over 4 GB so we thought the issue was it was getting too big. I

Re: consistency in SolrCloud replication

2012-11-16 Thread Mark Miller
I want to know where SolrCloud stands in terms of CAP. SolrCloud is a CP system. In the face of partitions, SolrCloud favors consistency over availability (mostly concerning writes). The system is eventually consistent, but should become consistent with a pretty low latency, unlike many cases

Re: admin query showing unstored fields

2012-11-16 Thread Reik Schatz
Erik, yes I did. I am stopping Tomcat, rm the index dir inside the data folder. Change the schema.xml inside the conf dir. Restart Tomcat and run my little class which adds a single SolrInputDocument to my core. Then I check via admin. So I take it that it shouldn't be displaying the contents of a

Re: zkcli issues

2012-11-16 Thread Mark Miller
I *think* I tested the script on windows once way back. Anyway, the code itself should not be OS specific. One thing you might want to check if you are copying unix cmd line stuff - I think windows separates classpath entries with ; rather than : - so you likely to need to change that. You'd

Re: Out Of Memory =( Too many cores on one server?

2012-11-16 Thread Vadim Kisselmann
Hi, your JVM need more RAM. My setup works well with 10 Cores, and 300mio. docs, Xmx8GB Xms8GB, 16GB for OS. But it's how Bernd mentioned, the memory consumption depends on the number of fields and the fieldCache. Best Regards Vadim 2012/11/16 Bernd Fehling bernd.fehl...@uni-bielefeld.de: I

Solr Admin Page authentication

2012-11-16 Thread Marcin Rzewucki
Hi, Does anybody know if SOLR supports Admin Page authentication ? I'm using Jetty from the latest solr package. I added security option to start.ini: OPTIONS=Server,webapp,security and in configuration file I have (according to Jetty documentation): !--

Re: admin query showing unstored fields

2012-11-16 Thread Jack Krupansky
I just noticed that you are using an old schema version, 1.1. Any reason for that? This suggests that you had an existing, old, Solr that you migrated to 4.0. What Solr release was it? Was the stored attribute working as you expected before you upgraded to Solr 4.0? I don't know the specific

Re: admin query showing unstored fields

2012-11-16 Thread Jack Krupansky
Also, go to the Schema Browser in Solr Admin and confirm that the players field is listed as Index: (unstored field). Select a stored field just to see how they should differ. -- Jack Krupansky -Original Message- From: Jack Krupansky Sent: Friday, November 16, 2012 8:09 AM To:

error opening index solr 4.0 with lukeall-4.0.0-ALPHA.jar

2012-11-16 Thread Miguel Ángel Martín
hi all: i can open an index create with solr 4.0. with luke version= lukeall-4.0.0-ALPHA.jar i have the error: Format version is not supported (resource: NIOFSIndexInput(path=/Users/desa/data/index/_2.tvx)): 1 (needs to be between 0 and 0) at

Re: zkcli issues

2012-11-16 Thread Nick Chase
I agree that it *shouldn't* be OS specific. :) Anyway, thanks for the suggestion, but that's not it. I get the same error with the script right out of the box: Error: Could not find or load main class

Re: Solr Admin Page authentication

2012-11-16 Thread Michael Long
It doesn't... you would have to do this with jetty or tomcat. But I noticed with 4.0 it no longer lives under /admin but rather /solr...and that means you can't just password-protect it without password-protecting all of solr. If I am wrong, please let me know...I would love to protect it

Re: admin query showing unstored fields

2012-11-16 Thread Reik Schatz
Hi Jack, I just did some testing again and can confirm it works! I am new to Solr so by reading through http://wiki.apache.org/solr/SchemaXml#Fields I was under the impression that not specifying the stored attribute at all, would default to stored=false - which apparently is not the case. So

Re: Solr Admin Page authentication

2012-11-16 Thread Alexandre Rafalovitch
The UI is under Solr, but actual operations I think are still under /solr/admin and /solr/corename/admin : requestHandler name=/admin/ class=solr.admin.AdminHandlers / I wonder if it is possible to protect those resources and whether the browser will pop-up the authentication on first access

Re: Solr Admin Page authentication

2012-11-16 Thread Stefan Matheis
Alex On Friday, November 16, 2012 at 5:44 PM, Alexandre Rafalovitch wrote: I wonder if it is possible to protect those resources and whether the browser will pop-up the authentication on first access (even if from AJAX call). Or it might be possible to have a fake resource loading from that

RE: DIH nested entities don't work

2012-11-16 Thread Dyer, James
Maarten, Here is a sample set-up that lets you build your caches in parallel and then index off the caches in a subsequent step. See below for the solrconfig.xml snippet and the text of the 4 data-config.xml files. In this example it builds a cache for the parent also, but this is not

Re: zkcli issues

2012-11-16 Thread Mark Miller
Still looks like a classpath issue to me then. If that didn't help, something is still off. Either the war has not been exploded yet, your classpath entries point to the wrong places, or the classpath is not being specified right. Those would be my guesses. Given you are using the out of the

inconsistent number of results returned in solr cloud

2012-11-16 Thread Buttler, David
Hi all, I buried an issue in my last post, so let me pop it up. I have a cluster with 10 collections on it. The first collection I loaded works perfectly. But every subsequent collection returns an inconsistent number of results for each query. The queries can be simply *:*, or more complex

RE: Custom Solr indexer/searcher

2012-11-16 Thread Scott Smith
Thanks for the suggestions. I'll take a look at these things. -Original Message- From: Mikhail Khludnev [mailto:mkhlud...@griddynamics.com] Sent: Thursday, November 15, 2012 11:54 PM To: solr-user@lucene.apache.org Subject: Re: Custom Solr indexer/searcher Scott, It sounds like you

Re: inconsistent number of results returned in solr cloud

2012-11-16 Thread Mark Miller
How did you do the final commit? Can you try a lone commit (with openSearcher=true) and see if that affects things? Trying to determine if this is a known issue or not. - Mark On Nov 16, 2012, at 1:34 PM, Buttler, David buttl...@llnl.gov wrote: Hi all, I buried an issue in my last post, so

Re: admin query showing unstored fields

2012-11-16 Thread Jack Krupansky
So, you're all set? Maybe you could suggest where in the wiki the text could be clarified to help others avoid the same confusion. -- Jack Krupansky -Original Message- From: Reik Schatz Sent: Friday, November 16, 2012 8:35 AM To: solr-user@lucene.apache.org Subject: Re: admin query

Solr/Lucene Tokenizers - cannot get the behavior I need

2012-11-16 Thread Shawn Heisey
I cannot seem to get the combination of behaviors that I want from the tokenizer/filter combinations in Solr. Right now I am using WhitespaceTokenizer. This does not split on punctuation, which is the behavior I want, because I do this myself later. I use WordDelimeterFilter with

Re: Solr/Lucene Tokenizers - cannot get the behavior I need

2012-11-16 Thread Jack Krupansky
Generally, you don't need the preserveOriginal attribute for WDF. Generate both the word parts and the concatenated terms, and queries should work fine without the original. The separated terms will be indexed as a sequence, and the split/separated terms will generate a phrase query that

Re: Faceting Question

2012-11-16 Thread Jamie Johnson
Thanks, I'll take a look. Do you happen to know if it works with dates? On Thu, Nov 15, 2012 at 7:28 AM, Alexey Serba ase...@gmail.com wrote: Seems like pivot faceting is what you looking for ( http://wiki.apache.org/solr/SimpleFacetParameters#Pivot_.28ie_Decision_Tree.29_Faceting )

Re: Solr/Lucene Tokenizers - cannot get the behavior I need

2012-11-16 Thread Shawn Heisey
On 11/16/2012 12:36 PM, Jack Krupansky wrote: Generally, you don't need the preserveOriginal attribute for WDF. Generate both the word parts and the concatenated terms, and queries should work fine without the original. The separated terms will be indexed as a sequence, and the split/separated

RE: inconsistent number of results returned in solr cloud

2012-11-16 Thread Buttler, David
My typical way of adding documents is through SolrJ, where I commit after every batch of documents (where the batch size is configurable) I have now tried committing several times, from the command line (curl) with and without openSearcher=true. It does not affect anything. Dave

Re: SolrCloud and Optimize

2012-11-16 Thread Walter Underwood
An optimize is really a forced merge. Solr does continuous partial merging automatically, and has done so since version 1. Do not call optimize unless you really, really know what you are doing. wunder On Nov 16, 2012, at 11:44 AM, shreejay wrote: I just stumbled on this post. Does

Architecture Question

2012-11-16 Thread Cool Techi
Hi, I am not sure if this is the right forum for this question, but it would be great if I could be pointed in the right direction. We have been using a combination of MySql and Solr for all our company full text and query needs. But as our customers have grow so has the amount of data and

Re: zkcli issues

2012-11-16 Thread Jeevanandam Madanagopal
Hello Nick - I have executed the steps on Windows 7 and successfully uploaded the Solr Configuration to ZooKeeper :) Log lines (Config upload): C:\Users\jeeva\for-nickdir solr-lib-cli Volume in drive C has no label. Volume Serial Number is 94F3-63A7 Directory of

Re: Architecture Question

2012-11-16 Thread Otis Gospodnetic
Hello, I am not sure if this is the right forum for this question, but it would be great if I could be pointed in the right direction. We have been using a combination of MySql and Solr for all our company full text and query needs. But as our customers have grow so has the amount of data

Re: BM25 model for solr 4?

2012-11-16 Thread Otis Gospodnetic
Hi Floyd, I don't think there is a general answer to that question. You would have to test it with your corpus/index and your queries. If you have that and if you can have 2 indices, one using BM25 and the other using VSM or anything else you want to compare, you would want to do some A/B

Solr 4:How to call a updateRequestProcessorChain during the /dataimport?

2012-11-16 Thread srinalluri
I have a new updateRequestProcessorChain called 'bodychain'. (Please note CountFieldValuesUpdateProcessorFactory is new in Solr 4). I want to call this bodychain during the dataimport. updateRequestProcessorChain name=bodychain processor

Bash Script to start delta import handler

2012-11-16 Thread Spadez
Hey guys, I am after a bash script (or python script) which I can use to trigger a delta import of XML files via CRON. After a bit of digging and modification I have this: Can I get any feedback on this? Is there a better way of doing it? Any optimisations or improvements would be most

Re: Solr Admin Page authentication

2012-11-16 Thread Marcin Rzewucki
Hi, Yes, I'm trying to add authentication to Jetty (for solr4), according to this wiki page: http://wiki.apache.org/solr/SolrSecurity Does it work for you ? On 16 November 2012 17:32, Michael Long ml...@bizjournals.com wrote: It doesn't... you would have to do this with jetty or tomcat. But I

Re: Solr filter using data from the database

2012-11-16 Thread Walter Underwood
Create an HTTP call backed by the database to fetch the list of valid vendors. Mark that response cacheable until the next refresh. Use an HTTP cache in case the database is temporarily unavailable. You don't really need a custom filter, you can list all the valid vendors in the filter query.

Re: Highlighting and storage overhead

2012-11-16 Thread Otis Gospodnetic
Hello, I prefer individual fields because this allows one to apply different query boosting and other nice (e)dismax things on different fields. With a catch-all field you lose that. Yes, to have highlighting you need to store fields you want to use for highlighting. See

Re: Solr filter using data from the database

2012-11-16 Thread Otis Gospodnetic
Hi, I'm actually not sure what Wunder is suggesting, but here is another way. Have an external app that talks to the DB either on demand or every N minutes/hours. When it talks to the DB it gets all merchants whose visibility flag was changed one way or the other since the last time the app

Re: Highlighting and storage overhead

2012-11-16 Thread Rajarshi Guha
Thanks for the pointer. If I were to use (e)dismax, is it possible to identify the field(s) that matched the query (irrespective of whether the fields are stored or not)? On Fri, Nov 16, 2012 at 9:10 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hello, I prefer individual fields

RE: Architecture Question

2012-11-16 Thread Cool Techi
Hi Otis, Thanks for your reply, just wanted to check what NoSql structure would be best suited to store data and use the least amount of memory, since for most of my work Solr would be sufficient and I want to store data just in case we want to reindex and as a backup. Regards, Ayush Date:

Question about Solr Cloud

2012-11-16 Thread Cool Techi
Hi, I have just started working with Solr cloud and have a few questions related to the same, 1) In the start script we provide the the following, what's the purpose of providing this. -Dbootstrap_confdir=./solr/collection1/conf Since we don't yet have a config in zookeeper, this parameter