Re: Solr 4.0 - disappointing results sharding on 1 machine

2012-09-20 Thread Yonik Seeley
Depends on where the bottlenecks are I guess. On a single system, increasing shards decreases throughput (this isn't specific to Solr). The increased parallelism *can* decrease latency to the degree that the parts that were parallelized outweigh the overhead. Going from one shard to two shards

Re: Understanding fieldCache SUBREADER insanity

2012-09-19 Thread Yonik Seeley
The other thing to realize is that it's only insanity if it's unexpected or not-by-design (so the term is rather mis-named). It's more for core developers - if you are just using Solr without custom plugins, don't worry about it. -Yonik http://lucidworks.com On Wed, Sep 19, 2012 at 3:27 PM,

Re: Nodes cannot recover and become unavailable

2012-09-19 Thread Yonik Seeley
On Wed, Sep 19, 2012 at 4:25 PM, Mark Miller markrmil...@gmail.com wrote: bq. I believe there were some changes made to the clusterstate.json recently that are not backwards compatible. Indeed - I think yonik committed something the other day - we prob should send an email out about this.

Re: Understanding fieldCache SUBREADER insanity

2012-09-19 Thread Yonik Seeley
already-optimized, single-segment index That part is interesting... if true, then the type of insanity you saw should be impossible, and either the insanity detection or something else is broken. -Yonik http://lucidworks.com

SolrCloud clusterstate.json layout changes

2012-09-19 Thread Yonik Seeley
Folks, Some changes have been committed in the past few days related to SOLR-3815 as part of the groundwork for SOLR-3755 (shard splitting). The resulting clusterstate.json now looks like the following: {collection1:{ shard1:{ range:8000-,

Re: SolrCloud clusterstate.json layout changes

2012-09-19 Thread Yonik Seeley
On Wed, Sep 19, 2012 at 5:27 PM, Yonik Seeley yo...@lucidworks.com wrote: Folks, Some changes have been committed in the past few days related to SOLR-3815 as part of the groundwork for SOLR-3755 (shard splitting). The resulting clusterstate.json now looks like the following: {collection1

Re: SOLR memory usage jump in JVM

2012-09-18 Thread Yonik Seeley
On Tue, Sep 18, 2012 at 7:45 AM, Bernd Fehling bernd.fehl...@uni-bielefeld.de wrote: I used GC in different situations and tried back and forth. Yes, it reduces the used heap memory, but not by 5GB. Even so that GC from jconsole (or jvisualvm) is Full GC. Whatever Full GC means ;-) In the past

Re: FilterCache Memory consumption high

2012-09-17 Thread Yonik Seeley
On Mon, Sep 17, 2012 at 3:44 PM, Mike Schultz mike.schu...@gmail.com wrote: So I'm figuring 3MB per entry. With CacheSize=512 I expect something like 1.5GB of RAM, but with the server in steady state after 1/2 hour, it is 7GB larger than without the cache. Heap size and memory use aren't

Re: [Solr4 beta] error 503 on commit

2012-09-12 Thread Yonik Seeley
On Tue, Sep 11, 2012 at 10:52 AM, Radim Kolar h...@filez.com wrote: After investigating more, here is the tomcat log herebelow. It is indeed the same problem: exceeded limit of maxWarmingSearchers=2,. could not be solr able to close oldest warming searcher and replace it by new one? That

Re: solr.StrField with stored=true useless or bad?

2012-09-11 Thread Yonik Seeley
On Tue, Sep 11, 2012 at 7:03 PM, sy...@web.de wrote: The purpose of stored=true is to store the raw string data besides the analyzed/transformed data for displaying purposes. This is fine for an analyzed solr.TextField, but for an StrField both values are the same. So is there any reason

Re: Unexpected results in Solr 4 Pivot Faceting

2012-09-07 Thread Yonik Seeley
On Fri, Sep 7, 2012 at 9:39 AM, Erik Hatcher erik.hatc...@gmail.com wrote: A trie field probably doesn't work properly, as it indexes multiple terms per value and you'd get odd values. I don't know about pivot faceting, but all of the other types of faceting take this into account (hence

Re: Solr 4.0alpha: edismax complaints on certain characters

2012-09-06 Thread Yonik Seeley
I believe this is caused by the regex support in https://issues.apache.org/jira/browse/LUCENE-2039 It certainly seems wrong to interpret a slash in the middle of the word as the start of a regex, so I've reopened the issue. -Yonik http://lucidworks.com On Thu, Sep 6, 2012 at 9:34 AM, Alexandre

Re: UnInvertedField limitations

2012-09-06 Thread Yonik Seeley
It's actually limited to 24 bits to point to the term list in a byte[], but there are 256 different arrays, so the maximum capacity is 4B bytes of un-inverted terms, but each bucket is limited to 4B/256 so the real limit can come in at a little less due to luck. From the comments: * There is

Re: Injest pauses

2012-08-29 Thread Yonik Seeley
On Wed, Aug 29, 2012 at 11:58 AM, Voth, Brad (GE Corporate) brad.v...@ge.com wrote: Anyone know the actual status of SOLR-2565, it looks to be marked as resolved in 4.* but I am still seeing long pauses during commits using 4.* SOLR-2565 is definitely committed - adds are no longer blocked by

Re: Ordering of fields

2012-08-29 Thread Yonik Seeley
In 4.0 you can use the def function with pseudo-fields (returning function results as doc field values) http://wiki.apache.org/solr/FunctionQuery#def fl=a,b,c:def(myfield,10) -Yonik http://lucidworks.com On Wed, Aug 29, 2012 at 2:39 PM, Rohit Harchandani rhar...@gmail.com wrote: Hi all, Is

Re: Sort on dynamic field

2012-08-16 Thread Yonik Seeley
On Thu, Aug 16, 2012 at 8:00 AM, Peter Kirk p...@alpha-solutions.dk wrote: Hi, a question about sorting and dynamic fields in Solr Specification Version: 3.6.0.2012.04.06.11.34.07. I have a field defined like dynamicField name=*_int type=int indexed=true stored=true multiValued=false/

Re: Tlog vs. buffer + softcommit.

2012-08-10 Thread Yonik Seeley
On Fri, Aug 10, 2012 at 11:19 AM, Bing Hua bh...@cornell.edu wrote: Thanks for the information. It definitely helps a lot. There're numDeletesToKeep = 1000; numRecordsToKeep = 100; in UpdateLog so this should probably be what you're referring to. However when I was doing indexing the total

Re: Tuning caching of geofilt queries

2012-08-10 Thread Yonik Seeley
On Fri, Aug 10, 2012 at 1:47 PM, David Smiley (@MITRE.org) dsmi...@mitre.org wrote: Information I've read vary on exactly what is the accuracy of float vs double but at a kilometer there's no question a double is overkill. Back of the envelope: 23 mantissa bits + 1 implied bit == 24 effective

Re: Documentation on the new updateLog transaction log feature?

2012-08-10 Thread Yonik Seeley
On Fri, Aug 10, 2012 at 2:31 PM, David Smiley (@MITRE.org) dsmi...@mitre.org wrote: Is there any documentation on the updateLog transaction log feature in Solr 4? Not much beyond what's in solrconfig.xml I started a quick prototype using Solr 4 alpha with a fairly structured schema; no big

Re: null:java.lang.RuntimeException: [was class java.net.SocketTimeoutException] null

2012-08-09 Thread Yonik Seeley
On Thu, Aug 9, 2012 at 10:11 AM, Markus Jelsma markus.jel...@openindex.io wrote: I've increased the connection time out on all 10 Tomcats from 1000ms to 5000ms. Indexing a larger amount of batches seems to run fine now. This, however, does not really answer the issue. What is exactly timing

Re: Tlog vs. buffer + softcommit.

2012-08-09 Thread Yonik Seeley
On Thu, Aug 9, 2012 at 5:39 PM, Bing Hua bh...@cornell.edu wrote: I'm a bit confused with the purpose of Transaction Logs (Update Logs) in Solr. My understanding is, update request comes in, first the new item is put in RAM buffer as well as T-Log. After a soft commit happens, the new item

Re: Recovery problem in solrcloud

2012-08-08 Thread Yonik Seeley
Stack trace looks normal - it's just a multi-term query instantiating a bitset. The memory is being taken up somewhere else. How many documents are in your index? Can you get a heap dump or use some other memory profiler to see what's taking up the space? if I stop query more then ten minutes,

Re: Syntax for parameter substitution in function queries?

2012-08-07 Thread Yonik Seeley
On Tue, Aug 7, 2012 at 3:01 PM, Timothy Hill timothy.d.h...@gmail.com wrote: Hello, all ... According to http://wiki.apache.org/solr/FunctionQuery/#What_is_a_Function.3F, it is possible under Solr 4.0 to perform parameter substitutions within function queries. However, I can't get the

Re: null:java.lang.RuntimeException: [was class java.net.SocketTimeoutException] null

2012-08-07 Thread Yonik Seeley
Could this be just a simple case of a socket timeout? Can you raise the timout on request threads in Tomcat? It's a lot easier to reproduce/diagnose stuff like this when people use the stock jetty server shipped with Solr. -Yonik http://lucidimagination.com On Tue, Aug 7, 2012 at 5:39 PM,

Re: Urgent: Facetable but not Searchable Field

2012-08-01 Thread Yonik Seeley
On Wed, Aug 1, 2012 at 7:58 AM, jayakeerthi s mail2keer...@gmail.com wrote: We have a requirement, where we need to implement 2 fields as Facetable, but the values of the fields should not be Searchable. The user fields uf feature of the edismax parser may work for you:

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-24 Thread Yonik Seeley
On Tue, Jul 24, 2012 at 8:24 AM, Nagendra Nagarajayya nnagaraja...@transaxtions.com wrote: SolrIndexSearcher is a heavy object with caches, etc. As I've said, the caches are configurable, and it's trivial to disable all caching (to the point where the cache objects are not even created). The

Re: [Announce] Solr 4.0-ALPHA with RankingAlgorithm 1.4.4 with Realtime NRT available for download

2012-07-23 Thread Yonik Seeley
On Mon, Jul 23, 2012 at 11:37 AM, Nagendra Nagarajayya nnagaraja...@transaxtions.com wrote: Realtime NRT algorithm enables NRT functionality in Solr by not closing the Searcher object and so is very fast. I am in the process of contributing the algorithm back to Apache Solr as a patch. Since

Re: SOLR 4 Alpha Out Of Mem Err

2012-07-18 Thread Yonik Seeley
I think what makes the most sense is to limit the number of connections to another host. A host only has so many CPU resources, and beyond a certain point throughput would start to suffer anyway (and then only make the problem worse). It also makes sense in that a client could generate documents

Re: Error 404 on every request

2012-07-17 Thread Yonik Seeley
On Tue, Jul 17, 2012 at 6:01 AM, Nils Abegg nils.ab...@ffuf.de wrote: I have installed the 4.0 Alpha with the build-in Jetty Server on Ubuntu Server 12.04…i followed this tutorial to set it up: http://kingstonlabs.blogspot.de/2012/06/installing-solr-36-on-ubuntu-1204.html Instead of trying to

Re: Computed fields - can I put a function in fl?

2012-07-16 Thread Yonik Seeley
On Mon, Jul 16, 2012 at 4:43 AM, maurizio1976 maurizio.picc...@gmail.com wrote: Yes, sorry Just a typo. I meant q=*:*fq=start=0rows=10qt=wt=explainOther=fl=product:(if(show_product:true, product, ) thanks Functions normally derive their values from the fieldCache... there isn't currently

Re: SOLR 4 Alpha Out Of Mem Err

2012-07-15 Thread Yonik Seeley
Do you have the following hard autoCommit in your config (as the stock server does)? autoCommit maxTime15000/maxTime openSearcherfalse/openSearcher /autoCommit This is now fairly important since Solr now tracks information on every uncommitted document added. At some

Re: SOLR 4 Alpha Out Of Mem Err

2012-07-15 Thread Yonik Seeley
On Sun, Jul 15, 2012 at 11:52 AM, Nick Koton nick.ko...@gmail.com wrote: Do you have the following hard autoCommit in your config (as the stock server does)? autoCommit maxTime15000/maxTime openSearcherfalse/openSearcher /autoCommit I have tried with and without that

Re: SOLR 4 Alpha Out Of Mem Err

2012-07-15 Thread Yonik Seeley
On Sun, Jul 15, 2012 at 12:52 PM, Jack Krupansky j...@basetechnology.com wrote: Maybe your rate of update is so high that the commit never gets a chance to run. I don't believe that is possible. If it is, it should be fixed. -Yonik http://lucidimagination.com

Re: Is it possible to alias a facet field?

2012-07-14 Thread Yonik Seeley
On Sat, Jul 14, 2012 at 10:12 AM, Jamie Johnson jej2...@gmail.com wrote: So this got me close facet.field=testfieldfacet.field=%7B!key=mylabel%7Dtestfieldf.mylabel.limit=1 but the limit on the alias didn't seem to work. Is this expected? Per-field params don't currently look under the

Re: Updating documents

2012-07-13 Thread Yonik Seeley
On Fri, Jul 13, 2012 at 1:41 PM, Jonatan Fournier jonatan.fourn...@gmail.com wrote: On Fri, Jul 13, 2012 at 12:57 AM, Yonik Seeley yo...@lucidimagination.com wrote: On Thu, Jul 12, 2012 at 3:20 PM, Jonatan Fournier jonatan.fourn...@gmail.com wrote: Is there a flag for: if document does

Re: Updating documents

2012-07-13 Thread Yonik Seeley
On Fri, Jul 13, 2012 at 3:50 PM, Jonatan Fournier jonatan.fourn...@gmail.com wrote: On Thu, Jul 12, 2012 at 3:20 PM, Jonatan Fournier jonatan.fourn...@gmail.com wrote: But later on when I want to append cat3 to the field by doing this: mv_f:{add:cat3}, ... I end up with something like this

Re: Updating documents

2012-07-12 Thread Yonik Seeley
On Thu, Jul 12, 2012 at 12:38 PM, Jonatan Fournier jonatan.fourn...@gmail.com wrote: On Thu, Jul 12, 2012 at 11:05 AM, Erick Erickson The partial documents update that Jonatan references also requires that all the fields be stored. If my only fields with stored=false are copyField (e.g. I

Re: Updating documents

2012-07-12 Thread Yonik Seeley
On Thu, Jul 12, 2012 at 3:20 PM, Jonatan Fournier jonatan.fourn...@gmail.com wrote: Is there a flag for: if document does not exist, create it for me? Not currently, but it certainly makes sense. The implementation should be easy. The most difficult part is figuring out the best syntax to

Re: Solr 4.0 Alpha taking lot of CPU

2012-07-11 Thread Yonik Seeley
On Wed, Jul 11, 2012 at 8:11 PM, Pavitar Singh psi...@sprinklr.com wrote: We upgraded to Solr 4.0 Alpha and our CPU usage shot off to 400%.In profiling we are getting following trace. That could either be good or bad. Higher CPU can mean higher concurrency. Have you benchmarked your indexing

Re: Nrt and caching

2012-07-07 Thread Yonik Seeley
On Sat, Jul 7, 2012 at 9:59 AM, Jason Rutherglen jason.rutherg...@gmail.com wrote: Currently the caches are stored per-multiple-segments, meaning after each 'soft' commit, the cache(s) will be purged. Depends which caches. Some caches are per-segment, and some caches are top level. It's also a

Re: deleteById commitWithin question

2012-07-05 Thread Yonik Seeley
On Thu, Jul 5, 2012 at 4:29 PM, Jamie Johnson jej2...@gmail.com wrote: I am running off of a snapshot taken 5/3/2012 of solr 4.0 and am noticing some issues around deleteById when a commitWithin parameter is included using SolrJ, specifically commit isn't executed.  If I later just call commit

Re: SolrCloud cache warming issues

2012-06-27 Thread Yonik Seeley
On Tue, Jun 26, 2012 at 6:53 AM, Markus Jelsma markus.jel...@openindex.io wrote: Why would the documentCache not be populated via firstSearcher warming queries with a non-zero value for rows? Solr streams documents (the stored fields) returned to the user (so very large result sets can be

Re: SolrCloud cache warming issues

2012-06-27 Thread Yonik Seeley
On Wed, Jun 27, 2012 at 12:23 PM, Erik Hatcher erik.hatc...@gmail.com wrote: On Jun 27, 2012, at 12:01 , Yonik Seeley wrote: On Tue, Jun 26, 2012 at 6:53 AM, Markus Jelsma markus.jel...@openindex.io wrote: Why would the documentCache not be populated via firstSearcher warming queries

Re: Trying to avoid filtering on score, as I'm told that's bad

2012-06-27 Thread Yonik Seeley
On Wed, Jun 27, 2012 at 6:50 PM, mcb thestreet...@gmail.com wrote: I have a function query that returns miles as a score along two points: q={!func}sub(sum(geodist(OriginCoordinates,39,-105),geodist(DestinationCoordinates,36,-97),Mileage),1000) The issue that I'm having now now my results

Re: How to update one field without losing the others?

2012-06-16 Thread Yonik Seeley
Atomic update is a very new feature coming in 4.0 (i.e. grab a recent nightly build to try it out). It's not documented yet, but here's the JIRA issue:

Re: [Announce] Solr 3.6 with RankingAlgorithm 1.4.2 - NRT support

2012-05-27 Thread Yonik Seeley
On Sun, May 27, 2012 at 11:57 AM, Radim Kolar h...@filez.com wrote: but i see RankingAlgorithm has fantastic results too and looking at its reference page it even powers sites like oracle.com and ebay.com. What reference page are you referring to? -Yonik http://lucidimagination.com

Re: [Announce] Solr 3.6 with RankingAlgorithm 1.4.2 - NRT support

2012-05-27 Thread Yonik Seeley
On Sun, May 27, 2012 at 12:42 PM, Radim Kolar h...@filez.com wrote: What reference page are you referring to? http://tgels.com/wiki/en/Sites_using/downloaded_RankingAlgorithm_or_Solr-RA Ah, ok sites using/downloaded So someone with a .oracle email / domain checked it out - that certainly

Re: What is the docs number in Solr explain query results for fieldnorm?

2012-05-25 Thread Yonik Seeley
On Fri, May 25, 2012 at 2:13 PM, Tom Burton-West tburt...@umich.edu wrote: The explain (debugQuery) shows the following for fieldnorm:  0.625 = fieldNorm(field=ocr, doc=16624) What does the doc=16624 mean? It's the internal document id (i.e. it's debugging info and doesn't affect scoring)

Re: How many doc/doc in the XML source file before indexing?

2012-05-24 Thread Yonik Seeley
On Thu, May 24, 2012 at 7:29 AM, Michael Kuhlmann k...@solarier.de wrote: However, I doubt it. I've not been too deeply into the UpdateHandler yet, but I think it first needs to parse the complete XML file before it starts to index. Solr's update handlers all stream (XML, JSON, CSV), reading

Re: Update JSON not working for me

2012-05-16 Thread Yonik Seeley
On Wed, May 16, 2012 at 1:43 PM, rjain15 rjai...@gmail.com wrote: http://localhost:8983/solr/select?q=title:monsterswt=jsonindent=true Try switching title:monsters to name:monsters https://issues.apache.org/jira/browse/SOLR-2598 Looks like the data was changed to use the name field instead and

Re: Update JSON not working for me

2012-05-16 Thread Yonik Seeley
On Wed, May 16, 2012 at 2:36 PM, rjain15 rjai...@gmail.com wrote: No. Changing to name:monsters didn't work OK, but you'll have to do that if you get the other part working. Here is my guess, the UpdateJSON is not adding any new documents to the existing index. If that's true, the most

Re: Update JSON not working for me

2012-05-16 Thread Yonik Seeley
On Wed, May 16, 2012 at 4:10 PM, rjain15 rjai...@gmail.com wrote: Hi Firstly, apologies for the long post, I changed the quote to double quote (and sometimes it is messy copying from DOS windows) Here is the command and the output on the Jetty Server Window. I am highlighting some important

Re: Problems with field names in solr functions

2012-05-14 Thread Yonik Seeley
In trunk, see: * SOLR-2335: New 'field(...)' function syntax for refering to complex field names (containing whitespace or special characters) in functions. The schema in trunk also specifies: !-- field names should consist of alphanumeric or underscore characters only and not start

Re: Update JSON not working for me

2012-05-14 Thread Yonik Seeley
I think this may be due to https://issues.apache.org/jira/browse/SOLR-2857 JIRA is down right now so I can't check, but I thought the intent was to have some back compat. Try changing the URL from /update/json to just /update in the meantime -Yonik http://lucidimagination.com On Mon, May 14,

Re: Update JSON not working for me

2012-05-14 Thread Yonik Seeley
:58 PM, Yonik Seeley yo...@lucidimagination.com wrote: I think this may be due to https://issues.apache.org/jira/browse/SOLR-2857 JIRA is down right now so I can't check, but I thought the intent was to have some back compat. Try changing the URL from /update/json to just /update

Re: 1MB file to Zookeeper

2012-05-05 Thread Yonik Seeley
On Sat, May 5, 2012 at 8:39 AM, Jan Høydahl jan@cominvent.com wrote: support for CouchDb, Voldemort or whatever. Hmmm... Or Solr! -Yonik

Re: 1MB file to Zookeeper

2012-05-04 Thread Yonik Seeley
On Fri, May 4, 2012 at 12:50 PM, Mark Miller markrmil...@gmail.com wrote: And how should we detect if data is compressed when reading from ZooKeeper? I was thinking we could somehow use file extensions? eg synonyms.txt.gzip - then you can use different compression algs depending on the

Re: solr: how to change display name of a facet?

2012-05-03 Thread Yonik Seeley
On Thu, May 3, 2012 at 2:26 PM, okayndc bodymo...@gmail.com wrote: [...] I've experimented with this: str name=facet.field{!ex=dt key=Categories and Stuff}category/str I'm not really sure what 'ex=dt' does but it's obvious that 'key' is the desired display name? If there are spaces in the

Re: access document by primary key

2012-05-03 Thread Yonik Seeley
On Thu, May 3, 2012 at 3:01 PM, Tomás Fernández Löbbe tomasflo...@gmail.com wrote: Is this still true? Assuming that I know that there hasn't been updates or that I don't care to see a different version of the document, are the term QP or the raw QP faster than the real-time get handler? Sort

Re: NPE when faceting

2012-05-01 Thread Yonik Seeley
Darn... looks likely that it's another bug from when part of UnInvertedField was refactored into Lucene. We really need some random tests that can catch bugs like these though - I'll see if I can reproduce. Can you open a JIRA issue for this? -Yonik lucenerevolution.com - Lucene/Solr Open Source

Re: commit fail

2012-04-28 Thread Yonik Seeley
On Sat, Apr 28, 2012 at 7:02 AM, mav.p...@holidaylettings.co.uk mav.p...@holidaylettings.co.uk wrote: Hi, This is what the thread dump looks like. Any ideas? Looks like the thread taking up CPU is in LukeRequestHandler 1062730578@qtp-1535043768-5' Id=16, RUNNABLE on lock=, total cpu

Re: Recovery - too many updates received since start

2012-04-27 Thread Yonik Seeley
On Tue, Apr 24, 2012 at 9:31 AM, Trym R. Møller t...@sigmat.dk wrote: Hi I experience that a Solr looses its connection with Zookeeper and re-establish it. After Solr is reconnection to Zookeeper it begins to recover. It has been missing the connection approximately 10 seconds and meanwhile

Re: commit stops

2012-04-27 Thread Yonik Seeley
On Fri, Apr 27, 2012 at 9:18 AM, mav.p...@holidaylettings.co.uk mav.p...@holidaylettings.co.uk wrote: We have an index of about 3.5gb which seems to work fine until it suddenly stops accepting new commits. Users can still search on the front end but nothing new can be committed and it

Re: commit fail

2012-04-27 Thread Yonik Seeley
On Fri, Apr 27, 2012 at 8:23 PM, mav.p...@holidaylettings.co.uk mav.p...@holidaylettings.co.uk wrote: Hi again, This is the only log entry I can find, regarding the failed commits… Still timing out as far as the client is concerned and there is actually nothing happening on the server in

Re: embedded solr populating field of type LatLonType

2012-04-25 Thread Yonik Seeley
On Tue, Apr 24, 2012 at 4:05 PM, Jason Cunning jcunn...@ucar.edu wrote: My question is, what is the AppropriateJavaType for populating a solr field of type LatLonType? A String with both the lat and lon separated by a comma. Example: 12.34,56.78 -Yonik lucenerevolution.com - Lucene/Solr Open

Re: Title Boosting and IDF

2012-04-25 Thread Yonik Seeley
On Wed, Apr 25, 2012 at 9:24 PM, Walter Underwood wun...@wunderwood.org wrote: Interestingly, I worked at two different web search companies with two different completely different search engines, and one arrived at an 8X title boost and the other at a 7.5X title boost. So I consider 8X a

searcher leak on trunk after 2/1/2012

2012-04-22 Thread Yonik Seeley
Folks, If you're using a trunk version after 2/1/2012 in conjunction with the shipped solrconfig.xml (which uses openSearcher=false in an autoCommit by default), then you should upgrade to a new version. There's a searcher leak when openSearcher=false is used with a commit that leads to files not

Re: # open files with SolrCloud

2012-04-21 Thread Yonik Seeley
I can reproduce some kind of searcher leak issue here, even w/o SolrCloud, and I've opened https://issues.apache.org/jira/browse/SOLR-3392 -Yonik lucenerevolution.com - Lucene/Solr Open Source Search Conference. Boston May 7-10

Re: Solr Hanging

2012-04-19 Thread Yonik Seeley
On Thu, Apr 19, 2012 at 4:25 AM, Trym R. Møller t...@sigmat.dk wrote: Hi I am using Solr trunk and have 7 Solr instances running with 28 leaders and 28 replicas for a single collection. After indexing a while (a couple of days) the solrs start hanging and doing a thread dump on the jvm I see

Re: Distributed FacetComponent NullPointer Exception

2012-04-17 Thread Yonik Seeley
facet.field={!terms=$organization__terms}organization This is referring to another request parameter that Solr should have added (organization__terms) . Did you cut-n-paste all of the parameters below? -Yonik lucenerevolution.com - Lucene/Solr Open Source Search Conference. Boston May 7-10

Re: Problem with faceting on a boolean field

2012-04-17 Thread Yonik Seeley
On Tue, Apr 17, 2012 at 2:22 PM, Kissue Kissue kissue...@gmail.com wrote: Hi, I am faceting on a boolean field called usedItem. There are a total of 607601 items in the index and they all have value for usedItem set to false. However when i do a search for *:* and faceting on usedItem, the

Re: Changing precisionStep without a re-index

2012-04-16 Thread Yonik Seeley
On Mon, Apr 16, 2012 at 12:12 PM, Michael Ryan mr...@moreover.com wrote: Is it safe to change the precisionStep for a TrieField without doing a re-index? Not really - it changes what tokens are indexed for them numbers and range queries won't work correctly. Sorting (FieldCache), function

Re: DeleteByQuery using xml commands in SolrCloud

2012-04-16 Thread Yonik Seeley
On Mon, Apr 16, 2012 at 4:13 PM, Jamie Johnson jej2...@gmail.com wrote: I tried to execute the following on my cluster, but it had no results.  Should this work? curl http://host:port/solr/collection1/update/?commit=true -H Contenet-Type: text/xml --data-binary 'deletequery*:*/query/delete'

Re: Can Solr solve this simple problem?

2012-04-16 Thread Yonik Seeley
2012/4/16 Tomás Fernández Löbbe tomasflo...@gmail.com: I'm wondering if Solr is the best tool for this kind of usage. Solr is a text search engine Well, Lucene is a full-text search library, but Solr has always been far more. Dating back to it's first use in CNET, it was used as a browse engine

Re: solr 3.5 taking long to index

2012-04-15 Thread Yonik Seeley
On Thu, Apr 12, 2012 at 10:42 PM, Rohit ro...@in-rev.com wrote: The machine has a total ram of around 46GB. My Biggest concern is Solr index time gradually increasing and then the commit stops because of timeouts, out commit rate is very high, but I am not able to find the root cause of the

Re: It's hard to google on _val_

2012-04-15 Thread Yonik Seeley
On Sun, Apr 15, 2012 at 11:34 AM, Benson Margulies bimargul...@gmail.com wrote: So, I've been experimenting to learn how the _val_ participates in scores. It seems to me that http://wiki.apache.org/solr/FunctionQuery should explain the *effect* of including an _val_ term in an ordinary query,

Re: It's hard to google on _val_

2012-04-15 Thread Yonik Seeley
On Sun, Apr 15, 2012 at 12:14 PM, Yonik Seeley yo...@lucidimagination.com wrote: That's just because Lucene normalizes scores.  By default, this is really just multiplying scores by a magic constant (that by default is the inverse of the sum of squared weights) Sorry... I missed the square

Re: performance impact using string or float when querying ranges

2012-04-13 Thread Yonik Seeley
On Fri, Apr 13, 2012 at 8:11 AM, Erick Erickson erickerick...@gmail.com wrote: Well, I guess my first question is whether using stirngs is fast enough, in which case there's little reason to make your life more complex. But yes, range queries will be significantly faster with any of the Trie

Re: solr 3.4 with nTiers = 2: usage of ids param causes NullPointerException (NPE)

2012-04-12 Thread Yonik Seeley
On Wed, Apr 11, 2012 at 8:16 AM, Dmitry Kan dmitry@gmail.com wrote: We have a system with nTiers, that is: Solr front base --- Solr front -- shards Although the architecture had this in mind (multi-tier), all of the pieces are not yet in place to allow it. The errors you see are a direct

Re: I've broken delete in SolrCloud and I'm a bit clueless as to how

2012-04-12 Thread Yonik Seeley
On Thu, Apr 12, 2012 at 2:21 PM, Chris Hostetter hossman_luc...@fucit.org wrote: : Please see the documentation: http://wiki.apache.org/solr/SolrCloud#Required_Config : : schema.xml : : You must have a _version_ field defined: : : field name=_version_ type=long indexed=true stored=true/

Re: SOLR 4 autocommit - is it working as I think it should?

2012-04-11 Thread Yonik Seeley
On Wed, Apr 11, 2012 at 12:58 PM, vybe3142 vybe3...@gmail.com wrote: This morning, I've been looking at the autocommit functionality as defined in solrconfig.xml. By default, it appears that it should kick in 15 seconds after a new document has been added. I do see this event triggered via the

Re: SOLR issue - too many search queries

2012-04-10 Thread Yonik Seeley
On Tue, Apr 10, 2012 at 8:51 AM, arunssasidhar arunssasid...@gmail.com wrote: We have a PHP web application which is using SOLR for searching. The APP is using CURL to connect to the SOLR server and which run in a loop with thousands of predefined keywords. That will create thousands of

Re: SolrCloud replica and leader out of Sync somehow

2012-04-05 Thread Yonik Seeley
are not guaranteed to be the same across shards should the sorting use the uniqueId field as the tie breaker by default? On Tue, Mar 20, 2012 at 2:10 PM, Yonik Seeley yo...@lucidimagination.com wrote: On Tue, Mar 20, 2012 at 2:02 PM, Jamie Johnson jej2...@gmail.com wrote: I'll try to dig for the JIRA.  Also

Re: Evaluating Solr

2012-04-04 Thread Yonik Seeley
On Wed, Apr 4, 2012 at 12:46 PM, Joseph Werner telco...@gmail.com wrote: For more routine changes, are record updates supported without the necessitity to rebuilt an index? For example if a description field for an item needs be changed, am I correct in reading that the recodrd need only be

Re: solrcloud is deleteByQuery stored in transactions and forwarded like other operations?

2012-04-04 Thread Yonik Seeley
On Wed, Apr 4, 2012 at 3:04 PM, Jamie Johnson jej2...@gmail.com wrote: Thanks Mark.  The delete by query is a very rare operation for us and I really don't have the liberty to update to current trunk right now. Do you happen to know about when the fix was made so I can see if we are before or

Re: Incremantally updating a VERY LARGE field - Is this possibe ?

2012-04-04 Thread Yonik Seeley
On Wed, Apr 4, 2012 at 3:14 PM, vybe3142 vybe3...@gmail.com wrote: Updating a single field is not possible in solr.  The whole record has to be rewritten. Unfortunate. Lucene allows it. I think you're mistaken - the same limitations apply to Lucene. -Yonik lucenerevolution.com - Lucene/Solr

Re: How do I use localparams/joins using SolrJ and/or the Admin GUI

2012-03-31 Thread Yonik Seeley
On Sat, Mar 31, 2012 at 11:50 AM, Erick Erickson erickerick...@gmail.com wrote: Try escaping the '+' with %2B (as I remember). Shouldn't that be the other way? The admin UI should do any necessary escaping, so those + chars should instead be a spaces? -Yonik lucenerevolution.com - Lucene/Solr

Re: SOLR hangs - update timeout - please help

2012-03-29 Thread Yonik Seeley
On Thu, Mar 29, 2012 at 4:24 AM, Lance Norskog goks...@gmail.com wrote: 5-7 seconds- there's the problem. If you want to have documents visible for search within that time, you want to use the trunk and near-real-time search. A hard commit does several hard writes to the disk (with the fsync()

Re: SOLR hangs - update timeout - please help

2012-03-29 Thread Yonik Seeley
On Thu, Mar 29, 2012 at 1:50 PM, Rafal Gwizdala rafal.gwizd...@gmail.com wrote: Below i'm pasting the thread dump taken when the update was hung (it's also attached to the first message of this topic) Interesting... It looks like there's only one thread in solr code (the one generating the

Re: SOLR hangs - update timeout - please help

2012-03-29 Thread Yonik Seeley
Oops... my previous replies accidentally went off-list. I'll cut-n-paste below. OK, so it looks like there is probably no bug here - it's simply that commits can sometimes take a long time and updates were blocked during that time (and would have succeeded eventually except the jetty timeout was

Re: bbox query and range queries

2012-03-29 Thread Yonik Seeley
On Thu, Mar 29, 2012 at 6:20 PM, Alexandre Rocco alel...@gmail.com wrote: http://localhost:8984/solr/select?q=*:*fq=local:[-23.6677,-46.7315 TO -23.6709,-46.7261] Range queries always need to be [lower_bound TO upper_bound] Try http://localhost:8984/solr/select?q=*:*fq=local:[-23.6709,-46.7315

Re: bbox query and range queries

2012-03-29 Thread Yonik Seeley
On Thu, Mar 29, 2012 at 6:44 PM, Alexandre Rocco alel...@gmail.com wrote: Yonik, Thanks for the heads-up. That one worked. Just trying to wrap around how it would work on a real case. To test this one I just got the coordinates from Google Maps and searched within the pair of coordinates as

Re: Optimizing in SolrCloud

2012-03-29 Thread Yonik Seeley
On Thu, Mar 29, 2012 at 7:15 PM, Jamie Johnson jej2...@gmail.com wrote: Thanks, does it matter that we are also updates to documents at various times?  Do the deleted documents get removed when doing a merge or does that only get done on an optimize? Yes, any merge removes documents that have

Re: NullPointException when Faceting

2012-03-29 Thread Yonik Seeley
On Thu, Mar 29, 2012 at 6:33 PM, Jamie Johnson jej2...@gmail.com wrote: I recently got this stack trace when trying to execute a facet based query on my index.  The error went away when I did an optimize but I was surprised to see it at all.  Can anyone shed some light on why this may have

Re: SolrCloud replica and leader out of Sync somehow

2012-03-20 Thread Yonik Seeley
On Tue, Mar 20, 2012 at 11:17 AM, Jamie Johnson jej2...@gmail.com wrote: ok, with my custom component out of the picture I still have the same issue.  Specifically, when sorting by score on a leader and replica I am getting different doc orderings.  Is this something anyone has seen? This is

Re: SolrCloud replica and leader out of Sync somehow

2012-03-20 Thread Yonik Seeley
On Tue, Mar 20, 2012 at 11:39 AM, Jamie Johnson jej2...@gmail.com wrote: HmmmOk, I don't see how it's possible for me to ensure that there are no ties.  If a query were for *:* everything has a constant score, if the user requested 1 page then requested the next the results on the second

Re: Multi-valued polyfields - Do they exist in the wild ?

2012-03-20 Thread Yonik Seeley
On Tue, Mar 20, 2012 at 2:17 PM, ramdev.wud...@thomsonreuters.com wrote: Hi:   We have been keen on using polyfields for a while. But we have been restricted from using it because they do not seem to support Multi-values (yet). Poly-fields should support multi-values, it's more what uses

Re: Is there a way for SOLR / SOLRJ to index files directly bypassing HTTP streaming?

2012-03-19 Thread Yonik Seeley
On Mon, Mar 19, 2012 at 4:38 PM, vybe3142 vybe3...@gmail.com wrote: Okay, I added the javabin handler snippet to the solrconfig.xml file (actually shared across all cores).  I got further (the request made it past tomcat and into SOLR) but  haven't quite succeeded yet. Server trace: Mar 19,

Re: Is there a way for SOLR / SOLRJ to index files directly bypassing HTTP streaming?

2012-03-19 Thread Yonik Seeley
On Mon, Mar 19, 2012 at 5:48 PM, vybe3142 vybe3...@gmail.com wrote: Thanks for the response No, the file is plain text. All I'm trying to do is index plain ASCII text files via a remote reference to their file paths. The XML update handler expects a specific format of XML. The json, CSV,

Re: 400 Error adding field 'tags'='[a,b,c]'

2012-03-13 Thread Yonik Seeley
Hmmm, this looks like it's generated by DocumentBuilder with the code catch( Exception ex ) { throw new SolrException( SolrException.ErrorCode.BAD_REQUEST, ERROR: +getID(doc, schema)+Error adding field ' + field.getName() + '=' +field.getValue()+', ex );

<    2   3   4   5   6   7   8   9   10   11   >