Re: 400 Error adding field 'tags'='[a,b,c]'

2012-03-13 Thread Yonik Seeley
- Lucene/Solr Open Source Search Conference. Boston May 7-10 On Tue, Mar 13, 2012 at 7:18 PM, Yonik Seeley yo...@lucidimagination.com wrote: Hmmm, this looks like it's generated by DocumentBuilder with the code      catch( Exception ex ) {        throw new SolrException

Re: MISSING LICENSE

2012-03-12 Thread Yonik Seeley
Over-aggressive license checking code doesn't like jars in extraneous directories (like the work directory that the war is exploded into under exampleB). delete exampleB and the build should work. -Yonik lucenerevolution.com - Lucene/Solr Open Source Search Conference. Boston May 7-10 On Mon,

Re: Too many values for UnInvertedField faceting on field topic

2012-03-01 Thread Yonik Seeley
On Thu, Mar 1, 2012 at 3:34 AM, Michael Jakl jakl.mich...@gmail.com wrote: The topic field holds roughly 5 values per doc, but I wasn't able to compute the correct number right now. How many unique values for that field in the whole index? If you have log output (or output from the stats page

Re: [SolrCloud] Too many open files - internal server error

2012-02-29 Thread Yonik Seeley
On Wed, Feb 29, 2012 at 10:32 AM, Markus Jelsma markus.jel...@openindex.io wrote: The Linux machines have proper settings for ulimit and friends, 32k open files allowed Maybe you can expand on this point. cat /proc/sys/fs/file-max cat /proc/sys/fs/nr_open Those take precedence over ulimit.

Re: SolrCloud on Trunk

2012-02-29 Thread Yonik Seeley
On Thu, Mar 1, 2012 at 12:27 AM, Jamie Johnson jej2...@gmail.com wrote: Is there a ticket around doing this? Around splitting shards? The easiest thing to consider is just splitting a single shard in two reusing some of the existing buffering/replication mechanisms we have. 1) create two new

Re: Solr Performance Improvement and degradation Help

2012-02-26 Thread Yonik Seeley
On Sun, Feb 26, 2012 at 3:32 PM, Erick Erickson erickerick...@gmail.com wrote: Would you hypothesize that lazy field loading could be that much slower if a large fraction of fields were selected? If you actually use the lazy field later, it will cause an extra read for each field. If you don't

Re: upgrading Solr - org.apache.lucene.search.Filter and acceptDocs

2012-02-25 Thread Yonik Seeley
On Sat, Feb 25, 2012 at 3:16 PM, Jamie Johnson jej2...@gmail.com wrote: I'm trying to upgrade an application I have from an old snapshot of solr to the latest stable trunk and see that the constructor for Filter has changed, specifically there is another parameter named acceptDocs, the API

Re: upgrading Solr - org.apache.lucene.search.Filter and acceptDocs

2012-02-25 Thread Yonik Seeley
On Sat, Feb 25, 2012 at 3:37 PM, Jamie Johnson jej2...@gmail.com wrote:  I.e. just do if(!acceptDocs.get(doc)) return false; at the top? Yep, that should do it. -Yonik lucenerevolution.com - Lucene/Solr Open Source Search Conference. Boston May 7-10

Re: Solr 4.0 Question

2012-02-25 Thread Yonik Seeley
On Sat, Feb 25, 2012 at 3:39 PM, Jamie Johnson jej2...@gmail.com wrote: Unfortunately, Apache Solr still uses this horrible code in a lot of places, leaving us with a major piece of work undone. Major parts of Solr’s facetting and filter caching need to be rewritten to work per atomic segment!

Re: Solr Transaction Log Question

2012-02-25 Thread Yonik Seeley
On Sat, Feb 25, 2012 at 11:30 PM, Jamie Johnson jej2...@gmail.com wrote: How large will the transaction log grow, and how long should it be kept around? We keep around enough logs to satisfy a minimum of 100 updates lookback. Unneeded log files are deleted automatically. When a hard commit is

Re: Unique key constraint and optimistic locking (versioning)

2012-02-24 Thread Yonik Seeley
On Fri, Feb 24, 2012 at 6:55 AM, Em mailformailingli...@yahoo.de wrote: However, regarding a versioning-system, one always has to keep in mind that an uncommited document is not guaranteed to be persisted in the index. We now have durability via an update log. With a recent nightly trunk build,

Re: Unique key constraint and optimistic locking (versioning)

2012-02-24 Thread Yonik Seeley
On Fri, Feb 24, 2012 at 9:04 AM, Per Steffensen st...@designware.dk wrote: Cool. We have a test doing exactly that - indexing 2000 documents into Solr, kill-9'ing Solr in the middle of the process, starting Solr again and checking that 2000 documents will eventually be searchable. It lights red

Re: Unique key constraint and optimistic locking (versioning)

2012-02-24 Thread Yonik Seeley
On Fri, Feb 24, 2012 at 8:59 AM, Per Steffensen st...@designware.dk wrote: We might make it outside Solr/Lucene but I hope to be able to convince my ProductOwner to make it as a Solr-feature contributing it back - especiallly if the Solr community agrees that it would be a nice and commonly

Re: Solr Performance Improvement and degradation Help

2012-02-24 Thread Yonik Seeley
On Fri, Feb 24, 2012 at 10:25 AM, naptowndev naptowndev...@gmail.com wrote: Our current config for that is as follows: documentCache class=*solr.LRUCache* size=*15000* initialSize=*15000*autowarmCount =*0* / It's the same for both instances I assume the asterisks are for emphasis and are

Re: Solr Performance Improvement and degradation Help

2012-02-24 Thread Yonik Seeley
On Fri, Feb 24, 2012 at 11:24 AM, naptowndev naptowndev...@gmail.com wrote: Another question I have is regarding solr.LRUCache vs. solr.FastLRUCache. Would there be reason to implement (or not implement) fastLRU on the documentcache? LRUCache can be faster if the hit rate is really low (i.e.

Re: Solr on netty

2012-02-22 Thread Yonik Seeley
On Wed, Feb 22, 2012 at 9:27 AM, prasenjit mukherjee prasen@gmail.com wrote: Is anybody aware of any effort regarding porting solr to a netty ( or any other async-io based framework ) based framework. Even on medium load ( 10 parallel clients )  with 16 shards performance seems to

Re: Problem parsing queries with forward slashes and multiple fields

2012-02-22 Thread Yonik Seeley
2012/2/22 Yury Kats yuryk...@yahoo.com: On 2/22/2012 12:25 PM, Yury Kats wrote: I'm running into a problem with queries that contain forward slashes and more than one field. For example, these queries work fine: fieldName:/a fieldName:/* But if I have two fields with similar syntax in

Re: lucene operators interfearing in edismax

2012-02-20 Thread Yonik Seeley
This should be fixed in trunk by LUCENE-2566 QueryParser: Unary operators +,-,! will not be treated as operators if they are followed by whitespace. -Yonik lucidimagination.com On Mon, Feb 20, 2012 at 2:09 PM, jmlucjav jmluc...@gmail.com wrote: Hi, I am using edismax with end user entered

Re: Exception importing multi-valued UUID field

2012-02-20 Thread Yonik Seeley
On Mon, Feb 20, 2012 at 7:26 PM, Greg Pelly gfpe...@gmail.com wrote: I exported a csv file from SOLR and made some changes, I then tried to reimport the file and got the exception below. It seems UUID field type can't import multi-values, I removed all of the multi-values and it imported

Re: distributed deletes working?

2012-02-17 Thread Yonik Seeley
On Fri, Feb 17, 2012 at 11:13 AM, Mark Miller markrmil...@gmail.com wrote: When exactly is this build from? Yeah... I just checked in a fix yesterday dealing with sync while indexing is going on. -Yonik lucidimagination.com

Re: distributed deletes working?

2012-02-17 Thread Yonik Seeley
On Fri, Feb 17, 2012 at 1:27 PM, Jamie Johnson jej2...@gmail.com wrote: I'm seeing the following.  Do I need a _version_ long field in my schema? Yep... versions are the way we keep things sane (shuffled updates to a replica can be correctly reordered, etc). -Yonik lucidimagination.com

Re: distributed deletes working?

2012-02-17 Thread Yonik Seeley
On Fri, Feb 17, 2012 at 1:38 PM, Jamie Johnson jej2...@gmail.com wrote: Something that didn't work though was if a node was down when a delete happened and then comes back up, that node still listed the id I deleted.  Is this currently supported? Yes, that should work fine. Are you still

Re: distributed deletes working?

2012-02-17 Thread Yonik Seeley
On Fri, Feb 17, 2012 at 2:07 PM, Jamie Johnson jej2...@gmail.com wrote: This was with the cloud-dev solrcloud-start.sh script (after that I've used solrcloud-start-existing.sh). Essentially I run ./solrcloud-start-existing.sh index docs kill 1 of the solr instances (using kill -9 on the pid)

Re: copyField: multivalued field to joined singlevalue field

2012-02-16 Thread Yonik Seeley
On Thu, Feb 16, 2012 at 11:35 AM, flyingeagle-de flyingeagle...@yahoo.de wrote: Hello, I want to copy all data from a multivalued field joined together in a single valued field. Is there any opportunity to do this by using solr-standards? There is not currently, but it certainly makes

Re: Using Solr for a rather busy Yellow Pages-type index - good idea or not really?

2012-02-16 Thread Yonik Seeley
On Thu, Feb 16, 2012 at 3:03 PM, Alexey Verkhovsky alexey.verkhov...@gmail.com wrote: 5. All Solr caching is switched off. But why? Because (a) I shouldn't need to cache documents, if they are all in memory anyway; Your're making many assumptions about how Solr works internally. One

Re: Using Solr for a rather busy Yellow Pages-type index - good idea or not really?

2012-02-16 Thread Yonik Seeley
On Thu, Feb 16, 2012 at 4:06 PM, Alexey Verkhovsky alexey.verkhov...@gmail.com wrote: ly need ids, scores and total number of results out of Solr. Presentation of selected entities will have to include some write-heavy data (from RDBMS and/or memcached), therefore won't be Solr's business

Re: files left open?

2012-02-16 Thread Yonik Seeley
On Thu, Feb 16, 2012 at 5:56 PM, Paulo Magalhaes paulo.magalh...@gmail.com wrote: I was loading a big (60 million docs) csv in solr 4 when something odd happened. I got a solr error in the log saying that it could not write the file. du -s indicated I had used 30Gb of a 50Gb available but df

Re: Facet on TrieDateField field without including date

2012-02-15 Thread Yonik Seeley
On Wed, Feb 15, 2012 at 8:58 AM, Jamie Johnson jej2...@gmail.com wrote: I would like to be able to facet based on the time of day items are purchased across a date span.  I was hoping that I could do a query of something like date:[NOW-1WEEK TO NOW] and then specify I wanted facet broken into

Re: Facet on TrieDateField field without including date

2012-02-15 Thread Yonik Seeley
. -Yonik lucidimagination.com On Wed, Feb 15, 2012 at 9:16 AM, Yonik Seeley yo...@lucidimagination.com wrote: On Wed, Feb 15, 2012 at 8:58 AM, Jamie Johnson jej2...@gmail.com wrote: I would like to be able to facet based on the time of day items are purchased across a date span.  I

Re: Solr 3.5 not starting on CentOS 6 or RHEL 5

2012-02-14 Thread Yonik Seeley
Perhaps this is some kind of vufind specific issue? The server (/example) bundled with solr unpacks the war in /example/work and not /tmp -Yonik lucidimagination.com On Mon, Feb 13, 2012 at 7:06 PM, Bernhardt, Russell (CIV) rgber...@nps.edu wrote: A software package we use recently upgraded to

Re: Improving performance for SOLR geo queries?

2012-02-12 Thread Yonik Seeley
On Thu, Feb 9, 2012 at 1:46 PM, Yonik Seeley yo...@lucidimagination.com wrote: One way to speed up numeric range queries (at the cost of increased index size) is to lower the precisionStep.  You could try changing this from 8 to 4 and then re-indexing to see how that affects your query speed

Re: (Old) SolrCloud Date Sorting issue

2012-02-10 Thread Yonik Seeley
On Fri, Feb 10, 2012 at 11:44 AM, Jamie Johnson jej2...@gmail.com wrote: Was there a fix recently to address sorting issues for Dates in solr cloud?  On my cluster I have a date field which when I sort across the cluster I get incorrect order executing the following query I get Yikes! There

Re: (Old) SolrCloud Date Sorting issue

2012-02-10 Thread Yonik Seeley
On Fri, Feb 10, 2012 at 2:48 PM, Jamie Johnson jej2...@gmail.com wrote: So looking at query component it appears to sort the entire doc list at the end of process, my component is defined after this query so the doclist that I get should be sorted, right?  To me this should mean that I can

new feature: advanced filter caching and post filtering

2012-02-10 Thread Yonik Seeley
Well, not super-new (it's in 3.4), but the spatial post-filtering is brand new in 4.0 as of today, and I don't think cache=false and post-filtering was really highlighted well before anyway. http://www.lucidimagination.com/blog/2012/02/10/advanced-filter-caching-in-solr/ -Yonik

Re: Improving performance for SOLR geo queries?

2012-02-09 Thread Yonik Seeley
2012/2/9 Matthias Käppler matth...@qype.com: arr name=filter_queries str{!bbox cache=false d=50 sfield=location_ll pt=54.1434,-0.452322}/str /arr arr name=parsed_filter_queries str WrappedQuery({!cache=false cost=0}+location_ll_0_coordinate:[53.69373983225355 TO 54.59306016774645]

Re: Performance degradation with distributed search

2012-02-06 Thread Yonik Seeley
On Mon, Feb 6, 2012 at 3:30 PM, oleole oleol...@gmail.com wrote: Thanks for your reply. Yeah that's the first thing I tried (adding fsv=true to the query) and it surprised me too. Could it due to we're using many complex sortings (20 sortings with dismax, and, or...). Any thing it can be

Re: Performance degradation with distributed search

2012-02-06 Thread Yonik Seeley
, Yonik Seeley yo...@lucidimagination.com wrote: On Mon, Feb 6, 2012 at 3:30 PM, oleole oleol...@gmail.com wrote: Thanks for your reply. Yeah that's the first thing I tried (adding fsv=true to the query) and it surprised me too. Could it due to we're using many complex sortings (20

Re: Performance degradation with distributed search

2012-02-06 Thread Yonik Seeley
that don't contain embedded relevancy queries, I would definitely not expect the degradation you are seeing - hence we should try to get to the bottom of this. -Yonik lucidimagination.com XJ On Mon, Feb 6, 2012 at 2:37 PM, Yonik Seeley yo...@lucidimagination.com wrote: On Mon, Feb 6, 2012

Re: Performance degradation with distributed search

2012-02-04 Thread Yonik Seeley
On Sat, Feb 4, 2012 at 1:20 AM, XJ oleol...@gmail.com wrote: When I look into details (slow queries), I found some real issues that I need help with. For example, a query which takes 200ms with geo sharding, now timeout (2000ms) with distributed search. And each shard query (isShard=true)

Re: Solr Join query with fq not correctly filtering results?

2012-02-01 Thread Yonik Seeley
Thanks for your persistence in tracking this down Mike! I'm going to start looking into this now... -Yonik lucidimagination.com On Thu, Jan 26, 2012 at 11:06 PM, Mike Hugo m...@piragua.com wrote: I created issue https://issues.apache.org/jira/browse/SOLR-3062 for this problem.  I was able to

Re: SolrCloud on Trunk

2012-01-28 Thread Yonik Seeley
On Fri, Jan 27, 2012 at 11:46 PM, Jamie Johnson jej2...@gmail.com wrote: I just want to verify some of the features in regards to SolrCloud that are now on Trunk documents added to the cluster are automatically distributed amongst the available shards (I had seen that Yonik had ported the

Re: SolrCloud on Trunk

2012-01-28 Thread Yonik Seeley
On Sat, Jan 28, 2012 at 3:45 PM, Jamie Johnson jej2...@gmail.com wrote: Second question, I know there are discussion about storing the shard assignments in ZK (i.e. shard 1 is responsible for hashed values between 0 and 10, shard 2 is responsible for hashed values between 11 and 20, etc), this

Re: HTMLStripCharFilterFactory not working in Solr4?

2012-01-24 Thread Yonik Seeley
You can use LegacyHTMLStripCharFilterFactory to get the previous behavior. See https://issues.apache.org/jira/browse/LUCENE-3690 for more details. -Yonik http://www.lucidimagination.com On Tue, Jan 24, 2012 at 1:34 PM, Mike Hugo m...@piragua.com wrote: We recently updated to the latest build

Re: HTMLStripCharFilterFactory not working in Solr4?

2012-01-24 Thread Yonik Seeley
Oops, I didn't read carefully enough to see that you wanted those constructs entirely stripped out. Given that you're seeing numbers indexed, this strongly indicates an escaping bug in the SolrJ client that must have been introduced at some point. I'll see if I can reproduce it in a unit test.

Re: first time query is very slow

2012-01-19 Thread Yonik Seeley
On Wed, Jan 18, 2012 at 10:15 PM, gabriel shen xshco...@gmail.com wrote: Hi Yonik, The index I am querying against is 20gb, containing 200,000documents, some of the documents are quite big, the schema contains more than 50 fields. Main content field are defined as both stored and indexed,

Re: Solr hides some facet.fields when doing a distributed search over multiple shards

2012-01-18 Thread Yonik Seeley
On Wed, Jan 18, 2012 at 3:36 PM, Daniel Bruegge daniel.brue...@googlemail.com wrote: Hi, I have asked the question already over Stackoverflow ( http://stackoverflow.com/questions/8913654/solr-hides-some-facet-fields-when-doing-a-distributed-search), but maybe someone here can give me a hint

Re: first time query is very slow

2012-01-17 Thread Yonik Seeley
On Tue, Jan 17, 2012 at 9:39 AM, gabriel shen xshco...@gmail.com wrote: For those customers who unluckily send un-prewarmed query, they will suffer from bad response time, it is not too pleasant anyway. The warming caches part isn't about unique queries, but more about caches used for sorting

Re: Faceting Question

2012-01-14 Thread Yonik Seeley
On Sat, Jan 14, 2012 at 12:56 PM, Jamie Johnson jej2...@gmail.com wrote: I'm trying to figure out a way to execute a query which would allow me to say there were x documents over this period of time with type a, y documents over the same period of time with type b and z documents over the same

Re: Faceting Question

2012-01-14 Thread Yonik Seeley
On Sat, Jan 14, 2012 at 1:12 PM, Lee Carroll lee.a.carr...@googlemail.com wrote: if type is a field use field faceting with an fq q=datefield:[start TO end]fq=type:(a  b c)facet.field=type Yep, that will work too. -Yonik http://www.lucidimagination.com

Re: JSON XML response writer issues with short binary fields

2012-01-13 Thread Yonik Seeley
On Fri, Jan 13, 2012 at 4:04 PM, Ken Krugler kkrugler_li...@transpac.com wrote: I finally got around to looking at why short field values are returned as java.lang.Short:value. Both XMLWriter.writeVal() and TextResponseWriter.writeVal() are missing the check for (val instanceof Short), and

Re: JSON XML response writer issues with short binary fields

2012-01-13 Thread Yonik Seeley
-Yonik http://www.lucidimagination.com On Fri, Jan 13, 2012 at 4:22 PM, Yonik Seeley yo...@lucidimagination.com wrote: On Fri, Jan 13, 2012 at 4:04 PM, Ken Krugler kkrugler_li...@transpac.com wrote: I finally got around to looking at why short field values are returned

Re: soft commit

2012-01-03 Thread Yonik Seeley
On Tue, Jan 3, 2012 at 4:36 PM, Erik Hatcher erik.hatc...@gmail.com wrote: As I understand it, the document and filter caches add value *intra* request such that it keeps additional work (like fetching stored fields from disk more than once) from occurring. Yep. Highlighting, multi-select

Re: soft commit

2012-01-03 Thread Yonik Seeley
On Tue, Jan 3, 2012 at 5:03 PM, Jason Rutherglen jason.rutherg...@gmail.com wrote: Yikes. I'd love to see a test showing that un-inverted field cache (which is for ALL segments as a single unit) can be used efficiently with NRT / soft commit. Please stop being a troll. Solr as multiple

Re: soft commit

2012-01-02 Thread Yonik Seeley
On Mon, Jan 2, 2012 at 1:28 PM, Mark Miller markrmil...@gmail.com wrote: Right - in most NRT cases (very frequent soft commits), the cache should probably be disabled. Did you mean autowarm should be disabled (as it already is in the example config)? It still normally makes sense to have the

Re: soft commit

2012-01-02 Thread Yonik Seeley
On Mon, Jan 2, 2012 at 9:58 PM, Jason Rutherglen jason.rutherg...@gmail.com wrote: It still normally makes sense to have the caches enabled (esp filter and document caches). In the NRT case that statement is completely incorrect *shrug* To each their own. I stand my my statement. -Yonik

Re: TrieField precisionStep effect on non-range queries and sorting

2012-01-02 Thread Yonik Seeley
On Tue, Jan 3, 2012 at 12:36 AM, Michael Ryan mr...@moreover.com wrote: I was wondering... how does the TrieField precisionStep value affect the speed of non-range queries and sorting? I'm assuming that int (precisionStep=0) is no slower than tint (precisionStep=8) for these - is that

Re: Enabling realtime search in Solr 4.0

2011-12-29 Thread Yonik Seeley
On Thu, Dec 29, 2011 at 2:35 PM, Avner Levy av...@checkpoint.com wrote: Thanks Mark, I appreciate your help. I need the Solr index to be in sync with my database. This means that even if one record was added I need it to appear in the next search (including faceting). You could just add

Re: Poor performance on distributed search

2011-12-28 Thread Yonik Seeley
On Wed, Dec 28, 2011 at 5:47 AM, ku3ia dem...@gmail.com wrote: So, based on p.2) and on my previous researches, I conclude, that the more documents I want to retrieve, the slower is search and main problem is the cycle in writeDocs method. Am I right? Can you advice something in this

Re: Custom Solr FunctionQuery Error

2011-12-28 Thread Yonik Seeley
On Wed, Dec 28, 2011 at 2:16 AM, Parvin Gasimzade parvin.gasimz...@gmail.com wrote: I have created custom Solr FunctionQuery in Solr 3.4. I extended ValueSourceParser, ValueSource, Query and QParserPlugin classes. Note that you only need a QParserPlugin implementation for top level query types,

Re: [Solr 3.5] Facets and stats become a lot slower during concurrent inserts

2011-12-27 Thread Yonik Seeley
On Tue, Dec 27, 2011 at 10:43 AM, Lyuba Romanchuk lyuba.romanc...@gmail.com wrote: I test facets and stats in Solr 3.5 and I see that queries are running a lot slower during inserts into index with more than 15M documents . Are you also doing commits (or have autocommit enabled)? The first time

Re: r1201855 broke stats.facet on long fields

2011-12-09 Thread Yonik Seeley
On Thu, Dec 8, 2011 at 6:16 PM, Chris Hostetter hossman_luc...@fucit.org wrote: Solr can not reasonably compute stats on a multivalued field Wasn't that added here? https://issues.apache.org/jira/browse/SOLR-1380 -Yonik http://www.lucidimagination.com

Re: IllegalStateException, response already committed - replication related

2011-12-08 Thread Yonik Seeley
On Thu, Dec 8, 2011 at 6:21 PM, Tom Lianza t...@wishpot.com wrote: We're seeing the same thing (though we're not using replication).  Based on the trace, it looks like it would happen when Solr's response is too slow for the client, and it's trying to send a response back to someone who's no

Re: To optimize or not - Solr vs Lucene

2011-12-06 Thread Yonik Seeley
On Tue, Dec 6, 2011 at 5:04 PM, Scott Smith ssm...@mainstreamdata.com wrote: If I read the 3.5 lucene javadocs, optimize() has been deprecated because it is rarely justified with the current lucene index implementation It's functionality is not being deprecated... it's just that the method is

Re: Configuring the Distributed

2011-12-05 Thread Yonik Seeley
On Mon, Dec 5, 2011 at 9:21 AM, Jamie Johnson jej2...@gmail.com wrote: What does the version field need to look like? It's in the example schema: field name=_version_ type=long indexed=true stored=true/ -Yonik http://www.lucidimagination.com

Re: Configuring the Distributed

2011-12-05 Thread Yonik Seeley
On Mon, Dec 5, 2011 at 1:29 PM, Jamie Johnson jej2...@gmail.com wrote: In this situation I don't think splitting one shard would help us we'd need to split every shard to reduce the load on the burdened systems right? Sure... but if you can split one, you can split them all :-) -Yonik

Re: Continuous update on progress of New SolrCloud Design work

2011-12-05 Thread Yonik Seeley
On Mon, Dec 5, 2011 at 6:23 AM, Per Steffensen st...@designware.dk wrote: Will it be possible to maintain a how-to-use section on http://wiki.apache.org/solr/NewSolrCloudDesign with examples, e.g. like to ones on http://wiki.apache.org/solr/SolrCloud, Yep, it was on my near-term todo list to

Re: Configuring the Distributed

2011-12-04 Thread Yonik Seeley
On Thu, Dec 1, 2011 at 3:39 PM, Mark Miller markrmil...@gmail.com wrote: On Thu, Dec 1, 2011 at 10:08 AM, Jamie Johnson jej2...@gmail.com wrote: I am currently looking at the latest solrcloud branch and was wondering if there was any documentation on configuring the

Re: Configuring the Distributed

2011-12-04 Thread Yonik Seeley
On Fri, Dec 2, 2011 at 10:48 AM, Mark Miller markrmil...@gmail.com wrote: You always want to use the distrib-update-chain. Eventually it will probably be part of the default chain and auto turn in zk mode. I'm working on this now... -Yonik http://www.lucidimagination.com

Re: Solr 3.5 very slow (performance)

2011-11-30 Thread Yonik Seeley
On Wed, Nov 30, 2011 at 7:08 AM, Pawel Rog pawelro...@gmail.com wrote:        at org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:702)        at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1144)        at

Re: Solr 3.5 very slow (performance)

2011-11-29 Thread Yonik Seeley
On Tue, Nov 29, 2011 at 12:25 PM, Pawel pawelmis...@gmail.com wrote: I've build index on solr 1.4 some time ago (about 18milions documents, about 8GB). I need new features from newer version of solr, so i decided to upgrade solr version from 1.4 to 3.5. * I created new solr master on new

Re: Painfully slow transfer speed from Solr

2011-11-21 Thread Yonik Seeley
On Tue, Nov 22, 2011 at 12:19 AM, Stephen Powis stephen.po...@pardot.com wrote: Just trying to get a better understanding of this.Wouldn't the indexes not being in the disk cache make the queries themselves slow as well (high qTime), not just fetching the results? What happens in

Re: Boosting is slow

2011-11-18 Thread Yonik Seeley
On Thu, Nov 17, 2011 at 2:59 PM, Brian Lamb brian.l...@journalexperts.com wrote: http://localhost:8983/solr/mycore/search/?q=test {!boost b=2} it is still really slow. Is there a different approach I should be taking? I just tried what something similar to this (a non-boosted query vs a simple

Re: Solr Near Real-Time Search, Soft Commit Problem

2011-11-17 Thread Yonik Seeley
On Thu, Nov 17, 2011 at 11:48 AM, Jak Akdemir jakde...@gmail.com wrote: 2) I am sure about delta-queries configured well. Full-Import is completed in 40 secs for 40 docs. And delta's are in 1 sec for 15 new records. Also I checked it. There is no problem in it. That's 10,000 docs/sec. If

Re: Solr Near Real-Time Search, Soft Commit Problem

2011-11-17 Thread Yonik Seeley
On Thu, Nov 17, 2011 at 1:34 PM, Erick Erickson erickerick...@gmail.com wrote: Hmmm. It is suspicious that your index files change every second. Why is this suspicious? A soft commit still writes out some files currently... it just doesn't fsync them. -Yonik http://www.lucidimagination.com

Re: Solr Near Real-Time Search, Soft Commit Problem

2011-11-17 Thread Yonik Seeley
On Thu, Nov 17, 2011 at 3:56 PM, Jak Akdemir jakde...@gmail.com wrote: Is it ok to see soft committed records after server restart, too? Yes... we currently have Jetty configured to call some cleanups on exit (such as closing the index writer). -Yonik http://www.lucidimagination.com

Re: [Solr-3.4] Norms file size is large in case of many unique indexed fields in index

2011-11-10 Thread Yonik Seeley
On Thu, Nov 10, 2011 at 7:42 AM, Ivan Hrytsyuk ihryts...@softserveinc.com wrote: For 5000 documents (every document has 2 unique fields, 2*5000=1 unique fields in index), index size is 48.24 MB. You might be able to turn this around and encode the unique field information in a multi-valued

Re: when using group=true facet numbers are incorrect

2011-11-07 Thread Yonik Seeley
On Mon, Nov 7, 2011 at 8:55 PM, Chris Hostetter hossman_luc...@fucit.org wrote: : I understand that's a valid thing for faceting to do, I was just wondering : if there's any way to get it to do the faceting on the groups returned. : Otherwise I guess I'll need to convince the UI people to just

Re: Default value for dynamic fields

2011-11-03 Thread Yonik Seeley
On Thu, Nov 3, 2011 at 12:59 PM, Milan Dobrota mi...@milandobrota.com wrote: Is there any way to define the default value for the dynamic fields in SOLR? I use some dynamic fields of type float with _val_ and if they haven't been created at index time, the value defaults to 0. I would want

Re: large scale indexing issues / single threaded bottleneck

2011-10-29 Thread Yonik Seeley
On Sat, Oct 29, 2011 at 6:35 AM, Michael McCandless luc...@mikemccandless.com wrote: I saw a mention somewhere that you can tell Solr not to use IW.addDocument (not IW.updateDocument) when you add a document if you are certain it's not replacing a previous document with the same ID Right -

Re: bbox issue

2011-10-28 Thread Yonik Seeley
=text indexed=false stored=true multiValued=true/ So should I have another for _latLon?  Would it look like: dynamicField name=*_latLon type=double indexed=true stored=true/ -- Chris On Fri, Oct 28, 2011 at 9:27 AM, Yonik Seeley yo...@lucidimagination.com wrote: On Fri, Oct 28, 2011 at 8:42

Re: bbox issue

2011-10-27 Thread Yonik Seeley
On Thu, Oct 27, 2011 at 2:34 PM, Christopher Gross cogr...@gmail.com wrote: I'm using the geohash field to store points for my data.  When I do a bounding box like: localhost:8080/solr/select?q=point:[-45,-80%20TO%20-24,-39] I get a data point that falls outside the box: (-73.03358

Re: bbox issue

2011-10-27 Thread Yonik Seeley
On Thu, Oct 27, 2011 at 3:22 PM, Christopher Gross cogr...@gmail.com wrote: I can roll back and use the LatLon type -- but then I'm still concerned about the bounding box giving results outside the specified range. The implementation of things like bbox are intimately tied to the field type

Re: Too many values for UnInvertedField faceting on field autocompleteField

2011-10-26 Thread Yonik Seeley
On Wed, Oct 26, 2011 at 7:39 AM, Torsten Krah tk...@fachschaft.imn.htwk-leipzig.de wrote: I am getting this SolrException Too many values for UnInvertedField faceting on field autocompleteField. Already added facet.method=enum to my search handler definition but still this exception does

Re: joins and filter queries effecting scoring

2011-10-25 Thread Yonik Seeley
Can you give an example of the request (URL) you are sending to Solr? -Yonik http://www.lucidimagination.com On Mon, Oct 24, 2011 at 3:31 PM, Jason Toy jason...@gmail.com wrote: I have 2 types of docs, users and posts. I want to view all the docs that belong to certain users by joining posts

Re: java.net.SocketException: Too many open files

2011-10-25 Thread Yonik Seeley
On Tue, Oct 25, 2011 at 4:03 PM, Jonty Rhods jonty.rh...@gmail.com wrote: Hi, I am using solrj and for connection to server I am using instance of the solr server: SolrServer server =  new CommonsHttpSolrServer( http://localhost:8080/solr/core0;); Are you reusing the server object for all

Re: NRT and replication

2011-10-14 Thread Yonik Seeley
On Fri, Oct 14, 2011 at 5:49 PM, Esteban Donato esteban.don...@gmail.com wrote:  I found soft commits very useful for NRT search requirements. However I couldn't figure out how replication works with this feature.  I mean, if I have N replicas of an index for load balancing purposes, when I

Re: Solr Cloud on solrcloud branch acting strange

2011-10-10 Thread Yonik Seeley
On Sun, Oct 9, 2011 at 11:30 PM, Jamie Johnson jej2...@gmail.com wrote: I'm doing some work on the solrcloud branch in SVN and am noticing some strange (but perhaps expected) behavior when executing queries. I have setup a simple 2 shard cluster, indexed 50 documents into each (verified by

Re: sorting using function query results are notin order

2011-10-04 Thread Yonik Seeley
Hmmm, try adding fl={!func}Count to make sure Count is an indexed field and function queries are getting the right values. -Yonik http://www.lucene-eurocon.com - The Lucene/Solr User Conference On Mon, Oct 3, 2011 at 3:42 PM, abhayd ajdabhol...@hotmail.com wrote: hi I am trying to sort

Re: solr 1.4 facet.limit behaviour in merging from several shards

2011-10-04 Thread Yonik Seeley
On Tue, Oct 4, 2011 at 7:13 PM, Chris Hostetter hossman_luc...@fucit.org wrote: : OK, if SOLR-2403 being related to the bug I described, has been fixed in : SOLR 3.4 than we are safe, since we are in the process of migration. Is it : possible to verify this somehow? Is FacetComponent class is

Re: I think I've found a bug with filter queries and joins

2011-10-02 Thread Yonik Seeley
On Fri, Sep 30, 2011 at 11:32 AM, Jason Toy jason...@gmail.com wrote: I'm testing out the join functionality on the svn revision 1175424. I've found when I add a single filter query to a join it works fine, but when I do more then 1 filter query, the query does not return results. This single

Re: NewSolrCloudDesign question

2011-09-14 Thread Yonik Seeley
On Wed, Sep 14, 2011 at 10:17 AM, dar...@ontrenet.com wrote: Hi,  I am very excited to see this direction for Solr. I realize its early still, but is there any thought as to what the target release date might be (this year? next?). We've started to work on the new functionallity now, but

Re: searching for terms containing embedded spaces

2011-09-11 Thread Yonik Seeley
On Sun, Sep 11, 2011 at 12:56 PM, Mark juszczec mark.juszc...@gmail.com wrote: We've also tried making it create field:a\ b The first case just does not work and I'm unsure why. The second case ends up url encoding the \ and I'm unsure if that will cause it to be used in the query or not.

Re: searching for terms containing embedded spaces

2011-09-11 Thread Yonik Seeley
On Sun, Sep 11, 2011 at 1:15 PM, Mark juszczec mark.juszc...@gmail.com wrote: I am looking for a text string with a single, embedded space.  For the purposes of this example, it is a b and its stored in the index in a field called field. Am I incorrect in assuming the query field:a b will

Re: searching for terms containing embedded spaces

2011-09-11 Thread Yonik Seeley
On Sun, Sep 11, 2011 at 1:39 PM, Mark juszczec mark.juszc...@gmail.com wrote: That's what I thought.  The problem is, its not and I am unsure what is wrong. What is the fieldType definition for that field? Did you change it without re-indexing? -Yonik http://www.lucene-eurocon.com - The

Re: StreamingUpdateSolrServer#handleError

2011-09-06 Thread Yonik Seeley
On Tue, Sep 6, 2011 at 6:56 PM, simon mtnes...@gmail.com wrote: If you're batching the documents when you send them to Solr with the #add method, you may be out of luck - Solr doesn't do a very good job of reporting which document in a batch caused the failure. If you reverted to

Re: Analyzers and sorting with a custom analysis chain

2011-09-03 Thread Yonik Seeley
Are you able to share the source code for this CombiningFilter? This sounds like it should be a relatively simple filter. -Yonik http://www.lucene-eurocon.com - The Lucene/Solr User Conference

Re: Analyzers and sorting with a custom analysis chain

2011-09-02 Thread Yonik Seeley
On Fri, Sep 2, 2011 at 10:26 PM, Mattmann, Chris A (388J) chris.a.mattm...@jpl.nasa.gov wrote: I'm left with childrenshospitallosangeles as a single token resultant from the chain. So, when I go to sort the titles in Solr, I use sort=title_sort asc, and I am getting all kinds of weird

Re: Changing the DocCollector

2011-08-29 Thread Yonik Seeley
On Mon, Aug 29, 2011 at 12:24 PM, Jamie Johnson jej2...@gmail.com wrote: Is there any configuration that can be done to change the Doc Collector used in SolrIndexSearcher? The most generic way would be to use a post-filter (which can insert a custom collector into the chain).

Re: Changing the DocCollector

2011-08-29 Thread Yonik Seeley
On Mon, Aug 29, 2011 at 12:44 PM, Jamie Johnson jej2...@gmail.com wrote: Also I see that this is before sorting, is there a way to do something similar after sorting? If you want post-sorting, then you don't want anything based on Collector. A custom search component that runs after the query

Re: commas in synonyms.txt are not escaping

2011-08-28 Thread Yonik Seeley
Turns out this isn't a bug - I was just tripped up by the analysis changes to the example server. Gary, you are probably just hitting the same thing. The text fieldType is no longer used by any fields by default - for example the text field uses the text_general fieldType. This fieldType uses the

<    3   4   5   6   7   8   9   10   11   12   >