Re: counter field

2012-04-06 Thread Manish Bafna
Yes, before indexing, we go and check whether that document is already there in index or not. Because along with the document, we also have meta-data information which needs to be appended. So, we have few multivalued metadata fields, which we update if the same document is found again. On Fri,

Re: It cost some many memory with solrj 3.5 how to decrease it?

2012-04-06 Thread a sd
Study the update examination more deeply,i logged all elapsetime value of Updateresponse, the result list following: It seems that it spent almost 20 ms on adding/updating one document in general, thus, i called which spend less than 20ms on adding one docs as normal log,and the others were

Re: Choosing tokenizer based on language of document

2012-04-06 Thread Dominique Bejean
Hi, Yes, I agree it is not an easy issue. Index all languages with the appropriate char filter, tokenizer and filters for each language is not possible without new text type and new analyzer development. If you plan to index up to 10 different languages, I suggest one text field per

Re: A tool for frequent re-indexing...

2012-04-06 Thread Valeriy Felberg
I've implemented something like described in https://issues.apache.org/jira/browse/SOLR-3246. The idea is to add an update request processor at the end of the update chain in the core you want to copy. The processor converts the SolrInputDocument to XML (there is some utility method for doing

Re: Creating a query-able dictionary using Solr

2012-04-06 Thread Serdyn du Toit
Hi Joel, Not an advanced Solr user myself - only been looking at it for a while. Still, maybe you are looking to use a suggester? http://wiki.apache.org/solr/Suggester (the examples at the bottom of the page is very helpful) I haven't worked with Pdf documents in Solr yet but the suggester

Re: A little onfusion with maxPosAsterisk

2012-04-06 Thread Dmitry Kan
Let's first figure out, why reversing a token is helpful for doing leading wildcard searches. I'll assume you refer to ReversedWildcardFilterFactory. If you have the query *foo, using a straightforward approach, you would need to scan through the entire dictionary of terms (which can be billions)

Re: SolrCloud replica and leader out of Sync somehow

2012-04-06 Thread Jamie Johnson
awesome Yonik. I'll indeed try this. Thanks! On Thu, Apr 5, 2012 at 10:20 AM, Yonik Seeley yo...@lucidimagination.com wrote: On Thu, Apr 5, 2012 at 12:19 AM, Jamie Johnson jej2...@gmail.com wrote: Not sure if this got lost in the shuffle, were there any thoughts on this? Sorting by id could

Re: Is there any performance cost of using lots of OR in the solr query

2012-04-06 Thread Shawn Heisey
On 4/5/2012 3:49 PM, Erick Erickson wrote: Of course putting more clauses in an OR query will have a performance cost, there's more work to do OK, being a smart-alec aside you will probably be fine with a few hundred clauses. The question is simply whether the performance hit is acceptable.

Re: counter field

2012-04-06 Thread Shawn Heisey
On 4/5/2012 1:53 AM, Manish Bafna wrote: Hi, Is it possible to define a field as Counter Column which can be auto-incremented. Manish, Where does your data come from? Can you add the autoincrement field to the data source? My data comes from MySQL, where the private key is an

SolrCloud Zookeeper view does not work on latest snapshot

2012-04-06 Thread Jamie Johnson
I just downloaded the latest snapshot and fired it up to take a look around and I'm getting the following error when looking at the Cloud view. Loading of undefined failed with HTTP-Status 404 The request I see going out is as follows http://localhost:8501/solr/slice1_shard1/zookeeper?wt=json

Re: SolrCloud Zookeeper view does not work on latest snapshot

2012-04-06 Thread Jamie Johnson
I looked at our old system and indeed it used to make a call to /solr/zookeeper not /solr/corename/zookeeper. I am making a change locally so I can run with this but is this a bug or did I much something up with my configuration? On Fri, Apr 6, 2012 at 9:33 AM, Jamie Johnson jej2...@gmail.com

Re: waitFlush and waitSearcher with SolrServer.add(docs, commitWithinMs)

2012-04-06 Thread Erick Erickson
You've got it. That's the post I was talking about, I was rushed and couldn't find it quickly... LucidWorks Enterprise uses a trunk version of Solr, so DWPT is in that code in 2.0. For Solr-only, you can just check out a trunk build. Best Erick On Thu, Apr 5, 2012 at 7:54 PM, Mike O'Leary

Re: Is there any performance cost of using lots of OR in the solr query

2012-04-06 Thread Erick Erickson
Shawn: Ahhh, so *that* was what your JIRA was about Consider https://issues.apache.org/jira/browse/SOLR-2429 for your ACL calculations, that's what this was developed for. The basic idea is that you can write a custom filter that returns whether the document should be included in the

RE: upgrade solr from 1.4 to 3.5 not working

2012-04-06 Thread Robert Petersen
Note that I am trying to upgrade from the Lucid Imagination distribution of Solr 1.4, dunno if that makes a difference. We have an existing index of 11 million documents which I am trying to preserve in the upgrade process. -Original Message- From: Robert Petersen

RE: upgrade solr from 1.4 to 3.5 not working

2012-04-06 Thread Robert Petersen
OK I found in the tomcat documentation that I not only have to drop the war file into webapps but also have to delete the expanded version of the war that tomcat makes. Now tomcat doesn't find the velocity response writer which I seem to recall seeing some note about. I'll try to find that

Re: schema design question

2012-04-06 Thread Erick Erickson
I'd consider a field like associated_with_album, and a field that identifies the kind of record this is track or album. Then you can form a query like -associated_with_album:true (where '-' is the Lucene or NOT). And then group by kind to get separate groups of albums and tracks. Hope this

Re: A little onfusion with maxPosAsterisk

2012-04-06 Thread neosky
great! thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/A-little-onfusion-with-maxPosAsterisk-tp3889226p3890776.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: schema design question

2012-04-06 Thread Neal Tucker
Thanks, but I don't want to exclude all tracks that are associated with albums, I want to exclude tracks that are associated with albums *which match the query* (tracks and their associated albums may have different tags). I don't think your suggestion covers that. On Fri, Apr 6, 2012 at 9:35

SolrEntityProcessor Configuration Problem

2012-04-06 Thread michael . kroh
Dear all, I'm facing a problem with SolrEntityProcessor, when having it configured under a JDBC Datasource. My configuration looks like this: entity name=V_MARKET_STUDIES datasource=jdbc-2 query=select * from V_MARKET_STUDIES transformer=ClobTransformer field column=ID

solr analysis-extras configuration

2012-04-06 Thread N. Tucker
Hello, I'm running into an odd problem trying to use ICUTokenizer under a solr installation running under tomcat on ubuntu. It seems that all the appropriate JAR files are loaded: INFO: Adding 'file:/usr/share/solr/lib/lucene-stempel-3.5.0.jar' to classloader INFO: Adding

Re: SolrCloud Zookeeper view does not work on latest snapshot

2012-04-06 Thread Ryan McKinley
There have been a bunch of changes getting the zookeeper info and UI looking good. The info moved from being on the core to using a servlet at the root level. Note, it is not a request handler anymore, so the wt=XXX has no effect. It is always JSON ryan On Fri, Apr 6, 2012 at 7:01 AM, Jamie

Re: Solr dismax not returning expected results

2012-04-06 Thread dboychuck
Adding autoGeneratePhraseQueries=true to my field definitions has solved the problem -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-dismax-not-returning-expected-results-tp3891346p3891594.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: solr analysis-extras configuration

2012-04-06 Thread N. Tucker
Further info: I can make this work if I stay out of tomcat -- I download a fresh solr binary distro, copy those five JARs from 'dist' and 'contrib' into example/solr/lib/, copy my solrconfig.xml and schema.xml, and run 'java -jar start.jar', and it works fine. But trying to add those same JARs to

Re: SolrCloud Zookeeper view does not work on latest snapshot

2012-04-06 Thread Jamie Johnson
Thanks Ryan. So to be clear this is a bug then? I went into the cloud.js and changed the url used to access this information so that it would work, wasn't sure if it was kosher or not. On 4/6/12, Ryan McKinley ryan...@gmail.com wrote: There have been a bunch of changes getting the zookeeper

Re: SolrEntityProcessor Configuration Problem

2012-04-06 Thread Lance Norskog
The SolrEntityProcessor resolves all of its parameters at start time, not for each query. This technique cannot work. I filed it: https://issues.apache.org/jira/browse/SOLR-3336 On Fri, Apr 6, 2012 at 11:13 AM, michael.k...@basf.com wrote: Dear all, I'm facing a problem with

Re: schema design question

2012-04-06 Thread Lance Norskog
(albums:query OR tracks:query) AND NOT(tracks:query - albums:query) Is this it? That last clause does sound like a join. How do you shard? Is it possible to put all associated albums and tracks in one shard? You can then do a join query against each shard and merge the output yourself. On Fri,

Re: solr analysis-extras configuration

2012-04-06 Thread Lance Norskog
Tomcat needs an explicit parameter somewhere to use UTF-8 text. It's on the wiki how to do this. On Fri, Apr 6, 2012 at 4:41 PM, N. Tucker ntucker-ml-solr-us...@august20th.com wrote: Further info: I can make this work if I stay out of tomcat -- I download a fresh solr binary distro, copy those