Re: Can solr-langid(Solr3.5.0) detect multiple languages in one text?

2012-03-13 Thread bing
Hi, Tanguy, >For the other implementation ( >http://code.google.com/p/language-detection/ ), it seems to be >performing a first pass on the input, and tries to separate Latin >characters from the others. If there's more non-Latin characters than >Latin ones, then it will process the non-Latin c

Re: index size with replication

2012-03-13 Thread Li Li
optimize will generate new segments and delete old ones. if your master also provides searching service during indexing, the old files may be opened by old SolrIndexSearcher. they will be deleted later. So when indexing, the index size may double. But a moment later, old indexes will be deleted.

Re: 400 Error adding field 'tags'='[a,b,c]'

2012-03-13 Thread Memory Makers
left. the $20 under the tile with the hand prints. thx On Tuesday, March 13, 2012, jlark wrote: > Interestingly I'm getting this on other fields now. > > I have the field stored="true" /> > > which is copied to text > > and my text field is simply indexed="true" stored="true" /> > > I

Re: 400 Error adding field 'tags'='[a,b,c]'

2012-03-13 Thread jlark
Interestingly I'm getting this on other fields now. I have the field which is copied to text and my text field is simply I'm feedin my test document {"url" : "TestDoc2", "title" : "another test", "ptag":["a","b"],"name":"foo bar"}, and when I try to feed I get. HTTP request sent, a

Re: Solr Monitoring / Stats

2012-03-13 Thread Jan Høydahl
And here is a page on how to wire Solr's JMX info into OpenNMS monitoring tool. Have not tried it, but as soon as a collector config is defined once I'd guess it could be re-used, maybe shipped with Solr. http://www.opennms.org/wiki/JMX_Collector -- Jan Høydahl, search solution architect Cominve

Re: solr 3.5 and indexing performance

2012-03-13 Thread Jan Høydahl
Hi, Thanks a lot for your detailed problem description. It definitely is an error. Would you be so kind to register it as a bug ticket, including your descriptions from this email? http://wiki.apache.org/solr/HowToContribute#JIRA_tips_.28our_issue.2BAC8-bug_tracker.29. Also please attach to th

Re: 400 Error adding field 'tags'='[a,b,c]'

2012-03-13 Thread jlark
Hey,Thanks for the lightning quick responses. Unfortunalty no stack trace anywhere I can see. Where would you recommend I look? I think this is a very bad bad bug. I changed my field name from tag to ptag and now it's working fine! Thanks, Alp -- View this message in context: http://lucene.4

Re: 400 Error adding field 'tags'='[a,b,c]'

2012-03-13 Thread Yonik Seeley
I just changed the exception handling in trunk to hopefully produce better error messages when you don't have the full stack trace. Shot in the dark: is tags the source for any copyField commands in the schema? If so, make sure the targets are also multi-valued. -Yonik lucenerevolution.com - Luc

Re: 400 Error adding field 'tags'='[a,b,c]'

2012-03-13 Thread Yonik Seeley
Hmmm, this looks like it's generated by DocumentBuilder with the code catch( Exception ex ) { throw new SolrException( SolrException.ErrorCode.BAD_REQUEST, "ERROR: "+getID(doc, schema)+"Error adding field '" + field.getName() + "'='" +field.getValue()+"'", e

index size with replication

2012-03-13 Thread Mike Austin
I have a master with two slaves. For some reason on the master if I do an optimize after indexing on the master it double in size from 42meg to 90 meg.. however, when the slaves replicate they get the 42meg index.. Should the master and slaves always be the same size? Thanks, Mike

400 Error adding field 'tags'='[a,b,c]'

2012-03-13 Thread jlark
Hey Folks, I'm new to lucene/solr so pardon my lack of knowledge. I'm trying to feed some json to my solr instance through wget. I'm using the command wget 'http://localhost:8983/solr/update/json?commit=true' --post-file=itemsExported.json --header='Content-type:application/json' however the r

Are multi-value poly fields possible in sold?

2012-03-13 Thread john.duprey
Are indexed multi-value poly fields possible in sold? If so where can I find an example? Thank you, -John

Re: Strange behavior with search on empty string and NOT

2012-03-13 Thread Lan
Would it be a good idea to have Solr throw syntax error if an empty string query occurs? -- View this message in context: http://lucene.472066.n3.nabble.com/Strange-behavior-with-search-on-empty-string-and-NOT-tp3818023p3823572.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: schema design help

2012-03-13 Thread Gora Mohanty
On 13 March 2012 18:19, Abhishek tiwari wrote: [...] > My one more concern, > Though Establishments, Events, Movies are not related to each other, > > I have to make 3 search queries to their independent cores and club the > data to show, will that effect my relevancy? Yes, it will. The results f

Re: solritas 'timestamp' parameter in call to /terms

2012-03-13 Thread jmlucjav
I suspected it was to avoid caching, but I thought what was the harm of caching at http level taking place if it's just suggestions, I would say it would be even better. So I can remove it... thanks -- View this message in context: http://lucene.472066.n3.nabble.com/solritas-timestamp-parameter

Re: Solr Monitoring / Stats

2012-03-13 Thread Mark Miller
There are jmx plugins for most monitoring tools and solr exposes many jmx stats - and the jvm does the same. http://wiki.apache.org/solr/SolrJmx - Mark Miller lucidimagination.com On Mar 13, 2012, at 5:07 AM, Alex Leonhardt wrote: > Hi All, > > I was wondering if anyone knows of a free tool

Re: solritas 'timestamp' parameter in call to /terms

2012-03-13 Thread Erik Hatcher
It's just a function of the jquery suggest component being used, if I recall correctly, to ensure that HTTP caching doesn't get involved since the request changes by the timestamp for every request. I imagine it can be "safely" (at the risk of getting cached results, I suppose) removed.

Re: Replication with different schema

2012-03-13 Thread Erick Erickson
Why would you want to? This seems like an XY problem, see: http://people.apache.org/~hossman/#xyproblem See the "confFiles" section here: http://wiki.apache.org/solr/SolrReplication although it mentions solrconfig.xml, it might work with schema.xml. BUT: This strikes me as really, really dangerou

solritas 'timestamp' parameter in call to /terms

2012-03-13 Thread jmlucjav
Hi, I am studying solristas with its browse UI that comes in 3.5.0 example. I have noticed the calls to /terms in order to get autocompletion terms have a 'timestamp' parameter. What is it for? I did not find any such param in solr docs. Can be safely be removed? thanks -- View this message i

Re: Replication :field wise

2012-03-13 Thread Erick Erickson
If by that you mean in a master/slave setup just replicate a single field in the index, no you can't. Nor can you just replicate only the changed fields in an index, Lucene isn't structured that way.. Otherwise, can you provide more background on what you're hoping for here? Your question was rath

RE: solr 3.5 and indexing performance

2012-03-13 Thread Agnieszka Kukałowicz
Hi, I did some more tests for Hunspell in solr 3.4, 4.0: Solr 3.4, full import 489017 documents: StempelPolishStemFilterFactory - 2908 seconds, 168 docs/sec HunspellStemFilterFactory - 3922 seconds, 125 docs/sec Solr 4.0, full import 489017 documents: StempelPolishStemFilterFactory - 3016 sec

RE: Using multiple DirectSolrSpellcheckers for a query

2012-03-13 Thread Dyer, James
Nalini, You're correct that "spellcheck.q" does not run through the SpellingQueryConverter, so the workaround I suggest might be half-baked. What if when using "maxCollationTries" to have it check the collations against the index, you also had the ability to override both "mm" and "qf"? Then

Re: Highlighting a font without bold or italic modes

2012-03-13 Thread Robert Muir
Google and Baidu highlight chinese queries by making text red. On Mon, Mar 12, 2012 at 11:50 PM, Lance Norskog wrote: > How do you highlight terms in languages without boldface or italic > modes? Maybe raise the text size a couple of sizes just for that word? > > > -- > Lance Norskog > goks...@gm

Re: schema design help

2012-03-13 Thread Abhishek tiwari
Hi Gora, Thanks, My one more concern, Though Establishments, Events, Movies are not related to each other, I have to make 3 search queries to their independent cores and club the data to show, will that effect my relevancy? There is movie with title "Striker" and Establishment with title "Striker

Re: Trouble indexing word documents

2012-03-13 Thread rdancy
Yes, I do have that one, as well as a bunch of other jars. I moved the lucidworks-solr-cell-3.2.0_01.jar to my classpath, I also placed it in /contrib/extraction. I restarted Solr and tried to index the document again and this is the result: "myfile=@troubleshooting_performance.doc" Apache Tomcat/

Re: Can solr-langid(Solr3.5.0) detect multiple languages in one text?

2012-03-13 Thread Tanguy Moal
Hi all, I think that depending on the language detector implemention, things may vary... For Tika, it performs better with longer inputs than shorter ones (as it seems to depend on the probabilistic distribution of ngrams -- of different sizes -- to perform distance computations with precomput

Re: Solr Monitoring / Stats

2012-03-13 Thread Alex Leonhardt
Hi there, Yes I know about that tool, however, we've decided that that's not optimal for us, so i'm looking for something freely available. Alex On 03/13/2012 09:15 AM, Rafał Kuć wrote: Hello Alex! Right now, SPM from Sematext is free to use so You can try that out :)

RE: solr 3.5 and indexing performance

2012-03-13 Thread Agnieszka Kukałowicz
Hi, Yes, I confirmed that without Hunspell indexing has normal speed. I did tests in solr 4.0 with Hunspell and PolishStemmer. With StempelPolishStemFilterFactory the speed is normal. My schema is quit easy. For Hunspell I have one text field I copy 14 text fields to: ""

Re: Can solr-langid(Solr3.5.0) detect multiple languages in one text?

2012-03-13 Thread bing
Hi, Jan Høydahl, Forgot to mention, the identifier I use is an existing one wrapped in Solr3.5.0., LangDetectLanguageIdentifier (http://wiki.apache.org/solr/LanguageDetection). For the language identifier, I looked into the sc, and found that the whole content of a text is parsed before detect

Re: Solr Monitoring / Stats

2012-03-13 Thread Rafał Kuć
Hello Alex! Right now, SPM from Sematext is free to use so You can try that out :) -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > Hi All, > I was wondering if anyone knows of a free tool to use to monitor > multiple Solr hosts under one roof ? I found some

Solr Monitoring / Stats

2012-03-13 Thread Alex Leonhardt
Hi All, I was wondering if anyone knows of a free tool to use to monitor multiple Solr hosts under one roof ? I found some non functioning cacti & munin trial implementation but would really like more direct statistics of the JVM itself + all Solr cores (i.e. requests /s , etc.) ? Does anyon

Re: Can solr-langid(Solr3.5.0) detect multiple languages in one text?

2012-03-13 Thread Jan Høydahl
Hi, Language detection cannot do that as of now. It would be a great improvement though. Language detectors are pluggable, perhaps if you know of a Java language detector which can do this we could plug it in? Or we could extend the current identifier with a capability of first splitting the te

Re: solr 3.5 and indexing performance

2012-03-13 Thread Jan Høydahl
Hi, Have you confirmed that disabling Hunspell in solrconfig gets you back to normal speed? What Hunspell configuration and dictionaries do you have? Can you share more about your environment and documents? Do you have a chance to run a profiler on your Solr instance? Try i.e. VisualVM and run t

Replication with different schema

2012-03-13 Thread syed kather
Team, Is it possible to do replication with different Schema in solr ? If not how can i acheive this . Can any one can give an idea to do this advance thanks .. Thanks and Regards, S SYED ABDUL KATHER

Re: List of recommendation engines with solr

2012-03-13 Thread Paul Libbrecht
Just out of curiosity, does Mahout qualify as a recommender-engine, or is it rather a library for it with (potentially open-source) recommenders built on it, with a more specific purpose? The page: https://cwiki.apache.org/MAHOUT/powered-by-mahout.html does not seem to list many open-s

Re: Highlighting a font without bold or italic modes

2012-03-13 Thread Markus Jelsma
I would first attempt to underline or assign another colour if the scheme allows for it before increasing font size. On Mon, 12 Mar 2012 20:50:15 -0700, Lance Norskog wrote: How do you highlight terms in languages without boldface or italic modes? Maybe raise the text size a couple of sizes j