RE: Solr SpellCheck on Query Field

2012-11-09 Thread SolrCarinthia
Correct me if i am wrong but wouldn't collation return alternate terms against the master dictionary field. So if I were to take a collated term and run a query for that term against a specific field (say First Name) I am not guaranteed to get back results since that term could actually have

RE: Skewed IDF in multi lingual index

2012-11-09 Thread Markus Jelsma
Robert, Tom, That's it indeed! Using maxDoc as numerator opposed to docCount yields very skewed results for an unevenly distributed multi-lingual index. We have one language dominating the other twenty so the dominating language contains no rare terms compared to the others. We're now

Re: Testing Solr Cloud with ZooKeeper

2012-11-09 Thread Erick Erickson
you have to have at least one node per shard running for SolrCloud to function. So when you bring down all nodes and start one, then you have some shards with no live nodes and SolrCloud goes into a wait state. Best Erick On Thu, Nov 8, 2012 at 6:17 PM, darul daru...@gmail.com wrote: Is it

Re: NullPointerException when debugQuery=true

2012-11-09 Thread Erick Erickson
If this went away when you made your id field into a string type rather than analyzed then it's probably not worth a JIRA... Erick On Thu, Nov 8, 2012 at 11:39 AM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Looks like a bug. If Solr 4.0, maybe this needs to be in JIRA along with

Re: Solr4.0 / SolrCloud queries

2012-11-09 Thread Erick Erickson
You really should be careful about optimizes, they're generally not needed. And optimizing is almost always wrong when done after every N documents in a batch process. Do it at the very end or not at all. optimize essentially re-writes the entire index into a single segment, so you're copying

Re: [SOLR-2549] DIH LineEntityProcessor support for delimited fixed-width files

2012-11-09 Thread zakaria benzidalmal
Hi James, Yes, that was this parameter who made the request fail. I've edited the patch and added the new version to jira. Thank you. 2012/11/7 Dyer, James james.d...@ingramcontent.com Try specifying the escape parameter. This is the character your file uses to escape delimiters occuring

Re: My latest solr blog post on Solr's PostFiltering

2012-11-09 Thread Erick Erickson
It's always good when someone writes up their experiences! But when I try to follow that link, I get to your Random Writings, but it tells me that the blog post doesn't exist... Erick On Thu, Nov 8, 2012 at 4:21 PM, Amit Nithian anith...@gmail.com wrote: Hey all, I wanted to thank those

Re: My latest solr blog post on Solr's PostFiltering

2012-11-09 Thread Dmitry Kan
I guess the url should have been: http://hokiesuns.blogspot.com/2012/11/using-solrs-postfiltering-to-collect.html i.e. without 'and' in the end of it. -- Dmitry On Fri, Nov 9, 2012 at 12:03 PM, Erick Erickson erickerick...@gmail.comwrote: It's always good when someone writes up their

Patch Needed for Issue Solr-3790

2012-11-09 Thread mechravi25
Hi All, Im using Solr 3.6.1 version. For the issue given in the following url, there is no patch file provided https://issues.apache.org/jira/browse/SOLR-3790 Can you tell me if there is patch file for the same? Also, We noticed that the below url had the changes that had to be done to

Re: Testing Solr Cloud with ZooKeeper

2012-11-09 Thread darul
- Shards : 2 - ZooKeeper Cluster : 3 - One collection. Here is how I run it and my scenario case: In first console, I get first Node (first Shard) running on port 8983: In second console, I get second Node (second Shard) running on port 8984: Here I get just 2 nodes for my 2 shards

Re: Testing Solr Cloud with ZooKeeper

2012-11-09 Thread ku3ia
Hi, I have near the same problems with cloud state see http://lucene.472066.n3.nabble.com/Replicated-zookeeper-td4018984.html -- View this message in context: http://lucene.472066.n3.nabble.com/Testing-Solr-Cloud-with-ZooKeeper-tp4018900p4019264.html Sent from the Solr - User mailing list

newSearcher event

2012-11-09 Thread Dzmitry Petrushenka
Hi All! Solr provides support for newSearcher events. But those are dispatched before the real search becomes the current one. Is that possible to add some code that would be called whenever the new searcher starts to serve requests? Thanx,

Re: Patch Needed for Issue Solr-3790

2012-11-09 Thread Koji Sekiguchi
(12/11/09 19:20), mechravi25 wrote: Hi All, Im using Solr 3.6.1 version. For the issue given in the following url, there is no patch file provided https://issues.apache.org/jira/browse/SOLR-3790 Can you tell me if there is patch file for the same? Also, We noticed that the below url had the

sort on wild card query not working in solr 3.6

2012-11-09 Thread Doug Kunzman
Hi - We are using SOLR 3.6 and have noticed that when the start parameter is a very large number SOLR's performance is rather slow. After looking at our schema I was hoping to speed up SOLR performance by using a sort order since it could be on an index column. This hasn't worked. I was

Re: Testing Solr Cloud with ZooKeeper

2012-11-09 Thread darul
Yes ku3ia, I read your thread yesterday and looks like we get same issue. I wish Apache Con is nearly finished and expert can resolve this Thanks again to solr community, Jul -- View this message in context:

Re: sort on wild card query not working in solr 3.6

2012-11-09 Thread Ahmet Arslan
Hi Doug, Retrieval Engines are not designed for deep paging (very large start parameter). https://issues.apache.org/jira/browse/SOLR-1726 And your sort syntax is wrong. sort:id It should be sort=id asc --- On Fri, 11/9/12, Doug Kunzman dkunz...@usgs.gov wrote: From: Doug Kunzman

Re: custom request handler

2012-11-09 Thread Lee Carroll
Hi Amit I did not do this via a servlet filter as I wanted the solr devs to be concerned with solr config and keep them out of any concerns of the container. By specifying declarative data in a request handler that would be enough to produce a service uri for an application. Or have I missed a

SolrZKClient changed interface

2012-11-09 Thread Trym R. Møller
Hi The constructor of SolrZKClient has changed, I expect to ensure clean up of resources. The strategy is as follows: connManager = new ConnectionManager(...) try { ... } catch (Throwable e) { connManager.close(); throw new RuntimeException(); } try {

Distributed Search (shards) not working with /terms request handler

2012-11-09 Thread Daniel Baur
Hi all, I am using the the /terms request handler defined in the default configuration with solr 3.6.1: requestHandler name=/terms class=solr.SearchHandler startup=lazy lst name=defaults bool name=termstrue/bool /lst arr name=components strterms/str /arr

RE: DIH nested entities don't work

2012-11-09 Thread Dyer, James
Here are things I would try: - You need to package the patch from SOLR-2943 in your jar as well as SOLR-2613 (to get the class DIHCachePersistCacheProperties) - You need to specify cacheImpl, not persistCacheImpl - You are correct using persistCacheName persistCacheBaseDir , contra the test

RE: Solr SpellCheck on Query Field

2012-11-09 Thread Dyer, James
What I'm saying is if you specify spellcheck.maxCollationTries, it will run the suggested query against the index for you and only return valid re-written queries. That is, a misspelled firstname will be replaced with a valid firstname; a missspelled lastname will be replaced with a valid

Re: SolrZKClient changed interface

2012-11-09 Thread Per Steffensen
Hi Trym I believe one of the reasons that they started throwing RuntimeExceptions insted of UnknownHostException, TimeoutException etc is that the method signature has changed to not have a throws-part. They probably do not want do deal with those checked exceptions. Im not sure I completely

Re: Using AnalyzingQueryParser - Solr 4.0

2012-11-09 Thread balaji.gandhi
Hi Jack, We have an email field defined like this:- fieldType name=text_email class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter

Re: My latest solr blog post on Solr's PostFiltering

2012-11-09 Thread Amit Nithian
Oh weird. I'll post URLs on their own lines next time to clarify. Thanks guys and looking forward to any feedback! Cheers Amit On Fri, Nov 9, 2012 at 2:05 AM, Dmitry Kan dmitry@gmail.com wrote: I guess the url should have been:

Re: Error with SolrCloud

2012-11-09 Thread Tomás Fernández Löbbe
Are you sure you are pointing to the correct conf directory? sounds like you are missing the collection name in the path (maybe it should be ../solr/YOURCOLLECTIONNAME/conf?) On Fri, Nov 9, 2012 at 1:58 PM, Carlos Alexandro Becker caarl...@gmail.comwrote: I started my JBoss server with the

Re: Error with SolrCloud

2012-11-09 Thread Carlos Alexandro Becker
Actually, I want to use it with multiple cores, and my app dinamically add cores to solr. So, my solr.xml looks like this: ?xml version=1.0 encoding=UTF-8 ? solr persistent=false cores defaultCoreName=collection1 adminPath=/admin/cores zkClientTimeout=${zkClientTimeout:15000}

Re: custom request handler

2012-11-09 Thread Amit Nithian
Lee, I guess my question was if you are trying to prevent the big bad world from doing stuff they aren't supposed to in Solr, how are you going to prevent the big bad world from POSTing a delete all query? Or restrict them from hitting the admin console, looking at the schema.xml, solrconfig.xml.

Re: Using AnalyzingQueryParser - Solr 4.0

2012-11-09 Thread Jack Krupansky
Maybe you just want to use the white space tokenizer - the standard tokenizer treats the at-sign as if a space. See: http://lucene.apache.org/core/4_0_0/analyzers-common/org/apache/lucene/analysis/core/WhitespaceTokenizerFactory.html Or, you could use the classic tokenizer which does keep

Re: Solr4.0 / SolrCloud queries

2012-11-09 Thread shreejay
Thanks Erick. I will try optimizing after indexing everything. I was doing it after every batch since it was taking way too long to Optimize (which was expected), but it was not finishing merging it into lesser number of segments (1 segment). Instead of doing an optimize, I have now changed the

Re: Error with SolrCloud

2012-11-09 Thread Tomás Fernández Löbbe
I think you have to use either bootstrap_conf=true or bootstrap_confdir=/path/to/conf+collection.configName=foo (not both at the same time). If you use the first one, Solr will upload the configuration for all the cores that you have configured (with the name of the core as name of the

Re: Error with SolrCloud

2012-11-09 Thread Carlos Alexandro Becker
Hi Thomás, thanks for your help. I change the start cmd to: JAVA_OPTS=-DzkRun -DnumShards=2 -Dbootstrap_conf=true -Xmx2048m -XX:MaxPermSize=512m ./standalone.sh Then, I tried to add a new core like this: http://localhost:8080/ecm-indexer/admin/collections?action=CREATEname=2numShards=2

Re: Error with SolrCloud

2012-11-09 Thread Tomás Fernández Löbbe
Also, JBoss AS uses Tomcat, rigth? you may want to look at Mark Miller's comments here: http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201210.mbox/%3ccabcj++j+am6e0ghmm+hpzak5d0exrqhyxaxla6uutw1yqae...@mail.gmail.com%3E On Fri, Nov 9, 2012 at 4:30 PM, Tomás Fernández Löbbe

Re: Error with SolrCloud

2012-11-09 Thread Carlos Alexandro Becker
Hi, about the port, that's my mistake, I have the wrong port specified in solr.xml. But, now, I got the following error: 17:37:10,358 WARN [com.datasul.technology.webdesk.indexer.engine.IndexerSearchEngine] (http--0.0.0.0-8080-6) Fail uptading indexer synonyms/stopwords list. 17:37:10,378 INFO

4.0 query question

2012-11-09 Thread dm_tim
Howdy, I have a Solr query that is almost perfect: http://localhost:8080/apache-solr-4.0.0/v3_tag_core/select?q=tag%3A%22coat%22%5E4+%22coat%22+cid%3A136+sort=score+descrows=10fl=id+tag+cid+file_version+lang+scorewt=jsonindent=truedebugQuery=true It's grabbing data that includes the fields: id,

Re: Error with SolrCloud

2012-11-09 Thread Tomás Fernández Löbbe
I thought it was possible to upload a new configuration when creating a new collection through the Collections API, but it looks like the CREATE action only takes: replicationFactor name collection.configName numShards I think this means that you'll have to use an existing configuration (already

Re: Error with SolrCloud

2012-11-09 Thread Carlos Alexandro Becker
Hm, OK, now I just leave my work, next week I'll try to do what you say and give you a feedback. Meanwhile, thank you very much for your help. On Fri, Nov 9, 2012 at 6:30 PM, Tomás Fernández Löbbe tomasflo...@gmail.com wrote: I thought it was possible to upload a new configuration when

Re: 4.0 query question

2012-11-09 Thread dm_tim
I think I may have found my answer buy I'd like additional validation: I believe that I can add a function to my query to get only the highest values of 'file_version' like this - _val_:max(file_version, 1) I seem to be getting the results I want. Does this look correct? Regards, Tim -- View

Re: Error with SolrCloud

2012-11-09 Thread Mark Miller
Yeah, if you want to use a new config set when you dynamically create a new collection, you must first upload the new config set. It's pretty easy using the cloud-scripts/zkcli.sh|bat scripts. If someone likes the idea of being able to point to a new config set to upload when using the

Re: SolrZKClient changed interface

2012-11-09 Thread Mark Miller
Please file a JIRA issue for this change. - Mark On Nov 9, 2012, at 8:41 AM, Trym R. Møller t...@sigmat.dk wrote: Hi The constructor of SolrZKClient has changed, I expect to ensure clean up of resources. The strategy is as follows: connManager = new ConnectionManager(...) try { ...

Re: Collections limit in SolrCloud aka best to use single index, SOLR-1293

2012-11-09 Thread Mark Miller
Have you looked at your logs? I think at around 1000 collections, the clusterstate.json node will become too large for zookeeper by default. It has a default limit of 1MB per node - you should be able to raise/override that limit with a sys prop or something when starting zookeeper. I can't

customize solr search/scoring for performance

2012-11-09 Thread jchen2000
Hi we have 20million short docs (about 60 terms, less than 1k in total bytes each) on each box, and we wanted to rank results based on how many terms got matched only. In particular we are only interested in top N with best scores (say a small number like 5). With some help from the forum

Re: Solr4.0 / SolrCloud queries

2012-11-09 Thread Mark Miller
On Nov 9, 2012, at 1:20 PM, shreejay shreej...@gmail.com wrote: Instead of doing an optimize, I have now changed the Merge settings by keeping a maxBuffer = 960, a merge Factor = 40 and ConcurrentMergePolicy. Don't you mean ConcurrentMergeScheduler? Keep in mind that if you use the default

Re: Apache Nutch 1.5.1 + Apache Solr 4.0

2012-11-09 Thread John Whelan
Hi, I while back, I had the same 'problem'. After solving it for myself, I built and distributed a combination of Solr and Nutch into a pre-configured environment. While what I did was specific to Windows (I included Cygwin in the distribution, and a bunch of other stuff for easy administration