Re: Apache Lucene Eurocon 2012

2012-03-08 Thread Vadim Kisselmann
Hi Chris, thanks for your response.Ok, we will wait :) Best Regards Vadim 2012/3/8 Chris Hostetter hossman_luc...@fucit.org : where and when is the next Eurocon scheduled? : I read something about denmark and autumn 2012(i don't know where *g*). I do not know where, but sometime in the

RE: Custom Sharding on solrcloud

2012-03-08 Thread Phil Hoy
Hi, If I remove the DistributedUpdateProcessorFactory I will have to manage a master slave setup myself by updating solely to the master and replicating to any slave. I wonder is it possible to have distributed updates but confined to the sub-set of cores and replicas within a collection that

Re: indexing cpu utilization

2012-03-08 Thread Gora Mohanty
On 8 March 2012 15:39, gabriel shen xshco...@gmail.com wrote: Hi, I noticed that, sequential indexing on 1 solr core is only using 40% of our 8 virtual core CPU power. Why isn't it use 100% of the power? Is there a way to increase CPU utilization rate? [...] This is an open-ended question

Re: indexing cpu utilization

2012-03-08 Thread gabriel shen
Our indexing process is to adding a bundle of solr documents(for example 5000) to solr each time, and we observed that before commiting(which might be io bounded) it uses less than half the CPU capacity constantly, which sounds strange to us why it doesn't use full cpu power. As for RAM, I don't

Moving from Multiple webapps to Multi Cores -Solr 1.3

2012-03-08 Thread Sujatha Arun
Hello All, On Protyping from moving from solr Multiple Webapps to Solr Multi Cores [1.3 Version both]..I am running into the following issues and Questions 1) We are primarily moving to Multicore because ,we saw the Permgen memory being increased ,each time we created a new solr webapp ,so

Re: How to exactly match fields which are multi-valued?

2012-03-08 Thread Erick Erickson
You haven't really given us much to go on here. Matches are just like a single valued field with the exception of the increment gap. Say one entry were large cat big dog in a multi-valued field. ay the next document indexed two values, large cat big dog And, say the increment gap were 100. The

Re: Question about Streaming Update Solr Server

2012-03-08 Thread Anderson vasconcelos
Anyone could reply this questions? Thanks 2012/3/5 Anderson vasconcelos anderson.v...@gmail.com Hi I have some questions about StreamingUpdateSolrServer. 1)What's queue size parameter? It's the number of documents in each thread? 2)When i configurated like this

Re: solr geospatial / spatial4j

2012-03-08 Thread Erick Erickson
Yes, there are trunk nightly builds, see: https://builds.apache.org//view/S-Z/view/Solr/job/Solr-trunk/ But I don't think LSP is in trunk at this point, so that's not useful. The code branch is on (I think) http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_3795_ls_spatial_playground but

Re: How to limit the number of open searchers?

2012-03-08 Thread Erick Erickson
Ah, you're right. If you queries run across several commits you'll get multiple searchers open. I don't know of any good way to do what you want. I'm curious, why can't you do a master/slave setup? The other thing to think about would be the NRT stuff if you can run trunk. Best Erick On Wed,

Re: indexing cpu utilization

2012-03-08 Thread Tanguy Moal
How are you sending documents to solr ? If you push solr input documents via HTTP (which is what SolrJ does), you could increase CPU consumption (and therefor reduce indexing time) by sending your update requests asynchronously, using multiple updating threads, to your single solr core.

Re: Stemmer Question

2012-03-08 Thread Ahmet Arslan
I was previously using the PorterStemmer to do stemming and ran into an issue where it was overly aggressive with some words or abbreviations which I needed to stop.  I have recently switched to KStem and I believe the issue is less, but I was wondering still if there was a way to set a

Re: indexing cpu utilization

2012-03-08 Thread Gora Mohanty
On 8 March 2012 16:18, gabriel shen xshco...@gmail.com wrote: Our indexing process is to adding a bundle of solr documents(for example 5000) to solr each time, and we observed that before commiting(which might be io bounded) it uses less than half the CPU capacity constantly, which sounds

Understanding update handler statistics

2012-03-08 Thread stetogias
Hi, Trying to understand the update handler statistics so I have this: commits : 2824 autocommit maxDocs : 1 autocommit maxTime : 1000ms autocommits : 41 optimizes : 822 rollbacks : 0 expungeDeletes : 0 docsPending : 0 adds : 0 deletesById : 0 deletesByQuery : 0 errors : 0 cumulative_adds :

Re: wildcard queries with edismax and lucene query parsers

2012-03-08 Thread Robert Stewart
Any help on this? I am really stuck on a client project. I need to know how scoring works with wildcard queries under SOLR 3.2. Thanks Bob On Mon, Mar 5, 2012 at 4:22 PM, Robert Stewart bstewart...@gmail.com wrote: How is scoring affected by wildcard queries?  Seems when I use a wildcard

Re: wildcard queries with edismax and lucene query parsers

2012-03-08 Thread Ahmet Arslan
WildcardQueries are wrapped into ConstantScoreQuery. I would create a copy field of these fields using the following field type. Then you can search on these copyFields (qf). With this approach you don't need to use start operator. defType=edismaxq=growfl=title,score fieldType

Re: Understanding update handler statistics

2012-03-08 Thread Shawn Heisey
On 3/8/2012 7:02 AM, stetogias wrote: Hi, Trying to understand the update handler statistics so I have this: commits : 2824 autocommit maxDocs : 1 autocommit maxTime : 1000ms autocommits : 41 optimizes : 822 rollbacks : 0 expungeDeletes : 0 docsPending : 0 adds : 0 deletesById : 0

Re: wildcard queries with edismax and lucene query parsers

2012-03-08 Thread Robert Stewart
Ahmet, That is a great idea. I will try it. Thank you. On Thu, Mar 8, 2012 at 9:34 AM, Ahmet Arslan iori...@yahoo.com wrote: WildcardQueries are wrapped into ConstantScoreQuery. I would create a copy field of these fields using the following field type. Then you can search on these

Re: Stemmer Question

2012-03-08 Thread Jamie Johnson
Thanks the KeywordMarkerFilterFactory seems to be what I was looking for. I'm still wondering about keeping the unstemmed word as a token though. While I know that this would increase the index size slightly I wonder what the negative of doing such a thing would be? Just seems less destructive

Importing dynamicField data on the fly

2012-03-08 Thread Mark Beeby
Hello Everyone, I'm trying to work out how, if at all possible, dynamicFields can be imported from a dynamic data source through the DataImportHandler configurations. Currently the DataImportHandler configuration file requires me to name every single field I want to map in advance, but I do

Re: Stemmer Question

2012-03-08 Thread Ahmet Arslan
Thanks the KeywordMarkerFilterFactory seems to be what I was looking for.  I'm still wondering about keeping the unstemmed word as a token though.  While I know that this would increase the index size slightly I wonder what the negative of doing such a thing would be?  Just seems less

Re: solr geospatial / spatial4j

2012-03-08 Thread Ryan McKinley
On Wed, Mar 7, 2012 at 7:25 AM, Matt Mitchell goodie...@gmail.com wrote: Hi, I'm researching options for handling a better geospatial solution. I'm currently using Solr 3.5 for a read-only database, and the point/radius searches work great. But I'd like to start doing point in polygon

Re: Solr-Lucene compatibility

2012-03-08 Thread Chris Hostetter
: I have an app the writes lucene indexes and is based on lucene 2.3.0. : : Can I read those indexes using solr 3.5.0 and perform a distributed search? : Or should I use a lower version of solr, so that the index reader is : compatible with the index writer. a) Lucene 2.3.0 is pretty damn

Re: How to exactly match fields which are multi-valued?

2012-03-08 Thread Jonathan Rochkind
Well, if you really want EXACT exact, just use a KeywordTokenizer (ie, not tokenize at all). But then matches will really have to be EXACT, including punctuation, whitespace, diacritics, etc. But a query will only match if it 'exactly' matches one value in your multi-valued field. You could

Re: Filter facet_fields with Solr similar to stopwords

2012-03-08 Thread Chris Hostetter
: I am using a solr.StopFilterFactory in a query filter for a text_general : field (here: content). It works fine, when I query the field for the : stopword, then I am getting no results. ... : used in the text. What I am trying to achieve is, to also filter the : stopwords from the

Re: Retrieving multiple levels with hierarchical faceting in Solr

2012-03-08 Thread Chris Hostetter
: I've found a couple of discussions online that suggest I ought to be : able to set the prefix using local params: : : facet.field={!prefix=0;}foo : facet.field={!prefix=1_foovalue; key=bar}foo citation please? as far as i know that has ever been implemented, but the idea was floated

Re: maxClauseCount Exception

2012-03-08 Thread Chris Hostetter
: I am suddenly getting a maxClauseCount exception for no reason. I am : using Solr 3.5. I have only 206 documents in my index. Unless things have changed the reason you are seeing this is because _highlighting_ a query (clause) like type_s:[*+TO+*] requires rewriting it into a giant boolean

Re:two solr instances using one index

2012-03-08 Thread C.Yunqin
the two solr instances are used to provide a failover. can i define the priority of the two instances? -- Original -- From: 我自己的邮箱345804...@qq.com; Date: Thu, Mar 8, 2012 02:05 PM To: solr-usersolr-user@lucene.apache.org; Subject: two solr instances

Re: indexing bigdata

2012-03-08 Thread Erick Erickson
Your question is really unanswerable, there are about a zillion factors that could influence the answer. I can index 5-7K docs/second so it's efficient. Others can index only a fraction of that. It all depends... Try it and see is about the only way to answer. Best Erick On Thu, Mar 8, 2012 at

Re: How to index doc file in solr?

2012-03-08 Thread Erick Erickson
Have you looked at ExtractingRequestHandler (aka Solr Cell)? SolrJ? Tika? Perhaps if you defined the problem a bit more we'd be able to give you more comprehensive answers Best Erick On Wed, Mar 7, 2012 at 12:14 AM, Rohan Ashok Kumbhar rohan_kumb...@infosys.com wrote: Hi, I would like to

Reporting tools

2012-03-08 Thread Donald Organ
Are there any reporting tools out there? So I can analyzer search term frequency, filter frequency, etc?

Re: Inconsistent Results with ZooKeeper Ensemble and Four SOLR Cloud Nodes

2012-03-08 Thread Matthew Parker
All, I recreated the cluster on my machine at home (Windows 7, Java 1.6.0.23, apache-solr-4.0-2012-02-29_09-07-30) , sent some document through Manifold using its crawler, and it looks like it's replicating fine once the documents are committed. This must be related to my environment somehow.

Re: Stemmer Question

2012-03-08 Thread Jamie Johnson
I'd be very interested to see how you did this if it is available. Does this seem like something useful to the community at large? On Thursday, March 8, 2012, Ahmet Arslan iori...@yahoo.com wrote: Thanks the KeywordMarkerFilterFactory seems to be what I was looking for. I'm still wondering

Re: indexing bigdata

2012-03-08 Thread Sharath Jagannath
Ok, My bad. I should have put it in a better way. Is it good idea to have all the 30M docs on a single instance, or should I consider distributed set-up. I have synthesized the data and the have configured schema and have made suitable changes to the config. Have tested out with a smaller data-set

Re: addBean method inserting multivalued values

2012-03-08 Thread Siddharth Gargate
I have not specified the multivalued attribute. dynamicField name=*_i type=integer indexed=true stored=true/ I have different integer properties in my java class, some are single integer values, some are integer arrays. What I want is if the setter method is expecting an integer then the field