Re: Provide value to uniqueID

2014-06-09 Thread Shalin Shekhar Mangar
You can specify the file name as the id by adding a TemplateTransformer on the entity x and specifying ${f.file} as the template value in the id field. For example: dataSource type=FileDataSource / document entity name=f processor=FileListEntityProcessor baseDir=F:\Work\Lucene\Solr\Solr

Re: Documents Added Not Available After Commit (Both Soft and Hard)

2014-06-09 Thread Shalin Shekhar Mangar
I think this may be the same bug as LUCENE-5289 which was fixed in 4.5.1. Can you upgrade to 4.5.1 and see if that solves the problem? On Fri, Jun 6, 2014 at 7:17 PM, Justin Sweeney justin.sweene...@gmail.com wrote: Hi, An application I am working on indexes documents to a Solr index. This

Re: Provide value to uniqueID

2014-06-09 Thread ienjreny
Thanks, it is working fine but I had to change the following line field name=id template=${f.file} / to field column=id template=${f.file} / On Mon, Jun 9, 2014 at 9:29 AM, Shalin Shekhar Mangar [via Lucene] ml-node+s472066n4140715...@n3.nabble.com wrote: You can specify the file name as

Solr spellcheck - onlyMorePopular threshold?

2014-06-09 Thread Alistair
Hello all, I was wondering what does the onlyMorePopular option for spellchecking use as its threshold? Will it always pick the suggestion that returns the most queries or does it base its result based off of some threshold that can be configured? Thanks! Ali. -- View this message in

Re: slow performance on simple filter

2014-06-09 Thread mizayah
I'm really at dead point. Mine indeks is 5,6GM and about 8mln documments. Field i'm using for filter is simple as hell. field name=class_name type=string indexed=true stored=true multiValued=false/ Can it be that other fields affect my search if i only do filter query?

writing logs of a speicific solr posting to a file

2014-06-09 Thread pshahukhal
Hi I am using SimplepostTool to post the xml files to SOLR llke : java -Durl=http://localhost:8080/solr/collection1/update -jar /var/lib/tomcat6/solr/collection1/dump/xmlinput/post.jar /var/lib/tomcat6/solr/collection1/dump/xmlinput/solr.xml When there are certain errors ,the response

How Can I modify the DocList and DocSet in solr

2014-06-09 Thread Vishnu Mishra
I am using solr 4.6 and I am using solr Sharding (Distributed Search). I have situation where I like to modify the solr search result (DocList and DocSet) inside solr QueryComponent right after the following method is called from process() method. searcher.search(result, cmd);

Re: SOLR Performance Benchmarking

2014-06-09 Thread Shalin Shekhar Mangar
To be of any help we'd need to know what your documents look like, what your queries look like, what is the specifications of your server? How much heap is dedicated to Solr, how much free memory is available for the OS file cache. You have to figure out the bottleneck. Is it CPU or RAM or Disk?

Large disjunction query practices

2014-06-09 Thread Joe Gresock
I'm wondering what the best practice for large disjunct queries in Solr is. A user wants to submit a query for several hundred thousand terms, like: (term1 OR term2 OR ... term500,000) I know it might be better to break this up into multiple queries that can be merged on the user's end, but I'm

Re: Large disjunction query practices

2014-06-09 Thread Jack Krupansky
Are they expecting relevancy ranking or merely seeking to a bulk read of those documents? Please detail what the user is trying to accomplish with such a monster list of IDs. Generally, queries of more than a few dozen terms are a bad idea. If for no other reason than that if you need to

Re: How Can I modify the DocList and DocSet in solr

2014-06-09 Thread Alexandre Rafalovitch
Can you make a custom Component? They are pluggable. Regards, Alex On 09/06/2014 6:24 pm, Vishnu Mishra vdil...@gmail.com wrote: I am using solr 4.6 and I am using solr Sharding (Distributed Search). I have situation where I like to modify the solr search result (DocList and DocSet)

Re: Customizing Solr; Where to draw the line?

2014-06-09 Thread Jorge Luis Betancourt Gonzalez
I’ve certainly go for the 2nd option. Depending of what you need you won’t need to modify Solr itself but extend it using different plugins for what you need. You’ll need to write different components depending on your specific requirements. I definitely recommend the talks from Trey Grainger,

Re: Solr Scale Toolkit Access Denied Error

2014-06-09 Thread Mark Gershman
Thanks, Tim. Worked like a charm. Appreciate your timely assistance. On Sat, Jun 7, 2014 at 9:13 PM, Timothy Potter thelabd...@gmail.com wrote: Hi Mark, Sorry for the trouble! I've now made the ami-1e6b9d76 AMI public; total oversight on my part :-(. Please try again. Thanks Hoss for

RE: Solr spellcheck - onlyMorePopular threshold?

2014-06-09 Thread Dyer, James
I believe it will return the terms that are most similar to the queried terms but have a greater term frequency than the queried terms. It doesn't actually care what the term frequencies are, only that they are greater than the frequencies of the terms you queried on. I do not know your use

Collection communication internally

2014-06-09 Thread Vineet Mishra
Hi All, I was curious to know how multiple Collection communication be achieved? If yes then by what means. The use case says, having multiple collection I need to query the first collection and get the unique ids from first collection to query the second one(Foreign Key Relation). Now if the

Re: Any way to view lucene files

2014-06-09 Thread Aman Tandon
No, Anyways thanks Alex, but where is the luke jar? With Regards Aman Tandon On Mon, Jun 9, 2014 at 6:54 AM, Alexandre Rafalovitch arafa...@gmail.com wrote: Have you looked at: https://github.com/DmitryKey/luke Regards, Alex. Personal website: http://www.outerthoughts.com/ Current

Re: ANN: Solr Next

2014-06-09 Thread Yonik Seeley
On Tue, Jan 7, 2014 at 1:53 PM, Yonik Seeley ysee...@gmail.com wrote: [...] Next major feature: Native Code Optimizations. In addition to moving more large data structures off-heap(like UnInvertedField?), I am planning to implement native code optimizations for certain hotspots. Native code

Re: Any way to view lucene files

2014-06-09 Thread François Schiettecatte
Just click the 'Releases' link: https://github.com/DmitryKey/luke/releases François On Jun 9, 2014, at 10:43 AM, Aman Tandon amantandon...@gmail.com wrote: No, Anyways thanks Alex, but where is the luke jar? With Regards Aman Tandon On Mon, Jun 9, 2014 at 6:54 AM, Alexandre

Re: Setup a Solr Cloud on a set of powerful machines

2014-06-09 Thread Erick Erickson
Well, you've omitted information about the most precious resource for Solr, memory. That said, this question is impossible to answer in the abstract, see: http://searchhub.org/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/ Best, Erick On Sun, Jun 8, 2014 at

Re: Any way to view lucene files

2014-06-09 Thread Aman Tandon
Yeah just got it thanks Fracois :) With Regards Aman Tandon On Mon, Jun 9, 2014 at 8:20 PM, François Schiettecatte fschietteca...@gmail.com wrote: Just click the 'Releases' link: https://github.com/DmitryKey/luke/releases François On Jun 9, 2014, at 10:43 AM, Aman Tandon

Re: Deepy nested structure

2014-06-09 Thread harikrishna
Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Deepy-nested-structure-tp4140397p4140802.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Deepy nested structure

2014-06-09 Thread harikrishna
thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Deepy-nested-structure-tp4140397p4140803.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Collection communication internally

2014-06-09 Thread Erick Erickson
My first answer is don't do it that way :). Solr works best with flattened (de-normlized) data. If at all possible, you _really_ would be better off combining the two collections and flattening the data even though there would be more data. Whenever I see a question like this, I wonder if you're

How use gorup and facet ?

2014-06-09 Thread Phi Hoang Hai
Dear Solr expert. I have 2 problems need your help. 1) I have to group list with group.limit=1group.main=truegroup.sort=Date desc (many group and each group has 1 element is newest). Then from list group (each group has 1 element), I want to filter in order to remove items (in groups) not matches

Re: Setup a Solr Cloud on a set of powerful machines

2014-06-09 Thread Shawn Heisey
On 6/8/2014 4:17 PM, shushuai zhu wrote: I would like to get some advice to setup a Solr Cloud on a set of powerful machines. The average size of the documents handled by the Solr Cloud is about 0.5 KB, and the number of documents stored in Solr Cloud could reach billions. When indexing,

Re: SOLR Performance Benchmarking

2014-06-09 Thread Shawn Heisey
On 6/8/2014 12:09 PM, rashi gandhi wrote: I am using SolrMeter for performance benchmarking. I am able to successfully test my solr setup up to 1000 queries per min while searching. But when I am exceeding this limit say 1500 search queries per min, facing Server Refused Connection in SOLR.

RE: COMMERCIAL: RE: SolrCloud: facet range option f.field.facet.mincount=1 omits buckets on response

2014-06-09 Thread Ronald Matamoros
Hi Chris, Created ticket https://issues.apache.org/jira/browse/SOLR-6154 Included to the ticket the data.xml and a PDF with instructions on how to replicate. Sending different updates to different ports was just how the confluence tutorial made the steps; it does not affect the result of the

accessing individual elements of a multivalued field

2014-06-09 Thread kritarth.anand
hi, prod: p cat : catA,catB,catC prod :q cat : catB, catC,catD My schema consists of documents with uid : 'prod's and then they belong can to multiple categories called 'cat' and which are represented as a multivalued field. For a particular kind of query I need to access individual elements

solr4 optimization

2014-06-09 Thread Joshi, Shital
Hi, We have SolrCloud cluster (5 shards and 2 replicas) on 10 boxes. On some of the boxes we have about 5 million deleted docs and we have never run optimization since beginning. Does number of deleted docs have anything to do with performance of query? Should we consider optimization at all

SolrCloud collection create / delete failure

2014-06-09 Thread John Smodic
Hey guys, I'm trying to simply create collection foo in SolrCloud (to a collection that failed to create once due to a badly formatted schema). I try the following: createCollection foo - could not create a new core solr/foo_shard1_replica1 as another core is already defined there

Re: solr4 optimization

2014-06-09 Thread Otis Gospodnetic
Hi, I don't remember last time I ran optimize. Sure, yes, things will work faster if you optimize an index and reduce the number of segments, but if you are regularly writing to that index and performance is OK, leave it to Lucene segment merges to purge deletes. Otis -- Performance Monitoring

Re: Setup a Solr Cloud on a set of powerful machines

2014-06-09 Thread Gili Nachum
the incoming document rate could be as high as 20k/second... That sounds like a lot of CPU eager indexing work, given the 128 CPU cores available, from indexing speed perspective: would you recommend having a similar number of solr cores created, or Solr does just a when several with a small

Re: writing logs of a speicific solr posting to a file

2014-06-09 Thread Sameer Maggon
Check out the patch on the issue below. We hit the same issue and posted a patch, none of the committers have picked it up yet, but would be good to get some feedback on it and get this into the next dot release. If it works for you, please vote it up.

Re: accessing individual elements of a multivalued field

2014-06-09 Thread Jack Krupansky
Not currently. You could have separate explicit fields for the categories such as cat_1, cat_2, etc. The data would need to be replicated (possibly using a copyField), but redundancy to facilitate access is a reasonable approach. -- Jack Krupansky -Original Message- From:

Re: accessing individual elements of a multivalued field

2014-06-09 Thread kritarth.anand
Thanks for the response Jack -- View this message in context: http://lucene.472066.n3.nabble.com/accessing-individual-elements-of-a-multivalued-field-tp4140862p4140911.html Sent from the Solr - User mailing list archive at Nabble.com.

How to simplifying my query for appropriate scoring.

2014-06-09 Thread kritarth.anand
hi all, I need help simplifying my query. The doc structure is as follows. docStructure id A cat : p, q, r id B cat : m, n ,o id C cat: l,b, o Now given this structure my job is to find documents which have cat ids belonging to a list. Right now this is achieved in this fashion using OR of

Re: Integrate solr with openNLP

2014-06-09 Thread Vivekanand Ittigi
Hi Aman, Yeah, We are also thinking the same. Using UIMA is better. And thanks to everyone. You guys really showed us the way(UIMA). We'll work on it. Thanks, Vivek On Fri, Jun 6, 2014 at 5:54 PM, Aman Tandon amantandon...@gmail.com wrote: Hi Vikek, As everybody in the mail list mentioned