SolrJ update

2015-08-06 Thread Henrique O. Santos
Hello all, I am using SolrJ to do a index update on one of my collections. This collection has a uniqueKey id field: fields field name=id type=string indexed=true stored=true/ field name=_version_ type=long indexed=true stored=true/ field name=name type=string indexed=true

Re: are facets or MatchAllDocsQuery not cached?

2015-08-06 Thread Upayavira
On Thu, Aug 6, 2015, at 06:56 PM, Toke Eskildsen wrote: Upayavira u...@odoko.co.uk wrote: Also, attempting to facet across a large number of docs is going to take some time. Perhaps you might gain some performance benefit by sharding your index? One should be aware that distributed

Re: Limits in individual filter sub queries

2015-08-06 Thread Selvam
Dear Toke, Thanks for your input. Infact my scenario is much more complex, let me give you an example, q=*.*fq=(country:india AND age:[25 TO 40] AND sex:male) OR (country:iran AND income:[5 TO 9]) You can see each subquery has different parameters, I may want to limit the first

Re: Limits in individual filter sub queries

2015-08-06 Thread Toke Eskildsen
On Thu, 2015-08-06 at 12:32 +0530, Selvam wrote: Good day, I wanted to run a filter query (fq), say, I need to run q=*.*fq=(country:india) OR (country:iran)limit=100 Now it may return me 100 records that might contain 70 Indians 30 Iran records. Now how can I force to fetch 50 Indian 50

Re: Clarification on WordDelimiterFilter.

2015-08-06 Thread Modassar Ather
Hi, Any suggestion will be really helpful. Kindly provide your inputs. Thanks, Modassar On Thu, Aug 6, 2015 at 2:06 PM, Modassar Ather modather1...@gmail.com wrote: I am using WordDelimiterFilter while indexing and searching both with the following attributes. Parser used is edismax. Solr

Re: How to do sorting instead of using bq

2015-08-06 Thread Upayavira
How do you know those boost values? Do they come from the outside? Could you put them in the index with the docs themselves? Then you can sort on a field in the doc. On Fri, Aug 7, 2015, at 04:40 AM, rachun wrote: Hi all, I'm trying to sort some docs which is about 200 or more docs. by using

How to do sorting instead of using bq

2015-08-06 Thread rachun
Hi all, I'm trying to sort some docs which is about 200 or more docs. by using bq like this.. *[bq] = product_id:L90094438^1 product_id:L90094438^3 product_id:L90094438^5 product_id:W27529923^123 product_id:W27529678^127 product_id:W27530909^133* *[sort] = score asc* The score that

Filtering documents using payloads

2015-08-06 Thread Jamie Johnson
I am attempting to put together a DocsAndPositionsEnum that can hide terms given the payload on the term. The idea is that if a term has a particular access control and the user does not I don't want it to be visible. I have based this off of

are facets or MatchAllDocsQuery not cached?

2015-08-06 Thread Bernd Fehling
Single Index Solr 4.10.4, optimized Index, 76M docs, 235GB index size. I was analysing my solr logs and it turned out that I have some queries which are above 30 seconds qtime while normally the qtime is below 1 second. Looking closer about the queries it turned out that this is for

Re: are facets or MatchAllDocsQuery not cached?

2015-08-06 Thread Toke Eskildsen
On Thu, 2015-08-06 at 13:00 +0200, Bernd Fehling wrote: Single Index Solr 4.10.4, optimized Index, 76M docs, 235GB index size. I was analysing my solr logs and it turned out that I have some queries which are above 30 seconds qtime while normally the qtime is below 1 second. Looking closer

Re: are facets or MatchAllDocsQuery not cached?

2015-08-06 Thread Bernd Fehling
Thanks a lot, your statement makes me feel better :-) It feels like this behavior showed up after changing to docValues for sorting, because before the 99 percentile for qtime was at 550ms average and 1.4 seconds at max. So my assumption is that the inverted index on the sort fields (when _not_

Re: are facets or MatchAllDocsQuery not cached?

2015-08-06 Thread Upayavira
Typically such performance issues with faceting are to do with the time spend uninverting the index before calculating the facet counts. If you indexed the fields with docValues enabled, perhaps you could then use them for faceting, which might improve performance. If you are using a

Re: are facets or MatchAllDocsQuery not cached?

2015-08-06 Thread Bernd Fehling
Am 06.08.2015 um 14:33 schrieb Upayavira: Typically such performance issues with faceting are to do with the time spend uninverting the index before calculating the facet counts. If you indexed the fields with docValues enabled, perhaps you could then use them for faceting, which might

Re: are facets or MatchAllDocsQuery not cached?

2015-08-06 Thread Toke Eskildsen
On Thu, 2015-08-06 at 14:32 +0200, Bernd Fehling wrote: It feels like this behavior showed up after changing to docValues for sorting, because before the 99 percentile for qtime was at 550ms average and 1.4 seconds at max. DocValues have faster startup but comes with a constant performance

Re: SolrCloud on 5.2.1 cluster state

2015-08-06 Thread Shawn Heisey
On 8/6/2015 6:50 AM, Suma Shivaprasad wrote: I was having issues since I am using a solr 4.8.1 client to talk to a 5.2.1 server. There is no API in ZKStateReader as well to add a collection to watch in the 4.8.1 API . I assume that would have caused the clusterstate.json to be updated?

Upload core.properties to ZooKeeper

2015-08-06 Thread marotosg
Hi, I am in the process of migrating my master, slave Solr infraestructure to SolrCloud. At the moment I have several cores inside a folder with this structure /MyCores /MyCores/Core1 /MyCores/Core1/conf /MyCores/Core1/core.properties /MyCores/Core2 /MyCores/Core2/conf

Re: Upload core.properties to ZooKeeper

2015-08-06 Thread Upayavira
Have you looked at the collections API? It has the ability to set properties against collections. I wonder if that'll achieve the same thing as adding them to core.properties? I've never used it myself, but wonder if it'll solve your issue. Upayavira On Thu, Aug 6, 2015, at 12:35 PM, marotosg

Re: SolrCloud on 5.2.1 cluster state

2015-08-06 Thread Suma Shivaprasad
I was having issues since I am using a solr 4.8.1 client to talk to a 5.2.1 server. There is no API in ZKStateReader as well to add a collection to watch in the 4.8.1 API . I assume that would have caused the clusterstate.json to be updated? Since I am using a third party library (which in turn

Re: how to extend JavaBinCodec and make it available in solrj api

2015-08-06 Thread Shalin Shekhar Mangar
What do you mean by a custom format? As long as your custom component is writing primitives or NamedList/SimpleOrderedMap or collections such as List/Map, any response writer should be able to handle them. On Wed, Aug 5, 2015 at 5:08 PM, Dmitry Kan solrexp...@gmail.com wrote: Hello, Solr:

Re: are facets or MatchAllDocsQuery not cached?

2015-08-06 Thread Toke Eskildsen
Upayavira u...@odoko.co.uk wrote: Also, attempting to facet across a large number of docs is going to take some time. Perhaps you might gain some performance benefit by sharding your index? One should be aware that distributed faceting on String fields has a non-trivial overhead: It is a

Re: are facets or MatchAllDocsQuery not cached?

2015-08-06 Thread Upayavira
Also, attempting to facet across a large number of docs is going to take some time. Perhaps you might gain some performance benefit by sharding your index? Upayavira On Thu, Aug 6, 2015, at 04:48 PM, Mikhail Khludnev wrote: On Thu, Aug 6, 2015 at 3:56 PM, Bernd Fehling

Re: SolrCloud on 5.2.1 cluster state

2015-08-06 Thread Suma Shivaprasad
Thanks for clarifying On Thu, Aug 6, 2015 at 6:43 PM, Shawn Heisey apa...@elyograg.org wrote: On 8/6/2015 6:50 AM, Suma Shivaprasad wrote: I was having issues since I am using a solr 4.8.1 client to talk to a 5.2.1 server. There is no API in ZKStateReader as well to add a collection to

Re: serious data loss bug in correlation with too much data after closed

2015-08-06 Thread adfel70
I have some docs that I know i've overwritten, but this is fine because this is caused by some duplicate docs with same data and same id. i know of dataloss because I know that a certain doc with certain id should be in the index but it isnt. Upayavira wrote Are you adding all new documents?

Re: are facets or MatchAllDocsQuery not cached?

2015-08-06 Thread Erik Hatcher
Do you think an all_parameters - complete_response cache is possible? It could be initialized right before or during warmup and would not take to much memory. This is along the lines of Solr’s 304 capabilities. See

Re: Upload core.properties to ZooKeeper

2015-08-06 Thread Kevin Lee
You should be able to use user defined properties within core.properties. However, it sounds like you are uploading core.properties to Zookeeper. In SolrCloud, core.properties is not uploaded to Zookeeper. You place core.properties within your core’s top level directory and the cores are

Clarification on WordDelimiterFilter.

2015-08-06 Thread Modassar Ather
I am using WordDelimiterFilter while indexing and searching both with the following attributes. Parser used is edismax. Solr version is 5.2.1. *filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=1

RE: Solr spell check not showing any suggestions for other language

2015-08-06 Thread talha
Same results :( . Used following query http://localhost:8983/solr/product_live/select?q=%E0%A6%B8%E0%A6%B9%E0%A6%97wt=jsonindent=truespellcheck=truespellcheck.q=%E0%A6%B8%E0%A6%B9%E0%A6%97 -- View this message in context:

Copying index from one Solr cloud to other Solr cloud

2015-08-06 Thread 无线事业部―胡胜波11289
I have the same question . I try to create a same name collection in another solrcloud and copy data folds to the the new solrcloud ,but it does not work . by the way my indexes are stored in hdfs . Does anybody can help? thanks 胡胜波 同程网络科技股份有限公司

Limits in individual filter sub queries

2015-08-06 Thread Selvam
Hi All, Good day, I wanted to run a filter query (fq), say, I need to run q=*.*fq=(country:india) OR (country:iran)limit=100 Now it may return me 100 records that might contain 70 Indians 30 Iran records. Now how can I force to fetch 50 Indian 50 Iran records using a single SOLR query?

Re: serious data loss bug in correlation with too much data after closed

2015-08-06 Thread Shawn Heisey
On 8/6/2015 8:31 AM, adfel70 wrote: Are you sure that this parameter concerns /update requests? On the one hand, it says that it specides the max size of form data (application/x-www-form-urlencoded) sent via POST. You can use POST to pass request parameters not fitting into URL and on the

Re: serious data loss bug in correlation with too much data after closed

2015-08-06 Thread adfel70
Are you sure that this parameter concerns /update requests? On the one hand, it says that it specides the max size of form data (application/x-www-form-urlencoded) sent via POST. You can use POST to pass request parameters not fitting into URL and on the other hand, I see the my bulks are as big

Schemaless mode and DIH

2015-08-06 Thread xavi jmlucjav
hi, While working with DIH, I tried schemaless mode, and found out it does not work if you are indexing with DIH. I could not find any issue or reference to this in the mailing list, even if I found it a bit surprising nobody tried that combination so far. Did anybody tested this before? I

Re: Embedded Solr now deprecated?

2015-08-06 Thread Lukasz Salwinski
On 08/05/2015 08:34 PM, Ken Krugler wrote: Hi Shawn, We have a different use case than the ones you covered in your response to Robert (below), which I wanted to call out. We currently use the embedded server when building indexes as part of a Hadoop workflow. The results get copied to a

Re: Embedded Solr stopped to index after a while

2015-08-06 Thread Alexandre Rafalovitch
(shooting in the dark) What does your data directory looks like? File sizes, etc. And which Operating System. 4Gb is when Windows FAT filesystem has a size limit, but it really should not be that. Regards, Alex. Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:

Re: are facets or MatchAllDocsQuery not cached?

2015-08-06 Thread Mikhail Khludnev
On Thu, Aug 6, 2015 at 3:56 PM, Bernd Fehling bernd.fehl...@uni-bielefeld.de wrote: Am 06.08.2015 um 14:33 schrieb Upayavira: Typically such performance issues with faceting are to do with the time spend uninverting the index before calculating the facet counts. If you indexed the

Embedded Solr stopped to index after a while

2015-08-06 Thread Aldric THOMAZO
Hello, I have an issue with embedded solr or a misconfiguration but no clue to resolve it. Solr stopped indexing a large set of data from a database after a while. It was running for many hours and when it reached a size of 4GB it stopped running although we are expecting about 40GB,

Functionality of post.jar

2015-08-06 Thread Aniruddh Sharma
Hi, I have a case where I have a csv file on my Unix file system and not in Hadoop file system. For example I have abc.xml in /home/cloudera/abc.xml on my Cloudera VMware. Now in Hadoop I go and I create a collection named test10 according to schema of abc.xml and using post.jar I post the