Re: DIH timezone offset

2014-05-20 Thread rulinma
good. -- View this message in context: http://lucene.472066.n3.nabble.com/DIH-timezone-offset-tp504958p4137077.html Sent from the Solr - User mailing list archive at Nabble.com.

How to optimize single shard only?

2014-05-20 Thread Marcin Rzewucki
Hi, Do you know how to optimize index on a single shard only ? I was trying to use optimize=truewaitFlush=trueshard.keys=myshard but it does not work - it optimizes all shards instead of just one. Kind regards.

Re: How to optimize single shard only?

2014-05-20 Thread Ahmet Arslan
Hi Marcin, just a guess, pass distrib=false ? Ahmet On Tuesday, May 20, 2014 10:23 AM, Marcin Rzewucki mrzewu...@gmail.com wrote: Hi, Do you know how to optimize index on a single shard only ? I was trying to use optimize=truewaitFlush=trueshard.keys=myshard but it does not work - it

Howto Search word which contains the character

2014-05-20 Thread heyyo
In hebrew words could contain the character ** ex: דוח I would like to know how to configure my schema.xml to be able to index and search correctly those types of words. If I search this character ** inside solr query tool I got this debug: /debug: { rawquerystring: \, querystring: \,

Re: solr-user Digest of: get.100322

2014-05-20 Thread Jeongseok Son
Thank you for your reply! I also found docValues after sending an email and your suggestion seems the best solution for me. Now I'm configuring schema.xml to use docValues and have a question about docValuesFormat. According to this thread(

Re: trigger delete on nested documents

2014-05-20 Thread Thomas Scheffler
Am 19.05.2014 19:25, schrieb Mikhail Khludnev: Thomas, Vanilla way to override a blocks is to send it with the same unique-key (I guess it's id for your case, btw don't you have unique-key defined in the schema?), but it must have at least one child. It seems like analysis issue to me

the whole web instance hangs when optimize one core.

2014-05-20 Thread YouPeng Yang
Hi. I am using solr4.6, in one my core it contains 50 million docs,and I am just click the optimized button on the overview page of the core,and the whole web instance hangs,one phenomenon is the DIH on another core hanged. Is it a known problem or something wrong with my env? Regards

Re: Howto Search word which contains the character

2014-05-20 Thread Ahmet Arslan
Hi, It is special query parser character, so it needs to be escaped.  http://lucene.apache.org/core/2_9_4/queryparsersyntax.html#Escaping%20Special%20Characters Ahmet On Tuesday, May 20, 2014 10:57 AM, heyyo lionel.enka...@gmail.com wrote: In hebrew words could contain the character ** ex:

Re: How to optimize single shard only?

2014-05-20 Thread YouPeng Yang
Hi Marcin Thanks to your mail,now I know why my cloud hangs when I just click the optimize button on the overview page of the shard. 2014-05-20 15:25 GMT+08:00 Ahmet Arslan iori...@yahoo.com: Hi Marcin, just a guess, pass distrib=false ? Ahmet On Tuesday, May 20, 2014 10:23 AM,

Re: How to optimize single shard only?

2014-05-20 Thread YouPeng Yang
Hi Maybe you can try _router_=myshard? I will check the source code ,note you later. 2014-05-20 17:19 GMT+08:00 YouPeng Yang yypvsxf19870...@gmail.com: Hi Marcin Thanks to your mail,now I know why my cloud hangs when I just click the optimize button on the overview page of the shard.

Re: How to optimize single shard only?

2014-05-20 Thread Marcin Rzewucki
Well, it should not hang if all is configured fine :) How many shards and memory you have ? Note that optimize rewrites index so you might need additional disk space for this process. Optimizing works fine however I'd like to be able to do it on a single shard as well. On 20 May 2014 11:19,

Re: How to optimize single shard only?

2014-05-20 Thread YouPeng Yang
Hi My DIH work indeed hangs, I have only four shards,each has a master and a replica.Maybe jvm memory size is very low.it was 3G while the size of every my core is almost 16GB. I also have found that the size of the master increased during the optimization(you can check on the overview page of

Re: How to optimize single shard only?

2014-05-20 Thread Marcin Rzewucki
As I wrote before index is being rewritten so it grows during optimization and later is reduced. I guess there was OOM in your case. On 20 May 2014 12:11, YouPeng Yang yypvsxf19870...@gmail.com wrote: Hi My DIH work indeed hangs, I have only four shards,each has a master and a

Solr Cloud Shards and Replica not reviving after restarting

2014-05-20 Thread Tim Burner
Hi Everyone, I have installed Solr Cloud 4.6.2 with external Zookeeper and Tomcat, having 3 shards with 2 replica each. I tried indexing some documents which went easy. After which I restarted my Tomcat, and now the Shards are not getting up, its coming up with bunch of Exceptions. First

Re: Howto Search word which contains the character

2014-05-20 Thread Jack Krupansky
It looks like it was escaped in the query, but the word delimiter filter will remove it and treat it as if it were white space. The types attribute for WDF can point to a file containing the types for various characters, so you could map a quote to ALPHA. The doc is sketchy, but there are

[ANNOUNCE] Apache Solr 4.8.1 released

2014-05-20 Thread Robert Muir
May 2014, Apache Solr™ 4.8.1 available The Lucene PMC is pleased to announce the release of Apache Solr 4.8.1 Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted

Re: trigger delete on nested documents

2014-05-20 Thread Jack Krupansky
To be clear, you cannot update a single document of a nested document in place - you must reindex the whole block, parent and all children. This is because this feature relies on the underlying Lucene block join feature that requires that the documents be contiguous, and updating a single child

Re: trigger delete on nested documents

2014-05-20 Thread Thomas Scheffler
Am 20.05.2014 14:11, schrieb Jack Krupansky: To be clear, you cannot update a single document of a nested document in place - you must reindex the whole block, parent and all children. This is because this feature relies on the underlying Lucene block join feature that requires that the

Vague Behavior while setting Solr Cloud

2014-05-20 Thread Tim Burner
Hi Everyone, I am trying to setup Solr Cloud referring to the blog http://myjeeva.com/solrcloud-cluster-single-collection-deployment.html if I complete the set in one go, then it seems to be going fine. when the setup is complete and I am trying to restart Solr by restarted Tomcat instance, it

Autoscaling Solr instances in AWS

2014-05-20 Thread Peter Keegan
We are running Solr 4.6.1 in AWS: - 2 Solr instances (1 shard, 1 leader, 1 replica) - 1 CloudSolrServer SolrJ client updating the index. - 3 Zookeepers The Solr instances are behind a load balanceer and also in an auto scaling group. The ScaleUpPolicy will add up to 9 additional instances

Re: WordDelimiterFilterFactory and StandardTokenizer

2014-05-20 Thread Diego Fernandez
Great, thanks for the information! Right now we're using the StandardTokenizer types to filter out CJK characters with a custom filter. I'll test using MappingCharFilters, although I'm a little concerned with possible adverse scenarios. Diego Fernandez - 爱国 Software Engineer US GSS

Re: Error initializing QueryElevationComponent

2014-05-20 Thread Geepalem
Hi, I have changed as amp Now, core is getting initialized. But document added in elevate.xml is not coming as top result. elevate query text=*analog* doc id=sitecore://master/{137f5eb3-eb84-4165-bef0-5be1fbbc3201}?lang=enamp;ver=1/ /query /elevate Also, why below query is not

Re: Issue paging when sorting on a Date field

2014-05-20 Thread Bryan Bende
This is using the solr.TrieDateField, it is the field type date from the example schema in solr 4.6.1: fieldType name=date class=solr.TrieDateField precisionStep=0 positionIncrementGap=0 / After further testing I was only able to reproduce this in a sharded replicated environment (numShards=3,

Re: WordDelimiterFilterFactory and StandardTokenizer

2014-05-20 Thread Ahmet Arslan
Hi Diego, Did you miss Shawn's response? His ICUTokenizerFactory solution is better than mine.  By the way, what solr version are you using? Does StandardTokenizer set type attribute for CJK words? To filter out given types, you not need a custom filter. Type Token filter serves exactly that

Re: WordDelimiterFilterFactory and StandardTokenizer

2014-05-20 Thread Diego Fernandez
Hey Ahmet, Yeah I had missed Shawn's response, I'll have to give that a try as well. As for the version, we're using 4.4. StandardTokenizer sets type for HANGUL, HIRAGANA, IDEOGRAPHIC, KATAKANA, and SOUTHEAST_ASIAN and you're right, we're using TypeTokenFilter to remove those. Diego

Odd interaction between {!tag..} and {!field}

2014-05-20 Thread Erick Erickson
not sure what to make of this... The presence of the {!tag} entry changes the filter query generated by the {!field...} tag. Note below that in one case the filter query is a phrase query, and in the other it's parsed with one term against the specified field and the other against the default

Re: Odd interaction between {!tag..} and {!field}

2014-05-20 Thread Chris Hostetter
: The presence of the {!tag} entry changes the filter query generated by : the {!field...} tag. Note below that in one case the filter query is a : phrase query, and in the other it's parsed with one term against the : specified field and the other against the default field. I think you are

Re: Odd interaction between {!tag..} and {!field}

2014-05-20 Thread Chris Hostetter
: when local params are embedded in a query being parsed by the : LuceneQParser, it applies them using the same scoping as other query : operators : : : fq: {!tag=name_name}{!field f=name}United States Think of that example in the context of this one -- the basics of

Re: solr-user Digest of: get.100322

2014-05-20 Thread Shawn Heisey
On 5/20/2014 2:01 AM, Jeongseok Son wrote: Though it uses only small amount of memory I'm worried about memory usage because I have to store so many documents. (32GB RAM / total 5B docs, sum of docs. of all cores) If you've only got 32GB of RAM and there are five billion docs on the system,

Re: Vague Behavior while setting Solr Cloud

2014-05-20 Thread Shawn Heisey
On 5/20/2014 7:10 AM, Tim Burner wrote: I am trying to setup Solr Cloud referring to the blog http://myjeeva.com/solrcloud-cluster-single-collection-deployment.html if I complete the set in one go, then it seems to be going fine. when the setup is complete and I am trying to restart Solr

Re: Issue paging when sorting on a Date field

2014-05-20 Thread Chris Hostetter
: So I think when I was paging through the results, if the query for page N : was handled by replica1 and page N+1 handled by replica2, and the page : boundary happened to be where the reversed rows were, this would produce : the behavior I was seeing where the last row from the previous page was

Re: Issue paging when sorting on a Date field

2014-05-20 Thread Shawn Heisey
On 5/19/2014 2:05 PM, Bryan Bende wrote: Using Solr 4.6.1 and in my schema I have a date field storing the time a document was added to Solr. I have a utility program which: - queries for all of the documents in the previous day sorted by create date - pages through the results keeping

Stemming for Chinese and Japanese

2014-05-20 Thread Geepalem
Hi, What is the filter to be used to implement stemming for Chinese and Japanese language field types. For English, I have used filter class=solr.SnowballPorterFilterFactory language=English / and its working fine. Appreciate your help! Thanks, G. Naresh Kumar -- View this message in

Extensibility and code reuse: SOLR vs Lucene

2014-05-20 Thread Achim Domma
Hi, I have a project, where we need to do aggregations over facetted values. The stats component is not powerful enough anymore and the new statistic component seems not to be ready yet. I understand that it's not easy to create a general purpose component for this task. I decided to check

Re: Extensibility and code reuse: SOLR vs Lucene

2014-05-20 Thread Joel Bernstein
Achim, Solr can be extended to plugin custom analytics. The code snippet you mention is part of the framework which enables this. Here is how you do it: 1) Create a QParserPlugin that returns a Query that extends PostFilter. 2) Then implement the PostFilter api and return a DelegatingCollector

Re: Extensibility and code reuse: SOLR vs Lucene

2014-05-20 Thread Yonik Seeley
On Tue, May 20, 2014 at 6:01 PM, Achim Domma do...@procoders.net wrote: - I found several times code snippets like if (collector instanceof DelegatingCollector) { ((DelegatingCollector)collector).finish() } . Such code is considered bad practice in every OO language I know. Do I miss

Re: How to optimize single shard only?

2014-05-20 Thread Erick Erickson
Marcin is correct. The index size on disk will perhaps double. (triple in compound case). The reason is so you don't lose your index if the process is interrupted. Consider the case where you're optimizing to one segment. 1 All the current segments are copied into the new segment 2 The new

Re: Solr Cloud Shards and Replica not reviving after restarting

2014-05-20 Thread Erick Erickson
First thing I'd look at is the log on the server. It's possible that you've changed the configuration such that Solr can't start. Shot in the dark, but that's where I'd start looking. Best, Erick On Tue, May 20, 2014 at 4:45 AM, Tim Burner imtimbur...@gmail.com wrote: Hi Everyone, I have

Re: Odd interaction between {!tag..} and {!field}

2014-05-20 Thread Erick Erickson
Thanks Chris! The query parsing stuff is something I keep stumbling over, but you may have noticed that! Erick On Tue, May 20, 2014 at 10:06 AM, Chris Hostetter hossman_luc...@fucit.org wrote: : when local params are embedded in a query being parsed by the : LuceneQParser, it applies them

solr-server refresh index

2014-05-20 Thread zzz
Hi I am using Solr on a 4 node CDH5 cluster (1 namenode, 3 datanodes). I am running the solr-server on the namenode, and the solr-indexer on each of the datanodes, alongside the hbase regionservers, for NRT indexing of a hbase table. The basics of the indexing seem to work - when I add records

Re: solr-server refresh index

2014-05-20 Thread Erick Erickson
Well, you can always make documents in Solr visible by issuing a hard commit or waiting for your hard commit (openSeacher=true) or soft commit interval to expire. But as far as the Cloudera product, you'd get much better answers by asking in Cloudera-specific forums. Here's a place to start...

Re: Solr performance: multiValued filed vs separate fields

2014-05-20 Thread rulinma
I think multiValue is copied multi values, index is bigger and query easy, but performance may worse, but it depends on how to using. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-performance-multiValued-filed-vs-separate-fields-tp4136121p4137289.html Sent from the