Re: how to use HTMLStripCharFilter in solrJ?

2018-07-05 Thread Ahmet Arslan
Hi Arturas,  Here are some things to try : 1) HTMLStripCharFilter stripper = new HTMLStripCharFilter(strReader.markSupported() ? strReader : new BufferedReader(strReader)) 2) Consider using HTML Strip update processor factory.  3) Create a custom Lucene analyzer using html strip char filter

Re: coord in SolR 7

2018-02-18 Thread Ahmet Arslan
Hi Andreas, Can weak AND (WAND) be used in your use case? https://issues.apache.org/jira/browse/LUCENE-8135 Ahmet On Monday, February 12, 2018, 1:44:38 PM GMT+3, Moll, Dr. Andreas wrote: Hi, I try to upgrade our SolR installation from SolR 5 to 7. We use a customized

Re: Difference between UAX29URLEmailTokenizerFactory and ClassicTokenizerFactory

2017-11-24 Thread Ahmet Arslan
Hi Zheng, UAX29UET recognizes URLs and e-mails. It does not tokenize them. It keeps them single token. StandardTokenizer produce two or more tokens for an entity. Please try them using the analysis page, use which one suits your requirements. Ahmet On Friday, November 24, 2017, 11:46:57

Re: get all tokens from TokenStream in my custom filter

2017-11-19 Thread Ahmet Arslan
blank and only last token is indexed . Ahmet i could not find peek or advance method :(   Please help me guys .  On Fri, Nov 17, 2017 at 10:10 PM, Ahmet Arslan <iori...@yahoo.com> wrote: Hi Kumar, If I am not wrong, I think there is method named something like peek(2) or advance(2).

Re: get all tokens from TokenStream in my custom filter

2017-11-17 Thread Ahmet Arslan
Hi Kumar, If I am not wrong, I think there is method named something like peek(2) or advance(2).Some filters access tokens ahead and perform some logic. AhmetOn Wednesday, November 15, 2017, 10:50:55 PM GMT+3, kumar gaurav wrote: Hi I need to get full field value

Re: Keeping the index naturally ordered by some field

2017-10-01 Thread Ahmet Arslan
Hi Alex, Lucene has this capability (borrowed from Nutch) under  org.apache.lucene.index.sorter package.I think it has been integrated into Solr, but could not find the Jira issue. Ahmet On Sunday, October 1, 2017, 10:22:45 AM GMT+3, alexpusch wrote: Hello,

Re: Help with Query/Function for conditional boost

2017-08-16 Thread Ahmet Arslan
Hi Shamik, I belive 5-args map function can be used here. Here is a link which may inspire you. http://www.cominvent.com/2012/01/25/super-flexible-autocomplete-with-solr/ Ahmet On Wednesday, August 16, 2017, 11:06:28 PM GMT+3, Shamik Bandopadhyay wrote: Hi,   I'm

Re: QueryParser changes query by itself

2017-08-15 Thread Ahmet Arslan
Hi Bernd, In LUCENE-3758, a new member field added into ComplexPhraseQuery class. But we didn't change its hashCode method accordingly. This caused anomalies in Solr, and Yonik found the bug and fixed hashCode. Your e-mail somehow reminded me this. Could it be the QueryCache and hashCode

Re: RE: Comparison of Solr with Sharepoint Search

2017-08-14 Thread Ahmet Arslan
Hi, https://manifoldcf.apache.org is used to crawl content from SharePoint and index into Solr. Ahmet On Monday, August 14, 2017, 9:05:20 PM GMT+3, jmahuang wrote: Sir, Can SOLR search existing SharePoint document libraries and lists? Thanks! -- View this message

Re: Token "states" not getting lemmatized by Solr?

2017-08-10 Thread Ahmet Arslan
Hi Omer, Your analysis chain does not include a stem filter (lemmatizer) Assuming you are dealing with English text, you can use KStemFilterFactory or SnowballFilterFactory. Ahmet On Thursday, August 10, 2017, 9:33:08 PM GMT+3, OTH wrote: Hi, Regarding 'analysis

Re: Indexing a CSV that contains double quotes

2017-08-07 Thread Ahmet Arslan
don't understand, do you think you could clarify a little bit? Thanks, Devon O'Shaughnessy Developer/Analyst Upper Lakes Foods p: 800.879.1265 | ext: 4135 w: upperlakesfoods.com   From: Ahmet Arslan <iori...@yahoo.co

Re: Indexing a CSV that contains double quotes

2017-08-07 Thread Ahmet Arslan
Hi Devon, I think you need to supply encapsulator=" parameter-value pair. Ahmet On Monday, August 7, 2017, 7:57:45 PM GMT+3, O'Shaughnessy, Devon wrote:    Hello all, I'm pretty new at Solr, having only worked with in a couple weeks, and I'm guessing I'm having

Re: Highlighting words with special characters

2017-07-19 Thread Ahmet Arslan
Hi, Maybe name of the UAX29URLEMailTokenizer is deceiving you?It does *not* tokenize URLs and Emails. Actually it recognises them and emits them as a single token. Ahmet On Wednesday, July 19, 2017, 12:00:05 PM GMT+3, Lasitha Wattaladeniya wrote: Update, I changed the

Re: Solr Analyzer for Vietnamese

2017-07-13 Thread Ahmet Arslan
Hi Eirik, I believe "icu tokenizer" does a decent job on text written in non-alphabets. Ahmet On Monday, May 22, 2017, 10:32:22 AM GMT+3, Eirik Hungnes wrote: Hi, There doesn't seem to be any Tokenizer / Analyzer for Vietnamese built in to Lucene at the moment.

Re: How to get field names of dynamic field

2017-04-14 Thread Ahmet Arslan
Hi Midas, LukeRequestHandler shows that information. Ahmet On Friday, April 14, 2017, 1:16:09 PM GMT+3, Midas A wrote: Actually , i am looking for APi On Fri, Apr 14, 2017 at 3:36 PM, Andrea Gazzarini wrote: > I can see those names in the "Schema 

Re: KeywordTokenizer and multiValued field

2017-04-12 Thread Ahmet Arslan
I don't understand the first option, what is each value? Keyword tokenizer emits single token, analogous to string type. On Wednesday, April 12, 2017, 7:45:52 PM GMT+3, Walter Underwood wrote: Does the KeywordTokenizer make each value into a unitary string or does it

Re: Filtering results by minimum relevancy score

2017-04-12 Thread Ahmet Arslan
Hi, I cannot find it. However it should be something like  q=hello={!frange l=0.5}query($q) Ahmet On Wednesday, April 12, 2017, 10:07:54 PM GMT+3, Ahmet Arslan <iori...@yahoo.com.INVALID> wrote: Hi David, A function query named "query" returns the score for the given sub

Re: Filtering results by minimum relevancy score

2017-04-12 Thread Ahmet Arslan
Hi David, A function query named "query" returns the score for the given subquery.  Combined with frange query parser this is possible. I tried it in the past.I am searching the original post. I think it was Yonik's post. https://cwiki.apache.org/confluence/display/solr/Function+Queries Ahmet

Re: Filtering results by minimum relevancy score

2017-04-10 Thread Ahmet Arslan
Hi, I remember that this is possible via frange query parser.But I don't have the query string at hand. Ahmet On Monday, April 10, 2017, 9:00:09 PM GMT+3, David Kramer wrote: I’ve done quite a bit of searching on this.  Pretty much every page I find says it’s a bad

Re: How on EARTH do I remove 's in schema file?

2017-03-19 Thread Ahmet Arslan
Hi Donato, How about using ApostropheFilterFactory ? http://lucene.apache.org/core/6_4_2/analyzers-common/org/apache/lucene/analysis/tr/ApostropheFilter.html Ahmet On Sunday, March 19, 2017 4:08 PM, donato wrote: Then why is it not working? It doesn't make sense at

Re: Distinguish exact match from wildcard match

2017-03-02 Thread Ahmet Arslan
Hi, how about q=code_text:bolt*=code_text:bolt Ahmet On Thursday, March 2, 2017 4:41 PM, Сергей Твердохлеб wrote: Hi, is there way to separate exact match from wildcard match in solr response? e.g. there are two documents: {code_text:bolt} and {code_text:bolter}. When

Re: CPU Intensive Scoring Alternatives

2017-02-21 Thread Ahmet Arslan
Hi, New default similarity is BM25. May be explicitly set similarity to tf-idf and see how it goes? Ahmet On Tuesday, February 21, 2017 4:28 AM, Fuad Efendi wrote: Hello, Default TF-IDF performs poorly with the indexed 200 millions documents. Query "Michael Jackson" may run

Re: Stemming and accents

2017-02-10 Thread Ahmet Arslan
Hi, I have experimented before, and found that Snowball is sensitive to accents/diacritics. Please see for more details: http://www.sciencedirect.com/science/article/pii/S0306457315001053 Ahmet On Friday, February 10, 2017 11:27 AM, Dominique Bejean wrote: Hi,

Re: Dismax query special characters

2017-01-29 Thread Ahmet Arslan
Hi, I don't think dismax recognizes AND OR. Special characters for dismax are + - and quotes. In your example, ampersand may causing you trouble. Due to URL encode stuff... Ahmet On Sunday, January 29, 2017 12:17 AM, Jarosław Grązka wrote: Hi, Reading Solr

Re: Empty Highlight Problem - Solr 6.3.0

2016-12-24 Thread Ahmet Arslan
Hi, Did you try increasing hl.maxAnalyzedChars ? Ahmet On Friday, December 23, 2016 10:47 PM, Furkan KAMACI wrote: Hi All, I'm trying highlighter component at Solr 6.3. I have a problem when I index PDF files. I know that given keyword exists at result document (it

Re: Stemming with SOLR

2016-12-15 Thread Ahmet Arslan
Hi, KStemFilter returns legitimate English words, please use it. Ahmet On Thursday, December 15, 2016 6:17 PM, Lasitha Wattaladeniya wrote: Hello devs, I'm trying to develop this indexing and querying flow where it converts the words to its original form (lemmatization).

Re: Searching for a term which isn't a part of an expression

2016-12-15 Thread Ahmet Arslan
using a PostFilter or adding a SearchComponent to filter out the "bad" results, but obviously a true query-time support would be a lot better. On Wed, Dec 14, 2016 at 10:52 PM, Ahmet Arslan <iori...@yahoo.com.invalid> wrote: > Hi, > > Do you have a common list of

Re: Searching for a term which isn't a part of an expression

2016-12-14 Thread Ahmet Arslan
Hi, Do you have a common list of phrases that you want to prohibit partial match? You can index those phrases in a special way, for example, This is a new world hello_world hot_dog tap_water etc. ahmet On Wednesday, December 14, 2016 9:20 PM, deansg wrote: We would like to

Re: Unicode Character Problem

2016-12-10 Thread Ahmet Arslan
Hi Furkan, I am pretty sure this is a pdf extraction thing. Turkish characters caused us trouble in the past during extracting text from pdf files. You can confirm by performing manual copy-paste from original pdf file. Ahmet On Friday, December 9, 2016 8:44 PM, Furkan KAMACI

Re: Wildcard searches with space in TextField/StrField

2016-11-25 Thread Ahmet Arslan
Hi, You could try this: drop wildcard stuff altogether: 1) Employ edgengramfilter at index time. 2) Use plain searches at query time. Ahmet On Friday, November 25, 2016 4:59 PM, Sandeep Khanzode wrote: Hi All, Can someone please assist with this query?

Re: Problem with Han character in ICUFoldingFilter

2016-10-30 Thread Ahmet Arslan
Hi Eyal, ICUFoldingFilter uses http://site.icu-project.org under the hood. If you think there is a bug, it is better to ask its mailing list. Ahmet On Sunday, October 30, 2016 3:41 PM, "eyal.naam...@exlibrisgroup.com" wrote: Hi, I was wondering if anyone ran

Re: Solr 5.3.1 - Synonym is not working as expected

2016-10-25 Thread Ahmet Arslan
Hi, If your index is pure Chinese, I would do the expansion on query time only. Simply replace English query term with Chinese translations. Ahmet On Tuesday, October 25, 2016 12:30 PM, soundarya wrote: We are using Solr 5.3.1 version as our search engine. This

Re: Lowercase all characters in String

2016-10-11 Thread Ahmet Arslan
Hi, KeywordTokenizer and LowerCaseFilter should suffice. Optionally you can add TrimFilter too. Ahmet On Tuesday, October 11, 2016 5:24 PM, Zheng Lin Edwin Yeo wrote: Hi, Would like to find out, what is the best way to lowercase all the text, while preserving all the

Re: Preceding special characters in ClassicTokenizerFactory

2016-10-03 Thread Ahmet Arslan
Hi Andy, WordDelimeterFilter has "types" option. There is an example file named wdftypes.txt in the source tree that preserves #hashtags and @mentions. If you follow this path, please use Whitespace tokenizer. Ahmet On Monday, October 3, 2016 9:52 PM, "Whelan, Andy"

Re: StrField with Wildcard Search

2016-09-08 Thread Ahmet Arslan
owever, the wildcard/fuzzy functionality will still be provided no matter the approach... SRK On Thursday, September 8, 2016 5:05 PM, Ahmet Arslan <iori...@yahoo.com.INVALID> wrote: Hi, EdgeNGram and Wildcard may be used to achieve the same goal: prefix search or starts with searc

Re: StrField with Wildcard Search

2016-09-08 Thread Ahmet Arslan
Hi, EdgeNGram and Wildcard may be used to achieve the same goal: prefix search or starts with search. Lets say, wildcard enumerates the whole inverted index, thus it may get slower for very large databases. With this one no index time manipulation is required. EdgeNGram does its magic at

Re: changed query parsing between 4.10.4 and 5.5.3?

2016-09-07 Thread Ahmet Arslan
Hi, The tilde in the former looks interesting. I think it related to proximity search. What query parser is this? Ahmet On Wednesday, September 7, 2016 10:52 AM, Bernd Fehling wrote: Hi list, while going from SOLR 4.10.4 to 5.5.3 I noticed a change in

Re: Blank/Null value search in term filter

2016-09-05 Thread Ahmet Arslan
re any way out without making any configuration change. Please suggest. On 02-Sep-2016 9:37 PM, "Ahmet Arslan" <iori...@yahoo.com> wrote: > > > Hi Kishore, > > You can employ an impossible token value (say XX) for null values. > This can be done via default val

Re: Blank/Null value search in term filter

2016-09-02 Thread Ahmet Arslan
Hi Kishore, You can employ an impossible token value (say XX) for null values. This can be done via default value update processor factory. You index some placeholder token for null values. fq={!terms f='queryField' separator='|'}A|XX would fetche docs with A or null values. Ahmet On Friday,

Re: Sorting non-english text

2016-08-25 Thread Ahmet Arslan
; query time. Does it mean, if for example, I update JVM patch-version, then already indexed documents whose indexed fields used CollationKeyAnalyzer needs to be re-indexed or else we cannot query them? Thanks, Vasu On Thu, Aug 25, 2016 at 7:59 PM, Ahmet Arslan <iori...@yahoo.com.

Re: Sorting non-english text

2016-08-25 Thread Ahmet Arslan
Hi Vasu, There is a field type or something like that (CollationKeyAnalyzer) for language specific sorting. Ahmet On Thursday, August 25, 2016 12:29 PM, Vasu Y wrote: Hi, I have a text field which can contain values (multiple tokens) in English; to support sorting, I had

Re: Wildcard search not working

2016-08-12 Thread Ahmet Arslan
with this schema? Respectively, what should I change to be able to correctly do wildcard searches? Many thanks for your time. Cheers, christian -- Christian Ribeaud Software Engineer (External) NIBR / WSJ-310.5.17 Novartis Campus CH-4056 Basel -Original Message----- From: Ahmet Arslan [mailto:iori...@

Re: Wildcard search not working

2016-08-11 Thread Ahmet Arslan
Hi Chiristian, The query r?che may not return at least the same number of matches as roche depending on your analysis chain. The difference is roche is analyzed but r?che don't. Wildcard queries are executed on the indexed/analyzed terms. For example, if roche is indexed/analyzed as roch, the

Re: Query optimization

2016-07-29 Thread Ahmet Arslan
Ups I forgot the link: http://yonik.com/solr/paging-and-deep-paging/ On Friday, July 29, 2016 9:51 AM, Ahmet Arslan <iori...@yahoo.com> wrote: Hi Midas, Please search 'deep paging' over the documentation, mailing list, etc. Solr Deep Paging and Sorting Ahmet On Friday, July 29, 201

Re: Query optimization

2016-07-29 Thread Ahmet Arslan
Hi Midas, Please search 'deep paging' over the documentation, mailing list, etc. Solr Deep Paging and Sorting Ahmet On Friday, July 29, 2016 9:21 AM, Midas A wrote: please reply . On Fri, Jul 29, 2016 at 10:26 AM, Midas A wrote: > a) my index

Re: No need white space split

2016-07-25 Thread Ahmet Arslan
Hi, May be you can simply use string field type? Or KeywordTokenizerFactory? Ahmet On Monday, July 25, 2016 4:38 PM, Shashi Roushan wrote: Hi All, I am Shashi. I am using Solr 6.1. I want to get result only when the hole word matched. Actually I want to avoid

Re: Find part of long query in shorter fields

2016-07-21 Thread Ahmet Arslan
Hi, If you want to disable operators altogether please use dismax instead of edismax. In dismax, only + and - unary operators are supported, if i am not wrong. I don't remember the situation of quotations for the phrase query. Ahmet On Tuesday, July 19, 2016 8:29 PM, CA

Re: Find part of long query in shorter fields

2016-07-16 Thread Ahmet Arslan
Hi Chantal, Please see https://issues.apache.org/jira/browse/LUCENE-7148 ahmet On Saturday, July 16, 2016 3:48 PM, CA wrote: Hello all, our index contains product offers from online shops. The fields we are indexing have all rather short values: the name of the

Re: Filter Query that matches all values of a field

2016-07-04 Thread Ahmet Arslan
Hi Vasu, This question appears occasionally in the mailing list. Please see https://issues.apache.org/jira/browse/LUCENE-7148 ahmet On Monday, July 4, 2016 9:10 PM, Vasu Y wrote: Hi, I have a single type field that can contain zero or more values (comma separated

Re: Data import handler in techproducts example

2016-07-02 Thread Ahmet Arslan
Hi Jonas, Search for the solr-dataimporthandler-*.jar place it under a lib directory (same level as the solr.xml file) along with the mysql jdbc driver (mysql-connector-java-*.jar) Please see: https://cwiki.apache.org/confluence/display/solr/Lib+Directives+in+SolrConfig On Saturday, July

Re: an advice: why not to add a searching model for mailing list

2016-07-02 Thread Ahmet Arslan
Hi Kent, There are already two search systems for the task: http://find.searchhub.org http://search-lucene.com Is this what you mean by saying 'search model'? Ahmet On Saturday, July 2, 2016 6:43 PM, Kent Mu wrote: hi all, I wonder why not do add a searching model

Re: Sorting & searching on the same field

2016-06-23 Thread Ahmet Arslan
Hi Jay, I don't think it can be combined. Mainly because: searching requires a tokenized field. Sorting requires a single value (token) to be meaningful. Ahmet On Thursday, June 23, 2016 7:43 PM, Jay Potharaju wrote: Hi, I would like to have 1 field that can used for

Re: How do we get terms suggestion from SuggestComponent?

2016-06-21 Thread Ahmet Arslan
Hi, With grams parameter of FreeTextLookupFactory, no? Ahmet On Tuesday, June 21, 2016 1:19 PM, solr2020 wrote: Thanks Ahmet. It is working fine. Now i would like to get suggestions for multiple terms. How do i get suggestions for multiple terms?

Re: How do we get terms suggestion from SuggestComponent?

2016-06-20 Thread Ahmet Arslan
Hi, I think : FreeTextLookupFactory DocumentDictionaryFactory 3 content Ahmet On Monday, June 20, 2016 3:51 PM, solr2020 wrote: Hi, I am using solr.SuggestComponent for auto suggestion, it works fine. But the problem is, it returns the whole field value as suggestion

Re: Phrase query proximity parameter doe not show up in parsed query string

2016-06-20 Thread Ahmet Arslan
Hi, I think synonym_edismax is not part of solr. Can you re-produce with the stock edismax? On Monday, June 20, 2016 12:34 PM, preeti kumari wrote: Hi All, My query looks like below : q=((_query_:"{!synonym_edismax qf='partnum' v='597871' bq='' mm=100

Re: Can someone explain about Sweetspot Similarity ?

2016-06-19 Thread Ahmet Arslan
Hi, Sweet spot is designed to punish too long or too short documents. Did you reindex? Can you see the mention of sweet spot in debugQuery=true response? Ahmet On Sunday, June 19, 2016 2:18 PM, dirmanhafiz wrote: Hi , Im Dirman and im trying experiment solr with

Re: Error when searching with special characters

2016-06-18 Thread Ahmet Arslan
efType=dismax or edismax. What could be the reason that it did not work with the default defType=lucene? Regards, Edwin On 18 June 2016 at 01:04, Ahmet Arslan <iori...@yahoo.com.invalid> wrote: > Hi, > > May be URL encoding issue? > By the way, I would use back slash to e

Re: Error when searching with special characters

2016-06-17 Thread Ahmet Arslan
Hi, May be URL encoding issue? By the way, I would use back slash to escape special characters. Ahmet On Friday, June 17, 2016 10:08 AM, Zheng Lin Edwin Yeo wrote: Hi, I encountered this error when I tried to search with special characters, like "&" and "#". {

Re: Stemming

2016-06-16 Thread Ahmet Arslan
Hi Jamal, Snowball requires lowercase filter above it. This is documented in javadocs but it is a small but important detail. Please use a lowercase filter after the whitescpace tokenizer. Ahmet On Thursday, June 16, 2016 10:13 PM, "Jamal, Sarfaraz"

Re: wildcard search for string having spaces

2016-06-15 Thread Ahmet Arslan
Hi Roshan, I think there are two options: 1) escape the space q=abc\ p* 2) use prefix query parser q={!prefix f=my_string}abc p Ahmet On Wednesday, June 15, 2016 3:48 PM, Roshan Kamble wrote: Hello, I have below custom field type defined for solr 6.0.0

Re: Question about multiple fq parameters

2016-06-09 Thread Ahmet Arslan
Hi Mikhail, Can you please explain what this mysterious op parameter is? How is it related to range queries issued on date fields? Thanks, Ahmet On Thursday, June 9, 2016 11:43 AM, Mikhail Khludnev wrote: Shawn, I found "op" at

Re: Scoring changes between 4.10 and 5.5

2016-06-09 Thread Ahmet Arslan
Hi, I wondered the same before and failed to decipher TFIDFSimilarity. Scoring looks like tf*idf*idf to me. I appreciate someone who will shed some light on this. Thanks, Ahmet On Friday, June 10, 2016 12:37 AM, Upayavira wrote: I've just done a very simple, single term

Re: Question about multiple fq parameters

2016-06-08 Thread Ahmet Arslan
What is the meaning of 'op=Intersects' here? On Thursday, June 9, 2016 12:20 AM, Mikhail Khludnev wrote: oh.. hold on. you might need the space in the later one ?=*=OR= {!field+f=DateB+op=Intersects v=$b} {!field+f=DateA+op=Intersects

Re: carrot2 label understanding(clustering)

2016-06-08 Thread Ahmet Arslan
Hi, This is search result clustering. Carrot2 also assigns labels to clusters. It automatically generates those labels. Ahmet On Wednesday, June 8, 2016 12:36 PM, Mugeesh Husain wrote: Hi, I have a few question regarding clustering , i check out this link

Re: Getting a list of matching terms and offsets

2016-06-05 Thread Ahmet Arslan
way. On Sun, Jun 5, 2016 at 11:30 AM Ahmet Arslan <iori...@yahoo.com.invalid> wrote: > Well debug query has the list of token that caused match. > If i am not mistaken i read an example about span query and spans thing. > It was listing the positions of the matches.

Re: Getting a list of matching terms and offsets

2016-06-05 Thread Ahmet Arslan
ecause the highlighter has to do just this in order to create snippets with accurate highlighting. Justin On Sun, Jun 5, 2016 at 9:09 AM Ahmet Arslan <iori...@yahoo.com.invalid> wrote: > Hi, > > May be org.apache.lucene.search.spans.TermSpans ? > > > > On Sunday, June 5, 20

Re: Getting a list of matching terms and offsets

2016-06-05 Thread Ahmet Arslan
Hi, May be org.apache.lucene.search.spans.TermSpans ? On Sunday, June 5, 2016 7:59 AM, Alexandre Rafalovitch wrote: It sounds like TermVector component's output: https://cwiki.apache.org/confluence/display/solr/The+Term+Vector+Component Perhaps with additional flags

Re: debugging solr query

2016-05-27 Thread Ahmet Arslan
ents(!fieldA:abc) dont have values in the new columns? >>> >>> >>> Thanks >>> Jay >>> >>> On Tue, May 24, 2016 at 8:06 PM, Erick Erickson <erickerick...@gmail.com >>> > wrote: >>> >>>> Try adding debug=timing,

Re: How can Most Popular Search be implemented in Solr?

2016-05-27 Thread Ahmet Arslan
Hi, Solr does not explicitly save incoming/maintain queries. * Some people save queries at the UI side. * Some folks enable Solr logging and then extract useful query, numFound, QTime, etc information from logs: http://soleami.com * Others identify searches that return zero documents (missing

Re: how can we use multi term search along with stop words

2016-05-26 Thread Ahmet Arslan
ciates.com -Original Message- From: Siddhartha Singh Sandhu [mailto:sandhus...@gmail.com] Sent: Thursday, May 26, 2016 6:54 PM To: solr-user@lucene.apache.org; Ahmet Arslan Subject: Re: how can we use multi term search along with stop words Hi Preeti, You can use the analysis tool in the So

Re: sort by custom function of similarity score

2016-05-26 Thread Ahmet Arslan
Hi, Probably, using the 'query' function query, which returns the score of a given query. https://cwiki.apache.org/confluence/display/solr/Function+Queries#FunctionQueries-UsingFunctionQuery On Thursday, May 26, 2016 1:59 PM, aanilpala wrote: is it allowed to provide a

Re: how can we use multi term search along with stop words

2016-05-26 Thread Ahmet Arslan
Hi Bhat, What do you mean by multi term search? In your first e-mail, your example uses quotes, which means phrase/proximity search. ahmet On Thursday, May 26, 2016 11:49 AM, Preeti Bhat wrote: HI All, Sorry for asking the same question again, but could someone

Re: debugging solr query

2016-05-24 Thread Ahmet Arslan
Hi, Is it QueryComponent taking time? Ot other components? Also make sure there is plenty of RAM for OS cache. Ahmet On Wednesday, May 25, 2016 1:47 AM, Jay Potharaju wrote: Hi, I am trying to debug solr performance problems on an old version of solr, 4.3.1. The

Re: highlight don't work if df not specified

2016-05-23 Thread Ahmet Arslan
org.apache.solr.common.SolrException"], "msg":"undefined field text", "code":400}} On Sun, May 22, 2016 at 5:34 PM, Ahmet Arslan <iori...@yahoo.com.invalid> wrote: > Hi, > > What happens when you increase hl.maxAnalyzedChars? > > O

Re: highlight don't work if df not specified

2016-05-22 Thread Ahmet Arslan
Hi, What happens when you increase hl.maxAnalyzedChars? OR hl.q=blah blah=normal_text,title Ahmet On Sunday, May 22, 2016 5:24 PM, michael solomon <micheal...@gmail.com> wrote: On Sun, May 22, 2016 at 5:18 PM, Ahmet Arslan <iori...@yahoo.com.invalid> wrote: > Hi, >

Re: highlight don't work if df not specified

2016-05-22 Thread Ahmet Arslan
Hi, Weird, are your fields stored? On Sunday, May 22, 2016 5:14 PM, michael solomon <micheal...@gmail.com> wrote: Thanks Ahmet, It was mistake in the question, sorry, in the quey I wrote it properly. On Sun, May 22, 2016 at 5:06 PM, Ahmet Arslan <iori...@yahoo.com.invalid>

Re: highlight don't work if df not specified

2016-05-22 Thread Ahmet Arslan
Hi, q=normal_text:"bla bla":"bla bla" should be q=+normal_text:"bla bla" +title:"bla bla" On Sunday, May 22, 2016 4:52 PM, michael solomon wrote: Hi, I'm I query multiple fields in solr: q=normal_text:"bla bla":"bla bla" I turn on the highlighting, but it doesn't

Re: How to use a regex search within a phrase query?

2016-05-22 Thread Ahmet Arslan
Hi Erez, I don't think it is possible to combine regex with phrase out-of-the-box. However, there is https://issues.apache.org/jira/browse/LUCENE-5205 for the task. Can't you define your query in terms of pure regex? something like /[0-9]{3} .* [0-9]{4}/ ahmet On Sunday, May 22, 2016 1:37

Re: indexing dovecot mailbox

2016-05-22 Thread Ahmet Arslan
chine1:/home/a.meyer/Postfach/cur # file 1461583672.Vfe03I1000f4M981621.bitmachine1:2,S 1461583672.Vfe03I1000f4M981621.bitmachine1:2,S: SMTP mail, ASCII text I can read them with the Midnight Commeander. Has it something to do with the file-ending not recognized? Andreas Ahmet Arslan

Re: indexing dovecot mailbox

2016-05-21 Thread Ahmet Arslan
, 2016 3:46 AM, Ahmet Arslan <iori...@yahoo.com.INVALID> wrote: Hi Meyer, Not sure what "mailbox of dovecot" is, but SimplePostTool can recognize certain file types. They (xml,json,...,log) are actually listed in the log msg in your email. Can you describe the format of the fil

Re: indexing dovecot mailbox

2016-05-21 Thread Ahmet Arslan
Hi Meyer, Not sure what "mailbox of dovecot" is, but SimplePostTool can recognize certain file types. They (xml,json,...,log) are actually listed in the log msg in your email. Can you describe the format of the files that you want to index? Are they text files? ahmet On Sunday, May 22, 2016

Re: Solrj 4.7.2 - slowing down over time

2016-05-19 Thread Ahmet Arslan
Hi, EmbeddedSolrServer bypass the servlet container. Please see : http://find.searchhub.org/document/a88f669d38513a76 On Thursday, May 19, 2016 6:23 PM, Roman Slavik wrote: Hi Ahmet, thanks for your response, I appreciate it. I thought that EmbeddedSolrServer is just

Re: Solrj 4.7.2 - slowing down over time

2016-05-18 Thread Ahmet Arslan
Hi Roman, You said you were using EmbeddedSolrServer, also you mention Tomcat. I don't think it is healthy to use both. Also I wouldn't use EmbeddedSolrServer at all. It is rarely used and there can be hidden things there. Consider using jetty which is actually tested. Since you commit every

Re: Precision, Recall, ROC in solr

2016-05-18 Thread Ahmet Arslan
Hi Tentri, Evaluation in IR primary carried out by traditional TREC-style (also referred to as Cranfield paradigm) evaluation methodology. The evaluation methodology requires a document collection, a set of information needs (called topics or queries), and a set of query relevance judgments

Re: Filter query (fq) on comma seperated value does not work

2016-05-16 Thread Ahmet Arslan
e data as such... And now I was able to retrieve the expected results. But Still Can you help me out in achieving the results using the comma as you suggested. Thanks & Regards On Mon, May 16, 2016 at 5:50 PM, Ahmet Arslan <iori...@yahoo.com.invalid> wrote: > Hi, > > Its a

Re: easiest way to search parts of words

2016-05-16 Thread Ahmet Arslan
Hi Gates, There are two approaches: 1) Use a wildcard query with star operator q=consult* 2) Create an index with EdgeNGramFilterFactory and issue a regular search q=consult (2) will be faster at the cost of bigger index size You don't need to change anything for (1) if the execution time is

Re: Filter query (fq) on comma seperated value does not work

2016-05-16 Thread Ahmet Arslan
Hi, Its all about how you tokenize the category field. It looks like you are using a string type, which does not tokenize at all (e.g. verbatim) Please use a PatterTokenizer and configure it so that it splits on comma. Ahmet On Monday, May 16, 2016 2:11 PM, SRINI SOLR

Re: URL parameters combined with text param

2016-05-13 Thread Ahmet Arslan
no_coord +() ExtendedDismaxQParser [...] On 12/05/2016 17:06, Erick Erickson wrote: > Try adding =query to your query and look at the parsed results. > This shows you exactly what Solr sees rather than what you think > it should. > > Best, &g

Re: URL parameters combined with text param

2016-05-12 Thread Ahmet Arslan
ost:8983/solr/my_core/select?q=hospital ) Kind regards, Bastien On 11/05/2016 16:06, Ahmet Arslan wrote: > Hi Bastien, > > Please use magic _query_ field, q=hospital AND _query_:"{!q.op=AND v=$a}" > > ahmet > > > On Wednesday, May 11, 2016 2:35 PM, Latard - MDPI

Re: Error

2016-05-11 Thread Ahmet Arslan
Hi Midas, It looks like you are committing too frequently, cache warming cannot catchup. Either lower your commit rate, or disable cache auto warm (autowarmCount=0). You can also remove queries registered at newSearcher event if you have defined some. Ahmet On Wednesday, May 11, 2016 2:51

Re: URL parameters combined with text param

2016-05-11 Thread Ahmet Arslan
Hi Bastien, Please use magic _query_ field, q=hospital AND _query_:"{!q.op=AND v=$a}" ahmet On Wednesday, May 11, 2016 2:35 PM, Latard - MDPI AG wrote: Hi Everybody, Is there a way to pass only some of the data by reference and some others in the q param? e.g.:

Re: How to search string

2016-05-11 Thread Ahmet Arslan
Hi, You can be explicit about the field that you want to search on. e.g. q=product_name:(Garmin Class A) Or you can use lucene query parser with default field (df) parameter. e.g. q={!lucene df=product_name)Garmin Class A Its all about query parsers. Ahmet On Wednesday, May 11, 2016 9:12

Re: How to search in solr for words like %rek Dr%

2016-05-11 Thread Ahmet Arslan
Hi Thrinadh, Why don't you use plain wildcard search? There are two operator star and question mark for this purpose. Ahmet On Wednesday, May 11, 2016 4:31 AM, Thrinadh Kuppili wrote: Thank you, Yes i am aware that surround with quotes will result in match for space

Re: Facet ignoring repeated word

2016-05-10 Thread Ahmet Arslan
+1 to Toke's facet and stats combo! On Tuesday, May 10, 2016 11:21 AM, Toke Eskildsen wrote: On Fri, 2016-04-29 at 08:55 +, G, Rajesh wrote: > I am trying to implement word >

Re: how to find out how many times a word appears in a collection of documents?

2016-05-10 Thread Ahmet Arslan
t is marked for deletion. df values include deleted documents. Christian Fotache Tel: 0728.297.207 Fax: 0351.411.570 From: Ahmet Arslan <iori...@yahoo.com.INVALID> To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org>; "liviuchrist...@yahoo.com" <

Re: how to find out how many times a word appears in a collection of documents?

2016-05-10 Thread Ahmet Arslan
Hi Christian, Collection wide term statistics can be accessed via TermsComponent or LukeRequestHandler. Ahmet On Tuesday, May 10, 2016 1:26 PM, "liviuchrist...@yahoo.com.INVALID" wrote: Hi everyone, I need to "read" the solr/lucene index and see how many

Re: Facet ignoring repeated word

2016-05-09 Thread Ahmet Arslan
hments by anyone other than the intended person(s) is prohibited. -Original Message- From: G, Rajesh [mailto:r...@cebglobal.com] Sent: Friday, May 6, 2016 1:08 PM To: Ahmet Arslan <iori...@yahoo.com>; solr-user@lucene.apache.org Subject: RE: Facet ignoring repeated word Hi Ahmet,

Re: Filter queries & caching

2016-05-08 Thread Ahmet Arslan
Hi, As I understand it useful incase you use an OR operator between two restricting clauses. Recall that multiple fq means implicit AND. ahmet On Monday, May 9, 2016 4:02 AM, Jay Potharaju wrote: As mentioned above adding filter() will add the filter query to the

Re: Facet ignoring repeated word

2016-05-06 Thread Ahmet Arslan
ttachments by anyone other than the intended person(s) is prohibited. -Original Message- From: G, Rajesh [mailto:r...@cebglobal.com] Sent: Thursday, May 5, 2016 4:29 PM To: Ahmet Arslan <iori...@yahoo.com>; solr-user@lucene.apache.org; erickerick...@gmail.com Subject: RE:

Re: How to get all the docs whose field contain a specialized string?

2016-05-06 Thread Ahmet Arslan
Hi, It looks like brand_s is defined as string, which is not tokenized. Please do one of the following to retrieve "brand_s":"ibm hp" a) use a tokenized field type or b) issue a wildcard query of q=ibm* Ahmet On Friday, May 6, 2016 8:35 AM, 梦在远方 wrote: Hi, all I do

  1   2   3   4   5   6   7   8   9   10   >