Re: error reporting during indexing

2015-09-29 Thread Matteo Grolla
Hi Erik, it's a curiosity question. When I add a document it's buffered by Solr and can (apparently is) be parsed to verify it matches the schema. But it's not written to a segment file until a commit is issued. If there is a problem writing the segment, a permission error, isn't this a case

ConcurrentUpdateSolrServer with timeout on flushing ?

2015-09-29 Thread gsus
Is there any possibility to use the solrj ConcurrentUpdateSolrServer to flush its queue on two conditions , queue is full OR a timeout occurs ? (e.g. 2 Minutes no new documents , so lets flush ) best regards -- View this message in context:

Re: MoreLikeThisHandler with mltipli input documents

2015-09-29 Thread Upayavira
Let's take a step back. So, you have 3000 or so docs, and you want to know which documents are similar to these. Why do you want to know this? What feature do you need to build that will use that information? Knowing this may help us to arrive at the right technology for you. For example, you

Re: error reporting during indexing

2015-09-29 Thread Alessandro Benedetti
Hi Matteo, at this point I would suggest you this reading by Erick: https://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ If i am not wrong when the document is indexed ( simplifying) : 1) The document is added to the current segment in memory 2) When a

Re: MoreLikeThisHandler with mltipli input documents

2015-09-29 Thread Szűcs Roland
Hi Alessandro, My original goal was to get offline suggestsion on content based similarity for every e-book we have . We wanted to run a bulk more like this calculation in the evening when the usage of our site is low and we submit a new e-book. Real time more like this can take a while as we

Re: MoreLikeThisHandler with mltipli input documents

2015-09-29 Thread Upayavira
If MoreLikeThis is slow for large documents that are indexed, have you enabled term vectors on the similarity fields? Basically, what more like this does is this: * decide on what terms in the source doc are "interesting", and pick the 25 most interesting ones * build and execute a boolean query

Re: MoreLikeThisHandler with mltipli input documents

2015-09-29 Thread Alessandro Benedetti
Hi Roland, what is your exact requirement ? Do you want to basically build a "description" for a set of documents and then find documents in the index, similar to this description ? By default , based on my experience ( and on the code) this is the entry point for the Lucene More Like This : >

Re: MoreLikeThisHandler with mltipli input documents

2015-09-29 Thread Szűcs Roland
Hello Upayavira, Thanks dealing with my issue. I have applied already the termVectors=true to all fileds involved in the more like this calculation. I have just 3 000 documents each of them is represented by a relativly big term vector with more than 20 000 unique terms. If I run the more like

MoreLikeThisHandler with mltipli input documents

2015-09-29 Thread Roland Szűcs
Hi all, Is it possible to feed multiple solr id for a MoreLikeThisHandler? false details title,content 4 title^12 content^1 2 10 true json true when I call this: http://localhost:8983/solr/bandwhu/mlt?q=id:8=id it works fine. Is there any way to have a kind of "bulk" call of more like

Re: highlighting

2015-09-29 Thread Upayavira
You can change the strings that are inserted into the text, and could place markers that you use to identify the start/end of highlighting elements. Does that work? Upayavira On Mon, Sep 28, 2015, at 09:55 PM, Mark Fenbers wrote: > Greetings! > > I have highlighting turned on in my Solr

Re: MoreLikeThisHandler with mltipli input documents

2015-09-29 Thread Alessandro Benedetti
Hi Roland, you said "The main goal is that when a customer is on the pruduct page ". But if you are in a product page, I guess you have the product Id. If you have the product id , you can simply execute the MLT request with the single Doc Id in input. Why do you need to calculate beforehand?

Re-label terms from a shard?

2015-09-29 Thread Dan Bolser
Hi, I'm using sharding 'off label' to integrate data from various remote sites running a common schema. One issue is that the remote sites sometimes use synonyms of the allowed terms in a given field. i.e. we specify that a certain field may only carry the values x, y, and z, but the remote

Re: MoreLikeThisHandler with mltipli input documents

2015-09-29 Thread Szűcs Roland
Hello Upayavira, The main goal is that when a customer is on the pruduct page on an e-book and he does not like it somehow I want to immediately offer her/him alternative e-books in the same topic. If I expect from the customer to click on a button like "similar e-books" I lose half of them as

Re: Re-label terms from a shard?

2015-09-29 Thread Upayavira
On Tue, Sep 29, 2015, at 03:38 PM, Dan Bolser wrote: > Hi, > > I'm using sharding 'off label' to integrate data from various remote > sites > running a common schema. > > One issue is that the remote sites sometimes use synonyms of the allowed > terms in a given field. i.e. we specify that a

Re: Passing Basic Auth info to HttpSolrClient

2015-09-29 Thread Steven White
Hi, Re-posting to see if anyone can help. If my question is not clear, let me know. Thanks! Steve On Mon, Sep 28, 2015 at 5:15 PM, Steven White wrote: > Hi, > > I'm using HttpSolrClient to connect to Solr. Everything works until when > I enabled basic authentication

SolrCloud and HTTP caching using eTag/if-none-match

2015-09-29 Thread Arcadius Ahouansou
Hello. - Would you be kind enough to share your experience using SolrCloud with HTTP Caching to return 304 status as described in the wiki https://cwiki.apache.org/confluence/display/solr/RequestDispatcher+in+SolrConfig#RequestDispatcherinSolrConfig-httpCachingElement ? - Looking at the SolrJ

Re: How can I get a monotonically increasing field value for docs?

2015-09-29 Thread Gili Nachum
Hoss, Good point, didn't know about cursor mark when we designed this a year ago :( Small potato: I assume cursor mark breaks when the number of shards changes while keeping the original values doesn't, since the relative position is encoded per shard...But that's an edge case. Looking forward

Re: Using dynamically calculated value for sorting

2015-09-29 Thread Chris Hostetter
: sorting. We are planning to introduce discounts based on login credentials : and we have to dynamically calculate price (using base price in SOLR feed) : based on a specific discount returned by an API. Now after the discount is : calculated we want to sort based on the new price (discounted

Re: CloudSolrClient timeout settingsr

2015-09-29 Thread Arcadius Ahouansou
Thank you very much Shawn. Arcadius. On 29 September 2015 at 01:41, Shawn Heisey wrote: > On 9/28/2015 4:04 PM, Arcadius Ahouansou wrote: > > CloudSolrClient has zkClientTimeout/zkConnectTimeout for access to > > zookeeper. > > > > It would be handy to also have the

Solr 4.8 - Updating zkhost list in solr.xml without requiring a restart

2015-09-29 Thread pramodEbay
Hi, Is there an example which I could use - to upload solr.xml in zookeeper and change zkhost entries on the fly and have solr instances be updated via zookeeper. This will prevent us from restarting each solr node everytime, a new zookeeper host is added or deleted. We are on Solr 4.8. Thanks,

Re: solrcloud in an inconsistent state

2015-09-29 Thread Arcadius Ahouansou
Hello Renning. Sounds like https://issues.apache.org/jira/browse/SOLR-6246 A workaround that may not be very appealing is to create a new collection and to use aliases to point to it in you code/call Thanks. On 30 September 2015 at 01:44, r b wrote: > lately, my

Re: Can StandardTokenizerFactory works well for Chinese and English (Bilingual)?

2015-09-29 Thread Zheng Lin Edwin Yeo
Hi Charlie, I've checked that Paoding's code is written for Solr 3 and Solr 4 versions. It is not written for Solr 5, thus I was unable to use it in my Solr 5.x version. Have you tried to use HMMChineseTokenizer and JiebaTokenizer as well? Regards, Edwin On 25 September 2015 at 18:46, Charlie

Re: Solr 4.8 - Updating zkhost list in solr.xml without requiring a restart

2015-09-29 Thread Shawn Heisey
On 9/29/2015 5:59 PM, pramodEbay wrote: > Is there an example which I could use - to upload solr.xml in zookeeper and > change zkhost entries on the fly and have solr instances be updated via > zookeeper. This will prevent us from restarting each solr node everytime, a > new zookeeper host is

solrcloud in an inconsistent state

2015-09-29 Thread r b
lately, my workflow has been 1) make some config changes, 2) upload to zookeeper, 3) use collections API to reload config for the collection. this has been working pretty well. starting last week, i started using the AnalyzingInfixLookupFactory in a SuggestComponent (up until then, it was just

Re: Solr 4.8 - Updating zkhost list in solr.xml without requiring a restart

2015-09-29 Thread pramodmm
> Before we even think about upgrading the zookeeper functionality in > Solr, we must wait for the official 3.5 release from the zookeeper > project. Alpha (or Beta) software will not be included in Solr unless > it is the only way to fix a very serious bug. This is a new feature, > not a bug.

Re: firstSearcher cache warming with own QuerySenderListener

2015-09-29 Thread Chris Hostetter
You haven't really provided us enough info to make any meaningful suggestions. You've got at least 2 custom plugins -- but you don't give us any idea what the implementations of those plugins look like, or how you've configured them. Maybe there is a bug in your code? maybe it's

Re: Cost of having multiple search handlers?

2015-09-29 Thread Jeff Wartes
At the risk of going increasingly off-thread, yes, please do. I’ve been using this: https://dropwizard.github.io/metrics/3.1.0/manual/jetty/, which is convenient, but doesn’t even have request-handler-level resolution. Something I’ve started doing for issues that don’t seem likely to get pulled

Re: Using dynamically calculated value for sorting

2015-09-29 Thread Leonardo Foderaro
Hi, please take a look at Alba, a framework which simplifies the development of new Solr plugins. You can write a plugin (e.g. custom function to be used to boost/sort your docs or a custom Response Writer) in literally five lines of code. More specifically I think these two examples could be

Re: How to preserve 0 after decimal point?

2015-09-29 Thread bbarani
Thanks for your response. -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-preserve-0-after-decimal-point-tp4159295p4231961.html Sent from the Solr - User mailing list archive at Nabble.com.

Using dynamically calculated value for sorting

2015-09-29 Thread bbarani
Hi, We have a price field in our SOLR XML feed that we currently use for sorting. We are planning to introduce discounts based on login credentials and we have to dynamically calculate price (using base price in SOLR feed) based on a specific discount returned by an API. Now after the discount is

Re: How can I get a monotonically increasing field value for docs?

2015-09-29 Thread Chris Hostetter
You're basically re-implementing Solr' cursors. you can change your system of reading docs from the old collection to use... cursorMark=*=timestamp+asc,id+asc ...and then instead of keeping track of the last timestamp & id values and constructing a filter, you can just keep track of the