Re: [poll] virtualization platform for SOLR

2015-10-01 Thread Upayavira
What are you trying to achieve by using virtualisation? If it is just code separation, consider using containers and Docker rather than fully fledged VMs. CPU is shared, but each container sees its own view of its file system. Upayavira On Thu, Oct 1, 2015, at 07:47 AM, Bernd Fehling wrote: >

Re: [poll] virtualization platform for SOLR

2015-10-01 Thread Toke Eskildsen
Bernd Fehling wrote: > unfortunately we have to run VMs, otherwise we would waste hardware. > I thought other solr users are in the same situation but seams that > other users have tons of hardware available and we are the only one > having to use VMs. We have ~5

Re: [poll] virtualization platform for SOLR

2015-10-01 Thread Bernd Fehling
Hi Upayavira, best would be to have 4 dedicated servers, 2 for indexing (masters) and 2 for searching (slaves). Always one is online and one is standby in case of hardware failure or update of OS, JAVA or even SOLR. But I only get 256GB RAM machines with many CPUs which I have to share with

RE: Cannot connect to a zookeeper 3.4.6 instance via zkCli.cmd

2015-10-01 Thread Adrian Liew
Hi all, The problem below was resolved by appropriately setting my server ip addresses to have the following for each zoo.cfg: server.1=10.0.0.4:2888:3888 server.2=10.0.0.5:2888:3888 server.3=10.0.0.6:2888:3888 as opposed to the following: server.1=10.0.0.4:2888:3888

Re: Join with faceting and filtering

2015-10-01 Thread Mikhail Khludnev
1. i'd say it's challenge. 2. can't you do the opposite filter active contracts, join them back to items, and facet then? q=(Description:colgate OR Categories:colgate OR Sellers:colgate)={!join from=ItemId to=ItemId fromIndex=Contracts)Active:true=SellersString 3. note: there is {!terms} QParser

Re: Join with faceting and filtering

2015-10-01 Thread Troy Edwards
I had missed a field in ContractItem index (ClientId) *ContractItem* ContractItemId - string ItemId - string ClientId - string ContractCode - string (facet and filter on this) Priority - integer (order by priority descending) Active - boolean (filter on this) 2) It appears that I cannot have

Solr vs Lucene

2015-10-01 Thread Mark Fenbers
Greetings! Being a newbie, I'm still mostly in the dark regarding where the line is between Solr and Lucene. The following code snippet is -- I think -- all Lucene and no Solr. It is a significantly modified version of some example code I found on the net. dir =

Re: highlighting

2015-10-01 Thread Mark Fenbers
Yeah, I thought about using markers, but then I'd have to search the the text for the markers to determine the locations. This is a clunky way of getting the results I want, and it would save two steps if Solr merely had an option to return a start/length array (of what should be highlighted)

RE: [poll] virtualization platform for SOLR

2015-10-01 Thread Davis, Daniel (NIH/NLM) [C]
Shawn, Same answer as Bernd. We have a big VmWare vCenter setup and Netapp. That's what we have to use.Even in a VM world, some advice persists - "local" disk is faster than network disk even if the "local" disk is virtual. Netapp disk is exported to VmWare vCenter over

Re: Solr vs Lucene

2015-10-01 Thread Alexandre Rafalovitch
Hi Mark, Have you gone through a Solr tutorial yet? If/when you do, you will see you don't need to code any of this. It is configured as part of the web-facing total offering which are tweaked by XML configuration files (or REST API calls). And most of the standard pipelines are already

Re: Create Collection in Solr Cloud using Solr 5.3.0 giving timeout issues

2015-10-01 Thread Shawn Heisey
On 10/1/2015 4:43 AM, Adrian Liew wrote: > E:\solr-5.3.0\bin>solr.cmd create_collection -c sitecore_core_index -n > sitecore_ > common_configs -shards 1 -replicationFactor 3 > > Connecting to ZooKeeper at 10.0.0.4:2181,10.0.0.5:2182,10.0.0.6:2183 ... > Re-using existing configuration directory

RE: Create Collection in Solr Cloud using Solr 5.3.0 giving timeout issues

2015-10-01 Thread Adrian Liew
Hi Shawn, Thanks for that. You did mention about starting out with empty collections and already I am experiencing timeout issues. Could this have to do with the hardware or server spec sizing itself. For example, lack of memory allocated, network issues etc. that can possibly cause this?

Re: Create Collection in Solr Cloud using Solr 5.3.0 giving timeout issues

2015-10-01 Thread Shawn Heisey
On 10/1/2015 9:26 AM, Adrian Liew wrote: > Thanks for that. You did mention about starting out with empty collections > and already I am experiencing timeout issues. Could this have to do with the > hardware or server spec sizing itself. For example, lack of memory allocated, > network issues

Re: Solr vs Lucene

2015-10-01 Thread Mark Fenbers
Yes, and I've spend numerous hours configuring and reconfiguring, and eventually even starting over, but still have not getting it to work right. Even now, I'm getting bizarre results. For example, I query "NOTE: This is purely as an example." and I get back really bizarre suggestions,

Re: Solr vs Lucene

2015-10-01 Thread Alexandre Rafalovitch
Is that with Lucene or with Solr? Because Solr has several different spell-checker modules you can configure. I would recommend trying them first. And, frankly, I still don't know what your business case is. Regards, Alex. Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:

Class Loader issues

2015-10-01 Thread Firas Khasawneh
Hi all, I am trying to load Jackson json library from the solr-5.3.1/contrib/clustering/lib directory. In solconfig.xml I have the following entry: When I start solr, I get the following warning: SolrResourceLoader No files added to classloader from lib: /dev/solr-5.3.1/contrib/clustering/lib

Facet queries blow out the filterCache

2015-10-01 Thread Jeff Wartes
I’m doing some fairly simple facet queries in a two-shard 5.3 SolrCloud index on fields like this:

PoolingClientConnectionManager

2015-10-01 Thread Rallavagu
Solr 4.6.1, single Shard, cloud with 4 nodes Solr is running on Tomcat configured with 200 threads for thread pool. As Solr uses "org.apache.http.impl.conn.PoolingClientConnectionManager" for replication, my question is does Solr threads use connections from tomcat thread pool or they create

Re: Using dynamically calculated value for sorting

2015-10-01 Thread bbarani
Thanks for your reply. Overall design has changed little bit. Now I will be sending the SKU id (SKU id is in SOLR document) to an external API and it will return a new price to me for that SKU based on some logic (I wont be calculating the new price). Once I get that value I need to use that

Re: PoolingClientConnectionManager

2015-10-01 Thread Rallavagu
Thanks for the response Andrea. Assuming that Solr has it's own thread pool, it appears that "PoolingClientConnectionManager" has a maximum 20 threads per host as default. Is there a way to changes this increase to handle heavy update traffic? Thanks. On 10/1/15 11:05 AM, Andrea Gazzarini

Re: PoolingClientConnectionManager

2015-10-01 Thread Shawn Heisey
On 10/1/2015 11:50 AM, Rallavagu wrote: > Solr 4.6.1, single Shard, cloud with 4 nodes > > Solr is running on Tomcat configured with 200 threads for thread pool. > As Solr uses > "org.apache.http.impl.conn.PoolingClientConnectionManager" for > replication, my question is does Solr threads use

Re: Cloud Deployment Strategy... In the Cloud

2015-10-01 Thread Mark Miller
On Wed, Sep 30, 2015 at 10:36 AM Steve Davids wrote: > Our project built a custom "admin" webapp that we use for various O > activities so I went ahead and added the ability to upload a Zip > distribution which then uses SolrJ to forward the extracted contents to ZK, > this

Re: PoolingClientConnectionManager

2015-10-01 Thread Rallavagu
Thanks Shawn. This is good data. On 10/1/15 11:43 AM, Shawn Heisey wrote: On 10/1/2015 11:50 AM, Rallavagu wrote: Solr 4.6.1, single Shard, cloud with 4 nodes Solr is running on Tomcat configured with 200 threads for thread pool. As Solr uses

Re: PoolingClientConnectionManager

2015-10-01 Thread Andrea Gazzarini
Hi, Maybe I could be wrong as your question is related with Solr internals (I believe the dev list is a better candidate for such questions). Anyway, my thoughts: unless you're within a JCA inbound component (and Solr isn't), the JEE specs say you shouldn' start new threads. For this reason,

Re: Solr vs Lucene

2015-10-01 Thread Walter Underwood
If you want a spell checker, don’t use a search engine. Use a spell checker. Something like aspell (http://aspell.net/ ) will be faster and better than Solr. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Oct 1, 2015, at 1:06

Re: PoolingClientConnectionManager

2015-10-01 Thread Rallavagu
Awesome. This is what I was looking for. Will try these. Thanks. On 10/1/15 1:31 PM, Shawn Heisey wrote: On 10/1/2015 12:39 PM, Rallavagu wrote: Thanks for the response Andrea. Assuming that Solr has it's own thread pool, it appears that "PoolingClientConnectionManager" has a maximum 20

Re: Solr vs Lucene

2015-10-01 Thread Mark Fenbers
This is with Solr. The Lucene approach (assuming that is what is in my Java code, shared previously) works flawlessly, albeit with fewer options, AFAIK. I'm not sure what you mean by "business case"... I'm wanting to spell-check user-supplied text in my Java app. The end-user then

Solr 4.7.2 Vs 5.3.0 Docs different for same query

2015-10-01 Thread Ravi Solr
I we migrated from 4.7.2 to 5.3.0. I sourced the docs from 4.7.2 core and indexed into 5.3.0 collection (data directories are different) via SolrEntityProcessor. Currently my production is all whack because of this issue. Do I have to go back and reindex all again ?? Is there a quick fix for this

Re: Spam handling with ASF mailing lists

2015-10-01 Thread Gora Mohanty
> On 23 September 2015 at 21:10, Upayavira wrote: > > > If you have specific questions about spam handling, then I'd suggest you > > ask on the ASF infrastructure list, but generally, we can expect that > > there will be occasions when something that seems obviously spam gets >

Re: Find records with no values in solr.LatLongType fied type

2015-10-01 Thread Erick Erickson
BTW, there's a JIRA for this, it's a bit clumsy to have to know about how the coordinate fields are split up And I like Ishan's idea too! On Wed, Sep 30, 2015 at 11:51 AM, Ishan Chattopadhyaya wrote: > There's also a function, exists(), which might work here, and

Re: Re-label terms from a shard?

2015-10-01 Thread Erick Erickson
Actually, I think there is an enum field type, see: https://issues.apache.org/jira/browse/SOLR-5084. Although the ability to retrofit the current setup is...er...fraught. You could always write a custom update processor (maybe a scriptupdateprocessor?) to transform synonyms into the "correct"

Re: PoolingClientConnectionManager

2015-10-01 Thread Shawn Heisey
On 10/1/2015 12:39 PM, Rallavagu wrote: > Thanks for the response Andrea. > > Assuming that Solr has it's own thread pool, it appears that > "PoolingClientConnectionManager" has a maximum 20 threads per host as > default. Is there a way to changes this increase to handle heavy > update traffic?

Zk and Solr Cloud

2015-10-01 Thread Rallavagu
Solr 4.6.1 single shard with 4 nodes. Zookeeper 3.4.5 ensemble of 3. See following errors in ZK and Solr and they are connected. When I see the following error in Zookeeper, unexpected error, closing socket connection and attempting reconnect java.io.IOException: Packet len11823809 is out of

Re: error reporting during indexing

2015-10-01 Thread Erick Erickson
bq: If there is a problem writing the segment, a permission error, Highly doubtful that this'll occur. When an IndexWriter is opened, the first thing that's (usually) done is write to the lock file to keep other Solr's from writing. That should fail right off the bat, far before any docs are

Re: Facet queries blow out the filterCache

2015-10-01 Thread Mikhail Khludnev
what if you set f.city.facet.limit=-1 ? On Thu, Oct 1, 2015 at 7:43 PM, Jeff Wartes wrote: > > I’m doing some fairly simple facet queries in a two-shard 5.3 SolrCloud > index on fields like this: > > docValues="true”/> > > that look something like this: >

Re: Facet queries blow out the filterCache

2015-10-01 Thread Jeff Wartes
No change, still shows an insert per-request. As does a simplified request with only the facet params "=city=true" It’s definitely facet related though, facet=false eliminates the insert. On 10/1/15, 1:50 PM, "Mikhail Khludnev" wrote: >what if you set

Re: highlighting

2015-10-01 Thread Koji Sekiguchi
Hi Mark, I think I saw similar requirement recently in mailing list. The feature sounds reasonable to me. > If not, how do I go about posting this as a feature request? JIRA can be used for the purpose, but there is no guarantee that the feature is implemented. :( Koji On 2015/10/01 20:07,

Re: Facet queries blow out the filterCache

2015-10-01 Thread Jeff Wartes
It still inserts if I address the core directly and use distrib=false. I’ve got a few collections sharing the same config, so it’s surprisingly annoying to change solrconfig.xml right now, but it seemed pretty clear the query is the thing being cached, since the cache size only changes when the

Re: highlighting

2015-10-01 Thread Teague James
Hi everyone! Pardon if it's not proper etiquette to chime in, but that feature would solve some issues I have with my app for the same reason. We are using markers now and it is very clunky - particularly with phrases and certain special characters. I would love to see this feature too Mark!

Re: Facet queries blow out the filterCache

2015-10-01 Thread Mikhail Khludnev
hm.. This option was useful for introspecting cache content https://wiki.apache.org/solr/SolrCaching#showItems It might help you to find-out a cause. I'm still blaming distributed requests, it expained here https://cwiki.apache.org/confluence/display/solr/Faceting#Faceting-Over-RequestParameters

Re: [poll] virtualization platform for SOLR

2015-10-01 Thread Bernd Fehling
Hi Shawn, unfortunately we have to run VMs, otherwise we would waste hardware. I thought other solr users are in the same situation but seams that other users have tons of hardware available and we are the only one having to use VMs. Right, bare metal is always better than any VM. As you

Re: Class Loader issues

2015-10-01 Thread Tomoko Uchida
Hi, Do you have (execute) permission for /dev/solr-5.3.1/contrib/clustering/lib ? I've seen same warning when I have not access permission to the library dir. Regards, Tomoko 2015-10-02 1:23 GMT+09:00 Firas Khasawneh : > Hi all, > > I am trying to load Jackson json

Re: Cannot connect to a zookeeper 3.4.6 instance via zkCli.cmd

2015-10-01 Thread Zheng Lin Edwin Yeo
Hi Adrian, How is your setup of your system like? By right it shouldn't be an issue if we use different ports. in fact, if the various zookeeper instance are running on a single machine, they have to be on different ports in order for it to work. Regards, Edwin On 1 October 2015 at 18:19,

Re: Keyword match distance rule issue

2015-10-01 Thread anil.vadhavane
Hello, We have tried the Analysis tool. Below is the screenshot of analysis tool. -- View this message in context: http://lucene.472066.n3.nabble.com/Keyword-match-distance-rule-issue-tp4231624p4232246.html Sent from the Solr -

Re: Zk and Solr Cloud

2015-10-01 Thread Shawn Heisey
On 10/1/2015 1:26 PM, Rallavagu wrote: > Solr 4.6.1 single shard with 4 nodes. Zookeeper 3.4.5 ensemble of 3. > > See following errors in ZK and Solr and they are connected. > > When I see the following error in Zookeeper, > > unexpected error, closing socket connection and attempting reconnect >

Re: Solr 4.7.2 Vs 5.3.0 Docs different for same query

2015-10-01 Thread Tomoko Uchida
Are you sure that you've indexed same data to Solr 4.7.2 and 5.3.0 ? If so, I suspect that you have multiple shards and request to one shard. (In that case, you might get partial results) Can you share HTTP request url and the schema and default search field ? 2015-10-02 6:09 GMT+09:00 Ravi

Re: [poll] virtualization platform for SOLR

2015-10-01 Thread Bernd Fehling
Hi Toke, I don't get SSDs, only spinning drives. And as you mentioned, the impact of VMs is not that much if you use spinning drives. It is more the VM software that matters and thats why we use XEN and not KVM. With some tuning of sysctrl for the VMs it performs good, but bare-metal is still

Cannot connect to a zookeeper 3.4.6 instance via zkCli.cmd

2015-10-01 Thread Adrian Liew
Hi there, Currently, I have setup an azure virtual network to connect my Zookeeper clusters together with three Azure VMs. Each VM has an internal IP of 10.0.0.4, 10.0.0.5 and 10.0.0.6. I have also setup Solr 5.3.0 which runs in Solr Cloud mode connected to all three Zookeepers in an external