Re: Disable all caches in Solr

2014-07-08 Thread vidit.asthana
Thanks Chris. I understand this. But this test is to determine the *maximum* latency a query can have and hence I have disabled all caches. After disabling all caches in solrconfig, I was able to remove latency variation for a single query in most of the cases. But still *sort* queries are

Changing default behavior of solr for overwrite the whole document on uniquekey duplication

2014-07-08 Thread Ali Nazemian
Dears, Hi, According to my requirement I need to change the default behavior of Solr for overwriting the whole document on unique-key duplication. I am going to change that the overwrite just part of document (some fields) and other parts of document (other fields) remain unchanged. First of all I

Solr irregularly having QTime 50000ms, stracing solr cures the problem

2014-07-08 Thread Harald Kirsch
Hi all, This is what happens when I run a regular wget query to log the current number of documents indexed: 2014-07-08:07:23:28 QTime=20 numFound=5720168 2014-07-08:07:24:28 QTime=12 numFound=5721126 2014-07-08:07:25:28 QTime=19 numFound=5721126 2014-07-08:07:27:18 QTime=50071

Re: Changing default behavior of solr for overwrite the whole document on uniquekey duplication

2014-07-08 Thread Himanshu Mehrotra
Please look at https://wiki.apache.org/solr/Atomic_Updates This does what you want just update relevant fields. Thanks, Himanshu On Tue, Jul 8, 2014 at 1:09 PM, Ali Nazemian alinazem...@gmail.com wrote: Dears, Hi, According to my requirement I need to change the default behavior of Solr

Re: SOLR on hdfs

2014-07-08 Thread shlash
Hi all, I am new to Solr and hdfs, actually, I am trying to index text content extracted from binary files like PDF, MS Office...etc which are stored on hdfs (single node), till now I've running Solr on HDFS, and create the core but I couldn't send the files to solr for indexing. Can someone

Re: Solr irregularly having QTime 50000ms, stracing solr cures the problem

2014-07-08 Thread Heyde, Ralf
My First assumption: full gc. Can you please tell us about your jvm setup and maybe trace what happens the jvms? On Jul 8, 2014 9:54 AM, Harald Kirsch harald.kir...@raytion.com wrote: Hi all, This is what happens when I run a regular wget query to log the current number of documents indexed:

Re: Changing default behavior of solr for overwrite the whole document on uniquekey duplication

2014-07-08 Thread Ali Nazemian
Dear Himanshu, Hi, You misunderstood what I meant. I am not going to update some field. I am going to change what Solr do on duplication of uniquekey field. I dont want to solr overwrite Whole document I just want to overwrite some parts of document. This situation does not come from user side

Re: Solr irregularly having QTime 50000ms, stracing solr cures the problem

2014-07-08 Thread Harald Kirsch
No, no full GC. The JVM does nothing during the outages, no CPU, no GC, as checked with jvisualvm and htop. Harald. On 08.07.2014 10:12, Heyde, Ralf wrote: My First assumption: full gc. Can you please tell us about your jvm setup and maybe trace what happens the jvms? On Jul 8, 2014 9:54

Parallel optimize of index on SolrCloud.

2014-07-08 Thread Modassar Ather
Hi, Need to optimize index created using CloudSolrServer APIs under SolrCloud setup of 3 instances on separate machines. Currently it optimizes sequentially if I invoke cloudSolrServer.optimize(). To make it parallel I tried making three separate HttpSolrServer instances and invoked

[Solr Schema API] SolrJ Access

2014-07-08 Thread Alessandro Benedetti
Hi guys, wondering if there is any proper way to access Schema API via Solrj. Of course is possible to reach them in Java with a specific Http Request, but in this way, using SolrCloud for example we become coupled to one specific instance ( and we don't want) . Code Example :

Fwd: Language detection for solr 3.6.1

2014-07-08 Thread Alexandre Rafalovitch
-- Forwarded message -- From: Poornima Jay poornima...@rocketmail.com Date: Tue, Jul 8, 2014 at 5:03 PM Subject: Re: Language detection for solr 3.6.1 When i try to use solr-langid-3.6.1.jar file in my path /apache-tomcat-5.5.25/webapps/solr_multilangue_3.6_jar/WEB-INF/lib/ and

Re: Fwd: Language detection for solr 3.6.1

2014-07-08 Thread Poornima Jay
When i use solr-langid-3.5.0.jar file after reloading the core i am getting the below error  SEVERE: java.lang.NoClassDefFoundError: net/arnx/jsonic/JSONException Even after adding the solr-jsonic-3.5.0.jar file in the webapps folder. Thanks, Poornima On Tuesday, 8 July 2014 3:36 PM,

Re: Fwd: Language detection for solr 3.6.1

2014-07-08 Thread Alexandre Rafalovitch
I just realized you are not using Solr language detect libraries. You are using third party one. You did mention that in your first message. I don't see that library integrated with Solr though, just as a standalone library. So, you can't just plug in it. Is there any reason you cannot use one

don't count facet on blank values

2014-07-08 Thread Aman Tandon
Hi, Is this possible to not to count the facets for the blank values? e.g. cat: cats:[*,34324,* 10,8635, 20,8226, 50,5162, 30,759, 100,188, 40,13, 200,7] How is this possible? With Regards Aman Tandon

Re: Fwd: Language detection for solr 3.6.1

2014-07-08 Thread Poornima Jay
I'm using the google library which I has mentioned in my first mail saying Im using http://code.google.com/p/language-detection/. I have downloaded the jar file from the below url https://www.versioneye.com/java/org.apache.solr:solr-langid/3.6.1 Please let me know from where I need to

Re: don't count facet on blank values

2014-07-08 Thread Alexandre Rafalovitch
Do you need those values stored/indexed? If not, why not remove them before they hit Solr with appropriate UpdateRequestProcessor? Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Tue, Jul 8,

Re: don't count facet on blank values

2014-07-08 Thread Gora Mohanty
On 8 July 2014 15:46, Aman Tandon amantandon...@gmail.com wrote: Hi, Is this possible to not to count the facets for the blank values? e.g. cat: [...] Either filter them out in the query, or remove them client-side when displaying the results. Regards, Gora

Re: don't count facet on blank values

2014-07-08 Thread Aman Tandon
@Alex, yes we need them to indexed and stored, as we are doing some processing if fields are blank. @Gora Thanks, i will try this one. Thanks for your quick replies. With Regards Aman Tandon On Tue, Jul 8, 2014 at 3:53 PM, Gora Mohanty g...@mimirtech.com wrote: On 8 July 2014 15:46, Aman

Re: don't count facet on blank values

2014-07-08 Thread Alexandre Rafalovitch
Right, but the blank field and missing field are different things. Are they for you? If yes, then correct, you are stuck with getting them back. But if blank field is the same as missing/empty field, then you can pre-process unify them. Regards, Alex. Personal website:

Re: Facets on Nested documents

2014-07-08 Thread Walter Liguori
Yes, also i've the same problem. In my case i have 2 type (parent and children) in a single collection and i want to retrieve only the parent with a facet on a children field. I've seen that is possible via block join query (availble by solr 4.5). I've solr 1.2 and I've thinked about static facet

JOB: Solr / Elasticsearch engineer @ Sematext

2014-07-08 Thread Otis Gospodnetic
Hi, I think most people on this list have heard of Sematext http://sematext.com/, so I'll skip the company info, and just jump to the meat, which involves a lot of fun work with Solr and/or Elasticsearch: We have an opening for an engineer who knows either Elasticsearch or Solr or both and wants

[ANN] Solr Users Thailand - unofficial group

2014-07-08 Thread Alexandre Rafalovitch
Hello, A new Google Group has been recently started for Solr Users who want to discuss Solr in Thai or need to discuss Solr issues around Thai language (in Thai or English). https://groups.google.com/forum/#!forum/solr-user-thailand The group is monitored by the local Solr consultancy, one of

I need a replacement for the QueryElevation Component

2014-07-08 Thread eShard
Good morning to one and all, I'm using Solr 4.0 Final and I've been struggling mightily with the elevation component. It is too limited for our needs; it doesn't handle phrases very well and I need to have more than one doc with the same keyword or phrase. So, I need a better solution. One that

Re: don't count facet on blank values

2014-07-08 Thread Aman Tandon
No both are same for me With Regards Aman Tandon On Tue, Jul 8, 2014 at 4:01 PM, Alexandre Rafalovitch arafa...@gmail.com wrote: Right, but the blank field and missing field are different things. Are they for you? If yes, then correct, you are stuck with getting them back. But if blank

Slow inserts when using Solr Cloud

2014-07-08 Thread Ian Williams (NWIS - Applications Design)
Hi I'm encountering a surprisingly high increase in response times when I insert new documents into a SolrCloud, compared with a standalone Solr instance. I have a SolrCloud set up for test and evaluation purposes. I have four shards, each with a leader and a replica, distributed over four

RE: Exact Match first in the list.

2014-07-08 Thread EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions)
Thanks shawn, I am already using the Boosting but the OR condition works for me as you mentioned. One question If I used in search field (TAGs) , it is returning lot of Fields but if try with the '( something like TAGs, it is getting less, why the ( ) are changing the results.? They won't

Re: I need a replacement for the QueryElevation Component

2014-07-08 Thread O. Klein
You can sponsor more then 1 document per keyword. query text=AAA doc id=A / doc id=B / /query And you might want to try str name=queryFieldTypestring/str instead of another FieldType. I found that textFields remove whitespace and concatenated the tokens. Not sure if this is intended or

Re: Slow inserts when using Solr Cloud

2014-07-08 Thread Mark Miller
Updates are currently done locally before concurrently being sent to all replicas - so on a single update, you can expect 2x just from that. As for your results, it sounds like perhaps there is more overhead than we would like in the code that sends to replicas and forwards updates? Someone

Re: Parallel optimize of index on SolrCloud.

2014-07-08 Thread Walter Underwood
You probably do not need to force merge (mistakenly called optimize) your index. Solr does automatic merges, which work just fine. There are only a few situations where a forced merge is even a good idea. The most common one is a replicated (non-cloud) setup with a full reindex every night.

Re: Transparently rebalancing a Solr cluster without splitting or moving shards

2014-07-08 Thread Damien Dykman
Thanks for your suggestions and recommendations. If I understand correctly, the MIGRATE command does shard splitting (around the range of the split.key) and merging behind the scene. Though, it's a bit difficult to properly monitor the actual migration, set the proper timeouts, know when to

SolrCloud delete replica

2014-07-08 Thread Arvin Barooni
Hi, I have an issue regarding collection delete. when a solr node is in down mode and I delete a collection, all things seems fine and it deletes the collection from cluster state too. But when the dead node comes back it register the collection again. Even when I delete the collection by

Re: Changing default behavior of solr for overwrite the whole document on uniquekey duplication

2014-07-08 Thread Chris Hostetter
I think you are missunderstanding what Himanshu is suggesting to you. You don't need to make lots of big changes ot the internals of solr's code to get what you want -- instead you can leverage the Atomic Updates Optimistic Concurrency features of Solr to get the existing internal Solr to

Hypen in search keyword

2014-07-08 Thread EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions)
I have the below config for the field type text_general. But then I search with keyword e.g 100-001, it get 100-001, 100 in starting records ending with 001 . I want to treat - as another character not to split. fieldType name=text_general class=solr.TextField positionIncrementGap=100

Re: Hypen in search keyword

2014-07-08 Thread Jack Krupansky
The word delimiter filter has a types parameter where you specify a file that can map hyphen to alpha or numeric. There is an example in my e-book. -- Jack Krupansky -Original Message- From: EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions) Sent: Tuesday, July 8, 2014 2:18

Re: Solr irregularly having QTime 50000ms, stracing solr cures the problem

2014-07-08 Thread Shawn Heisey
On 7/8/2014 1:53 AM, Harald Kirsch wrote: Hi all, This is what happens when I run a regular wget query to log the current number of documents indexed: 2014-07-08:07:23:28 QTime=20 numFound=5720168 2014-07-08:07:24:28 QTime=12 numFound=5721126 2014-07-08:07:25:28 QTime=19 numFound=5721126

Re: Solr irregularly having QTime 50000ms, stracing solr cures the problem

2014-07-08 Thread Walter Underwood
Local disks or shared network disks? --wunder On Jul 8, 2014, at 11:43 AM, Shawn Heisey s...@elyograg.org wrote: On 7/8/2014 1:53 AM, Harald Kirsch wrote: Hi all, This is what happens when I run a regular wget query to log the current number of documents indexed: 2014-07-08:07:23:28

SOLR Talk at AOL Dulles Campus.

2014-07-08 Thread Rishi Easwaran
All, There is a tech talk on AOL Dulles campus tomorrow. Do swing by if you can and share it with your colleagues and friends. www.meetup.com/Code-Brew/events/192361672/ There will be free food and beer served at this event :) Thanks, Rishi.

RE: [Solr Schema API] SolrJ Access

2014-07-08 Thread Cario, Elaine
Alessandro, I just got this to work myself: public static final String DEFINED_FIELDS_API = /schema/fields; public static final String DYNAMIC_FIELDS_API = /schema/dynamicfields; ... // just get a connection to Solr as usual (the factory is mine - it will use

Solr atomic updates question

2014-07-08 Thread Bill Au
Solr atomic update allows for changing only one or more fields of a document without having to re-index the entire document. But what about the case where I am sending in the entire document? In that case the whole document will be re-indexed anyway, right? So I assume that there will be no

What does getSearcher method of SolrQueryRequest means ?

2014-07-08 Thread Yossi Biton
Hello there, I'm using a project named LIRE for image retrieval based on sole platform. There is part of the code which i can't understand, so maybe you could help me. The project implements request handler named lireq : public class LireRequestHandler extends RequestHandlerBase The search

Re: Solr irregularly having QTime 50000ms, stracing solr cures the problem

2014-07-08 Thread Steve McKay
Sure sounds like a socket bug, doesn't it? I turn to tcpdump when Solr starts behaving strangely in a socket-related way. Knowing exactly what's happening at the transport level is worth a month of guessing and poking. On Jul 8, 2014, at 3:53 AM, Harald Kirsch harald.kir...@raytion.com wrote:

Re: What does getSearcher method of SolrQueryRequest means ?

2014-07-08 Thread Yossi Biton
(Sorry - my mail was sent half ready) hashes is an array of hash values generated some-how from the image. So my question is what is the query being done in this part ? I tried to reconstruct it by my own, by constructing select query with the hash values seperated by OR but the results were

Re: Solr atomic updates question

2014-07-08 Thread Steve McKay
Atomic updates fetch the doc with RealTimeGet, apply the updates to the fetched doc, then reindex. Whether you use atomic updates or send the entire doc to Solr, it has to deleteById then add. The perf difference between the atomic updates and normal updates is likely minimal. Atomic updates

Re: Solr atomic updates question

2014-07-08 Thread Bill Au
Thanks for that under-the-cover explanation. I am not sure what you mean by mix atomic updates with regular field values. Can you give an example? Thanks. Bill On Tue, Jul 8, 2014 at 6:56 PM, Steve McKay st...@b.abbies.us wrote: Atomic updates fetch the doc with RealTimeGet, apply the

Re: Solr atomic updates question

2014-07-08 Thread Steve McKay
Take a look at this update XML: add doc field name=employeeId05991/field field name=employeeNameSteve McKay/field field name=office update=setWalla Walla/field field name=skills update=addPython/field /doc /add Let's say employeeId is the key. If there's a fourth field,

Re: Solr atomic updates question

2014-07-08 Thread Bill Au
I see what you mean now. Thanks for the example. It makes things very clear. I have been thinking about the explanation in the original response more. According to that, both regular update with entire doc and atomic update involves a delete by id followed by a add. But both the Solr

fix wiki error

2014-07-08 Thread Susmit Shukla
The url for solr atomic update documentation should contain json in the end. Here is the page - https://wiki.apache.org/solr/UpdateJSON#Solr_4.0_Example curl http://localhost:8983/solr/update/*json* -H 'Content-type:application/json'

Re: fix wiki error

2014-07-08 Thread Alexandre Rafalovitch
Why do you think so? As of Solr 4, the CSV and JSON handlers have been unified in the general update handler and the /update/json is there for legacy reason. The example should work. If it is not for you, it might be a different reason. Regards, Alex. Personal website:

Add a new replica to SolrCloud

2014-07-08 Thread Varun Gupta
Hi, I am currently using Solr 4.7.2 and have SolrCloud setup running on 2 servers with number of shards as 2, replication factor as 2 and mas shards per node as 4. Now, I want to add another server to the SolrCloud as a replica. I can see Collection API to add a new replica but that was added in

Synchronising two masters

2014-07-08 Thread Prasi S
Hi , Our solr setup consists of 2 Masters and 2Slaves. The slaves would point to any one of the Masters through a load balancer and replicate the data. Master1(M1) is the primary indexer. I send data to M1. In case M1 fails, i have a failover master, M2 and that would be indexing the data. The

Re: Parallel optimize of index on SolrCloud.

2014-07-08 Thread Modassar Ather
Thanks Walter for your inputs. Our use case and performance benchmark requires us to invoke optimize. Here we see a chance of improvement in performance of optimize() if invoked in parallel. I found that if* distrib=false *is used, the optimization will happen in parallel. But I could not find

Planning ahead for Solr Cloud and Scaling

2014-07-08 Thread Zane Rockenbaugh
I'm working on a product hosted with AWS that uses Elastic Beanstalk auto-scaling to good effect and we are trying to set up similar (more or less) runtime scaling support with Solr. I think I understand how to set this up, and wanted to check I was on the right track. We currently run 3 cores on

Re: Add a new replica to SolrCloud

2014-07-08 Thread Shalin Shekhar Mangar
Yes, you can just call a Core Admin CREATE on the new node with the collection name and optionally the shard name. On Wed, Jul 9, 2014 at 9:46 AM, Varun Gupta varun.vgu...@gmail.com wrote: Hi, I am currently using Solr 4.7.2 and have SolrCloud setup running on 2 servers with number of

Re: Add a new replica to SolrCloud

2014-07-08 Thread Himanshu Mehrotra
Yes, there is a way. One node on which replica needs to be created hit curl ' http://localhost:8983/solr/admin/cores?action=CREATEname=corenamecollection=collectionshard= http://localhost:8983/solr/admin/cores?action=CREATEname=mycorecollection=collection1shard=shard2 shardid' For example curl

Re: Parallel optimize of index on SolrCloud.

2014-07-08 Thread Walter Underwood
I seriously doubt that you are required to force merge. How much improvement? And is the big performance cost also OK? I have worked on search engines that do automatic merges and offer forced merges for over fifteen years. For all that time, forced merges have usually caused problems. Stop

Re: Parallel optimize of index on SolrCloud.

2014-07-08 Thread Modassar Ather
Our index has almost 100M documents running on SolrCloud of 3 shards and each shard has an index size of about 700GB (for the record, we are not using stored fields - our documents are pretty large). We perform a full indexing every weekend and during the week there are no updates made to the