Re: fs river - Error while reading content

2014-09-17 Thread David Pilato
Could you turn on debug? See https://github.com/dadoonet/fsriver#debug-mode Also, which versions are you using? -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 17 sept. 2014 à 01:49, Preeti Jain itspre...@gmail.com a écrit : Hi, I'm using elasticsearch version 1.0.1

Re: Sorting search results

2014-09-17 Thread Matej Zerovnik
Anyone? I'm using logstash to feed logs to Elasticsearch with default schema template. Timestamp field looks like this: Sep 17 07:30:36 Should I sort by @timestamp, looking like 2014-09-17T05:30:36.000Z? Should I add another timestamp field with unix time and sort by that? Matej On

Elasticsearch hackday next week

2014-09-17 Thread Charlie Hull
Hi all, We're holding a free hackday on Elasticsearch next week in Cambridge, UK: http://www.meetup.com/Enterprise-Search-Cambridge-UK/events/200674282/ as part of our Cambridge Search Meetup programme. Thanks to Elasticsearch.com we have free lunch, coffee snacks and a pile of Elasticsearch

Re: Error on adding the new field mapping in already existing mapping

2014-09-17 Thread Narinder Kaur
Thanks for the reply. Yes, You were right, the mapping was there. On Thursday, 11 September 2014 15:35:29 UTC+5:30, Narinder Kaur wrote: Hi, We already have a type in our system, *YsFact*. Now I needed to add a new field in this type, *username*. And username has following

Java API and aggregations result problem

2014-09-17 Thread lpouget
I have a problem with Java API and aggregations result Serializer (jackson) produce this error: No serializer found for class org.elasticsearch.common.text.StringText and no properties discovered to create BeanSerializer for this query : 1. { 2. size: 0, 3. aggregations: {

Predictive modeling using ELK?

2014-09-17 Thread Yu Watanabe
Hi all! I would like to ask question to people in this forum. Currently, I am proposing ELK log centralization system to the customer and I am seeking some appeal point for the ELK. I am considering if predictive modeling can be an option for ELK. Is there anyone whom are willing to share

TooManyClauses error in Query String query - trying to rewrite

2014-09-17 Thread Markos Fragkakis
Hello, My query_string query gives me a TooManyClauses exception, and I am trying to avoid it with the rewrite parameter. This is my original query: { query : { query_string : { query : +substance +ATT_UUID:* +filename:*pdf, default_field :

Change index_options for analyzed string field

2014-09-17 Thread Ivan Ji
Hi, all I am wondering: can I change the index_options to `freqs` or `docs` for analyzed string field? If yes, what would happen if I configure the field like this? And what kind of queries or operations need the index_options of analyzed string field to be positions or even offsets ? Any

Using ES as a primary datastore.

2014-09-17 Thread P Suman
Hello, We are planning to use ES as a primary datastore. Here is my usecase We receive a million transactions per day (all are inserts). Each transaction is around 500KB size, transaction has 10 fields we should be able to search on all 10 fields. We want to keep around 1 yr worth of data,

Elasticsearch script execution on missing field

2014-09-17 Thread Manoj
I am currently using ES version 0.19. For a feature requirement, I wanted to execute script on missing field through terms facet. The curl which I tried is something like below code { query: { term: { content: deep } }, filter: { and: {

Floating point precision in response

2014-09-17 Thread Thomas
Hi, I have a quick question with regards the response of numeric values. I perform an aggregation with the sum aggregation and when I get back the response in a curl request the number is shown as follows: aggs:{ day_clicks:{ sum: { field : clicks } } } response

Re: Using ES as a primary datastore.

2014-09-17 Thread Mark Walkom
That's a lot of data, do you have a big budget, automation, monitoring? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 17 September 2014 20:41, P Suman papanaboina.su...@gmail.com wrote: Hello, We are planning

Re: Using ES as a primary datastore.

2014-09-17 Thread Thomas
Hi, You have to calculate the volumes you will keep in one shard first then you have to break your volumes into the number of shards you will maintain and then scale accordingly into a number of nodes, or at least as your volumes grow you should grow your cluster as well. It is difficult to

Re: Elasticsearch script execution on missing field

2014-09-17 Thread Thomas
I think the correct way to see if there is a missing field is the following doc['countryid'].empty == true Check also: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-scripting.html#_document_fields btw why such an old version of ES? Thomas On Wednesday, 17

Mapper Parsing Exception

2014-09-17 Thread Maria John Franklin
Hi Friends, How to push dynamic data types in same column in elasticsearch. Example json == data 1 == { test1:test, test2:test } data 2 == { test1:test, test2:{ test3:test} } Thanks, Franklin -- You received this message because you are subscribed to the Google

Re: Using ES as a primary datastore.

2014-09-17 Thread Alex Kamil
ES is a fantastic search engine but there is some risk http://aphyr.com/posts/317-call-me-maybe-elasticsearch of data loss, and a few other https://www.quora.com/Why-should-I-NOT-use-ElasticSearch-as-my-primary-datastore potential disadvantages which might or might not be relevant to you. You can

Re: Strange issue with 2 seperate ELK servers

2014-09-17 Thread Kevin M
Thanks you for the detailed information - changed the cluster name worked well. The plugins were also easy to install - thanks again! On Tuesday, September 16, 2014 4:00:20 PM UTC-4, Mark Walkom wrote: By default ES uses a discovery method that allows any node with the same cluster name to

Re: Mapper Parsing Exception

2014-09-17 Thread Maria John Franklin
yes u are right .But i have data's like that. what can i do? On Wednesday, September 17, 2014 6:18:00 PM UTC+5:30, David Pilato wrote: You can't. If test2 is a string for the first doc, it can not be an object for the second doc. What are you trying to do? -- *David Pilato* |

Boosting a type

2014-09-17 Thread Ramy
I'm trying to query two types and want to boost on of them! I have tried this way. But i wan't successful. Can someone help? How can I do boost a type? GET /my_index/*my_type1*,*my_type2*/_search { query: { match: { value.autocomplete: lorem ipsum }, function_score: {

Re: Mapper Parsing Exception

2014-09-17 Thread Maria John Franklin
Hi Pilato, Those data's coming from another site.. I can't change data values. Any option in elasticsearch? On Wednesday, September 17, 2014 6:18:00 PM UTC+5:30, David Pilato wrote: You can't. If test2 is a string for the first doc, it can not be an object for the second doc. What

how to scale an ES deployment to millions of tenants with different data schemas

2014-09-17 Thread Ziv Shalev
Hi, we are considering using ES as a primary data-source for a new project. our data is generated by millions of different users, *each having a relatively small number of documents, yet each having a different data schema.* *we are considering several approaches:* * index per user - we are

POST to _update, and CompileException

2014-09-17 Thread Martin Bruse
Hello elasticsearch list! I try to update a document using the _update endpoint, but I can't make it work. The same code works fine when I run it in plain groovy. See http://pastie.org/9561774 and note that the code in the elagrov file is the same as in the script field in the POST, and yet

ElasticSearch spark esRDD not returing the aggregate values in aggregated query

2014-09-17 Thread siva pradeep
Hi, I have a query which filters the rows and then applies the aggregation. I tried running the query in Sense it gave me the expected result. But when I try to run the same query using elasticsearch-spark_2.10 I get the rows filtered by the query but not the aggregation result. I am sure I am

RE: Using ES as a primary datastore.

2014-09-17 Thread Doug Turnbull
I'd also suggest checking out DataStax Enterprise -- a commercial flavor of Cassandra. Its Cassandra, so update rates and volume are its strong suit. Its intended as a primary data store. It has a Solr (another search engine) instance on each node that indexes the local data on that node,

Re: how to scale an ES deployment to millions of tenants with different data schemas

2014-09-17 Thread Itamar Syn-Hershko
First, you should really read this: http://aphyr.com/posts/317-call-me-maybe-elasticsearch regarding using ES as a single source of truth Millions of indexes is not advisable, unless you plan on having millions of servers. Depending on index size and write frequency to them, you don't want to

Re: how to scale an ES deployment to millions of tenants with different data schemas

2014-09-17 Thread Ziv Shalev
thanks for the prompt reply! one thing though - when using a single multi-tenant index, my concerns are not around the number of fields per doc (which is small, less than 50), but rather the fact that since each tenant has different fields, the accumulated number of fields in such an index will

Re: how to scale an ES deployment to millions of tenants with different data schemas

2014-09-17 Thread Itamar Syn-Hershko
This will still mean less overhead than having those distinct field in discreet indexes. I wouldn't worry about that. -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Author of RavenDB in Action http://manning.com/synhershko/

Re: Java API and aggregations result problem

2014-09-17 Thread David Pilato
Why don't you use Java client? If you are using it why do you need to convert JSON responses to elasticsearch objects? I don't understand exactly what you are trying to do here. --  David Pilato | Technical Advocate | Elasticsearch.com @dadoonet | @elasticsearchfr Le 17 septembre 2014 à

Re: Boosting a type

2014-09-17 Thread Ramy
i have solved it in that way... GET /my_index/my_type1,my_type2/_search { query: { function_score: { query: { match: { value.autocomplete: lorem ipsum } }, functions: [ { script_score: { lang: groovy,

Re: Java API and aggregations result problem

2014-09-17 Thread lpouget
I am using Java client. I wanted to convert elasticsearch objects to JSON responses. The JSON result above was the expected result. I successfully convert it with Jackson by switching SerializationFeature.FAIL_ON_EMPTY_BEANS to false which seems mandatory to serialize aggregation elasticsearch

Re: Java API and aggregations result problem

2014-09-17 Thread David Pilato
I think that if you do SearchResponse response = esClient.prepareSearch(person)                 .setTypes(person)                 .addAggregation(                         AggregationBuilders.terms(by_country).field(country)                 )                 .execute().actionGet();

Re: Boosting a type

2014-09-17 Thread Ivan Brusic
I have yet to switch over to groovy, so I can't comment on where your current script is wrong (it looks good to me as well). However, you can use the standard function score, which are easier to understand and do not rely on scripting (technically better performance).

Question about value_count aggregation

2014-09-17 Thread Sam Chrisinger
Hi All, I'm having trouble with a value_count aggregation that uses a script. For context I have an index of 100,000 dummy documents all with 1-word titles. My query looks like: { query: { match_all: {} }, filter: { match_all: {} }, aggs: { category_tokens: { value_count: {

Re: Yet another OOME: Java heap space thread :S

2014-09-17 Thread Chris Neal
Sorry to bump my own thread, but It's been awhile and I was hoping to get some more eyes on this. I've since added a third node to the cluster to see if that helps, but it did not. I still see these OOME on merges on any of the three nodes in the cluster. I have also increased the shard count

Re: Term Aggregations and StopWords

2014-09-17 Thread André Morais
Hello! Still can't get the result I want: stop words not appearing in buckets. Further testing showed that: - if I filter aggregation with a query for one of the stop words, I get an empty result for aggregations; - the same analyzer is changing all :) and :( and replacing them with SMILE

Re: powerful cluster is not able to handle 1.5Tb of data, how to optimize?

2014-09-17 Thread Greg Murnane
I run 1.3TB of active indices on a single node (64 GB ram with 12GB heap size, and 15 small disks in a raid 5), with most of my messages quite small, which makes it looks similar to your case in volume, although I have a significantly lower (about 5K) indexing rate. I suspect that the single

Re: powerful cluster is not able to handle 1.5Tb of data, how to optimize?

2014-09-17 Thread Pavel P
Thanks Greg, Your example should be useful. среда, 17 сентября 2014 г. пользователь Greg Murnane написал: I run 1.3TB of active indices on a single node (64 GB ram with 12GB heap size, and 15 small disks in a raid 5), with most of my messages quite small, which makes it looks similar to your

Re: Yet another OOME: Java heap space thread :S

2014-09-17 Thread joergpra...@gmail.com
I have very similar cluster setup here (ES 1.3.2, 64G RAM, 3 nodes, Java 8, G1GC, ~100 shards, ~500g indexes on disk) This is the culprit max_merged_segment: 15gb I recommend max_merged_segment: 1gb See also https://gist.github.com/jprante/10666960 (which also holds for ES 1.2 and ES 1.3 -

Re: Highly variable query performance with ES 1.3.2 (filter + aggregations)

2014-09-17 Thread Craig Wittenberg
Some new data. When I said that we upgraded to 1.3.2 recently, the cluster which exhibited this behavior had recently been loaded with a snapshot from the old 1.1.2 cluster. More importantly, I loaded that same 1.1.2 snapshot into a small cluster of 1.3.2 (just two nodes) and have been trying

Re: 'Shard Allocation' dashboard in Marvel 1.2 showing SearchParseExceptions

2014-09-17 Thread Sean Kipling
I am on Marvel 1.2.1 and had the same error (see below). The time on one of my nodes was off a fair bit. Fixed that and the problem went away. [2014-09-17 11:55:14,615][DEBUG][action.search.type ] [Pip the Troll] All shards failed for phase: [query_fetch] [2014-09-17

Re: Yet another OOME: Java heap space thread :S

2014-09-17 Thread Chris Neal
Thank you so very much for the reply! That makes sense. I will look at the gist as well, and make some changes to test. Again, thank you for your time. I will report back with some results! Chris On Wed, Sep 17, 2014 at 11:36 AM, joergpra...@gmail.com joergpra...@gmail.com wrote: I have

Re: Any zen discovery difference between master eligible and non eligible nodes?

2014-09-17 Thread David Pilato
A cluster needs a master node. You need at least one node which could be elected as master. -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 17 sept. 2014 à 20:08, Jinyuan Zhou zhou.jiny...@gmail.com a écrit : I have two nodes with this configuration

Tokenizer for html tags attributes

2014-09-17 Thread Jack
Hi. What i need to achieve is a better html documents indexing. I started with first analyzer that strips html chars and works with text only, but almost half om my searches will be through html tags (and more - some specific html attributes). For example, i have an index with content field

Re: Any zen discovery difference between master eligible and non eligible nodes?

2014-09-17 Thread Jinyuan Zhou
Sorry I didn't explain clearly. I have a cluster with 12 nodes. 10 are master eligible. 2 are client only (es.node.data=false, es.node.master=false). I have configured the minimal number of nodes to form a cluster is 6. all the boxes hosting 12 es instances are on the same rack. The problem is

Re: Any zen discovery difference between master eligible and non eligible nodes?

2014-09-17 Thread David Pilato
Did you set the same cluster name? -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 17 sept. 2014 à 21:09, Jinyuan Zhou zhou.jiny...@gmail.com a écrit : Sorry I didn't explain clearly. I have a cluster with 12 nodes. 10 are master eligible. 2 are client only

Rolling Restart an Elasticsearch Cluster with Ansible

2014-09-17 Thread Lance A. Brown
I've come up with what I think is a safe way to rolling restart an Elasticsearch cluster using Ansible handlers. Why is this needed? Even if you use a serial setting to limit the number of nodes processed at one time, Ansible will restart elasticsearch nodes and continue processing as soon

Re: Any zen discovery difference between master eligible and non eligible nodes?

2014-09-17 Thread Jinyuan Zhou
yes. cluster name is set to be the same on all 12 nodes Jinyuan (Jack) Zhou On Wed, Sep 17, 2014 at 12:24 PM, David Pilato da...@pilato.fr wrote: Did you set the same cluster name? -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 17 sept. 2014 à 21:09, Jinyuan Zhou

Re: Any zen discovery difference between master eligible and non eligible nodes?

2014-09-17 Thread David Pilato
Are you using multicast or unicast or a plugin for discovery? -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 17 sept. 2014 à 21:49, Jinyuan Zhou zhou.jiny...@gmail.com a écrit : yes. cluster name is set to be the same on all 12 nodes Jinyuan (Jack) Zhou On Wed, Sep

Re: Any zen discovery difference between master eligible and non eligible nodes?

2014-09-17 Thread Jinyuan Zhou
I am using unicast with out of box zen discovery. discovery.zen.ping.multicast.enabled: false -Des.discovery.zen.ping.unicast.hosts=myCommaSeparatedHostList Jinyuan (Jack) Zhou On Wed, Sep 17, 2014 at 1:08 PM, David Pilato da...@pilato.fr wrote: Are you using multicast or unicast or a

Re: Any zen discovery difference between master eligible and non eligible nodes?

2014-09-17 Thread David Pilato
When you start the client node, did you set list of master nodes as unicast hosts? -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 17 sept. 2014 à 22:13, Jinyuan Zhou zhou.jiny...@gmail.com a écrit : I am using unicast with out of box zen discovery.

Re: Any zen discovery difference between master eligible and non eligible nodes?

2014-09-17 Thread Jinyuan Zhou
Yes. The settings are almost same for all nodes discovery.zen.ping.multicast.enabled: false # this is in elasticsearch.yml -Des.discovery.zen.ping.unicast.hosts=myCommaSeparatedHostList this is java -D parameter Jinyuan (Jack) Zhou On Wed, Sep 17, 2014 at 1:27 PM, David Pilato da...@pilato.fr

Aggregations: Possible to return additional fields?

2014-09-17 Thread James Addison
Is it possible to do a terms aggregation on a 'tags.id' field but also get back the associated 'tags.slug' and 'tags.name' fields? Here's how the documents are structured: { id: 42, name: Beer, tags: [{ id: 2, name: Hoppy, slug: hoppy }, { id: 2,

Re: New version of Kibana in the works?

2014-09-17 Thread Thibaut Labarre
We would definitely love even a rough ETA. Will it be October? End of the year? In 2015? Also will Kibana 4 be backwards compatible for panels? On Thursday, August 28, 2014 2:44:18 AM UTC-7, Julien Palard wrote: Hi Le jeudi 14 août 2014 18:04:24 UTC+2, Rashid Khan a écrit : Yes there is a

Re: New version of Kibana in the works?

2014-09-17 Thread Rashid Khan
Unfortunately I can’t give you an ETA other than soon ;-) Initially backwards compatibility will not be available as there have been a large number of core changes. We’re looking at providing a compatibility layer to ease the transition, but there are some challenges there. On Thu, Aug 14,

ES JsonParseException

2014-09-17 Thread Foobar Geez
Hello, I am a newbie to ES and would appreciate any insights into the below issue (going crazy for the last couple of hours :/): I need to store the following string value into a field -- foo\bar -- with the literal backslash in it. curl -XPUT 'http://localhost:9200/test/test/test' -d ' {

Re: powerful cluster is not able to handle 1.5Tb of data, how to optimize?

2014-09-17 Thread Otis Gospodnetic
Hi Pavel, When you open Kibana and things are slow, what's happening with your server? Is/are the CPUs maxed out for a minute? Do you see heavy disk IO? Swapping? You can use our SPM http://sematext.com/spm/ to see all this and various other ES metrics. Show/tell us what you see and people will