Re: char_filter for German

2015-03-12 Thread Krešimir Slugan
Where is this german_normalize filter coming from? It solves my problem completely and magically but it's not documented anywhere (and seems like it's not part of ICU plugin either). What is also weird is that filter can not be used in global context, e.g. it's not possible to try

ElasticSearch across multiple data center architecture design options

2015-03-12 Thread Alex
Hi all, We are planning to use ELK for our log analysis. We have multiple data centers. Since it is not recommended to have across data center cluster, we are going to have one ES cluster per data center, here are the three design options we have: 1. Use snapshot restore to replicate data

Re: weighted average scripted metric usage

2015-03-12 Thread bowlby
I have data that has a weighting field and I'd like to visualize the weighted average in Kibana4. Is it possible to visualize this query in Kibana4? On Friday, 9 January 2015 23:42:42 UTC+1, Kallin Nagelberg wrote: The current 1.4 docs mention that the scripted_metric aggregation is

Re: Elasticsearch 1.4.4-1 with Shield 1.0.1 on CentOS 6.6 - Authentication issue when running as service vs bin/elasticsearch.

2015-03-12 Thread Jason Nagashima
Hi fmarchand, I ended up not pursuing Shield any further after finding out how much the licensing would cost, but this might solve your issue: http://stackoverflow.com/questions/28571868/elk-shield-auth-problems Hope that helps! Cheers, Jason On Wed, Mar 11, 2015 at 9:25 AM, fmarchand

Kibana with Hadoop directly?

2015-03-12 Thread KRRK2015
Hello, has anyone tried to get Kibana work directly with Hadoop (without elasticsearch in the middle)? If yes, how? Any references would help. Thanks. -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop

problem in using JDBC plugin for Elasticsearch

2015-03-12 Thread Ali Lotfdar
Hello All, I am going to use this plugin for transferring some data from mysql to elasticsearch. I followed all steps in https://github.com/jprante/elasticsearch-river-jdbc; but I encounter with error(the log is in below). Plugin version: 4.0.10 ElasticSearch Version: 1.4.4 Thank to let me

Re: unable to create Elasticsearch cluster using multiple physical server

2015-03-12 Thread Gurunath pai
Thanks David, yes both these systems are of production and transport layer uses http connection. Is secure connection possible among the physical, If yes then any reference doc will be helpfull. On Wednesday, 11 March 2015 16:14:35 UTC+5:30, Gurunath pai wrote: HI All, I am trying to

Re: unable to create Elasticsearch cluster using multiple physical server

2015-03-12 Thread Gurunath pai
Thanks David, yes both these systems are of production and transport layer uses http connection. Now my question is, secure connection possible among the physical systems, If yes then any reference doc will be helpfull. On Wednesday, 11 March 2015 16:14:35 UTC+5:30, Gurunath pai wrote: HI

Re: filtered has_child query?

2015-03-12 Thread asanderson
Actually, I do want only parent documents returned, but I want the filter to be applied to both parent and child documents. Is there a way to specify that the filter is to be applied before the query, so that this would be possible? If not, how would I rewrite the query to do this? -- You

Is it possible to write my own filter ?

2015-03-12 Thread cornet . remi
Hi everyone, I need a filter to split in two words a word containing a suffix that belongs to a list (Maybe a text file containing all the suffix) but I can't find an existing filter doing that. Does anyone have a solution to this? If not, is there a way to write my own filter in Java and

Re: Sanitize a text for indexing

2015-03-12 Thread Itamar Syn-Hershko
See http://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-length-tokenfilter.html -- Itamar Syn-Hershko http://code972.com | @synhershko https://twitter.com/synhershko Freelance Developer Consultant Lucene.NET committer and PMC member On Thu, Mar 12, 2015 at 10:52 AM,

Rebuilding an index with zero downtime using aliases

2015-03-12 Thread mzrth_7810
Hey everyone, I have a question about rebuilding an index. After reading the elasticsearch guide and various topics here I've found that the best practice for rebuilding an index without any downtime is by using aliases. However, there are certain steps and processes around that, which I seek

Re: Need Urgent Hekp: new node not joining to existed cluster

2015-03-12 Thread phani . nadiminti
Hi Mark and mkBig, Thank you for your suggestions * i disabled multicast and enabled unicast properties and zend discovery. * And installed what ever the plugins I have previously in existed cluster those are installed in new node it got worked. Thanks phani

Re: unable to create Elasticsearch cluster using multiple physical server

2015-03-12 Thread David Pilato
If you meant how to secure communication between nodes?, you could look at Shield project. -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 12 mars 2015 à 00:42, Gurunath pai pai.gurun...@gmail.com a écrit : Thanks David, yes both these systems are of production and

Sanitize a text for indexing

2015-03-12 Thread Bernhard Berger
Hi, while indexing various comments from Facebook I sometimes get Exceptions: IllegalArgumentException: Document contains at least one immense term... Is it possible to sanitize a text for indexing in Elasticsearch so it doesn't throw these Exceptions? Maybe there is a Filter to remove

Re: Sanitize a text for indexing

2015-03-12 Thread Bernhard Berger
On 12.03.15 10:03, Itamar Syn-Hershko wrote: See http://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-length-tokenfilter.html Unfortunately the length token filter also doesn't filter out these immense terms. See my example from

Re: Is it possible to write my own filter ?

2015-03-12 Thread Ivan Brusic
Off the top of my head, I cannot think of an existing filter that accomplishes that task. Creating a custom filter is easy. Simply creating a Lucene filter and create a plug-in around it. Take a look at existing analysis plug-ins for inspiration.

Query string order field

2015-03-12 Thread Dan
Hello, Is it possible to do a query string search on multi fields, where I can determine the result order by field? When a string is found in field-A it would be more important then an result in field-B [query] = Array ( [query_string] = Array (

Re: How ELK stores data

2015-03-12 Thread Austin Harmon
Hello, to add on to the searching historical data question, I know Elasticsearch using JSON to index documents but how do you get it to index the body of the document without copy and pasting the body into JSON. I assume there is a way to do this. I have used analyzers in my mapping but it

Should clause behaves like a must clause in filtered query

2015-03-12 Thread parq
Hello all, We have a single document in an index: $ curl -XGET http://localhost:9200/test-cbx/bug/_search?q=*; gives us the following response

whitespace tokenizer not working as I'd expect

2015-03-12 Thread Craig Ching
Hi all, I'm trying to break up some strings to use in a full text search leaving the original field intact. I have created a full_text field that is populated from a name field using copy_to and an analyzer that looks like this: settings : { analysis: { char_filter :

Re: How can I change _score based on string lenght ?

2015-03-12 Thread Arnaud Coutant
Any idea ? Le lundi 9 mars 2015 22:33:50 UTC+1, Arnaud Coutant a écrit : Dear Members, When I get result of my multi match request based on two words I get this: Iphone 6C OR Iphone 6C ARGENT I would like that this result has the same score then order it by cheapest price first (float

Re: Is it possible to write my own filter ?

2015-03-12 Thread David Pilato
I wonder if you could use a Pattern Tokenizer in that case??? http://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-pattern-tokenizer.html -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 12 mars 2015 à 04:32, Ivan Brusic i...@brusic.com a écrit : Off

Re: Kibana with Hadoop directly?

2015-03-12 Thread Costin Leau
On the bright side, you can use es-hadoop connector [1] to easily get data from Hadoop/HDFS to Elasticsearch and back whatever your Hadoop stack (Map/Reduce, Cascading, Pig, Hive, Spark, Storm). [1] https://www.elastic.co/products/hadoop On Fri, Mar 13, 2015 at 3:15 AM, aa...@definemg.com wrote:

ES isn't properly handling unicode ? advice for debugging this problem?

2015-03-12 Thread Kevin Burton
I have unit tests setup to test using transport client to write unicode data into ES and then read it back out. It's using the standard ElasticsearchIntegrationTest that ES recommends. I'm using MY JSON encoder... and then I write my JSON to the TransportClient, and read it back out, and it's

Re: Snapshot Scaling Problems

2015-03-12 Thread Andy Nemzek
Thank you guys for your thoughts here. This is really useful information. Again, we're creating daily indexes because that's what logstash does out of the box with the elasticsearch plugin, and this kind of tuning info isn't included with that plugin. Minimizing both the number of indexes

Re: elasticsearch-hadoop-hive exception when writing arraymapstring,string column

2015-03-12 Thread Chen Wang
Costin, Thanks for your info. I am mapping array of maps to nested objects in ES, and in this specific case, the expected document in ES will look like { _id:customer_id, store_purchase:[{item_id:123, category:'pants', department:'clothes'}, ...] } so that I can do query like find all

Re: Please help to understand these Exceptions

2015-03-12 Thread Mark Walkom
The limit of a node is hard to definitively know as use cases vary so much, but from what I have seen 3TB on 3 nodes is pretty dense. On 12 March 2015 at 08:09, Chris Neal chris.n...@derbysoft.net wrote: Thank you Mark. May I ask what about my answers caused you to say definitely? :) I want

Re: elasticsearch-hadoop-hive exception when writing arraymapstring,string column

2015-03-12 Thread Costin Leau
The exception occurs because you are trying to extract a field (the script parameters) from a complex type (array) and not a primitive. The issue with that (and why it's currently not supported) is because the internal structure of the complex type can get quite complex and its serialized, JSON

Re: snapshot and zone

2015-03-12 Thread Mark Walkom
Just use a local mount point, as long as the path is the same it doesn't matter. Also, we do not recommend cross DC clusters, there is a lot of potential issues you can run into. On 12 March 2015 at 17:31, Foobar Geez foobarg...@gmail.com wrote: Hello, We use one ES cluster with 4 nodes

Re: Elasticsearch manifest.xml possible configuration issue.

2015-03-12 Thread Mark Walkom
Technically we don't support SmartOS, but please raise a Github issue anyway as it'd be interesting to look into more. On 12 March 2015 at 12:12, dj.hutch deanjhu...@gmail.com wrote: Hi All, I've been working with Elasticsearch on a Joyent SmartOS instance and discovered a possible issue

Re: Hive to elasticsearch Parsing exception.

2015-03-12 Thread Costin Leau
Likely the issue is caused by the fact that in your manual mapping, the NULL value is not actually mapped to null but actually to a string value. You should be able to get around it by converting NULL to a proper NULL value which es-hadoop can recognized; additionally you can 'translate' it to a

Re: Please help to understand these Exceptions

2015-03-12 Thread Chris Neal
Thank you Mark. May I ask what about my answers caused you to say definitely? :) I want to better understand capacity related items for ES for sure. Many thanks! Chris On Wed, Mar 11, 2015 at 2:13 PM, Mark Walkom markwal...@gmail.com wrote: Then you're definitely going to be seeing node

Re: Should clause behaves like a must clause in filtered query

2015-03-12 Thread parq
However, the following query returns the expected document, curl -XGET http://localhost:9200/test-cbx/bug/_search; -d' { query: { filtered: { query: { bool: { must: [ { match: { type:

Kibana4 + Apache server ?

2015-03-12 Thread Guillaume RICHAUD
Hi guys, I'm trying to install the latest ELK stack including Kibana4.0.1 on a virtual machine with CentOS 7 minimal. My aim is to access kibana via an Apache server (httpd) from my computer (because the centOS mini hasn't any gnome installed so it's all in command lines). I've got an

Aggregations across multiple indices

2015-03-12 Thread Christian Rohling
Hello Everyone, I am attempting to use aggregations to count the number of documents matching a given query across multiple indices. What I would like to do, is make those counts on distinct keys. Say I had following document in 2 different indices, aliased together. ``` { _index: myindex

Re: Kibana connection with ElasticSearch

2015-03-12 Thread aaron
Kibana is tightly coupled with features that are available in ElasticSearch. As those features change versions of Kibana change. For instance the latest version of Kibana requires that you are using 1.4.4. Unless more updates have changed that. If you are running a version that predates

Re: ElasticSearch across multiple data center architecture design options

2015-03-12 Thread aaron
Why not load balance multiple tribe nodes, if you need multiple. On Wednesday, March 11, 2015 at 9:41:39 AM UTC-6, Abigail wrote: Hi Mark, Thank you for your reply. Is there any existing approach for kibana to communicate with multiple tribe nodes? Or is it something we should implement

Re: Kibana4 + Apache server ?

2015-03-12 Thread aaron
The latest versions of Kibana are very different than the older versions. The old version was just a bunch of javascript that needed any old webserver to host the files. The new version is a full blown node.js application and as such does not use Apache at all, but requires node.js. It also

Re: ElasticSearch across multiple data center architecture design options

2015-03-12 Thread naye923
Yes, that is what I meant. Is there any reference for set up the load balance for Kibana 4? Or if it is easier for Kibana 3? On Thu, Mar 12, 2015 at 12:26 PM, aa...@definemg.com wrote: Why not load balance multiple tribe nodes, if you need multiple. On Wednesday, March 11, 2015 at 9:41:39

Re: best practice for rebuilding an index using aliases

2015-03-12 Thread aaron
I tried to reply earlier but seems Google lost that reply. My suggestion would be to create a v1_new index that has the same mappings as v1. When you are ready to migrate to v2, change indexing to go to v1_new, change searches to cover v1 and v1_new (alias or query string), copy v1 to v2,

Re: Oops! SearchPhaseExecutionException[Failed to execute phase [query], all shards failed]

2015-03-12 Thread Taylor Wood
I didn't get any help on this but as an FYI for those that may have this issue and are just starting: Digging deeper it appears our system was created with 5 shards and 1 replica. Granted we are only using 1 node so every day elasticsearch would create an indice of 10 shards, 5 for the

Re: ElasticSearch across multiple data center architecture design options

2015-03-12 Thread naye923
Yes, that is what I meant. Is there any reference for set up the load balance for Kibana 4? Or if it is easier for Kibana 3? On Thu, Mar 12, 2015 at 12:26 PM, aa...@definemg.com wrote: Why not load balance multiple tribe nodes, if you need multiple. On Wednesday, March 11, 2015 at 9:41:39

Re: Snapshot Scaling Problems

2015-03-12 Thread aaron
With the low volume of ingest, and the long duration of history, Id suggest you may want to trim back the number of shards per index from the default 5. Based on your 100 docs per day Id say 1 shard per day. If you combined this with the other suggestion to increase the duration of an index,

Re: Analyzers and JSON

2015-03-12 Thread Aaron Mefford
Take a look at Apache Tika http://tika.apache.org/. It will allow you to extract the contents of the documents for indexing, this is outside of the scope of the ElasticSearch indexing. A good tool to make these files downloadable is also out of scope, but I'll answer to what is in scope. You

Re: Dealing with spam in this forum

2015-03-12 Thread Gavin Seng
Hi, What is the current policy on this? I just tried creating 2 new posts ... they showed up for awhile ... and then disappeared. I thought that it could be because I did inline pictures ... so I tried reposting and got the same result. Not sure if they're in a to be moderated bucket ... or

Re: Analyzers and JSON

2015-03-12 Thread Austin Harmon
Okay so I have a large amount of data 2 TB and its all microsoft office documents and pdfs and emails. What is the best way to go about indexing the body of these documents so making the contents of the document searchable. I tried to use the php client but that isn't helping and I know there

Re: Aggregations across multiple indices

2015-03-12 Thread Karl Putland
you might look at http://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-cardinality-aggregation.html#search-aggregations-metrics-cardinality-aggregation --K Karl Putland Senior Engineer *SimpleSignal* Anywhere: 303-242-8608

Re: OOME When large segments merge

2015-03-12 Thread Michael McCandless
Do you have many fields with norms enabled? Mike McCandless http://www.elastic.co On Thu, Mar 12, 2015 at 1:20 PM, Mark Greene m...@evertrue.com wrote: I've noticed periodically that data nodes in my cluster will run out of heap space when large segments start merging. I attached a

Re: char_filter for German

2015-03-12 Thread joergpra...@gmail.com
Yes, please upgrade Elasticsearch to use the official german normalizer. I added it to decompound plugin for convenience, it may be removed at any later time. Jörg On Wed, Mar 11, 2015 at 9:54 PM, Krešimir Slugan kresimir.slu...@gmail.com wrote: Thanks! I assume that german_normalize is

Re: Machine Learning / Decision Tree Learning with Elasticsearch

2015-03-12 Thread vineeth mohan
Hello Michael , There is Hadoop integration with Elasticsearch. With this integration , it can run against each feed in elasticsearch in a highly optimized way. This gives you opportunity to couple mahout library with Elasticsearch. I would advice this approach. Thanks Vineeth

Re: Issue with bettermap / kibana

2015-03-12 Thread squeaky jetpack
On Wednesday, 12 March 2014 13:31:12 UTC-7, Clinton Gormley wrote: This issue is fixed in master. cloudmade turned off public access, so we have switched to the mapquest servers. I'm still having this exact same issue:

Hive to elasticsearch Parsing exception.

2015-03-12 Thread P lva
Hello Everyone, I'm loading data from a a hive table (0.13) in to elasticsearch (1.4.4). With the auto create index option turned on , I don't face any problems and I can see all the data in ES. However, I get the following error when i create the index manually. Caused by:

Re: Should clause behaves like a must clause in filtered query

2015-03-12 Thread Les Barstow
should has a minimum_should_match of 1 when there is no must or must_not. With only a single should, that makes it act like must. On Thu, Mar 12, 2015 at 9:08 AM, parq p...@colourbox.com wrote: However, the following query returns the expected document, curl -XGET

OOME When large segments merge

2015-03-12 Thread Mark Greene
I've noticed periodically that data nodes in my cluster will run out of heap space when large segments start merging. I attached a screenshot of what marvel looked like leading up to the OOME on one of my data nodes. My question is, generally speaking, what knobs should I be turning to

Elasticsearch manifest.xml possible configuration issue.

2015-03-12 Thread dj.hutch
Hi All, I've been working with Elasticsearch on a Joyent SmartOS instance and discovered a possible issue with the java_opts value in the elasticsearch.xml file used to create the service. The line currently reads: propval name=java_opts type=astring value=-Djava.awt.headless=true -Xss256k

Elasticsearch with large amount of data

2015-03-12 Thread Jeferson Martins
Hi, I have 5 nodes of ElasticSearch with 4 CPUs, 8 Mbs of RAM. My Index today have 1TB of data and my index have about 100GBs By day and i configure 3 primary shards and 1 replica but my elasticsearch gets OutOfMemoy in every two days. There is some configuration to resolve this problem? --

snapshot and zone

2015-03-12 Thread Foobar Geez
Hello, We use one ES cluster with 4 nodes spread across 2 data centers (2 nodes/DC). Each DC is configured as a zone (via cluster.routing.allocation.awareness.attributes). I would like to use snapshot to backup indexes using type:fs. Per

Re: Elasticsearch with large amount of data

2015-03-12 Thread aaron
First going to assume you mean 8GBs of memory or I am very impressed that ElasticSearch runs at all. Second, when are you running out of memory? Do you run out of memory while indexing? Is it a specific document when indexing? Do you run out of memory when searching? Is it a specific

Re: ElasticSearch across multiple data center architecture design options

2015-03-12 Thread aaron
Perhaps you are misunderstanding me. ElasticSearch does not provide a load balancer for this purpose. You would use a typical HTTP load balancer which could be anything as simple as Nginx, to something costly and expensive like a NetScalar. Configuring such a loadbalancer I believe is

elasticsearch-hadoop-hive exception when writing arraymapstring,string column

2015-03-12 Thread Chen Wang
Folks, I am using elasticsearch-hadoop-hive-2.1.0.Beta3.jar I defined the external table as:. CREATE EXTERNAL TABLE IF NOT EXISTS ${staging_table}( customer_id STRING, store_purchase arraymapstring,string) ROW FORMAT SERDE 'org.elasticsearch.hadoop.hive.EsSerDe' STORED BY

Kibana build

2015-03-12 Thread Mohit Garg
I tried to build kibana using the instructions from https://github.com/elastic/kibana/blob/master/CONTRIBUTING.md. At the last step: grunt dev, I get the following error: Running dev task Running less:src (less) task FileError: 'lesshat.less' wasn't found in

Re: Oops! SearchPhaseExecutionException[Failed to execute phase [query], all shards failed]

2015-03-12 Thread aaron
You should be able to set the number of replicas for all previous indexes to 0. You cannot reduce the shard count once an index is created, or increase for that matter. You could reindex your shards. http://www.elastic.co/guide/en/elasticsearch/reference/current/indices-update-settings.html

Re: Kibana with Hadoop directly?

2015-03-12 Thread aaron
Kibana is very tightly integrated with ElasticSearch, to the point of requiring specific versions of ElasticSearch for a given version of Kibana. When you say Hadoop that really means nothing. Most of the Hadoop EcoSystem is not realtime. There are some exceptions like HBase, but their

now throttling indexing

2015-03-12 Thread Eric Jain
I set `indices.store.throttle.type: none` in the elasticsearch.yml, and yet this shows up in the logs: now throttling indexing: numMergesInFlight=5, maxNumMerges=4 stop throttling indexing: numMergesInFlight=3, maxNumMerges=4 Did I misunderstand the purpose of this setting? -- You

Re: Shield with Java Client

2015-03-12 Thread Jettro Coenradie
Can you try to switch off client.transport.sniff, this might trigger another authority rule. I am not sure, but it is worth a try. On Tue, Mar 10, 2015 at 5:30 AM, Zsolt Bákonyi i...@netmango.net wrote: Dear Jettro. Can you help me, how could you do it? I try to comminicate to Elasticsearch