Re: Data Loss

Ivan Brusic Fri, 14 Feb 2014 07:36:47 -0800

Jörg,

So if your shards are at most 5GB and you have 3x2 shards, then your data,
per index, is "only" 30GB?


Don't know why, maybe because it is what I used in Lucene, but I
always had segments_per_tier
and max_merge_at_once to be the same value. I had them higher than the
default of 10, but I slowly reduced them back to the default. How do you
tune BM25? Hard to debug since you cannot change similarities on the fly.

Cheers,

Ivan


On Thu, Feb 13, 2014 at 2:36 PM, [email protected] <
[email protected]> wrote:

> Here is the Elasticsearch config of my development ES 1.0  cluster (3
> nodes). I'm still experimenting with some settings.
>
> Server: HP DL 165 G7 (64GB RAM), 1TB RAID0 SAS, OS: RHEL 6.3, JVM: Java 8
> FCS
>
> Jörg
>
> bootstrap:
>   mlockall: true
>
> store:
>   index:
>     type: mmapfs
>
> network:
>   host: _local_
>
> transport:
>   ping_timeout: 30s
>   tcp:
>     port: 19300
>
> http:
>   port: 19200
>
> cluster:
>   name: "zbn-1.0"
>
> index:
>   codec:
>     bloom:
>       load: false
>   number_of_shards: 3
>   number_of_replicas: 1
>   merge:
>     policy:
>       max_merged_segment: 1gb
>       segments_per_tier: 24
>       max_merge_at_once: 4
>   similarity:
>     default:
>       type: BM25
>
> indices:
>   recovery:
>     concurrent_streams: 8
>
> gateway:
>   type: "local"
>   recover_after_nodes: 3
>   recover_after_time: 15m
>   expected_nodes: 3
>
> threadpool:
>   bulk:
>     type: fixed
>     size: 64
>     queue_size: 128
>
> discovery:
>   zen:
>     ping_timeout: 5s
>     minimum_master_nodes: 2
>
>
> On Thu, Feb 13, 2014 at 11:20 PM, [email protected] <
> [email protected]> wrote:
>
>> I'm keeping my shards small (1-5 GB), I do not disable shard allocation,
>> the recovery is fast enough (a few minutes)
>>
>> Gateway is local, the default.
>>
>> I keep backups on the file system, with plain cp -a or rsync, but also on
>> the data source side - I can reindex everything in a few hours, I have only
>> ~100 mio. docs to index.
>>
>> Because I don't operate a shared file system, I can not use ES 1.0
>> snapshot/restore yet.
>>
>> Jörg
>>
>>
>>
>> On Wed, Feb 12, 2014 at 11:27 PM, Mohit Anchlia 
>> <[email protected]>wrote:
>>
>>>
>>> Do you have special settings for "disable shard allocation" and
>>> "gateway" modules? Would you be able to share those settings? There are
>>> numerous number of settings and I am trying to understand which ones make
>>> most sense?
>>>
>>> How are you taking backups of data today? I believe new incremental
>>> backup feature is slated in 1.0.
>>>
>>> Are there other best practices you can share would be helpful.
>>>
>>>>
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH2v1Z25KnMx2dkQPWM-dhVmddM9nTYKpb_AG%2BO8cpG6Q%40mail.gmail.com
> .
>
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQD%3D_UzsirdeBcpJ-Qki50Q7wdOpYgRjNO3rDoHjPVEtCw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Re: Data Loss

Reply via email to