Re: [influxdb] Re: retention policies are not deleting data

Mark Rushakoff Wed, 21 Dec 2016 10:59:22 -0800

Specifying a retention policy on the /write endpoint requires the rp query
parameter. The space between the RP and measurement name only works in the
CLI as a convenience.


It's difficult to precisely quantify these performance differences.
Ultimately, the best answer will come from profiling your data on your
operating system and your hardware.

That being said, there's small CPU and memory overhead for any database. We
have users with hundreds of databases in production, but in general, prefer
the smallest number of databases that fit your needs.

You're more likely to run into disk contention with a larger number of
databases because there are more WAL and TSM files being managed. TSM files
in a single database and retention policy can be compacted together to
minimize disk reads; if they're spread out, you lose some compaction
benefits. Likewise, you'll be appending to a single WAL file per database.
You can batch writes to multiple measurements in a single request to
/write, but if your measurements are in separate databases and retention
policies, they must necessarily be multiple requests to /write.

On Wed, Dec 21, 2016 at 10:11 AM, Jeffery K <
[email protected]> wrote:

> I think i found the issue.
> I am using a http client and in the line protocol, specifying the insert
> as "RPNAME"."measurementName", but i actually need a space between the RP
> and measurement name. I thought that since the query required
> RP.measurement, then the insert was the same. My mistake.
> Do you know what the difference in the performance profile would be
> splitting out these measurements into more shards like this?
>
>
> On Wednesday, December 21, 2016 at 12:02:42 PM UTC-5, Mark Rushakoff wrote:
>>
>> It looks like you probably are writing without providing the rp query
>> parameter [1] and so writes are going into the default RP (that is, the one
>> with default=true, which also happens to be named "default").
>>
>> TSM and WAL files are stored under 
>> $INFLUXDB_DIR/{data,wal}/<database_name>/<rp_name>.
>> If you're intending to use a unique retention policy per measurement,
>> you're raising the disk usage and shard management overhead as compared to
>> multiple measurements in a single retention policy and database.
>>
>> You are correct that the replication factor has no effect in open source
>> InfluxDB.
>>
>> [1] https://docs.influxdata.com/influxdb/v1.1/tools/api/#write
>>
>> On Tuesday, December 20, 2016 at 10:05:13 PM UTC-8, Jeffery K wrote:
>>>
>>> In looking at other diagnostics in the github issue list, here is the
>>> output of a show shards. I'm confused why they all have the "default" RP. I
>>> have a retention policy for every measurement in my database.
>>>
>>>
>>> name: SL
>>> id      database        retention_policy        shard_group
>>> start_time              end_time                expiry_time
>>> owners
>>> --      --------        ----------------        -----------
>>> ----------              --------                -----------
>>> ------
>>> 110     SL       default                 110
>>> 2015-10-26T00:00:00Z    2015-11-02T00:00:00Z    2015-11-02T00:00:00Z
>>> 86      SL       default                 86
>>>  2016-11-14T00:00:00Z    2016-11-21T00:00:00Z    2016-11-21T00:00:00Z
>>> 88      SL       default                 88
>>>  2016-11-21T00:00:00Z    2016-11-28T00:00:00Z    2016-11-28T00:00:00Z
>>> 95      SL       default                 95
>>>  2016-11-28T00:00:00Z    2016-12-05T00:00:00Z    2016-12-05T00:00:00Z
>>> 104     SL       default                 104
>>> 2016-12-05T00:00:00Z    2016-12-12T00:00:00Z    2016-12-12T00:00:00Z
>>> 107     SL       default                 107
>>> 2016-12-12T00:00:00Z    2016-12-19T00:00:00Z    2016-12-19T00:00:00Z
>>> 115     SL       default                 115
>>> 2016-12-19T00:00:00Z    2016-12-26T00:00:00Z    2016-12-26T00:00:00Z
>>>
>>>
>>>
>>> On Tuesday, December 20, 2016 at 6:35:35 PM UTC-5, Jeffery K wrote:
>>>>
>>>> I'm having an issue with the influx 1.1 release. I just implemented
>>>> retention policies, but they don't seen to be deleting the data. I believe
>>>> I understand that they won't be deleted up until the value of the Shard
>>>> Group, which can vary bu the duration (if not specified) but even using
>>>> small durations, my data is still not being deleted.
>>>>
>>>> I have the following retention policies:
>>>> even the 10 hour retention policy, still has all the data i've added to
>>>> it, which is 5 days (collected live over the last 5 days). Infact all of
>>>> these retention policies still have all the data that was inserted into
>>>> them. It seems like no deletes are running.
>>>> Does the replica number affect that? I set 2, just so that if this is
>>>> used in a cluster, it already has the retention policy setup correctly for
>>>> 2, but it is currently not used in a cluster. My understanding is the
>>>> replica number has no effect unless in a cluster, right?
>>>>
>>>> > show retention policies
>>>> name
>>>>          duration        shardGroupDuration      replicaN        default
>>>> ----
>>>>          --------        ------------------      --------        -------
>>>> default
>>>>         0s              168h0m0s                1               true
>>>> RP_debir_Test10hours
>>>>          10h0m0s         1h0m0s                  2               false
>>>> RP_debir_Test1day
>>>>         24h0m0s         1h0m0s                  2               false
>>>> RP_debir_Test2days
>>>>          48h0m0s         24h0m0s                 2               false
>>>> RP_debir_Test3days
>>>>          72h0m0s         24h0m0s                 2               false
>>>> RP_debir_Test1000datablocks
>>>>         8333h20m0s      168h0m0s                2               false
>>>> RP_it-edmTest1000datablocks
>>>>         8333h20m0s      168h0m0s                2               false
>>>> RP_negtest_Millies
>>>>          336h0m0s        24h0m0s                 2               false
>>>>
>>> --
> Remember to include the version number!
> ---
> You received this message because you are subscribed to the Google Groups
> "InfluxData" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/influxdb.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/influxdb/c5bddcba-ed9c-40eb-b884-364680e0a22b%40googlegroups.com
> <https://groups.google.com/d/msgid/influxdb/c5bddcba-ed9c-40eb-b884-364680e0a22b%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
Remember to include the version number!
--- 
You received this message because you are subscribed to the Google Groups 
"InfluxData" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/influxdb.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/influxdb/CALxJwdO%2BRT%2BvnfgOE4yyfGxGX-sOJGG%3D1FoYQhmMFgtnQVgiUQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: [influxdb] Re: retention policies are not deleting data

Reply via email to