Re: [influxdb] Re: retention policies are not deleting data

Jeffery K Thu, 22 Dec 2016 07:28:13 -0800

What about on the /query endpoint. I tried something similar, but it didn't 
seem to work.My measurement names, and retention policies, are just about 
identically named, but with the retention policy having "RP_" in the front 
of it, and the measurement having "Metric_" in front of it. 
This was my URL. It gave me an empty result ({"results":[{}]}) , but have a 
200 response code. 
http://localhost:8086/query?db=jefferyk&epoch=ms&rp=RP_jefferyk_Windows+System&q=show+field+keys+FROM+%22Metric_jefferyk_Windows+System%22


In the database, I can see it's selecting from the autogen retention 
policy, despite me providing the rp in the query end point. 
database console output:
[query] 2016/12/22 10:27:00 SELECT fieldKey, fieldType FROM 
jefferyk.autogen._fieldKeys WHERE _name = 'Metric_NEWPETRONAS_Windows 
System'
[httpd] 127.0.0.1 - - [22/Dec/2016:10:27:00 -0500] "GET 
/query?db=jefferyk&epoch=ms&q=show+field+keys+FROM+%22Metric_NEWPETRONAS_Windows+System%22&rp=RP_NEWPETRONAS_Windows+System
 
HTTP/1.1" 200 17 "-" "Java/1.8.0_73" 14f22859-c85b-11e6-9276-000000000000 
1001


On Wednesday, December 21, 2016 at 1:59:00 PM UTC-5, Mark Rushakoff wrote:
>
> Specifying a retention policy on the /write endpoint requires the rp query 
> parameter. The space between the RP and measurement name only works in the 
> CLI as a convenience.
>
> It's difficult to precisely quantify these performance differences. 
> Ultimately, the best answer will come from profiling your data on your 
> operating system and your hardware.
>
> That being said, there's small CPU and memory overhead for any database. 
> We have users with hundreds of databases in production, but in general, 
> prefer the smallest number of databases that fit your needs.
>
> You're more likely to run into disk contention with a larger number of 
> databases because there are more WAL and TSM files being managed. TSM files 
> in a single database and retention policy can be compacted together to 
> minimize disk reads; if they're spread out, you lose some compaction 
> benefits. Likewise, you'll be appending to a single WAL file per database. 
> You can batch writes to multiple measurements in a single request to 
> /write, but if your measurements are in separate databases and retention 
> policies, they must necessarily be multiple requests to /write.
>
> On Wed, Dec 21, 2016 at 10:11 AM, Jeffery K <
> [email protected] <javascript:>> wrote:
>
>> I think i found the issue. 
>> I am using a http client and in the line protocol, specifying the insert 
>> as "RPNAME"."measurementName", but i actually need a space between the RP 
>> and measurement name. I thought that since the query required 
>> RP.measurement, then the insert was the same. My mistake. 
>> Do you know what the difference in the performance profile would be 
>> splitting out these measurements into more shards like this?
>>
>>
>> On Wednesday, December 21, 2016 at 12:02:42 PM UTC-5, Mark Rushakoff 
>> wrote:
>>>
>>> It looks like you probably are writing without providing the rp query 
>>> parameter [1] and so writes are going into the default RP (that is, the one 
>>> with default=true, which also happens to be named "default").
>>>
>>> TSM and WAL files are stored under 
>>> $INFLUXDB_DIR/{data,wal}/<database_name>/<rp_name>. If you're intending to 
>>> use a unique retention policy per measurement, you're raising the disk 
>>> usage and shard management overhead as compared to multiple measurements in 
>>> a single retention policy and database.
>>>
>>> You are correct that the replication factor has no effect in open source 
>>> InfluxDB.
>>>
>>> [1] https://docs.influxdata.com/influxdb/v1.1/tools/api/#write
>>>
>>> On Tuesday, December 20, 2016 at 10:05:13 PM UTC-8, Jeffery K wrote:
>>>>
>>>> In looking at other diagnostics in the github issue list, here is the 
>>>> output of a show shards. I'm confused why they all have the "default" RP. 
>>>> I 
>>>> have a retention policy for every measurement in my database. 
>>>>
>>>>
>>>> name: SL
>>>> id      database        retention_policy        shard_group     
>>>> start_time              end_time                expiry_time             
>>>> owners
>>>> --      --------        ----------------        -----------     
>>>> ----------              --------                -----------             
>>>> ------
>>>> 110     SL       default                 110             
>>>> 2015-10-26T00:00:00Z    2015-11-02T00:00:00Z    2015-11-02T00:00:00Z
>>>> 86      SL       default                 86             
>>>>  2016-11-14T00:00:00Z    2016-11-21T00:00:00Z    2016-11-21T00:00:00Z
>>>> 88      SL       default                 88             
>>>>  2016-11-21T00:00:00Z    2016-11-28T00:00:00Z    2016-11-28T00:00:00Z
>>>> 95      SL       default                 95             
>>>>  2016-11-28T00:00:00Z    2016-12-05T00:00:00Z    2016-12-05T00:00:00Z
>>>> 104     SL       default                 104             
>>>> 2016-12-05T00:00:00Z    2016-12-12T00:00:00Z    2016-12-12T00:00:00Z
>>>> 107     SL       default                 107             
>>>> 2016-12-12T00:00:00Z    2016-12-19T00:00:00Z    2016-12-19T00:00:00Z
>>>> 115     SL       default                 115             
>>>> 2016-12-19T00:00:00Z    2016-12-26T00:00:00Z    2016-12-26T00:00:00Z
>>>>
>>>>
>>>>
>>>> On Tuesday, December 20, 2016 at 6:35:35 PM UTC-5, Jeffery K wrote:
>>>>>
>>>>> I'm having an issue with the influx 1.1 release. I just implemented 
>>>>> retention policies, but they don't seen to be deleting the data. I 
>>>>> believe 
>>>>> I understand that they won't be deleted up until the value of the Shard 
>>>>> Group, which can vary bu the duration (if not specified) but even using 
>>>>> small durations, my data is still not being deleted. 
>>>>>
>>>>> I have the following retention policies:
>>>>> even the 10 hour retention policy, still has all the data i've added 
>>>>> to it, which is 5 days (collected live over the last 5 days). Infact all 
>>>>> of 
>>>>> these retention policies still have all the data that was inserted into 
>>>>> them. It seems like no deletes are running. 
>>>>> Does the replica number affect that? I set 2, just so that if this is 
>>>>> used in a cluster, it already has the retention policy setup correctly 
>>>>> for 
>>>>> 2, but it is currently not used in a cluster. My understanding is the 
>>>>> replica number has no effect unless in a cluster, right?
>>>>>
>>>>> > show retention policies
>>>>> name                                                                   
>>>>>          duration        shardGroupDuration      replicaN        default
>>>>> ----                                                                   
>>>>>          --------        ------------------      --------        -------
>>>>> default                                                               
>>>>>           0s              168h0m0s                1               true
>>>>> RP_debir_Test10hours                                                   
>>>>>          10h0m0s         1h0m0s                  2               false
>>>>> RP_debir_Test1day                                                     
>>>>>           24h0m0s         1h0m0s                  2               false
>>>>> RP_debir_Test2days                                                     
>>>>>          48h0m0s         24h0m0s                 2               false
>>>>> RP_debir_Test3days                                                     
>>>>>          72h0m0s         24h0m0s                 2               false
>>>>> RP_debir_Test1000datablocks                                           
>>>>>           8333h20m0s      168h0m0s                2               false
>>>>> RP_it-edmTest1000datablocks                                           
>>>>>           8333h20m0s      168h0m0s                2               false
>>>>> RP_negtest_Millies                                                     
>>>>>          336h0m0s        24h0m0s                 2               false
>>>>>
>>>> -- 
>> Remember to include the version number!
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "InfluxData" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To post to this group, send email to [email protected] 
>> <javascript:>.
>> Visit this group at https://groups.google.com/group/influxdb.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/influxdb/c5bddcba-ed9c-40eb-b884-364680e0a22b%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/influxdb/c5bddcba-ed9c-40eb-b884-364680e0a22b%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
Remember to include the version number!
--- 
You received this message because you are subscribed to the Google Groups 
"InfluxData" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/influxdb.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/influxdb/1c7656ce-65f8-498d-93ec-b5ecfcf3ab15%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [influxdb] Re: retention policies are not deleting data

Reply via email to