Hi Sean,

The data is from 1838 to 2016, daily (sparse at times). We need to retain
it, therefore the default policy.

Thanks,
Tanya

On 13 October 2016 at 06:26, Sean Beckett <s...@influxdb.com> wrote:

> Tanya, what range of time does your data cover? What are the retention
> policies on the database?
>
> On Tue, Oct 11, 2016 at 11:14 PM, Tanya Unterberger <
> tanya.unterber...@gmail.com> wrote:
>
>> Hi Sean,
>>
>> 1. Initially I killed the process
>> 2. At some point I restarted influxdb service
>> 3. Error logs show no errors
>> 4. I rebuilt the server, installed the latest rpm. Reimported the data
>> via scripts. Data goes in, but the server is unusable. Looks like indexing
>> might be stuffed. The size of the data in that database is 38M. Total size
>> of /var/lib/influxdb/data/ 273M
>> 5. CPU went beserk and doesn't come down
>> 6. A query like select count(blah) to the measurement that was batch
>> inserted (10k records at a time) is unusable and times out
>> 7. I need to import around 15 million records. How should I throttle that?
>>
>> At the moment I am pulling my hair out (not a pretty sight)
>>
>> Thanks a lot!
>> Tanya
>>
>> On 12 October 2016 at 06:11, Sean Beckett <s...@influxdb.com> wrote:
>>
>>>
>>>
>>> On Tue, Oct 11, 2016 at 12:11 AM, <tanya.unterber...@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> It seems that the old issue might have surfaced again (#3349) in v1.0.
>>>>
>>>> I tried to insert a large number of records (3913595) via a script,
>>>> inserting 10000 rows at a time.
>>>>
>>>> After a while I received
>>>>
>>>> HTTP/1.1 500 Internal Server Error
>>>> Content-Type: application/json
>>>> Request-Id: ac8ebbbe-8f70-11e6-8ce7-000000000000
>>>> X-Influxdb-Version: 1.0.0
>>>> Date: Tue, 11 Oct 2016 05:12:02 GMT
>>>> Content-Length: 20
>>>>
>>>> {"error":"timeout"}
>>>> HTTP/1.1 100 Continue
>>>>
>>>> I killed the process, after which the whole box became pretty much
>>>> unresponsive.
>>>>
>>>
>>> Killed the InfluxDB process, or the batch writing script process?
>>>
>>>
>>>>
>>>> There is nothing in the logs (i.e. sudo ls /var/log/influxdb/ gives me
>>>> nothing) although the setting for http logging is true:
>>>>
>>>
>>> systemd OSes put the logs in a new place (yay!?). See
>>> http://docs.influxdata.com/influxdb/v1.0/administration/logs/#systemd
>>> for how to read the logs.
>>>
>>>
>>>>
>>>> [http]
>>>>   enabled = true
>>>>   bind-address = ":8086"
>>>>   auth-enabled = true
>>>>   log-enabled = true
>>>>
>>>> I tried to restart influx, but got the following error:
>>>>
>>>> Failed to connect to http://localhost:8086
>>>> Please check your connection settings and ensure 'influxd' is running.
>>>>
>>>
>>> The `influx` console is just a fancy wrapper on the API. That error
>>> doesn't mean much except that the HTTP listener in InfluxDB is not yet up
>>> and running.
>>>
>>>
>>>>
>>>> Although I can see that influxd is up an running:
>>>>
>>>> > systemctl | grep influx
>>>> influxdb.service
>>>>                   loaded active running   InfluxDB is an open-source,
>>>> distributed, time series database
>>>>
>>>> What do I do now?
>>>>
>>>
>>> Check the logs as referenced above.
>>>
>>> The non-responsiveness on startup isn't surprising. It sounds like the
>>> system was overwhelmed with writes, which means that the WAL would have
>>> many points cached, waiting to be flushed to disk. On restart, InfluxDB
>>> won't accept new writes or queries until the cached ones in the WAL have
>>> persisted. For this reason, the HTTP listener is off until the WAL is
>>> flushed.
>>>
>>>
>>>>
>>>> I tried the same import over the weekend, then the script timeout
>>>> happened eventually but the result was the same unresponsive, unusable
>>>> server. We rebuilt the box and started again.
>>>>
>>>
>>> It sounds like the box is just overwhelmed. Did you get backoff messages
>>> from the writes before the crash? What are the machine specs?
>>>
>>>
>>>
>>>>
>>>> Perhaps it is worthwhile mentioning that the same measurement already
>>>> contained about 9 million records. Some of these records had the same
>>>> timestamp as the ones I tried to import, i.e. they should have been merged.
>>>>
>>>
>>> Overwriting points is much much more expensive than posting new points.
>>> Each overwritten point triggers a tombstone record which must later be
>>> processed. This can trigger frequent compactions of the TSM files. With a
>>> high write load and frequent compactions, the system would encounter
>>> significant CPU pressure.
>>>
>>>
>>>>
>>>> Interestingly enough the same amount of data was fine when I forgot to
>>>> add precision in ms, i.e. all records were imported as nanoseconds, but in
>>>> fact they "lacked" 6 zeroes.
>>>>
>>>
>>> That would mean all points are going to the same shard. It is more
>>> resource intensive to load points across a wide range of time, since more
>>> shard files are involved. InfluxDB does best with sequential
>>> chronologically ordered unique points from the very recent past. The more
>>> the write operation differs from that, the lower the throughput.
>>>
>>>
>>>>
>>>> Please advise what kind of action I can take.
>>>>
>>>
>>> Look in the logs for errors. Throttle the writes. Don't overwrite more
>>> points than you have to.
>>>
>>>
>>>>
>>>> Thanks a lot!
>>>> Tanya
>>>>
>>>> --
>>>> Remember to include the InfluxDB version number with all issue reports
>>>> ---
>>>> You received this message because you are subscribed to the Google
>>>> Groups "InfluxDB" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to influxdb+unsubscr...@googlegroups.com.
>>>> To post to this group, send email to influxdb@googlegroups.com.
>>>> Visit this group at https://groups.google.com/group/influxdb.
>>>> To view this discussion on the web visit https://groups.google.com/d/ms
>>>> gid/influxdb/f4ebdb56-32f9-4fb6-88de-f7ef603c4262%40googlegroups.com.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>
>>>
>>> --
>>> Sean Beckett
>>> Director of Support and Professional Services
>>> InfluxDB
>>>
>>> --
>>> Remember to include the version number!
>>> ---
>>> You received this message because you are subscribed to a topic in the
>>> Google Groups "InfluxData" group.
>>> To unsubscribe from this topic, visit https://groups.google.com/d/to
>>> pic/influxdb/sZIR8wY_v4g/unsubscribe.
>>> To unsubscribe from this group and all its topics, send an email to
>>> influxdb+unsubscr...@googlegroups.com.
>>> To post to this group, send email to influxdb@googlegroups.com.
>>> Visit this group at https://groups.google.com/group/influxdb.
>>> To view this discussion on the web visit https://groups.google.com/d/ms
>>> gid/influxdb/CALGqCvMCu%3DM9eR5NOky-LRAiqRU5cnCDJa0SBjRrz5_W
>>> t0tT8g%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/influxdb/CALGqCvMCu%3DM9eR5NOky-LRAiqRU5cnCDJa0SBjRrz5_Wt0tT8g%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> --
>> Remember to include the version number!
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "InfluxData" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to influxdb+unsubscr...@googlegroups.com.
>> To post to this group, send email to influxdb@googlegroups.com.
>> Visit this group at https://groups.google.com/group/influxdb.
>> To view this discussion on the web visit https://groups.google.com/d/ms
>> gid/influxdb/CAAHSRnBqEwgJ3npvUHGcVc-J4dFWs5bwvYAU3P84xKsh1C
>> 0yBA%40mail.gmail.com
>> <https://groups.google.com/d/msgid/influxdb/CAAHSRnBqEwgJ3npvUHGcVc-J4dFWs5bwvYAU3P84xKsh1C0yBA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> --
> Sean Beckett
> Director of Support and Professional Services
> InfluxDB
>
> --
> Remember to include the version number!
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "InfluxData" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/
> topic/influxdb/sZIR8wY_v4g/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> influxdb+unsubscr...@googlegroups.com.
> To post to this group, send email to influxdb@googlegroups.com.
> Visit this group at https://groups.google.com/group/influxdb.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/influxdb/CALGqCvNrjBpLPShu1_8zRvJ5d-zmO0%2BH%2ByTFJEVDDGkdM_J6Lw%
> 40mail.gmail.com
> <https://groups.google.com/d/msgid/influxdb/CALGqCvNrjBpLPShu1_8zRvJ5d-zmO0%2BH%2ByTFJEVDDGkdM_J6Lw%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
Remember to include the version number!
--- 
You received this message because you are subscribed to the Google Groups 
"InfluxData" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to influxdb+unsubscr...@googlegroups.com.
To post to this group, send email to influxdb@googlegroups.com.
Visit this group at https://groups.google.com/group/influxdb.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/influxdb/CAAHSRnA7bSRex-_3bg2Rw7rBMXQzypj1Z%3DbaaWX3nSmofcuJ%3Dw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to