Hi all,
I'm planning to implement a data sink that forwards and store events into
Redis[1][2]. But I'd like to get some feedback and opinion from you before
proceeding.
The question that I have is; since Redis is merely a key-value store, and
we have a structured event to be persisted, what would the key-value be?
Following are the possible approaches[3];
1. Store the entire object as a JSON-encoded string in a single key.
* SET event:{id} '{"sensorId":"001", "temp":28}'*
- Pro: faster when accessing all the fields of the event at once.
- Pro: works with nested objects (but I don't think we have any nested
objects).
- Pro: can set the TTL.
- Con: slower when accessing a single or subset of fields of the event.
- Con: JSON parsing is required to retrieve fields. However, it's quite
fast.
2. Store each Object's properties in a Redis hash.
* HMSET event:{id} sensorId "001"*
* HMSET event:{id} temp "28"*
- Pro: can set the TTL.
- Pro: no need to parse JSON strings.
- Con: faster when accessing a single or subset of fields of the event.
- Con: slower when accessing all the fields of the event.
3. Store each Object as a JSON string in a Redis hash.
* HMSET events {id1} '{"sensorId":"001", "temp":28}'*
* HMSET events {id2} '{"sensorId":"002", "temp":32}'*
- Pro: fewer keys to work with.
- Con: can't set the TTL.
- Con: JSON parsing is required to retrieve fields.
- Con: slower when accessing a single or subset of fields of the event.
4. Store each property of each Object in a dedicated key.
* SET event:{id}:sensorId "001"*
* SET event:{id}:temp 28*
- Pro: can set the TTL per field (but it's not necessary for our
scenario).
- Pro: no need to parse JSON strings.
- Con: faster when accessing a single or subset of fields of the event.
- Con: slower when accessing all the fields of the event.
5. Use RedisJSON[4][5] module and store each event as a JSON.
* JSON.SET event . '{"sensorId":"001", "temp":28}'*
- Pro: faster manipulation of JSON documents.
- Pro: faster when accessing single/multiple fields of the event.
- Pro: can set the TTL.
- Con: requires RedisJSON module.
IMO, 1 & 2 would be the best choices given that they both allow (TTL) for
purging. What would you think is best? Your feedback is highly appreciated.
[1] https://redis.io/
[2] https://issues.apache.org/jira/browse/STREAMPIPES-121
<https://redis.io/>
[3]
https://stackoverflow.com/questions/16375188/redis-strings-vs-redis-hashes-to-represent-json-efficiency
[4] https://redislabs.com/redis-enterprise/redis-json/
[5] https://oss.redislabs.com/redisjson/
Regards,
Grainier.