Hi all,

I've sent PR [1] with the initial implementation. Please review and merge.

[1] https://github.com/apache/incubator-streampipes-extensions/pull/12

Thanks,
Grainier.

On Mon, 11 May 2020 at 01:20, Dominik Riemer <[email protected]> wrote:

> Hi Grainier,
>
> very cool! A Redis sink would be awesome.
> Since I haven't worked a lot with Redis in the past, I don't have a strong
> opinion, just some thoughts:
> I guess the answer depends on the question how users will use events
> stored in Redis, whether they will need to access single fields or the
> whole event. I'd probably guess that most users will access whole events,
> which would lead to option 1.
> Maybe we could start with 1 and later on add an option in the pipeline
> element configuration where users can switch between both options?
>
> I'll be happy to help you with the SDK in case you have any questions - I
> know that our documentation has some potential for improvement, so feel
> free to ask 😉
>
> Dominik
>
>
> -----Original Message-----
> From: Grainier Perera <[email protected]>
> Sent: Sunday, May 10, 2020 6:20 PM
> To: [email protected]
> Subject: DataSink for Redis
>
> Hi all,
>
> I'm planning to implement a data sink that forwards and store events into
> Redis[1][2]. But I'd like to get some feedback and opinion from you before
> proceeding.
>
> The question that I have is; since Redis is merely a key-value store, and
> we have a structured event to be persisted, what would the key-value be?
> Following are the possible approaches[3];
>
> 1. Store the entire object as a JSON-encoded string in a single key.
>
> * SET event:{id} '{"sensorId":"001", "temp":28}'*
>
>
>    - Pro: faster when accessing all the fields of the event at once.
>    - Pro: works with nested objects (but I don't think we have any nested
>    objects).
>    - Pro: can set the TTL.
>    - Con: slower when accessing a single or subset of fields of the event.
>    - Con: JSON parsing is required to retrieve fields. However, it's quite
>    fast.
>
>
> 2. Store each Object's properties in a Redis hash.
>
> * HMSET event:{id} sensorId "001"*
>
> * HMSET event:{id} temp "28"*
>
>
>    - Pro: can set the TTL.
>    - Pro: no need to parse JSON strings.
>    - Con: faster when accessing a single or subset of fields of the event.
>    - Con: slower when accessing all the fields of the event.
>
>
> 3. Store each Object as a JSON string in a Redis hash.
>
> * HMSET events {id1} '{"sensorId":"001", "temp":28}'*
>
> * HMSET events {id2} '{"sensorId":"002", "temp":32}'*
>
>
>    - Pro: fewer keys to work with.
>    - Con: can't set the TTL.
>    - Con: JSON parsing is required to retrieve fields.
>    - Con: slower when accessing a single or subset of fields of the event.
>
>
> 4. Store each property of each Object in a dedicated key.
>
> * SET event:{id}:sensorId "001"*
>
> * SET event:{id}:temp 28*
>
>
>    - Pro: can set the TTL per field (but it's not necessary for our
>    scenario).
>    - Pro: no need to parse JSON strings.
>    - Con: faster when accessing a single or subset of fields of the event.
>    - Con: slower when accessing all the fields of the event.
>
>
> 5. Use RedisJSON[4][5] module and store each event as a JSON.
>
> * JSON.SET event . '{"sensorId":"001", "temp":28}'*
>
>
>    - Pro: faster manipulation of JSON documents.
>    - Pro: faster when accessing single/multiple fields of the event.
>    - Pro: can set the TTL.
>    - Con: requires RedisJSON module.
>
>
> IMO, 1 & 2 would be the best choices given that they both allow (TTL) for
> purging. What would you think is best? Your feedback is highly appreciated.
>
> [1] https://redis.io/
> [2] https://issues.apache.org/jira/browse/STREAMPIPES-121
> <https://redis.io/>
> [3]
>
> https://stackoverflow.com/questions/16375188/redis-strings-vs-redis-hashes-to-represent-json-efficiency
> [4] https://redislabs.com/redis-enterprise/redis-json/
> [5] https://oss.redislabs.com/redisjson/
>
> Regards,
> Grainier.
>
>

Reply via email to