Hi all, I've sent PR [1] with the initial implementation. Please review and merge.
[1] https://github.com/apache/incubator-streampipes-extensions/pull/12 Thanks, Grainier. On Mon, 11 May 2020 at 01:20, Dominik Riemer <[email protected]> wrote: > Hi Grainier, > > very cool! A Redis sink would be awesome. > Since I haven't worked a lot with Redis in the past, I don't have a strong > opinion, just some thoughts: > I guess the answer depends on the question how users will use events > stored in Redis, whether they will need to access single fields or the > whole event. I'd probably guess that most users will access whole events, > which would lead to option 1. > Maybe we could start with 1 and later on add an option in the pipeline > element configuration where users can switch between both options? > > I'll be happy to help you with the SDK in case you have any questions - I > know that our documentation has some potential for improvement, so feel > free to ask 😉 > > Dominik > > > -----Original Message----- > From: Grainier Perera <[email protected]> > Sent: Sunday, May 10, 2020 6:20 PM > To: [email protected] > Subject: DataSink for Redis > > Hi all, > > I'm planning to implement a data sink that forwards and store events into > Redis[1][2]. But I'd like to get some feedback and opinion from you before > proceeding. > > The question that I have is; since Redis is merely a key-value store, and > we have a structured event to be persisted, what would the key-value be? > Following are the possible approaches[3]; > > 1. Store the entire object as a JSON-encoded string in a single key. > > * SET event:{id} '{"sensorId":"001", "temp":28}'* > > > - Pro: faster when accessing all the fields of the event at once. > - Pro: works with nested objects (but I don't think we have any nested > objects). > - Pro: can set the TTL. > - Con: slower when accessing a single or subset of fields of the event. > - Con: JSON parsing is required to retrieve fields. However, it's quite > fast. > > > 2. Store each Object's properties in a Redis hash. > > * HMSET event:{id} sensorId "001"* > > * HMSET event:{id} temp "28"* > > > - Pro: can set the TTL. > - Pro: no need to parse JSON strings. > - Con: faster when accessing a single or subset of fields of the event. > - Con: slower when accessing all the fields of the event. > > > 3. Store each Object as a JSON string in a Redis hash. > > * HMSET events {id1} '{"sensorId":"001", "temp":28}'* > > * HMSET events {id2} '{"sensorId":"002", "temp":32}'* > > > - Pro: fewer keys to work with. > - Con: can't set the TTL. > - Con: JSON parsing is required to retrieve fields. > - Con: slower when accessing a single or subset of fields of the event. > > > 4. Store each property of each Object in a dedicated key. > > * SET event:{id}:sensorId "001"* > > * SET event:{id}:temp 28* > > > - Pro: can set the TTL per field (but it's not necessary for our > scenario). > - Pro: no need to parse JSON strings. > - Con: faster when accessing a single or subset of fields of the event. > - Con: slower when accessing all the fields of the event. > > > 5. Use RedisJSON[4][5] module and store each event as a JSON. > > * JSON.SET event . '{"sensorId":"001", "temp":28}'* > > > - Pro: faster manipulation of JSON documents. > - Pro: faster when accessing single/multiple fields of the event. > - Pro: can set the TTL. > - Con: requires RedisJSON module. > > > IMO, 1 & 2 would be the best choices given that they both allow (TTL) for > purging. What would you think is best? Your feedback is highly appreciated. > > [1] https://redis.io/ > [2] https://issues.apache.org/jira/browse/STREAMPIPES-121 > <https://redis.io/> > [3] > > https://stackoverflow.com/questions/16375188/redis-strings-vs-redis-hashes-to-represent-json-efficiency > [4] https://redislabs.com/redis-enterprise/redis-json/ > [5] https://oss.redislabs.com/redisjson/ > > Regards, > Grainier. > >
