Hi Grainer,

the sink looks very cool and I merged your PR.

I have a question regarding the key field. 

Currently users can either select ‘-‘ or a ‘runtimeName’ as a 
requiredTextParameter.
When ‘-‘ is selected a unique counter is used for the key, right?
The problem is when a user selects a ‘runtimeName’ we can not provide any input 
validation.
If the primaryKey is not within the event the user will see an error when the 
pipeline is started and has to go back and edit the pipeline.

Alternatively we could use a mapping property for the key field, then the user 
would see a drop down menu of all event properties and could select one. 
This way we can ensure that the key is within the event, but then we do not 
have the chance to select ‘-‘.

What do you think is a common use case for the Redit sink?
Could a use case for redit be to store the last event per asset? (e.g. sensor 
or machine)
Therefore, we could use the mapping property solution and further extend it 
with a dimension property requirement.
Then users can select a property representing an identifier (e.g. machine id. 
For each machine an entry would be created in Redit)


What do you think?

Philipp



> On 11. May 2020, at 17:51, Grainier Perera <grainier.per...@gmail.com> wrote:
> 
> Hi all,
> 
> I've sent PR [1] with the initial implementation. Please review and merge.
> 
> [1] https://github.com/apache/incubator-streampipes-extensions/pull/12
> 
> Thanks,
> Grainier.
> 
> On Mon, 11 May 2020 at 01:20, Dominik Riemer <rie...@apache.org> wrote:
> 
>> Hi Grainier,
>> 
>> very cool! A Redis sink would be awesome.
>> Since I haven't worked a lot with Redis in the past, I don't have a strong
>> opinion, just some thoughts:
>> I guess the answer depends on the question how users will use events
>> stored in Redis, whether they will need to access single fields or the
>> whole event. I'd probably guess that most users will access whole events,
>> which would lead to option 1.
>> Maybe we could start with 1 and later on add an option in the pipeline
>> element configuration where users can switch between both options?
>> 
>> I'll be happy to help you with the SDK in case you have any questions - I
>> know that our documentation has some potential for improvement, so feel
>> free to ask 😉
>> 
>> Dominik
>> 
>> 
>> -----Original Message-----
>> From: Grainier Perera <grainier.per...@gmail.com>
>> Sent: Sunday, May 10, 2020 6:20 PM
>> To: dev@streampipes.apache.org
>> Subject: DataSink for Redis
>> 
>> Hi all,
>> 
>> I'm planning to implement a data sink that forwards and store events into
>> Redis[1][2]. But I'd like to get some feedback and opinion from you before
>> proceeding.
>> 
>> The question that I have is; since Redis is merely a key-value store, and
>> we have a structured event to be persisted, what would the key-value be?
>> Following are the possible approaches[3];
>> 
>> 1. Store the entire object as a JSON-encoded string in a single key.
>> 
>> * SET event:{id} '{"sensorId":"001", "temp":28}'*
>> 
>> 
>>   - Pro: faster when accessing all the fields of the event at once.
>>   - Pro: works with nested objects (but I don't think we have any nested
>>   objects).
>>   - Pro: can set the TTL.
>>   - Con: slower when accessing a single or subset of fields of the event.
>>   - Con: JSON parsing is required to retrieve fields. However, it's quite
>>   fast.
>> 
>> 
>> 2. Store each Object's properties in a Redis hash.
>> 
>> * HMSET event:{id} sensorId "001"*
>> 
>> * HMSET event:{id} temp "28"*
>> 
>> 
>>   - Pro: can set the TTL.
>>   - Pro: no need to parse JSON strings.
>>   - Con: faster when accessing a single or subset of fields of the event.
>>   - Con: slower when accessing all the fields of the event.
>> 
>> 
>> 3. Store each Object as a JSON string in a Redis hash.
>> 
>> * HMSET events {id1} '{"sensorId":"001", "temp":28}'*
>> 
>> * HMSET events {id2} '{"sensorId":"002", "temp":32}'*
>> 
>> 
>>   - Pro: fewer keys to work with.
>>   - Con: can't set the TTL.
>>   - Con: JSON parsing is required to retrieve fields.
>>   - Con: slower when accessing a single or subset of fields of the event.
>> 
>> 
>> 4. Store each property of each Object in a dedicated key.
>> 
>> * SET event:{id}:sensorId "001"*
>> 
>> * SET event:{id}:temp 28*
>> 
>> 
>>   - Pro: can set the TTL per field (but it's not necessary for our
>>   scenario).
>>   - Pro: no need to parse JSON strings.
>>   - Con: faster when accessing a single or subset of fields of the event.
>>   - Con: slower when accessing all the fields of the event.
>> 
>> 
>> 5. Use RedisJSON[4][5] module and store each event as a JSON.
>> 
>> * JSON.SET event . '{"sensorId":"001", "temp":28}'*
>> 
>> 
>>   - Pro: faster manipulation of JSON documents.
>>   - Pro: faster when accessing single/multiple fields of the event.
>>   - Pro: can set the TTL.
>>   - Con: requires RedisJSON module.
>> 
>> 
>> IMO, 1 & 2 would be the best choices given that they both allow (TTL) for
>> purging. What would you think is best? Your feedback is highly appreciated.
>> 
>> [1] https://redis.io/
>> [2] https://issues.apache.org/jira/browse/STREAMPIPES-121
>> <https://redis.io/>
>> [3]
>> 
>> https://stackoverflow.com/questions/16375188/redis-strings-vs-redis-hashes-to-represent-json-efficiency
>> [4] https://redislabs.com/redis-enterprise/redis-json/
>> [5] https://oss.redislabs.com/redisjson/
>> 
>> Regards,
>> Grainier.
>> 
>> 


Reply via email to