Hi Grainer, thank you! I direclty merged the pull request with the docker-compose file.
@Patrick, what else do we have to add when we want to use Redit in Kubernetes? Do we also have to add a template in [1] as well or is it sufficient to have the docker-compose file? Philipp [1] https://github.com/apache/incubator-streampipes-installer/tree/dev/helm-chart/templates/optional-external-services <https://github.com/apache/incubator-streampipes-installer/tree/dev/helm-chart/templates/optional-external-services> On 2020/05/13 03:01:37, Grainier Perera <[email protected]> wrote: > Hi Philipp, > > I've created an issue [1] and added a docker-compose file for Redis in > PR[2]. Please review and merge. > > [1] https://issues.apache.org/jira/browse/STREAMPIPES-124 > [2] https://github.com/apache/incubator-streampipes-installer/pull/6 > > Thanks, > Grainier. > > On Wed, 13 May 2020 at 02:01, Philipp Zehnder <[email protected]> wrote: > > > Hi Grainer, > > > > your PR looks very good. > > Do you have a docker-compose file for Redis? > > I would like to add it to our CLI [1] in the service directory. > > > > This makes it easy for StreamPipes users to setup an instance and use your > > new sink. > > A user just has to add ‘redis’ to the system file and the container is > > then started with the rest of the system. > > We already provided docker-compose files for other DBs. > > > > Philipp > > > > [1] https://github.com/apache/incubator-streampipes-installer/tree/dev/cli > > <https://github.com/apache/incubator-streampipes-installer/tree/dev/cli> > > > > > On 12. May 2020, at 18:09, Grainier Perera <[email protected]> > > wrote: > > > > > > Hi Philipp, > > > > > > I agree with your opinion on the key-field. So I've modified it with an > > > option to either use auto-increment or use an existing event field as the > > > key field [1]. Now it will have a radio button to select True/False on > > > auto-increment. And if it's True, key-field will be ignored and a > > > sequential numeric key will be used. Otherwise, it'll use the selected > > > field as the key field. > > > > > > When it comes to use-cases, a user can; > > > > > > 1. Store the last event per asset (asset id as the key-field, > > > auto-increment disabled, index -1). > > > 2. Collect all the events for per asset for diagnostics, replaying, > > > etc... (auto-increment enabled, different index per asset) (index is > > like a > > > separate DB with a distinct keyspace, independent from the others [2]) > > > 3. To collect recent events with data purging. (similar to 1, 2. But, > > > with an expiration time). > > > > > > So, with this new approach, it would allow all the above scenarios. What > > do > > > you think? > > > > > > [1] https://github.com/apache/incubator-streampipes-extensions/pull/13 > > > [2] https://www.mikeperham.com/2015/09/24/storing-data-with-redis/ > > > > > > Regards, > > > Grainier. > > > > > > On Tue, 12 May 2020 at 12:36, Philipp Zehnder <[email protected]> > > wrote: > > > > > >> Hi Grainer, > > >> > > >> the sink looks very cool and I merged your PR. > > >> > > >> I have a question regarding the key field. > > >> > > >> Currently users can either select ‘-‘ or a ‘runtimeName’ as a > > >> requiredTextParameter. > > >> When ‘-‘ is selected a unique counter is used for the key, right? > > >> The problem is when a user selects a ‘runtimeName’ we can not provide > > any > > >> input validation. > > >> If the primaryKey is not within the event the user will see an error > > when > > >> the pipeline is started and has to go back and edit the pipeline. > > >> > > >> Alternatively we could use a mapping property for the key field, then > > the > > >> user would see a drop down menu of all event properties and could select > > >> one. > > >> This way we can ensure that the key is within the event, but then we do > > >> not have the chance to select ‘-‘. > > >> > > >> What do you think is a common use case for the Redit sink? > > >> Could a use case for redit be to store the last event per asset? (e.g. > > >> sensor or machine) > > >> Therefore, we could use the mapping property solution and further extend > > >> it with a dimension property requirement. > > >> Then users can select a property representing an identifier (e.g. > > machine > > >> id. For each machine an entry would be created in Redit) > > >> > > >> > > >> What do you think? > > >> > > >> Philipp > > >> > > >> > > >> > > >>> On 11. May 2020, at 17:51, Grainier Perera <[email protected]> > > >> wrote: > > >>> > > >>> Hi all, > > >>> > > >>> I've sent PR [1] with the initial implementation. Please review and > > >> merge. > > >>> > > >>> [1] https://github.com/apache/incubator-streampipes-extensions/pull/12 > > >>> > > >>> Thanks, > > >>> Grainier. > > >>> > > >>> On Mon, 11 May 2020 at 01:20, Dominik Riemer <[email protected]> > > wrote: > > >>> > > >>>> Hi Grainier, > > >>>> > > >>>> very cool! A Redis sink would be awesome. > > >>>> Since I haven't worked a lot with Redis in the past, I don't have a > > >> strong > > >>>> opinion, just some thoughts: > > >>>> I guess the answer depends on the question how users will use events > > >>>> stored in Redis, whether they will need to access single fields or the > > >>>> whole event. I'd probably guess that most users will access whole > > >> events, > > >>>> which would lead to option 1. > > >>>> Maybe we could start with 1 and later on add an option in the pipeline > > >>>> element configuration where users can switch between both options? > > >>>> > > >>>> I'll be happy to help you with the SDK in case you have any questions > > - > > >> I > > >>>> know that our documentation has some potential for improvement, so > > feel > > >>>> free to ask 😉 > > >>>> > > >>>> Dominik > > >>>> > > >>>> > > >>>> -----Original Message----- > > >>>> From: Grainier Perera <[email protected]> > > >>>> Sent: Sunday, May 10, 2020 6:20 PM > > >>>> To: [email protected] > > >>>> Subject: DataSink for Redis > > >>>> > > >>>> Hi all, > > >>>> > > >>>> I'm planning to implement a data sink that forwards and store events > > >> into > > >>>> Redis[1][2]. But I'd like to get some feedback and opinion from you > > >> before > > >>>> proceeding. > > >>>> > > >>>> The question that I have is; since Redis is merely a key-value store, > > >> and > > >>>> we have a structured event to be persisted, what would the key-value > > be? > > >>>> Following are the possible approaches[3]; > > >>>> > > >>>> 1. Store the entire object as a JSON-encoded string in a single key. > > >>>> > > >>>> * SET event:{id} '{"sensorId":"001", "temp":28}'* > > >>>> > > >>>> > > >>>> - Pro: faster when accessing all the fields of the event at once. > > >>>> - Pro: works with nested objects (but I don't think we have any > > nested > > >>>> objects). > > >>>> - Pro: can set the TTL. > > >>>> - Con: slower when accessing a single or subset of fields of the > > >> event. > > >>>> - Con: JSON parsing is required to retrieve fields. However, it's > > >> quite > > >>>> fast. > > >>>> > > >>>> > > >>>> 2. Store each Object's properties in a Redis hash. > > >>>> > > >>>> * HMSET event:{id} sensorId "001"* > > >>>> > > >>>> * HMSET event:{id} temp "28"* > > >>>> > > >>>> > > >>>> - Pro: can set the TTL. > > >>>> - Pro: no need to parse JSON strings. > > >>>> - Con: faster when accessing a single or subset of fields of the > > >> event. > > >>>> - Con: slower when accessing all the fields of the event. > > >>>> > > >>>> > > >>>> 3. Store each Object as a JSON string in a Redis hash. > > >>>> > > >>>> * HMSET events {id1} '{"sensorId":"001", "temp":28}'* > > >>>> > > >>>> * HMSET events {id2} '{"sensorId":"002", "temp":32}'* > > >>>> > > >>>> > > >>>> - Pro: fewer keys to work with. > > >>>> - Con: can't set the TTL. > > >>>> - Con: JSON parsing is required to retrieve fields. > > >>>> - Con: slower when accessing a single or subset of fields of the > > >> event. > > >>>> > > >>>> > > >>>> 4. Store each property of each Object in a dedicated key. > > >>>> > > >>>> * SET event:{id}:sensorId "001"* > > >>>> > > >>>> * SET event:{id}:temp 28* > > >>>> > > >>>> > > >>>> - Pro: can set the TTL per field (but it's not necessary for our > > >>>> scenario). > > >>>> - Pro: no need to parse JSON strings. > > >>>> - Con: faster when accessing a single or subset of fields of the > > >> event. > > >>>> - Con: slower when accessing all the fields of the event. > > >>>> > > >>>> > > >>>> 5. Use RedisJSON[4][5] module and store each event as a JSON. > > >>>> > > >>>> * JSON.SET event . '{"sensorId":"001", "temp":28}'* > > >>>> > > >>>> > > >>>> - Pro: faster manipulation of JSON documents. > > >>>> - Pro: faster when accessing single/multiple fields of the event. > > >>>> - Pro: can set the TTL. > > >>>> - Con: requires RedisJSON module. > > >>>> > > >>>> > > >>>> IMO, 1 & 2 would be the best choices given that they both allow (TTL) > > >> for > > >>>> purging. What would you think is best? Your feedback is highly > > >> appreciated. > > >>>> > > >>>> [1] https://redis.io/ > > >>>> [2] https://issues.apache.org/jira/browse/STREAMPIPES-121 > > >>>> <https://redis.io/> > > >>>> [3] > > >>>> > > >>>> > > >> > > https://stackoverflow.com/questions/16375188/redis-strings-vs-redis-hashes-to-represent-json-efficiency > > >>>> [4] https://redislabs.com/redis-enterprise/redis-json/ > > >>>> [5] https://oss.redislabs.com/redisjson/ > > >>>> > > >>>> Regards, > > >>>> Grainier. > > >>>> > > >>>> > > >> > > >> > > >> > > > > > > >
