Hi Johannes, yes this is a very good idea. We should refactor the file adapters to store the files in the service Dominik described. I created an issue for that: STREAMPIPES 80: Use internal file service in file adapters.
Philipp > On 19. Feb 2020, at 20:59, Johannes Tex <[email protected]> wrote: > > Hi, > > I also think a service for file handling would be a good solution. > > At the moment we also use files for the Adapters that are stored in the > Worker. > Maybe this would be another use case for a file service? > > Johannes > > On 2020/02/19 06:58:11, Dominik Riemer <[email protected]> wrote: >> Hi Philipp, >> >> yes, I think it makes sense to have a single service for handling files. >> When writing the CSVMetadataEnrichment component for Chris, I started to add >> a simple file management to the backend and also extended the SDK with >> methods to receive files from the backend (see >> CsvMetadataEnrichmentController and FileServingResource in the backend). >> >> We could extend this, isolate the file management to an individual >> microservice and add a simple API in front of it that can be used by all >> services that require to store or receive files (e.g., also for the included >> assets of pipeline elements, which could be documentation, icons or ML >> models). >> >> Concerning HDFS, in my opinion this might be an option, but as we don't have >> very large amounts of data by now to store, it would probably be a bit of >> overkill here (one distributed system more to manage). >> >> Dominik >> >> -----Original Message----- >> From: Philipp Zehnder <[email protected]> >> Sent: Tuesday, February 18, 2020 6:28 PM >> To: [email protected] >> Subject: STREAMPIPES-75: Extend data lake sink to store images >> >> Hi all, >> >> I finished the implementation to store images in files instead of base 64 >> Strings in InfluxDB. >> >> For the first version I mounted a local volume and added the images in a >> folder in this volume. >> I think this is a good starting point because the images are stored in a >> local volume on the same host as the sink. >> Now the question is how can users access those images? I would suggest to >> extend the data lake REST API for that. >> Therefore, the backend must mount the same volume as the internal sink >> container with the data lake sink. >> >> Does anyone of you have an alternative solution? >> >> @Dominik, you implemented already an StreamPipes internal file storage. >> Could we use that for the images as well or would the frequency be too high? >> >> @all What about HDFS. We could set up HDFS, for files. Similar to InfluxDB >> as a shared service between multiple containers >> >> >> Philipp >>
