Hi Johannes,

yes this is a very good idea. We should refactor the file adapters to store the 
files in the service Dominik described.
I created an issue for that: STREAMPIPES 80: Use internal file service in file 
adapters.

Philipp


> On 19. Feb 2020, at 20:59, Johannes Tex <[email protected]> wrote:
> 
> Hi,
> 
> I also think a service for file handling would be a good solution. 
> 
> At the moment we also use files for the Adapters that are stored in the 
> Worker. 
> Maybe this would be another use case for a file service?
> 
> Johannes 
> 
> On 2020/02/19 06:58:11, Dominik Riemer <[email protected]> wrote: 
>> Hi Philipp,
>> 
>> yes, I think it makes sense to have a single service for handling files.
>> When writing the CSVMetadataEnrichment component for Chris, I started to add 
>> a simple file management to the backend and also extended the SDK with 
>> methods to receive files from the backend (see 
>> CsvMetadataEnrichmentController and FileServingResource in the backend).
>> 
>> We could extend this, isolate the file management to an individual 
>> microservice and add a simple API in front of it that can be used by all 
>> services that require to store or receive files (e.g., also for the included 
>> assets of pipeline elements, which could be documentation, icons or ML 
>> models).
>> 
>> Concerning HDFS, in my opinion this might be an option, but as we don't have 
>> very large amounts of data by now to store, it would probably be a bit of 
>> overkill here (one distributed system more to manage). 
>> 
>> Dominik
>> 
>> -----Original Message-----
>> From: Philipp Zehnder <[email protected]> 
>> Sent: Tuesday, February 18, 2020 6:28 PM
>> To: [email protected]
>> Subject: STREAMPIPES-75: Extend data lake sink to store images
>> 
>> Hi all,
>> 
>> I finished the implementation to store images in files instead of base 64 
>> Strings in InfluxDB.
>> 
>> For the first version I mounted a local volume and added the images in a 
>> folder in this volume. 
>> I think this is a good starting point because the images are stored in a 
>> local volume on the same host as the sink.
>> Now the question is how can users access those images? I would suggest to 
>> extend the data lake REST API for that.
>> Therefore, the backend must mount the same volume as the internal sink 
>> container with the data lake sink.
>> 
>> Does anyone of you have an alternative solution?
>> 
>> @Dominik, you implemented already an StreamPipes internal file storage. 
>> Could we use that for the images as well or would the frequency be too high?
>> 
>> @all What about HDFS. We could set up HDFS, for files. Similar to InfluxDB 
>> as a shared service between multiple containers
>> 
>> 
>> Philipp
>> 


Reply via email to