Hi I created a Wiki page [1]. Everyone is invited to contribute :)
[1] https://cwiki.apache.org/confluence/display/STREAMPIPES/Generic+Data+Store Johannes On 2020/02/24 22:54:14, "Dominik Riemer" <[email protected]> wrote: > Hi Johannes, > > +1 for having both REST and streaming interfaces to store images, we can > probably start with REST and add streaming interfaces later. > I think the generic data store will be part of the general platform API > service, where we can integrate all endpoints that will be required by > external services (e.g., registered streams/processors/sinks, historical > data, images). > What do you think, should we create a wiki page to collect all requirements > and design the endpoints we are going to need? > > Dominik > > -----Original Message----- > From: Johannes Tex <[email protected]> > Sent: Sunday, February 23, 2020 11:35 AM > To: [email protected] > Subject: Re: Image Labeling > > Hi, > > I think a simple generic data store API that support the CRUD operations > would be a great. > > I think all CRUD operations should be available as REST API and the CREATE > API maybe additionally with a messaging protocol (Kafka, MQTT), which is used > e.g. by the Data-Lake-Sink to store the images. Or do you think that > synchronous communication via REST is fast enough? > In addition to reading entire files, the READ operation should have a stream > interface that can be used directly by the adapters, for example. > > Johannes > > > On 2020/02/21 18:49:58, Philipp Zehnder <[email protected]> wrote: > > Hi, > > > > I also think we should store it either in a file, in the same directory as > > the image or in the CouchDB. > > For now I am not sure what the better solution is. The only requirement is > > that once a user downloads the data, the labels should be provided in a > > Coco-JSON file, but this is possible with both options. > > > > Since we have now multiple locations where we store data, we probably > > should start a discussion of how to Store application data within > > StreamPipes. > > It might make sense to have an internal (or external) API for components > > and other service. > > How do you think about that? What kind of features would such an API need? > > > > Philipp > > > > > On 19. Feb 2020, at 22:00, Johannes Tex <[email protected]> wrote: > > > > > > Hi, > > > > > > I starts with @Dominik question: The first Intention was to be part of > > > the Data-Explorer, with toggling between simple exploring and labelling. > > > @Philipp opened an Issue [STREAMPIPES-79] to refactoring the Data > > > explorer, maybe in this context we could extend the data explorer for > > > this two modes? > > > To display images, for example, we need almost the same mechanism like it > > > is necessary for the image labelling, except the Labeling itself. We also > > > need to extend the datalake API for images, which leads to @Philipp > > > question. > > > > > > The data lake API supports, at the moment, just data that can be > > > aggregated (numeric data). For the Image Labeling and viewing we need to > > > extend the API. My proposal would be to create a paging API for images to > > > the receive the next e.g. 10 images: It could be like this > > > "/datalake/<index> /<timestamp>/<page>". What do you think? While this > > > necessary extension we also can create the API to save the annotation. > > > > > > I see three different options to save the annotations: > > > * Influx -> save annotation direct with data point > > > - when exporting need to create COCO file > > > - need extra place to save (image) Labels/Categories > > > - need to 'manupilate' data point, which is not possible in influx > > > (just delete and create new one) > > > * File > > > - need to handle a file > > > * CouchDB > > > - file generation is needed > > > My proposal is to use the CouchDB to use the annotations. > > > > > > Johannes > > > > > > > > > On 2020/02/17 21:12:38, Philipp Zehnder <[email protected]> wrote: > > >> Hi Johannes, > > >> > > >> as for the API, do you think we can extend the dataset API, or should we > > >> create a separate REST API for image annotation? > > >> > > >> Where do you plan to store the coco annotation information? In files or > > >> in a DB? > > >> > > >> Philipp > > >> > > >>> On 16. Feb 2020, at 19:51, Dominik Riemer <[email protected]> wrote: > > >>> > > >>> Hi Johannes, > > >>> sounds good! > > >>> I think bounding boxes and polygons are totally fine for the first > > >>> prototype. > > >>> > > >>> How to you plan to integrate the labeling tool, will it be part of the > > >>> data explorer or do you plan to add a new component? > > >>> > > >>> Dominik > > >>> > > >>> On 2020/02/14 16:30:17, Johannes Tex <[email protected]> wrote: > > >>>> Hi, > > >>>> > > >>>> Philip started to extend the datalake sink to store images > > >>>> [STREAMPIPES-75]. > > >>>> I started now to create an Image labeler that allows users to label > > >>>> images in the datalake. [STREAMPIPES-78]. The Labels will be stored in > > >>>> the COCO Annonation Format. [1] After labeling, the images can be used > > >>>> to train an NN. > > >>>> > > >>>> The main features that the labeler should support > > >>>> - Labeling with Bound boxes > > >>>> - Labeling with Polygons > > >>>> > > >>>> Do you have additional features that should also be supported? > > >>>> > > >>>> Johannes > > >>>> > > >>>> > > >>>> [1] http://cocodataset.org/#format-data > > >>>> > > >>>> > > >>>> > > >> > > >> > > >> > > > > > > > >
