Hi

I created a Wiki page [1].
Everyone is invited to contribute :)

[1] https://cwiki.apache.org/confluence/display/STREAMPIPES/Generic+Data+Store

Johannes


On 2020/02/24 22:54:14, "Dominik Riemer" <[email protected]> wrote: 
> Hi Johannes,
> 
> +1 for having both REST and streaming interfaces to store images, we can 
> probably start with REST and add streaming interfaces later.
> I think the generic data store will be part of the general platform API 
> service, where we can integrate all endpoints that will be required by 
> external services (e.g., registered streams/processors/sinks, historical 
> data, images).
> What do you think, should we create a wiki page to collect all requirements 
> and design the endpoints we are going to need?
> 
> Dominik 
> 
> -----Original Message-----
> From: Johannes Tex <[email protected]> 
> Sent: Sunday, February 23, 2020 11:35 AM
> To: [email protected]
> Subject: Re: Image Labeling 
> 
> Hi,
> 
> I think a simple generic data store API that support the CRUD operations 
> would be a great.
> 
> I think all CRUD operations should be available as REST API and the CREATE 
> API maybe additionally with a messaging protocol (Kafka, MQTT), which is used 
> e.g. by the Data-Lake-Sink to store the images. Or do you think that 
> synchronous communication via REST is fast enough?
> In addition to reading entire files, the READ operation should have a stream 
> interface that can be used directly by the adapters, for example.
> 
> Johannes
> 
> 
> On 2020/02/21 18:49:58, Philipp Zehnder <[email protected]> wrote: 
> > Hi,
> > 
> > I also think we should store it either in a file, in the same directory as 
> > the image or in the CouchDB.
> > For now I am not sure what the better solution is. The only requirement is 
> > that once a user downloads the data, the labels should be provided in a 
> > Coco-JSON file, but this is possible with both options.
> > 
> > Since we have now multiple locations where we store data, we probably 
> > should start a discussion of how to Store application data within 
> > StreamPipes.
> > It might make sense to have an internal (or external) API for components 
> > and other service.
> > How do you think about that? What kind of features would such an API need?
> > 
> > Philipp
> > 
> > > On 19. Feb 2020, at 22:00, Johannes Tex <[email protected]> wrote:
> > > 
> > > Hi,
> > > 
> > > I starts with @Dominik question: The first Intention was to be part of 
> > > the Data-Explorer, with toggling between simple exploring and labelling. 
> > > @Philipp opened an Issue [STREAMPIPES-79] to refactoring the Data 
> > > explorer, maybe in this context we could extend the data explorer for 
> > > this two modes? 
> > > To display images, for example, we need almost the same mechanism like it 
> > > is necessary for the image labelling, except the Labeling itself. We also 
> > > need to extend the datalake API for images, which leads to @Philipp 
> > > question. 
> > > 
> > > The data lake API supports, at the moment, just data that can be 
> > > aggregated (numeric data). For the Image Labeling and viewing we need to 
> > > extend the API. My proposal would be to create a paging API for images to 
> > > the receive the next e.g. 10 images: It could be like this 
> > > "/datalake/<index> /<timestamp>/<page>". What do you think? While this 
> > > necessary extension we also can create the API to save the annotation.
> > > 
> > > I see three different options to save the annotations:
> > > * Influx -> save annotation direct with data point
> > >    - when exporting need to create COCO file
> > >    - need extra place to save (image) Labels/Categories
> > >    - need to 'manupilate' data point, which is not possible in influx 
> > > (just delete and create new one)
> > > * File
> > >     - need to handle a file
> > > * CouchDB
> > >    - file generation is needed
> > > My proposal is to use the CouchDB to use the annotations. 
> > > 
> > > Johannes
> > > 
> > > 
> > > On 2020/02/17 21:12:38, Philipp Zehnder <[email protected]> wrote: 
> > >> Hi Johannes,
> > >> 
> > >> as for the API, do you think we can extend the dataset API, or should we 
> > >> create a separate REST API for image annotation?
> > >> 
> > >> Where do you plan to store the coco annotation information? In files or 
> > >> in a DB?
> > >> 
> > >> Philipp
> > >> 
> > >>> On 16. Feb 2020, at 19:51, Dominik Riemer <[email protected]> wrote:
> > >>> 
> > >>> Hi Johannes,
> > >>> sounds good!
> > >>> I think bounding boxes and polygons are totally fine for the first 
> > >>> prototype.
> > >>> 
> > >>> How to you plan to integrate the labeling tool, will it be part of the 
> > >>> data explorer or do you plan to add a new component?
> > >>> 
> > >>> Dominik
> > >>> 
> > >>> On 2020/02/14 16:30:17, Johannes Tex <[email protected]> wrote: 
> > >>>> Hi,
> > >>>> 
> > >>>> Philip started to extend the datalake sink to store images 
> > >>>> [STREAMPIPES-75]. 
> > >>>> I started now to create an Image labeler that allows users to label 
> > >>>> images in the datalake. [STREAMPIPES-78]. The Labels will be stored in 
> > >>>> the COCO Annonation Format. [1] After labeling, the images can be used 
> > >>>> to train an NN. 
> > >>>> 
> > >>>> The main features that the labeler should support
> > >>>> - Labeling with Bound boxes
> > >>>> - Labeling with Polygons
> > >>>> 
> > >>>> Do you have additional features that should also be supported?
> > >>>> 
> > >>>> Johannes
> > >>>> 
> > >>>> 
> > >>>> [1] http://cocodataset.org/#format-data
> > >>>> 
> > >>>> 
> > >>>> 
> > >> 
> > >> 
> > >> 
> > 
> > 
> > 
> 
> 

Reply via email to