Re: "Face Detection Engine based on OpenIMAJ" in GSoC 2014

Zhu Qiuxiang Mon, 17 Mar 2014 20:55:25 -0700

Hi Rupert,

Your instructions are very clear to me. I've just submitted a project
proposal [1] following our discussions. I can not make it without your
help. Thanks a lot!
Please let me know if anything can be refined in the proposal.


Yours, truly
Jenny

[1]
https://www.google-melange.com/gsoc/proposal/review/student/google/gsoc2014/jeny_qiuxiang/5629499534213120


2014-03-11 16:09 GMT+08:00 Rupert Westenthaler <
rupert.westentha...@gmail.com>:

> Hi Jenny
>
> On Sat, Mar 8, 2014 at 5:56 AM, Zhu Qiuxiang <jenny.qiuxi...@gmail.com>
> wrote:
> > Hi Rupert,
> >
> > Thanks for your instructions! I studied the 2 ontologies you pointed out.
> > They are useful for common media/fragment annotations, but maybe not
> enough
> > for face detection? For example, shall we define our own new face
> detection
> > ("fd" as prefix) ontology properties like "fd:hasFaceImage",
> > "fd:hasFaceVideoSegment"? Or are there existing face detection ontologies
> > to reuse?
> >
>
> I do not have an good overview about already available ontologies and
> if those could be reused. IMO this is part of the work of
> designing/implementing those engines.
>
>
> > Also, here are my further questions about the 3 possible engines you
> > mentioned:
> > 1) scene detection: to set detected faces in a context. This could be
> used
> > to group different faces found within the same scene)
> >  - What is a "scene"? Is it a frame/image or video segment? What is a
> > "context"?
> >
>
> I was referring to http://en.wikipedia.org/wiki/Shot_transition_detection
>
> > 2) face detection. Video segments showing a face could be marked with
> > MediaFragments URIs so the clients can easily play back the annotated
> > section in the browser.
> >  - I can understand it. The input of the engine is a video, with the
> output
> > of MediaFragments URIs marking the video segments showing a face. Do you
> > mean showing the same face? What if a video segment contain multiple
> faces?
> > Will the video segments be extracted and stored as Content parts (Blobs)
> of
> > the input video?
>
> In my terminology Face Detection is just the recognition that a
> Image/Video is showing a Face. This does not necessary mean that the
> engine can detect that the "same Face" is shown in different shots of
> the same video. It does also not mean that the engine can recognize
> the person based on the face.
>
> IMO extracting the video segment showing a face does not make sense.
> Video Players can anyway be used to play the annotated segment.
>
> >
> > 3) extraction of images showing detected faces. This would be nice for
> > Clients as they can easily show detected faces to users.
> >  - As is discussed in the previous e-mail,  the input of the engine is a
> > image, with the output of extracted images of the detected faces. For a
> > video, we firstly need another FrameExtractionEngine to extract all the
> > frames as images, and then apply the face extraction engine for each
> frame.
> > This could be clear in terms of separation of functionality. But may be
> > poor in performance? Because there're so many frames even for small
> video.
> > OpenIMAJ adopts VideoDisplayListener mechanism, so that the frames of the
> > video can be precessed one by one in a flow [1]. What's your opinion?
> >
>
> +1 for using the VideoDisplayListener. Separation of concerns is
> great, but not applicable in this example. A FrameExtractionEngine
> would create way to much information.
>
> best
> Rupert
>
> > Yours, truly
> > Jenny
> >
> > [1] http://www.openimaj.org/tutorial/finding-faces.html
> >
> >
> >
> >
> > 2014-03-05 13:30 GMT+08:00 Rupert Westenthaler <
> > rupert.westentha...@gmail.com>:
> >
> >> Hi Jenny
> >>
> >> Thanks for your interest in Stanbol. I will try to give you some more
> >> information
> >> on the topic you are interested in. See my comments inline.
> >>
> >> On Tue, Mar 4, 2014 at 3:35 PM, Zhu Qiuxiang <jenny.qiuxi...@gmail.com>
> >> wrote:
> >> > Hi all,
> >> >
> >> > My full name is Qiuxiang Zhu (you can call me Jenny for short),  who
> is a
> >> > Chinese student interested in participating GSoC 2014. In recent
> years,
> >> > I've been working on semantic web related projects, most of which are
> >> small
> >> > student projects funded by my university, with a big project of
> finance
> >> > knowledge base (RDF/OWL) development from my tutor. I'm quite
> experienced
> >> > with RDF, Topbraid Composer, Jena, SPARQL and Linked Data.
> >> >
> >> > Apache Stabol attracts me because it adopts semantic technologies for
> >> > content management, especially the Enhancer component to process
> semantic
> >> > data in a chain. I've read the related documents [1]. I can also
> >> understand
> >> > the source code of the Tika Engine [2].
> >> >
> >> > In GSoC 2014, I'd like to work on a similar engine of "Face Detection
> >> > Engine based on OpenIMAJ" [3] (STANBOL-1006) which also deals with
> Blobs.
> >> > Could you please tell me more about the details of the project?
> Here're
> >> my
> >> > questions:
> >> >
> >> > 1) The input of the Face Detection Engine can be a ContentItem
> containing
> >> > the original images. Are the extracted face images registered with
> >> > predefined URIs as Content parts (Blobs) in the ContentItem?
> >>
> >> In Stanbol Content is accessible as Blobs. The Blob provides the
> >> Content-Type
> >> and an InputStream to read the data. Both images and videos are possible
> >> inputs for a Face Detection Engine
> >>
> >> >
> >> > 2) What metadata can be enhanced for Face Detection Engine? Are there
> any
> >> > Face Detection related ontologies to be reused?
> >>
> >> Extending the Stanbol Enhancement Structure for Image and Video
> Annotations
> >> is covered by STANBOL-1005. There are several existing ontologies and
> even
> >> Recommendations like MediaFragments [1], the ontology for Media
> Resources
> >> [2]
> >> that should be considered.
> >>
> >>
> >> [1] http://www.w3.org/TR/media-frags/
> >> [2] http://www.w3.org/TR/mediaont-10/
> >>
> >> >
> >> > 3) How to deal with videos? It looks like that we should firstly (1)
> >> > extract images/frames from the videos, and then (2) apply Face
> Detection
> >> > Engine for face recognition. Shall we separate (1) from (2), to make a
> >> > Video Frame Extraction Engine?
> >>
> >> AFAIK OpenIMAJ provides all the required functionality. Separation of
> >> functionality in different engines is a good thing as it allows users
> more
> >> flexibility of configuring chains.
> >>
> >> I see a lot of possible engines
> >>
> >> * scene detection: to set detected faces in a context. This could be
> >> used to group different faces found within the same scene)
> >> * face detection. Video segments showing a face could be marked with
> >> MediaFragments URIs so the clients can easily play back the annotated
> >> section in the browser.
> >> * extraction of images showing detected faces. This would be nice for
> >> Clients as they can easily show detected faces to users
> >>
> >>
> >> best
> >> Rupert
> >>
> >> >
> >> >
> >> > Yours truly,
> >> > Jenny
> >> >
> >> >
> >> >
> >> >
> >> > [1] http://stanbol.apache.org/docs/trunk/components/enhancer/
> >> > [2]
> >> >
> >>
> https://svn.apache.org/repos/asf/stanbol/trunk/enhancement-engines/tika/src/main/java/org/apache/stanbol/enhancer/engines/tika/TikaEngine.java
> >> > [3] https://issues.apache.org/jira/browse/STANBOL-1006
> >>
> >>
> >>
> >> --
> >> | Rupert Westenthaler             rupert.westentha...@gmail.com
> >> | Bodenlehenstraße 11                             ++43-699-11108907
> >> | A-5500 Bischofshofen
> >>
>
>
>
> --
> | Rupert Westenthaler             rupert.westentha...@gmail.com
> | Bodenlehenstraße 11                             ++43-699-11108907
> | A-5500 Bischofshofen
>

Re: "Face Detection Engine based on OpenIMAJ" in GSoC 2014

Reply via email to