On Fri, Apr 12, 2013 at 3:02 PM, Jimmy O'Regan <[email protected]> wrote:

> On 12 April 2013 02:58, Shivani Poddar <[email protected]> wrote:
> > Hi Jimmy,
> >
> > Yes I did go through the initial theory in the paper about Linked Data
> and
> > how sieve aims at resolving the conflicts which arise due to linked data.
> > What I refer to my description is more of an involuntary thought stream
> at
> > reading the idea. Sorry for being so ambiguous.
> > Yes, I am referring to the " linguistic structure" here.
> > (Please kindly excuse my little knowledge in the subject , but) I feel
> that
> > as much close the linguistic structure for Portuguese and English seems
> to
> > me, much more further apart is it in case of Eastern Languages.
>
> That's what I thought, but I didn't want to assume. As Pablo has
> already noted, I think there is some confusion here, and I also think
> that mixing these ideas could be quite interesting.
>
> I'm sure that, if you look at something like
> http://dbpedia.org/page/David_Beckham that it's easy to imagine, given
> the type of text in the abstracts, that the information was extracted
> from the text (in a manner similar to that described in the opening
> paragraphs of http://en.wikipedia.org/wiki/Information_extraction).
>
> (I use a Basque example here, because (for short sentences at least)
> it can have similar sentence structure to Hindi - your name looks
> Indian to me, so I assume some familiarity with Hindi. Even if not, I
> hope that the difference to, say, English is apparent).
>





Yes, you are right in your estimation, I am Indian and very familiar with
the linguistic structures of Hindi and other languages which stem out of
Sanskrit mainly.
The example you give below for Basque does convey a very good distinction
between the linguistic structure I was referring to and the one which the
project seems to cover (now that I have more clarity about it. :) )



> Given an abstract like this:
> David Robert Joseph Beckham (Londres, Ingalaterra, 1975eko maiatzaren
> 2a) futbolari ingelesa da.
>
> and an extraction template something like this:
>
> [?name] ([?city], [?country], [?date]) futbolari ingelesa da.
>
> we could get similar information, but DBPedia's extraction framework
> actually uses things like infoboxes:
>
> {{Futbolari biografia infotaula
> | izena = David Beckham
> | izen osoa = David Robert Joseph Beckham
> | jaiotza data = [[1975]]eko [[maiatzaren 2]]a {{adina|1975|05|02}}
> | jaiotza hiria = [[Londres]], [[Ingalaterra]]
> }}
>
>
>
> If you're interested in information extraction of this kind, the good
> news is that we have the data from the infoboxes, and that could be
> used for semi-supervised creation of this kind of extraction template.
> If your idea was based around something related to this, that could
> make a great project.
>


This does seem to cover a major part of my interest. Although my eventual
goals (which are research based) would definitely look at the amalgam of
the 3 concepts Pablo mentioned, but, as of now, for an immediate project,
this seems very interesting to me. I would like to take it up for the
coming summer.
Also, by the creation for semi supervised template, would you mean a
template for (say only) Hindi? Or would extending it for all languages be
fine ?



> In any case, I hope you'll continue to describe your idea, but in its
> own terms, rather than, e.g., in terms of similarities that you saw in
> the description of Sieve, that may not be apparent to us.
>
>
My idea does seem to undergo some level of refining here. I will be careful
about what I propose to do.

Regards,
Shivani
------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
Dbpedia-gsoc mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc

Reply via email to