Re: Named entity coref resolution based on dbpedia categories and rdf:type

Cristian Petroaca Tue, 25 Mar 2014 11:52:46 -0700

That worked. Thanks.

So, there are no exceptions during the startup of the launcher.
The component tab in the felix console shows 6 WeightedChains the first
time, including the default one but after my changes and a restart there
are only 5 - the default one is missing altogether.



2014-03-24 20:18 GMT+02:00 Rupert Westenthaler <
[email protected]>:

> Hi Cristian,
>
> I do see the same problem since last Friday. The solution as mentions
> by [1] works for me.
>
>     mvn -Djsse.enableSNIExtension=false {goals}
>
> No Idea why https connections to github do currently cause this. I
> could not find anything related via Google. So I suggest to use the
> system property for now. If this persists for longer we can adapt the
> build files accordingly.
>
> best
> Rupert
>
>
>
>
> [1]
> http://stackoverflow.com/questions/7615645/ssl-handshake-alert-unrecognized-name-error-since-upgrade-to-java-1-7-0
>
> On Mon, Mar 24, 2014 at 7:01 PM, Cristian Petroaca
> <[email protected]> wrote:
> > I did a clean on the whole project and now I wanted to do another "mvn
> > clean install" but I am getting this :
> >
> > "[INFO]
> > ------------------------------------------------------------------------
> > [ERROR] Failed to execute goal
> > org.apache.maven.plugins:maven-antrun-plugin:1.6:
> > run (download) on project org.apache.stanbol.data.opennlp.lang.es: An
> Ant
> > BuildE
> > xception has occured: The following error occurred while executing this
> > line:
> > [ERROR]
> > C:\Data\Projects\Stanbol\main\data\opennlp\lang\es\download_models.xml:3
> > 3: Failed to copy
> > https://github.com/utcompling/OpenNLP-Models/raw/58ef0c6003140
> > 3e66e47ae35edaf58d3478b67af/models/es/opennlp-es-maxent-pos-es.bin to
> > C:\Data\Pr
> >
> ojects\Stanbol\main\data\opennlp\lang\es\downloads\resources\org\apache\stanbol\
> > data\opennlp\es-pos-maxent.bin due to javax.net.ssl.SSLProtocolException
> > handshake alert : unrecognized_name"
> >
> >
> >
> > 2014-03-20 11:25 GMT+02:00 Rupert Westenthaler <
> > [email protected]>:
> >
> >> Hi Cristian,
> >>
> >> On Thu, Mar 20, 2014 at 10:00 AM, Cristian Petroaca
> >> <[email protected]> wrote:
> >> >
> >>
> stanbol.enhancer.chain.weighted.chain=["tika;optional","langdetect","opennlp-sentence","opennlp-token","opennlp-pos","opennlp-ner","dbpediaLinking","entityhubExtraction","dbpedia-dereference","pos-chunker"]
> >> > service.ranking=I"-2147483648"
> >> > stanbol.enhancer.chain.name="default"
> >>
> >> Does look fine to me. Do you see any exception during the startup of
> >> the launcher. Can you check the status of this component in the
> >> component tab of the felix web console [1] (search for
> >> "org.apache.stanbol.enhancer.chain.weighted.impl.WeightedChain"). If
> >> you have multiple you can find the correct one by comparing the
> >> "Properties" with those in the configuration file.
> >>
> >> I guess that the according service is in the 'unsatisfied' as you do
> >> not see it in the web interface. But if this is the case you should
> >> also see the according exception in the log. You can also manually
> >> stop/start the component. In this case the exception should be
> >> re-thrown and you do not need to search the log for it.
> >>
> >> best
> >> Rupert
> >>
> >>
> >> [1] http://localhost:8080/system/console/components
> >>
> >> >
> >> >
> >> >
> >> > 2014-03-20 7:39 GMT+02:00 Rupert Westenthaler <
> >> [email protected]
> >> >>:
> >> >
> >> >> Hi Cristian,
> >> >>
> >> >> you can not send attachments to the list. Please copy the contents
> >> >> directly to the mail
> >> >>
> >> >> thx
> >> >> Rupert
> >> >>
> >> >> On Wed, Mar 19, 2014 at 9:20 PM, Cristian Petroaca
> >> >> <[email protected]> wrote:
> >> >> > The config attached.
> >> >> >
> >> >> >
> >> >> > 2014-03-19 9:09 GMT+02:00 Rupert Westenthaler
> >> >> > <[email protected]>:
> >> >> >
> >> >> >> Hi Cristian,
> >> >> >>
> >> >> >> can you provide the contents of the chain after your
> modifications?
> >> >> >> Would be interesting to test why the chain is no longer active
> after
> >> >> >> the restart.
> >> >> >>
> >> >> >> You can find the config file in the 'stanbol/fileinstall' folder.
> >> >> >>
> >> >> >> best
> >> >> >> Rupert
> >> >> >>
> >> >> >> On Tue, Mar 18, 2014 at 8:24 PM, Cristian Petroaca
> >> >> >> <[email protected]> wrote:
> >> >> >> > Related to the default chain selection rules : before restart I
> >> had a
> >> >> >> > chain
> >> >> >> > with the name 'default' as in I could access it via
> >> >> >> > enhancer/chain/default.
> >> >> >> > Then I just added another engine to the 'default' chain. I
> assumed
> >> >> that
> >> >> >> > after the restart the chain with the 'default' name would be
> >> >> persisted.
> >> >> >> > So
> >> >> >> > the first rule should have been applied after the restart as
> well.
> >> But
> >> >> >> > instead I cannot reach it via enhancer/chain/default anymore so
> its
> >> >> >> > gone.
> >> >> >> > Anyway, this is not a big deal, it's not blocking me in any
> way, I
> >> >> just
> >> >> >> > wanted to understand where the problem is.
> >> >> >> >
> >> >> >> >
> >> >> >> > 2014-03-18 7:15 GMT+02:00 Rupert Westenthaler
> >> >> >> > <[email protected]
> >> >> >> >>:
> >> >> >> >
> >> >> >> >> Hi Cristian
> >> >> >> >>
> >> >> >> >> On Mon, Mar 17, 2014 at 9:43 PM, Cristian Petroaca
> >> >> >> >> <[email protected]> wrote:
> >> >> >> >> > 1. Updated to the latest code and it's gone. Cool
> >> >> >> >> >
> >> >> >> >> > 2. I start the stable launcher -> create a new instance of
> the
> >> >> >> >> > PosChunkerEngine -> add it to the default chain. At this
> point
> >> >> >> >> > everything
> >> >> >> >> > looks good and works ok.
> >> >> >> >> > After I restart the server the default chain is gone and
> >> instead I
> >> >> >> >> > see
> >> >> >> >> this
> >> >> >> >> > in the enhancement chains page : all-active (default, id:
> 149,
> >> >> >> >> > ranking:
> >> >> >> >> 0,
> >> >> >> >> > impl: AllActiveEnginesChain ). all-active did not contain the
> >> >> >> >> > 'default'
> >> >> >> >> > word before the restart.
> >> >> >> >> >
> >> >> >> >>
> >> >> >> >> Please note the default chain selection rules as described at
> [1].
> >> >> You
> >> >> >> >> can also access chains chains under
> '/enhancer/chain/{chain-name}'
> >> >> >> >>
> >> >> >> >> best
> >> >> >> >> Rupert
> >> >> >> >>
> >> >> >> >> [1]
> >> >> >> >>
> >> >> >> >>
> >> >>
> >>
> http://stanbol.staging.apache.org/docs/trunk/components/enhancer/chains/#default-chain
> >> >> >> >>
> >> >> >> >> > It looks like the config files are exactly what I need.
> Thanks.
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> > 2014-03-17 9:26 GMT+02:00 Rupert Westenthaler <
> >> >> >> >> [email protected]
> >> >> >> >> >>:
> >> >> >> >> >
> >> >> >> >> >> On Sat, Mar 15, 2014 at 8:34 PM, Cristian Petroaca
> >> >> >> >> >> <[email protected]> wrote:
> >> >> >> >> >> > Thanks Rupert.
> >> >> >> >> >> >
> >> >> >> >> >> > A couple more questions/issues :
> >> >> >> >> >> >
> >> >> >> >> >> > 1. Whenever I start the stanbol server I'm seeing this in
> the
> >> >> >> >> >> > console
> >> >> >> >> >> > output :
> >> >> >> >> >> >
> >> >> >> >> >>
> >> >> >> >> >> This should be fixed with STANBOL-1278 [1] [2]
> >> >> >> >> >>
> >> >> >> >> >> > 2. Whenever I restart the server the Weighted Chains get
> >> messed
> >> >> >> >> >> > up. I
> >> >> >> >> >> > usually use the 'default' chain and add my engine to it so
> >> there
> >> >> >> >> >> > are
> >> >> >> >> 11
> >> >> >> >> >> > engines in it. After the restart this chain now contains
> >> around
> >> >> 23
> >> >> >> >> >> engines
> >> >> >> >> >> > in total.
> >> >> >> >> >>
> >> >> >> >> >> I was not able to replicate this. What I tried was
> >> >> >> >> >>
> >> >> >> >> >> (1) start up the stable launcher
> >> >> >> >> >> (2) add an additional engine to the default chain
> >> >> >> >> >> (3) restart the launcher
> >> >> >> >> >>
> >> >> >> >> >> The default chain was not changed after (2) and (3). So I
> would
> >> >> need
> >> >> >> >> >> further information for knowing why this is happening.
> >> >> >> >> >>
> >> >> >> >> >> Generally it is better to create you own chain instance as
> >> >> modifying
> >> >> >> >> >> one that is provided by the default configuration. I would
> also
> >> >> >> >> >> recommend that you keep your test configuration in text
> files
> >> and
> >> >> to
> >> >> >> >> >> copy those to the 'stanbol/fileinstall' folder. Doing so
> >> prevent
> >> >> you
> >> >> >> >> >> from manually entering the configuration after a software
> >> update.
> >> >> >> >> >> The
> >> >> >> >> >> production-mode section [3] provides information on how to
> do
> >> >> that.
> >> >> >> >> >>
> >> >> >> >> >> best
> >> >> >> >> >> Rupert
> >> >> >> >> >>
> >> >> >> >> >> [1] https://issues.apache.org/jira/browse/STANBOL-1278
> >> >> >> >> >> [2] http://svn.apache.org/r1576623
> >> >> >> >> >> [3] http://stanbol.apache.org/docs/trunk/production-mode
> >> >> >> >> >>
> >> >> >> >> >> > ERROR: Bundle org.apache.stanbol.enhancer.engine.topic.web
> >> >> [153]:
> >> >> >> >> Error
> >> >> >> >> >> > starting
> >> >> >> >> >> >
> >> >> >> >> >>
> >> >> >> >>
> >> >> >> >>
> >> >>
> >>
> slinginstall:c:\Data\Projects\Stanbol\main\launchers\stable\target\stanbol\star
> >> >> >> >> >> >
> >> >> >> >> >> >
> >> >>
> tup\35\org.apache.stanbol.enhancer.engine.topic.web-1.0.0-SNAPSHOT.jar
> >> >> >> >> >> > (org.osgi
> >> >> >> >> >> > .framework.BundleException: Unresolved constraint in
> bundle
> >> >> >> >> >> > org.apache.stanbol.e
> >> >> >> >> >> > nhancer.engine.topic.web [153]: Unable to resolve 153.0:
> >> missing
> >> >> >> >> >> > requirement [15
> >> >> >> >> >> > 3.0] package; (&(package=javax.ws.rs
> >> >> >> >> >> )(version>=0.0.0)(!(version>=2.0.0))))
> >> >> >> >> >> > org.osgi.framework.BundleException: Unresolved constraint
> in
> >> >> >> >> >> > bundle
> >> >> >> >> >> > org.apache.s
> >> >> >> >> >> > tanbol.enhancer.engine.topic.web [153]: Unable to resolve
> >> 153.0:
> >> >> >> >> missing
> >> >> >> >> >> > require
> >> >> >> >> >> > ment [153.0] package; (&(package=javax.ws.rs
> >> >> >> >> >> > )(version>=0.0.0)(!(version>=2.0.0))
> >> >> >> >> >> > )
> >> >> >> >> >> >         at
> >> >> >> >> >>
> org.apache.felix.framework.Felix.resolveBundle(Felix.java:3443)
> >> >> >> >> >> >         at
> >> >> >> >> org.apache.felix.framework.Felix.startBundle(Felix.java:1727)
> >> >> >> >> >> >         at
> >> >> >> >> >> >
> >> >> >> >> >> >
> >> >> org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1156)
> >> >> >> >> >> >
> >> >> >> >> >> >         at
> >> >> >> >> >> >
> >> >> >> >> >> >
> >> >> org.apache.felix.framework.StartLevelImpl.run(StartLevelImpl.java:264
> >> >> >> >> >> > )
> >> >> >> >> >> >         at java.lang.Thread.run(Unknown Source)
> >> >> >> >> >> >
> >> >> >> >> >> > Despite of this the server starts fine and I can use the
> >> >> enhancer
> >> >> >> >> fine.
> >> >> >> >> >> Do
> >> >> >> >> >> > you guys see this as well?
> >> >> >> >> >> >
> >> >> >> >> >> >
> >> >> >> >> >> > 2. Whenever I restart the server the Weighted Chains get
> >> messed
> >> >> >> >> >> > up. I
> >> >> >> >> >> > usually use the 'default' chain and add my engine to it so
> >> there
> >> >> >> >> >> > are
> >> >> >> >> 11
> >> >> >> >> >> > engines in it. After the restart this chain now contains
> >> around
> >> >> 23
> >> >> >> >> >> engines
> >> >> >> >> >> > in total.
> >> >> >> >> >> >
> >> >> >> >> >> >
> >> >> >> >> >> >
> >> >> >> >> >> >
> >> >> >> >> >> > 2014-03-11 9:47 GMT+02:00 Rupert Westenthaler <
> >> >> >> >> >> [email protected]
> >> >> >> >> >> >>:
> >> >> >> >> >> >
> >> >> >> >> >> >> Hi Cristian,
> >> >> >> >> >> >>
> >> >> >> >> >> >> NER Annotations are typically available as both
> >> >> >> >> >> >> NlpAnnotations.NER_ANNOTATION and  fise:TextAnnotation
> [1]
> >> in
> >> >> the
> >> >> >> >> >> >> enhancement metadata. As you are already accessing the
> >> >> >> >> >> >> AnayzedText I
> >> >> >> >> >> >> would prefer using the  NlpAnnotations.NER_ANNOTATION.
> >> >> >> >> >> >>
> >> >> >> >> >> >> best
> >> >> >> >> >> >> Rupert
> >> >> >> >> >> >>
> >> >> >> >> >> >> [1]
> >> >> >> >> >> >>
> >> >> >> >> >>
> >> >> >> >>
> >> >> >> >>
> >> >>
> >>
> http://stanbol.apache.org/docs/trunk/components/enhancer/enhancementstructure.html#fisetextannotation
> >> >> >> >> >> >>
> >> >> >> >> >> >> On Mon, Mar 10, 2014 at 10:07 PM, Cristian Petroaca
> >> >> >> >> >> >> <[email protected]> wrote:
> >> >> >> >> >> >> > Thanks.
> >> >> >> >> >> >> > I assume I should get the Named entities using the same
> >> but
> >> >> >> >> >> >> > with
> >> >> >> >> >> >> > NlpAnnotations.NER_ANNOTATION?
> >> >> >> >> >> >> >
> >> >> >> >> >> >> >
> >> >> >> >> >> >> >
> >> >> >> >> >> >> > 2014-03-10 13:29 GMT+02:00 Rupert Westenthaler <
> >> >> >> >> >> >> > [email protected]>:
> >> >> >> >> >> >> >
> >> >> >> >> >> >> >> Hallo Cristian,
> >> >> >> >> >> >> >>
> >> >> >> >> >> >> >> NounPhrases are not added to the RDF enhancement
> results.
> >> >> You
> >> >> >> >> need to
> >> >> >> >> >> >> >> use the AnalyzedText ContentPart [1]
> >> >> >> >> >> >> >>
> >> >> >> >> >> >> >> here is some demo code you can use in the
> >> computeEnhancement
> >> >> >> >> method
> >> >> >> >> >> >> >>
> >> >> >> >> >> >> >>         AnalysedText at =
> >> >> >> >> >> >> >> NlpEngineHelper.getAnalysedText(this,
> >> >> >> >> ci,
> >> >> >> >> >> >> true);
> >> >> >> >> >> >> >>         Iterator<? extends Section> sections =
> >> >> >> >> >> >> >> at.getSentences();
> >> >> >> >> >> >> >>         if(!sections.hasNext()){ //process as single
> >> >> sentence
> >> >> >> >> >> >> >>             sections =
> >> Collections.singleton(at).iterator();
> >> >> >> >> >> >> >>         }
> >> >> >> >> >> >> >>
> >> >> >> >> >> >> >>         while(sections.hasNext()){
> >> >> >> >> >> >> >>             Section section = sections.next();
> >> >> >> >> >> >> >>             Iterator<Span> chunks =
> >> >> >> >> >> >> >> section.getEnclosed(EnumSet.of(SpanTypeEnum.Chunk));
> >> >> >> >> >> >> >>             while(chunks.hasNext()){
> >> >> >> >> >> >> >>                 Span chunk = chunks.next();
> >> >> >> >> >> >> >>                 Value<PhraseTag> phrase =
> >> >> >> >> >> >> >> chunk.getAnnotation(NlpAnnotations.PHRASE_ANNOTATION);
> >> >> >> >> >> >> >>                 if(phrase.value().getCategory() ==
> >> >> >> >> >> >> LexicalCategory.Noun){
> >> >> >> >> >> >> >>                     log.info(" - NounPhrase [{},{}]
> {}",
> >> >> new
> >> >> >> >> >> Object[]{
> >> >> >> >> >> >> >>
> >> >> >> >> >> >> >> chunk.getStart(),chunk.getEnd(),chunk.getSpan()});
> >> >> >> >> >> >> >>                 }
> >> >> >> >> >> >> >>             }
> >> >> >> >> >> >> >>         }
> >> >> >> >> >> >> >>
> >> >> >> >> >> >> >> hope this helps
> >> >> >> >> >> >> >>
> >> >> >> >> >> >> >> best
> >> >> >> >> >> >> >> Rupert
> >> >> >> >> >> >> >>
> >> >> >> >> >> >> >> [1]
> >> >> >> >> >> >> >>
> >> >> >> >> >> >>
> >> >> >> >> >>
> >> >> >> >>
> >> >> >> >>
> >> >>
> >>
> http://stanbol.apache.org/docs/trunk/components/enhancer/nlp/analyzedtext
> >> >> >> >> >> >> >>
> >> >> >> >> >> >> >> On Sun, Mar 9, 2014 at 6:07 PM, Cristian Petroaca
> >> >> >> >> >> >> >> <[email protected]> wrote:
> >> >> >> >> >> >> >> > I started to implement the engine and I'm having
> >> problems
> >> >> >> >> >> >> >> > with
> >> >> >> >> >> getting
> >> >> >> >> >> >> >> > results for noun phrases. I modified the "default"
> >> >> weighted
> >> >> >> >> chain
> >> >> >> >> >> to
> >> >> >> >> >> >> also
> >> >> >> >> >> >> >> > include the PosChunkerEngine and ran a sample text :
> >> >> "Angela
> >> >> >> >> Merkel
> >> >> >> >> >> >> >> visted
> >> >> >> >> >> >> >> > China. The german chancellor met with various
> people".
> >> I
> >> >> >> >> expected
> >> >> >> >> >> that
> >> >> >> >> >> >> >> the
> >> >> >> >> >> >> >> > RDF XML output would contain some info about the
> noun
> >> >> >> >> >> >> >> > phrases
> >> >> >> >> but I
> >> >> >> >> >> >> >> cannot
> >> >> >> >> >> >> >> > see any.
> >> >> >> >> >> >> >> > Could you point me to the correct way to generate
> the
> >> noun
> >> >> >> >> phrases?
> >> >> >> >> >> >> >> >
> >> >> >> >> >> >> >> > Thanks,
> >> >> >> >> >> >> >> > Cristian
> >> >> >> >> >> >> >> >
> >> >> >> >> >> >> >> >
> >> >> >> >> >> >> >> > 2014-02-09 14:15 GMT+02:00 Cristian Petroaca <
> >> >> >> >> >> >> >> [email protected]>:
> >> >> >> >> >> >> >> >
> >> >> >> >> >> >> >> >> Opened
> >> >> https://issues.apache.org/jira/browse/STANBOL-1279
> >> >> >> >> >> >> >> >>
> >> >> >> >> >> >> >> >>
> >> >> >> >> >> >> >> >> 2014-02-07 10:53 GMT+02:00 Cristian Petroaca <
> >> >> >> >> >> >> >> [email protected]>
> >> >> >> >> >> >> >> >> :
> >> >> >> >> >> >> >> >>
> >> >> >> >> >> >> >> >> Hi Rupert,
> >> >> >> >> >> >> >> >>>
> >> >> >> >> >> >> >> >>> The "spatial" dimension is a good idea. I'll also
> >> take a
> >> >> >> >> >> >> >> >>> look
> >> >> >> >> at
> >> >> >> >> >> >> Yago.
> >> >> >> >> >> >> >> >>>
> >> >> >> >> >> >> >> >>> I will create a Jira with what we talked about
> here.
> >> It
> >> >> >> >> >> >> >> >>> will
> >> >> >> >> >> >> probably
> >> >> >> >> >> >> >> >>> have just a draft-like description for now and
> will
> >> be
> >> >> >> >> >> >> >> >>> updated
> >> >> >> >> >> as I
> >> >> >> >> >> >> go
> >> >> >> >> >> >> >> >>> along.
> >> >> >> >> >> >> >> >>>
> >> >> >> >> >> >> >> >>> Thanks,
> >> >> >> >> >> >> >> >>> Cristian
> >> >> >> >> >> >> >> >>>
> >> >> >> >> >> >> >> >>>
> >> >> >> >> >> >> >> >>> 2014-02-06 15:39 GMT+02:00 Rupert Westenthaler <
> >> >> >> >> >> >> >> >>> [email protected]>:
> >> >> >> >> >> >> >> >>>
> >> >> >> >> >> >> >> >>> Hi Cristian,
> >> >> >> >> >> >> >> >>>>
> >> >> >> >> >> >> >> >>>> definitely an interesting approach. You should
> have
> >> a
> >> >> >> >> >> >> >> >>>> look at
> >> >> >> >> >> Yago2
> >> >> >> >> >> >> >> >>>> [1]. As far as I can remember the Yago taxonomy
> is
> >> much
> >> >> >> >> better
> >> >> >> >> >> >> >> >>>> structured as the one used by dbpedia. Mapping
> >> >> >> >> >> >> >> >>>> suggestions of
> >> >> >> >> >> >> dbpedia
> >> >> >> >> >> >> >> >>>> to concepts in Yago2 is easy as both dbpedia and
> >> yago2
> >> >> do
> >> >> >> >> >> provide
> >> >> >> >> >> >> >> >>>> mappings [2] and [3]
> >> >> >> >> >> >> >> >>>>
> >> >> >> >> >> >> >> >>>> > 2014-02-05 15:39 GMT+02:00 Rafa Haro
> >> >> >> >> >> >> >> >>>> > <[email protected]>:
> >> >> >> >> >> >> >> >>>> >>
> >> >> >> >> >> >> >> >>>> >> "Microsoft posted its 2013 earnings. The
> >> Redmond's
> >> >> >> >> >> >> >> >>>> >> company
> >> >> >> >> >> made
> >> >> >> >> >> >> a
> >> >> >> >> >> >> >> >>>> >> huge profit".
> >> >> >> >> >> >> >> >>>>
> >> >> >> >> >> >> >> >>>> Thats actually a very good example. Spatial
> contexts
> >> >> are
> >> >> >> >> >> >> >> >>>> very
> >> >> >> >> >> >> >> >>>> important as they tend to be often used for
> >> >> referencing.
> >> >> >> >> >> >> >> >>>> So I
> >> >> >> >> >> would
> >> >> >> >> >> >> >> >>>> suggest to specially treat the spatial context.
> For
> >> >> >> >> >> >> >> >>>> spatial
> >> >> >> >> >> >> Entities
> >> >> >> >> >> >> >> >>>> (like a City) this is easy, but even for other
> >> (like a
> >> >> >> >> Person,
> >> >> >> >> >> >> >> >>>> Company) you could use relations to spatial
> entities
> >> >> >> >> >> >> >> >>>> define
> >> >> >> >> >> their
> >> >> >> >> >> >> >> >>>> spatial context. This context could than be used
> to
> >> >> >> >> >> >> >> >>>> correctly
> >> >> >> >> >> link
> >> >> >> >> >> >> >> >>>> "The Redmond's company" to "Microsoft".
> >> >> >> >> >> >> >> >>>>
> >> >> >> >> >> >> >> >>>> In addition I would suggest to use the "spatial"
> >> >> context
> >> >> >> >> >> >> >> >>>> of
> >> >> >> >> each
> >> >> >> >> >> >> >> >>>> entity (basically relation to entities that are
> >> cities,
> >> >> >> >> regions,
> >> >> >> >> >> >> >> >>>> countries) as a separate dimension, because those
> >> are
> >> >> >> >> >> >> >> >>>> very
> >> >> >> >> often
> >> >> >> >> >> >> used
> >> >> >> >> >> >> >> >>>> for coreferences.
> >> >> >> >> >> >> >> >>>>
> >> >> >> >> >> >> >> >>>> [1] http://www.mpi-inf.mpg.de/yago-naga/yago/
> >> >> >> >> >> >> >> >>>> [2]
> >> >> >> >> >> >> >> >>>>
> >> >> http://downloads.dbpedia.org/3.9/links/yago_links.nt.bz2
> >> >> >> >> >> >> >> >>>> [3]
> >> >> >> >> >> >> >> >>>>
> >> >> >> >> >> >> >>
> >> >> >> >> >> >>
> >> >> >> >> >>
> >> >> >> >>
> >> >> >> >>
> >> >>
> >>
> http://www.mpi-inf.mpg.de/yago-naga/yago/download/yago/yagoDBpediaInstances.ttl.7z
> >> >> >> >> >> >> >> >>>>
> >> >> >> >> >> >> >> >>>>
> >> >> >> >> >> >> >> >>>> On Thu, Feb 6, 2014 at 10:33 AM, Cristian
> Petroaca
> >> >> >> >> >> >> >> >>>> <[email protected]> wrote:
> >> >> >> >> >> >> >> >>>> > There are several dbpedia categories for each
> >> entity,
> >> >> >> >> >> >> >> >>>> > in
> >> >> >> >> this
> >> >> >> >> >> >> case
> >> >> >> >> >> >> >> for
> >> >> >> >> >> >> >> >>>> > Microsoft we have :
> >> >> >> >> >> >> >> >>>> >
> >> >> >> >> >> >> >> >>>> > category:Companies_in_the_NASDAQ-100_Index
> >> >> >> >> >> >> >> >>>> > category:Microsoft
> >> >> >> >> >> >> >> >>>> >
> category:Software_companies_of_the_United_States
> >> >> >> >> >> >> >> >>>> >
> >> >> category:Software_companies_based_in_Washington_(state)
> >> >> >> >> >> >> >> >>>> > category:Companies_established_in_1975
> >> >> >> >> >> >> >> >>>> >
> category:1975_establishments_in_the_United_States
> >> >> >> >> >> >> >> >>>> > category:Companies_based_in_Redmond,_Washington
> >> >> >> >> >> >> >> >>>> >
> >> >> >> >> >> >>
> >> >> >> >> >> >>
> >> >> category:Multinational_companies_headquartered_in_the_United_States
> >> >> >> >> >> >> >> >>>> > category:Cloud_computing_providers
> >> >> >> >> >> >> >> >>>> >
> >> >> category:Companies_in_the_Dow_Jones_Industrial_Average
> >> >> >> >> >> >> >> >>>> >
> >> >> >> >> >> >> >> >>>> > So we also have "Companies based in
> >> >> Redmont,Washington"
> >> >> >> >> which
> >> >> >> >> >> >> could
> >> >> >> >> >> >> >> be
> >> >> >> >> >> >> >> >>>> > matched.
> >> >> >> >> >> >> >> >>>> >
> >> >> >> >> >> >> >> >>>> >
> >> >> >> >> >> >> >> >>>> > There is still other contextual information
> from
> >> >> >> >> >> >> >> >>>> > dbpedia
> >> >> >> >> which
> >> >> >> >> >> >> can
> >> >> >> >> >> >> >> be
> >> >> >> >> >> >> >> >>>> used.
> >> >> >> >> >> >> >> >>>> > For example for an Organization we could also
> >> >> include :
> >> >> >> >> >> >> >> >>>> > dbpprop:industry = Software
> >> >> >> >> >> >> >> >>>> > dbpprop:service = Online Service Providers
> >> >> >> >> >> >> >> >>>> >
> >> >> >> >> >> >> >> >>>> > and for a Person (that's for Barack Obama) :
> >> >> >> >> >> >> >> >>>> >
> >> >> >> >> >> >> >> >>>> > dbpedia-owl:profession:
> >> >> >> >> >> >> >> >>>> >                                dbpedia:Author
> >> >> >> >> >> >> >> >>>> >
> >> >> >> >> >> >> >> >>>> > dbpedia:Constitutional_law
> >> >> >> >> >> >> >> >>>> >                                dbpedia:Lawyer
> >> >> >> >> >> >> >> >>>> >
> >> >> >> >> >> >> >> >>>> > dbpedia:Community_organizing
> >> >> >> >> >> >> >> >>>> >
> >> >> >> >> >> >> >> >>>> > I'd like to continue investigating this as I
> think
> >> >> that
> >> >> >> >> >> >> >> >>>> > it
> >> >> >> >> may
> >> >> >> >> >> >> have
> >> >> >> >> >> >> >> >>>> some
> >> >> >> >> >> >> >> >>>> > value in increasing the number of coreference
> >> >> >> >> >> >> >> >>>> > resolutions
> >> >> >> >> and
> >> >> >> >> >> I'd
> >> >> >> >> >> >> >> like
> >> >> >> >> >> >> >> >>>> to
> >> >> >> >> >> >> >> >>>> > concentrate more on precision rather than
> recall
> >> >> since
> >> >> >> >> >> >> >> >>>> > we
> >> >> >> >> >> already
> >> >> >> >> >> >> >> have
> >> >> >> >> >> >> >> >>>> a
> >> >> >> >> >> >> >> >>>> > set of coreferences detected by the stanford
> nlp
> >> tool
> >> >> >> >> >> >> >> >>>> > and
> >> >> >> >> this
> >> >> >> >> >> >> would
> >> >> >> >> >> >> >> >>>> be as
> >> >> >> >> >> >> >> >>>> > an addition to that (at least this is how I
> would
> >> >> like
> >> >> >> >> >> >> >> >>>> > to
> >> >> >> >> use
> >> >> >> >> >> >> it).
> >> >> >> >> >> >> >> >>>> >
> >> >> >> >> >> >> >> >>>> > Is it ok if I track this by opening a jira? I
> >> could
> >> >> >> >> >> >> >> >>>> > update
> >> >> >> >> it
> >> >> >> >> >> to
> >> >> >> >> >> >> >> show
> >> >> >> >> >> >> >> >>>> my
> >> >> >> >> >> >> >> >>>> > progress and also my conclusions and if it
> turns
> >> out
> >> >> >> >> >> >> >> >>>> > that
> >> >> >> >> it
> >> >> >> >> >> was
> >> >> >> >> >> >> a
> >> >> >> >> >> >> >> bad
> >> >> >> >> >> >> >> >>>> idea
> >> >> >> >> >> >> >> >>>> > then that's the situation at least I'll end up
> >> with
> >> >> >> >> >> >> >> >>>> > more
> >> >> >> >> >> >> knowledge
> >> >> >> >> >> >> >> >>>> about
> >> >> >> >> >> >> >> >>>> > Stanbol in the end :).
> >> >> >> >> >> >> >> >>>> >
> >> >> >> >> >> >> >> >>>> >
> >> >> >> >> >> >> >> >>>> > 2014-02-05 15:39 GMT+02:00 Rafa Haro
> >> >> >> >> >> >> >> >>>> > <[email protected]>:
> >> >> >> >> >> >> >> >>>> >
> >> >> >> >> >> >> >> >>>> >> Hi Cristian,
> >> >> >> >> >> >> >> >>>> >>
> >> >> >> >> >> >> >> >>>> >> The approach sounds nice. I don't want to be
> the
> >> >> >> >> >> >> >> >>>> >> devil's
> >> >> >> >> >> >> advocate
> >> >> >> >> >> >> >> but
> >> >> >> >> >> >> >> >>>> I'm
> >> >> >> >> >> >> >> >>>> >> just not sure about the recall using the
> dbpedia
> >> >> >> >> categories
> >> >> >> >> >> >> >> feature.
> >> >> >> >> >> >> >> >>>> For
> >> >> >> >> >> >> >> >>>> >> example, your sentence could be also
> "Microsoft
> >> >> posted
> >> >> >> >> >> >> >> >>>> >> its
> >> >> >> >> >> 2013
> >> >> >> >> >> >> >> >>>> earnings.
> >> >> >> >> >> >> >> >>>> >> The Redmond's company made a huge profit". So,
> >> maybe
> >> >> >> >> >> including
> >> >> >> >> >> >> more
> >> >> >> >> >> >> >> >>>> >> contextual information from dbpedia could
> >> increase
> >> >> the
> >> >> >> >> recall
> >> >> >> >> >> >> but
> >> >> >> >> >> >> >> of
> >> >> >> >> >> >> >> >>>> course
> >> >> >> >> >> >> >> >>>> >> will reduce the precision.
> >> >> >> >> >> >> >> >>>> >>
> >> >> >> >> >> >> >> >>>> >> Cheers,
> >> >> >> >> >> >> >> >>>> >> Rafa
> >> >> >> >> >> >> >> >>>> >>
> >> >> >> >> >> >> >> >>>> >> El 04/02/14 09:50, Cristian Petroaca escribió:
> >> >> >> >> >> >> >> >>>> >>
> >> >> >> >> >> >> >> >>>> >>  Back with a more detailed description of the
> >> steps
> >> >> >> >> >> >> >> >>>> >> for
> >> >> >> >> >> making
> >> >> >> >> >> >> this
> >> >> >> >> >> >> >> >>>> kind of
> >> >> >> >> >> >> >> >>>> >>> coreference work.
> >> >> >> >> >> >> >> >>>> >>>
> >> >> >> >> >> >> >> >>>> >>> I will be using references to the following
> >> text in
> >> >> >> >> >> >> >> >>>> >>> the
> >> >> >> >> >> steps
> >> >> >> >> >> >> >> below
> >> >> >> >> >> >> >> >>>> in
> >> >> >> >> >> >> >> >>>> >>> order to make things clearer : "Microsoft
> posted
> >> >> its
> >> >> >> >> >> >> >> >>>> >>> 2013
> >> >> >> >> >> >> >> earnings.
> >> >> >> >> >> >> >> >>>> The
> >> >> >> >> >> >> >> >>>> >>> software company made a huge profit."
> >> >> >> >> >> >> >> >>>> >>>
> >> >> >> >> >> >> >> >>>> >>> 1. For every noun phrase in the text which
> has :
> >> >> >> >> >> >> >> >>>> >>>      a. a determinate pos which implies
> >> reference
> >> >> to
> >> >> >> >> >> >> >> >>>> >>> an
> >> >> >> >> >> entity
> >> >> >> >> >> >> >> local
> >> >> >> >> >> >> >> >>>> to
> >> >> >> >> >> >> >> >>>> >>> the
> >> >> >> >> >> >> >> >>>> >>> text, such as "the, this, these") but not
> >> "another,
> >> >> >> >> every",
> >> >> >> >> >> etc
> >> >> >> >> >> >> >> which
> >> >> >> >> >> >> >> >>>> >>> implies a reference to an entity outside of
> the
> >> >> text.
> >> >> >> >> >> >> >> >>>> >>>      b. having at least another noun aside
> from
> >> the
> >> >> >> >> >> >> >> >>>> >>> main
> >> >> >> >> >> >> required
> >> >> >> >> >> >> >> >>>> noun
> >> >> >> >> >> >> >> >>>> >>> which
> >> >> >> >> >> >> >> >>>> >>> further describes it. For example I will not
> >> count
> >> >> >> >> >> >> >> >>>> >>> "The
> >> >> >> >> >> >> company"
> >> >> >> >> >> >> >> as
> >> >> >> >> >> >> >> >>>> being
> >> >> >> >> >> >> >> >>>> >>> a
> >> >> >> >> >> >> >> >>>> >>> legitimate candidate since this could create
> a
> >> lot
> >> >> of
> >> >> >> >> false
> >> >> >> >> >> >> >> >>>> positives by
> >> >> >> >> >> >> >> >>>> >>> considering the double meaning of some words
> >> such
> >> >> as
> >> >> >> >> >> >> >> >>>> >>> "in
> >> >> >> >> the
> >> >> >> >> >> >> >> company
> >> >> >> >> >> >> >> >>>> of
> >> >> >> >> >> >> >> >>>> >>> good people".
> >> >> >> >> >> >> >> >>>> >>> "The software company" is a good candidate
> >> since we
> >> >> >> >> >> >> >> >>>> >>> also
> >> >> >> >> >> have
> >> >> >> >> >> >> >> >>>> "software".
> >> >> >> >> >> >> >> >>>> >>>
> >> >> >> >> >> >> >> >>>> >>> 2. match the nouns in the noun phrase to the
> >> >> contents
> >> >> >> >> >> >> >> >>>> >>> of
> >> >> >> >> the
> >> >> >> >> >> >> >> dbpedia
> >> >> >> >> >> >> >> >>>> >>> categories of each named entity found prior
> to
> >> the
> >> >> >> >> location
> >> >> >> >> >> of
> >> >> >> >> >> >> the
> >> >> >> >> >> >> >> >>>> noun
> >> >> >> >> >> >> >> >>>> >>> phrase in the text.
> >> >> >> >> >> >> >> >>>> >>> The dbpedia categories are in the following
> >> format
> >> >> >> >> >> >> >> >>>> >>> (for
> >> >> >> >> >> >> Microsoft
> >> >> >> >> >> >> >> for
> >> >> >> >> >> >> >> >>>> >>> example) : "Software companies of the United
> >> >> States".
> >> >> >> >> >> >> >> >>>> >>>   So we try to match "software company" with
> >> that.
> >> >> >> >> >> >> >> >>>> >>> First, as you can see, the main noun in the
> >> dbpedia
> >> >> >> >> category
> >> >> >> >> >> >> has a
> >> >> >> >> >> >> >> >>>> plural
> >> >> >> >> >> >> >> >>>> >>> form and it's the same for all categories
> which
> >> I
> >> >> >> >> >> >> >> >>>> >>> saw. I
> >> >> >> >> >> don't
> >> >> >> >> >> >> >> know
> >> >> >> >> >> >> >> >>>> if
> >> >> >> >> >> >> >> >>>> >>> there's an easier way to do this but I
> thought
> >> of
> >> >> >> >> applying a
> >> >> >> >> >> >> >> >>>> lemmatizer on
> >> >> >> >> >> >> >> >>>> >>> the category and the noun phrase in order for
> >> them
> >> >> to
> >> >> >> >> have a
> >> >> >> >> >> >> >> common
> >> >> >> >> >> >> >> >>>> >>> denominator.This also works if the noun
> phrase
> >> >> itself
> >> >> >> >> has a
> >> >> >> >> >> >> plural
> >> >> >> >> >> >> >> >>>> form.
> >> >> >> >> >> >> >> >>>> >>>
> >> >> >> >> >> >> >> >>>> >>> Second, I'll need to use for comparison only
> the
> >> >> >> >> >> >> >> >>>> >>> words in
> >> >> >> >> >> the
> >> >> >> >> >> >> >> >>>> category
> >> >> >> >> >> >> >> >>>> >>> which are themselves nouns and not
> prepositions
> >> or
> >> >> >> >> >> determiners
> >> >> >> >> >> >> >> such
> >> >> >> >> >> >> >> >>>> as "of
> >> >> >> >> >> >> >> >>>> >>> the".This means that I need to pos tag the
> >> >> categories
> >> >> >> >> >> contents
> >> >> >> >> >> >> as
> >> >> >> >> >> >> >> >>>> well.
> >> >> >> >> >> >> >> >>>> >>> I was thinking of running the pos and lemma
> on
> >> the
> >> >> >> >> dbpedia
> >> >> >> >> >> >> >> >>>> categories when
> >> >> >> >> >> >> >> >>>> >>> building the dbpedia backed entity hub and
> >> storing
> >> >> >> >> >> >> >> >>>> >>> them
> >> >> >> >> for
> >> >> >> >> >> >> later
> >> >> >> >> >> >> >> >>>> use - I
> >> >> >> >> >> >> >> >>>> >>> don't know how feasible this is at the
> moment.
> >> >> >> >> >> >> >> >>>> >>>
> >> >> >> >> >> >> >> >>>> >>> After this I can compare each noun in the
> noun
> >> >> phrase
> >> >> >> >> with
> >> >> >> >> >> the
> >> >> >> >> >> >> >> >>>> equivalent
> >> >> >> >> >> >> >> >>>> >>> nouns in the categories and based on the
> number
> >> of
> >> >> >> >> matches I
> >> >> >> >> >> >> can
> >> >> >> >> >> >> >> >>>> create a
> >> >> >> >> >> >> >> >>>> >>> confidence level.
> >> >> >> >> >> >> >> >>>> >>>
> >> >> >> >> >> >> >> >>>> >>> 3. match the noun of the noun phrase with the
> >> >> >> >> >> >> >> >>>> >>> rdf:type
> >> >> >> >> from
> >> >> >> >> >> >> >> dbpedia
> >> >> >> >> >> >> >> >>>> of the
> >> >> >> >> >> >> >> >>>> >>> named entity. If this matches increase the
> >> >> confidence
> >> >> >> >> level.
> >> >> >> >> >> >> >> >>>> >>>
> >> >> >> >> >> >> >> >>>> >>> 4. If there are multiple named entities which
> >> can
> >> >> >> >> >> >> >> >>>> >>> match a
> >> >> >> >> >> >> certain
> >> >> >> >> >> >> >> >>>> noun
> >> >> >> >> >> >> >> >>>> >>> phrase then link the noun phrase with the
> >> closest
> >> >> >> >> >> >> >> >>>> >>> named
> >> >> >> >> >> entity
> >> >> >> >> >> >> >> prior
> >> >> >> >> >> >> >> >>>> to it
> >> >> >> >> >> >> >> >>>> >>> in the text.
> >> >> >> >> >> >> >> >>>> >>>
> >> >> >> >> >> >> >> >>>> >>> What do you think?
> >> >> >> >> >> >> >> >>>> >>>
> >> >> >> >> >> >> >> >>>> >>> Cristian
> >> >> >> >> >> >> >> >>>> >>>
> >> >> >> >> >> >> >> >>>> >>> 2014-01-31 Cristian Petroaca <
> >> >> >> >> [email protected]>:
> >> >> >> >> >> >> >> >>>> >>>
> >> >> >> >> >> >> >> >>>> >>>  Hi Rafa,
> >> >> >> >> >> >> >> >>>> >>>>
> >> >> >> >> >> >> >> >>>> >>>> I don't yet have a concrete heursitic but
> I'm
> >> >> >> >> >> >> >> >>>> >>>> working on
> >> >> >> >> >> it.
> >> >> >> >> >> >> I'll
> >> >> >> >> >> >> >> >>>> provide
> >> >> >> >> >> >> >> >>>> >>>> it here so that you guys can give me a
> >> feedback on
> >> >> >> >> >> >> >> >>>> >>>> it.
> >> >> >> >> >> >> >> >>>> >>>>
> >> >> >> >> >> >> >> >>>> >>>> What are "locality" features?
> >> >> >> >> >> >> >> >>>> >>>>
> >> >> >> >> >> >> >> >>>> >>>> I looked at Bart and other coref tools such
> as
> >> >> >> >> >> >> >> >>>> >>>> ArkRef
> >> >> >> >> and
> >> >> >> >> >> >> >> >>>> CherryPicker
> >> >> >> >> >> >> >> >>>> >>>> and
> >> >> >> >> >> >> >> >>>> >>>> they don't provide such a coreference.
> >> >> >> >> >> >> >> >>>> >>>>
> >> >> >> >> >> >> >> >>>> >>>> Cristian
> >> >> >> >> >> >> >> >>>> >>>>
> >> >> >> >> >> >> >> >>>> >>>>
> >> >> >> >> >> >> >> >>>> >>>> 2014-01-30 Rafa Haro <[email protected]>:
> >> >> >> >> >> >> >> >>>> >>>>
> >> >> >> >> >> >> >> >>>> >>>> Hi Cristian,
> >> >> >> >> >> >> >> >>>> >>>>
> >> >> >> >> >> >> >> >>>> >>>>> Without having more details about your
> >> concrete
> >> >> >> >> heuristic,
> >> >> >> >> >> >> in my
> >> >> >> >> >> >> >> >>>> honest
> >> >> >> >> >> >> >> >>>> >>>>> opinion, such approach could produce a lot
> of
> >> >> false
> >> >> >> >> >> >> positives. I
> >> >> >> >> >> >> >> >>>> don't
> >> >> >> >> >> >> >> >>>> >>>>> know
> >> >> >> >> >> >> >> >>>> >>>>> if you are planning to use some "locality"
> >> >> features
> >> >> >> >> >> >> >> >>>> >>>>> to
> >> >> >> >> >> detect
> >> >> >> >> >> >> >> such
> >> >> >> >> >> >> >> >>>> >>>>> coreferences but you need to take into
> account
> >> >> that
> >> >> >> >> >> >> >> >>>> >>>>> it
> >> >> >> >> is
> >> >> >> >> >> >> quite
> >> >> >> >> >> >> >> >>>> usual
> >> >> >> >> >> >> >> >>>> >>>>> that
> >> >> >> >> >> >> >> >>>> >>>>> coreferenced mentions can occurs even in
> >> >> different
> >> >> >> >> >> >> paragraphs.
> >> >> >> >> >> >> >> >>>> Although
> >> >> >> >> >> >> >> >>>> >>>>> I'm
> >> >> >> >> >> >> >> >>>> >>>>> not an expert in Natural Language
> >> Understanding,
> >> >> I
> >> >> >> >> would
> >> >> >> >> >> say
> >> >> >> >> >> >> it
> >> >> >> >> >> >> >> is
> >> >> >> >> >> >> >> >>>> quite
> >> >> >> >> >> >> >> >>>> >>>>> difficult to get decent precision/recall
> rates
> >> >> for
> >> >> >> >> >> >> coreferencing
> >> >> >> >> >> >> >> >>>> using
> >> >> >> >> >> >> >> >>>> >>>>> fixed rules. Maybe you can give a try to
> >> others
> >> >> >> >> >> >> >> >>>> >>>>> tools
> >> >> >> >> like
> >> >> >> >> >> >> BART
> >> >> >> >> >> >> >> (
> >> >> >> >> >> >> >> >>>> >>>>> http://www.bart-coref.org/).
> >> >> >> >> >> >> >> >>>> >>>>>
> >> >> >> >> >> >> >> >>>> >>>>> Cheers,
> >> >> >> >> >> >> >> >>>> >>>>> Rafa Haro
> >> >> >> >> >> >> >> >>>> >>>>>
> >> >> >> >> >> >> >> >>>> >>>>> El 30/01/14 10:33, Cristian Petroaca
> escribió:
> >> >> >> >> >> >> >> >>>> >>>>>
> >> >> >> >> >> >> >> >>>> >>>>>   Hi,
> >> >> >> >> >> >> >> >>>> >>>>>
> >> >> >> >> >> >> >> >>>> >>>>>> One of the necessary steps for
> implementing
> >> the
> >> >> >> >> >> >> >> >>>> >>>>>> Event
> >> >> >> >> >> >> >> extraction
> >> >> >> >> >> >> >> >>>> Engine
> >> >> >> >> >> >> >> >>>> >>>>>> feature :
> >> >> >> >> >> >> https://issues.apache.org/jira/browse/STANBOL-1121is
> >> >> >> >> >> >> >> >>>> to
> >> >> >> >> >> >> >> >>>> >>>>>> have
> >> >> >> >> >> >> >> >>>> >>>>>> coreference resolution in the given text.
> >> This
> >> >> is
> >> >> >> >> >> provided
> >> >> >> >> >> >> now
> >> >> >> >> >> >> >> >>>> via the
> >> >> >> >> >> >> >> >>>> >>>>>> stanford-nlp project but as far as I saw
> this
> >> >> >> >> >> >> >> >>>> >>>>>> module
> >> >> >> >> is
> >> >> >> >> >> >> >> performing
> >> >> >> >> >> >> >> >>>> >>>>>> mostly
> >> >> >> >> >> >> >> >>>> >>>>>> pronomial (He, She) or nominal (Barack
> Obama
> >> and
> >> >> >> >> >> >> >> >>>> >>>>>> Mr.
> >> >> >> >> >> Obama)
> >> >> >> >> >> >> >> >>>> coreference
> >> >> >> >> >> >> >> >>>> >>>>>> resolution.
> >> >> >> >> >> >> >> >>>> >>>>>>
> >> >> >> >> >> >> >> >>>> >>>>>> In order to get more coreferences from the
> >> text
> >> >> I
> >> >> >> >> though
> >> >> >> >> >> of
> >> >> >> >> >> >> >> >>>> creating
> >> >> >> >> >> >> >> >>>> >>>>>> some
> >> >> >> >> >> >> >> >>>> >>>>>> logic that would detect this kind of
> >> >> coreference :
> >> >> >> >> >> >> >> >>>> >>>>>> "Apple reaches new profit heights. The
> >> software
> >> >> >> >> company
> >> >> >> >> >> just
> >> >> >> >> >> >> >> >>>> announced
> >> >> >> >> >> >> >> >>>> >>>>>> its
> >> >> >> >> >> >> >> >>>> >>>>>> 2013 earnings."
> >> >> >> >> >> >> >> >>>> >>>>>> Here "The software company" obviously
> refers
> >> to
> >> >> >> >> "Apple".
> >> >> >> >> >> >> >> >>>> >>>>>> So I'd like to detect coreferences of
> Named
> >> >> >> >> >> >> >> >>>> >>>>>> Entities
> >> >> >> >> >> which
> >> >> >> >> >> >> are
> >> >> >> >> >> >> >> of
> >> >> >> >> >> >> >> >>>> the
> >> >> >> >> >> >> >> >>>> >>>>>> rdf:type of the Named Entity , in this
> case
> >> >> >> >> >> >> >> >>>> >>>>>> "company"
> >> >> >> >> and
> >> >> >> >> >> >> also
> >> >> >> >> >> >> >> >>>> have
> >> >> >> >> >> >> >> >>>> >>>>>> attributes which can be found in the
> dbpedia
> >> >> >> >> categories
> >> >> >> >> >> of
> >> >> >> >> >> >> the
> >> >> >> >> >> >> >> >>>> named
> >> >> >> >> >> >> >> >>>> >>>>>> entity, in this case "software".
> >> >> >> >> >> >> >> >>>> >>>>>>
> >> >> >> >> >> >> >> >>>> >>>>>> The detection of coreferences such as "The
> >> >> >> >> >> >> >> >>>> >>>>>> software
> >> >> >> >> >> >> company" in
> >> >> >> >> >> >> >> >>>> the
> >> >> >> >> >> >> >> >>>> >>>>>> text
> >> >> >> >> >> >> >> >>>> >>>>>> would also be done by either using the new
> >> Pos
> >> >> Tag
> >> >> >> >> Based
> >> >> >> >> >> >> Phrase
> >> >> >> >> >> >> >> >>>> >>>>>> extraction
> >> >> >> >> >> >> >> >>>> >>>>>> Engine (noun phrases) or by using a
> >> dependency
> >> >> >> >> >> >> >> >>>> >>>>>> tree of
> >> >> >> >> >> the
> >> >> >> >> >> >> >> >>>> sentence and
> >> >> >> >> >> >> >> >>>> >>>>>> picking up only subjects or objects.
> >> >> >> >> >> >> >> >>>> >>>>>>
> >> >> >> >> >> >> >> >>>> >>>>>> At this point I'd like to know if this
> kind
> >> of
> >> >> >> >> >> >> >> >>>> >>>>>> logic
> >> >> >> >> >> would
> >> >> >> >> >> >> be
> >> >> >> >> >> >> >> >>>> useful
> >> >> >> >> >> >> >> >>>> >>>>>> as a
> >> >> >> >> >> >> >> >>>> >>>>>> separate Enhancement Engine (in case the
> >> >> precision
> >> >> >> >> >> >> >> >>>> >>>>>> and
> >> >> >> >> >> >> recall
> >> >> >> >> >> >> >> are
> >> >> >> >> >> >> >> >>>> good
> >> >> >> >> >> >> >> >>>> >>>>>> enough) in Stanbol?
> >> >> >> >> >> >> >> >>>> >>>>>>
> >> >> >> >> >> >> >> >>>> >>>>>> Thanks,
> >> >> >> >> >> >> >> >>>> >>>>>> Cristian
> >> >> >> >> >> >> >> >>>> >>>>>>
> >> >> >> >> >> >> >> >>>> >>>>>>
> >> >> >> >> >> >> >> >>>> >>>>>>
> >> >> >> >> >> >> >> >>>> >>
> >> >> >> >> >> >> >> >>>>
> >> >> >> >> >> >> >> >>>>
> >> >> >> >> >> >> >> >>>>
> >> >> >> >> >> >> >> >>>> --
> >> >> >> >> >> >> >> >>>> | Rupert Westenthaler
> >> >> >> >> [email protected]
> >> >> >> >> >> >> >> >>>> | Bodenlehenstraße 11
> >> >> >> >> >> >> ++43-699-11108907
> >> >> >> >> >> >> >> >>>> | A-5500 Bischofshofen
> >> >> >> >> >> >> >> >>>>
> >> >> >> >> >> >> >> >>>
> >> >> >> >> >> >> >> >>>
> >> >> >> >> >> >> >> >>
> >> >> >> >> >> >> >>
> >> >> >> >> >> >> >>
> >> >> >> >> >> >> >>
> >> >> >> >> >> >> >> --
> >> >> >> >> >> >> >> | Rupert Westenthaler
> >> >> >> >> >> >> >> [email protected]
> >> >> >> >> >> >> >> | Bodenlehenstraße 11
> >> >> >> >> ++43-699-11108907
> >> >> >> >> >> >> >> | A-5500 Bischofshofen
> >> >> >> >> >> >> >>
> >> >> >> >> >> >>
> >> >> >> >> >> >>
> >> >> >> >> >> >>
> >> >> >> >> >> >> --
> >> >> >> >> >> >> | Rupert Westenthaler
> >> >> [email protected]
> >> >> >> >> >> >> | Bodenlehenstraße 11
> >> >> >> >> >> >> ++43-699-11108907
> >> >> >> >> >> >> | A-5500 Bischofshofen
> >> >> >> >> >> >>
> >> >> >> >> >>
> >> >> >> >> >>
> >> >> >> >> >>
> >> >> >> >> >> --
> >> >> >> >> >> | Rupert Westenthaler
> >> [email protected]
> >> >> >> >> >> | Bodenlehenstraße 11
> >> >> ++43-699-11108907
> >> >> >> >> >> | A-5500 Bischofshofen
> >> >> >> >> >>
> >> >> >> >>
> >> >> >> >>
> >> >> >> >>
> >> >> >> >> --
> >> >> >> >> | Rupert Westenthaler
> [email protected]
> >> >> >> >> | Bodenlehenstraße 11
> >> ++43-699-11108907
> >> >> >> >> | A-5500 Bischofshofen
> >> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> --
> >> >> >> | Rupert Westenthaler             [email protected]
> >> >> >> | Bodenlehenstraße 11
> ++43-699-11108907
> >> >> >> | A-5500 Bischofshofen
> >> >> >
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> | Rupert Westenthaler             [email protected]
> >> >> | Bodenlehenstraße 11                             ++43-699-11108907
> >> >> | A-5500 Bischofshofen
> >> >>
> >>
> >>
> >>
> >> --
> >> | Rupert Westenthaler             [email protected]
> >> | Bodenlehenstraße 11                             ++43-699-11108907
> >> | A-5500 Bischofshofen
> >>
>
>
>
> --
> | Rupert Westenthaler             [email protected]
> | Bodenlehenstraße 11                             ++43-699-11108907
> | A-5500 Bischofshofen
>

Re: Named entity coref resolution based on dbpedia categories and rdf:type

Reply via email to