[
https://issues.apache.org/jira/browse/STANBOL-1132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13795000#comment-13795000
]
Rupert Westenthaler edited comment on STANBOL-1132 at 10/15/13 8:36 AM:
------------------------------------------------------------------------
Hi Cristian
Had a look at you patch and IMO it is very good quality. Because of this and
because the patch only contains new features I decided to commit it directly to
trunk. This should make further work more simple (no merges needed, you can use
the default Stanbol Launcher for testing). However if you would prefer to
continue in a branch just let me know and I will also apply the patch to the
branch.
In the following a list of things I noticed while working through your patch:
1. Can you adapt the implementation of GrammaticalRelation to represent the
hierarchy similar to the Pos enum. Using a static {} block to build up the
transitive closure over parents instead of doing it in the #addParents()
method. The getParents() method would then use the static
EnumMap<GrammaticalRelation,EnumSet<GrammaticalRelation>>(GrammaticalRelation.class)
instead of a local HashMap. I would like to use EnumSet/-Map implementations
as they provide much better performance in contrast to HashSet/-Map
implementations.
2. Annotations should be immutable. So I would like to remove all add** set**
... methods from CorefTag, DependencyFeature and GramaticalReleationTag. In
addition Getter methods should return immutable wrappers. The best would be to
use Collections.unmodifiableSet(..) when setting fields in the Constructor. I
added according comments to the CorefTag
3. IMO DependencyFeatures is obsolete. That's because the Annotated interface
already supports multiple values and DependencyFeatures is just a collection of
multiple DependencyRelations. So instead of having a DependencyFeatures class I
would suggest to just add multiple DependencyRelations values. In detail I
suggest to
* rename DependencyRelation to DependencyRelationTag.
* DependencyRelationTag extends Tag<DependencyRelationTag>
* Change NlpAnnotations#DEPENDENCY_ANNOTATION to new
Annotation<DependencyFeatures>(
"stanbol.enhancer.nlp.dependency",
DependencyRelationTag.class);
* Simple add multiple DependencyRelationTag Values to a token. This has
also the advantage the iterations over those are sorted by the confidence of
those.
* simple add a new DependencyRelationTag
Again thanks for the patch. It is great to getting Coref and dependendy tree
support to the Stanbol NLP module!
was (Author: rwesten):
Hi Cristian
Had a look at you patch and IMO it is very good quality. Because of this and
because the patch only contains new features I decided to commit it directly to
trunk. This should make further work more simple (no merges needed, you can use
the default Stanbol Launcher for testing). However if you would prefer to
continue in a branch just let me know and I will also apply the patch to the
branch.
In the following a list of things I noticed while working through your patch:
1 Can you adapt the implementation of GrammaticalRelation to represent the
hierarchy similar to the Pos enum. Using a static {} block to build up the
transitive closure over parents instead of doing it in the #addParents()
method. The getParents() method would then use the static
EnumMap<GrammaticalRelation,EnumSet<GrammaticalRelation>>(GrammaticalRelation.class)
instead of a local HashMap. I would like to use EnumSet/-Map implementations
as they provide much better performance in contrast to HashSet/-Map
implementations.
2. Annotations should be immutable. So I would like to remove all add** set**
... methods from CorefTag, DependencyFeature and GramaticalReleationTag. In
addition Getter methods should return immutable wrappers. The best would be to
use Collections.unmodifiableSet(..) when setting fields in the Constructor. I
added according comments to the CorefTag
3. IMO DependencyFeatures is obsolete. That's because the Annotated interface
already supports multiple values and DependencyFeatures is just a collection of
multiple DependencyRelations. So instead of having a DependencyFeatures class I
would suggest to just add multiple DependencyRelations values. In detail I
suggest to
* rename DependencyRelation to DependencyRelationTag.
* DependencyRelationTag extends Tag<DependencyRelationTag>
* Change NlpAnnotations#DEPENDENCY_ANNOTATION to new
Annotation<DependencyFeatures>(
"stanbol.enhancer.nlp.dependency",
DependencyRelationTag.class);
* Simple add multiple DependencyRelationTag Values to a token. This has
also the advantage the iterations over those are sorted by the confidence of
those.
* simple add a new DependencyRelationTag
Again thanks for the patch. It is great to getting Coref and dependendy tree
support to the Stanbol NLP module!
> Add co-reference resolution and dependency tree support in the Stanbol NLP
> processing API
> -----------------------------------------------------------------------------------------
>
> Key: STANBOL-1132
> URL: https://issues.apache.org/jira/browse/STANBOL-1132
> Project: Stanbol
> Issue Type: New Feature
> Components: Enhancement Engines
> Reporter: Cristian Petroaca
> Assignee: Rupert Westenthaler
> Labels: co-reference, dependency-tree, nlp
> Attachments: coref_dependency_tree_datamodel.patch
>
>
> Extend the Stanbol NLP Processing API with annotations for co-reference
> resolution and dependency trees.
> Also, add support for JSON Serialisation/Parsing for the co-reference and
> dependency tree annotations so that the RESTful NLP Analysis Service can
> provide co-reference information.
--
This message was sent by Atlassian JIRA
(v6.1#6144)