I really appreciate your help and thank you very much Rupert.

All the steps helped me and I am up and debugging in Eclipse.

Thanks,
Harish

On Thu, Jul 26, 2012 at 10:30 PM, Rupert Westenthaler <
[email protected]> wrote:

> Hi,
>
> I do use Eclipse and I usually do not care about classpath related
> build problems in Eclipse as long as code suggestions do still work.
>
> If I have problems
>
> 1.  mvn eclipse:clean eclipse:eclipse
> 2. refreshing all projects in eclipse
> 3. full project > clean
>
> usually solves those problems. NOTE that only calling "mvn
> eclipse:eclipse" may not solve problems as it only adds new stuff to
> the project files but does not remove old one. Note that I do prefer
> to NOT use any Eclipse maven plugin as I had bad experiences with
> those. However those cases where about two years ago so such tools
> might have improved in the meantime.
>
> For Debugging I do use Eclipse:
>
> Unit tests work fine within eclipse. If I want to debug a component
> within a Stanbol Server I do the following
>
> 1. Start the Stanbol Server in debug mode
>
>     java -Xmx1024m -XX:MaxPermSize=256m \
>         -Xdebug
> -Xrunjdwp:transport=dt_socket,address=8787,server=y,suspend=n \
>         -jar
> org.apache.stanbol.launchers.full-0.10.0-incubating-SNAPSHOT.jar
>
> 2. connect Eclipse to the Stanbol Server:
>     * Debug Configurations > Remote Java Application >> create new
>     * Socket Attach
>     * Host: localhost and Port as specified with address (8787 in the
> example above)
>
> 3. using the sling installer maven plugin to install/update the module
> with the component I am working on
>
>     mvn clean install -PinstallBundle
> -Dsling.url=http://localhost:8080/system/console
>
>     * Make sure to "disconnect" the debugger before calling this as
> the debugging might interfere with update process of the module
>
> hope this helps
> best
> Rupert
>
> On Fri, Jul 27, 2012 at 3:56 AM, harish suvarna <[email protected]>
> wrote:
> > Hi,
> > I am trying to add Chinese language processing using some opensource
> > segmenters. I had some communication with Rupert. I am attaching Rupert's
> > suggestions. This way I amy get some more suggestions help as well as
> > Rupert's ideas get distributed to all.
> >
> > I am also following Anuj's blog to learn about Stanbol content
> enhancement
> > engine development.
> >
> > I can successfully build Stanbol and play with the default chain.
> >
> > I am trying to create the eclipse project now. mvn eclipse:eclipse was
> > successful too. Then I imported the stanbol directory into eclipse
> > workspace.
> > In eclipse certain Stanbol projects are in red.
> >
> > Description    Resource    Path    Location    Type
> > The project cannot be built until its prerequisite
> > org.apache.stanbol.enhancer.servicesapi is built. Cleaning and building
> all
> > projects is recommended    org.apache.stanbol.enhancer.ldpath
> > Unknown    Java Problem
> > The project cannot be built until its prerequisite
> > org.apache.stanbol.entityhub.indexing.core is built. Cleaning and
> building
> > all projects is recommended
> > org.apache.stanbol.entityhub.indexing.destination.solryard
> > Unknown    Java Problem
> > The project cannot be built until its prerequisite
> > org.apache.stanbol.entityhub.core is built. Cleaning and building all
> > projects is recommended    org.apache.stanbol.entityhub.query.clerezza
> >     Unknown    Java Problem
> > The project cannot be built until its prerequisite
> > org.apache.stanbol.entityhub.core is built. Cleaning and building all
> > projects is recommended    org.apache.stanbol.entityhub.ldpath
> > Unknown    Java Problem
> > The project cannot be built until its prerequisite
> > org.apache.stanbol.enhancer.servicesapi is built. Cleaning and building
> all
> > projects is recommended    org.apache.stanbol.enhancer.rdfentities
> > Unknown    Java Problem
> > The project cannot be built until its prerequisite
> > org.apache.stanbol.enhancer.servicesapi is built. Cleaning and building
> all
> > projects is recommended    org.apache.stanbol.enhancer.test
> > Unknown    Java Problem
> > The project cannot be built until its prerequisite
> > org.apache.stanbol.entityhub.core is built. Cleaning and building all
> > projects is recommended    org.apache.stanbol.entityhub.site.managed
> > Unknown    Java Problem
> > ....
> > ...
> >
> > Any extra steps are needed?
> > Should I try to build and debug inside eclipse or build using mvn and
> debug
> > in eclipse? What developers do in common?
> >
> > -harish
> >
> >
> >
> > ================================================Previous
> > communication================================================
> > Hi,
> >
> > There are no NER (Named Entity Recognition) models for Chinese text
> > available via OpenNLP. So the default configuration of Stanbol will
> > not process Chinese text. What you can do is to configure a
> > KeywordLinking Engine for Chinese text as this engine can also process
> > in unknown languages (see [1] for details).
> >
> > However also the KeywordLinking Engine requires at least n tokenizer
> > for looking up Words. As there is no specific Tokenizer for OpenNLP
> > Chinese text it will use the default one that uses a fixed set of
> > chars to split words (white spaces, hyphens ...). You may better how
> > well this would work with Chinese texts. My assumption would be that
> > it is not sufficient - so results will be sub-optimal.
> >
> > To apply Chinese optimization I see three possibilities:
> >
> > 1. add support for Chinese to OpenNLP (Tokenizer, Sentence detection,
> > POS tagging, Named Entity Detection)
> > 2. allow the KeywordLinkingEngine to use other already available tools
> > for text processing (e.g. stuff that is already available for
> > Solr/Lucene [2] or the paoding chinese segment or referenced in you
> > mail). Currently the KeywordLinkingEngine is hardwired with OpenNLP,
> > because representing Tokens, POS ... as RDF would be to much of an
> > overhead.
> > 3. implement a new EnhancementEngine for processing Chinese text.
> >
> > Hope this helps to get you started.
> >
> > best
> > Rupert
> >
> > [1] http://incubator.apache.org/stanbol/docs/trunk/multilingual.html
> > [2]
> >
> http://wiki.apache.org/solr/LanguageAnalysis#Chinese.2C_Japanese.2C_Korean
> > harish suvarna
> > 6:33 PM (22 minutes ago)
> >
> > to Rupert
> > Thanks a lot Rupert.
> >
> > I am weighing between options 2 and 3. What is the difference? Optiion 2
> > sounds like enhancing KeyWordLinkingEngine to deal with chinese text. It
> > may be like paoding is hardcoded into KeyWordLinkingEngine. Option 3 is
> > like a separate engine. But will I be able to use the stanbol dbpedia
> > lookup using option 3?
> >
> > Btw, I created my own enhancement engine chains and I could see them
> > yesterday in localhost:8080. But today all of them have vanished and only
> > the default chain shows up. Can I dig them up somewhere in the stanbol
> > directory?
> >
> > -harish
> >
> > I just created the eclipse project
>
>
>
> --
> | Rupert Westenthaler             [email protected]
> | Bodenlehenstraße 11                             ++43-699-11108907
> | A-5500 Bischofshofen
>

Reply via email to