I am stuck on integrating external jars into stanbol dev environment.
I have two jar files. One for langdetection langdetect.jar and one for
paoding.jar.
I wrote a small enhancer engine (just a replica of existing langid for the
sake of my learning). I could successfully compile it using mvn but the
test files inside the engine fail trying to use a class from
langdetect.jar. I tried various techniques in the pon.xml file of langid
engine.

 <dependency>
      <groupId>com.adobe.g11n</groupId>
      <artifactId>langdetect</artifactId>
      <version>1.0</version>
    </dependency>

I made sure that my local .m2 repo has this jar file using mvn
install-file: command.

I also tried the system scope
 <dependency>
            <groupId>com.adobe.g11n</groupId>
            <artifactId>langdetect</artifactId>
            <version>1.0</version>
            <scope>system</scope>
           <systemPath>${basedir}/src/lib/langdetect.jar</systemPath>
</dependency>

I do see an artifact related warning

Trying to get manifest from artifact
/Users/harishs/.m2/repository/org/slf4j/slf4j-api/1.6.1/slf4j-api-1.6.1.jar
[DEBUG] Artifact has no service component entry in manifest
/Users/harishs/.m2/repository/org/slf4j/slf4j-api/1.6.1/slf4j-api-1.6.1.jar
[DEBUG] Trying to get scrinfo from artifact
/Users/harishs/.m2/repository/org/slf4j/slf4j-api/1.6.1/slf4j-api-1.6.1.jar
[DEBUG] Artifact has no scrinfo file (it's optional):
/Users/harishs/.m2/repository/org/slf4j/slf4j-api/1.6.1/slf4j-api-1.6.1.jar
[DEBUG] Trying to get manifest from artifact /Users/harishs/langdetect.jar
[DEBUG] Artifact has no service component entry in manifest
/Users/harishs/langdetect.jar
[DEBUG] Trying to get scrinfo from artifact /Users/harishs/langdetect.jar
[DEBUG] Artifact has no scrinfo file (it's optional):
/Users/harishs/langdetect.jar

any clues would be great.

-harish


On Sun, Jul 29, 2012 at 6:04 AM, harish suvarna <[email protected]> wrote:

> I really appreciate your help and thank you very much Rupert.
>
> All the steps helped me and I am up and debugging in Eclipse.
>
> Thanks,
> Harish
>
>
> On Thu, Jul 26, 2012 at 10:30 PM, Rupert Westenthaler <
> [email protected]> wrote:
>
>> Hi,
>>
>> I do use Eclipse and I usually do not care about classpath related
>> build problems in Eclipse as long as code suggestions do still work.
>>
>> If I have problems
>>
>> 1.  mvn eclipse:clean eclipse:eclipse
>> 2. refreshing all projects in eclipse
>> 3. full project > clean
>>
>> usually solves those problems. NOTE that only calling "mvn
>> eclipse:eclipse" may not solve problems as it only adds new stuff to
>> the project files but does not remove old one. Note that I do prefer
>> to NOT use any Eclipse maven plugin as I had bad experiences with
>> those. However those cases where about two years ago so such tools
>> might have improved in the meantime.
>>
>> For Debugging I do use Eclipse:
>>
>> Unit tests work fine within eclipse. If I want to debug a component
>> within a Stanbol Server I do the following
>>
>> 1. Start the Stanbol Server in debug mode
>>
>>     java -Xmx1024m -XX:MaxPermSize=256m \
>>         -Xdebug
>> -Xrunjdwp:transport=dt_socket,address=8787,server=y,suspend=n \
>>         -jar
>> org.apache.stanbol.launchers.full-0.10.0-incubating-SNAPSHOT.jar
>>
>> 2. connect Eclipse to the Stanbol Server:
>>     * Debug Configurations > Remote Java Application >> create new
>>     * Socket Attach
>>     * Host: localhost and Port as specified with address (8787 in the
>> example above)
>>
>> 3. using the sling installer maven plugin to install/update the module
>> with the component I am working on
>>
>>     mvn clean install -PinstallBundle
>> -Dsling.url=http://localhost:8080/system/console
>>
>>     * Make sure to "disconnect" the debugger before calling this as
>> the debugging might interfere with update process of the module
>>
>> hope this helps
>> best
>> Rupert
>>
>> On Fri, Jul 27, 2012 at 3:56 AM, harish suvarna <[email protected]>
>> wrote:
>> > Hi,
>> > I am trying to add Chinese language processing using some opensource
>> > segmenters. I had some communication with Rupert. I am attaching
>> Rupert's
>> > suggestions. This way I amy get some more suggestions help as well as
>> > Rupert's ideas get distributed to all.
>> >
>> > I am also following Anuj's blog to learn about Stanbol content
>> enhancement
>> > engine development.
>> >
>> > I can successfully build Stanbol and play with the default chain.
>> >
>> > I am trying to create the eclipse project now. mvn eclipse:eclipse was
>> > successful too. Then I imported the stanbol directory into eclipse
>> > workspace.
>> > In eclipse certain Stanbol projects are in red.
>> >
>> > Description    Resource    Path    Location    Type
>> > The project cannot be built until its prerequisite
>> > org.apache.stanbol.enhancer.servicesapi is built. Cleaning and building
>> all
>> > projects is recommended    org.apache.stanbol.enhancer.ldpath
>> > Unknown    Java Problem
>> > The project cannot be built until its prerequisite
>> > org.apache.stanbol.entityhub.indexing.core is built. Cleaning and
>> building
>> > all projects is recommended
>> > org.apache.stanbol.entityhub.indexing.destination.solryard
>> > Unknown    Java Problem
>> > The project cannot be built until its prerequisite
>> > org.apache.stanbol.entityhub.core is built. Cleaning and building all
>> > projects is recommended    org.apache.stanbol.entityhub.query.clerezza
>> >     Unknown    Java Problem
>> > The project cannot be built until its prerequisite
>> > org.apache.stanbol.entityhub.core is built. Cleaning and building all
>> > projects is recommended    org.apache.stanbol.entityhub.ldpath
>> > Unknown    Java Problem
>> > The project cannot be built until its prerequisite
>> > org.apache.stanbol.enhancer.servicesapi is built. Cleaning and building
>> all
>> > projects is recommended    org.apache.stanbol.enhancer.rdfentities
>> > Unknown    Java Problem
>> > The project cannot be built until its prerequisite
>> > org.apache.stanbol.enhancer.servicesapi is built. Cleaning and building
>> all
>> > projects is recommended    org.apache.stanbol.enhancer.test
>> > Unknown    Java Problem
>> > The project cannot be built until its prerequisite
>> > org.apache.stanbol.entityhub.core is built. Cleaning and building all
>> > projects is recommended    org.apache.stanbol.entityhub.site.managed
>> > Unknown    Java Problem
>> > ....
>> > ...
>> >
>> > Any extra steps are needed?
>> > Should I try to build and debug inside eclipse or build using mvn and
>> debug
>> > in eclipse? What developers do in common?
>> >
>> > -harish
>> >
>> >
>> >
>> > ================================================Previous
>> > communication================================================
>> > Hi,
>> >
>> > There are no NER (Named Entity Recognition) models for Chinese text
>> > available via OpenNLP. So the default configuration of Stanbol will
>> > not process Chinese text. What you can do is to configure a
>> > KeywordLinking Engine for Chinese text as this engine can also process
>> > in unknown languages (see [1] for details).
>> >
>> > However also the KeywordLinking Engine requires at least n tokenizer
>> > for looking up Words. As there is no specific Tokenizer for OpenNLP
>> > Chinese text it will use the default one that uses a fixed set of
>> > chars to split words (white spaces, hyphens ...). You may better how
>> > well this would work with Chinese texts. My assumption would be that
>> > it is not sufficient - so results will be sub-optimal.
>> >
>> > To apply Chinese optimization I see three possibilities:
>> >
>> > 1. add support for Chinese to OpenNLP (Tokenizer, Sentence detection,
>> > POS tagging, Named Entity Detection)
>> > 2. allow the KeywordLinkingEngine to use other already available tools
>> > for text processing (e.g. stuff that is already available for
>> > Solr/Lucene [2] or the paoding chinese segment or referenced in you
>> > mail). Currently the KeywordLinkingEngine is hardwired with OpenNLP,
>> > because representing Tokens, POS ... as RDF would be to much of an
>> > overhead.
>> > 3. implement a new EnhancementEngine for processing Chinese text.
>> >
>> > Hope this helps to get you started.
>> >
>> > best
>> > Rupert
>> >
>> > [1] http://incubator.apache.org/stanbol/docs/trunk/multilingual.html
>> > [2]
>> >
>> http://wiki.apache.org/solr/LanguageAnalysis#Chinese.2C_Japanese.2C_Korean
>> > harish suvarna
>> > 6:33 PM (22 minutes ago)
>> >
>> > to Rupert
>> > Thanks a lot Rupert.
>> >
>> > I am weighing between options 2 and 3. What is the difference? Optiion 2
>> > sounds like enhancing KeyWordLinkingEngine to deal with chinese text. It
>> > may be like paoding is hardcoded into KeyWordLinkingEngine. Option 3 is
>> > like a separate engine. But will I be able to use the stanbol dbpedia
>> > lookup using option 3?
>> >
>> > Btw, I created my own enhancement engine chains and I could see them
>> > yesterday in localhost:8080. But today all of them have vanished and
>> only
>> > the default chain shows up. Can I dig them up somewhere in the stanbol
>> > directory?
>> >
>> > -harish
>> >
>> > I just created the eclipse project
>>
>>
>>
>> --
>> | Rupert Westenthaler             [email protected]
>> | Bodenlehenstraße 11                             ++43-699-11108907
>> | A-5500 Bischofshofen
>>
>
>

Reply via email to