Hi Rupert,
One question, I can't find this bundle:

   - "Apache Stanbol Data: DBpedia.org defaultdata version
   (org.apache.stanbol.data.sites.dbpedia.default)"


But I see this one:

   - "Apache Stanbol Default Data (org.apache.stanbol.defaultdata)"

Are you referring to the latter?

BR
David

On Thu, Jul 21, 2011 at 7:10 AM, Rupert Westenthaler <
[email protected]> wrote:

> Hi David, all
>
> With the changes from yesterday (revision r1148947 and r1148948) it is
> now easily possible to deactivate the default configuration for
> dbPedia provided by the Stanbol launcher and to replace it with the
> one the uses the remote services with a local cache.
>
> Steps:
>
> 1. use the current launcher
> 2. go to the Bundle tab of the Apache Felix Webconsole
> 3. stop the Bundle "Apache Stanbol Data: DBpedia.org defaultdata
> version (org.apache.stanbol.data.sites.dbpedia.default)"
> 4. install and start the Bundle "Apache Stanbol Data: Remote
> DBpedia.org with local cache
> (org.apache.stanbol.data.sites.dbpedia.cached)". You can find this
> bundle in "{stanbol-trunk}/data/sites/dbpediacached".
>
> best
> Rupert
>
> On Mon, Jul 18, 2011 at 1:15 PM, Rupert Westenthaler
> <[email protected]> wrote:
> > Hi
> >
> > Rather than working on the Workaround I decided to invest some time in
> > finishing STANBOL-140 and implementing STANBOL-287.
> > Together with the proposal made in [1] to split up the default data in
> > several bundles this should solve the issues described/discussed here.
> >
> > best
> > Rupert
> >
> > [1] http://markmail.org/message/bf7qurmzos45h23b
> >
> > On Thu, Jul 14, 2011 at 8:34 AM, Rupert Westenthaler
> > <[email protected]> wrote:
> >> On Thu, Jul 14, 2011 at 8:30 AM, David Riccitelli
> >> <[email protected]> wrote:
> >>> Thanks Rupert,
> >>>
> >>> A description on how to do this is available in [1].
> >>>
> >>>
> >>> I can't see the [1] :-)
> >>
> >> does this count as missing attachment? ^^
> >>
> >> [1]
> http://svn.apache.org/repos/asf/incubator/stanbol/trunk/entityhub/yard/solr/src/main/resources/solr/core/
> >>
> >>>
> >>> David
> >>>
> >>> On Thu, Jul 14, 2011 at 8:56 AM, Rupert Westenthaler <
> >>> [email protected]> wrote:
> >>>
> >>>> Hi
> >>>>
> >>>> Yes this is possible, but would need (depending on the hardware) quite
> >>>> some time.
> >>>> A description on how to do this is available in [1].
> >>>>
> >>>> Instead of installing the dbpedia.solrindex.zip file as described in
> >>>> the readme, you could directly
> >>>>
> >>>> * shutdown stanbol
> >>>> * delete the "dbpedia_43k" index in
> >>>> "{stanbol-root}/sling/entityhub/solrYard/indexes"
> >>>> * copy the index located in the
> >>>> "{indexing-root}/indexing/destination/indexes" to
> >>>> "{stanbol-root}/sling/entityhub/solrYard/indexes" and rename it to
> >>>> "dbpedia_43k"
> >>>> * restart stanbol.
> >>>>
> >>>> After that Stanbol should use the new index.
> >>>>
> >>>> Copying the "dbpedia.solrindex.zip" to the datafiles directory and
> >>>> than changing the value of "Solr Index/Core" in the configuration of
> >>>> the SolrYard for dbPedia form "dbpedia_43k" to "dbpedia" should also
> >>>> work.
> >>>>
> >>>> best
> >>>> Rupert
> >>>>
> >>>> On Wed, Jul 13, 2011 at 11:58 AM, David Riccitelli
> >>>> <[email protected]> wrote:
> >>>> > Hi,
> >>>> >
> >>>> > As another workaround, I was thinking that I could actually generate
> >>>> locally
> >>>> > the DBpedia index with all the data using the dumps (
> >>>> > http://wiki.dbpedia.org/Downloads36), in a way similar to the
> >>>> dbpedia_43k.
> >>>> >
> >>>> > What do you think?
> >>>> >
> >>>> > Thanks,
> >>>> > David
> >>>> >
> >>>> > On Wed, Jul 13, 2011 at 12:11 PM, Rupert Westenthaler <
> >>>> > [email protected]> wrote:
> >>>> >
> >>>> >> Hi
> >>>> >>
> >>>> >> I will try to find some time in the evening to reproduce this.
> >>>> >>
> >>>> >> On Wed, Jul 13, 2011 at 8:57 AM, David Riccitelli
> >>>> >> <[email protected]> wrote:
> >>>> >> > Thanks Rupert,
> >>>> >> >
> >>>> >> > I'm trying to follow your instructions but I encounter a couple
> of
> >>>> issues
> >>>> >> > (probably due to inexperience):
> >>>> >> >  [1] when dropping the config files, they enter some loop of
> >>>> >> > REGISTERING/UNREGISTERING (which I solve by stopping the
> FileInstall
> >>>> >> > bundle), is that normal?
> >>>> >>
> >>>> >> This is very strange and should not be caused by the FileInstaller.
> >>>> >> Maybe there is some loop between the Sling Installer - trying to
> >>>> >> install the default configuration and the FileInstaller that may
> cause
> >>>> >> this under some circumstances.
> >>>> >>
> >>>> >> >  [2] after I restart Stanbol, and try to query an entity from the
> >>>> >> entityhub
> >>>> >> > I receive the following error:
> >>>> >> >
> >>>> >> > 13.07.2011 09:54:17.939 *WARN* [509017110@qtp-1586831707-0]
> >>>> >> > org.apache.felix.http.jetty /entityhub/sites/entity/
> >>>> >> > (java.lang.IllegalStateException: Unable to initialize the Cache
> with
> >>>> >> Yard
> >>>> >> > dbpediaCache! This is usually caused by Errors while reading the
> Cache
> >>>> >> > Configuration from the Yard.) java.lang.IllegalStateException:
> Unable
> >>>> to
> >>>> >> > initialize the Cache with Yard dbpediaCache! This is usually
> caused by
> >>>> >> > Errors while reading the Cache Configuration from the Yard.
> >>>> >> > at
> >>>> >> >
> >>>> >>
> >>>>
> org.apache.stanbol.entityhub.core.site.CacheImpl.getCacheYard(CacheImpl.java:214)
> >>>> >> >
> >>>> >> >
> >>>> >> > Do I need to initialize the Cache in some way?
> >>>> >> >
> >>>> >> No it does not. Prepared in Indexes do include a document that
> >>>> >> provides a list of the indexed fields. In future this may be used
> to
> >>>> >> determine if a query can be successfully executed on the local
> index
> >>>> >> or not. In addition this is used in case an Entity within the index
> is
> >>>> >> updated with an newer version.
> >>>> >> However this configuration is optional and is not required. This
> >>>> >> Exception should only appear if the document is present but illegal
> >>>> >> formatted. However the SolrYard initialized for the dbpediaCache
> >>>> >> should be empty.
> >>>> >>
> >>>> >> Therefore I think it is somehow related to the above problem of
> >>>> >> overriding configurations.
> >>>> >>
> >>>> >> In general the way how the default configuration is loaded is
> >>>> >> sub-optional in the moment. Especially using a single defaultdata
> >>>> >> bundle for both the OpenNLP models and the dbpedia configuration +
> >>>> >> default index was not a good Idea, because one can not
> exclude/change
> >>>> >> the dbpedia stuff without affecting other components that depend on
> >>>> >> OpenNLP.
> >>>> >> Therefore I think we need to discuss how to better structure the
> >>>> >> configurations and data needed to run stanbol.
> >>>> >>
> >>>> >> There is also an other issue that the SolrYard only once copies
> >>>> >> provided indexes and does not check for updates. This would it make
> >>>> >> hard the upgrade from the small index provided with the default
> data
> >>>> >> to a bigger version.
> >>>> >>
> >>>> >> Both this things are related to the problems and need to be
> addressed
> >>>> >> before the first stanbol release. Independent of those I will try
> to
> >>>> >> find a simple solution for what you intend to do.
> >>>> >>
> >>>> >> In the meantime I suggest you go for the initially proposed
> workaround.
> >>>> >>
> >>>> >> best
> >>>> >> Rupert Westenthaler
> >>>> >>
> >>>> >> > Thanks for your help,
> >>>> >> >
> >>>> >> > David
> >>>> >> >
> >>>> >> >
> >>>> >> > On Mon, Jul 11, 2011 at 11:42 PM, Rupert Westenthaler <
> >>>> >> > [email protected]> wrote:
> >>>> >> >
> >>>> >> >> Hi
> >>>> >> >>
> >>>> >> >> On Mon, Jul 11, 2011 at 8:17 PM, Andrea Giovanni Nuzzolese
> >>>> >> >> <[email protected]> wrote:
> >>>> >> >> > I solved in the same way, but loosing the caching
> capabilities.
> >>>> >> >> > Is there any possibility to keep both all the data and the
> cache?
> >>>> >> >> >
> >>>> >> >> > Andrea
> >>>> >> >> >
> >>>> >> >> > On Jul 11, 2011, at 4:08 PM, David Riccitelli wrote:
> >>>> >> >> >
> >>>> >> >> >> Ok, stopping the solrYard dbpedia_43k component solved for
> me.
> >>>> >> >> >>
> >>>> >> >> >> Thanks,
> >>>> >> >> >> David
> >>>> >> >> >>
> >>>> >> >> >> On Mon, Jul 11, 2011 at 4:13 PM, David Riccitelli <
> >>>> >> >> >> [email protected]> wrote:
> >>>> >> >> >>
> >>>> >> >> >>> Hi Rupert,
> >>>> >> >> >>>
> >>>> >> >> >>> I recently updated the Stanbol install, and I found that the
> RDF
> >>>> >> >> returned
> >>>> >> >> >>> by the EntityHub is missing some props (specifically the
> dbprop
> >>>> as
> >>>> >> far
> >>>> >> >> as I
> >>>> >> >> >>> can see).
> >>>> >> >> >>>
> >>>> >> >> >>> This is the command that I use for testing:
> >>>> >> >> >>> curl -H "accept: application/rdf+xml" "
> >>>> >> >> >>>
> >>>> >> >>
> >>>> >>
> >>>>
> http://localhost:8080/entityhub/site/dbpedia/entity?id=http://dbpedia.org/resource/Valentino_Rossi
> >>>> >> >> >>> "
> >>>> >> >> >>>
> >>>> >> >> >>> which outputs the attached RDF file.
> >>>> >> >> >>>
> >>>> >> >> >>> I cleared all of the sling folder (rm -fr sling) and checked
> the
> >>>> >> with
> >>>> >> >> the
> >>>> >> >> >>> SPAQL end-point at DBpedia, but I wasn't able to fix it.
> >>>> >> >> >>>
> >>>> >> >> >>> Does this depend on the mapping.txt file?
> >>>> >> >> >>>
> >>>> >> >>
> >>>> >> >> If you plan to create your own dbpedia index, than the
> mapping.txt
> >>>> >> >> file would be the way how to configure what properties are
> >>>> >> >> includes/excluded.
> >>>> >> >> Typically dbprop values are low quality. They are just naive 1:1
> >>>> >> >> mappings of key value pairs as found in the info boxes. Because
> of
> >>>> >> >> this they are excluded from the indexes.
> >>>> >> >>
> >>>> >> >> At runtime the returned data depend on the used Cache strategy:
> >>>> >> >>
> >>>> >> >> Currently there are three possibilities (configured with the
> >>>> referenced
> >>>> >> >> Site)
> >>>> >> >> 1) no cache: bot queries and retrieval so use a remote service
> >>>> >> >> 2) used: Queries are executed by the remote service. Retrieved
> >>>> >> >> Entities are stored locally. The cached data depend on the
> mappings
> >>>> >> >> defined for the cache.
> >>>> >> >> 3) all: Both queries and retrieval are based on the cache. The
> remote
> >>>> >> >> service are only used as fallback in the case that the cache is
> not
> >>>> >> >> available (e.g. if you deactivate solrYard).
> >>>> >> >>
> >>>> >> >> So if you you are fine with (2) than you could use the
> configuration
> >>>> >> >> as previously used by the stable launcher [1].
> >>>> >> >> I think the easiest way to install this is to use this is to add
> the
> >>>> >> >> Felix File Installer [2] to the Stanbol Environment. You will
> need to
> >>>> >> >> delete the current referencedSite for dbpedia first and than add
> the
> >>>> >> >> three configuration files as described by [1].
> >>>> >> >>
> >>>> >> >> If your requirements are not covered by the currently available
> >>>> option
> >>>> >> >> it would be nice if you could write a short user story, because
> I am
> >>>> >> >> thinking about how to improve this feature and input like that
> would
> >>>> >> >> be really valuable.
> >>>> >> >>
> >>>> >> >> best
> >>>> >> >> Rupert Westenthaler
> >>>> >> >>
> >>>> >> >> [1] The dbpedia config consists of three files. the referenced
> site,
> >>>> >> >> cache and solryard components with the "-dbpedia" endings.
> >>>> >> >>
> >>>> >> >>
> >>>> >>
> >>>>
> http://svn.apache.org/viewvc/incubator/stanbol/trunk/launchers/stable/src/main/resources/resources/config/?pathrev=1140181
> >>>> >> >>
> >>>> >> >> [2] http://felix.apache.org/site/apache-felix-file-install.html
> >>>> >> >>
> >>>> >> >> p.s. I keep this part because it describes very well how the
> cache
> >>>> >> >> strategy "used" work:
> >>>> >> >> >>>>> Hi David
> >>>> >> >> >>>>>
> >>>> >> >> >>>>> Assuming that you are using the default distribution of
> Apache
> >>>> >> >> Stanbol.
> >>>> >> >> >>>>>
> >>>> >> >> >>>>> Requests for  http://dbpedia.org/resource/Valentino_Rossiwill
> >>>> be
> >>>> >> >> >>>>> - only the first time answered by retrieving the Entity
> form
> >>>> >> >> DBpedia.org
> >>>> >> >> >>>>> - the Information are cached in a local cache. By that
> values
> >>>> of
> >>>> >> the
> >>>> >> >> >>>>> documents are filtered (see (a) for details)
> >>>> >> >> >>>>> - the cached version is returned
> >>>> >> >> >>>>>
> >>>> >> >> >>>>> (a) The default configuration for dbpedia stores all
> fields
> >>>> >> however
> >>>> >> >> >>>>> filters values for literals so that only values with the
> >>>> language
> >>>> >> >> "en,
> >>>> >> >> >>>>> de, fr, it, es" or no language are stored.
> >>>> >> >> >>>>>
> >>>> >> >> >>>>>
> >>>> >> >> >>>>> Assuming that you have started for zero when updating to a
> new
> >>>> >> >> version
> >>>> >> >> >>>>> this also means that you have downloaded a new version of
> this
> >>>> >> Entity
> >>>> >> >> >>>>> from dbPedia.
> >>>> >> >> >>>>>
> >>>> >> >>
> >>>> >> >> --
> >>>> >> >> | Rupert Westenthaler             [email protected]
> >>>> >> >> | Bodenlehenstraße 11
> ++43-699-11108907
> >>>> >> >> | A-5500 Bischofshofen
> >>>> >> >>
> >>>> >> >
> >>>> >> >
> >>>> >> >
> >>>> >> > --
> >>>> >> > David Riccitelli
> >>>> >> >
> >>>> >> > Interact SpA
> >>>> >> > Via A. Bargoni 78 (scala F)
> >>>> >> > 00153 Roma
> >>>> >> >
> >>>> >> > T +39 06 58318 301
> >>>> >> > F +39 06 58318 303
> >>>> >> >
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> --
> >>>> >> | Rupert Westenthaler             [email protected]
> >>>> >> | Bodenlehenstraße 11                             ++43-699-11108907
> >>>> >> | A-5500 Bischofshofen
> >>>> >>
> >>>> >
> >>>> >
> >>>> >
> >>>> > --
> >>>> > David Riccitelli
> >>>> >
> >>>> > Interact SpA
> >>>> > Via A. Bargoni 78 (scala F)
> >>>> > 00153 Roma
> >>>> >
> >>>> > T +39 06 58318 301
> >>>> > F +39 06 58318 303
> >>>> >
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> | Rupert Westenthaler             [email protected]
> >>>> | Bodenlehenstraße 11                             ++43-699-11108907
> >>>> | A-5500 Bischofshofen
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> David Riccitelli
> >>>
> >>> Interact SpA
> >>> Via A. Bargoni 78 (scala F)
> >>> 00153 Roma
> >>>
> >>> T +39 06 58318 301
> >>> F +39 06 58318 303
> >>>
> >>
> >>
> >>
> >> --
> >> | Rupert Westenthaler             [email protected]
> >> | Bodenlehenstraße 11                             ++43-699-11108907
> >> | A-5500 Bischofshofen
> >>
> >
> >
> >
> > --
> > | Rupert Westenthaler             [email protected]
> > | Bodenlehenstraße 11                             ++43-699-11108907
> > | A-5500 Bischofshofen
> >
>
>
>
> --
> | Rupert Westenthaler             [email protected]
> | Bodenlehenstraße 11                             ++43-699-11108907
> | A-5500 Bischofshofen
>



-- 
David Riccitelli
-----
Skype: ziodave
Twitter: @ziodave
LinkedIn: http://it.linkedin.com/in/riccitelli
-----
Interact SpA
Via A. Bargoni 78 (scala F)
00153 Roma

T +39 06 58318 301
F +39 06 58318 303

Reply via email to