Rupert,

You fixed it.

Worked fine now on Windows 7 x64: building with latest source and running  an 
/engines page enhancement .

(no workaround en-*.bin files\ files in sling/datafiles needed, works with just 
fresh new sling dir from launcher)

(and no SolrIndex not installed error msgs)

Thanks,
Steve

-----Original Message-----
From: Rupert Westenthaler [mailto:[email protected]] 
Sent: Monday, July 18, 2011 2:33 AM
To: [email protected]
Subject: Re: index file issue on windows

Hi Steve

I think I have found the problem and provided a fix with revision 1147784 [1]

If you still have issues please feel free to reopen STANBOL-259 [2]

best
Rupert

[1] http://svn.apache.org/viewvc?view=revision&revision=1147784
[2] https://issues.apache.org/jira/browse/STANBOL-259

On Thu, Jul 14, 2011 at 8:37 AM, Steve Reiner 
<[email protected]> wrote:
> Rupert ,
>
> Thanks for the help
>
> I can use Stanbol running in a Linux vmware vm until next week.
>
> Thanks,
> Steve
> -----Original Message-----
> From: Rupert Westenthaler [mailto:[email protected]]
> Sent: Wednesday, July 13, 2011 11:22 PM
> To: [email protected]
> Subject: Re: index file issue on windows
>
> Hi
>
>
> On Thu, Jul 14, 2011 at 4:27 AM, Steve Reiner 
> <[email protected]> wrote:
>> Know I should add to jira, just want to make sure I didn't need to 
>> some additional step to get the index to work
>>
>> Was actually getting a different error, not the cache thing, but 
>> index not yet installed when use /engines
>>
>> On Windows with latest code get the index not yet installed error 
>> (and weirdly also with what I built 7/10 that used to work with the 
>> sling/datafiles workaround on Windows)  (Linux with 7/10 code is 
>> still
>> fine):
>
> Do you delete the {stanbol}/sling folder after upgrading to the newest 
> version? If not you might still use the old version because within the /sling 
> folder there is a cache that is not overridden with the new version just 
> because the launcher jar file is replaced?
>
>>
>> (org.apache.stanbol.enhancer.servicesapi.EngineException:
>> 'NamedEntityTaggingEngine' failed to process content item 
>> 'urn:content-item-sha1-88a2b5f6520df87e4567c06b48e742b7d1c71e9c' with 
>> type
>> 'text/plain': org.apache.stanbol.entityhub.servicesapi.yard.YardException:
>> SolrIndex entityhub is not available. The necessary Index is not yet
>> installed.) org.apache.stanbol.enhancer.servicesapi.EngineException:
>> 'NamedEntityTaggingEngine' failed to process content item 
>> 'urn:content-item-sha1-88a2b5f6520df87e4567c06b48e742b7d1c71e9c' with 
>> type
>> 'text/plain': org.apache.stanbol.entityhub.servicesapi.yard.YardException:
>> SolrIndex entityhub is not available. The necessary Index is not yet 
>> installed.
>>        at
>> org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTag
>> g
>> ingEng
>> ine.computeEnhancements(NamedEntityTaggingEngine.java:323)
>>
> Since revision 1144364 the BundleDataFilePovider (the one that seams not to 
> work on Windows) is also used to load the entityhub index.
> Therefore also the initialization of the SolrYard used by the Entityhub will 
> not work. As I am writing this I now know that this would also prevent the 
> initialization of any other SolrYard (such as the dbpediaCache) because also 
> the default initialization does relay an the same BundleDataFilePovider to 
> load the required Solr configuration. So this would also explain the problems 
> you had with the workaround I suggested.
>
> The two required files are in this directory [1]. If you copy them to the 
> {stanbol-root}/datafiles directory it should solve the problem.
> After copying the files there you will need to deactivate/activate the 
> SolrYards so they pick up this files.
>
> [1] 
> http://svn.apache.org/repos/asf/incubator/stanbol/trunk/entityhub/yard
> /solr/src/main/resources/solr/core/
>
>> Have workarounds in sling/datafiles (have en-*.bin, 
>> dbpedia_43k.solrindex.zip )
>>  (change for STANBOL-259, as Fabian commented, didn't fix the 
>> en-*.bin load issue, still needed the workaround)
>>
>
> I will look into that next Week when I am back in the office. I can not do 
> much without access to a Windows box.
>
>> From the felix web console "Stanbol Data File Provider"
>> Seems to be looking for entityhub.solrindex.zip and and not finding 
>> it (tried having dbpedia_43k.solrindex.zip copied to 
>> entityhub.solrindex.zip in datafiles but got same error after restart 
>> and engine use)
>>
>
> You need to restart the SolrYard because it lookups the required files 
> in the activation. Restarting the Engine will not cause the SolrYard 
> to be restarted
>
>> Tried also after being clean:  blow away sling dir, mvn clean, run 
>> shell script script to get defaultdata files,  mvn install 
>> -DskipTests MAVEN_OPTS=-Xmx1024M -XX:MaxPermSize=128M in env
>>
>
> I am really sorry for all this writing without coming up with a real 
> solution, but it is really hard to solve Windows related problems without 
> access to a Windows box. So if you are not in a hurry it would be maybe more 
> effective to delay working on this until next week.
>
> best
> Rupert Westenthaler
>
>> Steve
>> -----Original Message-----
>> From: Steve Reiner [mailto:[email protected]]
>> Sent: Wednesday, July 13, 2011 12:09 AM
>> To: '[email protected]'
>> Subject: RE: EntityHub and DBpedia
>>
>> I am getting something like this too after updating with the code 
>> checked in yesterday. Problem wasn't there in the code the day before.
>>
>> (using /engines page)
>>
>> -----Original Message-----
>> From: David Riccitelli [mailto:[email protected]]
>> Sent: Tuesday, July 12, 2011 11:58 PM
>> To: [email protected]
>> Subject: Re: EntityHub and DBpedia
>>
>> Thanks Rupert,
>>
>> I'm trying to follow your instructions but I encounter a couple of 
>> issues (probably due to inexperience):
>>  [1] when dropping the config files, they enter some loop of 
>> REGISTERING/UNREGISTERING (which I solve by stopping the FileInstall 
>> bundle), is that normal?
>>  [2] after I restart Stanbol, and try to query an entity from the 
>> entityhub I receive the following error:
>>
>> 13.07.2011 09:54:17.939 *WARN* [509017110@qtp-1586831707-0] 
>> org.apache.felix.http.jetty /entityhub/sites/entity/
>> (java.lang.IllegalStateException: Unable to initialize the Cache with 
>> Yard dbpediaCache! This is usually caused by Errors while reading the 
>> Cache Configuration from the Yard.) java.lang.IllegalStateException:
>> Unable to initialize the Cache with Yard dbpediaCache! This is 
>> usually caused by Errors while reading the Cache Configuration from the Yard.
>> at
>> org.apache.stanbol.entityhub.core.site.CacheImpl.getCacheYard(CacheIm
>> p
>> l.java
>> :214)
>>
>>
>> Do I need to initialize the Cache in some way?
>>
>> Thanks for your help,
>>
>> David
>>
>>
>> On Mon, Jul 11, 2011 at 11:42 PM, Rupert Westenthaler < 
>> [email protected]> wrote:
>>
>>> Hi
>>>
>>> On Mon, Jul 11, 2011 at 8:17 PM, Andrea Giovanni Nuzzolese 
>>> <[email protected]> wrote:
>>> > I solved in the same way, but loosing the caching capabilities.
>>> > Is there any possibility to keep both all the data and the cache?
>>> >
>>> > Andrea
>>> >
>>> > On Jul 11, 2011, at 4:08 PM, David Riccitelli wrote:
>>> >
>>> >> Ok, stopping the solrYard dbpedia_43k component solved for me.
>>> >>
>>> >> Thanks,
>>> >> David
>>> >>
>>> >> On Mon, Jul 11, 2011 at 4:13 PM, David Riccitelli < 
>>> >> [email protected]> wrote:
>>> >>
>>> >>> Hi Rupert,
>>> >>>
>>> >>> I recently updated the Stanbol install, and I found that the RDF
>>> returned
>>> >>> by the EntityHub is missing some props (specifically the dbprop 
>>> >>> as far
>>> as I
>>> >>> can see).
>>> >>>
>>> >>> This is the command that I use for testing:
>>> >>> curl -H "accept: application/rdf+xml" "
>>> >>>
>>> http://localhost:8080/entityhub/site/dbpedia/entity?id=http://dbpedia.
>>> org/resource/Valentino_Rossi
>>> >>> "
>>> >>>
>>> >>> which outputs the attached RDF file.
>>> >>>
>>> >>> I cleared all of the sling folder (rm -fr sling) and checked the 
>>> >>> with
>>> the
>>> >>> SPAQL end-point at DBpedia, but I wasn't able to fix it.
>>> >>>
>>> >>> Does this depend on the mapping.txt file?
>>> >>>
>>>
>>> If you plan to create your own dbpedia index, than the mapping.txt 
>>> file would be the way how to configure what properties are 
>>> includes/excluded.
>>> Typically dbprop values are low quality. They are just naive 1:1 
>>> mappings of key value pairs as found in the info boxes. Because of 
>>> this they are excluded from the indexes.
>>>
>>> At runtime the returned data depend on the used Cache strategy:
>>>
>>> Currently there are three possibilities (configured with the 
>>> referenced
>>> Site)
>>> 1) no cache: bot queries and retrieval so use a remote service
>>> 2) used: Queries are executed by the remote service. Retrieved 
>>> Entities are stored locally. The cached data depend on the mappings 
>>> defined for the cache.
>>> 3) all: Both queries and retrieval are based on the cache. The 
>>> remote service are only used as fallback in the case that the cache 
>>> is not available (e.g. if you deactivate solrYard).
>>>
>>> So if you you are fine with (2) than you could use the configuration 
>>> as previously used by the stable launcher [1].
>>> I think the easiest way to install this is to use this is to add the 
>>> Felix File Installer [2] to the Stanbol Environment. You will need 
>>> to delete the current referencedSite for dbpedia first and than add 
>>> the three configuration files as described by [1].
>>>
>>> If your requirements are not covered by the currently available 
>>> option it would be nice if you could write a short user story, 
>>> because I am thinking about how to improve this feature and input 
>>> like that would be really valuable.
>>>
>>> best
>>> Rupert Westenthaler
>>>
>>> [1] The dbpedia config consists of three files. the referenced site, 
>>> cache and solryard components with the "-dbpedia" endings.
>>>
>>> http://svn.apache.org/viewvc/incubator/stanbol/trunk/launchers/stabl
>>> e
>>> /
>>> src/main/resources/resources/config/?pathrev=1140181
>>>
>>> [2] http://felix.apache.org/site/apache-felix-file-install.html
>>>
>>> p.s. I keep this part because it describes very well how the cache 
>>> strategy "used" work:
>>> >>>>> Hi David
>>> >>>>>
>>> >>>>> Assuming that you are using the default distribution of Apache
>>> Stanbol.
>>> >>>>>
>>> >>>>> Requests for  http://dbpedia.org/resource/Valentino_Rossi will 
>>> >>>>> be
>>> >>>>> - only the first time answered by retrieving the Entity form
>>> DBpedia.org
>>> >>>>> - the Information are cached in a local cache. By that values 
>>> >>>>> of the documents are filtered (see (a) for details)
>>> >>>>> - the cached version is returned
>>> >>>>>
>>> >>>>> (a) The default configuration for dbpedia stores all fields 
>>> >>>>> however filters values for literals so that only values with 
>>> >>>>> the language
>>> "en,
>>> >>>>> de, fr, it, es" or no language are stored.
>>> >>>>>
>>> >>>>>
>>> >>>>> Assuming that you have started for zero when updating to a new
>>> version
>>> >>>>> this also means that you have downloaded a new version of this 
>>> >>>>> Entity from dbPedia.
>>> >>>>>
>>>
>>> --
>>> | Rupert Westenthaler             [email protected] 
>>> | Bodenlehenstraße 11                             ++43-699-11108907
>>> | A-5500 Bischofshofen
>>>
>>
>>
>>
>> --
>> David Riccitelli
>>
>> Interact SpA
>> Via A. Bargoni 78 (scala F)
>> 00153 Roma
>>
>> T +39 06 58318 301
>> F +39 06 58318 303
>>
>>
>
>
>
> --
> | Rupert Westenthaler             [email protected] 
> | Bodenlehenstraße 11                             ++43-699-11108907
> | A-5500 Bischofshofen
>
>



-- 
| Rupert Westenthaler             [email protected] 
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

Reply via email to