Hi

you need to enable 'id-namespace' in the iditerator.properties file
and set the value to 'http://rdf.freebase.com/ns/' (the same value as
defined by http://prefix.cc/fb)

This will ensure that the indexing tool is looking for the correct
Entity URIs (e.g. 'http://rdf.freebase.com/ns/m.0kpv11' for '10888430
m.0kpv11' the first line in the incoming_links.txt file)

best
Rupert


On Fri, May 22, 2015 at 3:43 PM, Rajan Shah <raja...@gmail.com> wrote:
> Hi Rupert,
>
> Thanks for the quick turnover, I really appreciate your prompt response.
>
> Please find included at the end.
>
> Thanks in advance,
> Rajan
>
> *a. iditerator.properties*
>
> #NOTES:
> # Lines in this file start with spaces in cases the score is lower than one
> # million. because of that we need to trim leading spaces
> trimLine
> # after trimming the lines the
> #  -> first position is always an empty string
> #  -> score should be at the first position
> score-pos=1
> #  -> second position should be the local name of the entity
> id-pos=2
> #the file needs to be in the source (default="/indexing/resource") folder!
> source=incoming_links.txt
> encodeIds=false
> charset=UTF-8
> # set the separator to ' '
> separator=
> # and URLdecode the IDs
> decodeIds=false
>
> # freebase uses namespace prefixes for IDs, because of that we do not need
> # the id-namespace parameter. NOTE that the 'ns' prefix need to be set to
> # http://www.
> #id-namespace=http://freebase.com/
> ns-prefix-state=false
>
> *b. incoming_links.txt*
>
> Some lines are as follows:
>
> 10888430 m.0kpv11
> 3741261 m.019h
> 2667858 m.0775xx5
> 2667804 m.0775xvm
> 1875352 m.01xryvm
> 1739262 m.05zppz
> 1369590 m.01xrzlb
> 1336481 m.0g4g
> 1202333 m.04l
> 1093642 m.01xryw5
> 1079153 m.09gn
> 1070544 m.0kpv17
> 1066210 m.09c7w0
> 925879 m.01x32j1
> 922312 m.0jst35z
> 921239 m.08x8
> 864526 m.02nsjl9
> 832558 m.01xlj26
> 769191 m.02lx2r
> 736892 m.04m8
>
> On Fri, May 22, 2015 at 9:29 AM, Rupert Westenthaler <
> rupert.westentha...@gmail.com> wrote:
>
>> Hi Rajan,
>>
>> > *05:11:41,851 [Indexing: Finished Entity Logger Deamon] INFO
>> > impl.IndexerImpl - Indexed 0 items in 2059467sec (Infinityms/item):
>>
>> 'You have not indexed a single entity. So something in your indexing
>> configuration is wrong. Most likely you are not correctly building the
>> URIs of the entities from the incoming_links.txt file. Can you provide
>> me an example line of the 'incoming_links.txt' file and the contents
>> of the 'iditerator.properties' file. Those specify how Entity URIs are
>> built.
>>
>> Short answers to the other questions
>>
>>
>> On Fri, May 22, 2015 at 2:10 PM, Rajan Shah <raja...@gmail.com> wrote:
>> > it ran for almost 3 days and generated index.
>>
>> Thats good. It means you do have now the Freebase dump in your Jena
>> TDB triple store. You will not need to repeat this (until you want to
>> use a newer dump. On the next call to the indexing tool it will
>> immediately start with the indexing step.
>>
>>
>> >
>> > Couple questions come to mind:
>> >
>> > a. Is there any particular log/error file the process generates besides
>> > printing out on stdout/stderr?
>>
>> The indexer writes a zip archive with the IDs of all the indexed
>> entities. Its in the indexing/destination folder.
>>
>> > b. Is it a must-have to have stanbol full launcher running all the time
>> > while indexing is going on?
>>
>> No Stanbol instance is needed by the indexing process.
>>
>> > c. Is it possible that, if the machine is not connected to internet for
>> > couple minutes could cause some issues?
>>
>> No Internet connectivity is needed during indexing. Only if you want
>> to use the namespace prefix mappings of prefix.cc you need to have
>> internet connectivity when starting the indexing tool.
>>
>> best
>> Rupert
>>
>> >
>> > I would really appreciate, if you can shed some light on "what could be
>> > wrong" or "potential approach to nail down this issue"? If you need, I am
>> > happy to share any additional logs/properties.
>> >
>> > With best regards,
>> > Rajan
>> >
>> > *1. Configuration changes*
>> >
>> > a. set ns-prefix-state=false*
>> > [within /indexing/config/iditerator.properties]*
>> > b. add empty space mapping to   http://rdf.freebase.com/ns/*
>> > [within namespaceprefix.mappings]*
>> > c. enable bunch of properties within mappings.txt such as following
>> >
>> > fb:music.artist.genre
>> > fb:music.artist.label
>> > fb:music.artist.album
>> >
>> > *2. Contents of indexing/dist directory*
>> >
>> > -rw-r--r--  108899 May 22 05:11 freebase.solrindex.zip
>> > -rw-r--r--  3457 May 22 05:11
>> > org.apache.stanbol.data.site.freebase-1.0.0.jar
>> >
>> > *3. Contents of /tmp/freebase/indexing/resources/imported directory*
>> >
>> > -rw-r--r--  1 31026810858 May 20 07:32 freebase.nt.gz
>> >
>> > *4. Contents of /tmp/freebase/indexing/resources directory*
>> >
>> > -rw-r--r--   1 1206745360 May 19 09:38 incoming_links.txt
>> >
>> > *5. The indexer log*
>> >
>> > *04:31:57,236 [Thread-3] INFO  jenatdb.RdfResourceImporter - Add:
>> > 570,850,000 triples (Batch: 2,604 / Avg: 3,621)*
>> > *04:32:00,727 [Thread-3] INFO  jenatdb.RdfResourceImporter - Filtered:
>> > 2429800000 triples (80.97554853864854%)*
>> > *04:32:01,157 [Thread-3] INFO  jenatdb.RdfResourceImporter - -- Finish
>> > triples data phase*
>> > *04:32:01,157 [Thread-3] INFO  jenatdb.RdfResourceImporter - ** Data:
>> > 570,859,352 triples loaded in 157,619.39 seconds [Rate: 3,621.76 per
>> > second]*
>> > *04:32:01,157 [Thread-3] INFO  jenatdb.RdfResourceImporter - -- Start
>> > triples index phase*
>> > *04:32:01,157 [Thread-3] INFO  jenatdb.RdfResourceImporter - -- Finish
>> > triples index phase*
>> > *04:32:01,157 [Thread-3] INFO  jenatdb.RdfResourceImporter - -- Finish
>> > triples load*
>> > *04:32:01,157 [Thread-3] INFO  jenatdb.RdfResourceImporter - **
>> Completed:
>> > 570,859,352 triples loaded in 157,619.39 seconds [Rate: 3,621.76 per
>> > second]*
>> > 04:32:56,880 [Thread-3] INFO  source.ResourceLoader -    ... moving
>> > imported file freebase.nt.gz to imported/freebase.nt.gz
>> > 04:32:56,883 [Thread-3] INFO  source.ResourceLoader -    - completed in
>> > 157675 seconds
>> > 04:32:56,883 [Thread-3] INFO  source.ResourceLoader -  > loading
>> > '/private/tmp/freebase/indexing/resources/rdfdata/fixit.sh' ...
>> > 04:32:56,944 [Thread-3] WARN  jenatdb.RdfResourceImporter - ignore File
>> {}
>> > because of unknown extension
>> > 04:32:56,958 [Thread-3] INFO  source.ResourceLoader -    - completed in 0
>> > seconds
>> > 04:32:56,958 [Thread-3] INFO  source.ResourceLoader -  ... 2 files
>> imported
>> > in 157675 seconds
>> > 04:32:56,958 [Thread-3] INFO  source.ResourceLoader - Loding 0 File ...
>> > 04:32:56,958 [Thread-3] INFO  source.ResourceLoader -  ... 0 files
>> imported
>> > in 0 seconds
>> > 04:32:56,971 [main] INFO  impl.IndexerImpl -  ... delete existing
>> > IndexedEntityId file
>> > /private/tmp/freebase/indexing/destination/indexed-entities-ids.zip
>> > 04:32:56,982 [main] INFO  impl.IndexerImpl - Initialisation completed
>> > 04:32:56,982 [main] INFO  impl.IndexerImpl -   ... initialisation
>> completed
>> > 04:32:56,982 [main] INFO  impl.IndexerImpl - start indexing ...
>> > 04:32:56,982 [main] INFO  impl.IndexerImpl - Indexing started ...
>> >
>> >
>> >
>> > 04:45:48,075 [pool-1-thread-1] WARN  impl.NamespacePrefixProviderImpl -
>> > Invalid Namespace Mapping: prefix 'nsogi' valid , namespace '
>> > http://prefix.cc/nsogi:' invalid -> mapping ignored!
>> > 04:45:48,076 [pool-1-thread-1] WARN  impl.NamespacePrefixProviderImpl -
>> > Invalid Namespace Mapping: prefix 'category' valid , namespace '
>> > http://dbpedia.org/resource/Category:' invalid -> mapping ignored!
>> > 04:45:48,077 [pool-1-thread-1] WARN  impl.NamespacePrefixProviderImpl -
>> > Invalid Namespace Mapping: prefix 'chebi' valid , namespace '
>> > http://bio2rdf.org/chebi:' invalid -> mapping ignored!
>> > 04:45:48,077 [pool-1-thread-1] WARN  impl.NamespacePrefixProviderImpl -
>> > Invalid Namespace Mapping: prefix 'hgnc' valid , namespace '
>> > http://bio2rdf.org/hgnc:' invalid -> mapping ignored!
>> > 04:45:48,077 [pool-1-thread-1] WARN  impl.NamespacePrefixProviderImpl -
>> > Invalid Namespace Mapping: prefix 'dbptmpl' valid , namespace '
>> > http://dbpedia.org/resource/Template:' invalid -> mapping ignored!
>> > 04:45:48,077 [pool-1-thread-1] WARN  impl.NamespacePrefixProviderImpl -
>> > Invalid Namespace Mapping: prefix 'dbc' valid , namespace '
>> > http://dbpedia.org/resource/Category:' invalid -> mapping ignored!
>> > 04:45:48,078 [pool-1-thread-1] WARN  impl.NamespacePrefixProviderImpl -
>> > Invalid Namespace Mapping: prefix 'pubmed' valid , namespace '
>> > http://bio2rdf.org/pubmed_vocabulary:' invalid -> mapping ignored!
>> > 04:45:48,078 [pool-1-thread-1] WARN  impl.NamespacePrefixProviderImpl -
>> > Invalid Namespace Mapping: prefix 'dbt' valid , namespace '
>> > http://dbpedia.org/resource/Template:' invalid -> mapping ignored!
>> > 04:45:48,078 [pool-1-thread-1] WARN  impl.NamespacePrefixProviderImpl -
>> > Invalid Namespace Mapping: prefix 'dbrc' valid , namespace '
>> > http://dbpedia.org/resource/Category:' invalid -> mapping ignored!
>> > 04:45:48,078 [pool-1-thread-1] WARN  impl.NamespacePrefixProviderImpl -
>> > Invalid Namespace Mapping: prefix 'call' valid , namespace '
>> > http://webofcode.org/wfn/call:' invalid -> mapping ignored!
>> > 04:45:48,078 [pool-1-thread-1] WARN  impl.NamespacePrefixProviderImpl -
>> > Invalid Namespace Mapping: prefix 'dbcat' valid , namespace '
>> > http://dbpedia.org/resource/Category:' invalid -> mapping ignored!
>> > 04:45:48,084 [pool-1-thread-1] WARN  impl.NamespacePrefixProviderImpl -
>> > Invalid Namespace Mapping: prefix 'affymetrix' valid , namespace '
>> > http://bio2rdf.org/affymetrix_vocabulary:' invalid -> mapping ignored!
>> > 04:45:48,084 [pool-1-thread-1] WARN  impl.NamespacePrefixProviderImpl -
>> > Invalid Namespace Mapping: prefix 'bgcat' valid , namespace '
>> > http://bg.dbpedia.org/resource/Категория:' invalid -> mapping ignored!
>> > 04:45:48,084 [pool-1-thread-1] WARN  impl.NamespacePrefixProviderImpl -
>> > Invalid Namespace Mapping: prefix 'condition' valid , namespace '
>> > http://www.kinjal.com/condition:' invalid -> mapping ignored!
>> > 05:11:41,836 [Indexing: Entity Source Reader Deamon] INFO
>> impl.IndexerImpl
>> > - Indexing: Entity Source Reader Deamon completed (sequence=0) ...
>> > 05:11:41,838 [Indexing: Entity Source Reader Deamon] INFO
>> impl.IndexerImpl
>> > -  > current sequence : 0
>> > 05:11:41,838 [Indexing: Entity Source Reader Deamon] INFO
>> impl.IndexerImpl
>> > -  > new sequence: 1
>> > 05:11:41,838 [Indexing: Entity Source Reader Deamon] INFO
>> impl.IndexerImpl
>> > - Send end-of-queue to Deamons with Sequence 1
>> > 05:11:41,839 [Indexing: Entity Processor Deamon] INFO  impl.IndexerImpl -
>> > Indexing: Entity Processor Deamon completed (sequence=1) ...
>> > 05:11:41,839 [Indexing: Entity Processor Deamon] INFO  impl.IndexerImpl -
>> >  > current sequence : 1
>> > 05:11:41,839 [Indexing: Entity Processor Deamon] INFO  impl.IndexerImpl -
>> >  > new sequence: 2
>> > 05:11:41,839 [Indexing: Entity Processor Deamon] INFO  impl.IndexerImpl -
>> > Send end-of-queue to Deamons with Sequence 2
>> > 05:11:41,839 [Indexing: Entity Perstisting Deamon] INFO
>> impl.IndexerImpl -
>> > Indexing: Entity Perstisting Deamon completed (sequence=2) ...
>> > 05:11:41,839 [Indexing: Entity Perstisting Deamon] INFO
>> impl.IndexerImpl -
>> >  > current sequence : 2
>> > 05:11:41,839 [Indexing: Entity Perstisting Deamon] INFO
>> impl.IndexerImpl -
>> >  > new sequence: 3
>> > 05:11:41,839 [Indexing: Entity Perstisting Deamon] INFO
>> impl.IndexerImpl -
>> > Send end-of-queue to Deamons with Sequence 3
>> > *05:11:41,851 [Indexing: Finished Entity Logger Deamon] INFO
>> >  impl.IndexerImpl - Indexed 0 items in 2059467sec (Infinityms/item):
>> > processing:  -1.000ms/item | queue:  -1.000ms*
>> > 05:11:41,851 [Indexing: Finished Entity Logger Deamon] INFO
>> >  impl.IndexerImpl -   - source   :  -1.000ms/item
>> > 05:11:41,851 [Indexing: Finished Entity Logger Deamon] INFO
>> >  impl.IndexerImpl -   - processing:  -1.000ms/item
>> > 05:11:41,851 [Indexing: Finished Entity Logger Deamon] INFO
>> >  impl.IndexerImpl -   - store     :  -1.000ms/item
>> > 05:11:41,906 [Indexing: Finished Entity Logger Deamon] INFO
>> >  impl.IndexerImpl - Indexing: Finished Entity Logger Deamon completed
>> > (sequence=3) ...
>> > 05:11:41,906 [Indexing: Finished Entity Logger Deamon] INFO
>> >  impl.IndexerImpl -  > current sequence : 3
>> > 05:11:41,906 [Indexing: Finished Entity Logger Deamon] INFO
>> >  impl.IndexerImpl -  > new sequence: 4
>> > 05:11:41,906 [Indexing: Finished Entity Logger Deamon] INFO
>> >  impl.IndexerImpl - Send end-of-queue to Deamons with Sequence 4
>> > 05:11:41,910 [Indexer: Entity Error Logging Daemon] INFO
>> impl.IndexerImpl
>> > - Indexer: Entity Error Logging Daemon completed (sequence=4) ...
>> > 05:11:41,910 [Indexer: Entity Error Logging Daemon] INFO
>> impl.IndexerImpl
>> > -  > current sequence : 4
>> > 05:11:41,910 [main] INFO  impl.IndexerImpl -   ... indexing completed
>> > 05:11:41,910 [main] INFO  impl.IndexerImpl - start post-processing ...
>> > 05:11:41,910 [main] INFO  impl.IndexerImpl - PostProcessing started ...
>> > 05:11:41,910 [main] INFO  impl.IndexerImpl -   ... post-processing
>> finished
>> > ...
>> > 05:11:41,911 [main] INFO  impl.IndexerImpl - start finalisation....
>> >
>> >
>> >
>> > On Wed, May 20, 2015 at 8:19 AM, Rupert Westenthaler <
>> > rupert.westentha...@gmail.com> wrote:
>> >
>> >> On Tue, May 19, 2015 at 7:04 PM, Rajan Shah <raja...@gmail.com> wrote:
>> >> > Hi Rupert and Antonio,
>> >> >
>> >> > Thanks a lot for the reply.
>> >> >
>> >> > I start to follow Rupert's suggestion, however it failed again at
>> >> >
>> >> > 10:56:34,152 [Thread-3] ERROR jena.riot - [line: 8722294, col: 88]
>> >> illegal
>> >> > escape sequence value: $ (0x24) -- Is there anyway it can be resolved
>> for
>> >> > the entire file?
>> >> >
>> >>
>> >> The indexing tool uses Apache Jena. An those are Jena parsing errors.
>> >> So the Jena Mailing lists would be the better place to look for
>> >> answers.
>> >> This specific issue looks like an invalid URI that is not fixed by the
>> >> fixit script.
>> >>
>> >>
>> >> > I requested an access to latest BaseKB bucket, as it doesn't seem to
>> be
>> >> > open.
>> >> >
>> >> > s3cmd ls s3://basekb-now/2015-04-15-18-54/
>> >> >  --add-header="x-amz-request-payer: requester"
>> >> > ERROR: Access to bucket 'basekb-now' was denied
>> >> >
>> >> >
>> >> > *Couple additional questions:*
>> >> >
>> >> > *1. indexing enhancements:*
>> >> > What settings/properties one can tweak to gain most out of the
>> indexing.
>> >> >
>> >>
>> >> In general you do only want information as needed for your application
>> >> case in the index.
>> >> For EntityLinking only labels and type are required.
>> >> Additional properties will only be used for dereferencing Entities. So
>> >> this will depend on your application needs (your dereferencing
>> >> configuration).
>> >>
>> >> In general I try to exclude as much information as possible form the
>> >> index to keep the size of the Solr Index as small as possible.
>> >>
>> >> > a. for ex. domain specific such as Pharmaceutical, Law etc... within
>> >> > freebase
>> >> > b. potential optimizations to speed up the overall indexing
>> >>
>> >> Most of the time will be needed to load the Freebase dump into Jena
>> >> TDB. Even with an SSD equipped Server this will take several days.
>> >> Assigning more RAM will speed up this process as Jena TDB can cache
>> >> more things in RAM.
>> >>
>> >> Usually it is a good Idea to cancel the indexing process after the
>> >> importing of the RDF data has finished (and the indexing of the
>> >> Entities has started). This is because after indexing all the RAM will
>> >> be used by Jena TDB for caching stuff that is no longer needed in the
>> >> read-only operations during indexing. So a fresh start can speed up
>> >> the indexing part of the process.
>> >>
>> >> Also have a look at the Freebase Indexing Tool Readme
>> >>
>> >> >
>> >> > *2. demo:*
>> >> > I see that, in recent github commit(s) the eHealth and other demos
>> have
>> >> > been commented out. How can I get demo source code and other
>> components
>> >> for
>> >> > these demos. I prefer to build it myself to see the power of stanbol.
>> >> >
>> >>
>> >> The eHealth demo is still in the 0.12 branch [1]. This is fully
>> >> compatible to the trunk version.
>> >>
>> >> > *3. custom vocabulary:*
>> >> > Suppose, I have custom vocabulary in CSV format. Is there a preferred
>> way
>> >> > to upload it to Stanbol and have it recognize my entities?
>> >>
>> >> Google Refine[2] with the RDF extension [3]. You can also try to use
>> >> the (newer) Open Refine [4] with the RDF Refine 0.9.0 Alpha version
>> >> but AFAIK this combination is not so stable and might not work at all.
>> >>
>> >> * Google Refine allows you to import your CSV file.
>> >> * Clean it up (if necessary)
>> >> * The RDF extension allows you to map your CSV data to RDF
>> >> * based on this mapping you can save your data as RDF
>> >> * after that you can import the RDF data to Apache Stanbol
>> >>
>> >> hope this helps
>> >> best
>> >> Rupert
>> >>
>> >> >
>> >> > Thanks in advance,
>> >> > Rajan
>> >> >
>> >>
>> >>
>> >>
>> >> [1]
>> >>
>> http://svn.apache.org/repos/asf/stanbol/branches/release-0.12/demos/ehealth/
>> >> [2] https://code.google.com/p/google-refine/
>> >> [3] http://refine.deri.ie/
>> >> [4] http://openrefine.org/
>> >>
>> >> > On Tue, May 19, 2015 at 3:01 AM, Rupert Westenthaler <
>> >> > rupert.westentha...@gmail.com> wrote:
>> >> >
>> >> >> Hi Rajan,
>> >> >>
>> >> >> I think this is because you named you file
>> >> >> "freebase-rdf-latest-fixed.gz". Jena assumes RDF/XML if the RDF
>> format
>> >> >> is not provided by the file extension. Renaming the file to
>> >> >> "freebase-rdf-latest-fixed.nt.gz" should fix this issue.
>> >> >>
>> >> >> The suggestion of Antonio to use BaseKB is also a valid option.
>> >> >>
>> >> >> best
>> >> >> Rupert
>> >> >>
>> >> >> On Tue, May 19, 2015 at 8:32 AM, Antonio David Perez Morales
>> >> >> <ape...@zaizi.com> wrote:
>> >> >> > Hi Rajan
>> >> >> >
>> >> >> > Freebase dump contains some things that does not fit very well with
>> >> the
>> >> >> > indexer.
>> >> >> > I advise you to use the dump provided by BaseKB (http://basekb.com
>> )
>> >> >> which
>> >> >> > is a curated Freebase dump.
>> >> >> > I did not have any problem indexing it using that dump.
>> >> >> >
>> >> >> > Regards
>> >> >> >
>> >> >> > On Mon, May 18, 2015 at 8:48 PM, Rajan Shah <raja...@gmail.com>
>> >> wrote:
>> >> >> >
>> >> >> >> Hi,
>> >> >> >>
>> >> >> >> I am working on indexing Freebase data within EntityHub and
>> observed
>> >> >> >> following issue:
>> >> >> >>
>> >> >> >> 01:06:01,547 [Thread-3] ERROR jena.riot - [line: 1, col: 7 ]
>> Element
>> >> or
>> >> >> >> attribute do not match QName production:
>> QName::=(NCName':')?NCName.
>> >> >> >>
>> >> >> >> I would appreciate any help pertaining to this issue.
>> >> >> >>
>> >> >> >> Thanks,
>> >> >> >> Rajan
>> >> >> >>
>> >> >> >> *Steps followed:*
>> >> >> >>
>> >> >> >> *1. Initialization: *
>> >> >> >> java -jar
>> >> >> org.apache.stanbol.entityhub.indexing.freebase-1.0.0-SNAPSHOT.jar
>> >> >> >>  init
>> >> >> >>
>> >> >> >> *2. Download the data:*
>> >> >> >> Download data and copy it to
>> >> >> https://developers.google.com/freebase/data
>> >> >> >>
>> >> >> >> *3. Performed execution of fbrankings-uri.sh*
>> >> >> >> It generated incoming_links.txt under resources directory as
>> follows
>> >> >> >>
>> >> >> >> 10888430 m.0kpv11
>> >> >> >> 3741261 m.019h
>> >> >> >> 2667858 m.0775xx5
>> >> >> >> 2667804 m.0775xvm
>> >> >> >> 1875352 m.01xryvm
>> >> >> >> 1739262 m.05zppz
>> >> >> >> 1369590 m.01xrzlb
>> >> >> >>
>> >> >> >> *4. Performed execution of fixit script*
>> >> >> >>
>> >> >> >> gunzip -c ${FB_DUMP} | fixit | gzip > ${FB_DUMP_fixed}
>> >> >> >>
>> >> >> >> *5. Rename the fixed file to freebase.rdf.gz and copy it *
>> >> >> >> to indexing/resources/rdfdata
>> >> >> >>
>> >> >> >> *6. config/iditer.properties file has following setting*
>> >> >> >> #id-namespace=http://freebase.com/
>> >> >> >> ns-prefix-state=false
>> >> >> >>
>> >> >> >> *7. Performed run of following command:*
>> >> >> >> java -jar -Xmx32g
>> >> >> >> org.apache.stanbol.entityhub.indexing.freebase-1.0.0-SNAPSHOT.jar
>> >> index
>> >> >> >>
>> >> >> >> The error dump on stdout is as follows:
>> >> >> >>
>> >> >> >> 01:37:32,884 [Thread-0] INFO
>> solryard.SolrYardIndexingDestination -
>> >> >> ...
>> >> >> >> copy Solr Configuration form
>> >> >> /private/tmp/freebase/indexing/config/freebase
>> >> >> >> to
>> >> /private/tmp/freebase/indexing/destination/indexes/default/freebase
>> >> >> >> 01:37:32,895 [Thread-3] INFO  jenatdb.RdfResourceImporter -     -
>> >> bulk
>> >> >> >> loading File freebase.rdf.gz using Format Lang:RDF/XML
>> >> >> >> 01:37:32,896 [Thread-3] INFO  jenatdb.RdfResourceImporter - --
>> Start
>> >> >> >> triples data phase
>> >> >> >> 01:37:32,896 [Thread-3] INFO  jenatdb.RdfResourceImporter - **
>> Load
>> >> >> empty
>> >> >> >> triples table
>> >> >> >> *01:37:32,948 [Thread-3] ERROR jena.riot - [line: 1, col: 7 ]
>> >> Element or
>> >> >> >> attribute do not match QName production:
>> QName::=(NCName':')?NCName.*
>> >> >> >> 01:37:32,948 [Thread-3] INFO  jenatdb.RdfResourceImporter - --
>> Finish
>> >> >> >> triples data phase
>> >> >> >> 01:37:32,948 [Thread-3] INFO  jenatdb.RdfResourceImporter - --
>> Finish
>> >> >> >> triples load
>> >> >> >> 01:37:32,960 [Thread-3] INFO  source.ResourceLoader - Ignore Error
>> >> for
>> >> >> File
>> >> >> >> /private/tmp/freebase/indexing/resources/rdfdata/freebase.rdf.gz
>> and
>> >> >> >> continue
>> >> >> >>
>> >> >> >> Additional Reference Point:
>> >> >> >>
>> >> >> >> *Original Freebase dump size:*  31025015397 May 14 18:10
>> >> >> >> freebase-rdf-latest.gz
>> >> >> >> *Fixed Freebase dump size:* 31026818367 May 15 12:45
>> >> >> >> freebase-rdf-latest-fixed.gz
>> >> >> >> *Incoming Links size: *1206745360 May 17 00:42 incoming_links.txt
>> >> >> >>
>> >> >> >
>> >> >> > --
>> >> >> >
>> >> >> > ------------------------------
>> >> >> > This message should be regarded as confidential. If you have
>> received
>> >> >> this
>> >> >> > email in error please notify the sender and destroy it immediately.
>> >> >> > Statements of intent shall only become binding when confirmed in
>> hard
>> >> >> copy
>> >> >> > by an authorised signatory.
>> >> >> >
>> >> >> > Zaizi Ltd is registered in England and Wales with the registration
>> >> number
>> >> >> > 6440931. The Registered Office is Brook House, 229 Shepherds Bush
>> >> Road,
>> >> >> > London W6 7AN.
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> | Rupert Westenthaler             rupert.westentha...@gmail.com
>> >> >> | Bodenlehenstraße 11                              ++43-699-11108907
>> >> >> | A-5500 Bischofshofen
>> >> >> | REDLINK.CO
>> >> >>
>> >>
>> ..........................................................................
>> >> >> | http://redlink.co/
>> >> >>
>> >>
>> >>
>> >>
>> >> --
>> >> | Rupert Westenthaler             rupert.westentha...@gmail.com
>> >> | Bodenlehenstraße 11                              ++43-699-11108907
>> >> | A-5500 Bischofshofen
>> >> | REDLINK.CO
>> >>
>> ..........................................................................
>> >> | http://redlink.co/
>> >>
>>
>>
>>
>> --
>> | Rupert Westenthaler             rupert.westentha...@gmail.com
>> | Bodenlehenstraße 11                              ++43-699-11108907
>> | A-5500 Bischofshofen
>> | REDLINK.CO
>> ..........................................................................
>> | http://redlink.co/
>>



-- 
| Rupert Westenthaler             rupert.westentha...@gmail.com
| Bodenlehenstraße 11                              ++43-699-11108907
| A-5500 Bischofshofen
| REDLINK.CO 
..........................................................................
| http://redlink.co/

Reply via email to