On 26/08/13 02:41, Tao (陶信东) wrote:
Thanks, Andy. I'm running you scripts and the following errors
prompt.
Does it mean the error was fixed?
ERROR meabs a syntax error - parsing stops.
9 ==== data/xaa (Mon Aug 26 09:30:02 CST 2013)
10 ERROR [line: 4692554, col: 35] Unknown char: \(92)
What is line 4692554?
11 stdin : 42.79 sec 4,692,540 triples 109,656.72 TPS
12 ==== data/xab (Mon Aug 26 09:30:46 CST 2013)
13 stdin : 86.39 sec 9,999,991 triples 115,751.35 TPS
14 ==== data/xac (Mon Aug 26 09:32:13 CST 2013)
15 ERROR [line: 2921254, col: 41] Unknown char: \(92)
and line 2921254 of this file?
(which version of Jena are you using?)
Andy
Thanks,
Tao
-----Original Message-----
From: Andy Seaborne [mailto:[email protected]]
Sent: Friday, August 23, 2013 12:14 AM
To: [email protected]
Subject: Re: Has anyone loaded freebase dump to TDB successfully?
On 22/08/13 10:57, Tao (陶信东) wrote:
I used to try but failed (due to some format errors). Now I want to
try again?
Do I have a chance to succeed?
Yes - there's chance. But.
First, you need to make sure the data is clean - that is, it parses.
http://people.apache.org/~andy/Freebase20121223/Notes.txt
has my notes from that published version.
Theer is no point loading until the data parses cleanly. Use "riot --validate"
Then you need a big (a lot of RAM) 64 bit machine and JVM and patience.
It's a large load and TDB slows down at scale currently. There have been
reports of exceptionally good loading, but it was on a large, powerful machine.
More RAM = faster.
The loading may well take more a very long time.
Andy
Thanks
Tao