That seems slow for the size.

We bulk load triples into Windows and get similar times to Centos/Fedora on
the same hardware.

You can hack the tdbloader2 to run on Windows as basically you're
exploiting the OS sort which on Windows is;

*sort* [*/r*] [*/+**n*] [*/m* *kilobytes*] [*/l* *locale*] [*/rec*
*characters*] [[*drive1**:*][*path1*]*filename1*] [*/t* [*drive2**:*][
*path2*]] [*/o* [*drive3**:*][*path3*]*filename3*]

Merge all the files together using copy *.txt newfile.txt This assumes you
understand the blank nodes..?

Use unique from gnu utils for Windows or the following native

@ECHO ON

SET InputFile=C:\folder\path\Input.txt
::SET InputFile=%~1
SET OutputFile=C:\folder\path\Output.txt

SET PSScript=%Temp%\~tmpRemoveDupe.ps1
IF EXIST "%PSScript%" DEL /Q /F "%PSScript%"
ECHO Get-Content "%InputFile%" ^| Sort-Object ^| Get-Unique ^>
"%OutputFile%">>"%PSScript%"

SET PowerShellDir=C:\Windows\System32\WindowsPowerShell\v1.0
CD /D "%PowerShellDir%"
Powershell -ExecutionPolicy Bypass -Command "& '%PSScript%'"

GOTO EOF



If you do the *SET InputFile=%~1 Window* will allow you to drag and drop
the source file into the CMD... Got to be some advantage to using Windows.!?

Dick

On 25 Dec 2017 4:51 am, "Shengyu Li" <[email protected]>
wrote:

Hello,

I am uploading my .ttl data to my database, there are totally about 10,000
files and each file is about 4M. My new data is totally about 40GB. My
origional db is also about 40GB. The server is in my local computer.

I use tdbloader.bat --loc to upload data. After the Finish quads load, it
will pause at this status for a long time (about half an hr for one file
(4M), but if for 200 files one time(200*4M), the pause time will be 2 hrs).
After the pause, the work will go back to the cmd.
[image: Inline image 1]

I guess the pause means the db is doing the organization about the data I
uploaded just now, so won't return for a long time, am I right? Is there
any way to shorten the waiting time?

Thank you very much! Jena is really a useful thing!

Best,
Shengyu

Reply via email to