On 04/10/11 09:22, Shri :) wrote:
Hi All,

I did bulk loading through the command line utility (with time) after making
some tuning with the Mysql(buffer_pool and key buffer size) and got the
final loading time as roughly 5 and half hours for ~24 million triples which
seems okay to me. It took 4 hours to index this dataset.

Any comments here??

Windows ...


I am now querying this dataset through command utility again where the
resulting tuples are printed along with the execution time, I would like to
know if this execution time includes the printing time as well (which I
would *not* prefer), kindly let me know this..

Don't print the results.
  --results=none
or "count"
or a streaming format to a file.  "text" is not streaming.

See also --repeat=N

        Andy




Thanks to all of you for you advices, it was very helpful to me :)

BR,
Shri


On Fri, Sep 30, 2011 at 2:54 AM, Shri :)<[email protected]>  wrote:

Hello, Sorry my dataset is in .NT format..


On Fri, Sep 30, 2011 at 2:52 AM, Shri :)<[email protected]>  wrote:

Hi All,


@Damian  thanks for the link, I will now try increasing the
buffer_pool_size and carry out the loading..Will let you know how it goes.

@ Andy: Are you using the sdb bulk loader or loading via your own code?What
format is the data in?
But why not use the sdbload tool? Take the source code and add whatever
extras timing you need (it already can print some timing info).


I am using the following code, which I don't think it is very different
from the one that you suggested, *my data is in .TTL format*
Here is the snippet of my code:

StoreDesc storeDesc = StoreDesc.read("sdb2.ttl") ; IDBConnection conn =
new DBConnection ( DB_URL, DB_USER, DB_PASSWD, DB ); conn.getConnection();
SDBConnection sdbconn = SDBFactory.createConnection( conn.getConnection()) ;
Store store = SDBFactory.connectStore(sdbconn, storeDesc) ; Model model=
SDBFactory.connectDefaultModel(store); //read data into the database
InputStream inn= new FileInputStream ("dataset_70000.nt"); long start =
System.currentTimeMillis(); model.read(inn, "localhost", "TTL");
loadtime=ext.elapsedTime(start); // Close the database connection
store.close(); System.out.println("Loading time: " + loadtime);



@Dave I think I followed the pattern suggested in the link that you gave
me (http://openjena.org/wiki/SDB/Loading_data), the above is the snippet
of my source code.
  And one more thing, I didn't get the idea of "Are you wrapping the load
in a transaction to avoid auto-commit costs?", can you please elaborate a
bit on this?? Sorry, I am relatively a novice..


Any thoughts over this? thank you very much! :)

BR,
shri








On Thu, Sep 29, 2011 at 12:00 AM, Shri :)<[email protected]>  wrote:

  *
*

Hi Again,

I supposed to evaluate the performance of few triple stores as a part of
my thesis work (which is the specification which I cannot change
unfortunately)one among them is Jens SDB with Mysql, I am using my own java
code to load the data and not the command line tool, as I wanted to make
note of the loading time. I am using .NT format of data for loading.

I have a 8 GB RAM

any thoughts/suggestion over this? thanks for your help.



On Wed, Sep 28, 2011 at 4:09 PM, Shri :)<[email protected]>  wrote:

Hi Everyone,

I am currently doing my master thesis wherein I have to work with Jena
SDB using mySQL as a backend store. I have around 25 million triples to load
which has taken more than 5 days to load in windows platform, whereas
according to the Berlin Benchmark, it took only 4 hours to load the same
number of triples but in Linux platform, this has left me confused..is the
enormous difference because of the difference in the platform or should I do
any performance tuning/optimization to improve the load time??

kindly give your suggestions/comments

P.S I am using WAMP


Thanks

Shridevika







Reply via email to