Persisten Jena Fuseki on AWS

2015-11-27 Thread Davy Cox
I would like to set up a redundant jena fuseki RDF sparql server on amazon
AWS.

I'd like to do this with AWS container service (docker) which loads the
server as a docker allowing me to scale the jena sparql server as required
for the load.

Locally I already use the "stain/jena-fuseki" docker which works great, but
I'm a bit puzzled with the best practice on saving the data in one
location, but allowing access from multiple jena-fuseki docker instances at
the same time. (I read somewhere it should only be written to by one
instance???)

I see jena supports relational databases as a back-end via TDB (which would
probably allow me to use one of the AWS RDS), but because I also read
somewhere this is not recommended.

What would you recommend to set up a multi docker jena-fuseki server where
data is stored in one persistent location (like DynamoDB or S3???) allowing
scaling and redundancy but at the same time fault tolerance?

Thank you in advance!


Re: Persisten Jena Fuseki on AWS

2015-11-27 Thread A. Soroka
> I see jena supports relational databases as a back-end via TDB (which would 
> probably allow me to use one of the AWS RDS), but because I also read 
> somewhere this is not recommended.

This isn’t quite accurate. Jena supports relational databases as backends via 
SDB, not TDB, and SDB isn’t recommended for performance reasons (possibly 
amongst others that I don’t know about). TDB is more performant and much closer 
to the front of development.

---
A. Soroka
The University of Virginia Library

> On Nov 27, 2015, at 8:35 AM, Davy Cox  wrote:
> 
> I would like to set up a redundant jena fuseki RDF sparql server on amazon
> AWS.
> 
> I'd like to do this with AWS container service (docker) which loads the
> server as a docker allowing me to scale the jena sparql server as required
> for the load.
> 
> Locally I already use the "stain/jena-fuseki" docker which works great, but
> I'm a bit puzzled with the best practice on saving the data in one
> location, but allowing access from multiple jena-fuseki docker instances at
> the same time. (I read somewhere it should only be written to by one
> instance???)
> 
> I see jena supports relational databases as a back-end via TDB (which would
> probably allow me to use one of the AWS RDS), but because I also read
> somewhere this is not recommended.
> 
> What would you recommend to set up a multi docker jena-fuseki server where
> data is stored in one persistent location (like DynamoDB or S3???) allowing
> scaling and redundancy but at the same time fault tolerance?
> 
> Thank you in advance!



Re: Persisten Jena Fuseki on AWS

2015-11-27 Thread Davy Cox
What would you then suggest as a server deployment for SPARQL and RDF
support that allows HA without expensive licensing?
I'm currently looking at Blazegraph (Bigdata)?

Any suggestions?

On Fri, Nov 27, 2015 at 3:00 PM, A. Soroka  wrote:

> > I see jena supports relational databases as a back-end via TDB (which
> would probably allow me to use one of the AWS RDS), but because I also read
> somewhere this is not recommended.
>
> This isn’t quite accurate. Jena supports relational databases as backends
> via SDB, not TDB, and SDB isn’t recommended for performance reasons
> (possibly amongst others that I don’t know about). TDB is more performant
> and much closer to the front of development.
>
> ---
> A. Soroka
> The University of Virginia Library
>
> > On Nov 27, 2015, at 8:35 AM, Davy Cox  wrote:
> >
> > I would like to set up a redundant jena fuseki RDF sparql server on
> amazon
> > AWS.
> >
> > I'd like to do this with AWS container service (docker) which loads the
> > server as a docker allowing me to scale the jena sparql server as
> required
> > for the load.
> >
> > Locally I already use the "stain/jena-fuseki" docker which works great,
> but
> > I'm a bit puzzled with the best practice on saving the data in one
> > location, but allowing access from multiple jena-fuseki docker instances
> at
> > the same time. (I read somewhere it should only be written to by one
> > instance???)
> >
> > I see jena supports relational databases as a back-end via TDB (which
> would
> > probably allow me to use one of the AWS RDS), but because I also read
> > somewhere this is not recommended.
> >
> > What would you recommend to set up a multi docker jena-fuseki server
> where
> > data is stored in one persistent location (like DynamoDB or S3???)
> allowing
> > scaling and redundancy but at the same time fault tolerance?
> >
> > Thank you in advance!
>
>


Re: Check the Jena TDB files exist in a directory

2015-11-27 Thread Saikat Maitra
Hi Laurent

You can use dir.listFiles() and it will return list of files in that dir.
You can iterate over the list to check for any TDB file present and then
load Dataset from it.

Regards
Saikat

On Fri, Nov 27, 2015 at 2:03 PM, Laurent Rucquoy <
laurent.rucq...@telemis.com> wrote:

> Hello,
>
> I want to load a Dataset from a directory but not to create a new one if
> there is no TDB files in this directory. Is there a way to check that TDB
> files effectively exist in a directory or to load a Dataset while
> preventing to create files on disk if they don't exist ?
>
> Thank you.
>
> Sincerely,
> Laurent
>


Re: Persisten Jena Fuseki on AWS

2015-11-27 Thread A. Soroka
This is a Jena-specific list, so maybe not the best place to research that 
question, but there is a new development for Jena called TDB2 that may be of 
interest, announced here:

https://mail-archives.apache.org/mod_mbox/jena-dev/201506.mbox/%3c5575b7b3.8020...@apache.org%3E

Andy would be able to say more about that.

---
A. Soroka
The University of Virginia Library

> On Nov 27, 2015, at 9:03 AM, Davy Cox  wrote:
> 
> What would you then suggest as a server deployment for SPARQL and RDF
> support that allows HA without expensive licensing?
> I'm currently looking at Blazegraph (Bigdata)?
> 
> Any suggestions?
> 
> On Fri, Nov 27, 2015 at 3:00 PM, A. Soroka  wrote:
> 
>>> I see jena supports relational databases as a back-end via TDB (which
>> would probably allow me to use one of the AWS RDS), but because I also read
>> somewhere this is not recommended.
>> 
>> This isn’t quite accurate. Jena supports relational databases as backends
>> via SDB, not TDB, and SDB isn’t recommended for performance reasons
>> (possibly amongst others that I don’t know about). TDB is more performant
>> and much closer to the front of development.
>> 
>> ---
>> A. Soroka
>> The University of Virginia Library
>> 
>>> On Nov 27, 2015, at 8:35 AM, Davy Cox  wrote:
>>> 
>>> I would like to set up a redundant jena fuseki RDF sparql server on
>> amazon
>>> AWS.
>>> 
>>> I'd like to do this with AWS container service (docker) which loads the
>>> server as a docker allowing me to scale the jena sparql server as
>> required
>>> for the load.
>>> 
>>> Locally I already use the "stain/jena-fuseki" docker which works great,
>> but
>>> I'm a bit puzzled with the best practice on saving the data in one
>>> location, but allowing access from multiple jena-fuseki docker instances
>> at
>>> the same time. (I read somewhere it should only be written to by one
>>> instance???)
>>> 
>>> I see jena supports relational databases as a back-end via TDB (which
>> would
>>> probably allow me to use one of the AWS RDS), but because I also read
>>> somewhere this is not recommended.
>>> 
>>> What would you recommend to set up a multi docker jena-fuseki server
>> where
>>> data is stored in one persistent location (like DynamoDB or S3???)
>> allowing
>>> scaling and redundancy but at the same time fault tolerance?
>>> 
>>> Thank you in advance!
>> 
>> 



Check the Jena TDB files exist in a directory

2015-11-27 Thread Laurent Rucquoy
Hello,

I want to load a Dataset from a directory but not to create a new one if
there is no TDB files in this directory. Is there a way to check that TDB
files effectively exist in a directory or to load a Dataset while
preventing to create files on disk if they don't exist ?

Thank you.

Sincerely,
Laurent