Re: fuseki in HA
Hi Andy, In the next few days I will study your proposal and the diferents posibilities. Thank you David Molina Estrada Software Architect -Andy Seaborne <a...@apache.org> escribió: - Para: users@jena.apache.org De: Andy Seaborne <a...@apache.org> Fecha: 22/02/2018 13:15 Asunto: [MASSMAIL]Re: fuseki in HA Hi David, This is one of the main use cases for: https://afs.github.io/rdf-delta/ and there is a Fuseki-component in that build that incorporates the mechanism need for 2+ Fuseki's to propagate changes [3] (a custom service /dataset/patch that accepts patch files and applied them). The work has two parts - the data format need to propagate change (RDF Patch [1]) and a patch log server [2]. Keeping these two components separate is import because not all situations will want patch server. Distribution using Hazelcast or Kafka, or publish changes in the style of Atom/RSS, being good examples. By having a defined patch format, there is no reason why the various triplestores even have to be all Jena-based. Apache Licensed, not part of the Jena project. Let me know what you think: Andy [1] https://afs.github.io/rdf-delta/rdf-patch.html [2] https://afs.github.io/rdf-delta/rdf-patch-logs.html [3] https://github.com/afs/rdf-delta/tree/master/rdf-delta-fuseki Disclosure : this part of my $job at TopQuadrant. There is not reason not to start publishing it to maven central - I just haven't had the need to so far. The RDF patch work is based on previous work with Rob Vesse. On 21/02/18 12:32, DAVID MOLINA ESTRADA wrote: > Hi, > > I want to buid a HA web Application based on fuseki server in HA too. My idea > is create a fuseki docker and deploy so instance that I need. For querying > all is Ok, but I try to define a mechanism (it may be based in Topics with > Hazelcast or Kafka) to distribute changes to all nodes (Both uploading files > and SparQL updated). > > Any recommandation or best practise? Has somebody done anything similar? > > Thanks > > > David Molina Estrada > > > Evite imprimir este mensaje si no es estrictamente necesario | Eviti imprimir > aquest missatge si no és estrictament necessari | Avoid printing this message > if it is not absolutely necessary > Evite imprimir este mensaje si no es estrictamente necesario | Eviti imprimir aquest missatge si no és estrictament necessari | Avoid printing this message if it is not absolutely necessary
Re: fuseki in HA
Hi David, This is one of the main use cases for: https://afs.github.io/rdf-delta/ and there is a Fuseki-component in that build that incorporates the mechanism need for 2+ Fuseki's to propagate changes [3] (a custom service /dataset/patch that accepts patch files and applied them). The work has two parts - the data format need to propagate change (RDF Patch [1]) and a patch log server [2]. Keeping these two components separate is import because not all situations will want patch server. Distribution using Hazelcast or Kafka, or publish changes in the style of Atom/RSS, being good examples. By having a defined patch format, there is no reason why the various triplestores even have to be all Jena-based. Apache Licensed, not part of the Jena project. Let me know what you think: Andy [1] https://afs.github.io/rdf-delta/rdf-patch.html [2] https://afs.github.io/rdf-delta/rdf-patch-logs.html [3] https://github.com/afs/rdf-delta/tree/master/rdf-delta-fuseki Disclosure : this part of my $job at TopQuadrant. There is not reason not to start publishing it to maven central - I just haven't had the need to so far. The RDF patch work is based on previous work with Rob Vesse. On 21/02/18 12:32, DAVID MOLINA ESTRADA wrote: Hi, I want to buid a HA web Application based on fuseki server in HA too. My idea is create a fuseki docker and deploy so instance that I need. For querying all is Ok, but I try to define a mechanism (it may be based in Topics with Hazelcast or Kafka) to distribute changes to all nodes (Both uploading files and SparQL updated). Any recommandation or best practise? Has somebody done anything similar? Thanks David Molina Estrada Evite imprimir este mensaje si no es estrictamente necesario | Eviti imprimir aquest missatge si no és estrictament necessari | Avoid printing this message if it is not absolutely necessary
Re: Fuseki 2 HA or on-the-fly backups?
Great info, thanks. Some organisations achieve this by running a load balancer in front of several replicas then co-ordinating the update process. So, they're running the same query against other nodes behind the load balancer to keep things in sync? You can do a live backup So, an HTTP POST /$/backup/*{name}* initiates a backup and that results in a gzip-compressed N-Quads file. What does a restore look like from that file? -J On Mon, Aug 24, 2015 at 4:08 AM, Rob Vesse rve...@dotnetrdf.org wrote: Andy already answered 1 but more on 2 Assuming you use TDB then in-memory checkpointing already happens. TDB caches data into memory but fundamentally is a persistent disk backed database that uses write-ahead logging for transactions and failure recovery so this already happens automatically and is below the level of Fuseki (you get this behaviour wherever you use TDB provided you use it transactionally which Fuseki always does) Rob On 24/08/2015 05:51, Jason Levitt slimands...@gmail.com wrote: Just wondering if there are any projects out there to provide: 1) HA (high availability) configuration of Fuseki such as mirroring or hot/standby failover. 2) Some kind of on-the-fly backup of Fuseki when it's running in RAM. This might be similar to how Hadoop 1.x checkpoints the in-RAM namenode data structures. BTW, are there any tools for testing the consistency of the Fuseki data structures when Fuseki is temporarily halted? Cheers, Jason
Re: Fuseki 2 HA or on-the-fly backups?
On 24/08/15 16:15, Jason Levitt wrote: Great info, thanks. Some organisations achieve this by running a load balancer in front of several replicas then co-ordinating the update process. So, they're running the same query against other nodes behind the load balancer to keep things in sync? You can do a live backup So, an HTTP POST /$/backup/*{name}* initiates a backup and that results in a gzip-compressed N-Quads file. What does a restore look like from that file? You just load it into an empty database (tdbloader etc). Andy -J On Mon, Aug 24, 2015 at 4:08 AM, Rob Vesse rve...@dotnetrdf.org wrote: Andy already answered 1 but more on 2 Assuming you use TDB then in-memory checkpointing already happens. TDB caches data into memory but fundamentally is a persistent disk backed database that uses write-ahead logging for transactions and failure recovery so this already happens automatically and is below the level of Fuseki (you get this behaviour wherever you use TDB provided you use it transactionally which Fuseki always does) Rob On 24/08/2015 05:51, Jason Levitt slimands...@gmail.com wrote: Just wondering if there are any projects out there to provide: 1) HA (high availability) configuration of Fuseki such as mirroring or hot/standby failover. 2) Some kind of on-the-fly backup of Fuseki when it's running in RAM. This might be similar to how Hadoop 1.x checkpoints the in-RAM namenode data structures. BTW, are there any tools for testing the consistency of the Fuseki data structures when Fuseki is temporarily halted? Cheers, Jason