Github user ejwhite922 commented on a diff in the pull request:
    --- Diff: extras/rya.manual/src/site/markdown/ ---
    @@ -0,0 +1,385 @@
    +[comment]: # Licensed to the Apache Software Foundation (ASF) under one
    +[comment]: # or more contributor license agreements.  See the NOTICE file
    +[comment]: # distributed with this work for additional information
    +[comment]: # regarding copyright ownership.  The ASF licenses this file
    +[comment]: # to you under the Apache License, Version 2.0 (the
    +[comment]: # "License"); you may not use this file except in compliance
    +[comment]: # with the License.  You may obtain a copy of the License at
    +[comment]: # 
    +[comment]: #
    +[comment]: # 
    +[comment]: # Unless required by applicable law or agreed to in writing,
    +[comment]: # software distributed under the License is distributed on an
    +[comment]: # KIND, either express or implied.  See the License for the
    +[comment]: # specific language governing permissions and limitations
    +[comment]: # under the License.
    +# Rya Streams
    +Introduced in 3.2.12
    +## Disclaimer
    +This is a Beta feature. We do not guarantee newer versions of Rya Streams
    +will be compatible with this version. You may need to remove all of your
    +queries and their associated data from your Rya Streams system and then 
    +reprocess them using the upgraded system.
    +# Table of Contents
    +- [Introduction](#introduction)
    +- [Architecture](#architecture)
    +- [Quick Start](#quick-start)
    +- [Use Cases](#use-cases)
    +- [Future Work](#future-work)
    +<div id='introduction'/>
    +## Introduction 
    +Rya Streams is a system that processes SPARQL queries over streams of RDF 
    +Statements that may have Visibilities attached to them. It does this by 
    +utilizing Kafka as a data processing platform.
    +There are three basic building blocks that the system depends on:
    +* **Streams Query** - This is a SPARQL query that is registered wth Rya 
    +  It is associated with a specific Rya instance because that Rya instance 
    +  determines which Statements the query will evaluate. It has an ID that 
    +  uniquely identifies it across all of the queries that are managed by the 
    +  system, whether or not the system should be processing it, and whether 
or not 
    +  the results of the query needs to be inserted back into the Rya instance 
    +  source statements come from.
    +* **Query Change Log** - A list of changes that have been performed to the
    +  Streams Queries of a specific Rya Instance. This log contains the 
    +  truth about what queries are registered, which are running, and which 
    +  new statements that need to be inserted back into Rya.
    +* **Query Manager** - A daemon application that reacts to new/deleted 
    +  Change Logs as well as new entries within those logs. It starts and stops
    +  Kafka Streams processing jobs for the active Streams Queries. 
    +The Quick Start section explains how the Rya Streams Client is used to 
    +with the system using a simple SPARQL query and some sample Statements.
    +<div id='architecture'/>
    +## Architecture ##
    +The following image is a high level view of how Rya Streams interacts with
    +Rya to process queries.
    +![alt text](../images/rya-streams-high-level.png)
    +1. The Rya Streams client is used to register/update/delete queries.
    +2. Rya Streams notices the change starts/stops processing a query based on
    +   what the change was.
    +3. Statements are discovered for ingest by whatever application is loading 
    +   into Rya.
    +4. Those Statements are written to Rya Streams so that the Streams Queries 
    +   process them and produce results.
    +5. Those Statements are also written to the Rya instance for ad-hoc 
    +6. Rya Streams produces Visibility Binding Sets and/or Visibility 
    +   that are written back to Rya.
    +7. Those same Visibility Binding Sets and/or Visibility Statements are made
    +   available to other systems as well. 
    +<div id='quick-start'/>
    +## Quick Start ##
    +This tutorial demonstrates how to install and start the Rya Streams 
system. It
    +must be configured and running on its own before any Rya instances may use 
    +After performing the steps of this quick start, you will have installed 
    +system, demonstrated that it is functioning properly, and then may use the 
    +Shell to configure Rya instances to use it.  
    +This tutorial assumes you are starting fresh and have no existing Kafka, 
    +Zookeeper data. The two services must already be installed and running.
    +### Step 1: Download the applications ###
    +You can fetch the artifacts you need to follow this Quick Start from our
    +[downloads page]( Click on the release of
    +interest and follow the "Central repository for Maven and other dependency 
    +managers" URL.
    +Fetch the following two artifacts:
    +Artifact Id | Type 
    +--- | ---
    +rya.streams.client | shaded jar
    +rya.streams.query-manager | rpm
    +### Step 2: Install the Query Manager ###
    +Copy the RPM to the CentOS 7 machine the Query Manager will be installed 
    +Install it using the following command:
    +yum install -y rya.streams.query-manager-3.2.12-incubating.noarch.rpm
    +It will install the program to **/opt/rya-streams-query-manager-3.2.12**. 
    +the directions that are in the README.txt file within that directory to 
    +### Step 3: Register a Streams Query with Rya Streams ###
    +Use the Rya Streams Client to register the following Streams Query:
    +SELECT * 
    +WHERE { 
    +    ?person <urn:talksTo> ?employee .
    +    ?employee <urn:worksAt> ?business 
    +We assume Kafka is running on the local machine using the standard Kafka 
    +Issue the following command:
    +java -jar rya.streams.client-3.2.12-incubating-SNAPSHOT-shaded.jar 
add-query \
    +    -i localhost -p 9092 -r rya-streams-quick-start -a true \
    +    -q "SELECT * WHERE { ?person <urn:talksTo> ?employee .?employee 
<urn:worksAt> ?business }"
    +The Query Manager should eventually see that this query was registered and 
    +start a Rya Streams job that will begin processing it using any Visibility 
    +Statements that have been loaded into Rya Streams. If no results are 
    +in step 6 , then verify the Query Manager is working properly. It's logs 
can be
    +found in **/opt/rya-streams-query-manager-3.2.12/logs**.
    +### Step 4: Start watching for results ###
    +We need to fetch the Query ID of the query we want to watch. This can be 
    +up by issuing the following command:
    +java -jar rya.streams.client-3.2.12-incubating-SNAPSHOT-shaded.jar 
list-queries \
    +    -i localhost -p 9092 -r rya-streams-quick-start"
    +The client will print something that looks like this:
    +Queries in Rya Streams:
    +ID: 8dd689ee-9d16-4aa7-91c0-667cdb3ed81a    Is Active: true     Query: 
SELECT * WHERE { ?person <urn:talksTo> ?employee .?employee <urn:worksAt> 
?business }
    +Now we know that if we want to reference the query we registered is step 
3, we 
    +need to use the Query ID **8dd689ee-9d16-4aa7-91c0-667cdb3ed81a**. Start 
    +watching the Stream Query's output using the following command:
    +java -jar rya.streams.client-3.2.12-incubating-SNAPSHOT-shaded.jar 
stream-results \
    +    -i localhost -p 9092 -r rya-streams-quick-start" -q 
    +The command will not finish. The console will print results once they are 
    +by the Rya Streams job that is processing the query. Results will appear 
    +Statements have been loaded.
    +### Step 5: Load data into the input topic ###
    +In a new terminal, create a file named **quick-start-data.nt** that 
    +the following text: 
    +<urn:Bob> <urn:worksAt> <urn:TacoPlace> .
    +<urn:Charlie> <urn:worksAt> <urn:BurngerJoint> .
    +<urn:Eve> <urn:worksAt> <urn:CoffeeShop> .
    +<urn:Bob> <urn:worksAt> <urn:BurgerJoint> .
    +<urn:Alice> <urn:talksTo> <urn:Bob> .
    +For the sake of simplicity within this quick start, we aren't going to use 
    +visibilities when we load the statements. Load the file using the 
    +java -jar rya.streams.client-3.2.12-incubating-SNAPSHOT-shaded.jar 
load-statements \
    +    -i localhost -p 9092 -r rya-streams-quick-start" -v "" -f 
    +### Step 6: Observe the results that appear ###
    +Go back to the terminal that was listening for results. The following 
    +will have appeared: 
    +  names: 
    +    [name]: person  ---  [value]: urn:Alice
    +    [name]: employee  ---  [value]: urn:Bob
    +    [name]: business  ---  [value]: urn:BurgerJoint
    +  Visibility: 
    +  names: 
    +    [name]: person  ---  [value]: urn:Alice
    +    [name]: employee  ---  [value]: urn:Bob
    +    [name]: business  ---  [value]: urn:TacoPlace
    +  Visibility: 
    +<div id='use-cases'/>
    +## Use Cases ##
    +### Alerting ###
    +An alerting system's job is to notify people, or another application, when 
    +something of interest has been observed. Alert latency is the amount of 
    +it takes for a system to issue the alert after the observations required 
to do
    +so have been made. How the alerts are used determines how much latency is 
    +to be tolerated. For example, if the alerting system is used to discover 
when a 
    +fire has started in a building, then that latency period needs to be short 
    +order to save the building. However, some systems do not require low 
    +Status updates typically aren't time critical. An alert that you package 
    +been shipped doens't need to be timely since your package still hasn't 
    +Rya Streams is able to be used to build alerting systems. The conditions 
    +generate alerts are encapsulated within a SPARQL query, the observations 
    +are watched are the RDF Statements, and the alert is a Binding Set or 
    +constructed Statement that is output by the query. It being low/high 
    +will depend on how much data needs to be processed and how performant the 
    +alerting queries are. 
    +### Inference Engine - Forward Chaining ###
    +An inference engine that uses forward chaining starts with a set of 
    +rules and some data. It uses the rules to generate new data until it 
    +its way to some end goal/answer. For example, suppose that the goal is to 
    +conclude the color of a pet named Fritz, given that he croaks and eats 
    +and that the rule base contains the following four rules:
    + 1. If X croaks and X eat flies - Then X is a frog
    + 2. If X chirps and X sings - Then X is a canary
    + 3. If X is a frog - Then X is green
    + 4. if X is a canary - Then X is yellow
    +If the following two facts are evaluated using this rule set:
    +* Fritz croaks
    +* Fritz eats flies
    +Then the following facts are inferred about Fritz:
    +* Fritz is a frog
    +* Fritz is green
    +Rya Streams is able to be used to build an inference engine that executes 
    +rule set. SPARQL queries are used to generate new Statements from the 
    +set of Statements. For example:
    +INSERT {
    +    ?x <urn:isA> <urn:frog> 
    --- End diff --
    Could use `a` or `rdf:type` instead of `<urn:isA>`:
    `?x a <urn:frog>`


Reply via email to