Github user wmarshall484 commented on a diff in the pull request:

    
https://github.com/apache/incubator-quarks-website/pull/13#discussion_r57074986
  
    --- Diff: site/docs/recipe_source_function.md ---
    @@ -0,0 +1,86 @@
    +---
    +title: Recipe 2. Writing a Source Function
    +---
    +In the previous [Hello Quarks!](recipe_hello_quarks) example, we create a 
data source which only generates a single Java String and prints it to output. 
Yet Quarks sources support the ability generate any data type as a source, not 
just primitive Java types such as Strings. Moreover, because the user supplies 
the code which generates the data, the user has complete flexibility for *how* 
the data is generated. This recipe demonstrates how a user could write such a 
custom data source.
    +
    +## Custom Source: Reading the Lines of a Web Page
    +{{site.data.alerts.note}} Quarks' API provides convenience methods for 
performing HTTP requests. For the sake of example we are writing a HTTP data 
source manually, but in principle there are easier methods. 
{{site.data.alerts.end}}
    +
    +One example of a custom data source could be retrieving the contents of a 
web page and printing each line to output. For example, the user could be 
querying the Yahoo Finance website for the most recent stock price data of Bank 
of America, Cabot Oil & Gas, and Freeport-McMoRan Inc:
    +
    +``` java
    +    public static void main(String[] args) throws Exception {
    +        DirectProvider dp = new DirectProvider();
    +        Topology top = dp.newTopology();
    +        
    +        final URL url = new 
URL("http://finance.yahoo.com/d/quotes.csv?s=BAC+COG+FCX&f=snabl";);
    +   }
    +```
    +
    +Given the correctly formatted URL to request the data, we can use the 
*Topology.source* method to generate each line of the page as a data item on 
the stream. *Topology.source* takes a Java Supplier that returns an Iterable. 
The supplier is invoked once, and the items returned from the Iterable are used 
as the stream's data items. For example, the following *queryWebsite* method 
returns a supplier which queries  a URL and returns an Iterable of its contents:
    +
    +``` java
    +    private static Supplier<Iterable<String> > queryWebsite(URL url) 
throws Exception{
    +        return () -> {
    +            List<String> lines = new LinkedList<>();
    +            try {
    +                InputStream is = url.openStream();
    +                BufferedReader br = new BufferedReader(
    +                        new InputStreamReader(is));
    +                
    +                for(String s = br.readLine(); s != null; s = br.readLine())
    +                    lines.add(s);
    +
    +            } catch (Exception e) {
    +                e.printStackTrace();
    +            }
    +            return lines;
    +        };
    +    }
    +```
    +
    + When invoking *Topology.source*, we can use *queryWebsite* to return the 
required supplier, passing in the URL.
    + 
    + ``` java
    +     public static void main(String[] args) throws Exception {
    +        DirectProvider dp = new DirectProvider();
    +        Topology top = dp.newTopology();
    +        
    +        final URL url = new 
URL("http://finance.yahoo.com/d/quotes.csv?s=BAC+COG+FCX&f=snabl";);
    +        
    +        TStream<String> linesOfWebsite = top.source(queryWebsite(url));
    +}
    + ```
    + 
    + Source methods such as *Topology.source* and *Topology.strings* return a 
*TStream*. If we print the *linesOfWebsite* stream to standard output and run 
the application, we can see that it correctly generates the data and feeds it 
into the Quarks runtime:
    +
    +Output:
    +
    +```java
    +"BAC","Bank of America Corporation Com",13.150,13.140,"12:00pm - 
<b>13.145</b>"
    +"COG","Cabot Oil & Gas Corporation Com",21.6800,21.6700,"12:00pm - 
<b>21.6775</b>"
    +"FCX","Freeport-McMoRan, Inc. Common S",8.8200,8.8100,"12:00pm - 
<b>8.8035</b>"
    +```
    +
    +## Polling source: reading data periodically
    +A much more common scenario for a developer is the periodic generation of 
data from a source operator -- a data source may need to be polled every 5 
seconds, 3 hours, or any time frame. To this end, *Topology* exposes the *poll* 
method which can be used to call a function at the frequency of the user's 
choosing. For example, a user might want to query Yahoo Finance every two 
seconds to retrieve the most up to date ticker price for a stock:
    +
    +```java
    +    public static void main(String[] args) throws Exception {
    +        DirectProvider dp = new DirectProvider();
    +        Topology top = dp.newTopology();
    +        
    +        final URL url = new 
URL("http://finance.yahoo.com/d/quotes.csv?s=BAC+COG+FCX&f=snabl";);
    +        
    +        TStream<Iterable<String>> source = top.poll(queryWebsite(url), 2, 
TimeUnit.SECONDS);
    +        source.print();
    +        
    +        dp.submit(top);
    +    }
    +```
    +
    +**Output:**
    +<br>
    +<img src="images/pollingSource.gif">
    +
    +It's important to note that calls to *DirectProvider.submit* are 
non-blocking; the main thread will exit, and the threads executing the topology 
will continue to run. (Also, to see changing stock prices, the above example 
needs to be run during open trading hours. Otherwise, it will simple return the 
same results every time the website is polled).
    --- End diff --
    
    Fixed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to