Github user wmarshall484 commented on a diff in the pull request:

    
https://github.com/apache/incubator-quarks-website/pull/13#discussion_r57074967
  
    --- Diff: site/docs/recipe_source_function.md ---
    @@ -0,0 +1,86 @@
    +---
    +title: Recipe 2. Writing a Source Function
    +---
    +In the previous [Hello Quarks!](recipe_hello_quarks) example, we create a 
data source which only generates a single Java String and prints it to output. 
Yet Quarks sources support the ability generate any data type as a source, not 
just primitive Java types such as Strings. Moreover, because the user supplies 
the code which generates the data, the user has complete flexibility for *how* 
the data is generated. This recipe demonstrates how a user could write such a 
custom data source.
    +
    +## Custom Source: Reading the Lines of a Web Page
    +{{site.data.alerts.note}} Quarks' API provides convenience methods for 
performing HTTP requests. For the sake of example we are writing a HTTP data 
source manually, but in principle there are easier methods. 
{{site.data.alerts.end}}
    +
    +One example of a custom data source could be retrieving the contents of a 
web page and printing each line to output. For example, the user could be 
querying the Yahoo Finance website for the most recent stock price data of Bank 
of America, Cabot Oil & Gas, and Freeport-McMoRan Inc:
    +
    +``` java
    +    public static void main(String[] args) throws Exception {
    +        DirectProvider dp = new DirectProvider();
    +        Topology top = dp.newTopology();
    +        
    +        final URL url = new 
URL("http://finance.yahoo.com/d/quotes.csv?s=BAC+COG+FCX&f=snabl";);
    +   }
    +```
    +
    +Given the correctly formatted URL to request the data, we can use the 
*Topology.source* method to generate each line of the page as a data item on 
the stream. *Topology.source* takes a Java Supplier that returns an Iterable. 
The supplier is invoked once, and the items returned from the Iterable are used 
as the stream's data items. For example, the following *queryWebsite* method 
returns a supplier which queries  a URL and returns an Iterable of its contents:
    +
    +``` java
    +    private static Supplier<Iterable<String> > queryWebsite(URL url) 
throws Exception{
    +        return () -> {
    +            List<String> lines = new LinkedList<>();
    +            try {
    +                InputStream is = url.openStream();
    +                BufferedReader br = new BufferedReader(
    +                        new InputStreamReader(is));
    +                
    +                for(String s = br.readLine(); s != null; s = br.readLine())
    +                    lines.add(s);
    +
    +            } catch (Exception e) {
    +                e.printStackTrace();
    +            }
    +            return lines;
    +        };
    +    }
    +```
    +
    + When invoking *Topology.source*, we can use *queryWebsite* to return the 
required supplier, passing in the URL.
    + 
    + ``` java
    +     public static void main(String[] args) throws Exception {
    +        DirectProvider dp = new DirectProvider();
    +        Topology top = dp.newTopology();
    +        
    +        final URL url = new 
URL("http://finance.yahoo.com/d/quotes.csv?s=BAC+COG+FCX&f=snabl";);
    +        
    +        TStream<String> linesOfWebsite = top.source(queryWebsite(url));
    +}
    --- End diff --
    
    Fixed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to