yes that is the main idea, please see architecture thread "Databricks Notebooks for Spark" and also read about iphython notebooks.
--Srinath On Mon, Nov 23, 2015 at 5:07 PM, Seshika Fernando <[email protected]> wrote: > Hi Srinath, > > The 'notebooks' that you talk about: are they similar to a sort of staging > DAS configuration where we test out/ try out things, and once we are happy > we deploy that configuration to the respective pipelines (like Realtime). > > Cannot access doc. Can you share a 'view only' link? > > seshi > > On Mon, Nov 23, 2015 at 2:57 PM, Srinath Perera <[email protected]> wrote: > >> Hi All, >> >> I tried to write down the use cases, to start thinking about this >> starting from what we discussed in the meeting. Please comment. ( doc is at >> https://docs.google.com/document/d/1355YEXbhcd2fvS-zG_CiMigT-iTncxYn3DTHlJRTYyo/edit# >> ( same content is below). >> >> Thanks >> Srinath >> Batch, interactive, and Predictive Story >> >> 1. >> >> Data is uploaded to the system or send as a data stream and collected >> for some time ( in DAS) >> 2. >> >> Data Scientist come in and select a data set, and look at schema of >> data and do standard descriptive statistics like Mean, Max, Percentiles >> and >> standard deviation about the data. >> 3. >> >> Data Scientist cleans up the data using series of transformations. >> This might include combining multiple data sets into one data set. >> [Notebooks] >> 4. >> >> He can play with the data interactively >> 5. >> >> He visualize the data in several ways [Notebooks] >> 6. >> >> If he need descriptive statistics, he can export the data mutations >> in the notebooks as a script and schedule it. >> 7. >> >> If what he needs is machine learning, he can initialize and run the >> ML Wizard from the Notebooks and create a model. >> 8. >> >> He can export the model he created and any data mutation operations >> he did as a script and deploy both the model and data mutation operations >> in the CEP ( Realtime Pipeline). This is the actual transaction flow. >> 9. >> >> He can export the data mutation operations and machine learning model >> building logic as a script and schedule it to run periodically. This is >> the >> >> >> >> [image: NotebookPipeline.png] >> >> >> >> Realtime Story >> >> Realtime story also we can start with a data set, write realtime queries, >> test them by replaying the data, and then only we deploy queries. ( We do >> this event now). We can do the same. >> >> >> 1. >> >> User start with a dataset. >> 2. >> >> He write a set of queries using dataset as a stream. Streams and >> dataset shares the same record format. For example, consider the following >> data set. >> >> >> We can consider this as a batch data set by taking it as a whole or as a >> stream by taking record by record. >> >> For example, if we run query >> >> select * from CountryData where GDP>35000 >> >> it will provide following results. >> >> >> >> >> 1. >> >> Tables created by replay data with CEP queries, we can visualize like >> other data. ( except that time is special) >> 2. >> >> When Data Scientist is happy, Data Scientist can click a button and >> export the CEP queries as a execution plan and any charts as a realtime >> gadgets. ( one complication is time is special, and we need to transform >> from any visualization to time based visualization) >> >> >> -- >> ============================ >> Blog: http://srinathsview.blogspot.com twitter:@srinath_perera >> Site: http://people.apache.org/~hemapani/ >> Photos: http://www.flickr.com/photos/hemapani/ >> Phone: 0772360902 >> >> _______________________________________________ >> Architecture mailing list >> [email protected] >> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture >> >> > -- ============================ Blog: http://srinathsview.blogspot.com twitter:@srinath_perera Site: http://people.apache.org/~hemapani/ Photos: http://www.flickr.com/photos/hemapani/ Phone: 0772360902
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
