Hi, According to the discussion to get a clear understanding I just drew 2 sequence diagrams that will explain how chart will react to changing the pivot.
Safe Level https://issues.apache.org/jira/secure/attachment/12707251/Changing%20the%20pivot%20-%20Safe%20Level.png In safe level (default level) only limited amount of data is retrieved(sufficient to draw the chart). At initial stage local storage don't contain data. But when you make a pivot change data will be there to draw the graph. If data is out-dated we will get it from back-end. Restricted Level https://issues.apache.org/jira/secure/attachment/12707250/Changing%20the%20pivot%20-%20Restricted%20Level.png <https://issues.apache.org/jira/secure/attachment/12707250/Changing%20the%20pivot%20-%20Restricted%20Level.png> User will reach Restricted Level after he successfully pass the Safe Level. Then in local storage we will have up-to-date data. But for this level it will be using all the data in the database. So Charting will grab the data from storage and back-end. Your ideas are mostly welcomed. On Wed, Mar 25, 2015 at 2:55 PM, madhuka udantha <madhukaudan...@gmail.com> wrote: > Hi, > > I want to know about the code structure and zeppelin architecture? Is > there any good post / article / wiki regarding the said. > Also if there is any quick start guide regarding development of Zeppelin > please share it with me. > > Thanks. > > On Mon, Mar 23, 2015 at 10:44 AM, madhuka udantha < > madhukaudan...@gmail.com> wrote: > >> Hi, moon >> >> Yes, Since >> >>> "Moving computation is cheaper than moving data" >> >> We can do computation in computing framework. >> >> For simple pivot changing or filtering can be handle in local storage >> with indexing databases depending on the current user level. >> As you saw, computations will be handle in the back ends. >> >> Great to hear about the building rich GUI, I will give me chart library ideas >> on there. >> >> Your ideas are always welcome, those will be helpful for my task and >> draft proposal >> >> Thanks >> >> On Mon, Mar 23, 2015 at 7:59 AM, moon soo Lee <m...@apache.org> wrote: >> >>> Hi, madhuka udantha >>> >>> I think your idea about chart library and data transformation engine >>> sounds >>> cool. For the data transform modules, it's good idea to make this >>> pluggable >>> to data transform engine. But i'm not sure getting result locally and do >>> transform for pivot or filtering to prevent run query again is good idea. >>> Because of Zeppelin is (not limited but) trying to build analytical >>> environment on top of distributed computing framework, like Spark, Flink, >>> Ignite, etc. Most of distributed computing framework Zeppelin trying to >>> integrate is following the same paradigm "Moving computation is cheaper >>> than moving data". In this manner, size of data that transform engine >>> need >>> to handle can be easily multiple TB. Which will take long time to copy to >>> local machine and process. So i think transform module should be run on >>> underlying distributed computing framework. >>> >>> And about Chart library, we have started discussion thread about building >>> rich GUI inside of notebook. it might be related. >>> >>> Thanks, >>> moon >>> >>> >>> >>> On Mon, Mar 23, 2015 at 2:27 AM madhuka udantha < >>> madhukaudan...@gmail.com> >>> wrote: >>> >>> > On Sun, Mar 22, 2015 at 7:05 PM, Corneau Damien <cornead...@apache.org >>> > >>> > wrote: >>> > >>> > > Hi, >>> > > >>> > > Being able to aggregate on the query side is a great idea and would >>> allow >>> > > us to transfer less data as well as having a full query >>> representation of >>> > > the visualization. >>> > > >>> > > However creating a SQL query dynamically is a pretty difficult task, >>> and >>> > > might be too much for that scope. >>> > > >>> > > Also I see some possible problems with this method: >>> > > - Changing the pivot or simple filtering would mean running the >>> query >>> > > again >>> > > >>> > No, the query wont run again. >>> > In the first run of the query data is collected and stored locally- >>> local >>> > storage [1](using indexing techniques to make retrieval faster) So >>> changing >>> > pivot or simple filtering will use the local storage. >>> > If any attribute or data is missing in local storage then it will >>> retrieve >>> > only that and save the network bandwidth as well. >>> > Does my explanation make sense. >>> > >>> > >>> > >>> > > - Being able to make pivot style SQL query would be really hard, >>> > > we would need multiple sub-queries or even some times multiple >>> queries >>> > > (I tried a few times and could have the result wanted only with >>> > > visualization side pivot). >>> > > It would end up with really bad SQL queries, especially with the >>> Hive >>> > > SQL or Spark SQL limitations and would take way more time to process. >>> > > >>> > Agreed. I'm not planing to use pivot style queries. >>> > >>> > Any suggestions? >>> > >>> > >>> > Thanks. >>> > >>> > >>> > > On Sun, Mar 22, 2015 at 10:08 PM, IT CTO <goi....@gmail.com> wrote: >>> > > >>> > > > Hi, >>> > > > >>> > > > The Chart library features sounds promising. >>> > > > As for the data engine - one thing that I think is missing is the >>> > > ability >>> > > > to use the visualization to drive the aggregation in the SQL. >>> today, >>> > you >>> > > > first write the SQL, you execute it, *limited by the number of >>> results >>> > > sent >>> > > > to the client*, and then you use viz to understand the results. >>> > > > Alternatively, if through the visualization I can generate a >>> better SQL >>> > > > which returns returns an aggregated data-set then I can analyze a >>> > bigger >>> > > > amount of data. >>> > > > >>> > > > I hope I was clear enough in my explanation :-) >>> > > > >>> > > > Eran >>> > > > >>> > > > >>> > > > On Fri, Mar 20, 2015 at 8:21 AM, madhuka udantha < >>> > > madhukaudan...@gmail.com >>> > > > > >>> > > > wrote: >>> > > > >>> > > > > Hi, >>> > > > > >>> > > > > Here is my proposing ideas. >>> > > > > According to COMDEV-119 jira. Charts are hard coded until now and >>> > data >>> > > > > transformation issue was highlighted since different charts have >>> > > > different >>> > > > > pivot fields eg: Area charts, Scatter, Surface charts, Bubble >>> charts, >>> > > > Radar >>> > > > > charts. etc.. >>> > > > > >>> > > > > To solve this I am introducing a two major component one is >>> called >>> > > 'Chart >>> > > > > library' and 'Data transformation engine'. Chart library is >>> located >>> > > where >>> > > > > it shows the chats that are currently plugged. There we can plug >>> > chart >>> > > > > types and those can be reused. >>> > > > > >>> > > > > *Chart library features * >>> > > > > >>> > > > > - Users can select the chart from library >>> > > > > - Those charts are pluggable to library >>> > > > > - Charts can be plugged by config(json)/UI with wizard >>> > > > > - Configuration/Meta file of the chart contains interface, >>> libs, >>> > > > themes >>> > > > > and a data transformation types/mappings >>> > > > > >>> > > > > >>> > > > > >>> > > > > *Data Transformation Engine* >>> > > > > 'Data transformation engine' contains data transformation >>> modules. >>> > > Those >>> > > > > modules are also pluggable to engine. Those have connections to >>> > charts. >>> > > > > Data transformation engine sit between the data (sql) and chart. >>> So >>> > > this >>> > > > > module converts data and map them to each chart pivot field >>> > > > > >>> > > > > - This module will look at pivot fields of the chart >>> > > > > - Selected attributes of the SQL query >>> > > > > - Attribute value operations improvement (string split, value >>> > > > > aggregation, round number round) >>> > > > > >>> > > > > >>> > > > > Another improvement that I notice is that >>> > > > > >>> > > > > - Query Edit auto-completion support (with Ctrl+space) >>> > > > > >>> > > > > >>> > > > > Your ideas are welcome here >>> > > > > Thanks >>> > > > > >>> > > > > On Fri, Mar 20, 2015 at 10:57 AM, madhuka udantha < >>> > > > > madhukaudan...@gmail.com> >>> > > > > wrote: >>> > > > > >>> > > > > > Hi All, >>> > > > > > >>> > > > > > I'm Udantha, MSc. Student at University of Moratuwa. This GSoC >>> 2015 >>> > > > > > project, 0COMDEV-1190 captures my interest. >>> > > > > > >>> > > > > > I have abundant experiences of visualization techniques >>> creating >>> > > > numerous >>> > > > > > dashboards[1,2] with javascript, html5, angularJS, d3 charting >>> etc. >>> > > > > > >>> > > > > > My current research area comprises of big data where I have >>> worked >>> > > with >>> > > > > > various types of data sets. Also I'm working with cluster >>> > > > representation >>> > > > > > and classification techniques where visualization amounts to a >>> > > > > considerable >>> > > > > > part. I was following COMDEV-119 (jira) with Alexander >>> Bezzubov and >>> > > > > CORNEAU >>> > > > > > Damien for more than week. >>> > > > > > >>> > > > > > Thanks >>> > > > > > >>> > > > > > [1] http://wso2.com/products/user-engagement-server/ >>> > > > > > [2] https://github.com/wso2/jaggery >>> > > > > > -- >>> > > > > > Cheers, >>> > > > > > Madhuka Udantha >>> > > > > > http://madhukaudantha.blogspot.com >>> > > > > > >>> > > > > >>> > > > > >>> > > > > >>> > > > > -- >>> > > > > Cheers, >>> > > > > Madhuka Udantha >>> > > > > http://madhukaudantha.blogspot.com >>> > > > > >>> > > > >>> > > > >>> > > > >>> > > > -- >>> > > > Eran | CTO >>> > > > >>> > > >>> > >>> > >>> > >>> > -- >>> > Cheers, >>> > Madhuka Udantha >>> > http://madhukaudantha.blogspot.com >>> > >>> >> >> >> >> -- >> Cheers, >> Madhuka Udantha >> http://madhukaudantha.blogspot.com >> > > > > -- > Cheers, > Madhuka Udantha > http://madhukaudantha.blogspot.com > -- Cheers, Madhuka Udantha http://madhukaudantha.blogspot.com