Hi Alex, This is the email thread which I used to collaborate with the Zeppelin Community[1] . Currently I'm writing the Proposal upon the points discussed here and on JIRA. If I have missed anything regard to the task please feel free to share it. I'll share my proposal when I finish writing it.
Thanks. [1] https://issues.apache.org/jira/browse/COMDEV-119 On Wed, Mar 25, 2015 at 11:09 PM, madhuka udantha <madhukaudan...@gmail.com> wrote: > Hi, > > According to the discussion to get a clear understanding I just drew 2 > sequence diagrams that will explain > how chart will react to changing the pivot. > > Safe Level > > https://issues.apache.org/jira/secure/attachment/12707251/Changing%20the%20pivot%20-%20Safe%20Level.png > > > In safe level (default level) only limited amount of data is > retrieved(sufficient to draw the chart). > At initial stage local storage don't contain data. But when you make a > pivot change data will be there to draw the graph. If data is out-dated we > will get it from back-end. > > Restricted Level > > https://issues.apache.org/jira/secure/attachment/12707250/Changing%20the%20pivot%20-%20Restricted%20Level.png > <https://issues.apache.org/jira/secure/attachment/12707250/Changing%20the%20pivot%20-%20Restricted%20Level.png> > > User will reach Restricted Level after he successfully pass the Safe > Level. Then in local storage we will have up-to-date data. But for this > level it will be using all the data in the database. So Charting will grab > the data from storage and back-end. > > Your ideas are mostly welcomed. > > > On Wed, Mar 25, 2015 at 2:55 PM, madhuka udantha <madhukaudan...@gmail.com > > wrote: > >> Hi, >> >> I want to know about the code structure and zeppelin architecture? Is >> there any good post / article / wiki regarding the said. >> Also if there is any quick start guide regarding development of Zeppelin >> please share it with me. >> >> Thanks. >> >> On Mon, Mar 23, 2015 at 10:44 AM, madhuka udantha < >> madhukaudan...@gmail.com> wrote: >> >>> Hi, moon >>> >>> Yes, Since >>> >>>> "Moving computation is cheaper than moving data" >>> >>> We can do computation in computing framework. >>> >>> For simple pivot changing or filtering can be handle in local storage >>> with indexing databases depending on the current user level. >>> As you saw, computations will be handle in the back ends. >>> >>> Great to hear about the building rich GUI, I will give me chart library >>> ideas >>> on there. >>> >>> Your ideas are always welcome, those will be helpful for my task and >>> draft proposal >>> >>> Thanks >>> >>> On Mon, Mar 23, 2015 at 7:59 AM, moon soo Lee <m...@apache.org> wrote: >>> >>>> Hi, madhuka udantha >>>> >>>> I think your idea about chart library and data transformation engine >>>> sounds >>>> cool. For the data transform modules, it's good idea to make this >>>> pluggable >>>> to data transform engine. But i'm not sure getting result locally and do >>>> transform for pivot or filtering to prevent run query again is good >>>> idea. >>>> Because of Zeppelin is (not limited but) trying to build analytical >>>> environment on top of distributed computing framework, like Spark, >>>> Flink, >>>> Ignite, etc. Most of distributed computing framework Zeppelin trying to >>>> integrate is following the same paradigm "Moving computation is cheaper >>>> than moving data". In this manner, size of data that transform engine >>>> need >>>> to handle can be easily multiple TB. Which will take long time to copy >>>> to >>>> local machine and process. So i think transform module should be run on >>>> underlying distributed computing framework. >>>> >>>> And about Chart library, we have started discussion thread about >>>> building >>>> rich GUI inside of notebook. it might be related. >>>> >>>> Thanks, >>>> moon >>>> >>>> >>>> >>>> On Mon, Mar 23, 2015 at 2:27 AM madhuka udantha < >>>> madhukaudan...@gmail.com> >>>> wrote: >>>> >>>> > On Sun, Mar 22, 2015 at 7:05 PM, Corneau Damien < >>>> cornead...@apache.org> >>>> > wrote: >>>> > >>>> > > Hi, >>>> > > >>>> > > Being able to aggregate on the query side is a great idea and would >>>> allow >>>> > > us to transfer less data as well as having a full query >>>> representation of >>>> > > the visualization. >>>> > > >>>> > > However creating a SQL query dynamically is a pretty difficult >>>> task, and >>>> > > might be too much for that scope. >>>> > > >>>> > > Also I see some possible problems with this method: >>>> > > - Changing the pivot or simple filtering would mean running the >>>> query >>>> > > again >>>> > > >>>> > No, the query wont run again. >>>> > In the first run of the query data is collected and stored locally- >>>> local >>>> > storage [1](using indexing techniques to make retrieval faster) So >>>> changing >>>> > pivot or simple filtering will use the local storage. >>>> > If any attribute or data is missing in local storage then it will >>>> retrieve >>>> > only that and save the network bandwidth as well. >>>> > Does my explanation make sense. >>>> > >>>> > >>>> > >>>> > > - Being able to make pivot style SQL query would be really hard, >>>> > > we would need multiple sub-queries or even some times multiple >>>> queries >>>> > > (I tried a few times and could have the result wanted only with >>>> > > visualization side pivot). >>>> > > It would end up with really bad SQL queries, especially with the >>>> Hive >>>> > > SQL or Spark SQL limitations and would take way more time to >>>> process. >>>> > > >>>> > Agreed. I'm not planing to use pivot style queries. >>>> > >>>> > Any suggestions? >>>> > >>>> > >>>> > Thanks. >>>> > >>>> > >>>> > > On Sun, Mar 22, 2015 at 10:08 PM, IT CTO <goi....@gmail.com> wrote: >>>> > > >>>> > > > Hi, >>>> > > > >>>> > > > The Chart library features sounds promising. >>>> > > > As for the data engine - one thing that I think is missing is the >>>> > > ability >>>> > > > to use the visualization to drive the aggregation in the SQL. >>>> today, >>>> > you >>>> > > > first write the SQL, you execute it, *limited by the number of >>>> results >>>> > > sent >>>> > > > to the client*, and then you use viz to understand the results. >>>> > > > Alternatively, if through the visualization I can generate a >>>> better SQL >>>> > > > which returns returns an aggregated data-set then I can analyze a >>>> > bigger >>>> > > > amount of data. >>>> > > > >>>> > > > I hope I was clear enough in my explanation :-) >>>> > > > >>>> > > > Eran >>>> > > > >>>> > > > >>>> > > > On Fri, Mar 20, 2015 at 8:21 AM, madhuka udantha < >>>> > > madhukaudan...@gmail.com >>>> > > > > >>>> > > > wrote: >>>> > > > >>>> > > > > Hi, >>>> > > > > >>>> > > > > Here is my proposing ideas. >>>> > > > > According to COMDEV-119 jira. Charts are hard coded until now >>>> and >>>> > data >>>> > > > > transformation issue was highlighted since different charts have >>>> > > > different >>>> > > > > pivot fields eg: Area charts, Scatter, Surface charts, Bubble >>>> charts, >>>> > > > Radar >>>> > > > > charts. etc.. >>>> > > > > >>>> > > > > To solve this I am introducing a two major component one is >>>> called >>>> > > 'Chart >>>> > > > > library' and 'Data transformation engine'. Chart library is >>>> located >>>> > > where >>>> > > > > it shows the chats that are currently plugged. There we can plug >>>> > chart >>>> > > > > types and those can be reused. >>>> > > > > >>>> > > > > *Chart library features * >>>> > > > > >>>> > > > > - Users can select the chart from library >>>> > > > > - Those charts are pluggable to library >>>> > > > > - Charts can be plugged by config(json)/UI with wizard >>>> > > > > - Configuration/Meta file of the chart contains interface, >>>> libs, >>>> > > > themes >>>> > > > > and a data transformation types/mappings >>>> > > > > >>>> > > > > >>>> > > > > >>>> > > > > *Data Transformation Engine* >>>> > > > > 'Data transformation engine' contains data transformation >>>> modules. >>>> > > Those >>>> > > > > modules are also pluggable to engine. Those have connections to >>>> > charts. >>>> > > > > Data transformation engine sit between the data (sql) and >>>> chart. So >>>> > > this >>>> > > > > module converts data and map them to each chart pivot field >>>> > > > > >>>> > > > > - This module will look at pivot fields of the chart >>>> > > > > - Selected attributes of the SQL query >>>> > > > > - Attribute value operations improvement (string split, value >>>> > > > > aggregation, round number round) >>>> > > > > >>>> > > > > >>>> > > > > Another improvement that I notice is that >>>> > > > > >>>> > > > > - Query Edit auto-completion support (with Ctrl+space) >>>> > > > > >>>> > > > > >>>> > > > > Your ideas are welcome here >>>> > > > > Thanks >>>> > > > > >>>> > > > > On Fri, Mar 20, 2015 at 10:57 AM, madhuka udantha < >>>> > > > > madhukaudan...@gmail.com> >>>> > > > > wrote: >>>> > > > > >>>> > > > > > Hi All, >>>> > > > > > >>>> > > > > > I'm Udantha, MSc. Student at University of Moratuwa. This >>>> GSoC 2015 >>>> > > > > > project, 0COMDEV-1190 captures my interest. >>>> > > > > > >>>> > > > > > I have abundant experiences of visualization techniques >>>> creating >>>> > > > numerous >>>> > > > > > dashboards[1,2] with javascript, html5, angularJS, d3 >>>> charting etc. >>>> > > > > > >>>> > > > > > My current research area comprises of big data where I have >>>> worked >>>> > > with >>>> > > > > > various types of data sets. Also I'm working with cluster >>>> > > > representation >>>> > > > > > and classification techniques where visualization amounts to a >>>> > > > > considerable >>>> > > > > > part. I was following COMDEV-119 (jira) with Alexander >>>> Bezzubov and >>>> > > > > CORNEAU >>>> > > > > > Damien for more than week. >>>> > > > > > >>>> > > > > > Thanks >>>> > > > > > >>>> > > > > > [1] http://wso2.com/products/user-engagement-server/ >>>> > > > > > [2] https://github.com/wso2/jaggery >>>> > > > > > -- >>>> > > > > > Cheers, >>>> > > > > > Madhuka Udantha >>>> > > > > > http://madhukaudantha.blogspot.com >>>> > > > > > >>>> > > > > >>>> > > > > >>>> > > > > >>>> > > > > -- >>>> > > > > Cheers, >>>> > > > > Madhuka Udantha >>>> > > > > http://madhukaudantha.blogspot.com >>>> > > > > >>>> > > > >>>> > > > >>>> > > > >>>> > > > -- >>>> > > > Eran | CTO >>>> > > > >>>> > > >>>> > >>>> > >>>> > >>>> > -- >>>> > Cheers, >>>> > Madhuka Udantha >>>> > http://madhukaudantha.blogspot.com >>>> > >>>> >>> >>> >>> >>> -- >>> Cheers, >>> Madhuka Udantha >>> http://madhukaudantha.blogspot.com >>> >> >> >> >> -- >> Cheers, >> Madhuka Udantha >> http://madhukaudantha.blogspot.com >> > > > > -- > Cheers, > Madhuka Udantha > http://madhukaudantha.blogspot.com > -- Cheers, Madhuka Udantha http://madhukaudantha.blogspot.com