Re: [GSoC 2015][COMDEV-119] Zeppelin GSoC Project: add more D3 visualization

madhuka udantha Wed, 25 Mar 2015 02:28:19 -0700

Hi,

I want to know about the code structure and zeppelin architecture? Is there
any good post / article / wiki regarding the said.
Also if there is any quick start guide regarding development of Zeppelin
please share it with me.


Thanks.

On Mon, Mar 23, 2015 at 10:44 AM, madhuka udantha <madhukaudan...@gmail.com>
wrote:

> Hi, moon
>
> Yes, Since
>
>> "Moving computation is cheaper than moving data"
>
> We can do computation in computing framework.
>
> For simple pivot changing or filtering can be handle in local storage with
> indexing databases depending on the current user level.
> As you saw, computations will be handle in the back ends.
>
> Great to hear about the building rich GUI, I will give me chart library ideas
> on there.
>
> Your ideas are always welcome, those will be helpful for my task and draft
> proposal
>
> Thanks
>
> On Mon, Mar 23, 2015 at 7:59 AM, moon soo Lee <m...@apache.org> wrote:
>
>> Hi, madhuka udantha
>>
>> I think your idea about chart library and data transformation engine
>> sounds
>> cool. For the data transform modules, it's good idea to make this
>> pluggable
>> to data transform engine. But i'm not sure getting result locally and do
>> transform for pivot or filtering to prevent run query again is good idea.
>> Because of Zeppelin is (not limited but) trying to build analytical
>> environment on top of distributed computing framework, like Spark, Flink,
>> Ignite, etc. Most of distributed computing framework Zeppelin trying to
>> integrate is following the same paradigm "Moving computation is cheaper
>> than moving data". In this manner, size of data that transform engine need
>> to handle can be easily multiple TB. Which will take long time to copy to
>> local machine and process. So i think transform module should be run on
>> underlying distributed computing framework.
>>
>> And about Chart library, we have started discussion thread about building
>> rich GUI inside of notebook. it might be related.
>>
>> Thanks,
>> moon
>>
>>
>>
>> On Mon, Mar 23, 2015 at 2:27 AM madhuka udantha <madhukaudan...@gmail.com
>> >
>> wrote:
>>
>> > On Sun, Mar 22, 2015 at 7:05 PM, Corneau Damien <cornead...@apache.org>
>> > wrote:
>> >
>> > > Hi,
>> > >
>> > > Being able to aggregate on the query side is a great idea and would
>> allow
>> > > us to transfer less data as well as having a full query
>> representation of
>> > > the visualization.
>> > >
>> > > However creating a SQL query dynamically is a pretty difficult task,
>> and
>> > > might be too much for that scope.
>> > >
>> > > Also I see some possible problems with this method:
>> > >  - Changing the pivot or simple filtering would mean running the query
>> > > again
>> > >
>> > No, the query wont run again.
>> > In the first run of the query data is collected and stored locally-
>> local
>> > storage [1](using indexing techniques to make retrieval faster) So
>> changing
>> > pivot or simple filtering will use the local storage.
>> > If any attribute or data is missing in local storage then it will
>> retrieve
>> > only that and save the network bandwidth as well.
>> > Does my explanation make sense.
>> >
>> >
>> >
>> > >  - Being able to make pivot style SQL query would be really hard,
>> > >    we would need multiple sub-queries or even some times multiple
>> queries
>> > > (I tried a few times and could have the result wanted only with
>> > > visualization side pivot).
>> > >    It would end up with really bad SQL queries, especially with the
>> Hive
>> > > SQL or Spark SQL limitations and would take way more time to process.
>> > >
>> > Agreed. I'm not planing to use pivot style queries.
>> >
>> > Any suggestions?
>> >
>> >
>> > Thanks.
>> >
>> >
>> > > On Sun, Mar 22, 2015 at 10:08 PM, IT CTO <goi....@gmail.com> wrote:
>> > >
>> > > > Hi,
>> > > >
>> > > > The Chart library features sounds promising.
>> > > > As  for the data engine - one thing that I think is missing is the
>> > > ability
>> > > > to use the visualization to drive the aggregation in the SQL. today,
>> > you
>> > > > first write the SQL, you execute it, *limited by the number of
>> results
>> > > sent
>> > > > to the client*, and then you use viz to understand the results.
>> > > > Alternatively, if through the visualization I can generate a better
>> SQL
>> > > > which returns returns an aggregated data-set then I can analyze a
>> > bigger
>> > > > amount of data.
>> > > >
>> > > > I hope I was clear enough in my explanation :-)
>> > > >
>> > > > Eran
>> > > >
>> > > >
>> > > > On Fri, Mar 20, 2015 at 8:21 AM, madhuka udantha <
>> > > madhukaudan...@gmail.com
>> > > > >
>> > > > wrote:
>> > > >
>> > > > > Hi,
>> > > > >
>> > > > > Here is my proposing ideas.
>> > > > > According to COMDEV-119 jira. Charts are hard coded until now and
>> > data
>> > > > > transformation issue was highlighted since different charts have
>> > > > different
>> > > > > pivot fields eg: Area charts, Scatter, Surface charts, Bubble
>> charts,
>> > > > Radar
>> > > > > charts. etc..
>> > > > >
>> > > > > To solve this I am introducing a two major component one is called
>> > > 'Chart
>> > > > > library' and 'Data transformation engine'. Chart library is
>> located
>> > > where
>> > > > > it shows the chats that are currently plugged. There we can plug
>> > chart
>> > > > > types and those can be reused.
>> > > > >
>> > > > > *Chart library features *
>> > > > >
>> > > > >    - Users can select the chart from library
>> > > > >    - Those charts are pluggable to library
>> > > > >    - Charts can be plugged by config(json)/UI with wizard
>> > > > >    - Configuration/Meta file of the chart contains interface,
>> libs,
>> > > > themes
>> > > > >    and a data transformation types/mappings
>> > > > >
>> > > > >
>> > > > >
>> > > > > *Data Transformation Engine*
>> > > > > 'Data transformation engine' contains data transformation modules.
>> > > Those
>> > > > > modules are also pluggable to engine. Those have connections to
>> > charts.
>> > > > > Data transformation engine sit between the data (sql) and chart.
>> So
>> > > this
>> > > > > module  converts data and map them to each chart pivot field
>> > > > >
>> > > > >    - This module will look at pivot fields of the chart
>> > > > >    - Selected attributes of the SQL query
>> > > > >    - Attribute value operations improvement (string split, value
>> > > > >    aggregation, round number round)
>> > > > >
>> > > > >
>> > > > > Another improvement that I notice is that
>> > > > >
>> > > > >    - Query Edit auto-completion support (with Ctrl+space)
>> > > > >
>> > > > >
>> > > > > Your ideas are welcome here
>> > > > > Thanks
>> > > > >
>> > > > > On Fri, Mar 20, 2015 at 10:57 AM, madhuka udantha <
>> > > > > madhukaudan...@gmail.com>
>> > > > > wrote:
>> > > > >
>> > > > > > Hi All,
>> > > > > >
>> > > > > > I'm Udantha, MSc. Student at University of Moratuwa. This GSoC
>> 2015
>> > > > > > project, 0COMDEV-1190 captures my interest.
>> > > > > >
>> > > > > > I have abundant experiences of visualization techniques creating
>> > > > numerous
>> > > > > > dashboards[1,2] with javascript, html5, angularJS, d3 charting
>> etc.
>> > > > > >
>> > > > > > My current research area comprises of big data where I have
>> worked
>> > > with
>> > > > > > various types of data sets. Also I'm working with cluster
>> > > > representation
>> > > > > > and classification techniques where visualization amounts to a
>> > > > > considerable
>> > > > > > part. I was following COMDEV-119 (jira) with Alexander Bezzubov
>> and
>> > > > > CORNEAU
>> > > > > > Damien for more than week.
>> > > > > >
>> > > > > > Thanks
>> > > > > >
>> > > > > > [1] http://wso2.com/products/user-engagement-server/
>> > > > > > [2] https://github.com/wso2/jaggery
>> > > > > > --
>> > > > > > Cheers,
>> > > > > > Madhuka Udantha
>> > > > > > http://madhukaudantha.blogspot.com
>> > > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > --
>> > > > > Cheers,
>> > > > > Madhuka Udantha
>> > > > > http://madhukaudantha.blogspot.com
>> > > > >
>> > > >
>> > > >
>> > > >
>> > > > --
>> > > > Eran | CTO
>> > > >
>> > >
>> >
>> >
>> >
>> > --
>> > Cheers,
>> > Madhuka Udantha
>> > http://madhukaudantha.blogspot.com
>> >
>>
>
>
>
> --
> Cheers,
> Madhuka Udantha
> http://madhukaudantha.blogspot.com
>



-- 
Cheers,
Madhuka Udantha
http://madhukaudantha.blogspot.com

Re: [GSoC 2015][COMDEV-119] Zeppelin GSoC Project: add more D3 visualization

Reply via email to