Re: Graph Output formats

Ted Dunning Sat, 17 Sep 2011 18:23:39 -0700

I strongly recommend Google's visualization API.

This is divided into two parts, the reporting half and the data source half.
 The reporting half is pretty good and very easy to use from javascript.  It
is the library that underlies pretty much all of Google's internal and
external web visualizations.

The data source half might actually be of more use for Mahout.  It provides
a simplified query language, query parsers standard provisions for having
data sources that handle only a subset of the possible query language, and
shims that help provide the remaining bits of query semantics.

The great virtue of this layer is that it provides a very clean abstraction
layer that separates data and presentation.  That separate lets you be very
exploratory at the visualization layer while reconstructing the data layer
as desired for performance.

Together these layers make it quite plausible to handle millions of data
points by the very common strategy of handling lots of data at the data
layer, but only transporting modest amounts of summary data to the
presentation layer.

The data layer is also general enough that you could almost certainly use it
with alternative visualization layers.  For instance, you can specify that
data be returned in CSV format which would make R usable for visualization.
 Or JSON makes Googles visualization code easy to use.  JSON would also make
processing or processing/js quite usable.

I have ported the java version of the data source stuff to use Maven in a
standardized build directory and have added a version of the mysql support
code to allow integration with standard web service frameworks.  That can be
found on github here:

https://github.com/tdunning/visualization-data-source

The original Google site on the subject is here:

http://code.google.com/apis/chart/

http://code.google.com/apis/chart/interactive/docs/dev/dsl_about.html

On Sat, Sep 17, 2011 at 1:23 PM, Grant Ingersoll <gsing...@apache.org>wrote:

> I'll be checking in an abstraction, people can implement writers as they
> see fit.
>
> FWIW, I'm mostly looking for something that can be used in a vizualization
> toolkit, such as Gephi (although all be impressed if any of them can handle
> 7M points)
>
> -Grant
>
> On Sep 16, 2011, at 7:14 PM, Ted Dunning wrote:
>
> > Indeed.
> >
> > I strongly prefer the other two for expressivity.
> >
> > On Fri, Sep 16, 2011 at 4:37 PM, Jake Mannix <jake.man...@gmail.com>
> wrote:
> >
> >> On Fri, Sep 16, 2011 at 3:30 PM, Ted Dunning <ted.dunn...@gmail.com>
> >> wrote:
> >>
> >>> I think that Avro and protobufs are the current best options for large
> >> data
> >>> assets like this.
> >>>
> >>
> >> (or serialized Thrift)
> >>
>
> --------------------------------------------
> Grant Ingersoll
> http://www.lucidimagination.com
> Lucene Eurocon 2011: http://www.lucene-eurocon.com
>
>

Re: Graph Output formats

Reply via email to