Hi Andy, quick question, does Spark-Notebook include its own Spark engine, or I 
need to install Spark separately and point to it from Spark Notebook? thanks

From: Lin, Hao [mailto:hao....@finra.org]
Sent: Tuesday, December 08, 2015 7:01 PM
To: andy petrella; Jörn Franke
Cc: user@spark.apache.org
Subject: RE: Graph visualization tool for GraphX

Thanks Andy, I certainly will give a try to your suggestion.

From: andy petrella [mailto:andy.petre...@gmail.com]
Sent: Tuesday, December 08, 2015 1:21 PM
To: Lin, Hao; Jörn Franke
Cc: user@spark.apache.org<mailto:user@spark.apache.org>
Subject: Re: Graph visualization tool for GraphX

Hello Lin,

This is indeed a tough scenario when you have many vertices (and even worst) 
many edges...

So two-fold answer:
First, technically, there is a graph plotting support in the spark notebook 
(https://github.com/andypetrella/spark-notebook/[github.com]<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_andypetrella_spark-2Dnotebook_&d=CwMFaQ&c=XK1GVu0Y2HvWRiFNJ9Hesw&r=uIybaSSiVvR1Uni2EecKYCQDa6UHqV0QDbyaKNjHuMM&m=hEbdsAy--QyqEffI1SO48sF4L8qAUh-2ABY1GA8lJ1s&s=btftZ-dWpn030poyyZmVMqv46oKPca3dR8InALUt_FI&e=>
 → check this notebook: 
https://github.com/andypetrella/spark-notebook/blob/master/notebooks/viz/Graph%20Plots.snb[github.com]<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_andypetrella_spark-2Dnotebook_blob_master_notebooks_viz_Graph-2520Plots.snb&d=CwMFaQ&c=XK1GVu0Y2HvWRiFNJ9Hesw&r=uIybaSSiVvR1Uni2EecKYCQDa6UHqV0QDbyaKNjHuMM&m=hEbdsAy--QyqEffI1SO48sF4L8qAUh-2ABY1GA8lJ1s&s=Ps4dk3ePteW7s9712REWtNXVtroxc_0S7apFyJni5lU&e=>).
 You can plot graph from scala, which will convert to D3 with force layout 
force field.
The number or the points which you will plot are "sampled" using a `Sampler` 
that you can provide yourself. Which leads to the second fold of this answer.

Plotting a large graph is rather tough because there is no real notion of 
dimension... there is always the option to dig the topological analysis theory 
to find good homeomorphism ... but won't be that efficient ;-D.
Best is to find a good approach to generalize/summarize the information, there 
are many many techniques (that you can find in mainly geospatial viz and 
biology viz theories...)
Best is to check what will match your need the fastest.
There are quick techniques like using unsupervised clustering models and then 
plot a voronoi diagram (which can be approached using force layout).

In general term I might say that multiscaling is intuitively what you want 
first: this is an interesting paper presenting the foundations: 
https://www.cs.ubc.ca/~tmm/courses/533-07/readings/auberIV03Seattle.pdf[cs.ubc.ca]<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.cs.ubc.ca_-7Etmm_courses_533-2D07_readings_auberIV03Seattle.pdf&d=CwMFaQ&c=XK1GVu0Y2HvWRiFNJ9Hesw&r=uIybaSSiVvR1Uni2EecKYCQDa6UHqV0QDbyaKNjHuMM&m=hEbdsAy--QyqEffI1SO48sF4L8qAUh-2ABY1GA8lJ1s&s=N3p_GQ2tUHGQ6sjyYfAfg2UcfC1mqfGDaEWlHS5VeCs&e=>

Oh and BTW, to end this longish mail, while looking for new papers on that, I 
felt on this one: 
http://vacommunity.org/egas2015/papers/IEEEEGAS2015-ScottLangevin.pdf[vacommunity.org]<https://urldefense.proofpoint.com/v2/url?u=http-3A__vacommunity.org_egas2015_papers_IEEEEGAS2015-2DScottLangevin.pdf&d=CwMFaQ&c=XK1GVu0Y2HvWRiFNJ9Hesw&r=uIybaSSiVvR1Uni2EecKYCQDa6UHqV0QDbyaKNjHuMM&m=hEbdsAy--QyqEffI1SO48sF4L8qAUh-2ABY1GA8lJ1s&s=r02nBLxV_-lc996UtERk8VfxQ4eFMRU9dzqHq9Fhtyo&e=>
 which is using
1. Spark !!!
2. a tile based approach (~ to tiling + pyramids in geospatial)

HTH

PS regarding the Spark Notebook, you can always come and discuss on gitter: 
https://gitter.im/andypetrella/spark-notebook[gitter.im]<https://urldefense.proofpoint.com/v2/url?u=https-3A__gitter.im_andypetrella_spark-2Dnotebook&d=CwMFaQ&c=XK1GVu0Y2HvWRiFNJ9Hesw&r=uIybaSSiVvR1Uni2EecKYCQDa6UHqV0QDbyaKNjHuMM&m=hEbdsAy--QyqEffI1SO48sF4L8qAUh-2ABY1GA8lJ1s&s=PuYO74CXBeRxGrdoe4TmK9ezEZfQWYN4bMLcZAJ12iE&e=>


On Tue, Dec 8, 2015 at 6:30 PM Lin, Hao 
<hao....@finra.org<mailto:hao....@finra.org>> wrote:
Hello Jorn,

Thank you for the reply and being tolerant of my over simplified question. I 
should’ve been more specific.  Though ~TB of data, there will be about billions 
of records (edges) and 100,000 nodes. We need to visualize the social networks 
graph like what can be done by Gephi which has limitation on scalability to 
handle such amount of data. There will be dozens of users to access and the 
response time is also critical.  We would like to run the visualization tool on 
the remote ec2 server where webtool can be a good choice for us.

Please let me know if I need to be more specific ☺.  Thanks
hao

From: Jörn Franke [mailto:jornfra...@gmail.com<mailto:jornfra...@gmail.com>]
Sent: Tuesday, December 08, 2015 11:31 AM
To: Lin, Hao
Cc: user@spark.apache.org<mailto:user@spark.apache.org>
Subject: Re: Graph visualization tool for GraphX

I am not sure about your use case. How should a human interpret many terabytes 
of data in one large visualization?? You have to be more specific, what part of 
the data needs to be visualized, what kind of visualization, what navigation do 
you expect within the visualisation, how many users, response time, web tool vs 
mobile vs Desktop etc

On 08 Dec 2015, at 16:46, Lin, Hao 
<hao....@finra.org<mailto:hao....@finra.org>> wrote:
Hi,

Anyone can recommend a great Graph visualization tool for GraphX  that can 
handle truly large Data (~ TB) ?

Thanks so much
Hao
Confidentiality Notice:: This email, including attachments, may include 
non-public, proprietary, confidential or legally privileged information. If you 
are not an intended recipient or an authorized agent of an intended recipient, 
you are hereby notified that any dissemination, distribution or copying of the 
information contained in or transmitted with this e-mail is unauthorized and 
strictly prohibited. If you have received this email in error, please notify 
the sender by replying to this message and permanently delete this e-mail, its 
attachments, and any copies of it immediately. You should not retain, copy or 
use this e-mail or any attachment for any purpose, nor disclose all or any part 
of the contents to any other person. Thank you.
Confidentiality Notice:: This email, including attachments, may include 
non-public, proprietary, confidential or legally privileged information. If you 
are not an intended recipient or an authorized agent of an intended recipient, 
you are hereby notified that any dissemination, distribution or copying of the 
information contained in or transmitted with this e-mail is unauthorized and 
strictly prohibited. If you have received this email in error, please notify 
the sender by replying to this message and permanently delete this e-mail, its 
attachments, and any copies of it immediately. You should not retain, copy or 
use this e-mail or any attachment for any purpose, nor disclose all or any part 
of the contents to any other person. Thank you.
--
andy
Confidentiality Notice:: This email, including attachments, may include 
non-public, proprietary, confidential or legally privileged information. If you 
are not an intended recipient or an authorized agent of an intended recipient, 
you are hereby notified that any dissemination, distribution or copying of the 
information contained in or transmitted with this e-mail is unauthorized and 
strictly prohibited. If you have received this email in error, please notify 
the sender by replying to this message and permanently delete this e-mail, its 
attachments, and any copies of it immediately. You should not retain, copy or 
use this e-mail or any attachment for any purpose, nor disclose all or any part 
of the contents to any other person. Thank you.

Confidentiality Notice::  This email, including attachments, may include 
non-public, proprietary, confidential or legally privileged information.  If 
you are not an intended recipient or an authorized agent of an intended 
recipient, you are hereby notified that any dissemination, distribution or 
copying of the information contained in or transmitted with this e-mail is 
unauthorized and strictly prohibited.  If you have received this email in 
error, please notify the sender by replying to this message and permanently 
delete this e-mail, its attachments, and any copies of it immediately.  You 
should not retain, copy or use this e-mail or any attachment for any purpose, 
nor disclose all or any part of the contents to any other person. Thank you.

Reply via email to