Spark ml Support Vector machines or neural networks could be candidates. 
For unstructured learning it could be clustering.
For doing a graph analysis On the followers you can easily use Spark Graphx
Keep in mind that each tweet contains a lot of meta data (location, followers 
etc) that is more or less structured.
For unstructured text analytics (eg tweet itself)I recommend solr/ElasticSearch 
.

However I am not sure what you want to do with the data exactly.


> On 07 Jun 2016, at 13:16, Mich Talebzadeh <mich.talebza...@gmail.com> wrote:
> 
> Hi,
> 
> This is really a general question.
> 
> I use Spark to get twitter data. I did some looking at it
> 
>     val ssc = new StreamingContext(sparkConf, Seconds(2))
>     val tweets = TwitterUtils.createStream(ssc, None)
>     val statuses = tweets.map(status => status.getText())
>     statuses.print()
> 
> Ok
> 
> Also I can use Apache flume to store data in hdfs directory
> 
> $FLUME_HOME/bin/flume-ng agent --conf ./conf/ -f conf/twitter.conf 
> Dflume.root.logger=DEBUG,console -n TwitterAgent
> Now that stores twitter data in binary format in  hdfs directory.
> 
> My question is pretty basic.
> 
> What is the best tool/language to dif in to that data. For example twitter 
> streaming data. I am getting all sorts od stuff coming in. Say I am only 
> interested in certain topics like sport etc. How can I detect the signal from 
> the noise using what tool and language?
> 
> Thanks
> Dr Mich Talebzadeh
>  
> LinkedIn  
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>  
> http://talebzadehmich.wordpress.com
>  

Reply via email to