That's interesting. Seems their website has not been updated for a few weeks. The last issue on the website is currently 10/2.
On Sun, Oct 23, 2016 at 8:05 PM, Josh Elser <[email protected]> wrote: > Congrats, the 1.0.0-incubating release was picked up by Hadoop Weekly :) > ---------- Forwarded message ---------- > From: "Hadoop Weekly" <[email protected]> > Date: Oct 23, 2016 19:21 > Subject: Hadoop Weekly #191 > To: <[email protected]> > Cc: > > Hadoop Weekly > > Issue #191 > > 23 October 2016 > > > > This week's issue is short and sweet with a few technical posts, two > > interesting news articles, and several exciting releases (including > Apache > > Kafka 0.10.1.0). With Spark Summit Europe this week, expect lots of great > > content in the next issue. And if you're attending, please send > interesting > > slides/talks my way! > > > > Technical > > ======= > > > > Cloudera's CDH supports intra-node disk balancing since version 5.8.2 > > (it's also part of the 3.0.0 alpha Apache release). Using this feature, a > > data node can rebalance data blocks across disks using the `hdfs > > diskbalancer` command. This post describes how the tool works and shows > how > > to run it. > > > > http://blog.cloudera.com/blog/2016/10/how-to-use-the-new- > > hdfs-intra-datanode-disk-balancer-in-apache-hadoop/ > > > > > > This post demonstrates the capabilities of the spark.ml library by > > building a logistic regression model to predict malignancy of cases from > > the Wisconsin Diagnostic Breast Cancer data set. The example code covers > > parsing, exploring a dataset with built-in statistics, extracting > features > > from the input dataset, training the model, and evaluating the model. > > > > https://www.mapr.com/blog/predicting-breast-cancer- > > using-apache-spark-machine-learning-logistic-regression > > > > > > The Amazon Big Data blog has a tutorial for running RStudio with sparklyr > > on EMR. Thanks to a bootstrap action, a cluster complete with RStudio > > running on the master, can be launched with a single command. > > > > https://aws.amazon.com/blogs/big-data/running-sparklyr- > > rstudios-r-interface-to-spark-on-amazon-emr/ > > > > > > The Databricks blog features a list of seven tips for debugging Apache > > Spark code on Databricks. Most of the suggestions, like "Scale up Spark > > jobs slowly for really large datasets" and "Examine the partitioning for > > your dataset," are generally applicable to all Spark users. > > > > https://databricks.com/blog/2016/10/18/7-tips-to-debug- > > apache-spark-code-faster-with-databricks.html > > > > > > News > > ==== > > > > InfoQ has an interview with Yahoo VP of Engineering, Peter Cnudde. Topics > > covered include Hadoop, Spark adoption at Yahoo (mostly for in-memory > > computing, not for ETL), and Caffe-on-Spark for deep learning. > > > > https://www.infoq.com/articles/peter-cnudde-yahoo-big-data > > > > > > ZDNet contributor Tony Baer has read between the lines when it comes to > > recent benchmarks by Cloudera and Hortonworks. The takeaways are as > > follows: 1) "SQL's the gateway drug to Hadoop." 2) Cloudera is trying to > > challenge Amazon (in this case Redshift), and 3) Hortonworks (via Hive's > > Live Long and Prosper) has caught up on the investment Cloudera made in > > Impala. > > > > http://www.zdnet.com/article/sql-on-hadoop-benchmarks-get-serious/ > > > > > > Releases > > ======= > > > > Apache Kafka 0.10.1.0 was released this week. It contains improvements > > from over 500 pull requests and the implementation of 15 Kafka > Improvement > > Proposals. The Confluent blog has the highlights of > additions/improvements > > to Kafka Server (time-based indexes, replication quotas, and improved log > > compaction), improvements to Kafka client APIs (interactive queries for > > Kafak Streams, improved memory management, secure quotas, and more), and > > bug fixes. > > > > http://mail-archives.apache.org/mod_mbox/kafka-users/ > > 201610.mbox/%3CCAJL4t_oz9q4T9vn6Z-EBoazWJFyqHw4Y0L- > > PTowD%2BpFhcPv0VQ%40mail.gmail.com%3E > > http://www.confluent.io/blog/announcing-apache-kafka-0-10-1-0/ > > > > Apache Fluo (incubating), recently had its first release since entering > > the incubator. Fluo is a tool for making "incremental updates to large > data > > sets stored in Apache Accumulo" a la Google's Perculator. > > > > https://fluo.apache.org/release/fluo-1.0.0-incubating/ > > > > > > Apache Flume 1.7.0 was released. It adds support for a `taildir` source > > and includes a number of improvements and bug fixes. Many of these are > > around Flume's integration with Apache Kafka. > > > > http://flume.apache.org/releases/1.7.0.html > > > > > > Apache NiFi 0.7.1 was released as a follow-up to July's 0.7.0 release > > (version 1.0.0 was also recently released—in August). This release adds a > > number of improvements and bug fixes. > > > > https://cwiki.apache.org/confluence/display/NIFI/ > > Release+Notes#ReleaseNotes-Version0.7.1 > > > > > > Apache Giraph 1.2.0 was released. Highlight's of the release include a > new > > blocks API, support for graphs that don't fit in memory, and the addition > > of a new set of default configuration options based on Facebook's > > experience with Giraph. > > > > https://blogs.apache.org/giraph/entry/giraph_1_2_0_release > > > > > > `deeplearning4j` is a deep learning implementation that integrates with > > Hadoop and Spark and supports GPUs. Version 0.6.0 was recently released. > > > > https://github.com/deeplearning4j/deeplearning4j > > > > > > Events > > ===== > > Curated by Datadog ( http://www.datadog.com ) > > UNITED STATES > > > > California > > Uber Engineering Tech Talk Series (San Francisco) - Monday, October 24 > > http://www.meetup.com/UberEvents/events/234789134/ > > > > Real-Time Streaming and Exactly-Once Semantics with Kafka (San Francisco) > > - Tuesday, October 25 > > http://www.meetup.com/MemSQL/events/234405914/ > > > > Building Your First Spark & C* App + SMACK Stack + The Cassandra Odyssey > > (San Francisco) - Wednesday, October 26 > > http://www.meetup.com/SF-Spark-and-Friends/events/234932979/ > > > > Apache YARN Committers/ContributÂors Meetup #4 (Sunnyvale) - Thursday, > > October 27 > > http://www.meetup.com/Hadoop-Contributors/events/234971372/ > > > > > > Washington > > Kafka Palooza: LinkedIn, Microsoft Azure, MapR (Bellevue) - Monday, > > October 24 > > http://www.meetup.com/Seattle-Apache-Kafka-Meetup/events/234836624/ > > > > > > Nevada > > PixieDust: Making Python Visualizations Easier for Jupyter Notebooks with > > Spark (Las Vegas) - Monday, October 24 > > http://www.meetup.com/Data-Science-Las-Vegas/events/234557659/ > > > > > > Texas > > O&G Big Data Use Cases, by Hortonworks (Houston) - Thursday, October 27 > > http://www.meetup.com/Houston-Hadoop-Meetup-Group/events/234282996/ > > > > > > Kansas > > Using Data Quality to Support Analytics in Hadoop (Overland Park) - > > Tuesday, October 25 > > http://www.meetup.com/Kansas-City-Big-Data-Projects-Group/ > > events/234597551/ > > > > > > Missouri > > Using Data Quality to Support Analytics in Hadoop (Kansas City) - > Tuesday, > > October 25 > > http://www.meetup.com/Kansas-City-Big-Data-Projects-Group/ > > events/234597347/ > > > > > > Illinois > > Big Data Streaming Platform Ecosystem (Chicago) - Tuesday, October 25 > > http://www.meetup.com/ChicagoRealTimeStreamingAnalyt > ics/events/234676872/ > > > > Apache Spark 101 (Chicago) - Tuesday, October 25 > > http://www.meetup.com/Chicago-Spark-Users/events/233999667/ > > > > > > Ohio > > October Edition of MOHUG (Dublin) - Tuesday, October 25 > > http://www.meetup.com/MOHUG-Mid-Ohio-Hadoop-User-Group/events/234416779/ > > > > > > Florida > > Apache Spark (Miami) - Wednesday, October 26 > > http://www.meetup.com/Miami-Hadoop-User-Group/events/234992451/ > > > > > > New York > > Lambda-in-a-Box: Merging Apache Spark & HBase into an Open-Source > Database > > (New York) - Thursday, October 27 > > http://www.meetup.com/mysqlnyc/events/233775657/ > > > > October Data Engineering Meetup (New York) - Thursday, October 27 > > http://www.meetup.com/NYC-Data-Engineering/events/234946410/ > > > > > > CANADA > > Toronto Apache Spark #14 (Toronto) - Wednesday, October 26 > > http://www.meetup.com/Toronto-Apache-Spark/events/234878620/ > > > > Introduction to MapR (Toronto) - Thursday, October 27 > > http://www.meetup.com/Toronto-MapR-User-Group/events/231648976/ > > > > > > UNITED KINGDOM > > Why SMACK for Fast Data (London) - Monday, October 24 > > http://www.meetup.com/skillsmatter/events/234588911/ > > > > Building Scalable Systems in a Changing Data Landscape (London) - > Tuesday, > > October 25 > > http://www.meetup.com/data-science-lab/events/234754144/ > > > > Spark Structured Streaming in Practice (London) - Wednesday, October 26 > > http://www.meetup.com/hadoop-users-group-uk/events/234876912/ > > > > > > SPAIN > > Season Premiere with Reynold Xin, Co-Founder & Chief Architect at > > Databricks (Barcelona) - Thursday, October 27 > > http://www.meetup.com/Spark-Barcelona/events/234463208/ > > > > Introduction to Kafka (Malaga) - Friday, October 28 > > http://www.meetup.com/Linux-Malaga/events/234826330/ > > > > > > BELGIUM > > Spark Pre-Summit Meetup (Brussels) - Tuesday, October 25 > > http://www.meetup.com/Spark-Belgium/events/234234256/ > > > > Meeting on Streamsets, Datameer and Kudu (Kontich) - Tuesday, October 25 > > http://www.meetup.com/Belgium-Cloudera-User-Group/events/234618841/ > > > > Spark & Machine Learning Meetup (Brussels) - Thursday, October 27 > > http://www.meetup.com/Data-Science-Community-Meetup/events/234173917/ > > > > > > INDIA > > Introduction to Spark & Use Cases (Hyderabad) - Monday, October 24 > > http://www.meetup.com/meetup-group-ytFpRTDs/events/234412261/ > > > > > > AUSTRALIA > > Rethink SQL for Big Data with Apache Drill (Barton) - Tuesday, October 25 > > http://www.meetup.com/Canberra-Big-Data-Converged- > SQL-NoSQL-and-Real-Time/ > > events/233463561/ > > > > Spark Meetup October (Sydney) - Wednesday, October 26 > > http://www.meetup.com/Sydney-Apache-Spark-User-Group/events/233723585/ > > > > Rethink SQL for Big Data with Apache Drill (Melbourne) - Thursday, > October > > 27 > > http://www.meetup.com/Melbourne-Big-Data-Converged- > > SQL-NoSQL-and-Real-Time/events/233463459/ > > > > > > ESTONIA > > Big Data: Spark and TensorFlow (Tallinn) - Monday, October 24 > > http://www.meetup.com/Advanced-Java-Estonia/events/234612322/ > > > > > > > > > > If you didn't receive this email directly, and you'd like to subscribe to > > weekly emails please visit http://hadoopweekly.com > > > > ============================================== > > You signed up for this email at hadoopweekly.com > > > > Unsubscribe [email protected] from this list: > > http://hadoopweekly.us6.list-manage.com/unsubscribe?u= > > c31415a60fb0bc4efbe86f45b&id=976fe003f4&e=b0d6d006e8&c=d7d5e262dd > > > > Our mailing address is: > > Hadoop Weekly > > PO Box 373 > > Cranford, NJ 07016 > > USA > > >
