Repository: arrow-site Updated Branches: refs/heads/asf-site a6214c739 -> 24caf72d8
http://git-wip-us.apache.org/repos/asf/arrow-site/blob/24caf72d/feed.xml ---------------------------------------------------------------------- diff --git a/feed.xml b/feed.xml index 5c51b44..112cfda 100644 --- a/feed.xml +++ b/feed.xml @@ -1,4 +1,4 @@ -<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.4.3">Jekyll</generator><link href="/feed.xml" rel="self" type="application/atom+xml" /><link href="/" rel="alternate" type="text/html" /><updated>2017-12-20T15:02:10-05:00</updated><id>/</id><entry><title type="html">Apache Arrow 0.8.0 Release</title><link href="/blog/2017/12/18/0.8.0-release/" rel="alternate" type="text/html" title="Apache Arrow 0.8.0 Release" /><published>2017-12-18T23:01:00-05:00</published><updated>2017-12-18T23:01:00-05:00</updated><id>/blog/2017/12/18/0.8.0-release</id><content type="html" xml:base="/blog/2017/12/18/0.8.0-release/"><!-- +<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.4.3">Jekyll</generator><link href="/feed.xml" rel="self" type="application/atom+xml" /><link href="/" rel="alternate" type="text/html" /><updated>2018-01-19T12:53:39-05:00</updated><id>/</id><entry><title type="html">Apache Arrow 0.8.0 Release</title><link href="/blog/2017/12/18/0.8.0-release/" rel="alternate" type="text/html" title="Apache Arrow 0.8.0 Release" /><published>2017-12-18T23:01:00-05:00</published><updated>2017-12-18T23:01:00-05:00</updated><id>/blog/2017/12/18/0.8.0-release</id><content type="html" xml:base="/blog/2017/12/18/0.8.0-release/"><!-- --> http://git-wip-us.apache.org/repos/asf/arrow-site/blob/24caf72d/img/copy.png ---------------------------------------------------------------------- diff --git a/img/copy.png b/img/copy.png index a1e0499..55ff71e 100644 Binary files a/img/copy.png and b/img/copy.png differ http://git-wip-us.apache.org/repos/asf/arrow-site/blob/24caf72d/img/shared.png ---------------------------------------------------------------------- diff --git a/img/shared.png b/img/shared.png index 7869dad..b079ad0 100644 Binary files a/img/shared.png and b/img/shared.png differ http://git-wip-us.apache.org/repos/asf/arrow-site/blob/24caf72d/index.html ---------------------------------------------------------------------- diff --git a/index.html b/index.html index 52a5cc4..656832d 100644 --- a/index.html +++ b/index.html @@ -113,72 +113,80 @@ </nav> - <div class="container"> - <div class="jumbotron"> + <div class="container"> + <div class="jumbotron"> <h1>Apache Arrow</h1> - <p class="lead">Powering Columnar In-Memory Analytics</p> + <p class="lead">A cross-language development platform for in-memory data</p> <p> - <a class="btn btn-lg btn-success" href="mailto:[email protected]" role="button">Join Mailing List</a> - <a class="btn btn-lg btn-primary" href="/install/" role="button">Install (0.8.0 Release - December 18, 2017)</a> + <a class="btn btn-lg btn-success" style="white-space: normal;" href="mailto:[email protected]" role="button">Join Mailing List</a> + <a class="btn btn-lg btn-primary" style="white-space: normal;" href="/install/" role="button">Install (0.8.0 Release - 18 December 2017)</a> </p> - </div> - <h4><a href="/blog/"><strong>See Latest News</strong></a></h4> - <div class="row"> + </div> + <div class="row"> + <div class="col-xs-12"> + <h4> + <a href="/blog/"><strong>See Latest News</strong></a> + </h4> + </div> + </div> + <div class="row"> + <div class="col-xs-12"> + <p>Apache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational libraries and zero-copy streaming messaging and interprocess communication. Languages currently supported include C, C++, Java, JavaScript, Python, and Ruby.</p> + </div> + </div> + <div class="row"> <div class="col-lg-4"> - <h2>Fast</h2> - <p>Apache Arrow™ enables execution engines to take advantage of - the latest SIMD (Single input multiple data) operations included in modern - processors, for native vectorized optimization of analytical data - processing. Columnar layout is optimized for data locality for better - performance on modern hardware like CPUs and GPUs.</p> - - <p>The Arrow memory format supports <strong>zero-copy reads</strong> - for lightning-fast data access without serialization overhead.</p> - + <h2>Fast</h2> + <p>Apache Arrow™ enables execution engines to take advantage of the latest SIMD (Single input multiple data) operations included in modern processors, for native vectorized optimization of analytical data processing. Columnar layout is optimized for data locality for better performance on modern hardware like CPUs and GPUs.</p> + <p>The Arrow memory format supports <strong>zero-copy reads</strong> for lightning-fast data access without serialization overhead.</p> </div> <div class="col-lg-4"> - <h2>Flexible</h2> - <p>Arrow acts as a new high-performance interface between various - systems. It is also focused on supporting a wide variety of - industry-standard programming languages. Java, C, C++, Python, Ruby, - and JavaScript implementations are in progress and more languages are - welcome.</p> + <h2>Flexible</h2> + <p>Arrow acts as a new high-performance interface between various systems. It is also focused on supporting a wide variety of industry-standard programming languages. Java, C, C++, Python, Ruby, and JavaScript implementations are in progress and more languages are welcome. + </p> </div> <div class="col-lg-4"> - <h2>Standard</h2> - <p>Apache Arrow is backed by key developers of 13 major open source - projects, including Calcite, Cassandra, Drill, Hadoop, HBase, Ibis, - Impala, Kudu, Pandas, Parquet, Phoenix, Spark, and Storm making it - the de-facto standard for columnar in-memory analytics.</p> + <h2>Standard</h2> + <p>Apache Arrow is backed by key developers of 13 major open source projects, including Calcite, Cassandra, Drill, Hadoop, HBase, Ibis, Impala, Kudu, Pandas, Parquet, Phoenix, Spark, and Storm making it the de-facto standard for columnar in-memory analytics.</p> + <p>Learn more about projects that are <a href="/powered_by/">Powered By Apache Arrow</a></p> </div> - </div> <!-- close "row" div --> + </div> + <!-- close "row" div --> -<h2>Performance Advantage of Columnar In-Memory</h2> -<div align="center"> - <img src="img/simd.png" alt="SIMD" style="width:60%" /> -</div> -<h2>Advantages of a Common Data Layer</h2> - -<div class="row"> -<div class="col-lg-4" style="width:50%"> -<img src="img/copy2.png" alt="common data layer" style="width:100%" /> -<ul> - <li>Each system has its own internal memory format</li> - <li>70-80% computation wasted on serialization and deserialization</li> - <li>Similar functionality implemented in multiple projects</li> -</ul> -</div> -<div class="col-lg-4" style="width:50%"> -<img src="img/shared2.png" alt="common data layer" style="width:100%" /> -<ul> - <li>All systems utilize the same memory format</li> - <li>No overhead for cross-system communication</li> - <li>Projects can share functionality (eg, Parquet-to-Arrow reader)</li> -</ul> -</div> + <div class="row"> + <div class="col-xs-12"> + <h2>Performance Advantage of Columnar In-Memory</h2> + </div> + <div class="col-lg-offset-2 col-lg-8 col-xs-12"> + <img src="img/simd.png" alt="SIMD" class="img-responsive" /> + </div> + </div> + + <div class="row"> + <div class="col-xs-12"> + <h2>Advantages of a Common Data Layer</h2> + </div> + <div class="col-lg-6 col-lg-offset-0 col-sm-8 col-sm-offset-2 col-xs-10 col-xs-offset-1"> + <img src="img/copy.png" alt="common data layer" class="img-responsive" /> + <ul> + <li>Each system has its own internal memory format</li> + <li>70-80% computation wasted on serialization and deserialization</li> + <li>Similar functionality implemented in multiple projects</li> + </ul> + </div> + <div class="col-lg-6 col-lg-offset-0 col-sm-8 col-sm-offset-2 col-xs-10 col-xs-offset-1"> + <img src="img/shared.png" alt="common data layer" class="img-responsive" /> + <ul> + <li>All systems utilize the same memory format</li> + <li>No overhead for cross-system communication</li> + <li>Projects can share functionality (eg, Parquet-to-Arrow reader)</li> + </ul> + </div> + </div> </div> - </div> <!-- /container --> - </body> +<!-- /container --> + +</body> </html> http://git-wip-us.apache.org/repos/asf/arrow-site/blob/24caf72d/powered_by/index.html ---------------------------------------------------------------------- diff --git a/powered_by/index.html b/powered_by/index.html index ee6080d..afd9089 100644 --- a/powered_by/index.html +++ b/powered_by/index.html @@ -140,11 +140,9 @@ names, etc.) like âarrow-fooâ. These are permitted. Nominative use of tradem in descriptions is also always allowed, as in âBigCoProduct is a widget for Apache Arrowâ.</p> -<h3 id="open-source-projects">Open Source Projects</h3> - -<p>To add yourself to the list, please email [email protected] with your +<p>To add yourself to the list, please open a pull request adding your organization name, URL, a list of which Arrow components you are using, and a -short description of your use case.</p> +short description of your use case. See the following for some examples.</p> <ul> <li><strong><a href="https://parquet.apache.org/">Apache Parquet</a>:</strong> A columnar storage format available to any project @@ -162,10 +160,23 @@ large-scale data processing. Spark uses Apache Arrow to <li><strong><a href="https://github.com/dask/dask">Dask</a>:</strong> Python library for parallel and distributed execution of dynamic task graphs. Dask supports using pyarrow for accessing Parquet files</li> + <li><strong><a href="https://www.dremio.com/">Dremio</a>:</strong> A self-service data platform. Dremio makes it easy for +users to discover, curate, accelerate, and share data from any source. +It includes a distributed SQL execution engine based on Apache Arrow. +Dremio reads data from any source (RDBMS, HDFS, S3, NoSQL) into Arrow +buffers, and provides fast SQL access via ODBC, JDBC, and REST for BI, +Python, R, and more (all backed by Apache Arrow).</li> <li><strong><a href="https://github.com/locationtech/geomesa">GeoMesa</a>:</strong> A suite of tools that enables large-scale geospatial query and analytics on distributed computing systems. GeoMesa supports query results in the Arrow IPC format, which can then be used for in-browser visualizations and/or further analytics.</li> + <li><strong><a href="http://gpuopenanalytics.com">GOAI</a>:</strong> Open GPU-Accelerated Analytics Initiative for Arrow-powered +analytics across GPU tools and vendors</li> + <li><strong><a href="https://www.graphistry.com">Graphistry</a>:</strong> Supercharged Visual Investigation Platform used by +teams for security, anti-fraud, and related investigations. The Graphistry +team uses Arrow in its NodeJS GPU backend and client libraries, and is an +early contributing member to GOAI and Arrow[JS] focused on bringing these +technologies to the enterprise.</li> <li><strong><a href="https://github.com/gpuopenanalytics/libgdf">libgdf</a>:</strong> A C library of CUDA-based analytics functions and GPU IPC support for structured data. Uses the Arrow IPC format and targets the Arrow memory layout in its analytic functions. This work is part of the <a href="https://gpuopenanalytics.com/">GPU Open @@ -176,6 +187,9 @@ handles. This work is part of the <a href="https://gpuopenanalytics.com/">GPU Op <li><strong><a href="https://pandas.pydata.org">pandas</a>:</strong> data analysis toolkit for Python programmers. pandas supports reading and writing Parquet files using pyarrow. Several pandas core developers are also contributors to Apache Arrow.</li> + <li><strong><a href="https://quiltdata.com/">Quilt Data</a>:</strong> Quilt is a data package manager, designed to make +managing data as easy as managing code. It supports Parquet format via +pyarrow for data access.</li> <li><strong><a href="https://github.com/ray-project/ray">Ray</a>:</strong> A flexible, high-performance distributed execution framework with a focus on machine learning and AI applications. Uses Arrow to efficiently store Python data structures containing large arrays of numerical @@ -193,31 +207,6 @@ Arrow Tables and RecordBatches in addition to the Python Database API Specification 2.0.</li> </ul> -<h3 id="companies-and-organizations">Companies and Organizations</h3> - -<p>To add yourself to the list, please email [email protected] with your -organization name, URL, a list of which Arrow components you are using, and a -short description of your use case.</p> - -<ul> - <li><strong><a href="https://www.dremio.com/">Dremio</a>:</strong> A self-service data platform. Dremio makes it easy for -users to discover, curate, accelerate, and share data from any source. -It includes a distributed SQL execution engine based on Apache Arrow. -Dremio reads data from any source (RDBMS, HDFS, S3, NoSQL) into Arrow -buffers, and provides fast SQL access via ODBC, JDBC, and REST for BI, -Python, R, and more (all backed by Apache Arrow).</li> - <li><strong><a href="http://gpuopenanalytics.com">GOAI</a>:</strong> Open GPU-Accelerated Analytics Initiative for Arrow-powered -analytics across GPU tools and vendors</li> - <li><strong><a href="https://www.graphistry.com">Graphistry</a>:</strong> Supercharged Visual Investigation Platform used by -teams for security, anti-fraud, and related investigations. The Graphistry -team uses Arrow in its NodeJS GPU backend and client libraries, and is an -early contributing member to GOAI and Arrow[JS] focused on bringing these -technologies to the enterprise.</li> - <li><strong><a href="https://quiltdata.com/">Quilt Data</a>:</strong> Quilt is a data package manager, designed to make -managing data as easy as managing code. It supports Parquet format via -pyarrow for data access.</li> -</ul> - <hr/>
