[incubator-tvm-site] branch asf-site updated: Build at Mon Mar 30 15:47:27 PDT 2020

tqchen Mon, 30 Mar 2020 15:50:20 -0700

This is an automated email from the ASF dual-hosted git repository.

tqchen pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/incubator-tvm-site.git



The following commit(s) were added to refs/heads/asf-site by this push:
     new c506653  Build at Mon Mar 30 15:47:27 PDT 2020
c506653 is described below

commit c506653d9088268817ed126e709cd084331b1516
Author: tqchen <[email protected]>
AuthorDate: Mon Mar 30 15:47:27 2020 -0700

    Build at Mon Mar 30 15:47:27 PDT 2020
---
 2018/07/12/vta-release-announcement.html | 10 +++++-----
 2019/03/18/tvm-apache-announcement.html  |  2 +-
 atom.xml                                 | 14 +++++++-------
 rss.xml                                  | 16 ++++++++--------
 vta.html                                 |  4 ++--
 5 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/2018/07/12/vta-release-announcement.html 
b/2018/07/12/vta-release-announcement.html
index 9304549..08c2b6e 100644
--- a/2018/07/12/vta-release-announcement.html
+++ b/2018/07/12/vta-release-announcement.html
@@ -168,7 +168,7 @@
 
 <p>VTA is more than a standalone accelerator design: it’s an end-to-end 
solution that includes drivers, a JIT runtime, and an optimizing compiler stack 
based on TVM. The current release includes a behavioral hardware simulator, as 
well as the infrastructure to deploy VTA on low-cost FPGA hardware for fast 
prototyping. By extending the TVM stack with a customizable, and open source 
deep learning hardware accelerator design, we are exposing a transparent 
end-to-end deep learning stack from [...]
 
-<p style="text-align: center"><img 
src="http://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_stack.png";
 alt="image" width="50%" /></p>
+<p style="text-align: center"><img 
src="https://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_stack.png";
 alt="image" width="50%" /></p>
 
 <p>The VTA and TVM stack together constitute a blueprint for end-to-end, 
accelerator-centric deep learning system that can:</p>
 
@@ -223,7 +223,7 @@ The extendability of the compiler stack, combined with the 
ability to modify the
 <p>The Vanilla Tensor Accelerator (VTA) is a generic deep learning accelerator 
built around a GEMM core, which performs dense matrix multiplication at a high 
computational throughput.
 The design is inspired by mainstream deep learning accelerators, of the likes 
of Google’s TPU accelerator. The design adopts decoupled access-execute to hide 
memory access latency and maximize utilization of compute resources. To a 
broader extent, VTA can serve as a template deep learning accelerator design, 
exposing a clean tensor computation abstraction to the compiler stack.</p>
 
-<p style="text-align: center"><img 
src="http://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_overview.png";
 alt="image" width="60%" /></p>
+<p style="text-align: center"><img 
src="https://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_overview.png";
 alt="image" width="60%" /></p>
 
 <p>The figure above presents a high-level overview of the VTA hardware 
organization. VTA is composed of four modules that communicate between each 
other via FIFO queues and single-writer/single-reader SRAM memory blocks, to 
allow for task-level pipeline parallelism.
 The compute module performs both dense linear algebra computation with its 
GEMM core, and general computation with its tensor ALU.
@@ -240,7 +240,7 @@ The first approach, which doesn’t require special hardware 
is to run deep lear
 This simulator back-end is readily available for developers to experiment with.
 The second approach relies on an off-the-shelf and low-cost FPGA development 
board – the <a href="http://www.pynq.io/";>Pynq board</a>, which exposes a 
reconfigurable FPGA fabric and an ARM SoC.</p>
 
-<p style="text-align: center"><img 
src="http://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_system.png";
 alt="image" width="70%" /></p>
+<p style="text-align: center"><img 
src="https://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_system.png";
 alt="image" width="70%" /></p>
 
 <p>The VTA release offers a simple compilation and deployment flow of the VTA 
hardware design and TVM workloads on the Pynq platform, with the help of an RPC 
server interface.
 The RPC server handles FPGA reconfiguration tasks and TVM module invocation 
offloading onto the VTA runtime.
@@ -263,7 +263,7 @@ While this platform is meant for prototyping (the 2012 FPGA 
cannot compete with
 <p>A popular method used to assess the efficient use of hardware are roofline 
diagrams: given a hardware design, how efficiently are different workloads 
utilizing the hardware compute and memory resources. The roofline plot below 
shows the throughput achieved on different convolution layers of the ResNet-18 
inference benchmark. Each layer has a different arithmetic intensity, i.e. 
compute to data movement ratio.
 In the left half, convolution layers are bandwidth limited, whereas on the 
right half, they are compute limited.</p>
 
-<p style="text-align: center"><img 
src="http://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_roofline.png";
 alt="image" width="60%" /></p>
+<p style="text-align: center"><img 
src="https://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_roofline.png";
 alt="image" width="60%" /></p>
 
 <p>The goal behind designing a hardware architecture, and a compiler stack is 
to bring each workload as close as possible to the roofline of the target 
hardware.
 The roofline plot shows the effects of having the hardware and compiler work 
together to maximize utilization of the available hardware resources.
@@ -272,7 +272,7 @@ The result is an overall higher utilization of the 
available compute and memory
 
 <h3 id="end-to-end-resnet-18-evaluation">End to end ResNet-18 evaluation</h3>
 
-<p style="text-align: center"><img 
src="http://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_e2e.png";
 alt="image" width="60%" /></p>
+<p style="text-align: center"><img 
src="https://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_e2e.png";
 alt="image" width="60%" /></p>
 
 <p>A benefit of having a complete compiler stack built for VTA is the ability 
to run end-to-end workloads. This is compelling in the context of hardware 
acceleration because we need to understand what performance bottlenecks, and 
Amdahl limitations stand in the way to obtaining faster performance.
 The bar plot above shows inference performance with and without offloading the 
ResNet convolutional layers to the FPGA-based VTA design, on the Pynq board’s 
ARM Cortex A9 SoC.
diff --git a/2019/03/18/tvm-apache-announcement.html 
b/2019/03/18/tvm-apache-announcement.html
index 98e350d..b154327 100644
--- a/2019/03/18/tvm-apache-announcement.html
+++ b/2019/03/18/tvm-apache-announcement.html
@@ -168,7 +168,7 @@
 
 <p style="text-align: center"><img src="/images/main/tvm-stack.png" 
alt="image" width="70%" /></p>
 
-<p>TVM stack began as a research project at the <a 
href="https://sampl.cs.washington.edu/";>SAMPL group</a> of Paul G. Allen School 
of Computer Science &amp; Engineering, University of Washington. The project 
uses the loop-level IR and several optimizations from the <a 
href="http://halide-lang.org/";>Halide project</a>, in addition to <a 
href="https://tvm.ai/about";>a full deep learning compiler stack</a> to support 
machine learning workloads for diverse hardware backends.</p>
+<p>TVM stack began as a research project at the <a 
href="https://sampl.cs.washington.edu/";>SAMPL group</a> of Paul G. Allen School 
of Computer Science &amp; Engineering, University of Washington. The project 
uses the loop-level IR and several optimizations from the <a 
href="http://halide-lang.org/";>Halide project</a>, in addition to <a 
href="https://tvm.apache.org/about";>a full deep learning compiler stack</a> to 
support machine learning workloads for diverse hardware backends.</p>
 
 <p>Since its introduction, the project was driven by an open source community 
involving multiple industry and academic institutions. Currently, the TVM stack 
includes a high-level differentiable programming IR for high-level 
optimization, a machine learning driven program optimizer and VTA – a fully 
open sourced deep learning accelerator. The community brings innovations from 
machine learning, compiler systems, programming languages, and computer 
architecture to build a full-stack open s [...]
 
diff --git a/atom.xml b/atom.xml
index 4a77194..775f322 100644
--- a/atom.xml
+++ b/atom.xml
@@ -4,7 +4,7 @@
  <title>TVM</title>
  <link href="https://tvm.apache.org"; rel="self"/>
  <link href="https://tvm.apache.org"/>
- <updated>2020-03-30T11:16:12-07:00</updated>
+ <updated>2020-03-30T15:47:25-07:00</updated>
  <id>https://tvm.apache.org</id>
  <author>
    <name></name>
@@ -269,7 +269,7 @@ We show that automatic optimization in TVM makes it easy 
and flexible to support
 
 &lt;p style=&quot;text-align: center&quot;&gt;&lt;img 
src=&quot;/images/main/tvm-stack.png&quot; alt=&quot;image&quot; 
width=&quot;70%&quot; /&gt;&lt;/p&gt;
 
-&lt;p&gt;TVM stack began as a research project at the &lt;a 
href=&quot;https://sampl.cs.washington.edu/&quot;&gt;SAMPL group&lt;/a&gt; of 
Paul G. Allen School of Computer Science &amp;amp; Engineering, University of 
Washington. The project uses the loop-level IR and several optimizations from 
the &lt;a href=&quot;http://halide-lang.org/&quot;&gt;Halide project&lt;/a&gt;, 
in addition to &lt;a href=&quot;https://tvm.ai/about&quot;&gt;a full deep 
learning compiler stack&lt;/a&gt; to support [...]
+&lt;p&gt;TVM stack began as a research project at the &lt;a 
href=&quot;https://sampl.cs.washington.edu/&quot;&gt;SAMPL group&lt;/a&gt; of 
Paul G. Allen School of Computer Science &amp;amp; Engineering, University of 
Washington. The project uses the loop-level IR and several optimizations from 
the &lt;a href=&quot;http://halide-lang.org/&quot;&gt;Halide project&lt;/a&gt;, 
in addition to &lt;a href=&quot;https://tvm.apache.org/about&quot;&gt;a full 
deep learning compiler stack&lt;/a&gt; to [...]
 
 &lt;p&gt;Since its introduction, the project was driven by an open source 
community involving multiple industry and academic institutions. Currently, the 
TVM stack includes a high-level differentiable programming IR for high-level 
optimization, a machine learning driven program optimizer and VTA – a fully 
open sourced deep learning accelerator. The community brings innovations from 
machine learning, compiler systems, programming languages, and computer 
architecture to build a full-stack  [...]
 
@@ -1276,7 +1276,7 @@ support, and can be used to implement convenient 
converters, such as
 
 &lt;p&gt;VTA is more than a standalone accelerator design: it’s an end-to-end 
solution that includes drivers, a JIT runtime, and an optimizing compiler stack 
based on TVM. The current release includes a behavioral hardware simulator, as 
well as the infrastructure to deploy VTA on low-cost FPGA hardware for fast 
prototyping. By extending the TVM stack with a customizable, and open source 
deep learning hardware accelerator design, we are exposing a transparent 
end-to-end deep learning stac [...]
 
-&lt;p style=&quot;text-align: center&quot;&gt;&lt;img 
src=&quot;http://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_stack.png&quot;
 alt=&quot;image&quot; width=&quot;50%&quot; /&gt;&lt;/p&gt;
+&lt;p style=&quot;text-align: center&quot;&gt;&lt;img 
src=&quot;https://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_stack.png&quot;
 alt=&quot;image&quot; width=&quot;50%&quot; /&gt;&lt;/p&gt;
 
 &lt;p&gt;The VTA and TVM stack together constitute a blueprint for end-to-end, 
accelerator-centric deep learning system that can:&lt;/p&gt;
 
@@ -1331,7 +1331,7 @@ The extendability of the compiler stack, combined with 
the ability to modify the
 &lt;p&gt;The Vanilla Tensor Accelerator (VTA) is a generic deep learning 
accelerator built around a GEMM core, which performs dense matrix 
multiplication at a high computational throughput.
 The design is inspired by mainstream deep learning accelerators, of the likes 
of Google’s TPU accelerator. The design adopts decoupled access-execute to hide 
memory access latency and maximize utilization of compute resources. To a 
broader extent, VTA can serve as a template deep learning accelerator design, 
exposing a clean tensor computation abstraction to the compiler stack.&lt;/p&gt;
 
-&lt;p style=&quot;text-align: center&quot;&gt;&lt;img 
src=&quot;http://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_overview.png&quot;
 alt=&quot;image&quot; width=&quot;60%&quot; /&gt;&lt;/p&gt;
+&lt;p style=&quot;text-align: center&quot;&gt;&lt;img 
src=&quot;https://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_overview.png&quot;
 alt=&quot;image&quot; width=&quot;60%&quot; /&gt;&lt;/p&gt;
 
 &lt;p&gt;The figure above presents a high-level overview of the VTA hardware 
organization. VTA is composed of four modules that communicate between each 
other via FIFO queues and single-writer/single-reader SRAM memory blocks, to 
allow for task-level pipeline parallelism.
 The compute module performs both dense linear algebra computation with its 
GEMM core, and general computation with its tensor ALU.
@@ -1348,7 +1348,7 @@ The first approach, which doesn’t require special 
hardware is to run deep lear
 This simulator back-end is readily available for developers to experiment with.
 The second approach relies on an off-the-shelf and low-cost FPGA development 
board – the &lt;a href=&quot;http://www.pynq.io/&quot;&gt;Pynq board&lt;/a&gt;, 
which exposes a reconfigurable FPGA fabric and an ARM SoC.&lt;/p&gt;
 
-&lt;p style=&quot;text-align: center&quot;&gt;&lt;img 
src=&quot;http://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_system.png&quot;
 alt=&quot;image&quot; width=&quot;70%&quot; /&gt;&lt;/p&gt;
+&lt;p style=&quot;text-align: center&quot;&gt;&lt;img 
src=&quot;https://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_system.png&quot;
 alt=&quot;image&quot; width=&quot;70%&quot; /&gt;&lt;/p&gt;
 
 &lt;p&gt;The VTA release offers a simple compilation and deployment flow of 
the VTA hardware design and TVM workloads on the Pynq platform, with the help 
of an RPC server interface.
 The RPC server handles FPGA reconfiguration tasks and TVM module invocation 
offloading onto the VTA runtime.
@@ -1371,7 +1371,7 @@ While this platform is meant for prototyping (the 2012 
FPGA cannot compete with
 &lt;p&gt;A popular method used to assess the efficient use of hardware are 
roofline diagrams: given a hardware design, how efficiently are different 
workloads utilizing the hardware compute and memory resources. The roofline 
plot below shows the throughput achieved on different convolution layers of the 
ResNet-18 inference benchmark. Each layer has a different arithmetic intensity, 
i.e. compute to data movement ratio.
 In the left half, convolution layers are bandwidth limited, whereas on the 
right half, they are compute limited.&lt;/p&gt;
 
-&lt;p style=&quot;text-align: center&quot;&gt;&lt;img 
src=&quot;http://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_roofline.png&quot;
 alt=&quot;image&quot; width=&quot;60%&quot; /&gt;&lt;/p&gt;
+&lt;p style=&quot;text-align: center&quot;&gt;&lt;img 
src=&quot;https://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_roofline.png&quot;
 alt=&quot;image&quot; width=&quot;60%&quot; /&gt;&lt;/p&gt;
 
 &lt;p&gt;The goal behind designing a hardware architecture, and a compiler 
stack is to bring each workload as close as possible to the roofline of the 
target hardware.
 The roofline plot shows the effects of having the hardware and compiler work 
together to maximize utilization of the available hardware resources.
@@ -1380,7 +1380,7 @@ The result is an overall higher utilization of the 
available compute and memory
 
 &lt;h3 id=&quot;end-to-end-resnet-18-evaluation&quot;&gt;End to end ResNet-18 
evaluation&lt;/h3&gt;
 
-&lt;p style=&quot;text-align: center&quot;&gt;&lt;img 
src=&quot;http://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_e2e.png&quot;
 alt=&quot;image&quot; width=&quot;60%&quot; /&gt;&lt;/p&gt;
+&lt;p style=&quot;text-align: center&quot;&gt;&lt;img 
src=&quot;https://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_e2e.png&quot;
 alt=&quot;image&quot; width=&quot;60%&quot; /&gt;&lt;/p&gt;
 
 &lt;p&gt;A benefit of having a complete compiler stack built for VTA is the 
ability to run end-to-end workloads. This is compelling in the context of 
hardware acceleration because we need to understand what performance 
bottlenecks, and Amdahl limitations stand in the way to obtaining faster 
performance.
 The bar plot above shows inference performance with and without offloading the 
ResNet convolutional layers to the FPGA-based VTA design, on the Pynq board’s 
ARM Cortex A9 SoC.
diff --git a/rss.xml b/rss.xml
index 967dd59..dc52cef 100644
--- a/rss.xml
+++ b/rss.xml
@@ -5,8 +5,8 @@
         <description>TVM - </description>
         <link>https://tvm.apache.org</link>
         <atom:link href="https://tvm.apache.org"; rel="self" 
type="application/rss+xml" />
-        <lastBuildDate>Mon, 30 Mar 2020 11:16:12 -0700</lastBuildDate>
-        <pubDate>Mon, 30 Mar 2020 11:16:12 -0700</pubDate>
+        <lastBuildDate>Mon, 30 Mar 2020 15:47:25 -0700</lastBuildDate>
+        <pubDate>Mon, 30 Mar 2020 15:47:25 -0700</pubDate>
         <ttl>60</ttl>
 
 
@@ -264,7 +264,7 @@ We show that automatic optimization in TVM makes it easy 
and flexible to support
 
 &lt;p style=&quot;text-align: center&quot;&gt;&lt;img 
src=&quot;/images/main/tvm-stack.png&quot; alt=&quot;image&quot; 
width=&quot;70%&quot; /&gt;&lt;/p&gt;
 
-&lt;p&gt;TVM stack began as a research project at the &lt;a 
href=&quot;https://sampl.cs.washington.edu/&quot;&gt;SAMPL group&lt;/a&gt; of 
Paul G. Allen School of Computer Science &amp;amp; Engineering, University of 
Washington. The project uses the loop-level IR and several optimizations from 
the &lt;a href=&quot;http://halide-lang.org/&quot;&gt;Halide project&lt;/a&gt;, 
in addition to &lt;a href=&quot;https://tvm.ai/about&quot;&gt;a full deep 
learning compiler stack&lt;/a&gt; to support [...]
+&lt;p&gt;TVM stack began as a research project at the &lt;a 
href=&quot;https://sampl.cs.washington.edu/&quot;&gt;SAMPL group&lt;/a&gt; of 
Paul G. Allen School of Computer Science &amp;amp; Engineering, University of 
Washington. The project uses the loop-level IR and several optimizations from 
the &lt;a href=&quot;http://halide-lang.org/&quot;&gt;Halide project&lt;/a&gt;, 
in addition to &lt;a href=&quot;https://tvm.apache.org/about&quot;&gt;a full 
deep learning compiler stack&lt;/a&gt; to [...]
 
 &lt;p&gt;Since its introduction, the project was driven by an open source 
community involving multiple industry and academic institutions. Currently, the 
TVM stack includes a high-level differentiable programming IR for high-level 
optimization, a machine learning driven program optimizer and VTA – a fully 
open sourced deep learning accelerator. The community brings innovations from 
machine learning, compiler systems, programming languages, and computer 
architecture to build a full-stack  [...]
 
@@ -1271,7 +1271,7 @@ support, and can be used to implement convenient 
converters, such as
 
 &lt;p&gt;VTA is more than a standalone accelerator design: it’s an end-to-end 
solution that includes drivers, a JIT runtime, and an optimizing compiler stack 
based on TVM. The current release includes a behavioral hardware simulator, as 
well as the infrastructure to deploy VTA on low-cost FPGA hardware for fast 
prototyping. By extending the TVM stack with a customizable, and open source 
deep learning hardware accelerator design, we are exposing a transparent 
end-to-end deep learning stac [...]
 
-&lt;p style=&quot;text-align: center&quot;&gt;&lt;img 
src=&quot;http://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_stack.png&quot;
 alt=&quot;image&quot; width=&quot;50%&quot; /&gt;&lt;/p&gt;
+&lt;p style=&quot;text-align: center&quot;&gt;&lt;img 
src=&quot;https://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_stack.png&quot;
 alt=&quot;image&quot; width=&quot;50%&quot; /&gt;&lt;/p&gt;
 
 &lt;p&gt;The VTA and TVM stack together constitute a blueprint for end-to-end, 
accelerator-centric deep learning system that can:&lt;/p&gt;
 
@@ -1326,7 +1326,7 @@ The extendability of the compiler stack, combined with 
the ability to modify the
 &lt;p&gt;The Vanilla Tensor Accelerator (VTA) is a generic deep learning 
accelerator built around a GEMM core, which performs dense matrix 
multiplication at a high computational throughput.
 The design is inspired by mainstream deep learning accelerators, of the likes 
of Google’s TPU accelerator. The design adopts decoupled access-execute to hide 
memory access latency and maximize utilization of compute resources. To a 
broader extent, VTA can serve as a template deep learning accelerator design, 
exposing a clean tensor computation abstraction to the compiler stack.&lt;/p&gt;
 
-&lt;p style=&quot;text-align: center&quot;&gt;&lt;img 
src=&quot;http://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_overview.png&quot;
 alt=&quot;image&quot; width=&quot;60%&quot; /&gt;&lt;/p&gt;
+&lt;p style=&quot;text-align: center&quot;&gt;&lt;img 
src=&quot;https://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_overview.png&quot;
 alt=&quot;image&quot; width=&quot;60%&quot; /&gt;&lt;/p&gt;
 
 &lt;p&gt;The figure above presents a high-level overview of the VTA hardware 
organization. VTA is composed of four modules that communicate between each 
other via FIFO queues and single-writer/single-reader SRAM memory blocks, to 
allow for task-level pipeline parallelism.
 The compute module performs both dense linear algebra computation with its 
GEMM core, and general computation with its tensor ALU.
@@ -1343,7 +1343,7 @@ The first approach, which doesn’t require special 
hardware is to run deep lear
 This simulator back-end is readily available for developers to experiment with.
 The second approach relies on an off-the-shelf and low-cost FPGA development 
board – the &lt;a href=&quot;http://www.pynq.io/&quot;&gt;Pynq board&lt;/a&gt;, 
which exposes a reconfigurable FPGA fabric and an ARM SoC.&lt;/p&gt;
 
-&lt;p style=&quot;text-align: center&quot;&gt;&lt;img 
src=&quot;http://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_system.png&quot;
 alt=&quot;image&quot; width=&quot;70%&quot; /&gt;&lt;/p&gt;
+&lt;p style=&quot;text-align: center&quot;&gt;&lt;img 
src=&quot;https://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_system.png&quot;
 alt=&quot;image&quot; width=&quot;70%&quot; /&gt;&lt;/p&gt;
 
 &lt;p&gt;The VTA release offers a simple compilation and deployment flow of 
the VTA hardware design and TVM workloads on the Pynq platform, with the help 
of an RPC server interface.
 The RPC server handles FPGA reconfiguration tasks and TVM module invocation 
offloading onto the VTA runtime.
@@ -1366,7 +1366,7 @@ While this platform is meant for prototyping (the 2012 
FPGA cannot compete with
 &lt;p&gt;A popular method used to assess the efficient use of hardware are 
roofline diagrams: given a hardware design, how efficiently are different 
workloads utilizing the hardware compute and memory resources. The roofline 
plot below shows the throughput achieved on different convolution layers of the 
ResNet-18 inference benchmark. Each layer has a different arithmetic intensity, 
i.e. compute to data movement ratio.
 In the left half, convolution layers are bandwidth limited, whereas on the 
right half, they are compute limited.&lt;/p&gt;
 
-&lt;p style=&quot;text-align: center&quot;&gt;&lt;img 
src=&quot;http://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_roofline.png&quot;
 alt=&quot;image&quot; width=&quot;60%&quot; /&gt;&lt;/p&gt;
+&lt;p style=&quot;text-align: center&quot;&gt;&lt;img 
src=&quot;https://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_roofline.png&quot;
 alt=&quot;image&quot; width=&quot;60%&quot; /&gt;&lt;/p&gt;
 
 &lt;p&gt;The goal behind designing a hardware architecture, and a compiler 
stack is to bring each workload as close as possible to the roofline of the 
target hardware.
 The roofline plot shows the effects of having the hardware and compiler work 
together to maximize utilization of the available hardware resources.
@@ -1375,7 +1375,7 @@ The result is an overall higher utilization of the 
available compute and memory
 
 &lt;h3 id=&quot;end-to-end-resnet-18-evaluation&quot;&gt;End to end ResNet-18 
evaluation&lt;/h3&gt;
 
-&lt;p style=&quot;text-align: center&quot;&gt;&lt;img 
src=&quot;http://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_e2e.png&quot;
 alt=&quot;image&quot; width=&quot;60%&quot; /&gt;&lt;/p&gt;
+&lt;p style=&quot;text-align: center&quot;&gt;&lt;img 
src=&quot;https://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_e2e.png&quot;
 alt=&quot;image&quot; width=&quot;60%&quot; /&gt;&lt;/p&gt;
 
 &lt;p&gt;A benefit of having a complete compiler stack built for VTA is the 
ability to run end-to-end workloads. This is compelling in the context of 
hardware acceleration because we need to understand what performance 
bottlenecks, and Amdahl limitations stand in the way to obtaining faster 
performance.
 The bar plot above shows inference performance with and without offloading the 
ResNet convolutional layers to the FPGA-based VTA design, on the Pynq board’s 
ARM Cortex A9 SoC.
diff --git a/vta.html b/vta.html
index e7ad980..ce54668 100644
--- a/vta.html
+++ b/vta.html
@@ -159,7 +159,7 @@ The current release includes a behavioral hardware 
simulator, as well as the inf
 By extending the TVM stack with a customizable, and open source deep learning 
hardware accelerator design, we are exposing a transparent end-to-end deep 
learning stack from the high-level deep learning framework, down to the actual 
hardware design and implementation.
 This forms a truly end-to-end, from software-to-hardware open source stack for 
deep learning systems.</p>
 
-<p style="text-align: center"><img 
src="http://raw.githubusercontent.com/uwsampl/web-data/master/vta/blogpost/vta_stack.png";
 alt="image" width="50%" /></p>
+<p style="text-align: center"><img 
src="https://raw.githubusercontent.com/uwsampl/web-data/master/vta/blogpost/vta_stack.png";
 alt="image" width="50%" /></p>
 
 <p>The VTA and TVM stack together constitute a blueprint for end-to-end, 
accelerator-centric deep learning system that can:</p>
 
@@ -174,7 +174,7 @@ TVM is now an effort undergoing incubation at The Apache 
Software Foundation (AS
 driven by an open source community involving multiple industry and academic 
institutions
 under the Apache way.</p>
 
-<p>Read more about VTA in the <a 
href="https://tvm.ai/2018/07/12/vta-release-announcement.html";>TVM blog 
post</a>, or in the <a href="https://arxiv.org/abs/1807.04188";>VTA 
techreport</a>.</p>
+<p>Read more about VTA in the <a 
href="https://tvm.apache.org/2018/07/12/vta-release-announcement.html";>TVM blog 
post</a>, or in the <a href="https://arxiv.org/abs/1807.04188";>VTA 
techreport</a>.</p>
 
       </div>
     </div>

[incubator-tvm-site] branch asf-site updated: Build at Mon Mar 30 15:47:27 PDT 2020

Reply via email to