This is an automated email from the ASF dual-hosted git repository.
sk0x50 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/ignite-website.git
The following commit(s) were added to refs/heads/master by this push:
new 6c4a7dd4c7 IGNITE-27061 Publish Schema-driven design blog post (#285)
6c4a7dd4c7 is described below
commit 6c4a7dd4c71a5e915fea10374bbca1c8e71d01f9
Author: jinxxxoid <[email protected]>
AuthorDate: Tue Nov 18 20:59:01 2025 +0400
IGNITE-27061 Publish Schema-driven design blog post (#285)
---
.../schema-design-for-distributed-systems-ai3.pug | 383 ++++++++++++++++
_src/_components/base.pug | 4 +-
_src/_components/templates/post.pug | 12 +-
blog/apache/index.html | 43 +-
blog/ignite/index.html | 41 +-
blog/index.html | 41 +-
...schema-design-for-distributed-systems-ai3.html} | 483 +++++++++++++--------
7 files changed, 724 insertions(+), 283 deletions(-)
diff --git a/_src/_blog/schema-design-for-distributed-systems-ai3.pug
b/_src/_blog/schema-design-for-distributed-systems-ai3.pug
new file mode 100644
index 0000000000..33e26c2b75
--- /dev/null
+++ b/_src/_blog/schema-design-for-distributed-systems-ai3.pug
@@ -0,0 +1,383 @@
+---
+title: " Schema Design for Distributed Systems: Why Data Placement Matters"
+author: "Michael Aglietti"
+date: 2025-11-18
+tags:
+ - apache
+ - ignite
+---
+
+p Discover how Apache Ignite 3 keeps related data together with schema-driven
colocation, cutting cross-node traffic and making distributed queries fast,
local and predictable.
+
+<!-- end -->
+
+h3 Schema Design for Distributed Systems: Why Data Placement Matters
+
+p You can scale out your database, add more nodes, and tune every index, but
if your data isn’t in the right place, performance still hits a wall. Every
distributed system eventually runs into this: joins that cross the network,
caches that can’t keep up, and queries that feel slower the larger your cluster
gets.
+
+p.
+ Most distributed SQL databases claim to solve scalability. They partition
data evenly, replicate it across nodes, and promise linear performance. But
#[em how] data is distributed and #[em which] records end up together matters
more than most people realize.
+ If related data lands on different nodes, every query has to travel the
network to fetch it, and each millisecond adds up.
+
+
+p.
+ That’s where #[strong data placement] becomes the real scaling strategy.
Apache Ignite 3 takes a different path with #[strong schema-driven colocation]
— a way to keep related data physically together. Instead of spreading rows
randomly across nodes, Ignite uses your schema relationships to decide where
data lives. The result: a 200 ms cross-node query becomes a 5 ms local read.
+
+hr
+
+h3 How Ignite 3 Differs from Other Distributed Databases
+
+p
+ strong Traditional Distributed SQL Databases:
+ul
+ li Hash-based partitioning ignores data relationships
+ li Related data scattered across nodes by default
+ li Cross-node joins create network bottlenecks
+ li Millisecond latencies due to disk-first architecture
+
+p
+ strong Ignite 3 Schema-Driven Approach:
+ul
+ li Colocation configuration in schema definitions
+ li Related data automatically placed together
+ li Local queries eliminate network overhead
+ li Microsecond latencies through memory-first storage
+
+hr
+
+h3 The Distributed Data Placement Problem
+
+p You’ve tuned indexes, optimized queries, and scaled your cluster—but latency
still creeps in. The problem isn’t your SQL — it’s where your data lives.
+
+p Traditional hash-based partitioning distributes records randomly across
nodes based on primary key values. While this ensures even data distribution,
it scatters related records that applications frequently access together. It’s
a clever approach — until you need to join data that doesn’t share the same
key. Then every query turns into a distributed operation, and your network
becomes the bottleneck.
+
+p Ignite 3 provides automatic colocation based on schema relationships. You
define relationships directly in your schema, and Ignite automatically places
related data on the same nodes using the specified colocation keys.
+
+p.
+ Using a #[a(href="https://github.com/maglietti/ignite3-chinook-demo") music
catalog example], we’ll demonstrate how schema-driven data placement reduces
query latency from 200 ms to 5 ms.
+
+blockquote
+ p.
+ This post assumes you have a basic understanding of how to get an Ignite 3
cluster running and have worked with the Ignite 3 Java API. If you’re new to
Ignite 3, start with the
#[a(href="https://ignite.apache.org/docs/ignite3/latest/quick-start/java-api")
Java API quick start guide] to set up your development environment.
+
+
+hr
+
+h3 How Ignite 3 Places Data Differently
+
+p Tables are distributed across multiple nodes using consistent hashing, but
with a key difference: your schema definitions control data placement. Instead
of accepting random distribution of related records, you declare relationships
in your schema and let Ignite handle placement automatically.
+
+p
+ strong Partitioning Fundamentals:
+ul
+ li Each table is divided into partitions (typically 64–1024 per table)
+ li Primary key hash determines which partition data goes into
+ li Partitions are distributed evenly across available nodes
+ li Each partition has configurable replicas for fault tolerance
+
+p
+ strong Data Placement Concepts:
+ul
+ li
+ strong Affinity
+ | – the algorithm that determines which nodes store which partitions
+ li
+ strong Colocation
+ | – ensuring related data from different tables gets placed on the same
nodes
+
+p.
+ The diagram below shows how colocation works in practice. Artist and Album
tables use different primary keys, but colocation strategy ensures albums are
partitioned by #[code ArtistId] rather than #[code AlbumId]:
+
+
+
+pre.mermaid.
+ %%{init: {
+ "themeVariables": { "fontSize": "24px" },
+ "flowchart": { "htmlLabels": true, "useMaxWidth": true, "nodeSpacing": 90,
"rankSpacing": 90, "diagramPadding": 28 }
+ }}%%
+ graph TB
+ subgraph "3-Node Cluster"
+ N1["Node 1 Partitions: 0,3,6,9"]
+ N2["Node 2 Partitions: 1,4,7,10"]
+ N3["Node 3 Partitions: 2,5,8,11"]
+ end
+
+ subgraph "Album Table - colocateBy ArtistId"
+ B1["AlbumId=101, ArtistId=1 hash(1)→P1 Node 2"]
+ B2["AlbumId=102, ArtistId=1 hash(1)→P1 Node 2"]
+ B3["AlbumId=201, ArtistId=22 hash(22)→P11 Node 3"]
+ B4["AlbumId=301, ArtistId=42 hash(42)→P6 Node 1"]
+ end
+
+ subgraph "Artist Table"
+ A1["ArtistId=1 hash→P1 Node 2"]
+ A2["ArtistId=22 hash→P11 Node 3"]
+ A3["ArtistId=42 hash→P6 Node 1"]
+ end
+
+ A1 -.->|"Same partition"| B1 -.-> N2
+ A1 -.->|"Same partition"| B2 -.-> N2
+ A2 -.->|"Same partition"| B3 -.-> N3
+ A3 -.->|"Same partition"| B4 -.-> N1
+
+p.
+ Colocation configuration in your schema ensures that Album records use the
#[code ArtistId] value (not #[code AlbumId]) for partition assignment. This
guarantees that Artist 1 and all albums with #[code ArtistId = 1] hash to the
same partition and therefore live on the same nodes.
+
+
+hr
+
+h3 Distribution Zones and Data Placement
+
+p Distribution zones are cluster-level configurations that define how data is
distributed and replicated.
+
+blockquote
+ p.
+ #[strong Zone Creation Options:] Ignite 3 supports multiple approaches:
+ #[br]
+ 1. #[strong SQL DDL] – #[code CREATE ZONE] statements
+ #[br]
+ 2. #[strong Java Builder API] – programmatic #[code
ZoneDefinition.builder()]
+ #[br]
+ #[br]
+ We use the Java Builder API here for consistency with our programmatic
schema examples.
+
+
+p A distribution zone specifies:
+ul
+ li
+ strong Partition count
+ | – how many partitions your data is divided into (typically 64–1024 per
table)
+ li
+ strong Replica count
+ | – how many copies of each partition exist for fault tolerance
+ li
+ strong Node filters
+ | – which nodes can store data for this zone
+
+p First, create the distribution zones:
+
+pre
+ code.
+ // Create the standard zone for frequently updated data
+ ignite.catalog().create(ZoneDefinition.builder("MusicStore")
+ .replicas(2)
+ .storageProfiles("default")
+ .build()).execute();
+
+ // Create the replicated zone for reference data (replicas = cluster size)
+ ignite.catalog().create(ZoneDefinition.builder("MusicStoreReplicated")
+ .replicas(clusterNodes().size())
+ .storageProfiles("default")
+ .build()).execute();
+
+hr
+
+h3 Building Your Music Platform Schema
+
+blockquote
+ p.
+ #[strong Schema Creation:] Ignite 3 supports three approaches:
+ #[br]
+ 1. #[strong SQL DDL] – traditional #[code CREATE TABLE] statements
+ #[br]
+ 2. #[strong Java Annotations API] – POJO markup with #[code @Table],
#[code @Column], etc.
+ #[br]
+ 3. #[strong Java Builder API] – programmatic #[code
TableDefinition.builder()]
+ #[br]
+ #[br]
+ We use the Annotations API here for its clarity and type safety.
+
+p The Artist table establishes the partitioning strategy that dependent tables
will follow through colocation:
+
+pre
+ code.
+ @Table(zone = @Zone(value = "MusicStore", storageProfiles = "default"))
+ public class Artist {
+ @Id
+ @Column(value = "ArtistId", nullable = false)
+ private Integer ArtistId;
+
+ @Column(value = "Name", nullable = false, length = 120)
+ private String Name;
+
+ public Artist() {}
+
+ public Artist(Integer artistId, String name) {
+ this.ArtistId = artistId;
+ this.Name = name;
+ }
+
+ public Integer getArtistId() { return ArtistId; }
+ public void setArtistId(Integer artistId) { this.ArtistId = artistId; }
+ public String getName() { return Name; }
+ public void setName(String name) { this.Name = name; }
+ }
+
+hr
+
+h3 Parent–Child Colocation Implementation
+
+p When users search for "The Beatles", they expect both artist details and
album listings in the same query. Without colocation, this requires cross-node
joins that can take 40–200 ms.
+
+p We solve this by setting
+ code colocateBy
+ | in the
+ code @Table
+ | annotation:
+
+pre
+ code.
+ @Table(
+ zone = @Zone(value = "MusicStore", storageProfiles = "default"),
+ colocateBy = @ColumnRef("ArtistId")
+ )
+ public class Album {
+ @Id
+ @Column(value = "AlbumId", nullable = false)
+ private Integer AlbumId;
+
+ @Id
+ @Column(value = "ArtistId", nullable = false)
+ private Integer ArtistId;
+
+ @Column(value = "Title", nullable = false, length = 160)
+ private String Title;
+
+ @Column(value = "ReleaseDate", nullable = true)
+ private LocalDate ReleaseDate;
+
+ // Constructors and getters/setters...
+ }
+
+p The colocation field (
+ code ArtistId
+ | ) must be part of the composite primary key. Ignite uses the
+ code ArtistId
+ | value to ensure albums with the same artist live on the same nodes as
their corresponding artist record.
+
+hr
+
+h3 Performance Impact: Memory-First + Colocation
+
+p Let’s quantify the effect of combining memory-first storage with
schema-driven colocation.
+
+p
+ strong Without Colocation – Data Scattered:
+pre
+ code.
+ Artist artist = artistView.get(null, artistKey); // Node 2
+ Collection<Album> albums = albumView.getAll(null, albumKeys); //
Nodes 1,2,3
+ // Result: 3 network operations for related data
+ // Query time: 40–200 ms (network latency × nodes involved)
+
+p
+ strong With Memory-First + Colocation – Data Local:
+pre
+ code.
+ Artist artist = artistView.get(null, artistKey); // Node 2
+ Collection<Album> albums = albumView.getAll(null, albumKeys); //
Node 2
+ // Result: 1 node involved, local memory access
+ // Query time: 1–5 ms (memory access + no network hops)
+
+ul
+ li
+ strong Query latency reduction:
+ | 200 ms → 5 ms (memory access + no network hops)
+ li
+ strong Network traffic elimination:
+ | related data queries become local operations
+ li
+ strong Resource efficiency:
+ | CPU focuses on serving requests instead of moving data
+
+hr
+
+h3 Colocation Enables Compute-to-Data Processing
+
+p Schema-driven colocation doesn’t just optimize queries—it enables processing
where data lives:
+
+pre
+ code.
+ // Process all albums for an artist locally
+ ComputeJob<RecommendationResult> job =
ComputeJob.colocated("Artist", artistId,
+ AlbumRecommendationJob.class);
+
+ // Runs on the same node where artist and album data live
+ CompletableFuture<RecommendationResult> result = ignite.compute()
+ .submitAsync(job, preferences);
+
+
+p Instead of moving gigabytes of album data to a compute cluster, you move
kilobytes of logic to where the data already resides.
+
+hr
+
+h3 Implementation Guide
+
+p Deploy tables in dependency order to avoid colocation reference errors:
+
+pre
+ code.
+ try (IgniteClient client = IgniteClient.builder()
+ .addresses("127.0.0.1:10800")
+ .build()) {
+
+ // 1. Reference tables with no dependencies
+ client.catalog().createTable(Genre.class);
+
+ // 2. Root entities
+ client.catalog().createTable(Artist.class);
+
+ // 3. Dependent entities in hierarchy order
+ client.catalog().createTable(Album.class); // References Artist
+ client.catalog().createTable(Track.class); // References Album
+ }
+
+hr
+
+h3 Accessing Your Distributed Data
+
+p Ignite 3 provides multiple views of the same colocated data:
+
+pre
+ code.
+ // RecordView for entity operations
+ RecordView<Artist> artists = client.tables()
+ .table("Artist")
+ .recordView(Artist.class);
+
+ // Operations with partition keys route to single nodes
+ Artist beatles = new Artist(1, "The Beatles");
+ artists.upsert(null, beatles);
+
+ Album abbeyRoad = new Album(1, 1, "Abbey Road", LocalDate.of(1969, 9, 26));
+ albums.upsert(null, abbeyRoad); // Automatically colocated with artist
+
+
+hr
+
+h3 Summary
+
+p.
+ Data placement is where distributed performance is won or lost. With
#[strong schema-driven colocation], Apache Ignite 3 keeps related data together
on the same nodes, so your queries stay local, fast, and predictable.
+
+p Instead of tuning around network latency, you design for it once at the
schema level. Your joins stay local, your compute jobs run where the data
lives, and scaling stops being a tradeoff between performance and size.
+
+ul
+ li
+ strong Memory-first + colocation
+ | → microsecond access to related data
+ li
+ strong Schema-driven placement
+ | → predictable performance at scale
+ li
+ strong Compute-to-data
+ | → logic runs with data, not across the network
+ li
+ strong Unified platform
+ | → transactions, analytics, and compute together
+
+p When data lives together, your system scales naturally — without complexity
creeping in.
+
+p.
+ Explore the #[a(href="https://ignite.apache.org/docs/ignite3/latest/")
Ignite 3 documentation] for detailed examples and API references.
+
diff --git a/_src/_components/base.pug b/_src/_components/base.pug
index 16528f74e2..af7259b27b 100644
--- a/_src/_components/base.pug
+++ b/_src/_components/base.pug
@@ -26,4 +26,6 @@ html(lang="en")
script(src="/js/vendor/hystmodal/hystmodal.min.js")
script(src="/js/vendor/smoothscroll.js")
- script(src="/js/main.js?ver=" + config.version)
\ No newline at end of file
+ script(src="/js/main.js?ver=" + config.version)
+
+ block scripts
diff --git a/_src/_components/templates/post.pug
b/_src/_components/templates/post.pug
index a940bc08e9..3e69606d95 100644
--- a/_src/_components/templates/post.pug
+++ b/_src/_components/templates/post.pug
@@ -33,4 +33,14 @@ block main
a(href=`/blog/${tag}`)= tag
aside.blog__sidebar
- include ./../../_src/_components/templates/tags.pug
\ No newline at end of file
+ include ./../../_src/_components/templates/tags.pug
+
+block append scripts
+ script(src='https://cdn.jsdelivr.net/npm/mermaid@10/dist/mermaid.min.js')
+ script.
+ (function () {
+ mermaid.initialize({ startOnLoad: false, securityLevel: 'strict' });
+ if (document.querySelector('.mermaid')) {
+ mermaid.run({ querySelector: '.mermaid' });
+ }
+ }());
\ No newline at end of file
diff --git a/blog/apache/index.html b/blog/apache/index.html
index 1d831ec609..4f4fe9c3c2 100644
--- a/blog/apache/index.html
+++ b/blog/apache/index.html
@@ -122,7 +122,7 @@
<nav class="hdrmenu">
<ul class="flexi">
<li class="js-hasdrop"><a class="hdrmenu--expanded" href="/"
data-panel="getStarted">Get Started</a></li>
- <li class="js-hasdrop"><a class="hdrmenu__current
hdrmenu--expanded" href="/features" data-panel="features">Features</a></li>
+ <li class="js-hasdrop"><a class="hdrmenu--expanded"
href="/features" data-panel="features">Features</a></li>
<li class="js-hasdrop"><a class="hdrmenu--expanded"
href="/community.html" data-panel="community">Community</a></li>
<li><a href="/use-cases/provenusecases.html"
data-panel="">Powered By</a></li>
<li class="js-hasdrop"><a class="hdrmenu--expanded"
href="/resources.html" data-panel="resources">Resources</a></li>
@@ -341,6 +341,17 @@
<div class="blog__content">
<main class="blog_main">
<section class="blog__posts">
+ <article class="post">
+ <div class="post__header">
+ <h2><a
href="/blog/schema-design-for-distributed-systems-ai3.html"> Schema Design for
Distributed Systems: Why Data Placement Matters</a></h2>
+ <div>
+ November 18, 2025 by Michael Aglietti. Share in <a
href="http://www.facebook.com/sharer.php?u=https://ignite.apache.org/blog/schema-design-for-distributed-systems-ai3.html">Facebook</a><span>,
</span
+ ><a href="http://twitter.com/home?status= Schema Design for
Distributed Systems: Why Data Placement
Matters%20https://ignite.apache.org/blog/schema-design-for-distributed-systems-ai3.html">Twitter</a>
+ </div>
+ </div>
+ <div class="post__content"><p>Discover how Apache Ignite 3 keeps
related data together with schema-driven colocation, cutting cross-node traffic
and making distributed queries fast, local and predictable.</p></div>
+ <div class="post__footer"><a class="more"
href="/blog/schema-design-for-distributed-systems-ai3.html">↓ Read all</a></div>
+ </article>
<article class="post">
<div class="post__header">
<h2><a
href="/blog/getting-to-know-apache-ignite-3.html">Getting to Know Apache Ignite
3: A Schema-Driven Distributed Computing Platform</a></h2>
@@ -536,36 +547,6 @@
</div>
<div class="post__footer"><a class="more"
href="/blog/apache-ignite-2-3-more.html">↓ Read all</a></div>
</article>
- <article class="post">
- <div class="post__header">
- <h2><a
href="/blog/apache-ignite-community-news-september.html">Apache Ignite
Community News (Issue 3)</a></h2>
- <div>
- September 15, 2017 by Denis Magda. Share in <a
href="http://www.facebook.com/sharer.php?u=https://ignite.apache.org/blog/apache-ignite-community-news-september.html">Facebook</a><span>,
</span
- ><a href="http://twitter.com/home?status=Apache Ignite
Community News (Issue
3)%20https://ignite.apache.org/blog/apache-ignite-community-news-september.html">Twitter</a>
- </div>
- </div>
- <div class="post__content">
- <p><b>by Tom Diederich</b></p>
- <p>This is our third community update – there’s a
lot going on, so let's get started.</p>
- <p>Apache Ignite experts have already spoken at two meetups
this month, both in Silicon Valley, but there are several more scheduled this
month around the world.</p>
- <p></p>
- <p style="margin-bottom: 15pt">
- <span style="line-height: 19.5pt"
- ><span style="font-family: Helvetica"
- ><span style="color: #333333"
- >On <b>Sept. 9</b> Apache Ignite PMC chair Denis Magda
was the featured presenter at the<b> </b
- ><a
href="https://www.meetup.com/datariders/events/242523245/"
- ><b><span style="color: #467d76">Big Data and Cloud
Meetup</span></b></a
- >
- in Santa Clara, Calif. His talk, titled "Apache
Spark and Apache Ignite: Where Fast Data Meets the IoT," was highly rated
and we’re planning a hands-on workshop with meetup organizers for
- November.</span
- ></span
- ></span
- >
- </p>
- </div>
- <div class="post__footer"><a class="more"
href="/blog/apache-ignite-community-news-september.html">↓ Read all</a></div>
- </article>
</section>
<section class="blog__footer">
<ul class="pagination">
diff --git a/blog/ignite/index.html b/blog/ignite/index.html
index 894825c082..6c21066391 100644
--- a/blog/ignite/index.html
+++ b/blog/ignite/index.html
@@ -122,7 +122,7 @@
<nav class="hdrmenu">
<ul class="flexi">
<li class="js-hasdrop"><a class="hdrmenu--expanded" href="/"
data-panel="getStarted">Get Started</a></li>
- <li class="js-hasdrop"><a class="hdrmenu__current
hdrmenu--expanded" href="/features" data-panel="features">Features</a></li>
+ <li class="js-hasdrop"><a class="hdrmenu--expanded"
href="/features" data-panel="features">Features</a></li>
<li class="js-hasdrop"><a class="hdrmenu--expanded"
href="/community.html" data-panel="community">Community</a></li>
<li><a href="/use-cases/provenusecases.html"
data-panel="">Powered By</a></li>
<li class="js-hasdrop"><a class="hdrmenu--expanded"
href="/resources.html" data-panel="resources">Resources</a></li>
@@ -341,6 +341,17 @@
<div class="blog__content">
<main class="blog_main">
<section class="blog__posts">
+ <article class="post">
+ <div class="post__header">
+ <h2><a
href="/blog/schema-design-for-distributed-systems-ai3.html"> Schema Design for
Distributed Systems: Why Data Placement Matters</a></h2>
+ <div>
+ November 18, 2025 by Michael Aglietti. Share in <a
href="http://www.facebook.com/sharer.php?u=https://ignite.apache.org/blog/schema-design-for-distributed-systems-ai3.html">Facebook</a><span>,
</span
+ ><a href="http://twitter.com/home?status= Schema Design for
Distributed Systems: Why Data Placement
Matters%20https://ignite.apache.org/blog/schema-design-for-distributed-systems-ai3.html">Twitter</a>
+ </div>
+ </div>
+ <div class="post__content"><p>Discover how Apache Ignite 3 keeps
related data together with schema-driven colocation, cutting cross-node traffic
and making distributed queries fast, local and predictable.</p></div>
+ <div class="post__footer"><a class="more"
href="/blog/schema-design-for-distributed-systems-ai3.html">↓ Read all</a></div>
+ </article>
<article class="post">
<div class="post__header">
<h2><a
href="/blog/getting-to-know-apache-ignite-3.html">Getting to Know Apache Ignite
3: A Schema-Driven Distributed Computing Platform</a></h2>
@@ -518,34 +529,6 @@
</div>
<div class="post__footer"><a class="more"
href="/blog/apache-ignite-2-12-0.html">↓ Read all</a></div>
</article>
- <article class="post">
- <div class="post__header">
- <h2><a href="/blog/apache-ignite-2-11-1.html">Apache Ignite
2.11.1: Emergency Log4j2 Update</a></h2>
- <div>
- December 21, 2021 by Maxim Muzafarov. Share in <a
href="http://www.facebook.com/sharer.php?u=https://ignite.apache.org/blog/apache-ignite-2-11-1.html">Facebook</a><span>,
</span
- ><a href="http://twitter.com/home?status=Apache Ignite
2.11.1: Emergency Log4j2
Update%20https://ignite.apache.org/blog/apache-ignite-2-11-1.html">Twitter</a>
- </div>
- </div>
- <div class="post__content">
- <p>
- The new <a href="https://ignite.apache.org/">Apache
Ignite</a> 2.11.1 is an emergency release that fixes <a
href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-44228">CVE-2021-44228</a>,
- <a
href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-45046">CVE-2021-45046</a>,<a
href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-45105">CVE-2021-45105</a>
related to the ignite-log4j2 module
- usage.
- </p>
- <h3 id="apache-ignite-with-log4j-vulnerability">Apache Ignite
with Log4j Vulnerability</h3>
- <p>All the following conditions must be met:</p>
- <ul>
- <li>The Apache Ignite version lower than 2.11.0 is used
(since these vulnerabilities are already fixed in 2.11.1, 2.12, and upper
versions);</li>
- <li>The <code>ignite-logj42</code> is used by Apache Ignite
and located in the <code>libs</code> directory (by default it is located in the
<code>libs/optional</code>directory, so these deployments are not
affected);</li>
- <li>
- The Java version in use is older than the following
versions: <code>8u191</code>, <code>11.0.1</code>. This is due to the fact that
later versions set the JVM property
- <code>com.sun.jndi.ldap.object.trustURLCodebase</code> to
<code>false</code> by default, which disables JNDI loading of classes from
arbitrary URL code bases.
- </li>
- </ul>
- <p>NOTE: Relying only on the Java version as a protection
against these vulnerabilities is very risky and has not been tested.</p>
- </div>
- <div class="post__footer"><a class="more"
href="/blog/apache-ignite-2-11-1.html">↓ Read all</a></div>
- </article>
</section>
<section class="blog__footer">
<ul class="pagination">
diff --git a/blog/index.html b/blog/index.html
index fe1985ef38..0df8df2c2e 100644
--- a/blog/index.html
+++ b/blog/index.html
@@ -122,7 +122,7 @@
<nav class="hdrmenu">
<ul class="flexi">
<li class="js-hasdrop"><a class="hdrmenu--expanded" href="/"
data-panel="getStarted">Get Started</a></li>
- <li class="js-hasdrop"><a class="hdrmenu__current
hdrmenu--expanded" href="/features" data-panel="features">Features</a></li>
+ <li class="js-hasdrop"><a class="hdrmenu--expanded"
href="/features" data-panel="features">Features</a></li>
<li class="js-hasdrop"><a class="hdrmenu--expanded"
href="/community.html" data-panel="community">Community</a></li>
<li><a href="/use-cases/provenusecases.html"
data-panel="">Powered By</a></li>
<li class="js-hasdrop"><a class="hdrmenu--expanded"
href="/resources.html" data-panel="resources">Resources</a></li>
@@ -341,6 +341,17 @@
<div class="blog__content">
<main class="blog_main">
<section class="blog__posts">
+ <article class="post">
+ <div class="post__header">
+ <h2><a
href="/blog/schema-design-for-distributed-systems-ai3.html"> Schema Design for
Distributed Systems: Why Data Placement Matters</a></h2>
+ <div>
+ November 18, 2025 by Michael Aglietti. Share in <a
href="http://www.facebook.com/sharer.php?u=https://ignite.apache.org/blog/schema-design-for-distributed-systems-ai3.html">Facebook</a><span>,
</span
+ ><a href="http://twitter.com/home?status= Schema Design for
Distributed Systems: Why Data Placement
Matters%20https://ignite.apache.org/blog/schema-design-for-distributed-systems-ai3.html">Twitter</a>
+ </div>
+ </div>
+ <div class="post__content"><p>Discover how Apache Ignite 3 keeps
related data together with schema-driven colocation, cutting cross-node traffic
and making distributed queries fast, local and predictable.</p></div>
+ <div class="post__footer"><a class="more"
href="/blog/schema-design-for-distributed-systems-ai3.html">↓ Read all</a></div>
+ </article>
<article class="post">
<div class="post__header">
<h2><a
href="/blog/getting-to-know-apache-ignite-3.html">Getting to Know Apache Ignite
3: A Schema-Driven Distributed Computing Platform</a></h2>
@@ -518,34 +529,6 @@
</div>
<div class="post__footer"><a class="more"
href="/blog/apache-ignite-2-12-0.html">↓ Read all</a></div>
</article>
- <article class="post">
- <div class="post__header">
- <h2><a href="/blog/apache-ignite-2-11-1.html">Apache Ignite
2.11.1: Emergency Log4j2 Update</a></h2>
- <div>
- December 21, 2021 by Maxim Muzafarov. Share in <a
href="http://www.facebook.com/sharer.php?u=https://ignite.apache.org/blog/apache-ignite-2-11-1.html">Facebook</a><span>,
</span
- ><a href="http://twitter.com/home?status=Apache Ignite
2.11.1: Emergency Log4j2
Update%20https://ignite.apache.org/blog/apache-ignite-2-11-1.html">Twitter</a>
- </div>
- </div>
- <div class="post__content">
- <p>
- The new <a href="https://ignite.apache.org/">Apache
Ignite</a> 2.11.1 is an emergency release that fixes <a
href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-44228">CVE-2021-44228</a>,
- <a
href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-45046">CVE-2021-45046</a>,<a
href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-45105">CVE-2021-45105</a>
related to the ignite-log4j2 module
- usage.
- </p>
- <h3 id="apache-ignite-with-log4j-vulnerability">Apache Ignite
with Log4j Vulnerability</h3>
- <p>All the following conditions must be met:</p>
- <ul>
- <li>The Apache Ignite version lower than 2.11.0 is used
(since these vulnerabilities are already fixed in 2.11.1, 2.12, and upper
versions);</li>
- <li>The <code>ignite-logj42</code> is used by Apache Ignite
and located in the <code>libs</code> directory (by default it is located in the
<code>libs/optional</code>directory, so these deployments are not
affected);</li>
- <li>
- The Java version in use is older than the following
versions: <code>8u191</code>, <code>11.0.1</code>. This is due to the fact that
later versions set the JVM property
- <code>com.sun.jndi.ldap.object.trustURLCodebase</code> to
<code>false</code> by default, which disables JNDI loading of classes from
arbitrary URL code bases.
- </li>
- </ul>
- <p>NOTE: Relying only on the Java version as a protection
against these vulnerabilities is very risky and has not been tested.</p>
- </div>
- <div class="post__footer"><a class="more"
href="/blog/apache-ignite-2-11-1.html">↓ Read all</a></div>
- </article>
</section>
<section class="blog__footer">
<ul class="pagination">
diff --git a/blog/index.html
b/blog/schema-design-for-distributed-systems-ai3.html
similarity index 59%
copy from blog/index.html
copy to blog/schema-design-for-distributed-systems-ai3.html
index fe1985ef38..87fa0fae23 100644
--- a/blog/index.html
+++ b/blog/schema-design-for-distributed-systems-ai3.html
@@ -3,16 +3,11 @@
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0,
maximum-scale=1" />
- <title>Apache Ignite Blog</title>
- <meta property="og:title" content="Apache Ignite Blog" />
- <link rel="canonical" href="https://ignite.apache.org/blog" />
- <meta property="og:type" content="article" />
- <meta property="og:url" content="https://ignite.apache.org/blog" />
- <meta property="og:image" content="/img/og-pic.png" />
+ <title>Schema Design for Distributed Systems: Why Data Placement
Matters</title>
<link rel="stylesheet"
href="/js/vendor/hystmodal/hystmodal.min.css?ver=0.9" />
<link rel="stylesheet" href="/css/utils.css?ver=0.9" />
<link rel="stylesheet" href="/css/site.css?ver=0.9" />
- <link rel="stylesheet" href="/css/blog.css?ver=0.9" />
+ <link rel="stylesheet" href="../css/blog.css?ver=0.9" />
<link rel="stylesheet" href="/css/media.css?ver=0.9" media="only screen
and (max-width:1199px)" />
<link rel="icon" type="image/png" href="/img/favicon.png" />
<!-- Matomo -->
@@ -122,7 +117,7 @@
<nav class="hdrmenu">
<ul class="flexi">
<li class="js-hasdrop"><a class="hdrmenu--expanded" href="/"
data-panel="getStarted">Get Started</a></li>
- <li class="js-hasdrop"><a class="hdrmenu__current
hdrmenu--expanded" href="/features" data-panel="features">Features</a></li>
+ <li class="js-hasdrop"><a class="hdrmenu--expanded"
href="/features" data-panel="features">Features</a></li>
<li class="js-hasdrop"><a class="hdrmenu--expanded"
href="/community.html" data-panel="community">Community</a></li>
<li><a href="/use-cases/provenusecases.html"
data-panel="">Powered By</a></li>
<li class="js-hasdrop"><a class="hdrmenu--expanded"
href="/resources.html" data-panel="resources">Resources</a></li>
@@ -337,222 +332,317 @@
<div class="dropmenu__back"></div>
<header class="hdrfloat hdr__white jsHdrFloatBase"></header>
<div class="container blog">
- <section class="blog__header"><h1>Apache Ignite Blog</h1></section>
+ <section class="blog__header post_page__header">
+ <a href="/blog/">← Apache Ignite Blog</a>
+ <h1>Schema Design for Distributed Systems: Why Data Placement
Matters</h1>
+ <p>
+ November 18, 2025 by <strong>Michael Aglietti. Share in </strong><a
href="http://www.facebook.com/sharer.php?u=https://ignite.apache.org/blog/undefined">Facebook</a><span>,
</span
+ ><a href="http://twitter.com/home?status= Schema Design for
Distributed Systems: Why Data Placement
Matters%20https://ignite.apache.org/blog/undefined">Twitter</a>
+ </p>
+ </section>
<div class="blog__content">
<main class="blog_main">
<section class="blog__posts">
<article class="post">
- <div class="post__header">
- <h2><a
href="/blog/getting-to-know-apache-ignite-3.html">Getting to Know Apache Ignite
3: A Schema-Driven Distributed Computing Platform</a></h2>
- <div>
- November 11, 2025 by Michael Aglietti. Share in <a
href="http://www.facebook.com/sharer.php?u=https://ignite.apache.org/blog/getting-to-know-apache-ignite-3.html">Facebook</a><span>,
</span
- ><a href="http://twitter.com/home?status=Getting to Know
Apache Ignite 3: A Schema-Driven Distributed Computing
Platform%20https://ignite.apache.org/blog/getting-to-know-apache-ignite-3.html">Twitter</a>
- </div>
- </div>
- <div class="post__content">
- <p>
- Apache Ignite 3 is a memory-first distributed SQL database
platform that consolidates transactions, analytics, and compute workloads
previously requiring separate systems. Built from the ground up, it represents
a complete
- departure from traditional caching solutions toward a
unified distributed computing platform with microsecond latencies and
collocated processing capabilities.
- </p>
- </div>
- <div class="post__footer"><a class="more"
href="/blog/getting-to-know-apache-ignite-3.html">↓ Read all</a></div>
- </article>
- <article class="post">
- <div class="post__header">
- <h2><a href="/blog/whats-new-in-apache-ignite-3-1.html">Apache
Ignite 3.1: Performance, Multi-Language Client Support, and Production
Hardening</a></h2>
- <div>
- November 3, 2025 by Evgeniy Stanilovskiy. Share in <a
href="http://www.facebook.com/sharer.php?u=https://ignite.apache.org/blog/whats-new-in-apache-ignite-3-1.html">Facebook</a><span>,
</span
- ><a href="http://twitter.com/home?status=Apache Ignite 3.1:
Performance, Multi-Language Client Support, and Production
Hardening%20https://ignite.apache.org/blog/whats-new-in-apache-ignite-3-1.html">Twitter</a>
- </div>
- </div>
- <div class="post__content">
+ <div>
+ <p>Discover how Apache Ignite 3 keeps related data together
with schema-driven colocation, cutting cross-node traffic and making
distributed queries fast, local and predictable.</p>
+ <!-- end -->
+ <h3>Schema Design for Distributed Systems: Why Data Placement
Matters</h3>
<p>
- Apache Ignite 3.1 improves the three areas that matter most
when running distributed systems: performance at scale, language flexibility,
and operational visibility. The release also fixes hundreds of bugs related to
data
- corruption, race conditions, and edge cases discovered since
3.0.
+ You can scale out your database, add more nodes, and tune
every index, but if your data isn’t in the right place, performance still hits
a wall. Every distributed system eventually runs into this: joins that cross the
+ network, caches that can’t keep up, and queries that feel
slower the larger your cluster gets.
</p>
- </div>
- <div class="post__footer"><a class="more"
href="/blog/whats-new-in-apache-ignite-3-1.html">↓ Read all</a></div>
- </article>
- <article class="post">
- <div class="post__header">
- <h2><a href="/blog/whats-new-in-apache-ignite-3-0.html">What's
New in Apache Ignite 3.0</a></h2>
- <div>
- February 24, 2025 by Stanislav Lukyanov. Share in <a
href="http://www.facebook.com/sharer.php?u=https://ignite.apache.org/blog/whats-new-in-apache-ignite-3-0.html">Facebook</a><span>,
</span
- ><a href="http://twitter.com/home?status=What's New in
Apache Ignite
3.0%20https://ignite.apache.org/blog/whats-new-in-apache-ignite-3-0.html">Twitter</a>
- </div>
- </div>
- <div class="post__content">
- <p>
- Apache Ignite 3.0 is the latest milestone in Apache Ignite
evolution that enhances developer experience, platform resilience, and
efficiency. In this article, we’ll explore the key new features and
improvements in Apache
- Ignite 3.0.
- </p>
- </div>
- <div class="post__footer"><a class="more"
href="/blog/whats-new-in-apache-ignite-3-0.html">↓ Read all</a></div>
- </article>
- <article class="post">
- <div class="post__header">
- <h2><a href="/blog/apache-ignite-2-17-0.html">Apache Ignite
2.17 Release: What’s New</a></h2>
- <div>
- February 13, 2025 by Nikita Amelchev. Share in <a
href="http://www.facebook.com/sharer.php?u=https://ignite.apache.org/blog/apache-ignite-2-17-0.html">Facebook</a><span>,
</span
- ><a href="http://twitter.com/home?status=Apache Ignite 2.17
Release: What’s
New%20https://ignite.apache.org/blog/apache-ignite-2-17-0.html">Twitter</a>
- </div>
- </div>
- <div class="post__content">
<p>
- We are happy to announce the release of <a
href="https://ignite.apache.org/">Apache Ignite </a>2.17.0! In this latest
version, the Ignite community has introduced a range of new features and
improvements to deliver a more
- efficient, flexible, and future-proof platform. Below, we’ll
cover the key highlights that you can look forward to when upgrading to the new
release.
+ Most distributed SQL databases claim to solve scalability.
They partition data evenly, replicate it across nodes, and promise linear
performance. But <em>how</em> data is distributed and <em>which</em> records
end up
+ together matters more than most people realize. If related
data lands on different nodes, every query has to travel the network to fetch
it, and each millisecond adds up.
</p>
- </div>
- <div class="post__footer"><a class="more"
href="/blog/apache-ignite-2-17-0.html">↓ Read all</a></div>
- </article>
- <article class="post">
- <div class="post__header">
- <h2><a
href="/blog/apache-ignite-net-intel-cet-fix.html">Ignite on .NET 9 and Intel
CET</a></h2>
- <div>
- November 22, 2024 by Pavel Tupitsyn. Share in <a
href="http://www.facebook.com/sharer.php?u=https://ignite.apache.org/blog/apache-ignite-net-intel-cet-fix.html">Facebook</a><span>,
</span
- ><a href="http://twitter.com/home?status=Ignite on .NET 9
and Intel
CET%20https://ignite.apache.org/blog/apache-ignite-net-intel-cet-fix.html">Twitter</a>
- </div>
- </div>
- <div class="post__content">
- <p>Old JDK code meets new Intel security feature, JVM + CLR in
one process, and a mysterious crash.</p>
- <p><a href="https://ptupitsyn.github.io/Ignite-on-NET-9/">Read
More...</a></p>
- </div>
- </article>
- <article class="post">
- <div class="post__header">
- <h2><a href="/blog/apache-ignite-2-16-0.html">Apache Ignite
2.16.0: Cache dumps, Calcite engine stabilization, JDK 14+ bug fixes</a></h2>
- <div>
- December 25, 2023 by Nikita Amelchev. Share in <a
href="http://www.facebook.com/sharer.php?u=https://ignite.apache.org/blog/apache-ignite-2-16-0.html">Facebook</a><span>,
</span
- ><a href="http://twitter.com/home?status=Apache Ignite
2.16.0: Cache dumps, Calcite engine stabilization, JDK 14+ bug
fixes%20https://ignite.apache.org/blog/apache-ignite-2-16-0.html">Twitter</a>
- </div>
- </div>
- <div class="post__content">
<p>
- As of December 25, 2023, <a
href="https://ignite.apache.org/">Apache Ignite </a>2.16 has been released. You
can directly check the full list of resolved <a
href="https://s.apache.org/j3brc">Important JIRA tasks </a>but
- let's briefly overview some valuable improvements.
+ That’s where <strong>data placement</strong> becomes the
real scaling strategy. Apache Ignite 3 takes a different path with
<strong>schema-driven colocation</strong> — a way to keep related data
physically together.
+ Instead of spreading rows randomly across nodes, Ignite uses
your schema relationships to decide where data lives. The result: a 200 ms
cross-node query becomes a 5 ms local read.
</p>
- <h3 id="cache-dumps">Cache dumps</h3>
+ <hr />
+ <h3>How Ignite 3 Differs from Other Distributed Databases</h3>
+ <p><strong>Traditional Distributed SQL Databases:</strong></p>
+ <ul>
+ <li>Hash-based partitioning ignores data relationships</li>
+ <li>Related data scattered across nodes by default</li>
+ <li>Cross-node joins create network bottlenecks</li>
+ <li>Millisecond latencies due to disk-first architecture</li>
+ </ul>
+ <p><strong>Ignite 3 Schema-Driven Approach:</strong></p>
+ <ul>
+ <li>Colocation configuration in schema definitions</li>
+ <li>Related data automatically placed together</li>
+ <li>Local queries eliminate network overhead</li>
+ <li>Microsecond latencies through memory-first storage</li>
+ </ul>
+ <hr />
+ <h3>The Distributed Data Placement Problem</h3>
+ <p>You’ve tuned indexes, optimized queries, and scaled your
cluster—but latency still creeps in. The problem isn’t your SQL — it’s where
your data lives.</p>
<p>
- Ignite has persistent cache <a
href="https://ignite.apache.org/docs/latest/snapshots/snapshots">snapshots
</a>and this feature is highly appreciated by Ignite users. This release
introduces another way to make a copy of
- user data - a cache dump.
+ Traditional hash-based partitioning distributes records
randomly across nodes based on primary key values. While this ensures even data
distribution, it scatters related records that applications frequently access
+ together. It’s a clever approach — until you need to join
data that doesn’t share the same key. Then every query turns into a distributed
operation, and your network becomes the bottleneck.
</p>
<p>
- The cache dump is essentially a file that contains all
entries of a cache group at the time of dump creation. Dump is consistent like
a snapshot, which means all entries that existed in the cluster at the moment
of dump
- creation will be included in the dump file. Meta information
of dumped caches and binary meta are also included in the dump.
+ Ignite 3 provides automatic colocation based on schema
relationships. You define relationships directly in your schema, and Ignite
automatically places related data on the same nodes using the specified
colocation keys.
</p>
- <p>Main differences from cache snapshots:</p>
- <ul>
- <li>Supports in-memory caches that a snapshot feature does
not support.</li>
- <li>Takes up less disk space. The dump contains only the
cache entries as-is.</li>
- <li>Can be used for offline data processing.</li>
- </ul>
- </div>
- <div class="post__footer"><a class="more"
href="/blog/apache-ignite-2-16-0.html">↓ Read all</a></div>
- </article>
- <article class="post">
- <div class="post__header">
- <h2><a
href="/blog/apache-ignite-net-dynamic-linq.html">Dynamic LINQ performance and
usability with Ignite.NET and System.Linq.Dynamic</a></h2>
- <div>
- May 22, 2023 by Pavel Tupitsyn. Share in <a
href="http://www.facebook.com/sharer.php?u=https://ignite.apache.org/blog/apache-ignite-net-dynamic-linq.html">Facebook</a><span>,
</span
- ><a href="http://twitter.com/home?status=Dynamic LINQ
performance and usability with Ignite.NET and
System.Linq.Dynamic%20https://ignite.apache.org/blog/apache-ignite-net-dynamic-linq.html">Twitter</a>
- </div>
- </div>
- <div class="post__content">
- <p>Dynamically building database queries can be necessary for
some use cases, such as UI-defined filtering. This can get challenging with
LINQ frameworks like EF Core and Ignite.NET.</p>
- <p><a
href="https://ptupitsyn.github.io/Dynamic-LINQ-With-Ignite/">Read
More...</a></p>
- </div>
- </article>
- <article class="post">
- <div class="post__header">
- <h2><a href="/blog/apache-ignite-2-13-0.html">Apache Ignite
2.13.0: new Apache Calcite-based SQL engine</a></h2>
- <div>
- April 28, 2022 by Nikita Amelchev. Share in <a
href="http://www.facebook.com/sharer.php?u=https://ignite.apache.org/blog/apache-ignite-2-13-0.html">Facebook</a><span>,
</span
- ><a href="http://twitter.com/home?status=Apache Ignite
2.13.0: new Apache Calcite-based SQL
engine%20https://ignite.apache.org/blog/apache-ignite-2-13-0.html">Twitter</a>
- </div>
- </div>
- <div class="post__content">
+ <p>Using a <a
href="https://github.com/maglietti/ignite3-chinook-demo">music catalog
example</a>, we’ll demonstrate how schema-driven data placement reduces query
latency from 200 ms to 5 ms.</p>
+ <blockquote>
+ <p>
+ This post assumes you have a basic understanding of how to
get an Ignite 3 cluster running and have worked with the Ignite 3 Java API. If
you’re new to Ignite 3, start with the
+ <a
href="https://ignite.apache.org/docs/ignite3/latest/quick-start/java-api">Java
API quick start guide</a> to set up your development environment.
+ </p>
+ </blockquote>
+ <hr />
+ <h3>How Ignite 3 Places Data Differently</h3>
<p>
- As of April 26, 2022, <a
href="https://ignite.apache.org/">Apache Ignite</a> 2.13 has been released. You
can directly check the full list of resolved <a
href="https://s.apache.org/x8u49">Important JIRA tasks</a> but here
- let's briefly overview some valuable improvements.
+ Tables are distributed across multiple nodes using
consistent hashing, but with a key difference: your schema definitions control
data placement. Instead of accepting random distribution of related records,
you declare
+ relationships in your schema and let Ignite handle placement
automatically.
</p>
- <h4>This is a breaking change release: The legacy service grid
implementation was removed.</h4>
- <h3 id="new-apache-calcite-based-sql-engine">New Apache
Calcite-based SQL engine</h3>
- <p>We've implemented a new experimental SQL engine based
on Apache Calcite. Now it's possible to:</p>
+ <p><strong>Partitioning Fundamentals:</strong></p>
<ul>
- <li>Get rid of some <a
href="https://cwiki.apache.org/confluence/display/IGNITE/IEP-37%3A+New+query+execution+engine#IEP37:Newqueryexecutionengine-Motivation">H2
limitations</a>;</li>
- <li><a
href="https://cwiki.apache.org/confluence/display/IGNITE/IEP-37%3A+New+query+execution+engine#IEP37:Newqueryexecutionengine-Implementationdetails">Optimize</a>
some query execution.</li>
+ <li>Each table is divided into partitions (typically 64–1024
per table)</li>
+ <li>Primary key hash determines which partition data goes
into</li>
+ <li>Partitions are distributed evenly across available
nodes</li>
+ <li>Each partition has configurable replicas for fault
tolerance</li>
</ul>
- <p>The current H2-based engine has fundamental limitations.
For example:</p>
+ <p><strong>Data Placement Concepts:</strong></p>
<ul>
- <li>some queries should be splitted into 2 phases (map
subquery and reduce subquery), but some of them cannot be effectively executed
in 2 phases.</li>
- <li>H2 is a third-party database product with not-ASF
license.</li>
- <li>The optimizer and other internal things are not supposed
to work in a distributed environment.</li>
- <li>It's hard to make Ignite-specific changes to the H2
code, patches are often declined.</li>
+ <li><strong>Affinity</strong> – the algorithm that
determines which nodes store which partitions</li>
+ <li><strong>Colocation</strong> – ensuring related data from
different tables gets placed on the same nodes</li>
</ul>
- </div>
- <div class="post__footer"><a class="more"
href="/blog/apache-ignite-2-13-0.html">↓ Read all</a></div>
- </article>
- <article class="post">
- <div class="post__header">
- <h2><a href="/blog/apache-ignite-2-12-0.html">Apache Ignite
2.12.0: CDC, Index Query API, Vulnerabilities Fixes</a></h2>
- <div>
- January 14, 2022 by Nikita Amelchev. Share in <a
href="http://www.facebook.com/sharer.php?u=https://ignite.apache.org/blog/apache-ignite-2-12-0.html">Facebook</a><span>,
</span
- ><a href="http://twitter.com/home?status=Apache Ignite
2.12.0: CDC, Index Query API, Vulnerabilities
Fixes%20https://ignite.apache.org/blog/apache-ignite-2-12-0.html">Twitter</a>
- </div>
- </div>
- <div class="post__content">
<p>
- As of January 14, 2022, <a
href="https://ignite.apache.org/">Apache Ignite</a> 2.12 has been released. You
can directly check the full list of resolved <a
href="https://s.apache.org/0zyi2">Important JIRA tasks</a> but here
- let’s briefly overview some valuable improvements.
+ The diagram below shows how colocation works in practice.
Artist and Album tables use different primary keys, but colocation strategy
ensures albums are partitioned by <code>ArtistId</code> rather than
+ <code>AlbumId</code>:
</p>
- <h3 id="vulnerability-updates">Vulnerability Updates</h3>
+ <pre class="mermaid">
+%%{init: {
+ "themeVariables": { "fontSize": "24px" },
+ "flowchart": { "htmlLabels": true, "useMaxWidth": true, "nodeSpacing": 90,
"rankSpacing": 90, "diagramPadding": 28 }
+}}%%
+graph TB
+ subgraph "3-Node Cluster"
+ N1["Node 1 Partitions: 0,3,6,9"]
+ N2["Node 2 Partitions: 1,4,7,10"]
+ N3["Node 3 Partitions: 2,5,8,11"]
+ end
+
+ subgraph "Album Table - colocateBy ArtistId"
+ B1["AlbumId=101, ArtistId=1 hash(1)→P1 Node 2"]
+ B2["AlbumId=102, ArtistId=1 hash(1)→P1 Node 2"]
+ B3["AlbumId=201, ArtistId=22 hash(22)→P11 Node 3"]
+ B4["AlbumId=301, ArtistId=42 hash(42)→P6 Node 1"]
+ end
+
+ subgraph "Artist Table"
+ A1["ArtistId=1 hash→P1 Node 2"]
+ A2["ArtistId=22 hash→P11 Node 3"]
+ A3["ArtistId=42 hash→P6 Node 1"]
+ end
+
+ A1 -.->|"Same partition"| B1 -.-> N2
+ A1 -.->|"Same partition"| B2 -.-> N2
+ A2 -.->|"Same partition"| B3 -.-> N3
+ A3 -.->|"Same partition"| B4 -.-> N1
+</pre
+ >
<p>
- The Apache Ignite versions lower than 2.11.1 are vulnerable
to <a
href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-44832">CVE-2021-44832</a>
which is related to the <code>ignite-log4j2</code> module usage.
+ Colocation configuration in your schema ensures that Album
records use the <code>ArtistId</code> value (not <code>AlbumId</code>) for
partition assignment. This guarantees that Artist 1 and all albums with
+ <code>ArtistId = 1</code> hash to the same partition and
therefore live on the same nodes.
</p>
- <p>The release also fixes 10+ CVE’s of various modules.
See <a
href="https://ignite.apache.org/releases/ignite2/2.12.0/release_notes.html">release
notes</a> for more details.</p>
- <h3 id="change-data-capture">Change Data Capture</h3>
+ <hr />
+ <h3>Distribution Zones and Data Placement</h3>
+ <p>Distribution zones are cluster-level configurations that
define how data is distributed and replicated.</p>
+ <blockquote>
+ <p>
+ <strong>Zone Creation Options:</strong> Ignite 3 supports
multiple approaches:
+ <br />
+ 1. <strong>SQL DDL</strong> – <code>CREATE ZONE</code>
statements
+ <br />
+ 2. <strong>Java Builder API</strong> – programmatic
<code>ZoneDefinition.builder()</code>
+ <br />
+ <br />
+ We use the Java Builder API here for consistency with our
programmatic schema examples.
+ </p>
+ </blockquote>
+ <p>A distribution zone specifies:</p>
+ <ul>
+ <li><strong>Partition count</strong> – how many partitions
your data is divided into (typically 64–1024 per table)</li>
+ <li><strong>Replica count</strong> – how many copies of each
partition exist for fault tolerance</li>
+ <li><strong>Node filters</strong> – which nodes can store
data for this zone</li>
+ </ul>
+ <p>First, create the distribution zones:</p>
+ <pre><code>// Create the standard zone for frequently updated
data
+ignite.catalog().create(ZoneDefinition.builder("MusicStore")
+.replicas(2)
+.storageProfiles("default")
+.build()).execute();
+
+// Create the replicated zone for reference data (replicas = cluster size)
+ignite.catalog().create(ZoneDefinition.builder("MusicStoreReplicated")
+.replicas(clusterNodes().size())
+.storageProfiles("default")
+.build()).execute();
+</code></pre>
+ <hr />
+ <h3>Building Your Music Platform Schema</h3>
+ <blockquote>
+ <p>
+ <strong>Schema Creation:</strong> Ignite 3 supports three
approaches:
+ <br />
+ 1. <strong>SQL DDL</strong> – traditional <code>CREATE
TABLE</code> statements
+ <br />
+ 2. <strong>Java Annotations API</strong> – POJO markup
with <code>@Table</code>, <code>@Column</code>, etc.
+ <br />
+ 3. <strong>Java Builder API</strong> – programmatic
<code>TableDefinition.builder()</code>
+ <br />
+ <br />
+ We use the Annotations API here for its clarity and type
safety.
+ </p>
+ </blockquote>
+ <p>The Artist table establishes the partitioning strategy that
dependent tables will follow through colocation:</p>
+ <pre><code>@Table(zone = @Zone(value = "MusicStore",
storageProfiles = "default"))
+public class Artist {
+ @Id
+ @Column(value = "ArtistId", nullable = false)
+ private Integer ArtistId;
+
+ @Column(value = "Name", nullable = false, length = 120)
+ private String Name;
+
+ public Artist() {}
+
+ public Artist(Integer artistId, String name) {
+ this.ArtistId = artistId;
+ this.Name = name;
+ }
+
+ public Integer getArtistId() { return ArtistId; }
+ public void setArtistId(Integer artistId) { this.ArtistId = artistId; }
+ public String getName() { return Name; }
+ public void setName(String name) { this.Name = name; }
+}
+</code></pre>
+ <hr />
+ <h3>Parent–Child Colocation Implementation</h3>
+ <p>When users search for "The Beatles", they expect both
artist details and album listings in the same query. Without colocation, this
requires cross-node joins that can take 40–200 ms.</p>
+ <p>We solve this by setting<code>colocateBy</code>in
the<code>@Table</code>annotation:</p>
+ <pre><code>@Table(
+ zone = @Zone(value = "MusicStore", storageProfiles = "default"),
+ colocateBy = @ColumnRef("ArtistId")
+)
+public class Album {
+ @Id
+ @Column(value = "AlbumId", nullable = false)
+ private Integer AlbumId;
+
+ @Id
+ @Column(value = "ArtistId", nullable = false)
+ private Integer ArtistId;
+
+ @Column(value = "Title", nullable = false, length = 160)
+ private String Title;
+
+ @Column(value = "ReleaseDate", nullable = true)
+ private LocalDate ReleaseDate;
+
+ // Constructors and getters/setters...
+}
+</code></pre>
<p>
- Change Data Capture (<a
href="https://en.wikipedia.org/wiki/Change_data_capture">CDC</a>) is a data
processing pattern used to asynchronously receive entries that have been
changed on the local node so that action can be
- taken using the changed entry.
+ The colocation field (<code>ArtistId</code>) must be part of
the composite primary key. Ignite uses the<code>ArtistId</code>value to ensure
albums with the same artist live on the same nodes as their corresponding artist
+ record.
</p>
- </div>
- <div class="post__footer"><a class="more"
href="/blog/apache-ignite-2-12-0.html">↓ Read all</a></div>
- </article>
- <article class="post">
- <div class="post__header">
- <h2><a href="/blog/apache-ignite-2-11-1.html">Apache Ignite
2.11.1: Emergency Log4j2 Update</a></h2>
- <div>
- December 21, 2021 by Maxim Muzafarov. Share in <a
href="http://www.facebook.com/sharer.php?u=https://ignite.apache.org/blog/apache-ignite-2-11-1.html">Facebook</a><span>,
</span
- ><a href="http://twitter.com/home?status=Apache Ignite
2.11.1: Emergency Log4j2
Update%20https://ignite.apache.org/blog/apache-ignite-2-11-1.html">Twitter</a>
- </div>
- </div>
- <div class="post__content">
+ <hr />
+ <h3>Performance Impact: Memory-First + Colocation</h3>
+ <p>Let’s quantify the effect of combining memory-first storage
with schema-driven colocation.</p>
+ <p><strong>Without Colocation – Data Scattered:</strong></p>
+ <pre><code>Artist artist = artistView.get(null, artistKey);
// Node 2
+Collection<Album> albums = albumView.getAll(null, albumKeys); // Nodes
1,2,3
+// Result: 3 network operations for related data
+// Query time: 40–200 ms (network latency × nodes involved)
+</code></pre>
+ <p><strong>With Memory-First + Colocation – Data
Local:</strong></p>
+ <pre><code>Artist artist = artistView.get(null, artistKey);
// Node 2
+Collection<Album> albums = albumView.getAll(null, albumKeys); // Node 2
+// Result: 1 node involved, local memory access
+// Query time: 1–5 ms (memory access + no network hops)
+</code></pre>
+ <ul>
+ <li><strong>Query latency reduction:</strong> 200 ms → 5 ms
(memory access + no network hops)</li>
+ <li><strong>Network traffic elimination:</strong> related
data queries become local operations</li>
+ <li><strong>Resource efficiency:</strong> CPU focuses on
serving requests instead of moving data</li>
+ </ul>
+ <hr />
+ <h3>Colocation Enables Compute-to-Data Processing</h3>
+ <p>Schema-driven colocation doesn’t just optimize queries—it
enables processing where data lives:</p>
+ <pre><code>// Process all albums for an artist locally
+ComputeJob<RecommendationResult> job = ComputeJob.colocated("Artist",
artistId,
+AlbumRecommendationJob.class);
+
+// Runs on the same node where artist and album data live
+CompletableFuture<RecommendationResult> result = ignite.compute()
+.submitAsync(job, preferences);
+
+</code></pre>
+ <p>Instead of moving gigabytes of album data to a compute
cluster, you move kilobytes of logic to where the data already resides.</p>
+ <hr />
+ <h3>Implementation Guide</h3>
+ <p>Deploy tables in dependency order to avoid colocation
reference errors:</p>
+ <pre><code>try (IgniteClient client = IgniteClient.builder()
+ .addresses("127.0.0.1:10800")
+ .build()) {
+
+ // 1. Reference tables with no dependencies
+ client.catalog().createTable(Genre.class);
+
+ // 2. Root entities
+ client.catalog().createTable(Artist.class);
+
+ // 3. Dependent entities in hierarchy order
+ client.catalog().createTable(Album.class); // References Artist
+ client.catalog().createTable(Track.class); // References Album
+}
+</code></pre>
+ <hr />
+ <h3>Accessing Your Distributed Data</h3>
+ <p>Ignite 3 provides multiple views of the same colocated
data:</p>
+ <pre><code>// RecordView for entity operations
+RecordView<Artist> artists = client.tables()
+.table("Artist")
+.recordView(Artist.class);
+
+// Operations with partition keys route to single nodes
+Artist beatles = new Artist(1, "The Beatles");
+artists.upsert(null, beatles);
+
+Album abbeyRoad = new Album(1, 1, "Abbey Road", LocalDate.of(1969, 9, 26));
+albums.upsert(null, abbeyRoad); // Automatically colocated with artist
+
+</code></pre>
+ <hr />
+ <h3>Summary</h3>
<p>
- The new <a href="https://ignite.apache.org/">Apache
Ignite</a> 2.11.1 is an emergency release that fixes <a
href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-44228">CVE-2021-44228</a>,
- <a
href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-45046">CVE-2021-45046</a>,<a
href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-45105">CVE-2021-45105</a>
related to the ignite-log4j2 module
- usage.
+ Data placement is where distributed performance is won or
lost. With <strong>schema-driven colocation</strong>, Apache Ignite 3 keeps
related data together on the same nodes, so your queries stay local, fast, and
+ predictable.
</p>
- <h3 id="apache-ignite-with-log4j-vulnerability">Apache Ignite
with Log4j Vulnerability</h3>
- <p>All the following conditions must be met:</p>
+ <p>Instead of tuning around network latency, you design for it
once at the schema level. Your joins stay local, your compute jobs run where
the data lives, and scaling stops being a tradeoff between performance and
size.</p>
<ul>
- <li>The Apache Ignite version lower than 2.11.0 is used
(since these vulnerabilities are already fixed in 2.11.1, 2.12, and upper
versions);</li>
- <li>The <code>ignite-logj42</code> is used by Apache Ignite
and located in the <code>libs</code> directory (by default it is located in the
<code>libs/optional</code>directory, so these deployments are not
affected);</li>
- <li>
- The Java version in use is older than the following
versions: <code>8u191</code>, <code>11.0.1</code>. This is due to the fact that
later versions set the JVM property
- <code>com.sun.jndi.ldap.object.trustURLCodebase</code> to
<code>false</code> by default, which disables JNDI loading of classes from
arbitrary URL code bases.
- </li>
+ <li><strong>Memory-first + colocation</strong> → microsecond
access to related data</li>
+ <li><strong>Schema-driven placement</strong> → predictable
performance at scale</li>
+ <li><strong>Compute-to-data</strong> → logic runs with data,
not across the network</li>
+ <li><strong>Unified platform</strong> → transactions,
analytics, and compute together</li>
</ul>
- <p>NOTE: Relying only on the Java version as a protection
against these vulnerabilities is very risky and has not been tested.</p>
+ <p>When data lives together, your system scales naturally —
without complexity creeping in.</p>
+ <p>Explore the <a
href="https://ignite.apache.org/docs/ignite3/latest/">Ignite 3
documentation</a> for detailed examples and API references.</p>
</div>
- <div class="post__footer"><a class="more"
href="/blog/apache-ignite-2-11-1.html">↓ Read all</a></div>
</article>
- </section>
- <section class="blog__footer">
- <ul class="pagination">
- <li><a class="current" href="/blog/">1</a></li>
- <li><a class="item" href="/blog/1/">2</a></li>
- <li><a class="item" href="/blog/2/">3</a></li>
- </ul>
+ <section class="blog__footer">
+ <ul class="pagination post_page">
+ <li><a href="/blog/apache">apache</a></li>
+ <li><a href="/blog/ignite">ignite</a></li>
+ </ul>
+ </section>
</section>
</main>
<aside class="blog__sidebar">
@@ -658,5 +748,14 @@
<script src="/js/vendor/hystmodal/hystmodal.min.js"></script>
<script src="/js/vendor/smoothscroll.js"></script>
<script src="/js/main.js?ver=0.9"></script>
+ <script
src="https://cdn.jsdelivr.net/npm/mermaid@10/dist/mermaid.min.js"></script>
+ <script>
+ (function () {
+ mermaid.initialize({ startOnLoad: false, securityLevel: 'strict' });
+ if (document.querySelector('.mermaid')) {
+ mermaid.run({ querySelector: '.mermaid' });
+ }
+ })();
+ </script>
</body>
</html>