This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git


The following commit(s) were added to refs/heads/master by this push:
     new a9220c2056b [blog](update) Update MiniMax blog (#1067)
a9220c2056b is described below

commit a9220c2056babd48360d5e8d76d94f2ee0ddd408
Author: KassieZ <[email protected]>
AuthorDate: Fri Aug 30 14:09:33 2024 +0800

    [blog](update) Update MiniMax blog (#1067)
    
    # Versions
    
    - [ ] dev
    - [ ] 3.0
    - [ ] 2.1
    - [ ] 2.0
    - [x] blog
    # Languages
    
    - [ ] Chinese
    - [x] English
---
 ...d-built-a-pb-scale-logging-system-with-doris.md | 145 +++++++++++++++++++++
 blog/apache-doris-vs-rockset.md                    |   2 -
 blog/auto-partition-in-apache-doris.md             |   2 +-
 blog/migrate-lakehouse-from-bigquery-to-doris.md   |   2 +-
 blog/release-note-3.0.1.md                         |   2 +-
 .../gettingStarted/demo-block/latest.tsx           |  10 +-
 gettingStarted/demo-block/latest.tsx               |   7 +-
 src/components/recent-blogs/recent-blogs.data.ts   |   9 +-
 src/constant/newsletter.data.ts                    |  14 +-
 .../images/apache-doris-based-logging-system.png   | Bin 0 -> 126955 bytes
 .../images/minimax-migrated-from-loki-to-doris.png | Bin 0 -> 704790 bytes
 .../the-old-grafana-Loki-based-logging-system.png  | Bin 0 -> 116103 bytes
 static/images/why-Apache-Doris.png                 | Bin 0 -> 270135 bytes
 13 files changed, 166 insertions(+), 27 deletions(-)

diff --git 
a/blog/ai-unicorn-minimax-from-loki-and-built-a-pb-scale-logging-system-with-doris.md
 
b/blog/ai-unicorn-minimax-from-loki-and-built-a-pb-scale-logging-system-with-doris.md
new file mode 100644
index 00000000000..0848e2f8aca
--- /dev/null
+++ 
b/blog/ai-unicorn-minimax-from-loki-and-built-a-pb-scale-logging-system-with-doris.md
@@ -0,0 +1,145 @@
+---
+{
+    'title': 'How AI unicorn MiniMax migrated from Loki and built a PB-scale 
logging system with Apache Doris',
+    'summary': "Serving a PB-scale data size with over 99.9% availability, 
Apache Doris is the vital signs monitor of MiniMax, a generative AI startup 
backed by Alibaba.",
+    'description': "Serving a PB-scale data size with over 99.9% availability, 
Apache Doris is the vital signs monitor of MiniMax, a generative AI startup 
backed by Alibaba.",
+    'date': '2024-08-29',
+    'author': 'Apache Doris',
+    'tags': ['Best Practice'],
+    'picked': "true",
+    'order': "1",
+    "image": '/images/minimax-migrated-from-loki-to-doris.png'
+}
+
+---
+
+<!-- 
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+MiniMax, [a generative AI startup backed by Alibaba, Tencent, miHoYo, 
etc.](https://fortune.com/asia/2024/03/05/alibaba-leads-financing-round-chinese-ai-startup-minimax/),
 has been investing most of its efforts in MoE (Mixture of Experts) before it 
became an industry consensus. In April 2024, MiniMax launched its first 
commercially deployed MoE-based LLM, **MiniMax-abab 6.5**, which contains over 
a trillion parameters and delivers performances comparable to GPT-4, Claude-3, 
and Gemini-1.5. 
+
+As their LLM is getting more complex and called upon more frequently, it 
generates an exploding amount of logs from model training and inference. These 
logs provide the basis for performance monitoring, optimization, and 
troubleshooting. The existing Grafana Loki-based logging system of MiniMax 
faced performance and stability issues, so they planned for an upgrade. After 
looking at the common industry solutions, they came to Apache Doris.
+
+**Now, all of MiniMax's business lines have been integrated with the Apache 
Doris-based logging system, which serves a PB-scale data size with over 99.9% 
availability. The query latency on 100 million logs is within seconds.**
+
+## The old Grafana Loki-based logging system
+
+The design of Loki, an open-source log aggregation system, was inspired by 
Prometheus and developed by the Grafana Labs team. It does not have an indexing 
structure, but instead builds indexes only on log labels and metadata. 
+
+The major components of a Loki-based system typically include:
+
+- **Loki**: the main server responsible for log storage and querying.
+
+- **Promtail**: the agent layer for collecting logs and sending them to Loki.
+
+- **Grafana**: for user interface visualization.
+
+To deploy Grafana Loki, each cluster should be deployed with a complete set of 
log collectors and Loki log storage/query services. 
+
+Loki uses an Index + Chunk design for log storage, where during ingestion, the 
different log streams are dispersed across various Ingesters based on a hash of 
the log labels, and the Ingesters are responsible for writing the log data to 
object storage. During querying, the Querier retrieves the relevant Chunks from 
the object storage based on the Index, and then performs the log matching.
+
+![The old Grafana Loki-based logging 
system](/images/the-old-grafana-Loki-based-logging-system.png)
+
+Although Grafana Loki is positioned as a lightweight, horizontally scalable, 
and highly available log management system, it still faces some challenges in 
practical business use:
+
+- **Excessive query resource consumption**: Loki does not create indexes based 
on the log, but instead, it only performs preliminary filtering of logs at the 
label granularity. Thus, for searches on the logs, it applies the query 
mechanism to perform full-text regular expression matching on the entire log 
data set. This operation can lead to spikes in resource consumption, including 
CPU, memory, and network bandwidth. As the volume of data being queried and the 
query per second (QPS) inc [...]
+
+- **Complex architecture**: In addition to the modules shown in the above 
diagram, Loki also includes components like the Index Gateway, Memcache, and 
Compactor. The large number of architectural components makes the system 
challenging to operate and manage, and complex to configure.
+
+- **High maintenance cost and difficulty**: MiniMax has a large number of 
deployed clusters, and each cluster has differences in its system, resources, 
storage, and network environments. The need to deploy an independent Loki 
architecture in each cluster adds to the maintenance difficulty.
+
+## Why Apache Doris
+
+As one of the most data-intensive industries, AI use cases are characterized 
by long processing pipelines, abundant contextual data, and large per-request 
data volumes. Thus, the log size the MiniMax generates far exceeds those of 
non-AI software products of the same user base. The gigantic log size of 
MiniMax requires their logging system to be:
+
+- **High-performance**: They need the system to return query results on 100 
million log entries within seconds.
+
+- **Flexible**: The system should support log alerting and log metric queries, 
such as generating statistical trend lines for key terms.
+
+- **Low-cost**: The petabyte-scale raw log data continues to grow, so it's a 
make-or-break factor to keep the storage and computational costs within 
reasonable bounds.
+
+After an evaluation of mature logging system architectures in the industry, 
MiniMax identified the following key components typically found in leading log 
management solutions: 
+
+- **Collection agent**: collecting logs from service standard outputs and 
pushing the data into a central message queue.
+
+- **Message queue**: decoupling upstream and downstream components, absorbing 
spikes, and ensuring system stability even when downstream components are 
unavailable.
+
+- **Storage and query middleware**: storing and querying the log data. In a 
logging system, this middleware should be capable of inverted indexing to 
support efficient log searches.
+
+MiniMax decided to use iLogtail for the collection agents, Kafka for the 
message queue, and Apache Doris as the storage and query middleware. In 
selecting the storage middleware, MiniMax compared the representative 
technologies of Apache Doris and Elasticsearch.
+
+Based on such reference architecture, MiniMax decided to use iLogtail as the 
collection agent, Apache Kafka for the message queue, and **[Apache 
Doris](https://doris.apache.org) as the storage and query middleware**. The 
middleware decision was made after comparing Apache Doris and Elasticsearch.
+
+![Why Apache Doris](/images/why-Apache-Doris.png)
+
+Apache Doris shows competitiveness in cost and performance. It stands out 
particularly in storage efficiency, write throughput, and aggregation. 
Additionally, its compatibility with the MySQL syntax makes it more 
user-friendly.
+
+## Apache Doris-based logging system
+
+![Apache Doris-based logging 
system](/images/apache-doris-based-logging-system.png)
+
+The new logging system of MiniMax, called Mlogs, is more streamlined, with a 
single architecture serving all clusters. The upper layer acts as the control 
plane for the logging system, which consists of the encapsulation of log query 
interfaces and the module for automatic configuration generation and 
distribution. The lower layer represents the data plane of the logging system, 
containing the log collection agent, message queue, log writer, and the 
**Apache Doris** database.
+
+Logs generated by the cluster services are collected by iLogtail and pushed to 
Kafka. Part of these logs is pulled from Kafka by the Mlogs Ingester and 
written to the Doris cluster via the Stream Load method of Apache Doris. The 
rest is directly subscribed to in real-time by Doris via Routine Load, pulling 
the message stream from Kafka. **Ultimately, Apache Doris handles the storage 
and querying of all log data, eliminating the need for separate deployments for 
each cluster.**
+
+## Hands-on experience from MiniMax
+
+**Log ingestion**
+
+The new architecture utilizes both the Routine Load and Stream Load methods of 
Apache Doris. Routine Load is ready to use out of the box and can directly 
handle JSON logs without the need for additional parsing. For more complex logs 
that require filtering and processing, MiniMax has introduced a log writer 
called Mlogs Ingester between Kafka and Doris. The Mlogs Ingester parses and 
processes the logs before writing them to Doris via Stream Load.
+
+**Log search**
+
+For log searches, MiniMax utilizes the inverted indexes and full-text regular 
expression query capabilities of Apache Doris.
+
+- The inverted index of Apache Doris fits into a wide range of use cases and 
delivers high query performance. It's mainly used in `MATCH` and `MATCH_PHRASE` 
queries.
+
+- Full-text regular expression query (`REGEXP`) provides higher precision but 
lower performance than token-based queries. It is suitable for smaller-scale 
queries where precision is critical.
+
+**Performance improvement**
+
+MiniMax implements **query truncation** to further accelerate queries. Log 
data is arranged linearly in chronological order. If a query requests data of a 
large range, it can consume excessive computation, storage, and network 
resources and potentially lead to query timeouts or even system unavailability. 
So they set and truncate the time range of the queries to prevent overly broad 
queries, and pre-calculate the data volume for all tables every 15 minutes to 
dynamically estimate the max [...]
+
+**Cost control**
+
+To cut down storage costs, MiniMax utilizes the **[tiered 
storage](https://doris.apache.org/docs/table-design/cold-hot-separation/)** 
capabilities of Apache Doris. They define data within the last 7 days as hot 
data and data older than 7 days as cold data. Data will be moved to object 
storage as soon as it turns cold. Furthermore, they archive object storage data 
that is over 30 days old and only restore the archived data when necessary.
+
+## Value to MiniMax
+
+Now, the Apache Doris-based logging system has been supporting all business 
line log data within MiniMax, serving a **PB-scale data size** with over 
**99.9% availability**. It has also brought the following values to MiniMax:
+
+- **Simplified architecture**: The new system is easier to deploy and allows a 
single framework to serve all clusters. This reduces maintenance and management 
complexity, thus saving operational manpower and costs.
+
+- **Fast query response**: The new system can respond to keyword searches and 
aggregation queries from 1 billion log records within 2 seconds. Most log 
queries can return results within seconds, too.
+
+- **High write performance**: With the current hardware setups, the system can 
deliver a log write throughput of 10 GB/s, while maintaining data latency 
within seconds.
+
+- **Low storage costs**: The data compression ratio reaches 5:1 and tiered 
storage further reduces storage costs by 70%.
+
+## What's next
+
+After a successful initial experience with Apache Doris, MiniMax proceeds with 
the next phase of its upgrade plan, which includes the following efforts:
+
+- **Log pre-processing**: introduce log sampling and structuring to improve 
data usability and storage efficiency.
+
+- **Tracing**: integrate the logging system with other observability systems 
(monitoring, alerting, tracing, etc.) to provide comprehensive operational 
insights.
+
+- **Lakehousing**: expand the use of Apache Doris include big data processing 
and analysis within MiniMax, laying the foundation for a data lakehouse.
+
+If you have any questions or require assistance regarding Apache Doris, join 
the 
[community](https://join.slack.com/t/apachedoriscommunity/shared_invite/zt-2gmq5o30h-455W226d79zP3L96ZhXIoQ).
\ No newline at end of file
diff --git a/blog/apache-doris-vs-rockset.md b/blog/apache-doris-vs-rockset.md
index a7a443be944..a176b14afe8 100644
--- a/blog/apache-doris-vs-rockset.md
+++ b/blog/apache-doris-vs-rockset.md
@@ -6,8 +6,6 @@
     'date': '2024-06-24',
     'author': 'Zaki Lu',
     'tags': ['Top News'],
-    'picked': "true",
-    'order': "4",
     "image": '/images/doris-vs-rockset.jpeg'
 }
 
diff --git a/blog/auto-partition-in-apache-doris.md 
b/blog/auto-partition-in-apache-doris.md
index a3c74479dca..8ebd505ec35 100644
--- a/blog/auto-partition-in-apache-doris.md
+++ b/blog/auto-partition-in-apache-doris.md
@@ -7,7 +7,7 @@
     'author': 'Apache Doris',
     'tags': ['Tech Sharing'],
    'picked': "true",
-    'order': "2",
+    'order': "3",
     "image": '/images/auto-partition-in-apache-doris.jpg'
 }
 
diff --git a/blog/migrate-lakehouse-from-bigquery-to-doris.md 
b/blog/migrate-lakehouse-from-bigquery-to-doris.md
index 536ae4727a6..ee9abcd216b 100644
--- a/blog/migrate-lakehouse-from-bigquery-to-doris.md
+++ b/blog/migrate-lakehouse-from-bigquery-to-doris.md
@@ -7,7 +7,7 @@
     'author': 'Dien, Tran Thanh',
     'tags': ['Best Practice'],
     'picked': "true",
-    'order': "3",
+    'order': "4",
     "image": '/images/migrate-lakehouse-from-bigquery-to-apache-doris.jpg'
 }
 
diff --git a/blog/release-note-3.0.1.md b/blog/release-note-3.0.1.md
index 669c14c6d29..05aec267149 100644
--- a/blog/release-note-3.0.1.md
+++ b/blog/release-note-3.0.1.md
@@ -7,7 +7,7 @@
     'author': 'Apache Doris',
     'tags': ['Release Notes'],
     'picked': "true",
-    'order': "1",
+    'order': "2",
     "image": '/images/3.0.1.jpg'
 }
 ---
diff --git a/common_docs_zh/gettingStarted/demo-block/latest.tsx 
b/common_docs_zh/gettingStarted/demo-block/latest.tsx
index 7ed92ea380b..c2109942320 100644
--- a/common_docs_zh/gettingStarted/demo-block/latest.tsx
+++ b/common_docs_zh/gettingStarted/demo-block/latest.tsx
@@ -24,8 +24,8 @@ export default function Latest() {
                     </div>
                 </div> */}
                 <div className="home-page-hero-right">
-                    <a className="latest-button" 
href="https://ask.selectdb.com/";>
-                        <div 
className="home-page-hero-button-label"><div>近期事件</div></div>
+                    <a className="latest-button" href="https://hdxu.cn/AfjED";>
+                        <div 
className="home-page-hero-button-label"><div>近期活动</div></div>
                         <div className="latest-button-title">
                             {/* <div className="home-page-hero-button-icon">
                                 <svg width="24px" viewBox="0 0 24 24" 
xmlns="http://www.w3.org/2000/svg";>
@@ -33,10 +33,10 @@ export default function Latest() {
                                     <path fill="none" d="M0 0h24v24H0Z"></path>
                                 </svg>
                             </div> */}
-                            <div style={{ marginBottom: 10 }}>技术论坛全面升级上线!Ask 
and Discover</div>
+                            <div style={{ marginBottom: 10 }}>飞轮科技 x 字节跳动开源 
Meetup@北京站</div>
                         </div>
-                        <div style={{ fontSize: 12, marginBottom: 20 }}>联合众多 
Doris 
生态中的开发者、用户以及合作伙伴,共同发起和创建的问答社区。在这里,你可以自由的提出和讨论技术问题、分享和收获技术经验、与社区的小伙伴进行互动和交流。</div>
-                        <div style={{ fontSize: 14, marginBottom: 10 
}}>进入论坛</div>
+                        <div style={{ fontSize: 12, marginBottom: 20 
}}>来自抖音集团、飞轮科技、爱玛科技、中国电信、天翼云等多位行业技术专家,将为参会者带来多行业、跨领域的技术分享及落地实践。</div>
+                        <div style={{ fontSize: 14, marginBottom: 10 
}}>立即报名</div>
                     </a>
                     <a className="latest-button" 
href={`/zh-CN/docs${currentVersion === '' ? '' : 
`/${currentVersion}`}/releasenotes/v3.0/release-3.0.1`}>
                         <div 
className="home-page-hero-button-label"><div>版本发布</div></div>
diff --git a/gettingStarted/demo-block/latest.tsx 
b/gettingStarted/demo-block/latest.tsx
index 57acf7b8ea4..bcb88fc41d1 100644
--- a/gettingStarted/demo-block/latest.tsx
+++ b/gettingStarted/demo-block/latest.tsx
@@ -48,14 +48,9 @@ export default function Latest() {
                             </div> */}
                             <div style={{ marginBottom: 10 }}>Apache Doris 
3.0.1 just released</div>
                         </div>
-                        <div style={{ fontSize: 12, marginBottom: 20 }}>In 
this version, Apache Doris has improvements in compute-storage decoupling, 
lakehouse, semi-structured data analysis and more.</div>
+                        <div style={{ fontSize: 12, marginBottom: 20 }}>Apache 
Doris has improvements in compute-storage decoupling, lakehouse, 
semi-structured data analysis and more.</div>
                         <div style={{ fontSize: 14, marginBottom: 10 }}>Learn 
more</div>
                     </a>
-
-
-
-
-
                 </div>
                 {/* <div style={{ fontSize: '1rem', fontWeight: 500, width: 
600, marginTop: '1rem', color: '#1d1d1d' }}>学习路径</div> */}
 
diff --git a/src/components/recent-blogs/recent-blogs.data.ts 
b/src/components/recent-blogs/recent-blogs.data.ts
index 8d617b45591..29578311b49 100644
--- a/src/components/recent-blogs/recent-blogs.data.ts
+++ b/src/components/recent-blogs/recent-blogs.data.ts
@@ -1,4 +1,8 @@
 export const RECENT_BLOGS_POSTS = [
+    {
+        label: `Apache Doris 3.0.1 just released`,
+        link: 'https://doris.apache.org/blog/release-note-3.0.1',
+    },
     {
         label: 'Automatic and flexible data sharding: Auto Partition in Apache 
Doris',
         link: 'https://doris.apache.org/blog/auto-partition-in-apache-doris',
@@ -11,8 +15,5 @@ export const RECENT_BLOGS_POSTS = [
         label: 'Why Apache Doris is the Best Open Source Alternative to 
Rockset',
         link: 'https://doris.apache.org/blog/apache-doris-vs-rockset',
     },
-    {
-        label: `Steps to industry-leading query speed: evolution of the Apache 
Doris execution engine`,
-        link: 
'https://doris.apache.org/blog/evolution-of-the-apache-doris-execution-engine',
-    }
+
 ];
diff --git a/src/constant/newsletter.data.ts b/src/constant/newsletter.data.ts
index 0eae1e7e59d..e0721c4c81b 100644
--- a/src/constant/newsletter.data.ts
+++ b/src/constant/newsletter.data.ts
@@ -1,4 +1,11 @@
 export const NEWSLETTER_DATA = [
+    {
+        tags: ['Top News'],
+        title: "How AI unicorn MiniMax migrated from Loki and built a PB-scale 
logging system with Apache Doris",
+        content: `Serving a PB-scale data size with over 99.9% availability, 
Apache Doris is the vital signs monitor of MiniMax, a generative AI startup 
backed by Alibaba.`,
+        to: 
'/blog/ai-unicorn-minimax-from-loki-and-built-a-pb-scale-logging-system-with-doris',
+        image: 'minimax-migrated-from-loki-to-doris.png',
+    },
     {
         tags: ['Release Note'],
         title: "Apache Doris version 3.0.1 just released",
@@ -21,13 +28,6 @@ export const NEWSLETTER_DATA = [
         to: '/blog/migrate-lakehouse-from-bigquery-to-doris',
         image: 'migrate-lakehouse-from-bigquery-to-apache-doris.jpg',
     },
-    {
-        tags: ['Top News'],
-        title: "Why Apache Doris is the Best Open Source Alternative to 
Rockset",
-        content: `Among of all the claim-to-be alternatives to Rockset, Apache 
Doris is one of the few that cover all the key features of Rockset.`,
-        to: '/blog/apache-doris-vs-rockset',
-        image: 'doris-vs-rockset.jpeg',
-    },
 
 
 ];
diff --git a/static/images/apache-doris-based-logging-system.png 
b/static/images/apache-doris-based-logging-system.png
new file mode 100644
index 00000000000..fdfae053c94
Binary files /dev/null and 
b/static/images/apache-doris-based-logging-system.png differ
diff --git a/static/images/minimax-migrated-from-loki-to-doris.png 
b/static/images/minimax-migrated-from-loki-to-doris.png
new file mode 100644
index 00000000000..9d8a4b33ace
Binary files /dev/null and 
b/static/images/minimax-migrated-from-loki-to-doris.png differ
diff --git a/static/images/the-old-grafana-Loki-based-logging-system.png 
b/static/images/the-old-grafana-Loki-based-logging-system.png
new file mode 100644
index 00000000000..eaa13b8b13b
Binary files /dev/null and 
b/static/images/the-old-grafana-Loki-based-logging-system.png differ
diff --git a/static/images/why-Apache-Doris.png 
b/static/images/why-Apache-Doris.png
new file mode 100644
index 00000000000..05cc69d78fb
Binary files /dev/null and b/static/images/why-Apache-Doris.png differ


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to