[GitHub] [incubator-hudi] yihua commented on a change in pull request #884: [HUDI-240] Translate Use Cases page

GitBox Thu, 12 Sep 2019 11:50:21 -0700

yihua commented on a change in pull request #884: [HUDI-240] Translate Use 
Cases page
URL: https://github.com/apache/incubator-hudi/pull/884#discussion_r323895886


 ##########
 File path: docs/use_cases.cn.md
 ##########
 @@ -4,73 +4,65 @@ keywords: hudi, data ingestion, etl, real time, use cases
 sidebar: mydoc_sidebar
 permalink: use_cases.html
 toc: false
-summary: "Following are some sample use-cases for Hudi, which illustrate the 
benefits in terms of faster processing & increased efficiency"
+summary: "下面展示一些使用Hudi的示例，示例说明了加快处理速度和提高效率的好处"
 
 ---
 
 
 
-## Near Real-Time Ingestion
+## 近实时摄取
 
-Ingesting data from external sources like (event logs, databases, external 
sources) into a [Hadoop Data Lake](http://martinfowler.com/bliki/DataLake.html) 
is a well known problem.
-In most (if not all) Hadoop deployments, it is unfortunately solved in a 
piecemeal fashion, using a medley of ingestion tools,
-even though this data is arguably the most valuable for the entire 
organization.
+将外部源(如事件日志，数据库，外部源)的数据摄取到[Hadoop数据湖](http://martinfowler.com/bliki/DataLake.html)是一个众所周知的问题。
+尽管这些数据对整个组织来说是最有价值的，但不幸的是，在大多数(可能不是全部)Hadoop部署中都使用零散的方式解决，即使用混合摄取工具。
 
 
-For RDBMS ingestion, Hudi provides __faster loads via Upserts__, as opposed 
costly & inefficient bulk loads. For e.g, you can read the MySQL BIN log or 
[Sqoop Incremental 
Import](https://sqoop.apache.org/docs/1.4.2/SqoopUserGuide.html#_incremental_imports)
 and apply them to an
-equivalent Hudi table on DFS. This would be much faster/efficient than a [bulk 
merge 
job](https://sqoop.apache.org/docs/1.4.0-incubating/SqoopUserGuide.html#id1770457)
-or [complicated handcrafted merge 
workflows](http://hortonworks.com/blog/four-step-strategy-incremental-updates-hive/)
+对于RDBMS摄取，Hudi提供__通过更新插入更快地加载__，而不是昂贵且低效的批量加载。例如，您可以读取MySQL 
BIN日志或[Sqoop增量导入](https://sqoop.apache.org/docs/1.4.2/SqoopUserGuide.html#_incremental_imports)并将其应用于
+DFS上的等效Hudi表。这比[批量合并任务](https://sqoop.apache.org/docs/1.4.0-incubating/SqoopUserGuide.html#id1770457)更快/更有效率或[复杂的手工合并工作流](http://hortonworks.com/blog/four-step-strategy-incremental-updates-hive/)
 
 
-For NoSQL datastores like [Cassandra](http://cassandra.apache.org/) / 
[Voldemort](http://www.project-voldemort.com/voldemort/) / 
[HBase](https://hbase.apache.org/), even moderately big installations store 
billions of rows.
-It goes without saying that __full bulk loads are simply infeasible__ and more 
efficient approaches are needed if ingestion is to keep up with the typically 
high update volumes.
+对于NoSQL数据存储，如[Cassandra](http://cassandra.apache.org/) / 
[Voldemort](http://www.project-voldemort.com/voldemort/) / 
[HBase](https://hbase.apache.org/)，即使是中等规模大小也会存储数十亿行。
+毫无疑问， __全量加载不可行__ 如果摄取需要跟上较高的更新量，那么则需要更有效的方法。
 
 Review comment:
   Add “，” after “全量加载不可行”?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-hudi] yihua commented on a change in pull request #884: [HUDI-240] Translate Use Cases page

Reply via email to