Sean-Gu commented on a change in pull request #982: Realtime doc for lambda mode
URL: https://github.com/apache/kylin/pull/982#discussion_r357547215
 
 

 ##########
 File path: website/_docs30/tutorial/lambda_mode_and_timezone_realtime_olap.md
 ##########
 @@ -0,0 +1,174 @@
+---
+layout: docs30
+title:  Lambda mode and Timezone in Real-time OLAP
+categories: tutorial
+permalink: /docs30/tutorial/lambda_mode_and_timezone_realtime_olap.html
+---
+
+Kylin v3.0.0 will release the real-time OLAP function, by the power of new 
added streaming reciever cluster, Kylin can query streaming data with 
sub-second latency. You can check [this tech 
blog](/blog/2019/04/12/rt-streaming-design/) for the overall design and core 
concept. 
+
+If you want to find a step by step tutorial, please check this [this tech 
blog](/docs30/tutorial/realtime_olap.html).
+In this article, we will introduce how to update segment and set timezone in 
for derived time column in realtime OLAP cube. 
+
+# Background
+
+Says we have Kafka message which look like this:
+
+{% highlight Groff markup %}
+{
+    "s_nation":"SAUDI ARABIA",
+    "lo_supplycost":74292,
+    "p_category":"MFGR#0910",
+    "local_day_hour_minute":"09_21_44",
+    "event_time":"2019-12-09 08:44:50.000-0500",
+    "local_day_hour":"09_21",
+    "lo_quantity":12,
+    "lo_revenue":1411548,
+    "p_brand":"MFGR#0910051",
+    "s_region":"MIDDLE EAST",
+    "lo_discount":5,
+    "customer_info":{
+        "CITY":"CHINA    057",
+        "REGION":"ASIA",
+        "street":"CHINA    05721",
+        "NATION":"CHINA"
+    },
+    "d_year":1994,
+    "d_weeknuminyear":30,
+    "p_mfgr":"MFGR#09",
+    "v_revenue":7429200,
+    "d_yearmonth":"Jul1994",
+    "s_city":"SAUDI ARA15",
+    "profit_ratio":0.05263157894736842,
+    "d_yearmonthnum":199407,
+    "round":1
+}
+{% endhighlight %}
+
+In this sample, it is come from SSB with some additional field such as 
*event_time*. We have the field *event_time* as the timestamp of current event. 
+And we assumed that event came from countries of different timezone, 
"2019-12-09 08:44:50.000-0500" indicated that this a event which come from 
'America/New_York' timezone. You may have some events which come from 
'Asia/Shanghai' as well.
+
+*local_day_hour_minute* is a column which value is in local timezone, in this 
sample it in "GMT+8".
+
+### Question
+We want to do some realtime OLAP analysis, so you may consider to use Realtime 
OLAP. But you may have some concerns which included:
+
+1. In the fact that events are come from different timezone, you may worried 
will this cause some trouble or incorrect query result?
+2. In some cases, kafka message contains the value which is not actually what 
you want, says some dimension value is misspelled, how could you make 
corrections? (Or you want to retrieve some long-late-message which was dropped.)
 
 Review comment:
   2. How could I make it correct when kafka messages contain the value which 
is not what you want, say ...
   3. How could I retrieve long-late messages which has been dropped

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to